scikit-learn Classification GradientBoostingClassifier


Example

Gradient Boosting for classification. The Gradient Boosting Classifier is an additive ensemble of a base model whose error is corrected in successive iterations (or stages) by the addition of Regression Trees which correct the residuals (the error of the previous stage).

Import:

from sklearn.ensemble import GradientBoostingClassifier

Create some toy classification data

from sklearn.datasets import load_iris

iris_dataset = load_iris()

X, y = iris_dataset.data, iris_dataset.target

Let us split this data into training and testing set.

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=0)

Instantiate a GradientBoostingClassifier model using the default params.

gbc = GradientBoostingClassifier()
gbc.fit(X_train, y_train)

Let us score it on the test set

# We are using the default classification accuracy score
>>> gbc.score(X_test, y_test)
1

By default there are 100 estimators built

>>> gbc.n_estimators
100

This can be controlled by setting n_estimators to a different value during the initialization time.