Tutorial by Examples | RIP Tutorial

Installation of scikit-learn

The current stable version of scikit-learn requires: Python (>= 2.6 or >= 3.3), NumPy (>= 1.6.1), SciPy (>= 0.9). For most installation pip python package manager can install python and all of its dependencies: pip install scikit-learn However for linux systems it is recomm...

scikit-learn • Getting started with scikit-learn

Train a classifier with cross-validation

Using iris dataset: import sklearn.datasets iris_dataset = sklearn.datasets.load_iris() X, y = iris_dataset['data'], iris_dataset['target'] Data is split into train and test sets. To do this we use the train_test_split utility function to split both X and y (data and target vectors) randomly w...

scikit-learn • Getting started with scikit-learn

Creating pipelines

Finding patterns in data often proceeds in a chain of data-processing steps, e.g., feature selection, normalization, and classification. In sklearn, a pipeline of stages is used for this. For example, the following code shows a pipeline consisting of two stages. The first scales the features, and t...

scikit-learn • Getting started with scikit-learn

Interfaces and conventions:

Different operations with data are done using special classes. Most of the classes belong to one of the following groups: classification algorithms (derived from sklearn.base.ClassifierMixin) to solve classification problems regression algorithms (derived from sklearn.base.RegressorMixin) to so...

scikit-learn • Getting started with scikit-learn

Sample datasets

For ease of testing, sklearn provides some built-in datasets in sklearn.datasets module. For example, let's load Fisher's iris dataset: import sklearn.datasets iris_dataset = sklearn.datasets.load_iris() iris_dataset.keys() ['target_names', 'data', 'target', 'DESCR', 'feature_names'] You can ...

scikit-learn • Getting started with scikit-learn