Support vector machines is a family of algorithms attempting to pass a (possibly high-dimension) hyperplane between two labelled sets of points, such that the distance of the points from the plane is optimal in some sense. SVMs can be used for classification or regression (corresponding to sklearn.svm.SVC
and sklearn.svm.SVR
, respectively.
Example:
Suppose we work in a 2D space. First, we create some data:
import numpy as np
Now we create x and y:
x0, x1 = np.random.randn(10, 2), np.random.randn(10, 2) + (1, 1)
x = np.vstack((x0, x1))
y = [0] * 10 + [1] * 10
Note that x is composed of two Gaussians: one centered around (0, 0), and one centered around (1, 1).
To build a classifier, we can use:
from sklearn import svm
svm.SVC(kernel='linear').fit(x, y)
Let's check the prediction for (0, 0):
>>> svm.SVC(kernel='linear').fit(x, y).predict([[0, 0]])
array([0])
The prediction is that the class is 0.
For regression, we can similarly do:
svm.SVR(kernel='linear').fit(x, y)