Support Vector Machine with Scikit-learn

1. How to get data?

  1. Import the following packages
import pandas as pd
import numpy as np
from sklearn.svm import SVC
from sklearn.datasets import load_iris

2. Load predefined dataset setting load_datasetname() to a variable

iris = load_iris()

3. Identify data features and target writing datasetname.data and datasetname.target. Then store both in two variables

X = iris.data
y = iris.target

4.Store linear Support Vector Machine function in a variable. Set function’s kernel parameter to linear

model = SVC(kernel='linear')

5. Fit variable in terms of the variables that represent data features and target
model.fit(X, y)

6. Import cross-validation score

from sklearn.cross_validation import cross_val_score

7. Write the cross-validation score function

cross_val_score()

8. Store Support Vector Machine model, data features and target inside function

cross_val_score(model, X, y)

9. Set cv parameter to 5 and n_jobs=-1. n_jobs parameter means the number of CPUs used n_jobs=-1 means all CPUs are used.

cvscores = cross_val_score(model, X, y, cv = 5, n_jobs=-1)

10. Store result in a variable and calculate its mean and standard deviation in the following way (FIX! EXPLAIN WAY)
print "CV score: {:.3} +/- {:.3}".format(cvscores.mean(), cvscores.std())

Leave a comment