1. How to get data?
- Import the following packages
import pandas as pd
import numpy as np
from sklearn.svm import SVC
from sklearn.datasets import load_iris
2. Load predefined dataset setting load_datasetname() to a variable
iris = load_iris()
3. Identify data features and target writing datasetname.data and datasetname.target. Then store both in two variables
X = iris.data
y = iris.target
4.Store linear Support Vector Machine function in a variable. Set function’s kernel parameter to linear
model = SVC(kernel='linear')
5. Fit variable in terms of the variables that represent data features and target
model.fit(X, y)
6. Import cross-validation score
from sklearn.cross_validation import cross_val_score
7. Write the cross-validation score function
cross_val_score()
8. Store Support Vector Machine model, data features and target inside function
cross_val_score(model, X, y)
9. Set cv parameter to 5 and n_jobs=-1. n_jobs parameter means the number of CPUs used n_jobs=-1 means all CPUs are used.
cvscores = cross_val_score(model, X, y, cv = 5, n_jobs=-1)
10. Store result in a variable and calculate its mean and standard deviation in the following way (FIX! EXPLAIN WAY)
print "CV score: {:.3} +/- {:.3}".format(cvscores.mean(), cvscores.std())