Introduction
In this introduction, we will write a simple experiment to find out which SVM kernel works best on MNIST data.
Defining the experiment
classifier.py:
from sklearn import datasets, svm
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
from experitur import Experiment, Trial
from experitur.configurators import Grid
@Grid({"svc_kernel": ["linear", "poly", "rbf", "sigmoid"]})
@Experiment()
def classifier_svm(trial: Trial):
X, y = datasets.load_digits(return_X_y=True)
n_samples = len(X)
# Flatten
X = X.reshape((n_samples, -1))
# Extract parameters prefixed with "svc_"
svc_parameters = trial.prefixed("svc_")
# Create a support vector classifier
classifier = svc_parameters.call(svm.SVC)
# svc_parameters.call automatically filled `parameters` in with the default values:
assert "svc_gamma" in trial
assert trial["svc_gamma"] == "scale"
print("Classifier:", classifier)
# Fit
X_train = X[: n_samples // 2]
y_train = y[: n_samples // 2]
classifier.fit(X_train, y_train)
# Predict
X_test = X[n_samples // 2 :]
y_test = y[n_samples // 2 :]
y_test_pred = classifier.predict(X_test)
# Calculate some metrics
macro_prfs = precision_recall_fscore_support(y_test, y_test_pred, average="macro")
result = dict(zip(("macro_precision", "macro_recall", "macro_f_score"), macro_prfs))
result["accuracy"] = accuracy_score(y_test, y_test_pred)
print(result)
return result
Running the experiment
Looking at the results
The results are saved in a folder with the same name as the description of experiment (DOX), in this case classifier/
.
For every trial, a subfolder classifier/<trial_id>
is created with a trial.yaml
file
containing the following data:
|
Experiment data. |
|
Trial ID ( |
|
Parameters as defined by the |
|
Parameters after parameter substitution and filled in with values derived from
|
|
Result returned by the experiment function. |
|
TODO |
|
Time when the trial was started. |
|
Time when the trial ended. |
|
Working directory of the trial. |
Running experitur collect classifier.py
will produce the following CSV file
that can be used to examine the results.
id |
success |
time_end |
time_start |
wdir |
experiment.func |
experiment.meta |
experiment.name |
experiment.parent |
parameters.svc_kernel |
resolved_parameters.svc_C |
resolved_parameters.svc_break_ties |
resolved_parameters.svc_cache_size |
resolved_parameters.svc_class_weight |
resolved_parameters.svc_coef0 |
resolved_parameters.svc_decision_function_shape |
resolved_parameters.svc_degree |
resolved_parameters.svc_gamma |
resolved_parameters.svc_kernel |
resolved_parameters.svc_max_iter |
resolved_parameters.svc_probability |
resolved_parameters.svc_random_state |
resolved_parameters.svc_shrinking |
resolved_parameters.svc_tol |
resolved_parameters.svc_verbose |
result.accuracy |
result.macro_f_score |
result.macro_precision |
result.macro_recall |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
classifier_svm/svc_kernel-linear |
True |
2020-03-26 21:00:54.434675 |
2020-03-26 21:00:54.331190 |
classifier/classifier_svm/svc_kernel-linear |
classifier.classifier_svm |
classifier_svm |
linear |
1.0 |
False |
200 |
0.0 |
ovr |
3 |
scale |
linear |
-1 |
False |
True |
0.001 |
False |
0.9443826473859844 |
0.9447774845903595 |
0.9464082561378555 |
0.9447239678783855 |
||||
classifier_svm/svc_kernel-poly |
True |
2020-03-26 21:00:54.552524 |
2020-03-26 21:00:54.444060 |
classifier/classifier_svm/svc_kernel-poly |
classifier.classifier_svm |
classifier_svm |
poly |
1.0 |
False |
200 |
0.0 |
ovr |
3 |
scale |
poly |
-1 |
False |
True |
0.001 |
False |
0.9588431590656284 |
0.9592749897784388 |
0.9605287212162343 |
0.9590717352596622 |
||||
classifier_svm/svc_kernel-sigmoid |
True |
2020-03-26 21:00:54.925145 |
2020-03-26 21:00:54.725383 |
classifier/classifier_svm/svc_kernel-sigmoid |
classifier.classifier_svm |
classifier_svm |
sigmoid |
1.0 |
False |
200 |
0.0 |
ovr |
3 |
scale |
sigmoid |
-1 |
False |
True |
0.001 |
False |
0.8865406006674083 |
0.8854842866699852 |
0.8886035858711934 |
0.8863762321227018 |
||||
classifier_svm/svc_kernel-rbf |
True |
2020-03-26 21:00:54.712331 |
2020-03-26 21:00:54.567612 |
classifier/classifier_svm/svc_kernel-rbf |
classifier.classifier_svm |
classifier_svm |
rbf |
1.0 |
False |
200 |
0.0 |
ovr |
3 |
scale |
rbf |
-1 |
False |
True |
0.001 |
False |
0.9610678531701891 |
0.9611883180540721 |
0.9621048944956506 |
0.9612942318641812 |
As you can see, resolved_parameters
also contains the default values of sklearn.svm.SVC
.
Concepts
The example uses the following concepts:
Define an experiment. |
|
Data related to a trial. |
The following methods of Trial
were used:
|
Return new |
|
Call the function applying the configured parameters. |