Introduction

In this introduction, we will write a simple experiment to find out which SVM kernel works best on MNIST data.

Defining the experiment

classifier.py:

from sklearn import datasets, svm
from sklearn.metrics import accuracy_score, precision_recall_fscore_support

from experitur import Experiment, Trial
from experitur.configurators import Grid


@Grid({"svc_kernel": ["linear", "poly", "rbf", "sigmoid"]})
@Experiment()
def classifier_svm(trial: Trial):
    X, y = datasets.load_digits(return_X_y=True)

    n_samples = len(X)

    # Flatten
    X = X.reshape((n_samples, -1))

    # Extract parameters prefixed with "svc_"
    svc_parameters = trial.prefixed("svc_")

    # Create a support vector classifier
    classifier = svc_parameters.call(svm.SVC)

    # svc_parameters.call automatically filled `parameters` in with the default values:
    assert "svc_gamma" in trial
    assert trial["svc_gamma"] == "scale"

    print("Classifier:", classifier)

    # Fit
    X_train = X[: n_samples // 2]
    y_train = y[: n_samples // 2]
    classifier.fit(X_train, y_train)

    # Predict
    X_test = X[n_samples // 2 :]
    y_test = y[n_samples // 2 :]
    y_test_pred = classifier.predict(X_test)

    # Calculate some metrics
    macro_prfs = precision_recall_fscore_support(y_test, y_test_pred, average="macro")

    result = dict(zip(("macro_precision", "macro_recall", "macro_f_score"), macro_prfs))

    result["accuracy"] = accuracy_score(y_test, y_test_pred)

    print(result)

    return result

Running the experiment

_images/classifier.gif

Looking at the results

The results are saved in a folder with the same name as the description of experiment (DOX), in this case classifier/. For every trial, a subfolder classifier/<trial_id> is created with a trial.yaml file containing the following data:

experiment

Experiment data.

id

Trial ID (<experiment name>/<trial id>)

parameters

Parameters as defined by the ParameterGenerator.

resolved_parameters

Parameters after parameter substitution and filled in with values derived from call() and record_defaults().

result

Result returned by the experiment function.

success

TODO

time_start

Time when the trial was started.

time_end

Time when the trial ended.

wdir

Working directory of the trial.

Running experitur collect classifier.py will produce the following CSV file that can be used to examine the results.

Results

id

success

time_end

time_start

wdir

experiment.func

experiment.meta

experiment.name

experiment.parent

parameters.svc_kernel

resolved_parameters.svc_C

resolved_parameters.svc_break_ties

resolved_parameters.svc_cache_size

resolved_parameters.svc_class_weight

resolved_parameters.svc_coef0

resolved_parameters.svc_decision_function_shape

resolved_parameters.svc_degree

resolved_parameters.svc_gamma

resolved_parameters.svc_kernel

resolved_parameters.svc_max_iter

resolved_parameters.svc_probability

resolved_parameters.svc_random_state

resolved_parameters.svc_shrinking

resolved_parameters.svc_tol

resolved_parameters.svc_verbose

result.accuracy

result.macro_f_score

result.macro_precision

result.macro_recall

classifier_svm/svc_kernel-linear

True

2020-03-26 21:00:54.434675

2020-03-26 21:00:54.331190

classifier/classifier_svm/svc_kernel-linear

classifier.classifier_svm

classifier_svm

linear

1.0

False

200

0.0

ovr

3

scale

linear

-1

False

True

0.001

False

0.9443826473859844

0.9447774845903595

0.9464082561378555

0.9447239678783855

classifier_svm/svc_kernel-poly

True

2020-03-26 21:00:54.552524

2020-03-26 21:00:54.444060

classifier/classifier_svm/svc_kernel-poly

classifier.classifier_svm

classifier_svm

poly

1.0

False

200

0.0

ovr

3

scale

poly

-1

False

True

0.001

False

0.9588431590656284

0.9592749897784388

0.9605287212162343

0.9590717352596622

classifier_svm/svc_kernel-sigmoid

True

2020-03-26 21:00:54.925145

2020-03-26 21:00:54.725383

classifier/classifier_svm/svc_kernel-sigmoid

classifier.classifier_svm

classifier_svm

sigmoid

1.0

False

200

0.0

ovr

3

scale

sigmoid

-1

False

True

0.001

False

0.8865406006674083

0.8854842866699852

0.8886035858711934

0.8863762321227018

classifier_svm/svc_kernel-rbf

True

2020-03-26 21:00:54.712331

2020-03-26 21:00:54.567612

classifier/classifier_svm/svc_kernel-rbf

classifier.classifier_svm

classifier_svm

rbf

1.0

False

200

0.0

ovr

3

scale

rbf

-1

False

True

0.001

False

0.9610678531701891

0.9611883180540721

0.9621048944956506

0.9612942318641812

As you can see, resolved_parameters also contains the default values of sklearn.svm.SVC.

Concepts

The example uses the following concepts:

Experiment

Define an experiment.

Trial

Data related to a trial.

The following methods of Trial were used:

prefixed(prefix)

Return new Trial instance with prefix applied.

call(func, *args, **kwargs)

Call the function applying the configured parameters.

Further reading