(runnables)= # Using Your Own Algorithms When using the package mlrl-testbed, as described {ref}`here`, the Python module of the program to be run must be specified. For running the algorithms provided by this project, the module names `mlrl.boosting` and `mlrl.seco` can be used. However, you can also specify the name of a custom module, or the path to a Python source file, that provides an integration with a machine learning algorithm of your choice. ## Integrating an Algorithm The module or source file, which is specified via the command line API, must contain a class named `Runnable` that extends from {py:class}`mlrl.testbed.SkLearnRunnable `. If you want to use a different class name, you can specify a different one via the command line arguments `-r` or `--runnable` as described {ref}`here`. Besides the name of the machine learning algorithm to be integrated, the class must override the methods {py:meth}`create_classifier ` and {py:meth}`create_regressor `. If you do not intend to support either classification or regression problems, you can just return `None` from the respective method. Otherwise, it must return a scikit-learn compatible [estimator](https://scikit-learn.org/stable/glossary.html#term-estimators) to be used in experiments. In the following, we provide an exemplary implementation of such a class using scikit-learn's {py:class}`RandomForestClassifier `: ```python from argparse import Namespace from mlrl.testbed.experiments.state import ExperimentMode from mlrl.testbed_sklearn.runnables import SkLearnRunnable from sklearn.ensemble import RandomForestClassifier from sklearn.base import ClassifierMixin, RegressorMixin from typing import Optional class Runnable(SkLearnRunnable): def create_classifier(self, mode: ExperimentMode, args: Namespace) -> Optional[ClassifierMixin]: return RandomForestClassifier() def create_regressor(self, mode: ExperimentMode, args: Namespace) -> Optional[RegressorMixin]: return None ``` Assuming that the source code shown above is saved to a file named `custom_runnable.py` in the working directory, the package mlrl-testbed can be instructed to use it as follows: ```text mlrl-testbed custom_runnable.py --data-dir path/to/datasets/ --dataset dataset-name ``` ## Defining Command Line Arguments To ease the configuration of a machine learning algorithm, for which you created a custom integration, the base class {py:class}`SkLearnRunnable ` provides a simple mechanism for defining custom command line arguments by overriding the method {py:meth}`get_algorithmic_arguments `. As illustrated below, the user-specified values for these arguments can then be retrieved in the methods {py:meth}`create_classifier ` and {py:meth}`create_regressor `: ```python from argparse import Namespace from mlrl.testbed import SkLearnRunnable from mlrl.testbed.experiments.state import ExperimentMode from mlrl.util.cli import Argument, IntArgument from sklearn.ensemble import RandomForestClassifier from sklearn.base import ClassifierMixin, RegressorMixin from typing import Optional, Set class Runnable(SkLearnRunnable): N_ESTIMATORS = IntArgument( '--n-estimators', description='The number of trees in the forest', default=100, ) def get_algorithmic_arguments(self, known_args: Namespace) -> Set[Argument]: return { self.N_ESTIMATORS } def create_classifier(self, mode: ExperimentMode, args: Namespace) -> Optional[ClassifierMixin]: return RandomForestClassifier(n_estimators=self.N_ESTIMATORS.get_value(args)) def create_regressor(self, mode: ExperimentMode, args: Namespace) -> Optional[RegressorMixin]: return None ``` The method {py:meth}`get_algorithmic_arguments ` must return a set of {py:class}`Argument ` objects. The following subclasses, corresponding to different types of arguments, are available: - {py:class}`IntArgument `: For specifying an integer value. - {py:class}`FloatArgument `: For specifying a float value. - {py:class}`StringArgument `: For specifying an arbitrary string. - {py:class}`SetArgument `: For specifying one out of a predefined set of string values. - {py:class}`EnumArgument `: For specifying one out of a predefined set of enum values. - {py:class}`FlagArgument `: For flags that are disabled by default. - {py:class}`PathArgument `: For file system paths. Instead of retrieving the value specified by the user directly from the given `Namespace` object, we recommend to use the method {py:meth}`get_value `, as it validates the given value and prints helpful information in the case of validation errors. ## Providing Version Information Optionally, you can provide information about the version and authors of your custom program by overriding the method {py:meth}`get_program_info `: ```python from mlrl.testbed_sklearn.runnables import SkLearnRunnable from mlrl.testbed.program_info import ProgramInfo from typing import Optional class Runnable(SkLearnRunnable): # ... def get_program_info(self) -> Optional[ProgramInfo]: return ProgramInfo(name='Random Forest Classifier', version='1.0.0', year='1934', authors=['Bonnie', 'Clyde']) ```