mlrl.testbed.runnables module¶

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides base classes for programs that can be configured via command line arguments.

class mlrl.testbed.runnables.LearnerRunnable(learner_name: str)¶

Bases: Runnable, ABC

A base class for all programs that perform an experiment that involves training and evaluation of a machine learning algorithm.

class ClearOutputDirHook(output_dir: str)¶

Bases: ExecutionHook

Deletes all files from the output directory before an experiment starts.

execute()¶: See mlrl.testbed.experiments.Experiment.ExecutionHook.execute()

DATA_SPLIT_CROSS_VALIDATION = 'cross-validation'¶

DATA_SPLIT_TRAIN_TEST = 'train-test'¶

DATA_SPLIT_VALUES: Dict[str, Set[str]] = {'cross-validation': {'current_fold', 'num_folds'}, 'none': {}, 'train-test': {'test_size'}}¶

OPTION_CURRENT_FOLD = 'current_fold'¶

OPTION_NUM_FOLDS = 'num_folds'¶

OPTION_TEST_SIZE = 'test_size'¶

PARAM_DATA_SPLIT = '--data-split'¶

PARAM_OUTPUT_DIR = '--output-dir'¶

PARAM_PREDICTION_TYPE = '--prediction-type'¶

PARAM_PRINT_DATA_CHARACTERISTICS = '--print-data-characteristics'¶

PARAM_PRINT_EVALUATION = '--print-evaluation'¶

PARAM_PRINT_LABEL_VECTORS = '--print-label-vectors'¶

PARAM_PRINT_PREDICTIONS = '--print-predictions'¶

PARAM_PRINT_PREDICTION_CHARACTERISTICS = '--print-prediction-characteristics'¶

PARAM_PROBLEM_TYPE = '--problem-type'¶

PARAM_RANDOM_STATE = '--random-state'¶

PARAM_STORE_DATA_CHARACTERISTICS = '--store-data-characteristics'¶

PARAM_STORE_EVALUATION = '--store-evaluation'¶

PARAM_STORE_LABEL_VECTORS = '--store-label-vectors'¶

PARAM_STORE_PREDICTIONS = '--store-predictions'¶

PARAM_STORE_PREDICTION_CHARACTERISTICS = '--store-prediction-characteristics'¶

PRINT_DATA_CHARACTERISTICS_VALUES: Dict[str, Set[str]] = {'false': {}, 'true': {'decimals', 'distinct_label_vectors', 'examples', 'feature_density', 'feature_sparsity', 'features', 'label_cardinality', 'label_imbalance_ratio', 'nominal_features', 'numerical_features', 'output_density', 'output_sparsity', 'outputs', 'percentage'}}¶

PRINT_EVALUATION_VALUES: Dict[str, Set[str]] = {'false': {}, 'true': {'accuracy', 'coverage_error', 'dcg', 'decimals', 'enable_all', 'example_wise_f1', 'example_wise_jaccard', 'example_wise_precision', 'example_wise_recall', 'f1', 'hamming_accuracy', 'hamming_loss', 'jaccard', 'lrap', 'macro_f1', 'macro_jaccard', 'macro_precision', 'macro_recall', 'mean_absolute_error', 'mean_absolute_percentage_error', 'mean_squared_error', 'micro_f1', 'micro_jaccard', 'micro_precision', 'micro_recall', 'ndcg', 'percentage', 'precision', 'rank_loss', 'recall', 'subset_accuracy', 'subset_zero_one_loss', 'zero_one_loss'}}¶

PRINT_LABEL_VECTORS_VALUES: Dict[str, Set[str]] = {'false': {}, 'true': {'sparse'}}¶

PRINT_PREDICTIONS_VALUES: Dict[str, Set[str]] = {'false': {}, 'true': {'decimals'}}¶

PRINT_PREDICTION_CHARACTERISTICS_VALUES: Dict[str, Set[str]] = {'false': {}, 'true': {'decimals', 'distinct_label_vectors', 'label_cardinality', 'label_imbalance_ratio', 'output_density', 'output_sparsity', 'outputs', 'percentage'}}¶

STORE_DATA_CHARACTERISTICS_VALUES = {'false': {}, 'true': {'decimals', 'distinct_label_vectors', 'examples', 'feature_density', 'feature_sparsity', 'features', 'label_cardinality', 'label_imbalance_ratio', 'nominal_features', 'numerical_features', 'output_density', 'output_sparsity', 'outputs', 'percentage'}}¶

STORE_EVALUATION_VALUES: Dict[str, Set[str]] = {'false': {}, 'true': {'accuracy', 'coverage_error', 'dcg', 'decimals', 'enable_all', 'example_wise_f1', 'example_wise_jaccard', 'example_wise_precision', 'example_wise_recall', 'f1', 'hamming_accuracy', 'hamming_loss', 'jaccard', 'lrap', 'macro_f1', 'macro_jaccard', 'macro_precision', 'macro_recall', 'mean_absolute_error', 'mean_absolute_percentage_error', 'mean_squared_error', 'micro_f1', 'micro_jaccard', 'micro_precision', 'micro_recall', 'ndcg', 'percentage', 'precision', 'prediction_time', 'rank_loss', 'recall', 'subset_accuracy', 'subset_zero_one_loss', 'training_time', 'zero_one_loss'}}¶

STORE_LABEL_VECTORS_VALUES = {'false': {}, 'true': {'sparse'}}¶

STORE_PREDICTIONS_VALUES = {'false': {}, 'true': {'decimals'}}¶

STORE_PREDICTION_CHARACTERISTICS_VALUES = {'false': {}, 'true': {'decimals', 'distinct_label_vectors', 'label_cardinality', 'label_imbalance_ratio', 'output_density', 'output_sparsity', 'outputs', 'percentage'}}¶

configure_arguments(parser: ArgumentParser)¶

May be overridden by subclasses in order to configure the command line arguments of the program.

Parameters:: parser – An ArgumentParser that is used for parsing command line arguments

configure_problem_specific_arguments(parser: ArgumentParser, problem_type: ProblemType)¶

May be overridden by subclasses in order to configure the command line arguments of the program, depending on the type of machine learning problem to be solved.

Parameters:

parser – An ArgumentParser that is used for parsing command line arguments
problem_type – The type of the machine learning problem to be solved

abstractmethod create_classifier(args) → ClassifierMixin | None¶

Must be implemented by subclasses in order to create a machine learning algorithm that can be applied to classification problems.

Parameters:: args – The command line arguments
Returns:: The learner that has been created or None, if regression problems are not supported

abstractmethod create_regressor(args) → RegressorMixin | None¶

Must be implemented by subclasses in order to create a machine learning algorithm that can be applied to regression problems.

Parameters:: args – The command line arguments
Returns:: The learner that has been created or None, if regression problems are not supported

class mlrl.testbed.runnables.LogLevel(*values)¶

Bases: Enum

Specifies all valid textual representations of log levels.

CRITICAL = 'critical'¶

DEBUG = 'debug'¶

ERROR = 'error'¶

FATAL = 'fatal'¶

INFO = 'info'¶

NOTSET = 'notset'¶

WARN = 'warn'¶

WARNING = 'warning'¶

static parse(text: str)¶

Parses a given text that represents a log level. If the given text does not represent a valid log level, a ValueError is raised.

Parameters:: text – The text to be parsed
Returns:: A log level, depending on the given text

class mlrl.testbed.runnables.RuleLearnerRunnable(learner_name: str, classifier_type: type | None, classifier_config_type: type | None, classifier_parameters: Set[Parameter] | None, regressor_type: type | None, regressor_config_type: type | None, regressor_parameters: Set[Parameter] | None)¶

Bases: LearnerRunnable

A base class for all programs that perform an experiment that involves training and evaluation of a rule learner.

INCREMENTAL_EVALUATION_VALUES: Dict[str, Set[str]] = {'false': {}, 'true': {'max_size', 'min_size', 'step_size'}}¶

OPTION_MAX_SIZE = 'max_size'¶

OPTION_MIN_SIZE = 'min_size'¶

OPTION_STEP_SIZE = 'step_size'¶

PARAM_FEATURE_FORMAT = '--feature-format'¶

PARAM_INCREMENTAL_EVALUATION = '--incremental-evaluation'¶

PARAM_PRINT_JOINT_PROBABILITY_CALIBRATION_MODEL = '--print-joint-probability-calibration-model'¶

PARAM_PRINT_MARGINAL_PROBABILITY_CALIBRATION_MODEL = '--print-marginal-probability-calibration-model'¶

PARAM_PRINT_RULES = '--print-rules'¶

PARAM_SPARSE_FEATURE_VALUE = '--sparse-feature-value'¶

PARAM_STORE_JOINT_PROBABILITY_CALIBRATION_MODEL = '--store-joint-probability-calibration-model'¶

PARAM_STORE_MARGINAL_PROBABILITY_CALIBRATION_MODEL = '--store-marginal-probability-calibration-model'¶

PARAM_STORE_RULES = '--store-rules'¶

PRINT_JOINT_PROBABILITY_CALIBRATION_MODEL_VALUES: Dict[str, Set[str]] = {'false': {}, 'true': {'decimals'}}¶

PRINT_MARGINAL_PROBABILITY_CALIBRATION_MODEL_VALUES: Dict[str, Set[str]] = {'false': {}, 'true': {'decimals'}}¶

PRINT_RULES_VALUES: Dict[str, Set[str]] = {'false': {}, 'true': {'decimals_body', 'decimals_head', 'print_bodies', 'print_feature_names', 'print_heads', 'print_nominal_values', 'print_output_names'}}¶

STORE_JOINT_PROBABILITY_CALIBRATION_MODEL_VALUES = {'false': {}, 'true': {'decimals'}}¶

STORE_MARGINAL_PROBABILITY_CALIBRATION_MODEL_VALUES = {'false': {}, 'true': {'decimals'}}¶

STORE_RULES_VALUES = {'false': {}, 'true': {'decimals_body', 'decimals_head', 'print_bodies', 'print_feature_names', 'print_heads', 'print_nominal_values', 'print_output_names'}}¶

configure_problem_specific_arguments(parser: ArgumentParser, problem_type: ProblemType)¶

May be overridden by subclasses in order to configure the command line arguments of the program, depending on the type of machine learning problem to be solved.

Parameters:

parser – An ArgumentParser that is used for parsing command line arguments
problem_type – The type of the machine learning problem to be solved

create_classifier(args) → ClassifierMixin | None¶

Must be implemented by subclasses in order to create a machine learning algorithm that can be applied to classification problems.

Parameters:: args – The command line arguments
Returns:: The learner that has been created or None, if regression problems are not supported

create_regressor(args) → RegressorMixin | None¶

Must be implemented by subclasses in order to create a machine learning algorithm that can be applied to regression problems.

Parameters:: args – The command line arguments
Returns:: The learner that has been created or None, if regression problems are not supported

class mlrl.testbed.runnables.Runnable¶

Bases: ABC

A base class for all programs that can be configured via command line arguments.

class ProgramInfo(name: str, version: str, year: str | None = None, authors: ~typing.Set[str] = <factory>, python_packages: ~typing.List[~mlrl.common.info.PythonPackageInfo] = <factory>)¶

Bases: object

Provides information about a program.

Parameters:

name – A string that specifies the program name
version – A string that specifies the program version
year – A string that specifies the year when the program was released
authors – A set that contains the name of each author of the program
python_packages – A list that contains a PythonPackageInfo for each Python package that is used by the program

property all_python_packages: List[PythonPackageInfo]¶: A list that contains a PythonPackageInfo for each Python package that is used by the program, as well as for the testbed package.

authors: Set[str]¶

name: str¶

python_packages: List[PythonPackageInfo]¶

version: str¶

year: str | None = None¶

configure_arguments(parser: ArgumentParser)¶

May be overridden by subclasses in order to configure the command line arguments of the program.

Parameters:: parser – An ArgumentParser that is used for parsing command line arguments

configure_logger(args)¶

May be overridden by subclasses in order to configure the logger to be used by the program.

Parameters:: args – The command line arguments

get_program_info() → ProgramInfo | None¶

May be overridden by subclasses in order to provide information about the program to be printed via the command line argument ‘-v’ or ‘–version’.

Returns:: The Runnable.ProgramInfo that has been provided

run(args)¶

Executes the runnable.

Parameters:: args – The command line arguments