mlrl.testbed.experiments module¶
Author: Michael Rapp (michael.rapp.ml@gmail.com)
Provides classes for performing experiments.
- class mlrl.testbed.experiments.Evaluation(prediction_type: PredictionType, output_writers: List[OutputWriter])¶
Bases:
ABC
An abstract base class for all classes that allow to evaluate predictions that are obtained from a previously trained model.
- abstract predict_and_evaluate(meta_data: MetaData, data_split: DataSplit, data_type: DataType, train_time: float, learner, x, y)¶
Must be implemented by subclasses in order to obtain and evaluate predictions for given query examples from a previously trained model.
- Parameters:
meta_data – The meta-data of the data set
data_split – The split of the available data, the predictions and ground truth labels correspond to
data_type – Specifies whether the predictions and ground truth labels correspond to the training or test data
train_time – The time needed to train the model
learner – The learner, the predictions should be obtained from
x – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_features), that stores the feature values of the query examples
y – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_labels), that stores the ground truth labels of the query examples
- class mlrl.testbed.experiments.Experiment(base_learner: BaseEstimator, learner_name: str, data_splitter: DataSplitter, pre_training_output_writers: List[OutputWriter], post_training_output_writers: List[OutputWriter], pre_execution_hook: ExecutionHook | None = None, train_evaluation: Evaluation | None = None, test_evaluation: Evaluation | None = None, parameter_input: ParameterInput | None = None, persistence: ModelPersistence | None = None)¶
Bases:
Callback
An experiment that trains and evaluates a single multi-label classifier or ranker on a specific data set using cross validation or separate training and test sets.
- class ExecutionHook¶
Bases:
ABC
An abstract base class for all operations that may be executed before or after an experiment.
- abstract execute()¶
Must be overridden by subclasses in order to execute the operation.
- run()¶
Runs the experiment.
- train_and_evaluate(meta_data: MetaData, data_split: DataSplit, train_x, train_y, test_x, test_y)¶
Trains a model on a training set and evaluates it on a test set.
- Parameters:
meta_data – The meta-data of the training data set
data_split – Information about the split of the available data that should be used for training and evaluating the model
train_x – The feature matrix of the training examples
train_y – The label matrix of the training examples
test_x – The feature matrix of the test examples
test_y – The label matrix of the test examples
- class mlrl.testbed.experiments.GlobalEvaluation(prediction_type: PredictionType, output_writers: List[OutputWriter])¶
Bases:
Evaluation
Obtains and evaluates predictions from a previously trained global model.
- predict_and_evaluate(meta_data: MetaData, data_split: DataSplit, data_type: DataType, train_time: float, learner, x, y)¶
Must be implemented by subclasses in order to obtain and evaluate predictions for given query examples from a previously trained model.
- Parameters:
meta_data – The meta-data of the data set
data_split – The split of the available data, the predictions and ground truth labels correspond to
data_type – Specifies whether the predictions and ground truth labels correspond to the training or test data
train_time – The time needed to train the model
learner – The learner, the predictions should be obtained from
x – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_features), that stores the feature values of the query examples
y – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_labels), that stores the ground truth labels of the query examples
- class mlrl.testbed.experiments.IncrementalEvaluation(prediction_type: PredictionType, output_writers: List[OutputWriter], min_size: int, max_size: int, step_size: int)¶
Bases:
Evaluation
Repeatedly obtains and evaluates predictions from a previously trained ensemble model, e.g., a model consisting of several rules, using only a subset of the ensemble members with increasing size.
- predict_and_evaluate(meta_data: MetaData, data_split: DataSplit, data_type: DataType, train_time: float, learner, x, y)¶
Must be implemented by subclasses in order to obtain and evaluate predictions for given query examples from a previously trained model.
- Parameters:
meta_data – The meta-data of the data set
data_split – The split of the available data, the predictions and ground truth labels correspond to
data_type – Specifies whether the predictions and ground truth labels correspond to the training or test data
train_time – The time needed to train the model
learner – The learner, the predictions should be obtained from
x – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_features), that stores the feature values of the query examples
y – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_labels), that stores the ground truth labels of the query examples