mlrl.testbed.evaluation module¶
Author: Michael Rapp (michael.rapp.ml@gmail.com)
Provides classes for evaluating the predictions or rankings provided by a multi-label learner according to different measures. The evaluation results can be written to one or several outputs, e.g., to the console or to a file.
- class mlrl.testbed.evaluation.BinaryEvaluationWriter(sinks: List[Sink])¶
Bases:
EvaluationWriter
Evaluates the quality of binary predictions provided by a single- or multi-label classifier according to commonly used bipartition measures.
- class mlrl.testbed.evaluation.EvaluationFunction(option: str, name: str, evaluation_function, percentage: bool = True, **kwargs)¶
Bases:
Formatter
An evaluation function.
- class mlrl.testbed.evaluation.EvaluationWriter(sinks: List[Sink])¶
Bases:
OutputWriter
,ABC
An abstract base class for all classes that evaluate the predictions provided by a learner and allow to write the evaluation results to one or several sinks.
- class CsvSink(output_dir: str, options: ~mlrl.common.options.Options = <mlrl.common.options.Options object>)¶
Bases:
CsvSink
Allows to write evaluation results to CSV files.
- write_output(meta_data: MetaData, data_split: DataSplit, data_type: DataType | None, prediction_scope: PredictionScope | None, output_data, **kwargs)¶
See
mlrl.testbed.output_writer.OutputWriter.Sink.write_output()
- class EvaluationResult¶
Bases:
Formattable
,Tabularizable
Stores the evaluation results according to different measures.
- avg(measure: Formatter, **kwargs) Tuple[str, str] ¶
Returns the score and standard deviation according to a specific measure averaged over all available folds.
- Parameters:
measure – The measure
- Returns:
A tuple consisting of textual representations of the averaged score and standard deviation
- avg_dict(**kwargs) Dict[Formatter, str] ¶
Returns a dictionary that stores the scores, averaged across all folds, as well as the standard deviation, according to each measure.
- Returns:
A dictionary that stores textual representations of the scores and standard deviation according to each measure
- dict(fold: int | None, **kwargs) Dict[Formatter, str] ¶
Returns a dictionary that stores the scores for a specific fold according to each measure.
- Parameters:
fold – The fold, the scores correspond to, or None, if no cross validation is used
- Returns:
A dictionary that stores textual representations of the scores for the given fold according to each measure
- get(measure: Formatter, fold: int | None, **kwargs) str ¶
Returns the score according to a specific measure.
- Parameters:
measure – The measure
fold – The fold, the score corresponds to, or None, if no cross validation is used
- Returns:
A textual representation of the score
- put(measure: Formatter, score: float, num_folds: int, fold: int | None)¶
Adds a new score according to a specific measure to the evaluation result.
- Parameters:
measure – The measure
score – The score according to the measure
num_folds – The total number of cross validation folds
fold – The fold, the score corresponds to, or None, if no cross validation is used
- KWARG_FOLD = 'fold'¶
- class LogSink(options: ~mlrl.common.options.Options = <mlrl.common.options.Options object>)¶
Bases:
LogSink
Allows to write evaluation results to the console.
- write_output(meta_data: MetaData, data_split: DataSplit, data_type: DataType | None, prediction_scope: PredictionScope | None, output_data, **kwargs)¶
See
mlrl.testbed.output_writer.OutputWriter.Sink.write_output()
- class mlrl.testbed.evaluation.ProbabilityEvaluationWriter(sinks: List[Sink])¶
Bases:
ScoreEvaluationWriter
Evaluates the quality of probability estimates provided by a single- or multi-label classifier according to commonly used regression and ranking measures.
- class mlrl.testbed.evaluation.ScoreEvaluationWriter(sinks: List[Sink])¶
Bases:
EvaluationWriter
Evaluates the quality of regression scores provided by a single- or multi-output regressor according to commonly used regression and ranking measures.