mlrl.common.cython.learner_classification module

@author: Michael Rapp (michael.rapp.ml@gmail.com)

class mlrl.common.cython.learner_classification.ClassificationRuleLearner

Bases: object

A rule learner that can be applied to classification problems.

can_predict_binary(feature_matrix, num_labels) bool

Returns whether the rule learner is able to predict binary labels or not.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • num_labels – The number of labels to predict for

Returns:

True, if the rule learner is able to predict binary labels, False otherwise

can_predict_probabilities(feature_matrix, num_labels) bool

Returns whether the rule learner is able to predict probability estimates or not.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • num_labels – The number of labels to predict for

Returns:

True, if the rule learner is able to predict probability estimates, False otherwise

can_predict_scores(feature_matrix, num_labels) bool

Returns whether the rule learner is able to predict scores or not.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • num_labels – The number of labels to predict for

Returns:

True, if the rule learner is able to predict scores, False otherwise

create_binary_predictor(feature_matrix, rule_model, output_space_info, marginal_probability_calibration_model, joint_probability_calibration_model, num_labels) BinaryPredictor

Creates and returns a predictor that may be used to predict binary labels for given query examples. If the prediction of binary labels is not supported by the rule learner, a RuntimeError is thrown.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • rule_model – The RuleModel that should be used to obtain predictions

  • output_space_info – The OutputSpaceInfo that provides information about the output space that may be used as a basis for obtaining predictions

  • marginal_probability_calibration_model – The MarginalProbabilityCalibrationModel that may be used for the calibration of marginal probabilities

  • joint_probability_calibration_model – The JointProbabilityCalibrationModel that may be used for the calibration of joint probabilities

  • num_labels – The number of labels to predict for

Returns:

A BinaryPredictor that may be used to predict binary labels for the given query examples

create_probability_predictor(feature_matrix, rule_model, output_space_info, marginal_probability_calibration_model, joint_probability_calibration_model, num_labels) ProbabilityPredictor

Creates and returns a predictor that may be used to predict probability estimates for given query examples. If the prediction of probability estimates is not supported by the rule learner, a RuntimeError is thrown.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • rule_model – The RuleModel that should be used to obtain predictions

  • output_space_info – The OutputSpaceInfo that provides information about the output space that may be used as a basis for obtaining predictions

  • marginal_probability_calibration_model – The MarginalProbabilityCalibrationModel that may be used for the calibration of marginal probabilities

  • joint_probability_calibration_model – The JointProbabilityCalibrationModel that may be used for the calibration of joint probabilities

  • num_labels – The number of labels to predict for

Returns:

A ProbabilityPredictor that may be used to predict probability estimates for the given query examples

create_score_predictor(feature_matrix, rule_model, output_space_info, num_labels) ScorePredictor

Creates and returns a predictor that may be used to predict scores for given query examples. If the prediction of scores is not supported by the rule learner, a RuntimeError is thrown.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • rule_model – The RuleModel that should be used to obtain predictions

  • output_space_info – The OutputSpaceInfo that provides information about the output space that may be used as a basis for obtaining predictions

  • num_labels – The number of labels to predict for

Returns:

A ScorePredictor that may be used to predict scores for the given query examples

create_sparse_binary_predictor(feature_matrix, rule_model, output_space_info, marginal_probability_calibration_model, joint_probability_calibration_model, num_labels) SparseBinaryPredictor

Creates and returns a predictor that may be used to predict sparse binary labels for given query examples. If the prediction of sparse binary labels is not supported by the rule learner, a RuntimeError is thrown.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • rule_model – The RuleModel that should be used to obtain predictions

  • output_space_info – The OutputSpaceInfo that provides information about the output space that may be used as a basis for obtaining predictions

  • marginal_probability_calibration_model – The MarginalProbabilityCalibrationModel that may be used for the calibration of marginal probabilities

  • joint_probability_calibration_model – The JointProbabilityCalibrationModel that may be used for the calibration of joint probabilities

  • num_labels – The number of labels to predict for

Returns:

A SparseBinaryPredictor that may be used to predict sparse binary labels for the given query examples

fit(example_weights, feature_info, feature_matrix, label_matrix) TrainingResult

Applies the rule learner to given training examples and corresponding ground truth labels.

Parameters:
  • example_weightsExampleWeights that provide access to the weights of individual training examples

  • feature_info – A FeatureInfo that provides information about the types of individual features

  • feature_matrix – A ColumnWiseFeatureMatrix that provides column-wise access to the feature values of the training examples

  • label_matrix – A RowWiseLabelMatrix that provides row-wise access to the ground truth labels of the training examples

Returns:

The TrainingResult that provides access to the result of fitting the rule learner to the training data

class mlrl.common.cython.learner_classification.ExampleWiseStratifiedBiPartitionSamplingMixin

Bases: ABC

Allows to configure a rule learner to partition the available training examples into a training set and a holdout set using stratification, where distinct label vectors are treated as individual classes.

abstractmethod use_example_wise_stratified_bi_partition_sampling() ExampleWiseStratifiedBiPartitionSamplingConfig

Configures the rule learner to partition the available training examples into a training set and a holdout set using stratification, where distinct label vectors are treated as individual classes

Returns:

An ExampleWiseStratifiedBiPartitionSamplingConfig that allows further configuration of the method for partitioning the available training examples into a training and a holdout set

class mlrl.common.cython.learner_classification.ExampleWiseStratifiedInstanceSamplingMixin

Bases: ABC

Allows to configure a rule learner to use example-wise stratified instance sampling.

abstractmethod use_example_wise_stratified_instance_sampling() ExampleWiseStratifiedInstanceSamplingConfig

Configures the rule learner to sample from the available training examples using stratification, where distinct label vectors are treated as individual classes, whenever a new rule should be learned.

Returns:

An ExampleWiseStratifiedInstanceSamplingConfig that allows further configuration of the method for sampling instances

class mlrl.common.cython.learner_classification.OutputWiseStratifiedBiPartitionSamplingMixin

Bases: ABC

Allows to configure a rule learner to partition the available training examples into a training set and a holdout set using stratification, such that for each label the proportion of relevant and irrelevant examples is maintained.

abstractmethod use_output_wise_stratified_bi_partition_sampling() OutputWiseStratifiedBiPartitionSamplingConfig

Configures the rule learner to partition the available training examples into a training set and a holdout set using stratification, such that for each label the proportion of relevant and irrelevant examples is maintained.

Returns:

An OutputWiseStratifiedBiPartitionSamplingConfig that allows further configuration of the method for partitioning the available training examples into a training and a holdout set

class mlrl.common.cython.learner_classification.OutputWiseStratifiedInstanceSamplingMixin

Bases: ABC

Allows to configure a rule learner to use label-wise stratified instance sampling.

abstractmethod use_output_wise_stratified_instance_sampling() OutputWiseStratifiedInstanceSamplingConfig

Configures the rule learner to sample from the available training examples using stratification, such that for each label the proportion of relevant and irrelevant examples is maintained, whenever a new rule should be learned.

Returns:

An OutputWiseStratifiedInstanceSamplingConfig that allows further configuration of the method for sampling instances