mlrl.common.cython.learner module

@author: Michael Rapp (michael.rapp.ml@gmail.com)

class mlrl.common.cython.learner.BeamSearchTopDownRuleInductionMixin

Bases: ABC

Allows to configure a rule learner to use a top-down beam search.

abstract use_beam_search_top_down_rule_induction() BeamSearchTopDownRuleInductionConfig

Configures the algorithm to use a top-down beam search for the induction of individual rules.

Returns:

A BeamSearchTopDownRuleInductionConfig that allows further configuration of the algorithm for the induction of individual rules

class mlrl.common.cython.learner.DefaultRuleMixin

Bases: ABC

Allows to configure a rule learner to induce a default rule.

abstract use_default_rule()

Configures the rule learner to induce a default rule.

class mlrl.common.cython.learner.EqualFrequencyFeatureBinningMixin

Bases: ABC

Allows to configure a rule learner to use equal-frequency feature binning.

abstract use_equal_frequency_feature_binning() EqualFrequencyFeatureBinningConfig

Configures the rule learner to use a method for the assignment of numerical feature values to bins, such that each bin contains approximately the same number of values.

Returns:

An EqualFrequencyFeatureBinningConfig that allows further configuration of the method for the assignment of numerical feature values to bins

class mlrl.common.cython.learner.EqualWidthFeatureBinningMixin

Bases: ABC

Allows to configure a rule learner to use equal-width feature binning.

abstract use_equal_width_feature_binning() EqualWidthFeatureBinningConfig

Configures the rule learner to use a method for the assignment of numerical feature values to bins, such that each bin contains values from equally sized value ranges.

Returns:

An EqualWidthFeatureBinningConfig that allows further configuration of the method for the assignment of numerical feature values to bins

class mlrl.common.cython.learner.ExampleWiseStratifiedBiPartitionSamplingMixin

Bases: ABC

Allows to configure a rule learner to partition the available training examples into a training set and a holdout set using stratification, where distinct label vectors are treated as individual classes.

abstract use_example_wise_stratified_bi_partition_sampling() ExampleWiseStratifiedBiPartitionSamplingConfig

Configures the rule learner to partition the available training examples into a training set and a holdout set using stratification, where distinct label vectors are treated as individual classes

Returns:

An ExampleWiseStratifiedBiPartitionSamplingConfig that allows further configuration of the method for partitioning the available training examples into a training and a holdout set

class mlrl.common.cython.learner.ExampleWiseStratifiedInstanceSamplingMixin

Bases: ABC

Allows to configure a rule learner to use example-wise stratified instance sampling.

abstract use_example_wise_stratified_instance_sampling() ExampleWiseStratifiedInstanceSamplingConfig

Configures the rule learner to sample from the available training examples using stratification, where distinct label vectors are treated as individual classes, whenever a new rule should be learned.

Returns:

An ExampleWiseStratifiedInstanceSamplingConfig that allows further configuration of the method for sampling instances

class mlrl.common.cython.learner.FeatureSamplingWithoutReplacementMixin

Bases: ABC

Allows to configure a rule learner to use feature sampling without replacement.

abstract use_feature_sampling_without_replacement() FeatureSamplingWithoutReplacementConfig

Configures the rule learner to sample from the available features with replacement whenever a rule should be refined.

Returns:

A FeatureSamplingWithoutReplacementConfig that allows further configuration of the method for sampling features

class mlrl.common.cython.learner.GreedyTopDownRuleInductionMixin

Bases: ABC

Allows to configure a rule learner to use a greedy top-down search for the induction of individual rules.

abstract use_greedy_top_down_rule_induction() GreedyTopDownRuleInductionConfig

Configures the algorithm to use a greedy top-down search for the induction of individual rules.

Returns:

A GreedyTopDownRuleInductionConfig that allows further configuration of the algorithm for the induction of individual rules

class mlrl.common.cython.learner.InstanceSamplingWithReplacementMixin

Bases: ABC

Defines an interface for all classes that allow to configure a rule learner to use instance sampling with replacement.

abstract use_instance_sampling_with_replacement() InstanceSamplingWithReplacementConfig

Configures the rule learner to sample from the available training examples with replacement whenever a new rule should be learned.

Returns:

An InstanceSamplingWithReplacementConfig that allows further configuration of the method for sampling instances

class mlrl.common.cython.learner.InstanceSamplingWithoutReplacementMixin

Bases: ABC

Defines an interface for all classes that allow to configure a rule learner to use instance sampling without replacement.

abstract use_instance_sampling_without_replacement() InstanceSamplingWithoutReplacementConfig

Configures the rule learner to sample from the available training examples without replacement whenever a new rule should be learned.

Returns:

An InstanceSamplingWithoutReplacementConfig that allows further configuration of the method for sampling instances

class mlrl.common.cython.learner.IrepRulePruningMixin

Bases: ABC

Allows to configure a rule learner to prune individual rules by following the principles of “incremental reduced error pruning” (IREP).

abstract use_irep_rule_pruning()

Configures the rule learner to prune individual rules by following the principles of “incremental reduced error pruning” (IREP).

class mlrl.common.cython.learner.LabelSamplingWithoutReplacementMixin

Bases: ABC

Allows to configure a rule learner to use label sampling without replacement.

abstract use_label_sampling_without_replacement() LabelSamplingWithoutReplacementConfig

Configures the rule learner to sample from the available labels with replacement whenever a new rule should be learned.

Returns:

A LabelSamplingWithoutReplacementConfig that allows further configuration of the method for sampling labels

class mlrl.common.cython.learner.LabelWiseStratifiedBiPartitionSamplingMixin

Bases: ABC

Allows to configure a rule learner to partition the available training examples into a training set and a holdout set using stratification, such that for each label the proportion of relevant and irrelevant examples is maintained.

abstract use_label_wise_stratified_bi_partition_sampling() LabelWiseStratifiedBiPartitionSamplingConfig

Configures the rule learner to partition the available training examples into a training set and a holdout set using stratification, such that for each label the proportion of relevant and irrelevant examples is maintained.

Returns:

A LabelWiseStratifiedBiPartitionSamplingConfig that allows further configuration of the method for partitioning the available training examples into a training and a holdout set

class mlrl.common.cython.learner.LabelWiseStratifiedInstanceSamplingMixin

Bases: ABC

Allows to configure a rule learner to use label-wise stratified instance sampling.

abstract use_label_wise_stratified_instance_sampling() LabelWiseStratifiedInstanceSamplingConfig

Configures the rule learner to sample from the available training examples using stratification, such that for each label the proportion of relevant and irrelevant examples is maintained, whenever a new rule should be learned.

Returns:

A LabelWiseStratifiedInstanceSamplingConfig that allows further configuration of the method for sampling instances

class mlrl.common.cython.learner.NoFeatureBinningMixin

Bases: ABC

Allows to configure a rule learner to not use any method for the assignment of numerical features values to bins.

abstract use_no_feature_binning()

Configures the rule learner to not use any method for the assignment of numerical feature values to bins.

class mlrl.common.cython.learner.NoFeatureSamplingMixin

Bases: ABC

Allows to configure a rule learner to not use feature sampling.

abstract use_no_feature_sampling()

Configures the rule learner to not sample from the available features whenever a rule should be refined.

class mlrl.common.cython.learner.NoGlobalPruningMixin

Bases: ABC

Allows to configure a rule learner to not use global pruning.

abstract use_no_global_pruning()

Configures the rule learner to not use global pruning.

class mlrl.common.cython.learner.NoInstanceSamplingMixin

Bases: ABC

Defines an interface for all classes that allow to configure a rule learner to not use instance sampling.

abstract use_no_instance_sampling()

Configures the rule learner to not sample from the available training examples whenever a new rule should be learned.

class mlrl.common.cython.learner.NoJointProbabilityCalibrationMixin

Bases: ABC

Allows to configure a rule learner to not calibrate joint probabilities.

abstract use_no_joint_probability_calibration()

Configures the rule learner to not calibrate joint probabilities.

class mlrl.common.cython.learner.NoLabelSamplingMixin

Bases: ABC

Allows to configure a rule learner to not use label sampling.

abstract use_no_label_sampling()

Configures the rule learner to not sample from the available labels whenever a new rule should be learned.

class mlrl.common.cython.learner.NoMarginalProbabilityCalibrationMixin

Bases: ABC

Allows to configure a rule learner to not calibrate marginal probabilities.

abstract use_no_marginal_probability_calibration()

Configures the rule learner to not calibrate marginal probabilities.

class mlrl.common.cython.learner.NoParallelPredictionMixin

Bases: ABC

Allows to configure a rule learner to not use any multi-threading for prediction.

abstract use_no_parallel_prediction()

Configures the rule learner to not use any multi-threading to predict for several query examples in parallel.

class mlrl.common.cython.learner.NoParallelRuleRefinementMixin

Bases: ABC

Allows to configure a rule learner to not use any multi-threading for the parallel refinement of rules.

abstract use_no_parallel_rule_refinement()

Configures the rule learner to not use any multi-threading for the parallel refinement of rules.

class mlrl.common.cython.learner.NoParallelStatisticUpdateMixin

Bases: ABC

Allows to configure a rule learner to not use any multi-threading for the parallel update of statistics.

abstract use_no_parallel_statistic_update()

Configures the rule learner to not use any multi-threading for the parallel update of statistics.

class mlrl.common.cython.learner.NoPartitionSamplingMixin

Bases: ABC

Allows to configure a rule learner to not partition the available training examples into a training set and a holdout set.

abstract use_no_partition_sampling()

Configures the rule learner to not partition the available training examples into a training set and a holdout set.

class mlrl.common.cython.learner.NoPostProcessorMixin

Bases: ABC

Allows to configure a rule learner to not use any post processor.

abstract use_no_post_processor()

Configures the rule learner to not use any post-processor.

class mlrl.common.cython.learner.NoRulePruningMixin

Bases: ABC

Allows to configure a rule learner to not prune individual rules.

abstract use_no_rule_pruning()

Configures the rule learner to not prune individual rules.

class mlrl.common.cython.learner.NoSequentialPostOptimizationMixin

Bases: ABC

Allows to configure a rule learner to not use a post-optimization method that optimizes each rule in a model by relearning it in the context of the other rules.

abstract use_no_sequential_post_optimization()

Configures the rule learner to not use a post-optimization method that optimizes each rule in a model by relearning it in the context of the other rules.

class mlrl.common.cython.learner.NoSizeStoppingCriterionMixin

Bases: ABC

Allows to configure a rule learner to not use a stopping criterion that ensures that the number of induced rules does not exceed a certain maximum.

abstract use_no_size_stopping_criterion()

Configures the rule learner to not use a stopping criterion that ensures that the number of induced rules does not exceed a certain maximum.

class mlrl.common.cython.learner.NoTimeStoppingCriterionMixin

Bases: ABC

Allows to configure a rule learner to not use a stopping criterion that ensures that a certain time limit is not exceeded.

abstract use_no_time_stopping_criterion()

Configures the rule learner to not use a stopping criterion that ensures that a certain time limit is not exceeded.

class mlrl.common.cython.learner.ParallelPredictionMixin

Bases: ABC

Allows to configure a rule learner to use multi-threading to predict for several examples in parallel.

abstract use_parallel_prediction() ManualMultiThreadingConfig

Configures the rule learner to use multi-threading to predict for several query examples in parallel.

Returns:

A ManualMultiThreadingConfig that allows further configuration of the multi-threading behavior

class mlrl.common.cython.learner.ParallelRuleRefinementMixin

Bases: ABC

Allows to configure a rule learner to use multi-threading for the parallel refinement of rules.

abstract use_parallel_rule_refinement() ManualMultiThreadingConfig

Configures the rule learner to use multi-threading for the parallel refinement of rules.

Returns:

A ManualMultiThreadingConfig that allows further configuration of the multi-threading behavior

class mlrl.common.cython.learner.ParallelStatisticUpdateMixin

Bases: ABC

Allows to configure a rule learner to use multi-threading for the parallel update of statistics.

abstract use_parallel_statistic_update() ManualMultiThreadingConfig

Configures the rule learner to use multi-threading for the parallel update of statistics.

Returns:

A ManualMultiThreadingConfig that allows further configuration of the multi-threading behavior

class mlrl.common.cython.learner.PostPruningMixin

Bases: ABC

Allows to configure a rule learner to use a stopping criterion that keeps track of the number of rules in a model that perform best with respect to the examples in the training or holdout set according to a certain measure.

abstract use_global_post_pruning() PostPruningConfig

Configures the rule learner to use a stopping criterion that keeps track of the number of rules in a model that perform best with respect to the examples in the training or holdout set according to a certain measure.

class mlrl.common.cython.learner.PrePruningMixin

Bases: ABC

Allows to configure a rule learner to use a stopping criterion that stops the induction of rules as soon as the quality of a model’s predictions for the examples in the training or holdout set do not improve according to a certain measure.

abstract use_global_pre_pruning() PrePruningConfig

Configures the rule learner to use a stopping criterion that stops the induction of rules as soon as the quality of a model’s predictions for the examples in the training or holdout set do not improve according to a certain measure.

Returns:

A PrePruningConfig that allows further configuration of the stopping criterion

class mlrl.common.cython.learner.RandomBiPartitionSamplingMixin

Bases: ABC

Allows to configure a rule learner to partition the available training example into a training set and a holdout set by randomly splitting the training examples into two mutually exclusive sets.

abstract use_random_bi_partition_sampling() RandomBiPartitionSamplingConfig

Configures the rule learner to partition the available training examples into a training set and a holdout set by randomly splitting the training examples into two mutually exclusive sets.

Returns:

A RandomBiPartitionSamplingConfig that allows further configuration of the method for partitioning the available training examples into a training set and a holdout set

class mlrl.common.cython.learner.RoundRobinLabelSamplingMixin

Bases: ABC

Allows to configure a rule learner to sample single labels in a round-robin fashion.

abstract use_round_robin_label_sampling()

Configures the rule learner to sample a single label in a round-robin fashion whenever a new rule should be learned.

class mlrl.common.cython.learner.RuleLearner

Bases: object

A rule learner.

can_predict_binary(feature_matrix, num_labels) bool

Returns whether the rule learner is able to predict binary labels or not.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • num_labels – The number of labels to predict for

Returns:

True, if the rule learner is able to predict binary labels, False otherwise

can_predict_probabilities(feature_matrix, num_labels) bool

Returns whether the rule learner is able to predict probability estimates or not.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • num_labels – The number of labels to predict for

Returns:

True, if the rule learner is able to predict probability estimates, False otherwise

can_predict_scores(feature_matrix, num_labels) bool

Returns whether the rule learner is able to predict regression scores or not.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • num_labels – The number of labels to predict for

Returns:

True, if the rule learner is able to predict regression scores, False otherwise

create_binary_predictor(feature_matrix, rule_model, label_space_info, marginal_probability_calibration_model, joint_probability_calibration_model, num_labels) BinaryPredictor

Creates and returns a predictor that may be used to predict binary labels for given query examples. If the prediction of binary labels is not supported by the rule learner, a RuntimeError is thrown.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • rule_model – The RuleModel that should be used to obtain predictions

  • label_space_info – The LabelSpaceInfo that provides information about the label space that may be used as a basis for obtaining predictions

  • marginal_probability_calibration_model – The MarginalProbabilityCalibrationModel that may be used for the calibration of marginal probabilities

  • joint_probability_calibration_model – The JointProbabilityCalibrationModel that may be used for the calibration of joint probabilities

  • num_labels – The number of labels to predict for

Returns:

A BinaryPredictor that may be used to predict binary labels for the given query examples

create_probability_predictor(feature_matrix, rule_model, label_space_info, marginal_probability_calibration_model, joint_probability_calibration_model, num_labels) ProbabilityPredictor

Creates and returns a predictor that may be used to predict probability estimates for given query examples. If the prediction of probability estimates is not supported by the rule learner, a RuntimeError is thrown.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • rule_model – The RuleModel that should be used to obtain predictions

  • label_space_info – The LabelSpaceInfo that provides information about the label space that may be used as a basis for obtaining predictions

  • marginal_probability_calibration_model – The MarginalProbabilityCalibrationModel that may be used for the calibration of marginal probabilities

  • joint_probability_calibration_model – The JointProbabilityCalibrationModel that may be used for the calibration of joint probabilities

  • num_labels – The number of labels to predict for

Returns:

A ProbabilityPredictor that may be used to predict probability estimates for the given query examples

create_score_predictor(feature_matrix, rule_model, label_space_info, num_labels) ScorePredictor

Creates and returns a predictor that may be used to predict regression scores for given query examples. If the prediction of regression scores is not supported by the rule learner, a RuntimeError is thrown.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • rule_model – The RuleModel that should be used to obtain predictions

  • label_space_info – The LabelSpaceInfo that provides information about the label space that may be used as a basis for obtaining predictions

  • num_labels – The number of labels to predict for

Returns:

A ScorePredictor that may be used to predict regression scores for the given query examples

create_sparse_binary_predictor(feature_matrix, rule_model, label_space_info, marginal_probability_calibration_model, joint_probability_calibration_model, num_labels) SparseBinaryPredictor

Creates and returns a predictor that may be used to predict sparse binary labels for given query examples. If the prediction of sparse binary labels is not supported by the rule learner, a RuntimeError is thrown.

Parameters:
  • feature_matrix – A RowWiseFeatureMatrix that provides row-wise access to the feature values of the query examples

  • rule_model – The RuleModel that should be used to obtain predictions

  • label_space_info – The LabelSpaceInfo that provides information about the label space that may be used as a basis for obtaining predictions

  • marginal_probability_calibration_model – The MarginalProbabilityCalibrationModel that may be used for the calibration of marginal probabilities

  • joint_probability_calibration_model – The JointProbabilityCalibrationModel that may be used for the calibration of joint probabilities

  • num_labels – The number of labels to predict for

Returns:

A SparseBinaryPredictor that may be used to predict sparse binary labels for the given query examples

fit(feature_info, feature_matrix, label_matrix, random_state) TrainingResult

Applies the rule learner to given training examples and corresponding ground truth labels.

Parameters:
  • feature_info – A FeatureInfo that provides information about the types of individual features

  • feature_matrix – A ColumnWiseFeatureMatrix that provides column-wise access to the feature values of the training examples

  • label_matrix – A RowWiseLabelMatrix that provides row-wise access to the ground truth labels of the training examples

  • random_state – The seed to be used by random number generators

Returns:

The TrainingResult that provides access to the result of fitting the rule learner to the training data

class mlrl.common.cython.learner.RuleLearnerConfig

Bases: object

class mlrl.common.cython.learner.SequentialPostOptimizationMixin

Bases: ABC

Allows to configure a rule learner to use a post-optimization method that optimizes each rule in a model by relearning it in the context of the other rules.

abstract use_sequential_post_optimization() SequentialPostOptimizationConfig

Configures the rule learner to use a post-optimization method that optimizes each rule in a model by relearning it in the context of the other rules.

Returns:

A SequentialPostOptimizationConfig that allows further configuration of the post-optimization method

class mlrl.common.cython.learner.SequentialRuleModelAssemblageMixin

Bases: ABC

Allows to configure a rule learner to use an algorithm that sequentially induces several rules.

abstract use_sequential_rule_model_assemblage()

Configures the rule learner to use an algorithm that sequentially induces several rules, optionally starting with a default rule, that are added to a rule-based model.

class mlrl.common.cython.learner.SizeStoppingCriterionMixin

Bases: ABC

Allows to configure a rule learner to use a stopping criterion that ensures that the number of induced rules does not exceed a certain maximum.

abstract use_size_stopping_criterion() SizeStoppingCriterionConfig

Configures the rule learner to use a stopping criterion that ensures that the number of induced rules does not exceed a certain maximum.

Returns:

A SizeStoppingCriterionConfig that allows further configuration of the stopping criterion

class mlrl.common.cython.learner.TimeStoppingCriterionMixin

Bases: ABC

Allows to configure a rule learner to use a stopping criterion that ensures that a certain time limit is not exceeded.

use_time_stopping_criterion() TimeStoppingCriterionConfig

Configures the rule learner to use a stopping criterion that ensures that a certain time limit is not exceeded.

Returns:

A TimeStoppingCriterionConfig that allows further configuration of the stopping criterion

class mlrl.common.cython.learner.TrainingResult

Bases: object

Provides access to the results of fitting a rule learner to training data. It incorporates the model that has been trained, as well as additional information that is necessary for obtaining predictions for unseen data.

joint_probability_calibration_model
label_space_info
marginal_probability_calibration_model
num_labels
rule_model