mlrl.boosting.cython.learner_boomer module

@author: Michael Rapp (michael.rapp.ml@gmail.com)

class mlrl.boosting.cython.learner_boomer.Boomer

Bases: RuleLearner

The BOOMER rule learning algorithm.

class mlrl.boosting.cython.learner_boomer.BoomerConfig

Bases: RuleLearnerConfig, AutomaticPartitionSamplingMixin, AutomaticFeatureBinningMixin, AutomaticParallelRuleRefinementMixin, AutomaticParallelStatisticUpdateMixin, ConstantShrinkageMixin, NoL1RegularizationMixin, L1RegularizationMixin, NoL2RegularizationMixin, L2RegularizationMixin, NoDefaultRuleMixin, AutomaticDefaultRuleMixin, CompleteHeadMixin, FixedPartialHeadMixin, DynamicPartialHeadMixin, SingleLabelHeadMixin, AutomaticHeadMixin, DenseStatisticsMixin, SparseStatisticsMixin, AutomaticStatisticsMixin, ExampleWiseLogisticLossMixin, ExampleWiseSquaredErrorLossMixin, ExampleWiseSquaredHingeLossMixin, LabelWiseLogisticLossMixin, LabelWiseSquaredErrorLossMixin, LabelWiseSquaredHingeLossMixin, NoLabelBinningMixin, EqualWidthLabelBinningMixin, AutomaticLabelBinningMixin, IsotonicMarginalProbabilityCalibrationMixin, IsotonicJointProbabilityCalibrationMixin, LabelWiseBinaryPredictorMixin, ExampleWiseBinaryPredictorMixin, GfmBinaryPredictorMixin, AutomaticBinaryPredictorMixin, LabelWiseScorePredictorMixin, LabelWiseProbabilityPredictorMixin, MarginalizedProbabilityPredictorMixin, AutomaticProbabilityPredictorMixin, SequentialRuleModelAssemblageMixin, DefaultRuleMixin, GreedyTopDownRuleInductionMixin, BeamSearchTopDownRuleInductionMixin, NoPostProcessorMixin, NoFeatureBinningMixin, EqualWidthFeatureBinningMixin, EqualFrequencyFeatureBinningMixin, NoLabelSamplingMixin, RoundRobinLabelSamplingMixin, LabelSamplingWithoutReplacementMixin, NoInstanceSamplingMixin, InstanceSamplingWithReplacementMixin, InstanceSamplingWithoutReplacementMixin, LabelWiseStratifiedInstanceSamplingMixin, ExampleWiseStratifiedInstanceSamplingMixin, NoFeatureSamplingMixin, FeatureSamplingWithoutReplacementMixin, NoPartitionSamplingMixin, RandomBiPartitionSamplingMixin, LabelWiseStratifiedBiPartitionSamplingMixin, ExampleWiseStratifiedBiPartitionSamplingMixin, NoRulePruningMixin, IrepRulePruningMixin, NoParallelRuleRefinementMixin, ParallelRuleRefinementMixin, NoParallelStatisticUpdateMixin, ParallelStatisticUpdateMixin, NoParallelPredictionMixin, ParallelPredictionMixin, NoSizeStoppingCriterionMixin, SizeStoppingCriterionMixin, NoTimeStoppingCriterionMixin, TimeStoppingCriterionMixin, PrePruningMixin, NoGlobalPruningMixin, PostPruningMixin, NoSequentialPostOptimizationMixin, SequentialPostOptimizationMixin, NoMarginalProbabilityCalibrationMixin, NoJointProbabilityCalibrationMixin

Allows to configure the BOOMER algorithm.

use_automatic_binary_predictor()

Configures the rule learner to automatically decide for a predictor for predicting whether individual labels are relevant or irrelevant.

use_automatic_default_rule()

Configures the rule learner to automatically decide whether a default rule should be induced or not.

use_automatic_feature_binning()

Configures the rule learning to automatically decide whether a method for the assignment of numerical feature values to bins should be used or not.

use_automatic_heads()

Configures the rule learner to automatically decide for the type of rule heads to be used.

use_automatic_label_binning()

Configures the rule learner to automatically decide whether a method for the assignment of labels to bins should be used or not.

use_automatic_parallel_rule_refinement()

Configures the rule learner to automatically decide whether multi-threading should be used for the parallel refinement of rules or not.

use_automatic_parallel_statistic_update()

Configures the rule learner to automatically decide whether multi-threading should be used for the parallel update of statistics or not.

use_automatic_partition_sampling()

Configures the rule learner to automatically decide whether a holdout set should be used or not.

use_automatic_probability_predictor()

Configures the rule learner to automatically decide for a predictor for predicting probability estimates.

use_automatic_statistics()

Configures the rule learner to automatically decide whether a dense or sparse representation of gradients and Hessians should be used.

use_beam_search_top_down_rule_induction() BeamSearchTopDownRuleInductionConfig

Configures the algorithm to use a top-down beam search for the induction of individual rules.

Returns:

A BeamSearchTopDownRuleInductionConfig that allows further configuration of the algorithm for the induction of individual rules

use_complete_heads()

Configures the rule learner to induce rules with complete heads that predict for all available labels.

use_constant_shrinkage_post_processor() ConstantShrinkageConfig

Configures the rule learner to use a post-processor that shrinks the weights of rules by a constant “shrinkage” parameter.

Returns:

A ConstantShrinkageConfig that allows further configuration of the post-processor

use_default_rule()

Configures the rule learner to induce a default rule.

use_dense_statistics()

Configures the rule learner to use a dense representation of gradients and Hessians.

use_dynamic_partial_heads() DynamicPartialHeadConfig

Configures the rule learner to induce rules with partial heads that predict for a subset of the available labels that is determined dynamically. Only those labels for which the square of the predictive quality exceeds a certain threshold are included in a rule head.

Returns:

A DynamicPartialHeadConfig that allows further configuration of the rule heads

use_equal_frequency_feature_binning() EqualFrequencyFeatureBinningConfig

Configures the rule learner to use a method for the assignment of numerical feature values to bins, such that each bin contains approximately the same number of values.

Returns:

An EqualFrequencyFeatureBinningConfig that allows further configuration of the method for the assignment of numerical feature values to bins

use_equal_width_feature_binning() EqualWidthFeatureBinningConfig

Configures the rule learner to use a method for the assignment of numerical feature values to bins, such that each bin contains values from equally sized value ranges.

Returns:

An EqualWidthFeatureBinningConfig that allows further configuration of the method for the assignment of numerical feature values to bins

use_equal_width_label_binning() EqualWidthLabelBinningConfig

Configures the rule learner to use a method for the assignment of labels to bins in a way such that each bin contains labels for which the predicted score is expected to belong to the same value range.

Returns:

A EqualWidthLabelBinningConfig that allows further configuration of the method for the assignment of labels to bins

use_example_wise_binary_predictor() ExampleWiseBinaryPredictorConfig

Configures the rule learner to use a predictor that predicts known label vectors for given query examples by comparing the predicted regression scores or probability estimates to the label vectors encountered in the training data.

Returns:

An ExampleWiseBinaryPredictorConfig that allows further configuration of the predictor

use_example_wise_logistic_loss()

Configures the rule learner to use a loss function that implements a multi-label variant of the logistic loss that is applied example-wise.

use_example_wise_squared_error_loss()

Configures the rule learner to use a loss function that implements a multi-label variant of the squared error loss that is applied example-wise.

use_example_wise_squared_hinge_loss()

Configures the rule learner to use a loss function that implements a multi-label variant of the squared hinge loss that is applied example-wise.

use_example_wise_stratified_bi_partition_sampling() ExampleWiseStratifiedBiPartitionSamplingConfig

Configures the rule learner to partition the available training examples into a training set and a holdout set using stratification, where distinct label vectors are treated as individual classes

Returns:

An ExampleWiseStratifiedBiPartitionSamplingConfig that allows further configuration of the method for partitioning the available training examples into a training and a holdout set

use_example_wise_stratified_instance_sampling() ExampleWiseStratifiedInstanceSamplingConfig

Configures the rule learner to sample from the available training examples using stratification, where distinct label vectors are treated as individual classes, whenever a new rule should be learned.

Returns:

An ExampleWiseStratifiedInstanceSamplingConfig that allows further configuration of the method for sampling instances

use_feature_sampling_without_replacement() FeatureSamplingWithoutReplacementConfig

Configures the rule learner to sample from the available features with replacement whenever a rule should be refined.

Returns:

A FeatureSamplingWithoutReplacementConfig that allows further configuration of the method for sampling features

use_fixed_partial_heads() FixedPartialHeadConfig

Configures the rule learner to induce rules with partial heads that predict for a predefined number of labels.

Returns:

A FixedPartialHeadConfig that allows further configuration of the rule heads

use_gfm_binary_predictor() GfmBinaryPredictorConfig

Configures the rule learner to use a predictor that predicts whether individual labels of given query examples are relevant or irrelevant by discretizing the regression scores or probability estimates that are predicted for each label according to the general F-measure maximizer (GFM).

Returns:

A GfmBinaryPredictorConfig that allows further configuration of the predictor

use_global_post_pruning() PostPruningConfig

Configures the rule learner to use a stopping criterion that keeps track of the number of rules in a model that perform best with respect to the examples in the training or holdout set according to a certain measure.

use_global_pre_pruning() PrePruningConfig

Configures the rule learner to use a stopping criterion that stops the induction of rules as soon as the quality of a model’s predictions for the examples in the training or holdout set do not improve according to a certain measure.

Returns:

A PrePruningConfig that allows further configuration of the stopping criterion

use_greedy_top_down_rule_induction() GreedyTopDownRuleInductionConfig

Configures the algorithm to use a greedy top-down search for the induction of individual rules.

Returns:

A GreedyTopDownRuleInductionConfig that allows further configuration of the algorithm for the induction of individual rules

use_instance_sampling_with_replacement() InstanceSamplingWithReplacementConfig

Configures the rule learner to sample from the available training examples with replacement whenever a new rule should be learned.

Returns:

An InstanceSamplingWithReplacementConfig that allows further configuration of the method for sampling instances

use_instance_sampling_without_replacement() InstanceSamplingWithoutReplacementConfig

Configures the rule learner to sample from the available training examples without replacement whenever a new rule should be learned.

Returns:

An InstanceSamplingWithoutReplacementConfig that allows further configuration of the method for sampling instances

use_irep_rule_pruning()

Configures the rule learner to prune individual rules by following the principles of “incremental reduced error pruning” (IREP).

use_isotonic_joint_probability_calibration() IsotonicJointProbabilityCalibratorConfig

Configures the rule learner to calibrate joint probabilities via isotonic regression.

Returns:

An IsotonicJointProbabilityCalibratorConfig that allows further configuration of the calibrator

use_isotonic_marginal_probability_calibration() IsotonicMarginalProbabilityCalibratorConfig

Configures the rule learner to calibrate marginal probabilities via isotonic regression.

Returns:

An IsotonicMarginalProbabilityCalibratorConfig that allows further configuration of the calibrator

use_l1_regularization() ManualRegularizationConfig

Configures the rule learner to use L1 regularization.

Returns:

A ManualRegularizationConfig that allows further configuration of the regularization term

use_l2_regularization() ManualRegularizationConfig

Configures the rule learner to use L2 regularization.

Returns:

A ManualRegularizationConfig that allows further configuration of the regularization term

use_label_sampling_without_replacement() LabelSamplingWithoutReplacementConfig

Configures the rule learner to sample from the available labels with replacement whenever a new rule should be learned.

Returns:

A LabelSamplingWithoutReplacementConfig that allows further configuration of the method for sampling labels

use_label_wise_binary_predictor() LabelWiseBinaryPredictorConfig

Configures the rule learner to use a predictor that predicts whether individual labels of given query examples are relevant or irrelevant by discretizing the regression scores or probability estimates that are predicted for each label individually.

Returns:

A LabelWiseBinaryPredictorConfig that allows further configuration of the predictor

use_label_wise_logistic_loss()

Configures the rule learner to use a loss function that implements a multi-label variant of the logistic loss that is applied label-wise.

use_label_wise_probability_predictor() LabelWiseProbabilityPredictorConfig

Configures the rule learner to use a predictor that predicts label-wise probabilities for given query examples by transforming the regression scores that are predicted for each label individually into probabilities.

Returns:

A LabelWiseProbabilityPredictorConfig that allows further configuration of the predictor

use_label_wise_score_predictor()

Configures the rule learner to use a predictor that predict label-wise regression scores for given query examples by summing up the scores that are provided by individual rules for each label individually.

use_label_wise_squared_error_loss()

Configures the rule learner to use a loss function that implements a multi-label variant of the squared error loss that is applied label-wise.

use_label_wise_squared_hinge_loss()

Configures the rule learner to use a loss function that implements a multi-label variant of the squared hinge loss that is applied label-wise.

use_label_wise_stratified_bi_partition_sampling() LabelWiseStratifiedBiPartitionSamplingConfig

Configures the rule learner to partition the available training examples into a training set and a holdout set using stratification, such that for each label the proportion of relevant and irrelevant examples is maintained.

Returns:

A LabelWiseStratifiedBiPartitionSamplingConfig that allows further configuration of the method for partitioning the available training examples into a training and a holdout set

use_label_wise_stratified_instance_sampling() LabelWiseStratifiedInstanceSamplingConfig

Configures the rule learner to sample from the available training examples using stratification, such that for each label the proportion of relevant and irrelevant examples is maintained, whenever a new rule should be learned.

Returns:

A LabelWiseStratifiedInstanceSamplingConfig that allows further configuration of the method for sampling instances

use_marginalized_probability_predictor() MarginalizedProbabilityPredictorConfig

Configures the rule learner to use a predictor for predicting probability estimates by summing up the scores that are provided by individual rules of an existing rule-based model and comparing the aggregated score vector to the known label vectors according to a certain distance measure. The probability for an individual label calculates as the sum of the distances that have been obtained for all label vectors, where the respective label is specified to be relevant, divided by the total sum of all distances.

Returns:

A MarginalizedProbabilityPredictorConfig that allows further configuration of the predictor

use_no_default_rule()

Configures the rule learner to not induce a default rule.

use_no_feature_binning()

Configures the rule learner to not use any method for the assignment of numerical feature values to bins.

use_no_feature_sampling()

Configures the rule learner to not sample from the available features whenever a rule should be refined.

use_no_global_pruning()

Configures the rule learner to not use global pruning.

use_no_instance_sampling()

Configures the rule learner to not sample from the available training examples whenever a new rule should be learned.

use_no_joint_probability_calibration()

Configures the rule learner to not calibrate joint probabilities.

use_no_l1_regularization()

Configures the rule learner to not use L1 regularization.

use_no_l2_regularization()

Configures the rule learner to not use L2 regularization.

use_no_label_binning()

Configures the rule learner to not use any method for the assignment of labels to bins.

use_no_label_sampling()

Configures the rule learner to not sample from the available labels whenever a new rule should be learned.

use_no_marginal_probability_calibration()

Configures the rule learner to not calibrate marginal probabilities.

use_no_parallel_prediction()

Configures the rule learner to not use any multi-threading to predict for several query examples in parallel.

use_no_parallel_rule_refinement()

Configures the rule learner to not use any multi-threading for the parallel refinement of rules.

use_no_parallel_statistic_update()

Configures the rule learner to not use any multi-threading for the parallel update of statistics.

use_no_partition_sampling()

Configures the rule learner to not partition the available training examples into a training set and a holdout set.

use_no_post_processor()

Configures the rule learner to not use any post-processor.

use_no_rule_pruning()

Configures the rule learner to not prune individual rules.

use_no_sequential_post_optimization()

Configures the rule learner to not use a post-optimization method that optimizes each rule in a model by relearning it in the context of the other rules.

use_no_size_stopping_criterion()

Configures the rule learner to not use a stopping criterion that ensures that the number of induced rules does not exceed a certain maximum.

use_no_time_stopping_criterion()

Configures the rule learner to not use a stopping criterion that ensures that a certain time limit is not exceeded.

use_parallel_prediction() ManualMultiThreadingConfig

Configures the rule learner to use multi-threading to predict for several query examples in parallel.

Returns:

A ManualMultiThreadingConfig that allows further configuration of the multi-threading behavior

use_parallel_rule_refinement() ManualMultiThreadingConfig

Configures the rule learner to use multi-threading for the parallel refinement of rules.

Returns:

A ManualMultiThreadingConfig that allows further configuration of the multi-threading behavior

use_parallel_statistic_update() ManualMultiThreadingConfig

Configures the rule learner to use multi-threading for the parallel update of statistics.

Returns:

A ManualMultiThreadingConfig that allows further configuration of the multi-threading behavior

use_random_bi_partition_sampling() RandomBiPartitionSamplingConfig

Configures the rule learner to partition the available training examples into a training set and a holdout set by randomly splitting the training examples into two mutually exclusive sets.

Returns:

A RandomBiPartitionSamplingConfig that allows further configuration of the method for partitioning the available training examples into a training set and a holdout set

use_round_robin_label_sampling()

Configures the rule learner to sample a single label in a round-robin fashion whenever a new rule should be learned.

use_sequential_post_optimization() SequentialPostOptimizationConfig

Configures the rule learner to use a post-optimization method that optimizes each rule in a model by relearning it in the context of the other rules.

Returns:

A SequentialPostOptimizationConfig that allows further configuration of the post-optimization method

use_sequential_rule_model_assemblage()

Configures the rule learner to use an algorithm that sequentially induces several rules, optionally starting with a default rule, that are added to a rule-based model.

use_single_label_heads()

Configures the rule learner to induce rules with single-label heads that predict for a single label.

use_size_stopping_criterion() SizeStoppingCriterionConfig

Configures the rule learner to use a stopping criterion that ensures that the number of induced rules does not exceed a certain maximum.

Returns:

A SizeStoppingCriterionConfig that allows further configuration of the stopping criterion

use_sparse_statistics()

Configures the rule learner to use a sparse representation of gradients and Hessians, if possible.

use_time_stopping_criterion() TimeStoppingCriterionConfig

Configures the rule learner to use a stopping criterion that ensures that a certain time limit is not exceeded.

Returns:

A TimeStoppingCriterionConfig that allows further configuration of the stopping criterion