mlrl.common.cython.stopping_criterion module¶

@author: Michael Rapp (michael.rapp.ml@gmail.com)

class mlrl.common.cython.stopping_criterion.AggregationFunction(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)¶

Bases: Enum

Specifies different types of aggregation functions that allow to aggregate the values that are stored in a buffer.

ARITHMETIC_MEAN = 2¶

MAX = 1¶

MIN = 0¶

class mlrl.common.cython.stopping_criterion.AggregationFunctionImpl(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)¶

Bases: IntFlag

ARITHMETIC_MEAN = 2¶

MAX = 1¶

MIN = 0¶

class mlrl.common.cython.stopping_criterion.PostPruningConfig¶

Bases: object

Defines an interface for all classes that allow to configure a stopping criterion that keeps track of the number of rules in a model that perform best with respect to the examples in the training or holdout set according to a certain measure.

This stopping criterion assesses the performance of the current model after every interval rules and stores and checks whether the current model is the best one evaluated so far.

get_interval() → int¶

Returns the interval that is used to check whether the current model is the best one evaluated so far.

Returns:: The interval that is used to check whether the current model is the best one evaluated so far

get_min_rules() → int¶

Returns the minimum number of rules that must be included in a model.

Returns:: The minimum number of rules that must be included in a model

is_holdout_set_used() → bool¶

Returns whether the quality of the current model’s predictions is measured on the holdout set, if available, or if the training set is used instead.

Returns:: True, if the quality of the current model’s predictions is measured on the holdout set, if available, False, if the training set is used instead

is_remove_unused_rules() → bool¶

Returns whether rules that have been induced, but are not used, should be removed from the final model or not.

Returns:: True, if unused rules should be removed from the model, False otherwise

set_interval(interval: int) → PostPruningConfig¶

Sets the interval that should be used to check whether the current model is the best one evaluated so far.

Parameters:: interval – The interval that should be used to check whether the current model is the best one evaluated so far, e.g., a value of 10 means that the best model may include 10, 20, … rules
Returns:: A PostPruningConfig that allows further configuration of the stopping criterion

set_min_rules(min_rules: int) → PostPruningConfig¶

Sets the minimum number of rules that must be included in a model.

Parameters:: min_rules – The minimum number of rules that must be included in a model. Must be at least 1
Returns:: A PostPruningConfig that allows further configuration of the stopping criterion

set_remove_unused_rules(remove_unused_rules: bool) → PostPruningConfig¶

Sets whether rules that have been induced, but are not used, should be removed from the final model or not.

Parameters:: remove_unused_rules – True, if unused rules should be removed from the model, false otherwise
Returns:: A PostPruningConfig that allows further configuration of the stopping criterion

set_use_holdout_set(use_holdout_set: bool) → PostPruningConfig¶

Sets whether the quality of he current model’s predictions should be measured on the holdout set, if available, or if the training set should be used instead.

Parameters:: use_holdout_set – True, if the quality of the current model’s predictions should be measured on the holdout set, if available, False, if the training set should be used instead
Returns:: A PostPruningConfig that allows further configuration of the stopping criterion

class mlrl.common.cython.stopping_criterion.PrePruningConfig¶

Bases: object

Allow to configure a stopping criterion that stops the induction of rules as soon as the quality of a model’s predictions for the examples in a holdout set do not improve according to a certain measure.

This stopping criterion assesses the performance of the current model after every updateInterval rules and stores its quality in a buffer that keeps track of the last numCurrent iterations. If the capacity of this buffer is already reached, the oldest quality is passed to a buffer of size numPast. Every stopInterval rules, it is decided whether the rule induction should be stopped. For this reason, the numCurrent qualities in the first buffer, as well as the numPast qualities in the second buffer are aggregated according to a certain aggregation_function. If the percentage improvement, which results from comparing the more recent qualities from the first buffer to the older qualities from the second buffer, is greater than a certain minImprovement, the rule induction is continued, otherwise it is stopped.

get_aggregation_function() → AggregationFunction¶

Returns the type of the aggregation function that is used to aggregate the values that are stored in a buffer.

Returns:: A value of the enum AggregationFunction that specifies the type of the aggregation function that is used to aggregate the values that are stored in a buffer

get_min_improvement() → float¶

Returns the minimum improvement that must be reached for the rule induction to be continued.

Returns:: The minimum improvement that must be reached for the rule induction to be continued

get_min_rules() → int¶

Returns the minimum number of rules that must have been learned until the induction of rules might be stopped.

Returns:: The minimum number of rules that must have been learned until the induction of rules might be stopped

get_num_current() → int¶

Returns the number of the most recent iterations that are stored in a buffer.

Returns:: The number of the most recent iterations that are stored in a buffer

get_num_past() → int¶

Returns the number of quality stores of past iterations that are stored in a buffer.

Returns:: The number of quality stores of past iterations that are stored in a buffer

get_stop_interval() → int¶

Returns the interval that is used to decide whether the induction of rules should be stopped.

Returns:: The interval that is used to decide whether the induction of rules should be stopped

get_update_interval() → int¶

Returns the interval that is used to update the quality of the current model.

Returns:: The interval that is used to update the quality of the current model

is_holdout_set_used() → bool¶

Returns whether the quality of the current model’s predictions is measured on the holdout set, if available, or if the training set is used instead.

Returns:: True, if the quality of the current model’s predictions is measured on the holdout set, if available, False, if the training set is used instead

is_remove_unused_rules() → bool¶

Returns whether rules that have been induced, but are not used, should be removed from the final model or not.

Returns:: True, if unused rules should be removed from the model, False otherwise

set_aggregation_function(aggregation_function: AggregationFunction) → PrePruningConfig¶

Sets the type of the aggregation function that should be used to aggregate the values that are stored in a buffer.

Parameters:: aggregation_function – A value of the enum AggregationFunction that specifies the type of the aggregation function that should be used to aggregate the values that are stored in a buffer
Returns:: A PrePruningConfig that allows further configuration of the stopping criterion

set_min_improvement(min_improvement: float) → PrePruningConfig¶

Sets the minimum improvement that must be reached for the rule induction to be continued.

Parameters:: min_improvement – The minimum improvement in percent that must be reached for the rule induction to be continued. Must be in [0, 1]
Returns:: A PrePruningConfig that allows further configuration of the stopping criterion

set_min_rules(min_rules: int) → PrePruningConfig¶

Sets the minimum number of rules that must have been learned until the induction of rules might be stopped.

Parameters:: min_rules – The minimum number of rules that must have been learned until the induction of rules might be stopped. Must be at least 1
Returns:: A PrePruningConfig that allows further configuration of the stopping criterion

set_num_current(num_current: int) → PrePruningConfig¶

Sets the number of the most recent iterations that should be stored in a buffer.

Parameters:: num_current – The number of the most recent iterations that should be stored in a buffer. Must be at least 1
Returns:: A PrePruningConfig that allows further configuration of the stopping criterion

set_num_past(num_past: int) → PrePruningConfig¶

Sets the number of past iterations that should be stored in a buffer.

Parameters:: num_past – The number of past iterations that should be be stored in a buffer. Must be at least 1
Returns:: A PrePruningConfig that allows further configuration of the stopping criterion

set_remove_unused_rules(remove_unused_rules: bool) → PrePruningConfig¶

Sets whether rules that have been induced, but are not used, should be removed from the final model or not.

Parameters:: remove_unused_rules – True, if unused rules should be removed from the model, false otherwise
Returns:: A PrePruningConfig that allows further configuration of the stopping criterion

set_stop_interval(stop_interval: int) → PrePruningConfig¶

Sets the interval that should be used to decide whether the induction of rules should be stopped.

Parameters:: stop_interval – The interval that should be used to decide whether the induction of rules should be stopped, e.g., a value of 10 means that the rule induction might be stopped after 10, 20, … rules. Must be a multiple of the update interval
Returns:: A PrePruningConfig that allows further configuration of the stopping criterion

set_update_interval(update_interval: int) → PrePruningConfig¶

Sets the interval that should be used to update the quality of the current model.

Parameters:: update_interval – The interval that should be used to update the quality of the current model, e.g., a * value of 5 means that the model quality is assessed every 5 rules. Must be at least 1
Returns:: A PrePruningConfig that allows further configuration of the stopping criterion

set_use_holdout_set(use_holdout_set: bool) → PrePruningConfig¶

Sets whether the quality of he current model’s predictions should be measured on the holdout set, if available, or if the training set should be used instead.

Parameters:: use_holdout_set – True, if the quality of the current model’s predictions should be measured on the holdout set, if available, False, if the training set should be used instead
Returns:: A PrePruningConfig that allows further configuration of the stopping criterion

class mlrl.common.cython.stopping_criterion.SizeStoppingCriterionConfig¶

Bases: object

Allows to configure a stopping criterion that ensures that the number of induced rules does not exceed a certain maximum.

get_max_rules() → int¶

Returns the maximum number of rules that are induced.

Returns:: The maximum number of rules that are induced

set_max_rules(max_rules: int) → SizeStoppingCriterionConfig¶

Sets the maximum number of rules that should be induced.

Parameters:: max_rules – The maximum number of rules that should be induced. Must be at least 1
Returns:: A SizeStoppingCriterionConfig that allows further configuration of the stopping criterion

class mlrl.common.cython.stopping_criterion.TimeStoppingCriterionConfig¶

Bases: object

Allows to configure a stopping criterion that ensures that a certain time limit is not exceeded.

get_time_limit() → int¶

Returns the time limit.

Returns:: The time limit in seconds

set_time_limit(time_limit: int) → TimeStoppingCriterionConfig¶

Sets the time limit.

Parameters:: time_limit – The time limit in seconds. Must be at least 1
Returns:: A TimeStoppingCriterionConfig that allows further configuration of the stopping criterion