mlrl.common.cython.stopping_criterion module¶
@author: Michael Rapp (michael.rapp.ml@gmail.com)
- class mlrl.common.cython.stopping_criterion.AggregationFunction(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
Enum
Specifies different types of aggregation functions that allow to aggregate the values that are stored in a buffer.
- ARITHMETIC_MEAN = 2¶
- MAX = 1¶
- MIN = 0¶
- class mlrl.common.cython.stopping_criterion.AggregationFunctionImpl(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
IntFlag
- ARITHMETIC_MEAN = 2¶
- MAX = 1¶
- MIN = 0¶
- class mlrl.common.cython.stopping_criterion.PostPruningConfig¶
Bases:
object
Defines an interface for all classes that allow to configure a stopping criterion that keeps track of the number of rules in a model that perform best with respect to the examples in the training or holdout set according to a certain measure.
This stopping criterion assesses the performance of the current model after every interval rules and stores and checks whether the current model is the best one evaluated so far.
- get_interval() int ¶
Returns the interval that is used to check whether the current model is the best one evaluated so far.
- Returns:
The interval that is used to check whether the current model is the best one evaluated so far
- get_min_rules() int ¶
Returns the minimum number of rules that must be included in a model.
- Returns:
The minimum number of rules that must be included in a model
- is_holdout_set_used() bool ¶
Returns whether the quality of the current model’s predictions is measured on the holdout set, if available, or if the training set is used instead.
- Returns:
True, if the quality of the current model’s predictions is measured on the holdout set, if available, False, if the training set is used instead
- is_remove_unused_rules() bool ¶
Returns whether rules that have been induced, but are not used, should be removed from the final model or not.
- Returns:
True, if unused rules should be removed from the model, False otherwise
- set_interval(interval: int) PostPruningConfig ¶
Sets the interval that should be used to check whether the current model is the best one evaluated so far.
- Parameters:
interval – The interval that should be used to check whether the current model is the best one evaluated so far, e.g., a value of 10 means that the best model may include 10, 20, … rules
- Returns:
A PostPruningConfig that allows further configuration of the stopping criterion
- set_min_rules(min_rules: int) PostPruningConfig ¶
Sets the minimum number of rules that must be included in a model.
- Parameters:
min_rules – The minimum number of rules that must be included in a model. Must be at least 1
- Returns:
A PostPruningConfig that allows further configuration of the stopping criterion
- set_remove_unused_rules(remove_unused_rules: bool) PostPruningConfig ¶
Sets whether rules that have been induced, but are not used, should be removed from the final model or not.
- Parameters:
remove_unused_rules – True, if unused rules should be removed from the model, false otherwise
- Returns:
A PostPruningConfig that allows further configuration of the stopping criterion
- set_use_holdout_set(use_holdout_set: bool) PostPruningConfig ¶
Sets whether the quality of he current model’s predictions should be measured on the holdout set, if available, or if the training set should be used instead.
- Parameters:
use_holdout_set – True, if the quality of the current model’s predictions should be measured on the holdout set, if available, False, if the training set should be used instead
- Returns:
A PostPruningConfig that allows further configuration of the stopping criterion
- class mlrl.common.cython.stopping_criterion.PrePruningConfig¶
Bases:
object
Allow to configure a stopping criterion that stops the induction of rules as soon as the quality of a model’s predictions for the examples in a holdout set do not improve according to a certain measure.
This stopping criterion assesses the performance of the current model after every updateInterval rules and stores its quality in a buffer that keeps track of the last numCurrent iterations. If the capacity of this buffer is already reached, the oldest quality is passed to a buffer of size numPast. Every stopInterval rules, it is decided whether the rule induction should be stopped. For this reason, the numCurrent qualities in the first buffer, as well as the numPast qualities in the second buffer are aggregated according to a certain aggregation_function. If the percentage improvement, which results from comparing the more recent qualities from the first buffer to the older qualities from the second buffer, is greater than a certain minImprovement, the rule induction is continued, otherwise it is stopped.
- get_aggregation_function() AggregationFunction ¶
Returns the type of the aggregation function that is used to aggregate the values that are stored in a buffer.
- Returns:
A value of the enum AggregationFunction that specifies the type of the aggregation function that is used to aggregate the values that are stored in a buffer
- get_min_improvement() float ¶
Returns the minimum improvement that must be reached for the rule induction to be continued.
- Returns:
The minimum improvement that must be reached for the rule induction to be continued
- get_min_rules() int ¶
Returns the minimum number of rules that must have been learned until the induction of rules might be stopped.
- Returns:
The minimum number of rules that must have been learned until the induction of rules might be stopped
- get_num_current() int ¶
Returns the number of the most recent iterations that are stored in a buffer.
- Returns:
The number of the most recent iterations that are stored in a buffer
- get_num_past() int ¶
Returns the number of quality stores of past iterations that are stored in a buffer.
- Returns:
The number of quality stores of past iterations that are stored in a buffer
- get_stop_interval() int ¶
Returns the interval that is used to decide whether the induction of rules should be stopped.
- Returns:
The interval that is used to decide whether the induction of rules should be stopped
- get_update_interval() int ¶
Returns the interval that is used to update the quality of the current model.
- Returns:
The interval that is used to update the quality of the current model
- is_holdout_set_used() bool ¶
Returns whether the quality of the current model’s predictions is measured on the holdout set, if available, or if the training set is used instead.
- Returns:
True, if the quality of the current model’s predictions is measured on the holdout set, if available, False, if the training set is used instead
- is_remove_unused_rules() bool ¶
Returns whether rules that have been induced, but are not used, should be removed from the final model or not.
- Returns:
True, if unused rules should be removed from the model, False otherwise
- set_aggregation_function(aggregation_function: AggregationFunction) PrePruningConfig ¶
Sets the type of the aggregation function that should be used to aggregate the values that are stored in a buffer.
- Parameters:
aggregation_function – A value of the enum AggregationFunction that specifies the type of the aggregation function that should be used to aggregate the values that are stored in a buffer
- Returns:
A PrePruningConfig that allows further configuration of the stopping criterion
- set_min_improvement(min_improvement: float) PrePruningConfig ¶
Sets the minimum improvement that must be reached for the rule induction to be continued.
- Parameters:
min_improvement – The minimum improvement in percent that must be reached for the rule induction to be continued. Must be in [0, 1]
- Returns:
A PrePruningConfig that allows further configuration of the stopping criterion
- set_min_rules(min_rules: int) PrePruningConfig ¶
Sets the minimum number of rules that must have been learned until the induction of rules might be stopped.
- Parameters:
min_rules – The minimum number of rules that must have been learned until the induction of rules might be stopped. Must be at least 1
- Returns:
A PrePruningConfig that allows further configuration of the stopping criterion
- set_num_current(num_current: int) PrePruningConfig ¶
Sets the number of the most recent iterations that should be stored in a buffer.
- Parameters:
num_current – The number of the most recent iterations that should be stored in a buffer. Must be at least 1
- Returns:
A PrePruningConfig that allows further configuration of the stopping criterion
- set_num_past(num_past: int) PrePruningConfig ¶
Sets the number of past iterations that should be stored in a buffer.
- Parameters:
num_past – The number of past iterations that should be be stored in a buffer. Must be at least 1
- Returns:
A PrePruningConfig that allows further configuration of the stopping criterion
- set_remove_unused_rules(remove_unused_rules: bool) PrePruningConfig ¶
Sets whether rules that have been induced, but are not used, should be removed from the final model or not.
- Parameters:
remove_unused_rules – True, if unused rules should be removed from the model, false otherwise
- Returns:
A PrePruningConfig that allows further configuration of the stopping criterion
- set_stop_interval(stop_interval: int) PrePruningConfig ¶
Sets the interval that should be used to decide whether the induction of rules should be stopped.
- Parameters:
stop_interval – The interval that should be used to decide whether the induction of rules should be stopped, e.g., a value of 10 means that the rule induction might be stopped after 10, 20, … rules. Must be a multiple of the update interval
- Returns:
A PrePruningConfig that allows further configuration of the stopping criterion
- set_update_interval(update_interval: int) PrePruningConfig ¶
Sets the interval that should be used to update the quality of the current model.
- Parameters:
update_interval – The interval that should be used to update the quality of the current model, e.g., a * value of 5 means that the model quality is assessed every 5 rules. Must be at least 1
- Returns:
A PrePruningConfig that allows further configuration of the stopping criterion
- set_use_holdout_set(use_holdout_set: bool) PrePruningConfig ¶
Sets whether the quality of he current model’s predictions should be measured on the holdout set, if available, or if the training set should be used instead.
- Parameters:
use_holdout_set – True, if the quality of the current model’s predictions should be measured on the holdout set, if available, False, if the training set should be used instead
- Returns:
A PrePruningConfig that allows further configuration of the stopping criterion
- class mlrl.common.cython.stopping_criterion.SizeStoppingCriterionConfig¶
Bases:
object
Allows to configure a stopping criterion that ensures that the number of induced rules does not exceed a certain maximum.
- get_max_rules() int ¶
Returns the maximum number of rules that are induced.
- Returns:
The maximum number of rules that are induced
- set_max_rules(max_rules: int) SizeStoppingCriterionConfig ¶
Sets the maximum number of rules that should be induced.
- Parameters:
max_rules – The maximum number of rules that should be induced. Must be at least 1
- Returns:
A SizeStoppingCriterionConfig that allows further configuration of the stopping criterion
- class mlrl.common.cython.stopping_criterion.TimeStoppingCriterionConfig¶
Bases:
object
Allows to configure a stopping criterion that ensures that a certain time limit is not exceeded.
- set_time_limit(time_limit: int) TimeStoppingCriterionConfig ¶
Sets the time limit.
- Parameters:
time_limit – The time limit in seconds. Must be at least 1
- Returns:
A TimeStoppingCriterionConfig that allows further configuration of the stopping criterion