mlrl.common package
Submodules
mlrl.common.arrays module
Author: Michael Rapp (michael.rapp.ml@gmail.com)
Provides utility functions for handling arrays.
- mlrl.common.arrays.enforce_dense(a, order: str, dtype)
Converts a given array into a np.ndarray, if necessary, and enforces a specific memory layout and type.
- Parameters
a – A np.ndarray or scipy.sparse.matrix to be converted
order – The memory layout to be used. Must be C or F
dtype – The type to be used
- Returns
A np.ndarray that uses the given memory layout
mlrl.common.data_types module
Author: Michael Rapp (michael.rapp.ml@gmail.com)
Provides type definitions.
mlrl.common.learners module
Author: Michael Rapp (michael.rapp.ml@gmail.com)
Provides base classes for implementing single- or multi-label classifiers or rankers.
- class mlrl.common.learners.Learner
Bases:
sklearn.base.BaseEstimatorA base class for all single- or multi-label classifiers or rankers.
- fit(x, y)
Fits a model according to given training examples and corresponding ground truth labels.
- Parameters
x – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_features), that stores the feature values of the training examples
y – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_labels), that stores the labels of the training examples according to the ground truth
- Returns
The fitted learner
- abstract get_name() str
Returns a human-readable name that allows to identify the configuration used by the classifier or ranker.
- Returns
The name of the classifier or ranker
- predict(x)
Makes a prediction for given query examples.
- Parameters
x – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_features), that stores the feature values of the query examples
- Returns
A numpy.ndarray or scipy.sparse matrix of shape (num_examples, num_labels), that stores the prediction for individual examples and labels
- predict_proba(x)
Returns probability estimates for given query examples.
- Parameters
x – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_features), that stores the feature values of the query examples
- Returns
A numpy.ndarray or scipy.sparse matrix of shape (num_examples, num_labels), that stores the probabilities for individual examples and labels
mlrl.common.options module
Author: Michael Rapp (michael.rapp.ml@gmail.com)
Provides a data structure that allows to store and parse options that are provided as key-value pairs.
- class mlrl.common.options.BooleanOption(value)
Bases:
enum.EnumAn enumeration.
- FALSE = 'false'
- TRUE = 'true'
- static parse(s) bool
- class mlrl.common.options.Options
Bases:
objectStores key-value pairs in a dictionary and provides methods to access and validate them.
- ERROR_MESSAGE_INVALID_OPTION = 'Expected comma-separated list of key-value pairs'
- ERROR_MESSAGE_INVALID_SYNTAX = 'Invalid syntax used to specify additional options'
- classmethod create(string: str, allowed_keys: Set[str])
Parses the options that are provided via a given string that is formatted according to the following syntax: “[key1=value1,key2=value2]”. If the given string is malformed, a ValueError will be raised.
- Parameters
string – The string to be parsed
allowed_keys – A set that contains all valid keys
- Returns
An object of type Options that stores the key-value pairs that have been parsed from the given string
- get_bool(key: str, default_value: bool) bool
Returns a boolean that corresponds to a specific key.
- Parameters
key – The key
default_value – The default value to be returned, if no value is associated with the given key
- Returns
The value that is associated with the given key or the given default value
- get_float(key: str, default_value: float) float
Returns a float that corresponds to a specific key.
- Parameters
key – The key
default_value – The default value to be returned, if no value is associated with the given key
- Returns
THe value that is associated with the given key or the given default value
- get_int(key: str, default_value: int) int
Returns an integer that corresponds to a specific key.
- Parameters
key – The key
default_value – The default value to be returned, if no value is associated with the given key
- Returns
The value that is associated with the given key or the given default value
- get_string(key: str, default_value: str) str
Returns a string that corresponds to a specific key.
- Parameters
key – The key
default_value – The default value to be returned, if no value is associated with the given key
- Returns
The value that is associated with the given key or the given default value
mlrl.common.rule_learners module
Author: Michael Rapp (michael.rapp.ml@gmail.com)
Provides base classes for implementing single- or multi-label rule learning algorithms.
- class mlrl.common.rule_learners.MLRuleLearner(random_state: int, feature_format: str, label_format: str, prediction_format: str)
Bases:
mlrl.common.learners.Learner,mlrl.common.learners.NominalAttributeLearnerA scikit-multilearn implementation of a rule learning algorithm for multi-label classification or ranking.
- class mlrl.common.rule_learners.SparseFormat(value)
Bases:
enum.EnumAn enumeration.
- CSC = 'csc'
- CSR = 'csr'
- class mlrl.common.rule_learners.SparsePolicy(value)
Bases:
enum.EnumAn enumeration.
- AUTO = 'auto'
- FORCE_DENSE = 'dense'
- FORCE_SPARSE = 'sparse'
- mlrl.common.rule_learners.configure_feature_binning(config: mlrl.common.cython.learner.RuleLearnerConfig, feature_binning: Optional[str])
- mlrl.common.rule_learners.configure_feature_sampling(config: mlrl.common.cython.learner.RuleLearnerConfig, feature_sampling: Optional[str])
- mlrl.common.rule_learners.configure_instance_sampling(config: mlrl.common.cython.learner.RuleLearnerConfig, instance_sampling: Optional[str])
- mlrl.common.rule_learners.configure_label_sampling(config: mlrl.common.cython.learner.RuleLearnerConfig, label_sampling: Optional[str])
- mlrl.common.rule_learners.configure_parallel_prediction(config: mlrl.common.cython.learner.RuleLearnerConfig, parallel_prediction: Optional[str])
- mlrl.common.rule_learners.configure_parallel_rule_refinement(config: mlrl.common.cython.learner.RuleLearnerConfig, parallel_rule_refinement: Optional[str])
- mlrl.common.rule_learners.configure_parallel_statistic_update(config: mlrl.common.cython.learner.RuleLearnerConfig, parallel_statistic_update: Optional[str])
- mlrl.common.rule_learners.configure_partition_sampling(config: mlrl.common.cython.learner.RuleLearnerConfig, partition_sampling: Optional[str])
- mlrl.common.rule_learners.configure_pruning(config: mlrl.common.cython.learner.RuleLearnerConfig, pruning: Optional[str])
- mlrl.common.rule_learners.configure_rule_induction(config: mlrl.common.cython.learner.RuleLearnerConfig, rule_induction: Optional[str])
- mlrl.common.rule_learners.configure_rule_model_assemblage(config: mlrl.common.cython.learner.RuleLearnerConfig, rule_model_assemblage: Optional[str])
- mlrl.common.rule_learners.configure_size_stopping_criterion(config: mlrl.common.cython.learner.RuleLearnerConfig, max_rules: Optional[int])
- mlrl.common.rule_learners.configure_time_stopping_criterion(config: mlrl.common.cython.learner.RuleLearnerConfig, time_limit: Optional[int])
- mlrl.common.rule_learners.create_sparse_policy(parameter_name: str, policy: str) mlrl.common.rule_learners.SparsePolicy
- mlrl.common.rule_learners.is_sparse(m, sparse_format: mlrl.common.rule_learners.SparseFormat, dtype, sparse_values: bool = True) bool
Returns whether a given matrix is considered sparse or not. A matrix is considered sparse if it is given in a sparse format and is expected to occupy less memory than a dense matrix.
- Parameters
m – A np.ndarray or scipy.sparse.matrix to be checked
sparse_format – The SparseFormat to be used
dtype – The type of the values that should be stored in the matrix
sparse_values – True, if the values must explicitly be stored when using a sparse format, False otherwise
- Returns
True, if the given matrix is considered sparse, False otherwise
- mlrl.common.rule_learners.parse_param(parameter_name: str, value: str, allowed_values: Set[str]) str
- mlrl.common.rule_learners.parse_param_and_options(parameter_name: str, value: str, allowed_values_and_options: typing.Dict[str, typing.Set[str]]) -> (<class 'str'>, <class 'mlrl.common.options.Options'>)
- mlrl.common.rule_learners.should_enforce_sparse(m, sparse_format: mlrl.common.rule_learners.SparseFormat, policy: mlrl.common.rule_learners.SparsePolicy, dtype, sparse_values: bool = True) bool
Returns whether it is preferable to convert a given matrix into a scipy.sparse.csr_matrix, scipy.sparse.csc_matrix or scipy.sparse.dok_matrix, depending on the format of the given matrix and a given SparsePolicy:
If the given policy is SparsePolicy.AUTO, the matrix will be converted into the given sparse format, if possible, if the sparse matrix is expected to occupy less memory than a dense matrix. To be able to convert the matrix into a sparse format, it must be a scipy.sparse.lil_matrix, scipy.sparse.dok_matrix or scipy.sparse.coo_matrix. If the given sparse format is csr or csc and the matrix is a already in that format, it will not be converted.
If the given policy is SparsePolicy.FORCE_DENSE, the matrix will always be converted into the specified sparse format, if possible.
If the given policy is SparsePolicy.FORCE_SPARSE, the matrix will always be converted into a dense matrix.
- Parameters
m – A np.ndarray or scipy.sparse.matrix to be checked
sparse_format – The SparseFormat to be used
policy – The SparsePolicy to be used
dtype – The type of the values that should be stored in the matrix
sparse_values – True, if the values must explicitly be stored when using a sparse format, False otherwise
- Returns
True, if it is preferable to convert the matrix into a sparse matrix of the given format, False otherwise
mlrl.common.strings module
Author: Michael Rapp (michael.rapp.ml@gmail.com)
Provides utility functions for dealing with strings.
- mlrl.common.strings.format_dict_keys(dictionary: Dict[str, Set[str]]) str
Creates and returns a textual representation of the keys in a dictionary.
- Parameters
dictionary – The dictionary to be formatted
- Returns
The textual representation that has been created
- mlrl.common.strings.format_enum_values(enum) str
Creates and returns a textual representation of an enum’s values.
- Parameters
enum – The enum to be formatted
- Returns
The textual representation that has been created
- mlrl.common.strings.format_string_set(strings: Set[str]) str
Creates and returns a textual representation of the strings in a set.
- Parameters
strings – The set of strings to be formatted
- Returns
The textual representation that has been created