mlrl.common package

Submodules

mlrl.common.arrays module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides utility functions for handling arrays.

mlrl.common.arrays.enforce_dense(a, order: str, dtype)

Converts a given array into a np.ndarray, if necessary, and enforces a specific memory layout and type.

Parameters

a – A np.ndarray or scipy.sparse.matrix to be converted
order – The memory layout to be used. Must be C or F
dtype – The type to be used

Returns

A np.ndarray that uses the given memory layout

mlrl.common.data_types module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides type definitions.

mlrl.common.learners module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides base classes for implementing single- or multi-label classifiers or rankers.

class mlrl.common.learners.Learner

Bases: sklearn.base.BaseEstimator

A base class for all single- or multi-label classifiers or rankers.

fit(x, y)

Fits a model according to given training examples and corresponding ground truth labels.

Parameters

x – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_features), that stores the feature values of the training examples
y – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_labels), that stores the labels of the training examples according to the ground truth

Returns

The fitted learner

abstract get_name() → str

Returns a human-readable name that allows to identify the configuration used by the classifier or ranker.

Returns: The name of the classifier or ranker

predict(x)

Makes a prediction for given query examples.

Parameters: x – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_features), that stores the feature values of the query examples
Returns: A numpy.ndarray or scipy.sparse matrix of shape (num_examples, num_labels), that stores the prediction for individual examples and labels

predict_proba(x)

Returns probability estimates for given query examples.

Parameters: x – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_features), that stores the feature values of the query examples
Returns: A numpy.ndarray or scipy.sparse matrix of shape (num_examples, num_labels), that stores the probabilities for individual examples and labels

class mlrl.common.learners.NominalAttributeLearner

Bases: abc.ABC

A base class for all single- or multi-label classifiers or rankers that natively support nominal attributes.

nominal_attribute_indices: List[int] = None

mlrl.common.options module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides a data structure that allows to store and parse options that are provided as key-value pairs.

class mlrl.common.options.BooleanOption(value)

Bases: enum.Enum

An enumeration.

FALSE = 'false'

TRUE = 'true'

static parse(s) → bool

class mlrl.common.options.Options

Bases: object

Stores key-value pairs in a dictionary and provides methods to access and validate them.

ERROR_MESSAGE_INVALID_OPTION = 'Expected comma-separated list of key-value pairs'

ERROR_MESSAGE_INVALID_SYNTAX = 'Invalid syntax used to specify additional options'

classmethod create(string: str, allowed_keys: Set[str])

Parses the options that are provided via a given string that is formatted according to the following syntax: “[key1=value1,key2=value2]”. If the given string is malformed, a ValueError will be raised.

Parameters

string – The string to be parsed
allowed_keys – A set that contains all valid keys

Returns

An object of type Options that stores the key-value pairs that have been parsed from the given string

get_bool(key: str, default_value: bool) → bool

Returns a boolean that corresponds to a specific key.

Parameters

key – The key
default_value – The default value to be returned, if no value is associated with the given key

Returns

The value that is associated with the given key or the given default value

get_float(key: str, default_value: float) → float

Returns a float that corresponds to a specific key.

Parameters

key – The key
default_value – The default value to be returned, if no value is associated with the given key

Returns

THe value that is associated with the given key or the given default value

get_int(key: str, default_value: int) → int

Returns an integer that corresponds to a specific key.

Parameters

key – The key
default_value – The default value to be returned, if no value is associated with the given key

Returns

The value that is associated with the given key or the given default value

get_string(key: str, default_value: str) → str

Returns a string that corresponds to a specific key.

Parameters

key – The key
default_value – The default value to be returned, if no value is associated with the given key

Returns

The value that is associated with the given key or the given default value

mlrl.common.rule_learners module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides base classes for implementing single- or multi-label rule learning algorithms.

class mlrl.common.rule_learners.MLRuleLearner(random_state: int, feature_format: str, label_format: str, prediction_format: str)

Bases: mlrl.common.learners.Learner, mlrl.common.learners.NominalAttributeLearner

A scikit-multilearn implementation of a rule learning algorithm for multi-label classification or ranking.

class mlrl.common.rule_learners.SparseFormat(value)

Bases: enum.Enum

An enumeration.

CSC = 'csc'

CSR = 'csr'

class mlrl.common.rule_learners.SparsePolicy(value)

Bases: enum.Enum

An enumeration.

AUTO = 'auto'

FORCE_DENSE = 'dense'

FORCE_SPARSE = 'sparse'

mlrl.common.rule_learners.configure_feature_binning(config: mlrl.common.cython.learner.RuleLearnerConfig, feature_binning: Optional[str])

mlrl.common.rule_learners.configure_feature_sampling(config: mlrl.common.cython.learner.RuleLearnerConfig, feature_sampling: Optional[str])

mlrl.common.rule_learners.configure_instance_sampling(config: mlrl.common.cython.learner.RuleLearnerConfig, instance_sampling: Optional[str])

mlrl.common.rule_learners.configure_label_sampling(config: mlrl.common.cython.learner.RuleLearnerConfig, label_sampling: Optional[str])

mlrl.common.rule_learners.configure_parallel_prediction(config: mlrl.common.cython.learner.RuleLearnerConfig, parallel_prediction: Optional[str])

mlrl.common.rule_learners.configure_parallel_rule_refinement(config: mlrl.common.cython.learner.RuleLearnerConfig, parallel_rule_refinement: Optional[str])

mlrl.common.rule_learners.configure_parallel_statistic_update(config: mlrl.common.cython.learner.RuleLearnerConfig, parallel_statistic_update: Optional[str])

mlrl.common.rule_learners.configure_partition_sampling(config: mlrl.common.cython.learner.RuleLearnerConfig, partition_sampling: Optional[str])

mlrl.common.rule_learners.configure_pruning(config: mlrl.common.cython.learner.RuleLearnerConfig, pruning: Optional[str])

mlrl.common.rule_learners.configure_rule_induction(config: mlrl.common.cython.learner.RuleLearnerConfig, rule_induction: Optional[str])

mlrl.common.rule_learners.configure_rule_model_assemblage(config: mlrl.common.cython.learner.RuleLearnerConfig, rule_model_assemblage: Optional[str])

mlrl.common.rule_learners.configure_size_stopping_criterion(config: mlrl.common.cython.learner.RuleLearnerConfig, max_rules: Optional[int])

mlrl.common.rule_learners.configure_time_stopping_criterion(config: mlrl.common.cython.learner.RuleLearnerConfig, time_limit: Optional[int])

mlrl.common.rule_learners.create_sparse_policy(parameter_name: str, policy: str) → mlrl.common.rule_learners.SparsePolicy

mlrl.common.rule_learners.is_sparse(m, sparse_format: mlrl.common.rule_learners.SparseFormat, dtype, sparse_values: bool = True) → bool

Returns whether a given matrix is considered sparse or not. A matrix is considered sparse if it is given in a sparse format and is expected to occupy less memory than a dense matrix.

Parameters

m – A np.ndarray or scipy.sparse.matrix to be checked
sparse_format – The SparseFormat to be used
dtype – The type of the values that should be stored in the matrix
sparse_values – True, if the values must explicitly be stored when using a sparse format, False otherwise

Returns

True, if the given matrix is considered sparse, False otherwise

mlrl.common.rule_learners.parse_param(parameter_name: str, value: str, allowed_values: Set[str]) → str

mlrl.common.rule_learners.parse_param_and_options(parameter_name: str, value: str, allowed_values_and_options: typing.Dict[str, typing.Set[str]]) -> (<class 'str'>, <class 'mlrl.common.options.Options'>)

mlrl.common.rule_learners.should_enforce_sparse(m, sparse_format: mlrl.common.rule_learners.SparseFormat, policy: mlrl.common.rule_learners.SparsePolicy, dtype, sparse_values: bool = True) → bool

Returns whether it is preferable to convert a given matrix into a scipy.sparse.csr_matrix, scipy.sparse.csc_matrix or scipy.sparse.dok_matrix, depending on the format of the given matrix and a given SparsePolicy:

If the given policy is SparsePolicy.AUTO, the matrix will be converted into the given sparse format, if possible, if the sparse matrix is expected to occupy less memory than a dense matrix. To be able to convert the matrix into a sparse format, it must be a scipy.sparse.lil_matrix, scipy.sparse.dok_matrix or scipy.sparse.coo_matrix. If the given sparse format is csr or csc and the matrix is a already in that format, it will not be converted.

If the given policy is SparsePolicy.FORCE_DENSE, the matrix will always be converted into the specified sparse format, if possible.

If the given policy is SparsePolicy.FORCE_SPARSE, the matrix will always be converted into a dense matrix.

Parameters

m – A np.ndarray or scipy.sparse.matrix to be checked
sparse_format – The SparseFormat to be used
policy – The SparsePolicy to be used
dtype – The type of the values that should be stored in the matrix
sparse_values – True, if the values must explicitly be stored when using a sparse format, False otherwise

Returns

True, if it is preferable to convert the matrix into a sparse matrix of the given format, False otherwise

mlrl.common.strings module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides utility functions for dealing with strings.

mlrl.common.strings.format_dict_keys(dictionary: Dict[str, Set[str]]) → str

Creates and returns a textual representation of the keys in a dictionary.

Parameters: dictionary – The dictionary to be formatted
Returns: The textual representation that has been created

mlrl.common.strings.format_enum_values(enum) → str

Creates and returns a textual representation of an enum’s values.

Parameters: enum – The enum to be formatted
Returns: The textual representation that has been created

mlrl.common.strings.format_string_set(strings: Set[str]) → str

Creates and returns a textual representation of the strings in a set.

Parameters: strings – The set of strings to be formatted
Returns: The textual representation that has been created

mlrl.common package

Submodules

mlrl.common.arrays module

mlrl.common.data_types module

mlrl.common.learners module

mlrl.common.options module

mlrl.common.rule_learners module

mlrl.common.strings module

Module contents