mlrl.common package

Submodules

mlrl.common.arrays module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides utility functions for handling arrays.

mlrl.common.arrays.enforce_dense(a, order: str, dtype)

Converts a given array into a np.ndarray, if necessary, and enforces a specific memory layout and type.

Parameters
  • a – A np.ndarray or scipy.sparse.matrix to be converted

  • order – The memory layout to be used. Must be C or F

  • dtype – The type to be used

Returns

A np.ndarray that uses the given memory layout

mlrl.common.data_types module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides type definitions.

mlrl.common.learners module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides base classes for implementing single- or multi-label classifiers or rankers.

class mlrl.common.learners.Learner

Bases: sklearn.base.BaseEstimator

A base class for all single- or multi-label classifiers or rankers.

fit(x, y)

Fits a model according to given training examples and corresponding ground truth labels.

Parameters
  • x – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_features), that stores the feature values of the training examples

  • y – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_labels), that stores the labels of the training examples according to the ground truth

Returns

The fitted learner

abstract get_name() str

Returns a human-readable name that allows to identify the configuration used by the classifier or ranker.

Returns

The name of the classifier or ranker

predict(x)

Makes a prediction for given query examples.

Parameters

x – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_features), that stores the feature values of the query examples

Returns

A numpy.ndarray or scipy.sparse matrix of shape (num_examples, num_labels), that stores the prediction for individual examples and labels

predict_proba(x)

Returns probability estimates for given query examples.

Parameters

x – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_features), that stores the feature values of the query examples

Returns

A numpy.ndarray or scipy.sparse matrix of shape (num_examples, num_labels), that stores the probabilities for individual examples and labels

class mlrl.common.learners.NominalAttributeLearner

Bases: abc.ABC

A base class for all single- or multi-label classifiers or rankers that natively support nominal attributes.

nominal_attribute_indices: List[int] = None

mlrl.common.options module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides a data structure that allows to store and parse options that are provided as key-value pairs.

class mlrl.common.options.BooleanOption(value)

Bases: enum.Enum

An enumeration.

FALSE = 'false'
TRUE = 'true'
static parse(s) bool
class mlrl.common.options.Options

Bases: object

Stores key-value pairs in a dictionary and provides methods to access and validate them.

ERROR_MESSAGE_INVALID_OPTION = 'Expected comma-separated list of key-value pairs'
ERROR_MESSAGE_INVALID_SYNTAX = 'Invalid syntax used to specify additional options'
classmethod create(string: str, allowed_keys: Set[str])

Parses the options that are provided via a given string that is formatted according to the following syntax: “[key1=value1,key2=value2]”. If the given string is malformed, a ValueError will be raised.

Parameters
  • string – The string to be parsed

  • allowed_keys – A set that contains all valid keys

Returns

An object of type Options that stores the key-value pairs that have been parsed from the given string

get_bool(key: str, default_value: bool) bool

Returns a boolean that corresponds to a specific key.

Parameters
  • key – The key

  • default_value – The default value to be returned, if no value is associated with the given key

Returns

The value that is associated with the given key or the given default value

get_float(key: str, default_value: float) float

Returns a float that corresponds to a specific key.

Parameters
  • key – The key

  • default_value – The default value to be returned, if no value is associated with the given key

Returns

THe value that is associated with the given key or the given default value

get_int(key: str, default_value: int) int

Returns an integer that corresponds to a specific key.

Parameters
  • key – The key

  • default_value – The default value to be returned, if no value is associated with the given key

Returns

The value that is associated with the given key or the given default value

get_string(key: str, default_value: str) str

Returns a string that corresponds to a specific key.

Parameters
  • key – The key

  • default_value – The default value to be returned, if no value is associated with the given key

Returns

The value that is associated with the given key or the given default value

mlrl.common.rule_learners module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides base classes for implementing single- or multi-label rule learning algorithms.

class mlrl.common.rule_learners.MLRuleLearner(random_state: int, feature_format: str, label_format: str, prediction_format: str)

Bases: mlrl.common.learners.Learner, mlrl.common.learners.NominalAttributeLearner

A scikit-multilearn implementation of a rule learning algorithm for multi-label classification or ranking.

class mlrl.common.rule_learners.SparseFormat(value)

Bases: enum.Enum

An enumeration.

CSC = 'csc'
CSR = 'csr'
class mlrl.common.rule_learners.SparsePolicy(value)

Bases: enum.Enum

An enumeration.

AUTO = 'auto'
FORCE_DENSE = 'dense'
FORCE_SPARSE = 'sparse'
mlrl.common.rule_learners.configure_feature_binning(config: mlrl.common.cython.learner.RuleLearnerConfig, feature_binning: Optional[str])
mlrl.common.rule_learners.configure_feature_sampling(config: mlrl.common.cython.learner.RuleLearnerConfig, feature_sampling: Optional[str])
mlrl.common.rule_learners.configure_instance_sampling(config: mlrl.common.cython.learner.RuleLearnerConfig, instance_sampling: Optional[str])
mlrl.common.rule_learners.configure_label_sampling(config: mlrl.common.cython.learner.RuleLearnerConfig, label_sampling: Optional[str])
mlrl.common.rule_learners.configure_parallel_prediction(config: mlrl.common.cython.learner.RuleLearnerConfig, parallel_prediction: Optional[str])
mlrl.common.rule_learners.configure_parallel_rule_refinement(config: mlrl.common.cython.learner.RuleLearnerConfig, parallel_rule_refinement: Optional[str])
mlrl.common.rule_learners.configure_parallel_statistic_update(config: mlrl.common.cython.learner.RuleLearnerConfig, parallel_statistic_update: Optional[str])
mlrl.common.rule_learners.configure_partition_sampling(config: mlrl.common.cython.learner.RuleLearnerConfig, partition_sampling: Optional[str])
mlrl.common.rule_learners.configure_pruning(config: mlrl.common.cython.learner.RuleLearnerConfig, pruning: Optional[str])
mlrl.common.rule_learners.configure_rule_induction(config: mlrl.common.cython.learner.RuleLearnerConfig, rule_induction: Optional[str])
mlrl.common.rule_learners.configure_rule_model_assemblage(config: mlrl.common.cython.learner.RuleLearnerConfig, rule_model_assemblage: Optional[str])
mlrl.common.rule_learners.configure_size_stopping_criterion(config: mlrl.common.cython.learner.RuleLearnerConfig, max_rules: Optional[int])
mlrl.common.rule_learners.configure_time_stopping_criterion(config: mlrl.common.cython.learner.RuleLearnerConfig, time_limit: Optional[int])
mlrl.common.rule_learners.create_sparse_policy(parameter_name: str, policy: str) mlrl.common.rule_learners.SparsePolicy
mlrl.common.rule_learners.is_sparse(m, sparse_format: mlrl.common.rule_learners.SparseFormat, dtype, sparse_values: bool = True) bool

Returns whether a given matrix is considered sparse or not. A matrix is considered sparse if it is given in a sparse format and is expected to occupy less memory than a dense matrix.

Parameters
  • m – A np.ndarray or scipy.sparse.matrix to be checked

  • sparse_format – The SparseFormat to be used

  • dtype – The type of the values that should be stored in the matrix

  • sparse_values – True, if the values must explicitly be stored when using a sparse format, False otherwise

Returns

True, if the given matrix is considered sparse, False otherwise

mlrl.common.rule_learners.parse_param(parameter_name: str, value: str, allowed_values: Set[str]) str
mlrl.common.rule_learners.parse_param_and_options(parameter_name: str, value: str, allowed_values_and_options: typing.Dict[str, typing.Set[str]]) -> (<class 'str'>, <class 'mlrl.common.options.Options'>)
mlrl.common.rule_learners.should_enforce_sparse(m, sparse_format: mlrl.common.rule_learners.SparseFormat, policy: mlrl.common.rule_learners.SparsePolicy, dtype, sparse_values: bool = True) bool

Returns whether it is preferable to convert a given matrix into a scipy.sparse.csr_matrix, scipy.sparse.csc_matrix or scipy.sparse.dok_matrix, depending on the format of the given matrix and a given SparsePolicy:

If the given policy is SparsePolicy.AUTO, the matrix will be converted into the given sparse format, if possible, if the sparse matrix is expected to occupy less memory than a dense matrix. To be able to convert the matrix into a sparse format, it must be a scipy.sparse.lil_matrix, scipy.sparse.dok_matrix or scipy.sparse.coo_matrix. If the given sparse format is csr or csc and the matrix is a already in that format, it will not be converted.

If the given policy is SparsePolicy.FORCE_DENSE, the matrix will always be converted into the specified sparse format, if possible.

If the given policy is SparsePolicy.FORCE_SPARSE, the matrix will always be converted into a dense matrix.

Parameters
  • m – A np.ndarray or scipy.sparse.matrix to be checked

  • sparse_format – The SparseFormat to be used

  • policy – The SparsePolicy to be used

  • dtype – The type of the values that should be stored in the matrix

  • sparse_values – True, if the values must explicitly be stored when using a sparse format, False otherwise

Returns

True, if it is preferable to convert the matrix into a sparse matrix of the given format, False otherwise

mlrl.common.strings module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides utility functions for dealing with strings.

mlrl.common.strings.format_dict_keys(dictionary: Dict[str, Set[str]]) str

Creates and returns a textual representation of the keys in a dictionary.

Parameters

dictionary – The dictionary to be formatted

Returns

The textual representation that has been created

mlrl.common.strings.format_enum_values(enum) str

Creates and returns a textual representation of an enum’s values.

Parameters

enum – The enum to be formatted

Returns

The textual representation that has been created

mlrl.common.strings.format_string_set(strings: Set[str]) str

Creates and returns a textual representation of the strings in a set.

Parameters

strings – The set of strings to be formatted

Returns

The textual representation that has been created

Module contents