mlrl.testbed.characteristics module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides functions to determine certain characteristics of feature or label matrices.

class mlrl.testbed.characteristics.Characteristic(option: str, name: str, getter_function, percentage: bool = False)

Bases: Formatter

Allows to create textual representations of characteristics.

format(value, **kwargs) str

See mlrl.testbed.output_writer.Formattable.format()

class mlrl.testbed.characteristics.LabelCharacteristics(y)

Bases: Formattable, Tabularizable

Stores characteristics of a label matrix.

property avg_label_cardinality

The average label cardinality of the label matrix.

property avg_label_imbalance_ratio

The average label imbalance ratio of the label matrix.

format(options: Options, **_) str

See mlrl.testbed.output_writer.Formattable.format()

property label_density

The density of the label matrix.

property label_sparsity

The sparsity of the label matrix.

property num_distinct_label_vectors

The number of distinct label vectors in the label matrix.

tabularize(options: Options, **_) List[Dict[str, str]] | None

See mlrl.testbed.output_writer.Tabularizable.tabularize()

mlrl.testbed.characteristics.density(matrix) float

Calculates and returns the density of a given feature or label matrix.

Parameters:

matrix – A numpy.ndarray or scipy.sparse matrix, shape (num_rows, num_cols), that stores the feature values of training examples or their labels

Returns:

The fraction of non-zero elements in the given matrix among all elements

mlrl.testbed.characteristics.distinct_label_vectors(y) int

Determines and returns the number of distinct label vectors in a label matrix.

Parameters:

y – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_labels), that stores the labels of training examples

Returns:

The number of distinct label vectors in the given matrix

mlrl.testbed.characteristics.label_cardinality(y) float

Calculates and returns the average label cardinality of a given label matrix.

Parameters:

y – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_labels), that stores the labels of training examples

Returns:

The average number of relevant labels per training example

mlrl.testbed.characteristics.label_imbalance_ratio(y) float

Calculates and returns the average label imbalance ratio of a given label matrix.

Parameters:

y – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_labels), that stores the labels of training examples

Returns:

The label imbalance ratio averaged over the available labels