mlrl.testbed.characteristics module¶
Author: Michael Rapp (michael.rapp.ml@gmail.com)
Provides functions to determine certain characteristics of feature or label matrices.
- class mlrl.testbed.characteristics.Characteristic(option: str, name: str, getter_function, percentage: bool = False)¶
Bases:
Formatter
Allows to create textual representations of characteristics.
- class mlrl.testbed.characteristics.LabelCharacteristics(y)¶
Bases:
Formattable
,Tabularizable
Stores characteristics of a label matrix.
- property avg_label_cardinality¶
The average label cardinality of the label matrix.
- property avg_label_imbalance_ratio¶
The average label imbalance ratio of the label matrix.
- property label_density¶
The density of the label matrix.
- property label_sparsity¶
The sparsity of the label matrix.
- property num_distinct_label_vectors¶
The number of distinct label vectors in the label matrix.
- mlrl.testbed.characteristics.density(matrix) float ¶
Calculates and returns the density of a given feature or label matrix.
- Parameters:
matrix – A numpy.ndarray or scipy.sparse matrix, shape (num_rows, num_cols), that stores the feature values of training examples or their labels
- Returns:
The fraction of non-zero elements in the given matrix among all elements
- mlrl.testbed.characteristics.distinct_label_vectors(y) int ¶
Determines and returns the number of distinct label vectors in a label matrix.
- Parameters:
y – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_labels), that stores the labels of training examples
- Returns:
The number of distinct label vectors in the given matrix
- mlrl.testbed.characteristics.label_cardinality(y) float ¶
Calculates and returns the average label cardinality of a given label matrix.
- Parameters:
y – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_labels), that stores the labels of training examples
- Returns:
The average number of relevant labels per training example
- mlrl.testbed.characteristics.label_imbalance_ratio(y) float ¶
Calculates and returns the average label imbalance ratio of a given label matrix.
- Parameters:
y – A numpy.ndarray or scipy.sparse matrix, shape (num_examples, num_labels), that stores the labels of training examples
- Returns:
The label imbalance ratio averaged over the available labels