mlrl.util.arrays module

Author: Michael Rapp (michael.rapp.ml@gmail.com)

Provides utility functions for handling arrays.

class mlrl.util.arrays.SparseFormat(*values)

Bases: StrEnum

Specifies all valid textual representations of sparse matrix formats.

BSR = 'bsr'
COO = 'coo'
CSC = 'csc'
CSR = 'csr'
DIA = 'dia'
DOK = 'dok'
LIL = 'lil'
mlrl.util.arrays.enforce_2d(array: ndarray) ndarray

Converts a given np.ndarray into a two-dimensional array if it is one-dimensional.

Parameters:

array – A np.ndarray to be converted

Returns:

A np.ndarray with at least two dimensions

mlrl.util.arrays.enforce_dense(array, order: str, dtype: dtype | None = None, sparse_value: int | float = 0) ndarray

Converts a given array into a np.ndarray, if necessary, and enforces a specific memory layout and data type to be used.

Parameters:
  • array – A np.ndarray, scipy.sparse.spmatrix or scipy.sparse.sparray to be converted

  • order – The memory layout to be used. Must be C or F

  • dtype – The data type to be used or None, if the data type should not be changed

  • sparse_value – The value that should be used for sparse elements in the given array

Returns:

A np.ndarray that uses the given memory layout and data type

mlrl.util.arrays.ensure_no_complex_data(array) Any

Raises a ValueError if the given array stores complex numbers.

Returns:

The given array

mlrl.util.arrays.get_unique_values(matrix) ndarray

Returns a np.ndarray that stores all unique values contains in a given matrix, sorted in increasing order.

Parameters:

matrix – A np.ndarray or scipy.sparse.sparray

Returns:

A np.ndarray that stores the unique values

mlrl.util.arrays.is_bsr(array) bool

Returns whether a given scipy_sparse.spmatrix or scipy.sparse.sparray uses the BSR format or not.

Parameters:

array – A scipy.sparse.spmatrix or scipy.sparse.sparray to be checked

Returns:

True, if the given array uses the BSR format, False otherwise

mlrl.util.arrays.is_coo(array) bool

Returns whether a given scipy.sparse.spmatrix or scipy.sparse.sparray uses the COO format or not.

Parameters:

array – A scipy.sparse.spmatrix or scipy.sparse.sparray to be checked

Returns:

True, if the given array uses the COO format, False otherwise

mlrl.util.arrays.is_csc(array) bool

Returns whether a given scipy.sparse.spmatrix or scipy.sparse.sparray uses the CSC format or not.

Parameters:

array – A scipy.sparse.spmatrix or scipy.sparse.sparray to be checked

Returns:

True, if the given array uses the CSC format, False otherwise

mlrl.util.arrays.is_csr(array) bool

Returns whether a given scipy.sparse.spmatrix or scipy.sparse.sparray uses the CSR format or not.

Parameters:

array – A scipy.sparse.spmatrix or scipy.sparse.sparray to be checked

Returns:

True, if the given array uses the CSR format, False otherwise

mlrl.util.arrays.is_dia(array) bool

Returns whether a given scipy_sparse.spmatrix or scipy.sparse.sparray uses the DIA format or not.

Parameters:

array – A scipy.sparse.spmatrix or scipy.sparse.sparray to be checked

Returns:

True, if the given array uses the DIA format, False otherwise

mlrl.util.arrays.is_dok(array) bool

Returns whether a given scipy.sparse.spmatrix or scipy.sparse.sparray uses the DOK format or not.

Parameters:

array – A scipy.sparse.spmatrix or scipy.sparse.sparray to be checked

Returns:

True, if the given array uses the DOK format, False otherwise

mlrl.util.arrays.is_lil(array) bool

Returns whether a given scipy.sparse.spmatrix or scipy.sparse.sparray uses the LIL format or not.

Parameters:

array – A scipy.sparse.spmatrix or scipy.sparse.sparray to be checked

Returns:

True, if the given array uses the LIL format, False otherwise

mlrl.util.arrays.is_sparse(array, supported_formats: set[SparseFormat] | None = None) bool

Returns whether a given array is a scipy.sparse.spmatrix or scipy.sparse.sparray or not.

Parameters:
  • array – A np.ndarray, scipy.sparse.spmatrix or scipy.sparse.sparray to be checked

  • supported_formats – A set of supported SparseFormat`s, the `scipy.sparse.spmatrix or scipy.sparse.sparray may use or None, if the format should not be checked

Returns:

True, if the given array is a scipy.sparse.spmatrix or scipy.sparse.sparray using one of the supported formats, False otherwise

mlrl.util.arrays.is_sparse_and_memory_efficient(array, sparse_format: SparseFormat, dtype: dtype | None = None, sparse_values: bool = True) bool

Returns whether a given matrix uses sparse format and is expected to occupy less memory than a dense matrix.

Parameters:
  • array – A np.ndarray, scipy.sparse.spmatrix or scipy.sparse.sparray to be checked

  • sparse_format – The SparseFormat to be used. Must be SparseFormat.CSC or SparseFormat.CSR

  • dtype – The type of the values that should be stored in the matrix or None, if it should be obtained from the given array

  • sparse_values – True, if the values must explicitly be stored when using a sparse format, False otherwise

Returns:

True, if the given matrix uses a sparse format an is expected to occupy less memory than a dense matrix, False otherwise