File stratified_sampling_output_wise.hpp

template<typename LabelMatrix, typename IndexIterator>
class LabelWiseStratification
#include <stratified_sampling_output_wise.hpp>

Implements iterative stratified sampling for selecting a subset of the available training examples as proposed in the following publication:

Sechidis K., Tsoumakas G., Vlahavas I. (2011) On the Stratification of Multi-label Data. In: Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science, vol 6913. Springer.

Template Parameters:
  • LabelMatrix – The type of the label matrix that provides random or row-wise access to the labels of the training examples

  • IndexIterator – The type of the iterator that provides access to the indices of the examples that should be considered

Public Functions

LabelWiseStratification(std::unique_ptr<RNG> rngPtr, const LabelMatrix &labelMatrix, IndexIterator indicesBegin, IndexIterator indicesEnd)
Parameters:
  • rngPtr – An unique pointer to an object of type RNG that should be used for generating random numbers

  • labelMatrix – A reference to an object of template type LabelMatrix that provides random or row-wise access to the labels of the training examples

  • indicesBegin – An iterator to the beginning of the indices of the examples that should be considered

  • indicesEnd – An iterator to the end of the indices of the examples that should be considered

void sampleWeights(BitWeightVector &weightVector, EqualWeightVector::const_iterator weightIterator, float32 sampleSize, uint32 minSamples, uint32 maxSamples)

Randomly selects a stratified sample of the available examples and sets the corresponding weights in a BitWeightVector to the value in a given iterator, while the remaining weights are set to 0.

Parameters:
  • weightVector – A reference to an object of type BitWeightVector, the weights should be written to

  • weightIterator – An iterator that provides access to the weights of individual training examples

  • sampleSize – The fraction of the available examples to be selected

  • minSamples – The minimum number of examples to be included in the sample. Must be at least 1

  • maxSamples – The maximum number of examples to be included in the sample. Must be at least minSamples or 0, if the number of examples should not be restricted

void sampleWeights(DenseWeightVector<float32> &weightVector, DenseWeightVector<float32>::const_iterator weightIterator, float32 sampleSize, uint32 minSamples, uint32 maxSamples)

Randomly selects a stratified sample of the available examples and sets the corresponding weights in a DenseWeightVector<float32> to the value in a given iterator, while the remaining weights are set to 0.

Parameters:
  • weightVector – A reference to an object of type DenseWeightVector<float32>, the weights should be written to

  • weightIterator – An iterator that provides access to the weights of individual training examples

  • sampleSize – The fraction of the available examples to be selected

  • minSamples – The minimum number of examples to be included in the sample. Must be at least 1

  • maxSamples – The maximum number of examples to be included in the sample. Must be at least minSamples or 0, if the number of examples should not be restricted

void sampleBiPartition(BiPartition &partition)

Randomly splits the available examples into two distinct sets and updates a given BiPartition accordingly.

Parameters:

partition – A reference to an object of type BiPartition to be updated

Private Members

const std::unique_ptr<RNG> rngPtr_
BinarySparseMatrixDecorator<AllocatedBinaryCscView> stratificationMatrix_