bqlearn.corruptions.make_imbalance

bqlearn.corruptions.make_imbalance(y, *arrays, majority_ratio=1.0, imbalance_distribution='step', minority_class_fraction=0.5, random_state=None, labels=None)[source]

Create class imbalance in a multi class scenario according to [1].

It selects a fraction of all class to be considered as the minority group according to minority_class_fraction and going to subsample it given majority_ratio when imbalance_distribution=’step’.

If imbalance_distribution=’linear’, it creates imbalance between all classes by decreasing linearly the ratio of subsampling when iterating through classes according to majority_ratio.

Parameters:
yarray-like of shape (n_samples, )

The targets.

*arrays: sequence of indexables with length / shape[0] equals to n_samples

Allowed inputs are lists, numpy arrays, scipy-sparse matrices or pandas dataframes.

majority_ratiofloat, default = 1.0

Ratio between number of samples in majority classes and number of samples in minority classes.

imbalance_distribution{‘step’, ‘linear’}, default=’step’

Imbalance distribution.

minority_class_fractionfloat, default = 0.5

Fraction of classes considered as minority classes. Only used when imbalance_distribution=’step’.

random_stateint or RandomState, default=None

Controls the randomness of the subsampling procedure.

labelsarray-like of shape (n_classes), default=None

List of labels to index the matrix. This may be used to reorder or select a subset of labels. If None is given, those that appear at least once in y and are used in sorted order.

Returns:
y_imbalancedarray-like of shape (n_samples_new)

The array containing the imbalanced data.

*arrays_imbalancedlist, length=len(arrays)

The corresponding imbalanced arrays.

References

[1]

Mateusz Buda, et al. “A systematic study of the class imbalance problem in convolutional neural networks.” Neural Networks, 106:249-259, 2018.