bqlearn.model_selection.make_biquality_cv¶
- bqlearn.model_selection.make_biquality_cv(X, sample_quality, cv=None, *, y=None, groups=None)[source]¶
Utility function for building a biquality cross-validator.
In the Biquality Data setup, cross-validators behave the same way as usual cross-validators, but untrusted samples should be remove from the generated test dataset.
At the moment this cross-validator is made thanks to
PredifinedSplitand untrusted samples are removed from all test sets generated by the providedcv. That’s why each sample should be attributed to only one test set at maximum, otherwise a warning is returned.- Parameters:
- Xarray-like of shape (n_samples, n_features)
The samples.
- sample_qualityarray-like of shape (n_samples,)
The sample quality.
- cvint, cross-validation generator or an iterable, default=None
Determines the cross-validation splitting strategy. Possible inputs for cv are: - None, to use the default 5-fold cross validation, - integer, to specify the number of folds. - CV splitter, - An iterable that generates (train, test) splits as arrays of indices.
For integer/None inputs, if
yis either binary or multiclass,StratifiedKFoldis used. In all other cases,KFoldis used.- yarray-like of shape (n_samples,), default=None
The target variable.
- groupsarray-like of shape (n_samples,), default=None
Group labels for the samples used while splitting the dataset into train/test set.
- Returns:
- biquality_cva cross-validator instance.
The return value is a cross-validator which generates the train/test splits via the
splitmethod.