bqlearn.irbl.IRBL

class bqlearn.irbl.IRBL(base_estimator, final_estimator)[source]

A Reweighted Classifier for Biquality Learning.

An IRBL [1] classifier is a is a meta-algorithm that uses the covariate shift trick to reweight untrusted examples from two classifiers learned on the trusted and untrusted dataset.

Parameters:
base_estimatorobject, optional (default=None)

The base estimator from which the IRBLClassifier is built. Support for probability prediction is required.

final_estimatorobject, optional (default=None)

The final estimator from which the IRBLClassifier is built. Support for sample weighting is required.

Attributes:
final_estimator_classifier

The final fitted estimator.

sample_weight_ndarray, shape (n_samples,)

The weights of the examples computed during fit().

classes_ndarray of shape (n_classes,)

The classes labels.

n_classes_int

The number of classes.

References

[1]
  1. Nodet, V. Lemaire, A. Bondu, A. Cornuéjols, “Importance Reweighting for Biquality Learning”, IJCNN, 2021.

Methods

decision_function(X)

Call decision function of the final_estimator.

fit(X, y[, sample_quality])

Fit the reweighted model.

get_params([deep])

Get parameters for this estimator.

predict(X)

Predict the classes of X.

predict_log_proba(X)

Predict log probability for each possible outcome.

predict_proba(X)

Predict probability for each possible outcome.

score(X, y[, sample_weight])

Return the mean accuracy on the given test data and labels.

set_params(**params)

Set the parameters of this estimator.

decision_function(X)[source]

Call decision function of the final_estimator.

Parameters:
Xarray-like, shape (n_samples, n_features)

The input samples.

Returns:
yndarray, shape (n_samples,)

The predicted classes.

fit(X, y, sample_quality=None)[source]

Fit the reweighted model.

Parameters:
X{array-like, sparse matrix} of shape (n_samples, n_features)

The training input samples. Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK, and LIL are converted to CSR.

yarray-like of shape (n_samples,)

The target values (class labels in classification, real numbers in regression).

sample_qualityarray-like, shape (n_samples,)

Sample qualities.

Returns:
selfobject
get_params(deep=True)[source]

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

predict(X)[source]

Predict the classes of X.

Parameters:
Xarray-like, shape (n_samples, n_features)

The input samples.

Returns:
yndarray, shape (n_samples,)

The predicted classes.

predict_log_proba(X)[source]

Predict log probability for each possible outcome.

Parameters:
Xarray-like, shape (n_samples, n_features)

The input samples.

Returns:
log_parray, shape (n_samples, n_classes)

Array with log prediction probabilities.

predict_proba(X)[source]

Predict probability for each possible outcome.

Parameters:
Xarray-like, shape (n_samples, n_features)

The input samples.

Returns:
parray, shape (n_samples, n_classes)

The class probabilities of the input samples. The order of the classes corresponds to that in the attribute classes_.

score(X, y, sample_weight=None)[source]

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:
Xarray-like of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True labels for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

Returns:
scorefloat

Mean accuracy of self.predict(X) w.r.t. y.

set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.