bqlearn.metrics.gold_transition_matrix¶

bqlearn.metrics.gold_transition_matrix(y_true, y_prob, labels=None)[source]¶

Compute the gold transition matrix [1].

It computes the average predictions of a model learned on untrusted data on the trusted dataset per class:

\[\hat{T}_{(i,*)} = \frac{1}{|D^i_T|}\sum_{x_i \in D^i_T}f_U(x_i)\]

where:

\[\forall k \in [\![1,K]\!], D^k_T = \{\forall (x,y) \in D_T \mid y=k\}\]

and $K$ is the number of class.

Parameters:

y_truearray-like of shape (n_samples,): Ground truth (correct) target values.
y_probarray-like of shape (n_samples, n_classes): Predicted probabilities, as returned by a classifier’s predict_proba method.
labelsarray-like of shape (n_classes), default=None: List of labels to index the matrix. This may be used to reorder or select a subset of labels. If None is given, those that appear at least once in y_true or y_prob are used in sorted order.

Returns:

Cndarray of shape (n_classes, n_classes): Gold transition matrix whose i-th row and j-th column entry indicates the probability of samples with true label being i-th class to be corrupted to a label being the j-th class.

References

[1]

Hendrycks, “Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise”, 2019