bqlearn.metrics.gold_transition_matrix

bqlearn.metrics.gold_transition_matrix(y_true, y_prob, labels=None)[source]

Compute the gold transition matrix [1].

It computes the average predictions of a model learned on untrusted data on the trusted dataset per class:

\[\hat{T}_{(i,*)} = \frac{1}{|D^i_T|}\sum_{x_i \in D^i_T}f_U(x_i)\]

where:

\[\forall k \in [\![1,K]\!], D^k_T = \{\forall (x,y) \in D_T \mid y=k\}\]

and $K$ is the number of class.

Parameters:
y_truearray-like of shape (n_samples,)

Ground truth (correct) target values.

y_probarray-like of shape (n_samples, n_classes)

Predicted probabilities, as returned by a classifier’s predict_proba method.

labelsarray-like of shape (n_classes), default=None

List of labels to index the matrix. This may be used to reorder or select a subset of labels. If None is given, those that appear at least once in y_true or y_prob are used in sorted order.

Returns:
Cndarray of shape (n_classes, n_classes)

Gold transition matrix whose i-th row and j-th column entry indicates the probability of samples with true label being i-th class to be corrupted to a label being the j-th class.

References

[1]
  1. Hendrycks, “Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise”, 2019