bqlearn.metrics.anchor_transition_matrix¶

bqlearn.metrics.anchor_transition_matrix(y_prob, quantile=0.97, anchor_idx=None)[source]¶

Compute the anchor transition matrix [1].

It uses anchor points \(A\) as trustful points in a unlabelled dataset.

\[\forall i \in [\![1,K]\!], A_i = \operatorname*{argmax}_{x \in D} \mathbb{P}(Y=i|X=x)\]

Then it uses predictions of a model learned on untrusted data to estimate the transition matrix.

\[\forall (i,j) \in [\![1,K]\!]^2, \hat{T}_{(i,j)} = \mathbb{P}(\tilde{Y}=j|X=A_i)\]

Parameters:

y_probarray-like of shape (n_samples, n_classes): Predicted probabilities, as returned by a classifier’s predict_proba method.
quantilefloat, default=0.97: Quantile used to select the anchor points. It filters out outlier points with high predicted probabilities.
anchor_idxarray-like of shape (n_classes), default=None: Anchor points indices. If not None, use provided anchor points instead of computing them.

Returns:

Cndarray of shape (n_classes, n_classes): Anchor transition matrix whose i-th row and j-th column entry indicates the probability of samples with true label being i-th class to be corrupted to a label being the j-th class.

References

[1]

G. Patrini, “Making Deep Neural Networks Robust to Label Noise: a Loss Correction Approach”, 2017