bqlearn.metrics.anchor_transition_matrix

bqlearn.metrics.anchor_transition_matrix(y_prob, quantile=0.97, anchor_idx=None)[source]

Compute the anchor transition matrix [1].

It uses anchor points \(A\) as trustful points in a unlabelled dataset.

\[\forall i \in [\![1,K]\!], A_i = \operatorname*{argmax}_{x \in D} \mathbb{P}(Y=i|X=x)\]

Then it uses predictions of a model learned on untrusted data to estimate the transition matrix.

\[\forall (i,j) \in [\![1,K]\!]^2, \hat{T}_{(i,j)} = \mathbb{P}(\tilde{Y}=j|X=A_i)\]
Parameters:
y_probarray-like of shape (n_samples, n_classes)

Predicted probabilities, as returned by a classifier’s predict_proba method.

quantilefloat, default=0.97

Quantile used to select the anchor points. It filters out outlier points with high predicted probabilities.

anchor_idxarray-like of shape (n_classes), default=None

Anchor points indices. If not None, use provided anchor points instead of computing them.

Returns:
Cndarray of shape (n_classes, n_classes)

Anchor transition matrix whose i-th row and j-th column entry indicates the probability of samples with true label being i-th class to be corrupted to a label being the j-th class.

References

[1]

G. Patrini, “Making Deep Neural Networks Robust to Label Noise: a Loss Correction Approach”, 2017