Etiq Docs
Search…
Accuracy RCA Scan
scan_accuracy_metrics_rca is the typical RCA scan. You can find an example notebook here.
An example config file is as below:
1
{
2
"dataset": {
3
"label": "income",
4
"bias_params": {
5
"protected": "gender",
6
"privileged": 1,
7
"unprivileged": 0,
8
"positive_outcome_label": 1,
9
"negative_outcome_label": 0
10
},
11
"train_valid_test_splits": [0.0, 1.0, 0.0],
12
"remove_protected_from_features": true
13
},
14
"scan_accuracy_metrics": {
15
"thresholds": {
16
"accuracy": [0.8, 1.0],
17
"true_pos_rate": [0.75, 1.0],
18
"true_neg_rate": [0.7, 1.0]
19
}
20
},
21
"scan_accuracy_metrics_rca": {
22
"thresholds": {
23
"accuracy": [0.8, 1.0],
24
"true_pos_rate": [0.7, 1.0]
25
},
26
"metric_filter": ["accuracy", "true_pos_rate"],
27
"minimum_segment_size": 1000
28
}
29
}
Copied!
The syntax to run the scan after logging the model and dataset is the following:
1
snapshot.scan_accuracy_metrics_rca()
Copied!
Like with all RCA scans the principle behind the scan is that it searches through different combinations of records and it finds those combinations for which the metric is outside the thresholds. As per the usual scans, you can set the thresholds for what constitutes an issue for your use case. You can also filter out the metrics you want/do not want RCA for, using e.g. "metric_filter": ["accuracy", "true_pos_rate"]
To make the metrics per group meaningful, it assigns a minimum number of records that constitutes a group, but you can change this by using the following syntax/parameter: "minimum_segment_size": 1000 as per config example above.
At the moment we only have results retrieval through the IDE and by snapshot using the following syntax and then call each of the elements.
1
(segments_accuracy, issues_accuracy, issue_summary_accuracy) = snapshot.scan_accuracy_metrics_rca()
2
Copied!
The end results give business rules to the segments to help you understand the records you’re having an issue with.
We are working to add more retrieval methods.
Out of the box you can scan for the following metrics:
  1. 1.
    accuracy - % correct out of total
  2. 2.
    true positive rate - the proportion positive outcome labels that are correctly classified out of all positive outcome labels
  3. 3.
    true negative rate - the proportion negative outcome labels that are correctly classified out of all negative outcome labels
This pipeline is experimental.
Copy link