Etiq Docs
Search…
Scan
The scan is a test pipeline applied to a snapshot. The scan is testing whether the model/snapshot has a specific issue. The function to call a scan is a one liner:
snapshot.scan_<scan_name>()
#for example
snapshot.scan_bias_metrics()

Issues, Metrics, Thresholds

A scan is a testing pipeline, and a lot of scans just test for one issue only, e.g. is accuracy above a certain threshold. But other scans, e.g. scan_bias_sources test for a lot of different issues at the same time because it is more efficient. Which is why we added ISSUE as a sub-element of the scan.
Whether an issue is found or not is based on whether the METRIC associated with the issue is outside acceptable THRESHOLDS.
You can set thresholds based on your use case, although we provide config files with suggested thresholds to get you started. You can also add custom metrics as per this section.
A subset of metrics is MEASURES. In our convention, measures are used more to uncover causes of issues rather than high level issues on your snapshot/model , e.g. a correlation coefficient would be a measure, but the line is blurry.

Scans Summary

Scan Type
Issue
Metric/Measure
Release
Accuracy metrics
Is accuracy above or below accepted threshold? Accuracy above threshold can also be a problem
Accuracy (no of correctly labelled/total)
TPR: true positive rate
TNR: true negative rate
1.3.1
Bias metrics
Is given bias metric above or below acceptable threshold?
Equal opportunity Demographic parity Equal_odds_TNR
Individual fairness
Individual fairness counterfactuals
1.3.1
Bias sources
What are proxies and sampling issues that could lead to bias later on? For automatically derived business rules, use the option auto in your config file
This scan uses a measure of correlation (4 options based on types of features: Pearson, Cramer's V, Rank-Biserial, Point-Biserial), plus differential measures between demographic groups.
1.3.5
Leakage
Target leakage: Has the target leaked into a feature you use in your model? Demographic leakage: Has a demographic feature leaked into a feature in your model?
This scan uses a measure of correlation (4 options based on types of features: Pearson, Cramer's V, Rank-Biserial, Point-Biserial), rather than a metric.
1.3.5
Drift Metrics
Feature drift: Has the feature dataset changed from the initial/benchmark dataset? For which feature?
Kolmogorov- Smirnov Jensen-Shannon Distance PSI: Population Stability Index
1.3.1
Drift Metrics
Target drift: Has the target feature distribution changed from the initial/benchmark dataset?
Kolmogorov- Smirnov Jensen-Shannon Distance PSI: Population Stability Index
1.3.1
Drift Metrics
Concept drift: Have the relationships between target and features changed from initial/benchmark dataset?
Earth Mover's Distance Kullback-Leibler Divergence Jensen-Shannon Distance
1.3.5
Accuracy Metrics RCA
Are there segments in the data where the model seems to be performing considerably worse than average?
Accuracy (no of correctly labelled/total)
TPR: true positive rate
TNR: true negative rate
1.3.3
Bias Metrics RCA
Are there segments in data where the model is performing worse for a demographic group than for another?
Equal opportunity Demographic parity Equal_odds_TNR
Individual fairness
1.3.3
Drift Metrics RCA
Feature Drift Metrics RCA: Are there segments in the data where feature drift was observed? Target Drift Metrics RCA: Are there segments in the data where target drift was observed?
Kolmogorov- Smirnov Jensen-Shannon Distance PSI: Population Stability Index
1.3.5
Etiq also has available test suites on other areas such as explainability related issues, or drift metrics RCA which are not currently part of the public release. If you are interested in them just get in touch with us [email protected]

Parameters

To be able to use the scans, you will also need to login parameters about the dataset. As the scans primarily handle classification problems at this stage, the parameters are as follows:
  • For all scans:
    • ‘label’ - Feature you are predicting
    • ‘train_valid _test_splits’ (if your model is already trained and you’re providing only the test dataset for scans please set the % accordingly)
    • Optional: ‘cat_col’ - list of categorical features
    • Optional: ‘cont_col’ - list of continuous features
Parameters ‘cat_col’ and ‘cont_col’ are optional, but for the scans relying on correlations, having these parameters logged means the scan can use the right measure)
  • For bias scans:
    • ‘protected’ - a demographic feature or features that you are checking for bias for (protected characteristics) - for more information please see Bias Scans section
    • ‘privileged’ - usually the majority class or the class not protected by legislation
    • ‘unprivileged’ - the minority class or the class protected by legislation
    • ‘positive_outcome_label’ - for bias type tests it’s important to know which outcome label for the predicted feature is a positive outcome for the individual (e.g. low likelihood of default on a loan, or high likelihood of performing well in a role). This allows you to set-up the test to understand if the group that needs to be ‘protected’ is more likely to be treated negatively by the model. (For more details please see the Bias tests section)
    • ‘negative_outcome_label’ - a negative outcome for the individual (e.g. high likelihood of default of a loan)
Having appropriate and accurate labels for your features means that you’ll be able to make use of the automated segment discovery and business rules creation that come with the dashboard.
Metrics, thresholds and parameters are customized as part of the config file (see next Key concept)