Custom Tests
How to set up your own metrics to run regular scans and RCA type scans
You have multiple ways to customize your tests. You can choose the scan types, scan metrics and thresholds.
Additionally, you can also add your own custom metrics and measures, include them in your config and then your scans will check for this metric (or measure) as well - both the regular scans and RCA type scans 🎉🎉🎉. For a notebook and config example check our github repo.
Custom Metrics for Accuracy and Bias Scans
The decorators you can use to build your custom accuracy and bias metrics are as follows:
Decorator | Description |
---|---|
prediction_values | refers to what the model scores should be a list |
actual_values | refers to the actuals if your custom metric is for production, it will use the score as actual if it is provided and no actuals or model are available should be a list |
protected_values | refers to the demographic variable you want to check for bias if you have multiple demographics please create a feature with the intersection |
positive_outcome | directional, refers to what is considered a positive prediction or outcome e.g. in the case of a lending model it would be a low risk score or if the customer is accepted for the loan, should be a value |
negative_outcome | directional, refers to what is considered a negative prediction or outcome e.g. in the case of a lending model it would be a high risk score or if the customer is rejected for the loan, should be a value |
privileged_class | refers to the class in the demographics which is privileged - not protected by the legislation should be a value |
unprivileged_class | refers to the class in the demographics which is not privileged - and which is protected by the legislation should be a value, in future releases we will add functionality for multiple values here |
They follow the parameters available in the config file.
@etiq.metrics.accuracy_metric | refers to logging your metric as an accuracy metric |
@etiq.metrics.bias_metric | refers to logging your metric as a bias metric |
@etiq.custom_metric | specifies that this is a custom metric |
Below is an example of how to add a custom metric to the accuracy metrics scan suite:
Below is an example of how to add a custom metric to the bias metrics scan suite:
Afterwards don’t forget to update your config file with the metric name, and thresholds you want, before you run your scan.
Custom Metrics for Drift Scans
You can now add custom metrics for drift scans as well. Examples in this notebook.
Below is an example of how to add a custom metric for feature or target drift scans:
Below is an example of how to add a custom metric for a concept drift scan:
Don't forget to add the new metrics to the config file:
To build your own drift type measures, consider the logic of feature/target drift vs. concept drift.
Feature and target drift look at whether the distribution for a certain feature has changed. For an explanation of how this is calculated using the out-of-the-box metrics provided check out the drift section. When building your own feature/target drift measure, you can use the following parameters which stand for the following concepts:
first argument (e.g. expected_dist in the example above): observations of a given feature in the baseline dataset
second argument (e.g. new_dist in the example above): observations of a given feature in the new dataset that we're assessing for feature drift
Concept drift looks at whether the relationships between input dataset and target feature have changed over time. The out-of-the-box measures for concept drift look at the change between 2 datasets when it comes to, for instance, the probability that if target has value 0 feature A has value 1. The measure looks not at just one probability value but conditional probabilities are calculated for the different potential combinations of target and feature values. Then the measure compares the distribution of all these probabilities in the 2 datasets. The custom measures follow the same logic. This means that the parameters which you use to build concept drift type measures stand for slightly different things than those you use for feature/target drift:
first argument (e.g. expected_dist in the example above): probabilities of the target values given a feature value in the baseline dataset
second argument (e.g. new_dist in the example above): probabilities of the target values given a feature value in the new dataset that we're assessing for concept drift
Note that for continuous features and/or target, the values will have to be binned.
Custom metrics for RCA type scans
You can also use your own custom metric in an RCA type scan.
If we continue the drift measure example below, we can just run the feature and target drift metrics RCA scans on the snapshot using the config as per below:
would yield the following results:
This means you can now use Etiq to fully customize your tests to your use case, as well as to experiment with the best metrics and measures.
Custom Correlation/Association Measures
For bias sources and leakage scans, we use correlation and association measures. We provide multiple correlation measures out of the box to be used based on the type of features you have: Pearson, Cramer's V, Rank-Biserial, Point-Biserial, for more info see Bias Sources Scan .
To add your own custom correlation or association metric, use the decorator @correlation_measure and see an example below:
You can then use this for the relevant scans - we recommend using them in the scan types exemplified above.
Last updated