Custom Tests
How to set up your own metrics to run regular scans and RCA type scans
Last updated
How to set up your own metrics to run regular scans and RCA type scans
Last updated
You have multiple ways to customize your tests. You can choose the scan types, scan metrics and thresholds.
Additionally, you can also add your own custom metrics and measures, include them in your config and then your scans will check for this metric (or measure) as well - both the regular scans and RCA type scans . For a notebook and config example check our github repo.
The decorators you can use to build your custom accuracy and bias metrics are as follows:
prediction_values
refers to what the model scores
should be a list
actual_values
refers to the actuals
if your custom metric is for production, it will use the score as actual if it is provided and no actuals or model are available
should be a list
protected_values
refers to the demographic variable you want to check for bias
if you have multiple demographics please create a feature with the intersection
positive_outcome
directional, refers to what is considered a positive prediction or outcome
e.g. in the case of a lending model it would be a low risk score or if the customer is accepted for the loan, should be a value
negative_outcome
directional, refers to what is considered a negative prediction or outcome
e.g. in the case of a lending model it would be a high risk score or if the customer is rejected for the loan, should be a value
privileged_class
refers to the class in the demographics which is privileged - not protected by the legislation
should be a value
unprivileged_class
refers to the class in the demographics which is not privileged - and which is protected by the legislation
should be a value, in future releases we will add functionality for multiple values here
They follow the parameters available in the config file.
@etiq.metrics.accuracy_metric
refers to logging your metric as an accuracy metric
@etiq.metrics.bias_metric
refers to logging your metric as a bias metric
@etiq.custom_metric
specifies that this is a custom metric
Below is an example of how to add a custom metric to the accuracy metrics scan suite:
Below is an example of how to add a custom metric to the bias metrics scan suite:
Afterwards don’t forget to update your config file with the metric name, and thresholds you want, before you run your scan.
You can now add custom metrics for drift scans as well. Examples in this notebook.
Below is an example of how to add a custom metric for feature or target drift scans:
Below is an example of how to add a custom metric for a concept drift scan:
Don't forget to add the new metrics to the config file:
To build your own drift type measures, consider the logic of feature/target drift vs. concept drift.
Feature and target drift look at whether the distribution for a certain feature has changed. For an explanation of how this is calculated using the out-of-the-box metrics provided check out the drift section. When building your own feature/target drift measure, you can use the following parameters which stand for the following concepts:
first argument (e.g. expected_dist in the example above): observations of a given feature in the baseline dataset
second argument (e.g. new_dist in the example above): observations of a given feature in the new dataset that we're assessing for feature drift
Concept drift looks at whether the relationships between input dataset and target feature have changed over time. The out-of-the-box measures for concept drift look at the change between 2 datasets when it comes to, for instance, the probability that if target has value 0 feature A has value 1. The measure looks not at just one probability value but conditional probabilities are calculated for the different potential combinations of target and feature values. Then the measure compares the distribution of all these probabilities in the 2 datasets. The custom measures follow the same logic. This means that the parameters which you use to build concept drift type measures stand for slightly different things than those you use for feature/target drift:
first argument (e.g. expected_dist in the example above): probabilities of the target values given a feature value in the baseline dataset
second argument (e.g. new_dist in the example above): probabilities of the target values given a feature value in the new dataset that we're assessing for concept drift
Note that for continuous features and/or target, the values will have to be binned.
You can also use your own custom metric in an RCA type scan.
If we continue the drift measure example below, we can just run the feature and target drift metrics RCA scans on the snapshot using the config as per below:
would yield the following results:
This means you can now use Etiq to fully customize your tests to your use case, as well as to experiment with the best metrics and measures.
To add your own custom correlation or association metric, use the decorator @correlation_measure and see an example below:
You can then use this for the relevant scans - we recommend using them in the scan types exemplified above.
For bias sources and leakage scans, we use correlation and association measures. We provide multiple correlation measures out of the box to be used based on the type of features you have: Pearson, Cramer's V, Rank-Biserial, Point-Biserial, for more info see .