1.2 Functionality
If you have purchased Etiq v 1.2 via AWS marketplace - use these docs. Provides bias identification functionality and a different dashboard. Version 1.3 will be available via AWS Marketplace shortly.
Last updated
If you have purchased Etiq v 1.2 via AWS marketplace - use these docs. Provides bias identification functionality and a different dashboard. Version 1.3 will be available via AWS Marketplace shortly.
Last updated
A typical use case for etiq: Let's say you are building a predictive model using tabular customer data. You have wrangled your data and tried a few model classes. Now you want to see if your model is discriminating unintentionally against certain demographic groups, e.g. based on gender, ethnicity, age, etc. and you do have access to the demographic label. This is where you can use the etiq library.
Etiq library provides different kinds of pipelines that are intended to plug in to your existing pipelines and test them for a specific purpose. The pipelines currently available focus on identifying and mitigating unintended discrimination. Etiq pipelines provide identify methods, repair methods, metrics to evaluate outcomes including fairness metrics.
For more details on the theoretical underpinnings of our methods go to Definitions. We'd like to stress that the 'fairness' literature and methodology is a very wide field, with a lot of divergent opinions. Where applicable we will refer to the framework we are using, but some of our approaches are experimental.
In addition to the library the solution also includes a dashboard that presents results of the different pipelines logged by the library. This additional functionality includes the ability to retrieve results of pipelines from one session to another.
If you want support from us, or submit any comments, feature requests, issues or bugs, please login to our slack channel or email us: info@etiq.ai
The Etiq library supports Python versions 3.6, 3.7, 3.8 and 3.9 on Windows, Mac and Linux. We do not support Mac m1 at the moment.
We recommend using pip
to install the Etiq library and its dependencies.
From your python environment import etiq_core
Once you have imported the library go to the dashboard site on your AWS-hosted version, sign-up and login.
To start storing metrics you ran from your notebook or other IDE to your dashboard you will need a token to associate your session with your account. To create this token, once in your account go to the Token Management window and just click on Add New Access Token. Then copy and paste into your notebook.
From your notebook just login to the dashboard and you're all set to go. Now as you log different pipelines and debiasing pipelines and tie them to a project you'll be able to retrieve them both via the dashboard and via your notebook across sessions.
Data about your pipelines and debiasing pipelines get stored on Etiq's AWS instance. However your datasets and models will not actually be stored anywhere, so you can rest assured.
If your security set-up is such that you would need a deployment entirely on your cloud instance or on prem just get in touch with us - info@etiq.ai
Please don't leave your token lying around as if anyone finds it they can use it to retrieve information stored about your pipelines. Similarly to how you use a password/username authentication.
To start using the versioning and dashboard functionality, please set a project and a project name. You only have to run it once per session and all the details logged as part of data pipelines or debias pipelines will be stored. Once you go to your dashboard you will be able to see the metrics of all your pipelines & debiasing pipelines logged split by the project name.
To follow the example analysis below, download the Adult dataset from https://archive.ics.uci.edu/ml/datasets/adult or load it in the notebook as a Pandas dataframe from the samples included in the library. A demo notebook is available here
The DataPipeline object has the model we'd like to evaluate, the dataset used to train it and the fairness metrics that are most relevant to our project.
Below, we define the parameters for the debiasing process using the BiasParams structure. This allows us to specify the protected category (often a demographic feature you'd like to mitigate bias for) using the protected
parameter; specifiy who is in the privileged and unprivileged groups (these are set using the privileged
and unprivileged
parameters respectively); specify what is the positive outcome and the negative outcome in this dataset (these are set using the positive_outcome_label
and negative_outcome_label
parameters respectively).
Even if your model does not use the specific demographic features you want to identify bias for, you should include this in the dataset. (etiq will automatically exclude it later during any model refitting).
It is important to note that the protected feature is removed from the dataset for the purposes of training a model and will only be used to evaluate the model for bias.
Specify transforms like Dropna or EncodeLabels to make sure data are numeric and without missing values.
Choose the metrics you want computed for this project.
Each of these metrics measure how well our model is performing when classifying the data. For example the accuracy
metric returns the fraction of the training dataset which is correctly classified. The equal_opportunity
metric measures the difference in true positive rate between a privileged demographic group and an unprivileged demographic group. The other available metrics used to evaluate model performance are
accuracy
(proportion of outcomes correctly classified out of total outcomes)
true_neg_rate
(the proportion negative outcome labels that are correctly classified out of all negative outcome labels)
true_pos_rate
(the proportion positive outcome labels that are correct out of all positive outcome labels)
demographic_parity
(the difference between number of positive labels out of total from a privileged demographic group vs. a unprivileged demographic group)
equal_odds_tpr & equal_odds_tnr
(unlike with equal_opportunity, this criteria looks at difference between true positive rate - privileged vs. unpriviledge and true negative rate - privileged vs. unprivileged, with the aim of ensuring that the difference for both metrics are minimal)
For a discussion on how metrics behave and recommended usage, please see our blogpost.
Load the model you'd like to evaluate with the dataset or choose one of the classifiers that are already available. For this release these are the available wrappers: DefaultXGBoostClassifier
(a wrapper around XGBoost classifier), DefaultRandomForestClassifier
(a wrapper around the random forest classifier from sklearn) and DefaultLogisticRegression
(a wrapper around the logistic regression classifier from sklearn).
You can use a pre-trained model and are not restricted to the model classes we have wrappers for. We just provided some widely-used model classes for ease of use.
Models from other libraries (Etiq supports models from XGBoost, LightGBM, PyTorch, TensorFlow, Keras and scikit-learn) may be used by wrapping them in the EtiqModel
class . We could, for example, create an LGBMClassifier model, train it and use the trained model.
Now you can create the DataPipeline. The DatasetLoader class will take the data, transform it, split it into training/validation/testing data and load it in. The DataPipeline computes your metrics of interest on the Dataset, using the model you provided.
DebiasPipeline takes as inputs a data pipeline, an identify and/or repair method and metrics you want to use to evaluate your model. Identify methods are as the name suggests are intended to help you identify bias issues. Repair methods are designed to help fix or mitigate the issues identified and include implemented algorithms from the fairness literature.
The current repair pipeline we provide is at the pre-processing level, i.e. changes the dataset with the objective that some of the sources of bias in it will be mitigated. Other methods at in-processing or post-processing stages will be more effective from an optimization point of view, but they might not address some of the issues in the data, which is why this is a good starting area. In our full solution we have additional pipelines.
An example debiasing pipeline is given below
IdentifyBiasSources is the type of pipeline you are using. For this release we are providing this pipeline. Similarly RepairResamplePipeline denotes what type of repair pipeline it is.
The parameters for the identify pipeline available in this release are as follows:
group_definition = unsupervised. This is a type of pipeline method that looks for groups (i.e. segments of the dataset) that have issues that could cause bias. In our test version we have only released one option but in our full package we have multiple options.
nr_groups - Experiment with a few different options based on how large your dataset is. This refers to how many groups/segments you think your dataset could be split into.
Version 1.3 has more functionality across bias pipelines and other areas.
As with the data pipeline, when running the pipeline, we get the logs of how the pipeline has run:
In the fairness literature, mitigation is considered to be the likely terminology as these types of issues are hard to remove entirely. Our usage of the term repair & debias refers primarily to mitigation, rather than removal.
Now that you've checked the logs and the etiq pipeline ran, to retrieve the outputs, use the following methods:
Example output:
Our library is intended for you to test your models and see if there are any issues. The pipeline surfaces potential issues, and then it's up to you whether you consider them to be issues for your specific model or not. For more details on definitions please see Definitions tab
Example output
To help make sense of the segments, we also have a profiler method which gives you an idea about the rows found to have specific issues.
To understand more about the types of errors this pipeline finds, please use the following method; it will give you definitions and thresholds used. Also, please see a discussion of different bias sources at this link.
In release 1.3 you are able to customize the definitions and thresholds. We will make this release available on AWS Marketplace shortly.
If you've just built a pipeline using a repair method and want to see if the issues you've identified before, use the evaluate method
To see all your projects and pipelines from your notebook or IDE use the methods below:
This should give you an output like the one below:
If you just opened a new session but want the pipelines and debiasing pipelines to be logged as part of the project you used in your previous session make sure you find the ID of the project and set your current project to the id you are looking for:
To see what pipelines are associated with the project, use the methods below: