"Algorithmic bias" refers to unintended discrimination occurring as a result of an automated decision. The term "protected feature" refers to a specific demographic characteristic such as age or sex. Legislation defines a series of protected features. For example, in the UK, citizens are protected against discrimination on the basis of age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex or sexual orientation status by the Equality Act 2010.
The unprivileged group within the protected feature (for example, people over 65 when age is the protected feature) tends to be discriminated against and as a result tends to be the one protected by legislation. The privileged group within the protected feature tends to not be discriminated against.
We are also using terminology such as "debias" in the library to expedite articulation. The consensus in the literature (and our view) is that algorithmic bias can be mitigated but not removed entirely.
There is no consensus on the most appropriate way to measure bias, however depending on the framework used, there are some key metrics worth knowing.
Before we get into this, a quick explanation of model building. A model uses data to make predictions. During training, a model "learns" of a way to use training data to understand which combination of features predict a positive or negative outcome (labels). Testing the model on a validation or test dataset lets the user quantify how accurate are the model predictions. There are many ways to measure bias and this is an on-going research topic. One way to measure bias is to compute fairness metrics on the predictions and ground truth that a trained model makes for a dataset. The fairness metrics attempt to encode in mathematical terms a notion of what a fair outcome should be for the model and the dataset. It is important to consider whether a particular fairness metric encapsulates the notion of a fair model for each individual project.
Some of the metrics commonly used in the algorithmic fairness literature that the Etiq library provides are:
- Demographic parity - is the ratio of users predicted to be positive over all the users the same for all groups in a demographic? For instance, is the proportion of women accepted for an interview the same as the proportion of men?
- Equal opportunity - is the model as accurate for all demographic groups? Is the true positive rate the same for all demographics? True positives rate measures the proportion of actual positives that are correctly identified as such (e.g., the percentage of sick people who are correctly identified as having the condition). If the true positives rate is lower for a group then likely that group is experiencing bias.
- Equal odds - an extension on Equal opportunity. It does not look just at true positive rates but also at false negative rates for the different demographic groups to ensure that the model performs equally well for all the different groups.
- Individual fairness - a different angle on bias is to ensure that customers who display the same characteristics are treated the same. This does not yet have a clear definition.
In our understanding of the fairness literature, below are the key general areas:
Optimization: pre-processing, in-processing and post-processing methods which attempt to optimize for both fairness metrics and accuracy. Some examples include: mapping the training data to a space independent of the specific demographic, adversarial debiasing, calibrating the model once it's built. The repair approaches can be anywhere from repairs that are very non-intrusive, e.g. resampling to those that are changing the labels and feature distribution quite heavily.
Causality: Causality type approaches overlap with both counterfactuals and optimization ones, but are firmly rooted in the idea that a dataset can be modelled into a causal graph which can then point if belonging to a certain demographic class impacts other feature and via them impacts the outcome.
For the pipeline we released we are using group metrics and sources of bias approaches.
The sources of bias framework used relies on this lecture. According to this framework, there are roughly 5 areas of sources of bias (within the model build process, outside issues like team diversity, data collection, etc.). 3 of them are visible from the data and/or model:
- proxies - features that are proxy for demographics
- sample size disparity - sample size for the protected demographic group is quite a bit lower than for the majority class
- limited features - features might be less reliable for a certain demographic group than for a majority class
The remaining 2 sources of bias require more background or context knowledge:
- 'tainted' examples - the target variable is reflective of past bias, e.g. a model predicting who might make a good hire using data on who was hired in the past not on who was the objectively best candidate for the role
- skewed sample - the dataset is not representative of the population for which the model will be used
As expected, different bias sources can be mitigated by different repairs. The repair we focus on at the moment is only at pre-processing stage - changing the dataset in such a way as to mitigate some of the inherent bias issues it presents.
Our goal is for our pipelines to be transparent enough in terms of the outcome that it will be clear to the user how reliable the results are. We are also working on adding stability measures to our library.