Low Level API

More complex scans, such as bias_sources, have underneath longer pipelines with multiple steps. This section describes these pipelines' structure.

These can currently be used in the IDE and customized, but they will not be displayed in the dashboard. As this API evolves we will surface methods that allow you to add these pipelines to the dashboard as well.

DataPipeline

To follow the example analysis below, download the Adult dataset from https://archive.ics.uci.edu/ml/datasets/adult or load it in the notebook as a Pandas dataframe from the samples included in the library. A demo notebook is available at https://github.com/ETIQ-AI/demo/blob/main/DemoAdultLibrary03.ipynb

data = load_sample('adultdata')

The DataPipeline object has the model we'd like to evaluate, the dataset used to train it and associated fairness metrics.

Below, we define the parameters for the debiasing process using the BiasParams structure. This allows us to specify the protected category (often a demographic feature you'd like to mitigate bias for) using the protected parameter; specify who is in the privileged and unprivileged groups (these are set using the privileged and unprivileged parameters respectively); specify what is the positive outcome and the negative outcome in this dataset (these are set using the positive_outcome_label and negative_outcome_label parameters respectively).

debias_param = BiasParams(protected='gender',
                          privileged='Male',
                          unprivileged='Female', 
                          positive_outcome_label='>50K',
                          negative_outcome_label='<=50K')

Even if your model does not use the specific demographic features you want to identify bias for, you should include this in the dataset. (etiq will automatically exclude it later during any model refitting).

It is important to note that the protected feature is removed from the dataset for the purposes of training a model and will only be used to evaluate the model for bias.

Specify transforms like Dropna or EncodeLabels to make sure data are numeric and without missing values. Preferably use your own transform functions.

transforms = [Dropna, EncodeLabels] 

The DatasetLoader reads in the data, applies any transformations, splits the data into training, validation and test datasets and sets aside the test dataset to avoid data leakage in your analysis. The training and validation datasets are loaded into the Dataset class.

dl = DatasetLoader(data=data, 
                   label='income', 
                   transforms=transforms,
                   bias_params=debias_param,
                   train_valid_test_splits=[0.8, 0.1, 0.1],
                   names_col = data.columns.values)

Choose the metrics you want computed for this project.

metrics_initial= [accuracy,  equal_opportunity]

Each of these metrics measure how well our model is performing when classifying the data. For more details please see Definitions.

Load the model you'd like to evaluate with the dataset or choose one of the classifiers that are already available. For this test release these DefaultXGBoostClassifier (a wrapper around XGBoost classifier), DefaultRandomForestClassifier (a wrapper around the random forest classifier from sklearn) and DefaultLogisticRegression (a wrapper around the logistic regression classifier from sklearn).

clf_model = DefaultXGBoostClassifier()

You can use a pre-trained model and are not restricted to the model classes we have wrappers for. We just provided some widely-used model classes for ease of use.

Models from other libraries (Etiq supports models from XGBoost, LightGBM, PyTorch, TensorFlow, Keras and scikit-learn) may be used by wrapping them in the EtiqModel class . We could, for example, create an LGBMClassifier model, train it and use the trained model.

import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()
fitted_lgb = lgb_model.fit(X_train, y_train)
clf_model = Model(model_architecture=lgb_model, model_fitted=fitted_lgb)

Now you can create the DataPipeline. The DatasetLoader class will take the data, transform it, split it into training/validation/testing data and load it in. The DataPipeline computes your metrics of interest on the Dataset, using the model you provided.

pipeline_initial = DataPipeline(dataset_loader=dl, model=clf_model, metrics=metrics_initial)
pipeline_initial.run()

Remember your dataset has as many features as you want but in this limited release library the DataPipeline will only pick up on the first 15 features

DebiasPipeline

DebiasPipeline takes as inputs a data pipeline, an identify and/or repair method and metrics you want to use to evaluate your model. Identify methods are as the name suggests are intended to help you identify bias issues. Repair methods are designed to help fix or mitigate the issues identified and include implemented algorithms from the fairness literature.

The current repair pipeline we provide is at the pre-processing level, i.e. changes the dataset with the objective that some of the sources of bias in it will be mitigated. Other methods at in-processing or post-processing stages will be more effective from an optimization point of view, but they might not address some of the issues in the data, which is why this is a good starting area. In our full solution we have additional pipelines.

An example debiasing pipeline is given below

identify_pipeline = IdentifyBiasSources(nr_groups=20, # nr of segments based on using unsupervised learning to group similar rows
                                        train_model_segment=True,
                                        group_def=['unsupervised'],
                                        fit_metrics=[accuracy, equal_opportunity])
    
# the DebiasPipeline aims to mitigate sources of bias by applying different types of repair algorithms
# the library offers implementations of repair algorithms described in the academic fairness literature

repair_pipeline = RepairResamplePipeline(steps=[ResampleUnbiasedSegmentsStep(ratio_resample=1)], random_seed=4)

debias_pipeline = DebiasPipeline(data_pipeline=pipeline_initial, 
                                 model=xgb,
                                 metrics=metrics_initial,
                                 identify_pipeline=identify_pipeline,
                                 repair_pipeline=repair_pipeline)
debias_pipeline.run()

IdentifyBiasSources is the type of pipeline you are using. For this test release we are providing this pipeline. Similarly RepairResamplePipeline denotes what type of repair pipeline it is.

As a convention anything that is a pipeline type is <TypeOfPipeline>Pipeline

The parameters for the identify pipeline available in this release are as follows:

  • group_definition = unsupervised. This is a type of pipeline method that looks for groups (i.e. segments of the dataset) that have issues that could cause bias. In our test version we have only released one option but in our full package we have multiple options.

  • nr_groups - Experiment with a few different options based on how large your dataset is. This refers to how many groups/segments you think your dataset could be split into.

Remember this is just one of the pipelines we provide and arguably not our most interesting one. If you want to explore using our other pipelines get in touch with us: info@etiq.ai

As with the data pipeline, when running the pipeline, we get the logs of how the pipeline has run:

INFO:etiq_core.pipeline.DebiasPipeline36:Starting pipeline
INFO:etiq_core.pipeline.DebiasPipeline36:Start Phase IdentifyPipeline844
INFO:etiq_core.pipeline.IdentifyPipeline844:Starting pipeline
INFO:etiq_core.pipeline.IdentifyPipeline844:Completed pipeline
INFO:etiq_core.pipeline.DebiasPipeline36:Completed Phase IdentifyPipeline844
INFO:etiq_core.pipeline.DebiasPipeline36:Start Phase RepairPipeline558
INFO:etiq_core.pipeline.RepairPipeline558:Starting pipeline
INFO:etiq_core.pipeline.RepairPipeline558:Completed pipeline
INFO:etiq_core.pipeline.DebiasPipeline36:Completed Phase RepairPipeline558
INFO:etiq_core.pipeline.DebiasPipeline36:Refitting model
INFO:etiq_core.pipeline.DebiasPipeline36:Computed metrics for the repaired dataset
INFO:etiq_core.pipeline.DebiasPipeline36:Completed pipeline

In the fairness literature, mitigation is considered to be the likely terminology as these types of issues are hard to remove entirely. Our usage of the term repair & debias refers primarily to mitigation, rather than removal.

Output methods

Now that you've checked the logs and the etiq pipeline ran, to retrieve the outputs, use the following methods:

Metrics

debias_pipeline.get_protected_metrics()

Example output:

{'DataPipeline502': 
[{'accuracy': ('privileged', 0.84, 'unprivileged', 0.93)},
 {'equal_opportunity': ('privileged', 0.6901408450704225,'unprivileged',0.55)}],
 'DebiasPipeline426': 
[{'accuracy': ('privileged', 0.82, 'unprivileged', 0.91)},
 {'equal_opportunity': ('privileged', 0.6539235412474849,'unprivileged', 0.65)}]}

Issues found by the pipeline

Our library is intended for you to test your models and see if there are any issues. The pipeline surfaces potential issues, and then it's up to you whether you consider them to be issues for your specific model or not. For more details on definitions please see Definitions tab

debias_pipeline.get_issues_summary()

Example output

To help make sense of the segments, we also have a profiler method which gives you an idea about the rows found to have specific issues.

debias_pipeline.get_profiler()

To understand more about the types of errors this pipeline finds, please use the following method; it will give you definitions and thresholds used. Also, please see a discussion of different bias sources at this link.

debias_pipeline.get_thresholds()

Last updated