# Bias Mitigation in Credit Scoring by Reweighting

Bias mitigation is the process of removing bias from a data set or a model in order to make it fair. Bias mitigation usually follows a bias detection step, where a series of metrics are computed based on a data set or model predictions. Bias mitigation has three stages: pre-processing, in-processing, and post-processing. This example demonstates a pre-processing method to mitigate bias in a credit scoring workflow. The example uses bias detection and bias mitigation functionality from the Statistics and Machine Learning Toolbox™. For a detailed example on bias detection, see the following example: Explore Fairness Metrics for Credit Scoring Model.

The bias mitigation method in this example is Reweighting which essentially reweights observations within a data set to guarantee fairness between different subgroups within a sensitive attribute. As a result of reweighting, the Statistical Parity Difference (SPD) of all subgroups goes to 0 and the Disparate Impact metric becomes 1. This example demonstrates how reweighting works in a credit scoring workflow.

Load the CreditCardData data set and discretize the 'CustAge' predictor.

AgeGroup = discretize(data.CustAge,[min(data.CustAge) 30 45 60 max(data.CustAge)], ...
'categorical',{'Age < 30','30 <= Age < 45','45 <= Age < 60','Age >= 60'});

CustID    CustAge       AgeGroup       TmAtAddress    ResStatus     EmpStatus    CustIncome    TmWBank    OtherCC    AMBalance    UtilRate    status
______    _______    ______________    ___________    __________    _________    __________    _______    _______    _________    ________    ______

1         53       45 <= Age < 60        62         Tenant        Unknown        50000         55         Yes       1055.9        0.22        0
2         61       Age >= 60             22         Home Owner    Employed       52000         25         Yes       1161.6        0.24        0
3         47       45 <= Age < 60        30         Tenant        Employed       37000         61         No        877.23        0.29        0
4         50       45 <= Age < 60        75         Home Owner    Employed       53000         20         Yes       157.37        0.08        0
5         68       Age >= 60             56         Home Owner    Employed       53000         14         Yes       561.84        0.11        0
6         65       Age >= 60             13         Home Owner    Employed       48000         59         Yes       968.18        0.15        0
7         34       30 <= Age < 45        32         Home Owner    Unknown        32000         26         Yes       717.82        0.02        1
8         50       45 <= Age < 60        57         Other         Employed       51000         33         No        3041.2        0.13        0

Split the data set into training and testing data. Use the training data to fit the model and the testing data to predict from the model.

rng('default');
c = cvpartition(size(data,1),'HoldOut',0.3);
data_Train = data(c.training(),:);
data_Test = data(c.test(),:);

### Compute Fairness Metrics at Predictor and Model Level

Compute the fairness metrics for the training data by creating a fairnessMetrics object and then generating a metrics report using report. Since you are only working with data and there is no fitted model, only two bias metrics are computed for StatisticalParityDifference and DisparateImpact. The two group metrics computed are GroupCount and GroupSizeRatio. The fairness metrics are computed for two sensitive attributes, Age ('AgeGroup') and Residential Status ('ResStatus').

trainingDataMetrics = fairnessMetrics(data_Train, 'status', 'SensitiveAttributeNames',{'AgeGroup', 'ResStatus'});
tdmReport = report(trainingDataMetrics)
tdmReport=7×4 table
SensitiveAttributeNames        Groups        StatisticalParityDifference    DisparateImpact
_______________________    ______________    ___________________________    _______________

AgeGroup            Age < 30                   0.039827                   1.1357
AgeGroup            30 <= Age < 45             0.096324                   1.3282
AgeGroup            45 <= Age < 60                    0                        1
AgeGroup            Age >= 60                  -0.19181                  0.34648
ResStatus           Home Owner                        0                        1
ResStatus           Tenant                      0.01689                   1.0529
ResStatus           Other                      -0.02108                  0.93404

figure
tiledlayout(2,1)
nexttile
plot(trainingDataMetrics,'spd')
nexttile
plot(trainingDataMetrics,'di')

Looking at the DisparateImpact bias metric for both AgeGroup and ResStatus, you can see that there is a much larger variance in the AgeGroup predictor as compared to the ResStatus predictor. This suggests that users are treated more unfairly when it comes to their age as compared to their residential status. This example focuses on the AgeGroup predictor and attempts to reduce bias among its subgroups.

To begin, fit a credit scoring model and compute the model-level bias metrics. This provides a baseline for comparison.

Since CustAge and AgeGroup are essentially the same predictor and this is a sensitive attribute, you can exclude it from the model. Additionally, you can use 'status' as the response variable and 'CustID' as the ID variable.

PredictorVars = setdiff(data_Train.Properties.VariableNames, ...
{'CustAge','AgeGroup','CustID','FairWeights','status'});
sc1 = creditscorecard(data_Train,'IDVar','CustID', ...
'PredictorVars',PredictorVars,'ResponseVar','status');
sc1 = autobinning(sc1);
sc1 = fitmodel(sc1,'VariableSelection','fullmodel');
Generalized linear regression model:
logit(status) ~ 1 + TmAtAddress + ResStatus + EmpStatus + CustIncome + TmWBank + OtherCC + AMBalance + UtilRate
Distribution = Binomial

Estimated Coefficients:
Estimate       SE        tStat        pValue
________    ________    ________    __________

(Intercept)     0.73924    0.077237      9.5711     1.058e-21
ResStatus         1.755       1.295      1.3552       0.17535
EmpStatus       0.88652     0.32232      2.7504     0.0059516
CustIncome      0.95991     0.19645      4.8862    1.0281e-06
TmWBank           1.132      0.3157      3.5856    0.00033637
OtherCC         0.85227      2.1198     0.40204       0.68765
AMBalance        1.0773     0.31969      3.3698    0.00075232
UtilRate       -0.19784     0.59565    -0.33214       0.73978

840 observations, 831 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 66.5, p-value = 2.44e-11
pointsinfo1 = displaypoints(sc1)
pointsinfo1=38×3 table
Predictors              Bin            Points
_______________    _________________    _________

{'ResStatus'  }    {'Tenant'       }    -0.017688
{'ResStatus'  }    {'Home Owner'   }      0.11681
{'ResStatus'  }    {'Other'        }      0.29011
{'ResStatus'  }    {'<missing>'    }          NaN
{'EmpStatus'  }    {'Unknown'      }    -0.097582
{'EmpStatus'  }    {'Employed'     }      0.33162
{'EmpStatus'  }    {'<missing>'    }          NaN
{'CustIncome' }    {'[-Inf,30000)' }     -0.61962
{'CustIncome' }    {'[30000,36000)'}     -0.10695
{'CustIncome' }    {'[36000,40000)'}    0.0010845
{'CustIncome' }    {'[40000,42000)'}     0.065532
⋮

pd1 = probdefault(sc1,data_Test);

Set the threshold value that controls the allocation of "goods" and "bads."

threshold = 0.35;
predictions1 = double(pd1>threshold);

Create a fairnessMetrics object to compute fairness metrics at the model level and then generate a metrics report using report.

modelMetrics1 = fairnessMetrics(data_Test, 'status', 'Predictions', predictions1, 'SensitiveAttributeNames','AgeGroup');
mmReport1 = report(modelMetrics1)
mmReport1=4×7 table
ModelNames    SensitiveAttributeNames        Groups        StatisticalParityDifference    DisparateImpact    EqualOpportunityDifference    AverageAbsoluteOddsDifference
__________    _______________________    ______________    ___________________________    _______________    __________________________    _____________________________

Model1             AgeGroup            Age < 30                    0.54312                  2.6945                   0.47391                         0.5362
Model1             AgeGroup            30 <= Age < 45              0.19922                  1.6216                   0.35645                        0.22138
Model1             AgeGroup            45 <= Age < 60                    0                       1                         0                              0
Model1             AgeGroup            Age >= 60                  -0.15385                    0.52                  -0.18323                        0.16375

Measure accuracy of model using validatemodel.

validatemodel(sc1)
ans=4×2 table
Measure              Value
________________________    _______

{'Accuracy Ratio'      }    0.33751
{'Area under ROC curve'}    0.66876
{'KS statistic'        }    0.26418
{'KS score'            }     1.0403

figure
tiledlayout(2,1)
nexttile
plot(modelMetrics1,'spd')
nexttile
plot(modelMetrics1,'di')

### Reweight Data at Predictor and Model Level

Use fairnessWeights to reweight the training data to remove bias for the sensitive attribute 'AgeGroup'.

fairWeights = fairnessWeights(data_Train, 'AgeGroup', 'status');
data_Train.FairWeights = fairWeights;
CustID    CustAge       AgeGroup       TmAtAddress    ResStatus     EmpStatus    CustIncome    TmWBank    OtherCC    AMBalance    UtilRate    status    FairWeights
______    _______    ______________    ___________    __________    _________    __________    _______    _______    _________    ________    ______    ___________

1        53       45 <= Age < 60        62         Tenant        Unknown        50000         55         Yes       1055.9        0.22        0         0.95879
2        61       Age >= 60             22         Home Owner    Employed       52000         25         Yes       1161.6        0.24        0         0.75407
3        47       45 <= Age < 60        30         Tenant        Employed       37000         61         No        877.23        0.29        0         0.95879
4        50       45 <= Age < 60        75         Home Owner    Employed       53000         20         Yes       157.37        0.08        0         0.95879
7        34       30 <= Age < 45        32         Home Owner    Unknown        32000         26         Yes       717.82        0.02        1         0.82759
8        50       45 <= Age < 60        57         Other         Employed       51000         33         No        3041.2        0.13        0         0.95879
9        50       45 <= Age < 60        10         Tenant        Unknown        52000         25         Yes       115.56        0.02        1          1.0992
10        49       45 <= Age < 60        30         Home Owner    Unknown        53000         23         Yes        718.5        0.17        1          1.0992

Use fairnessMetrics to compute fairness metrics for the training data after reweighting and use report to generate a fairness metrics report..

trainingDataMetrics_AfterReweighting = fairnessMetrics(data_Train, 'status', 'SensitiveAttributeNames','AgeGroup','Weights','FairWeights');
tdmrReport = report(trainingDataMetrics_AfterReweighting)
tdmrReport=4×4 table
SensitiveAttributeNames        Groups        StatisticalParityDifference    DisparateImpact
_______________________    ______________    ___________________________    _______________

AgeGroup            Age < 30                  -2.9976e-15                   1
AgeGroup            30 <= Age < 45            -5.5511e-16                   1
AgeGroup            45 <= Age < 60                      0                   1
AgeGroup            Age >= 60                 -2.9421e-15                   1

By applying the reweighting algorithm to the AgeGroup predictor, you can completely remove the disparate impact for AgeGroup. Then use this debiased data to fit a model to produce predictions with an overall reduced disparate impact at the model level.

Use creditscorecard to fit a new credit scoring model with the new fair weights and compute model-level bias metrics.

sc2 = creditscorecard(data_Train,'IDVar','CustID', ...
'PredictorVars',PredictorVars,'WeightsVar','FairWeights','ResponseVar','status');
sc2 = autobinning(sc2);
sc2 = fitmodel(sc2,'VariableSelection','fullmodel');
Generalized linear regression model:
logit(status) ~ 1 + TmAtAddress + ResStatus + EmpStatus + CustIncome + TmWBank + OtherCC + AMBalance + UtilRate
Distribution = Binomial

Estimated Coefficients:
Estimate       SE        tStat        pValue
________    ________    ________    __________

(Intercept)     0.74055    0.076222      9.7158    2.5817e-22
ResStatus        2.0467      1.7669      1.1584       0.24672
EmpStatus       0.91879     0.32197      2.8536     0.0043222
CustIncome      0.91038     0.33216      2.7407       0.00613
TmWBank          1.1067     0.30826      3.5901     0.0003305
OtherCC         0.42264      3.5078     0.12049        0.9041
AMBalance        1.1347      0.3447      3.2919    0.00099504
UtilRate       -0.39861     0.77284    -0.51577       0.60601

840 observations, 831 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 46.6, p-value = 1.85e-07
pointsinfo2 = displaypoints(sc2)
pointsinfo2=34×3 table
Predictors              Bin            Points
_______________    _________________    ________

{'ResStatus'  }    {'Tenant'       }    0.016048
{'ResStatus'  }    {'Home Owner'   }    0.091092
{'ResStatus'  }    {'Other'        }     0.28326
{'ResStatus'  }    {'<missing>'    }         NaN
{'EmpStatus'  }    {'Unknown'      }    -0.10352
{'EmpStatus'  }    {'Employed'     }     0.33653
{'EmpStatus'  }    {'<missing>'    }         NaN
{'CustIncome' }    {'[-Inf,30000)' }    -0.37618
{'CustIncome' }    {'[30000,40000)'}    0.047483
{'CustIncome' }    {'[40000,42000)'}     0.10244
{'CustIncome' }    {'[42000,47000)'}     0.14652
{'CustIncome' }    {'[47000,Inf]'  }     0.40015
⋮

pd2 = probdefault(sc2,data_Test);
predictions2 = double(pd2>threshold);

Use fairnessMetrics to compute fairness metrics at the model level and report to generate a fairness metrics report.

modelMetrics2 = fairnessMetrics(data_Test, 'status', 'Predictions', predictions2, 'SensitiveAttributeNames','AgeGroup');
mmReport2 = report(modelMetrics2)
mmReport2=4×7 table
ModelNames    SensitiveAttributeNames        Groups        StatisticalParityDifference    DisparateImpact    EqualOpportunityDifference    AverageAbsoluteOddsDifference
__________    _______________________    ______________    ___________________________    _______________    __________________________    _____________________________

Model1             AgeGroup            Age < 30                    0.39394                  2.1818                   0.37391                        0.39377
Model1             AgeGroup            30 <= Age < 45             0.094298                  1.2829                   0.22947                        0.11509
Model1             AgeGroup            45 <= Age < 60                    0                       1                         0                              0
Model1             AgeGroup            Age >= 60                  -0.13333                     0.6                  -0.18323                         0.1511

Measure accuracy of model using validatemodel.

validatemodel(sc2)
ans=4×2 table
Measure              Value
________________________    _______

{'Accuracy Ratio'      }    0.27735
{'Area under ROC curve'}    0.63868
{'KS statistic'        }    0.22702
{'KS score'            }    0.90741

figure
tiledlayout(2,1)
nexttile
plot(modelMetrics2,'spd')
nexttile
plot(modelMetrics2,'di')

The process of reweighting removed all the bias from the training data. When you use the new data to fit a model, the overall bias in the model is reduced when compared to a model trained with biased data. As a consequence of this reduction in bias, there is a drop in model accuracy. You can choose to make tradeoff to improve fairness.

### References

[1] Nielsen, Aileen. "Chapter 4. Fairness PreProcessing." Practical Fairness. O'Reilly Media, Inc., Dec. 2020.

[2] Mehrabi, Ninareh, et al. “A Survey on Bias and Fairness in Machine Learning.” ArXiv:1908.09635 [Cs], Sept. 2019. arXiv.org, https://arxiv.org/abs/1908.09635.

[3] Wachter, Sandra, et al. Bias Preservation in Machine Learning: The Legality of Fairness Metrics Under EU Non-Discrimination Law. SSRN Scholarly Paper, ID 3792772, Social Science Research Network, 15 Jan. 2021. papers.ssrn.com, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3792772.