Gender-Differentiated Digital Credit Algorithms Using Machine Learning
Lead Researcher: Sean Higgins, Post-Doctoral Fellow (University of California, Berkeley)
Study Timeline: April 1 - November 1, 2017
Development Challenge
Among men and women with comparable creditworthiness, women face a bias in the amount lenders are willing to provide [1], higher interest rates [2], and legal frameworks which can make it more difficult for them to access credit [3]. These factors restrict access to formal credit for some low-income women and prevent them from building credit histories, which can limit their ability to invest in productive assets and smooth consumption in the face of shocks.
Digital credit algorithms provide an opportunity to overcome these constraints, while simultaneously minimizing default, over-indebtedness, leakage, fraud, and other risks to consumers. Traditional credit scoring models pool data from men and women and either omit gender entirely due to discrimination concerns [4] , or include gender without fully capturing the ways in which gender interacts with other characteristics. Thus, traditional models may overlook whether women with specific characteristics or behaviors—including network interactions, phone and utility payment behavior, socioeconomic status, and intra-household bargaining power—are being overlooked for credit. It has yet to be studied whether these women would benefit from credit access to enhance their economic opportunities, and how to use data on these characteristics and behaviors to select clients with minimum risk of default and over-indebtedness. This pilot study aims to understand whether gender-differentiated credit scoring models (using non-traditional data) can increase women’s access to formal credit.
Digital credit algorithms provide an opportunity to overcome these constraints, while simultaneously minimizing default, over-indebtedness, leakage, fraud, and other risks to consumers. Traditional credit scoring models pool data from men and women and either omit gender entirely due to discrimination concerns [4] , or include gender without fully capturing the ways in which gender interacts with other characteristics. Thus, traditional models may overlook whether women with specific characteristics or behaviors—including network interactions, phone and utility payment behavior, socioeconomic status, and intra-household bargaining power—are being overlooked for credit. It has yet to be studied whether these women would benefit from credit access to enhance their economic opportunities, and how to use data on these characteristics and behaviors to select clients with minimum risk of default and over-indebtedness. This pilot study aims to understand whether gender-differentiated credit scoring models (using non-traditional data) can increase women’s access to formal credit.
Study Summary
During this pilot study, the research team collected a wide array of data that will be used to develop a machine learning credit scoring algorithm, with the goal of rigorously testing this model in a full-scale randomized control trial [5]. Partnering with La Nacional (a Dominican bank), researchers obtained and analyzed data on existing clients of a credit card product designed for low-income consumers who lack credit histories. These data for 16,091 applicants, who had applied to the bank’s low-income credit product since June 2015, include sociodemographic data used by La Nacional to determine creditworthiness. Additionally, the research team gained access to detailed bill payment histories for existing clients. The researchers are currently in conversation with Claro, the largest telecommunications company in the Dominican Republic, to obtain call detail records for applicants. The bill payment histories and call detail records will be used to develop the preliminary gender-differentiated credit-scoring model. A richer set of sociodemographic outcomes and psychometric measures for existing clients will be collected in a forthcoming survey, which will serve as important predictors in the credit scoring algorithm.
Researchers used the aforementioned data to screen loan applicant eligibility and loan outcomes to simulate the effect of three retroactive credit-scoring models: 1) the traditional model with data for men and women pooled and an indicator variable for gender included in the model, 2) a male-only model and 3) a female-only model. To mimic the current practices of La Nacional, the simulated experiments used a set of sociodemographic characteristics including gender, marital status, age, job sector, occupation, work experience, geographic location, housing status, and income.
Researchers used the aforementioned data to screen loan applicant eligibility and loan outcomes to simulate the effect of three retroactive credit-scoring models: 1) the traditional model with data for men and women pooled and an indicator variable for gender included in the model, 2) a male-only model and 3) a female-only model. To mimic the current practices of La Nacional, the simulated experiments used a set of sociodemographic characteristics including gender, marital status, age, job sector, occupation, work experience, geographic location, housing status, and income.
Early Results & Takeaways
Using a subsample of female loan applicants (7,552), results from the simulated experiment show:
• 48% of these applicants would be approved under both the pooled and female-only credit scoring models (“always-accepted” group 1)
• 14% would be approved only under the female-only model (“female-only” group 2)
• 13% would be approved only under the pooled model (“pooled-only” group 3)
• 24% would be rejected by both models (“always-rejected” group 4).
Put differently, financial inclusion could be increased by providing credit to over one-third of low-income women who lack credit access: these women are rejected by current credit scoring models, but would be considered creditworthy by a gender-differentiated credit scoring model.
• 48% of these applicants would be approved under both the pooled and female-only credit scoring models (“always-accepted” group 1)
• 14% would be approved only under the female-only model (“female-only” group 2)
• 13% would be approved only under the pooled model (“pooled-only” group 3)
• 24% would be rejected by both models (“always-rejected” group 4).
Put differently, financial inclusion could be increased by providing credit to over one-third of low-income women who lack credit access: these women are rejected by current credit scoring models, but would be considered creditworthy by a gender-differentiated credit scoring model.
Policy Relevance & Implications
If the model proves successful, La Nacional will use gender-differentiated credit scoring for multiple products aimed at providing credit access to low-income women. For example, the government of the Dominican Republic has expressed interest in working with La Nacional to provide mortgages for affordable, government-subsidized housing for low-income families, but the bank currently lacks a method to reliably assess the credit risk of these potential borrowers. Gender-differentiated credit scoring models using detailed bill payment histories and call detail records could provide such a method, which would in turn allow private and public sector stakeholders to collaborate in lowering barriers to financial inclusion.
If gender-differentiated credit scoring proves to be viable and benefits women, then this scoring model could be widely adopted by digital credit providers and scaled to millions of clients around the world.
If gender-differentiated credit scoring proves to be viable and benefits women, then this scoring model could be widely adopted by digital credit providers and scaled to millions of clients around the world.
[1] Agier and Szafarz, 2013
[2] Alesina et al., 2013
[3] Galan, 1998
[4] Mester, 1997
[5] This pilot study received full-scale study funding from the DCO in November 2017
Photo Credit: Sean Higgins
[2] Alesina et al., 2013
[3] Galan, 1998
[4] Mester, 1997
[5] This pilot study received full-scale study funding from the DCO in November 2017
Photo Credit: Sean Higgins