AI in financial services: Can you challenge decisions made by algorithms?

insight2impact has, for several years, argued that the use of data can unlock greater financial inclusion, but what happens when the use of data results in discrimination and potentially contributes to exclusion?

A team of us recently worked on a project with CGAP to develop “practical and feasible” outcomes indicators that would plausibly link financial service usage to positive customer outcomes.

While considering the desired outcome of fairness and respect, we explored the reality that decisions are no longer necessarily made by humans – for example, the use of credit scoring algorithms in deciding whether to offer or decline (or how to price) a loan application.

In the end, we came up with a somewhat ambitious indicator: “Algorithms are explainable (e.g. parameters for decision making are clear and ensure no discriminatory variables are used)”.

It’s worth noting that in this instance we use explainable to mean that the reasons for the decisions made by algorithms could be explained to an individual who questioned the outcome (implying that financial service providers had sufficient knowledge of the input variables used and the potential outcomes), rather than technical XAI (explainable AI), a relatively new field in data science.

But how can financial service providers ensure that the algorithms they deploy don’t result in negative outcomes for customers or potential customers?

We have curated a selection of articles that explore this issue and encourage you to click through to the hyperlinked texts to learn more. Please note that we don’t cover the broader theme of ethical AI here, although we do recommend this thought-provoking blog for those interested in the topic.

Where does bias creep in?

It’s too easy to forget that human decision-making is often flawed and that humans are prone to bias. In fact, behavioural science suggests that people often don’t even know how or why they make certain decisions. Data scientists argue that machine intelligence, when “done right”, is likely to be less biased than human intelligence. However, there are plenty of unfortunate examples of algorithmic decision-making resulting in undesirable outcomes and these should be prevented as far as possible.

Legal practitioners from White & Case LLP explain that, “In an algorithmic system, there are three main sources of bias that could lead to undesirable [outcomes]: input, training and programming. Input bias could occur when the source data itself is biased because it lacks certain types of information, is not representative or reflects historical biases. Training bias could appear in either the categorization of the baseline data or the assessment of whether the output matches the desired result. Programming bias could occur in the original design or when a smart algorithm is allowed to learn and modify itself through successive contacts with human users, the assimilation of existing data, or the introduction of new data.”

How do you recognise bias or test for it?

One option to test for bias or discrimination is to audit algorithms before they are deployed. Kartik Hosanagar, author of A Human’s Guide to Machine Intelligence: How Algorithms Are Shaping Our Lives and How We Can Stay in Control, suggests that the audit should be undertaken by someone other than the team who developed the algorithm. Algorithm developers are likely to focus on factors like prediction accuracy but auditors could deliberately consider issues pertaining to privacy, bias and fairness.

In some cases, financial service providers might consider publishing a limited dataset and inviting external parties to test and review it.

Researchers are testing several technological solutions. The White & Case article referenced earlier mentions some of the solutions being tested – one of which, Quantitative Input Influence, shows promise. An algorithm is repeatedly run using different input variables to determine which variables have the greatest effect on the output. It accounts for potential correlation between variables. This could be useful in determining the weighting given to specific variables in a credit-scoring algorithm, for example.

In order to test for bias, an organisation needs to have access to data variables that enable them to understand the people likely to be affected by these decisions. The Centre for Data, Ethics and Innovation (a relatively new unit set up within the UK government) is conducting a review into bias in algorithmic decision-making. In its interim report, officials explain that “some organisations do not collect diversity information at all, due to nervousness of a perception that this data might be used in a biased way. This then limits the ability to properly assess whether a system is leading to biased outcomes”.

How can financial service providers avoid the pitfall of bias in algorithmic decision-making?

Through rejection sampling and intuition one financial executive suggests. In a Medium post by Andrew Watkins-Ball, founder of Jumo, he explains how the company attempts to address the bias in machine learning to deliver on its financial inclusion objectives.

“To test our impact, we target a representative sample of otherwise declined customers, often called rejection sampling, to quantify an upper threshold of what perfect financial inclusion would look like. Continuously interrogating the user data gives us access to a flow of unbiased information which we can use to minimize historical bias (that arises from historical decisioning) and avoid the unintended consequences of inaccurately excluding people. It’s important, however, that this process of measuring is regular and ongoing.”

He further suggests that, “To make real progress in advancing predictive methodologies, we need algorithms and intuition”.

A further possibility is explainable AI (XAI), an emerging field in machine learning that aims to address how decisions of AI systems are made. Advocates suggest that simpler forms of machine learning can be used to develop algorithms that include the visibility required.

In another Medium post, Prajwal Paudyal warns that “Skeptics [of XAI] point out (and correctly so) that most popular AI models with good performance have around 100 million parameters. This means there were 100 million numbers that were learned during training that contribute to a decision. With that complexity, how can we begin to think about which factors affect the explanations?”

It’s early days for XAI and the science around this is emerging. Interestingly, the UK government has tasked the Information Commissioner’s Office and the Alan Turing Institute with producing XAI guidelines for organisations. More on this initiative, called Project ExplAIn, can be found here.

How can regulators ensure that algorithmic decision-making results in good outcomes for individual consumers?

Olaf Growth, writing for Wired magazine, suggests that AI algorithms need FDA-style drug trials.
“The US Food and Drug Administration requires controlled testing on animals to establish safety, and then more testing on small populations to establish efficacy. Only then can a company offer a new drug to the masses. Software, by contrast, is typically subjected only to “unit tests,” to assure new lines of code perform as expected, and “integration tests,” to assure the updates don’t degrade the system’s performance. This is like checking a drug for contaminants without testing the effects of its active ingredients.”

The European Union’s General Data Protection Regulations (GDPR) already afford individuals the right not to be subject to solely automated decisions where that decision produces legal (or similarly significant) effects. The UK’s Information Commissioner’s Office (ICO) cites automatic refusal of an online credit application as one such example of a solely automated decision.

Despite our draft indicator introduced at the beginning of the article, it is probably unhelpful to insist that algorithms be completely interpretable. The ICO suggests that “some types of AI systems, for example those using deep learning, may be difficult for a human reviewer to interpret”.

The attention that is currently being devoted to the outcomes of decisions influenced by artificial intelligence is welcome and it serves as a reminder to scrutinise the impact of human decisions on financial consumers, particularly marginalised consumers. The articles and papers hyperlinked in this piece explain more about the potential for bias to creep into algorithms and introduce several ways to either test for or neutralise bias. It's clear that this is an evolving area of research and that financial service providers should, in the interim, do everything humanely possible to ensure that where assistance is sought from machines, every attempt is made to check for unintended consequences.

In an effort to flag potentially interesting resources for our readers, we ocassionally share curated articles comprising links to mostly external content. Please note that sharing links to publications that are not authored by i2i does not constitute official endorsement of or agreement with the content contained in those links.