Ai-based credit default risks should be treated with caution

Katja Langenbucher: The EU is still struggling to find an up-to-date regulation. Discrimination prohibitions alone are not enough

Artificial intelligence (AI) allows prediction of future events that until now "only God knew about," as Mikio Okumura, president of Japanese insurance group Sompo, was recently quoted as saying. The company had collected data on the lifestyle habits of nursing home residents and tested it for its correlation with the onset of dementia symptoms. In this way, one not only achieves progress in medical prevention, but also improves the risk-adjusted calculation of insurance premiums.

In the world of consumer credit conquered by fintechs, such predictions have long been used. Less surprisingly, data on account activity, shopping and payment behavior online can tell you about a potential borrower's cash flow history. However, age, gender or place of residence can also provide relevant information for assessing the risk of credit default. The same applies to the educational or migratory background, religious community or a party affiliation. Platforms that deal with the assessment of credit default risks do not stop there. Relevant correlations may also be provided, for example, by friends on social media, musical tastes or leisure behavior. The same goes for the brand of cell phone you use, the number of typos in text messages, or the time it takes to fill out an online form.

Both sides benefit

Lenders naturally benefit from this combination of machine learning and access to "Big Data". But also borrowers can be offered a contract, which would not have been possible with a conventional assessment. Last year, Schufa announced plans of this kind to simplify the conclusion of mobile phone contracts for customers with poor creditworthiness with its account data-based "Check Now" process. The very broad consent to data utilization contained therein irritated consumer protection organizations and immediately led to the termination of the procedure.

The European legislator has such developments in mind. Of particular relevance here is the draft EU regulation on artificial intelligence. However, the reform of the Consumer Credit Directive also contains important elements of future regulation.

"Particular attention," reads recital (37) of the proposed AI regulation, should be paid to the use of AI when it comes to "access to basic private and public services.". The EU legislator has identified "credit scoring" and the "assessment of the creditworthiness of natural persons" as particularly relevant, because they do not only influence access to credit. The provision of essential "services such as housing, electricity, and telecommunications services" may also be impacted by the results of algorithm-based processes. Because of the central importance for the participation of citizens in the guarantees of modern civil society, such AI systems have been classified as so-called "high-risk systems".

The scope of the AI Regulation is limited to developers and users of the AI system. Although the situation of the end user is outside this scope, it does give rise to labeling as a high-risk system. Among other things, this leads to special compliance requirements: In the manner of product regulation, quality assurance measures are formulated, certification procedures are opened, and the responsibilities of supervisory authorities are arranged.

Risk of discrimination

The recital already cited talks about the fact that AI systems used to assess creditworthiness pose a risk of discrimination. Following research results in computer science, there is talk of "historical discrimination patterns" ("historical bias"), which lead to the perpetuation of experienced discrimination "based on racial or ethnic origin, disability, age, or sexual orientation". The reason for this has to do with the specifics of machine learning. An AI system "learns" based on whatever empirical data its developer provides it with. It thus creates the profile of the successful borrower on the basis of people who have been able to service extended credit in the past. Put the other way around: Individuals who belong to a group of people who have had difficulty obtaining credit in the past will also initially classify the AI system as high-risk candidate:s.

For certain members of this very group, the AI assessment may nevertheless have a positive impact ("invisible primes"). The more similar the non-traditional parameters of a potential borrower are to those of the classically successful candidates, the higher the chance of obtaining a credit despite the atypical profile due to this match. The example of an immigrant or a young person can be used to illustrate this. The latter may not (yet) have a credit history available to a candidate assessed according to conventional patterns. However, it is possible that other variables that the AI has identified as relevant match a previously successful profile. She or he may use a high-priced manufacturer's cell phone, have a prestigious college education, or be particularly brisk with online forms. This is precisely the gap in the market that some fintech companies have filled, and they have been able to help expand the volume of loans to previously disadvantaged groups.

However, not all people benefit equally from the inclusion achieved in this way. A whole series of empirical studies have been able to show for the U.S. that the expansion of credit benefits historically disadvantaged groups to a much lesser extent. This may have to do with the described learning effect of AI, if these groups cannot demonstrate any of the parameters that distinguish the traditionally successful group. Sometimes the AI also gives particularly strong weight to those characteristics that are found in the majority of the successful group. Then it can happen that although other characteristics would actually be more meaningful for the minority, these are not weighted sufficiently ("majority bias"). Again using an example: credit that has been successfully repaid in the past has a positive effect on creditworthiness. But whether the correct operation of a "buy now pay later" agreement or even a private credit agreement has the same effect depends on whether they are classified as credit. If this does not happen, no positive points can be collected. This may disadvantage younger or inexperienced borrowers.

Right to human supervision

For the legal classification of discriminatory credit practices, the AI Regulation still essentially refers to the legal systems of the member states. In contrast, the Consumer Credit Directive reform addresses this risk for the first time. "Consumers who are legally resident in the Union", so classifies Art. 6 an, "may not be used in granting credit on the basis of nationality, residence, or any characteristic listed in Art. 21 of the Charter of Fundamental Rights of the EU."The directive also provides that consumers:inside have the right to seek human intervention when credit scoring involves automated processing. A "meaningful explanation of the assessment and operation of the automated processing used" is also required (Art. 18).

However, prohibiting discriminatory credit practices initially creates more problems than it solves. Up to now, European law has recognized two forms of discrimination: direct and indirect discrimination. Directly discriminates against anyone who, on the basis of a protected characteristic, treats a person less favorably in a comparable situation than a person without that characteristic. If a lender determines, for example, that women are statistically more credit risky than men, it is nevertheless precluded from training its AI model to add a risk premium to all women for simplicity's sake.

Question of comparability

Indirect discrimination is much more relevant in practical terms. It applies when an apparently neutral criterion, such as part-time employment, results in persons in the protected group, such as women, being treated less favorably than those without the protected characteristic. In order to determine this, the parameters of the decision are traced. If the omission of the criterion in question leads to the disappearance of the imbalance, indirect discrimination can be considered. Whether the circumstances are in fact comparable or whether there is an objective reason for the unequal treatment is only a subsequent question.

The prerequisite for indirect discrimination is therefore that individual criteria can be used. Precisely for this reason, this legal figure cannot be applied to AI systems without further ado with a perfect fit. The larger the amount of data that an AI system processes, the more likely it is that the algorithm will find another seemingly neutral criterion that allows the same prediction to be made with comparable precision ("redundant encoding"). For example, gender can correlate not only with holding a part-time job, but also with a certain body size, first names, musical tastes or leisure behavior. A sophisticated AI system will find such parameters and establish correlations between whole bundles of variables.

If the developer of the system restricts access to certain variables, the likelihood that substitute variables ("proxies") will be found is high. One conceivable solution would be to conduct many rounds in which more and more variables are removed from the set of recorded data. The consequence would usually be that the quality of the model's prediction suffers. The – albeit limited – inclusive potential of AI scoring would diminish and models that allow more precise predictions on a broader data basis would be penalized.

Whether discrimination prohibitions are actually the best legal tool for dealing with algorithm-based lending is doubtful, not only because they are based on decision parameters that can be concretized. It is also uncertain whether the current formulation of protected group characteristics meets the specific challenges of AI decision making. Imbalances may arise in the future among quite unexpected groups, such as people who update software frequently or infrequently, who are present on social media or refrain from doing so, who charge their cell phones carefully, or whose batteries are frequently depleted. Such grouping is only covered by discrimination prohibitions – as it were by chance – if the group correlates with a protected characteristic.

Few efficient response options

For consumers, this creates a Kafkaesque situation: they do not know what data is relevant to their assessment, such as leaving unpaid bills unpaid, and consequently cannot respond by specifically changing their behavior, such as paying on time. The user of the AI system has no incentive to disclose relevant variables. On the one hand, these are usually protected trade secrets; on the other hand, a change in the borrower's behavior could reduce the significance of the variables in question. If, for example, the latter learns that installing a dating app harms the credit score, while using a trading app has a positive effect, he or she may delete the former and download the latter ("gaming the system"). If her or his change in behavior is limited to this, his or her judgment will change unfairly.

The draft regulatory framework for AI systems that may be used in scoring procedures and credit scoring is therefore likely to have to go beyond prohibitions on discrimination. One of its central elements will be a quality control, as outlined in the AI regulation. This affects the quality of the AI system itself, but also the reliability of the data used. If these are collected on social media, for example, misunderstandings and errors occur frequently. The possibilities of the General Data Protection Regulation (GDPR) to demand access and, if necessary, rectification of inaccurate data require clearly understandable consumer education as well as efficient law enforcement procedures.

Plenty to discuss

The scientific and public debate on this issue is still in its infancy. It is doubtful, for example, whether the Consumer Credit Directive has found a convincing way forward when it stipulates that "personal data, such as data found on social media platforms or health data, including data on cancer, should not be used in credit scoring" (recital 47). To the extent that sophisticated AI systems can easily track such information via the detour of surrogate variables, such as frequent Google searches for specific diseases or medications, consumers:inside are hardly helped by this.

The debate about an up-to-date regulatory framework can also be illustrated with a view to AI-based price discrimination. In this respect, the Consumer Credit Directive is again surprisingly open to personalized pricing (recital 40). Although this may result in efficiency gains for society as a whole in some situations. Above this, however, it must not be forgotten that AI systems can find not only "invisible prime" candidates, but also particularly vulnerable potential borrower:s. Inexperienced borrowers who are poorly educated in financial matters or people who are particularly in need of a loan can present themselves to the morally agnostic AI system as an attractive market opportunity for a high-priced loan. It may be reassuring in this respect that the directive in future provides for an upper limit for interest rates, APR and total cost of the loan (Art. 31). At any rate, this should put a stop to the import of US "predatory lending".

Katja Langenbucher is a professor of civil law, commercial law and banking law in the House of Finance at Goethe University Frankfurt and coordinates the LawLab – Fintech& AI as a SAFE bridge professor.