Join the Community

21,089

Expert opinions

43,465

Total members

329

New members (last 30 days)

127

New opinions (last 30 days)

28,313

Total comments

Join Sign in

But it worked in Harrogate!

24 July 2020 Be the first to comment 142

John Cant

Managing Director

MPI Europe Ltd

I have nothing against Harrogate, well not consciously at least. However, over recent months there has been much discussion about unconscious bias in many walks of life - including mention in this article about data used for algorithms. So let's consider the role that preparation of data for regulatory uses might have, and where issues exist, what I have been able to do to mitigate this impact in major data programmes I have run. For example, what are the inbuilt assumptions of the input data models and what impact does it have if these precepts and conditions are stretched, strained, or even broken.

To look at just one example of many, take the common approach of removing words such as Limited from company name data before performing a fuzzy match. The premise is that these words add little to the uniqueness of names, and if left in may make a character by character match seem far better than it is. An extreme example would be matching 'A Limited' and 'B Limited'. If the word Limited is left in, the algorithm will likely match the two names as most of the characters in name A match their counterparts in name B. In contrast the human observer will immediately note that they are probably totally different. So, removing the word in this case is a sensible approach and the Harrogate Limiteds will work okay. However, to get the full benefit of these techniques in other countries depends on having a relevant list of words to remove which will differ in different geographies. To apply the technique equally in a global programme needs an explicit effort and analysis - for example, I never thought I would have to learn the Vietnamese word for conglomerate! A similar approach is required for selecting relevant sets of abbreviations to expand or remove.

So issues of bias do exist although much can be done to resolve them, so long as you have the experience to recognise they exist and that their mitigation is planned early into the process. Otherwise, an attempt to simply roll out globally a previously successful fin crime model developed in Europe will not only suffer from bias, but also probably fail to identify the intended targets due to the large volume of data "noise".

This content has been created by the Finextra editorial team with inputs from subject matter experts at the funding sponsor.

4390

Report

Channels

/regulation & compliance /financial crime

Data Management and Governance

Anything that can be used to better manage and govern data.

Join group

45 opinions 3 members 25 March 2024

Comments: (0)

John Cant

Managing Director

MPI Europe Ltd

Member since

06 Jul 2004

Location

London

More expert opinions

Serhii Bondarenko Artificial Intelegence at Tickeron

External

This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.

Join the Community

21,089

Expert opinions

43,465

Total members

329

New members (last 30 days)

127

New opinions (last 30 days)

28,313

Total comments

Join Sign in

Join the Community

But it worked in Harrogate!

Sponsored

Share

Channels

Data Management and Governance

Comments: (0)

You've been quiet - are you Dormant, Deceased or just Gone Away?

Will MiFID research arrangements change again?

Easy to say, but hard to deliver

Imagine a miniature financial crime fighter in every router

Will MiFID II bring Evolution or Revolution to investment research?

More expert opinions

AI into Algorithmic Trading Based on Price Action, Volatility and Correlations

How innovation in encryption is helping secure the credit card approval process

Future of Payment Review: six-months on – Digital Payment Infrastructure

Towards AI Agents: addressing rule-based governance deficiencies

External

Join the Community

Now Hiring