Join the Community

Expert opinions
Total members
New members (last 30 days)
New opinions (last 30 days)
Total comments

Can Transformers accelerate the evolution of an Intelligent Bank? – Exploring recent research trends

Be the first to comment 2

Machine Learning (ML) has transformed the Banking landscape over the last decade powering organizations to understand customer better, deliver personalized products and services and transform the end-customer experience with them. However, the efficiency of ML was limited by the volume and quality of training data and the time to train the models. The seminal paper in 2017 from Google researchers, “Attention is all you need” truly transformed the world of language and sequence modeling with its self-attention mechanism-based novel neural network architecture called 'Transformers'. Transformers brought with them the ability to integrate information across very long sequences thereby bringing greater “contextual understanding” to every token in the language sequence being analyzed. Coupled with the ability to embrace parallel computing in the model training, Transformers brought in greater accuracy and efficiency.

While Transformers started with the problem of machine translation, it evolved into a one-shot model solving problems in content summarization, sentiment analysis, topic modeling and content generation. BERT from Google and GPT3 from Open AI have popularized the transformer models.

While Transformers have proved their mastery in language and sequence modeling, recent research efforts include exploration of Transformer based architectures for decisioning through Reinforcement Learning, anomaly predictions and other areas wherein other problem statements have been converted into a sequence modeling problem and then addressed. This article explores the various research trends in Transformer architecture and how it’s different variants could have a profound impact in the world of Banking and financial services.



Chat GPT3 is the talk of the town. The ability of such pretrained generative transformers that are trained on massive language data corpus, to respond intelligently to varied queries has caught everyone’s attention. This pretrained generic language model can be effectively used by banks for various functions ranging from marketing to customer service to operations support. A few areas where ChatGPT3 type models can be integrated into the banking IT ecosystem are in:

-  Chatbots services/ online assistance: Instant response to customer queries, advisory service to customers seeking guidance on banking products...

-  Language translation needs in a multi-lingual environment

-  Marketing campaigns: Generate customized content based on customer data and preferences

-  Market research: Analyze customer feedback and data and provide insights to customer servicing and marketing departments

-  Automate routine tasks such as data entry, email marketing etc. 



Recent research shows that Speech processing using Transformer based models shows much greater promise than the earlier RNN and CNN based models. The applications include:

-  Speech to Text conversion (like in recording of calls handled at contact service center) for analysis

-  User Authentication using speech recognition while user calls in at the contact center

-  User emotion predictor (to sense the mood of the caller and route call intelligently to handle the requestor and request better)

All of this can also be thought through in a multi-lingual and dialect scenario.

Speech recognition typically face the problem of long sequences (as number of audio frames are high) unlike text. Different research efforts show varied approaches to handle this. Some have used CNNs to reduce the number of input tokens. Some have tried to split input into chunks over which attention is computed. Some researchers have tried to solve the problem in alignment between input and output (say audio and text output). Emformer, Gen3 Conformer, AmTRF, ESPnet, Hybrid Transformer are few examples of the research output in this space.



Products, services and offers made to banking customers have to be very contextual. For example, customers may be made customized offers during their interaction with customer service representatives, or proactively provided offers when they are at the proximity of merchant stores. Based on a customer’s spend patterns and their age profile, a bank may choose to offer specific products (like loans, discounts etc.) that may be of relevance to them.

Research has shown that encoder construct of the Transformer can be used to model customers and profile them effectively. Researchers have attempted to concatenate encoded user’s data into words, and then description of a user’s behavior in a specific month into a sentence. So, the history of a customer that contains a one-hot encoding of a customer, numerical features of customer’s demographics, and a one-hot encoding of products owned, together became a sentence. This approach helped address the problem into a sequence modeling statement. The transformer encoder then learns an embedding of each user which encodes their historical data. A feed-forward layer takes as input the output from encoder stack and user embedding layers and outputs a probability distribution for likelihood for users to buy each product, that is normalized using a softmax layer to give the final recommendation. Research literature shows such transform models perform better than traditional recommender engines.



Recognizing handwritten content has always been a tough problem to solve. Variants of Convolutional Neural Nets (CNNs) have been successful to bring improved efficiency in ICR/OCR techniques, but their efficiency suffered when processing handwritten application forms.

Researchers have adopted a Convolutional Vision Transformer which comprises of multiple stages, with each stage having a CNN, a normalization layer and a vision transformer construct. CNN is primarily used for the computing the embedding patches and feature extraction. The embedded patches output from CNN is fed into an encoder architecture that has stages of batch normalization, multi-head attention and finally a multi-layer perceptron (fully connected layers). Preliminary research indicates much higher accuracy in predicting the handwritten content.

Banks do process handwritten forms in:

-  Account opening applications

-  Debit/ Credit Card requests

-  Loan processing forms

-  Transaction requests

-  Signature recognition, Face recognition and others

And Vision Transformers (ViT) can play a vital role in improving the straight through processing capability of banking organizations.



Transformers are known to model sequence data effectively. However, recent research at universities and organizations such as Google and Facebook have shown that a Transformer architecture (known as Decision Transformer Framework) can abstract reinforcement learning (RL) as a sequence modeling problem. It has been shown that the self-attention layer in Transformer architecture can be used to assign a reward by maximizing the dot product of query and key vectors and forming state-award associations. Unlike traditional approach of fitting value functions and computing policy gradients, this transformer-based approach predicts the next token and optimal trajectories are obtained by generating the sequence of actions.

This approach in the banking industry can be explored for goal-based wealth management and financial portfolio management. This could include specific problems such as:

-  Optimization of retirement plans

-  Maximizing returns from fund investments for customers

-  Assistance in trading decisions

-  Maximizing customer value to a bank



Efficient anomaly detection is critical to detecting frauds in a financial ecosystem. Lack of labeled fraud data for training as frauds are infrequent is often a challenge to build ML models. Also, fraudsters use different approach each time.

Recent research indicates a deep transformer model architecture with multi-head attention-based sequence encoders to identify/classify fraudulent transactions based on the knowledge of the broader temporal trends in the data has been very effective. The input to the transformer model comes with a set of requisite features that are shortlisted using traditional ML models.

This approach can be effectively embraced by banking organizations in their risk department to build effective approach to:

-  Identify frauds real-time in banking transactions, credit card transactions

-  Identify and handle money laundering

-  Assist in credit decisioning and limit assessments

Another interesting predictor is the Time Series Transformer. They can be used for predicting stock price movements, forecasting forward looking quarterly or annual gross non-performing assets, probability of default, exchange rate, and other key factors.



  • Transformer based Deep Learning Models gaining mainstream adoption:

Transformer model based ML algorithms are expected to be more efficient and accurate in its output and therefore shall be a contributor to the greater adoption of intelligence by banks. Fintech companies have been rapid adopters of technology and the traditional banking players are catching on. It is felt that machine learning and more so Transformer architecture based ML is bound to have a lasting impact on banking by:

-  Automating/ Digitizing tasks thereby reducing costs,

-  Understanding the customer better thereby improving quality of service and

-  Improved cross selling through personalized product and offer recommendations and

-  Creating a sustainable and dependable enterprise by reducing frauds and risks

  • Financial Technology contextual Transformers:

FinBERT is a financial domain-specific pre-trained language model, based on Google's BERT. Its goal is to enhance financial NLP research and practice. Very recently, an Indian startup Yubi launched its own transformer model YubiBERT, that would cater to the fintech industry in this part of the world and customized to different local languages. This certainly could be the trend for innovative companies to further explore, refine and customize the transformer style deep learning architecture and train it with tons of data to cater effectively to specific problem statements in the financial services world.

Transformer models may not replace humans but will certainly disrupt the current working model and providing a huge potential for businesses to adapt and grow.



This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.

Join the Community

Expert opinions
Total members
New members (last 30 days)
New opinions (last 30 days)
Total comments

Now Hiring