Join the Community

21,008

Expert opinions

43,823

Total members

313

New members (last 30 days)

114

New opinions (last 30 days)

28,280

Total comments

Join Sign in

How Financial Services Companies Secure Data in a RAG GenAI Environment

22 December 2023 Be the first to comment 1

Ameesh Divatia

Co-founder and CEO

Baffle

Few industries have the competitive pressure to innovate — while under as much public and regulatory scrutiny for data privacy and security — as the financial services sector. So, as companies implement new applications and services using large language model (LLM) AI platforms like ChatGPT, the financial services industry must take a different approach.

Given the data security risks that public LLM services can pose when private data is used, financial services companies are turning to GenAI deployment models that can be deployed on private infrastructure, such as retrieval augmented generation (RAG), which can leverage the LLM understanding of text, and apply it to the access and process of private data. Companies have all the power of AI without exposing it in an open AI environment or experiencing data hallucination.

This article will cover some benefits RAG models can offer financial services companies and explore ways to ensure continuous data security while using this forward-thinking technology.

The difference between RAG and publicly available chatbots

Well-known GenAI systems like ChatGPT generate a query response based on their training from publicly available information. With RAG, companies can use their private data set to add context to the query without retraining the model. It does so by intercepting the user prompt and using it to search through an index of the private data set to find the most relevant information. This information is sent to the LLM as context along with the original prompt to get a response based on this context and not on the training data.

The result is a response that is grounded in private data with virtually no risk of hallucinations. Best of all, augmenting the standard Q&A GenAI system requires no retraining of the LLM, and the knowledge base that the GenAI system relies on can be refined and updated frequently to ensure the most accurate results.

Benefits of RAG for the financial services sector

In my conversations with customers in the financial services industry, they have shared how they plan to use RAG to help them solve business challenges such as:

Automated customer support: GenAI-powered chatbots and virtual assistants can provide meaningful and actionable responses to customers asking questions about the institution's products, services, and applications. With RAG, these chatbots can tap into an extensive and up-to-date internal knowledge base to give customers accurate answers while supporting natural language interactions that customers are accustomed to.

Document summarization for analysis and decision-making: Financial institutions process vast numbers of documents each day, ranging from financial disclosures from customers for account applications to internal reports about market positions. By using RAG to quickly summarize the content of these documents and point employees to the most relevant information, financial institutions can analyze information and arrive at crucial decisions more quickly and efficiently.

There’s no doubt in the minds of all those I’ve talked to that a GenAI system using RAG presents an excellent opportunity to quickly gain value from the rapidly developing technology in this space.

Strategies for secure RAG use

As tempting as it may be to dive headfirst into RAG, financial institutions must exercise caution. Like other Generative AI technologies, using RAG can lead to multiple pitfalls that can cause financial institutions to fall out of compliance and open themselves up to severe penalties. It behooves companies in the financial sector to implement appropriate security measures as part of their rollout plan for RAG systems.

Review risk of data flows to LLM service: RAG systems will have access to large data sets from multiple data sources to be effective. Much of this data will be regulated. During typical RAG operation, this sensitive data must be sent to the LLM service to generate the context and get the response. Financial institutions should look closer at the data accessible to the RAG system and determine whether the risks of exposing this data require an LLM service deployed fully within the infrastructure it controls instead of a public service.

Encrypt Data at Field Level: Because regulated data must be protected no matter where it is in the IT infrastructure, financial services companies must ensure that sensitive data is encrypted at all times and decrypted on an “as needed” basis. Data-at-rest encryption solutions are used widely today. The data is automatically decrypted when accessed, exposing it as soon as it moves from one system to another.

The only way to ensure consistent and continuous protection of regulated data values as they move through a financial company’s data pipeline into a RAG system is to encrypt the data value itself at a field level as early as possible. Specifically, all PII data should be encrypted since multiple overlapping compliance regulations cover them. For RAG systems, encrypting PII ensures that no amount of prompt engineering can reveal the sensitive data value while allowing valid prompts (those not seeking to reveal PII) to work correctly.

Field-level Access Controls: All compliance requirements demand that appropriate access controls are implemented to allow access to regulated information only to those with a legitimate business need. For financial institutions, the RAG system must never include PII and other sensitive information in the response. While security filters and DLP solutions may catch some leaks, they can readily be bypassed by prompts designed to bypass those mechanisms.

The most effective access control is tied to the sensitive data values and allows access to those values based on a well-defined access control policy. With field-level access control enforcement, financial institutions can deterministically enforce access to regulated data and provide clear audit records of the access.

Everyone agrees that RAG systems represent an immediate and unprecedented opportunity to leverage LLMs in financial institutions, but it is crucial to consider data security and privacy requirements early on. By adopting the necessary security measures to protect regulated data in RAG systems, financial services companies can address the data security and privacy compliance requirements posed by RAG-based applications before they become liabilities.

This content has been created by the Finextra editorial team with inputs from subject matter experts at the funding sponsor.

3485

Report