RAG Applications: Input Guardrails

Question and Answer (QnA) and Semantic Summarization types of RAG applications allow users to ask questions of potentially private enterprise data in their natural language. The flexibility of these free-form queries has the potential to unlock the ability to make enterprise data become more available where it is needed most – whether it is internal employees or customers. 

This flexibility can also be the cause of issues when a user, either unintentionally or maliciously, asks questions that are not appropriate or not in the intended scope of the RAG application. An enterprise-grade RAG Builder needs to provide some defenses around the inputs being submitted to be able to prevent misuse of the RAG system. The dangers from these can range from additional expense to significant brand damage. “Misuse” can be defined as any interaction or question that exceeds the bounds of reasonable use of the system. These can be categorized into the following – 

Topical Issues

Where questions are outside the domain of the data. E.g. asking questions about food recipes from a site focused on financial data. While not necessarily harmful, this can result in an unnecessary load on various components of the RAG pipeline and increase the cost of the system. Occasionally, the LLM might respond with what it thinks is an appropriate answer which could also be confusing at best and brand damaging at worst. Users may ask out-of-topic questions unknowingly because they are not entirely familiar with what data a particular RAG application targets or because they are mischievously testing the limits of the system. A RAG designer should be able to provide reasonable feedback redirecting the user back to relevant topics.

 

Profanity Issues

This is a wide category of input that contains or asks for socially and workspace-inappropriate content. This generally includes racial slurs and other culturally and socially unacceptable terms. This could also include inputs that go against the values and ethos of an enterprise. 

 

This is a complex topic because certain terms might not be appropriate for certain enterprises and data corpuses while might be perfectly legitimate terms in a different data corpus. E.g. a financial help website might consider certain anatomical references in questions to be flagged while a corpus of medical research might accept them. 

 

An enterprise grade RAG builder needs to provide configuration to be able to decide what terms are considered profanity and out-of-bounds and what is not. Reasonable feedback needs to be provided when input is flagged with having profanity.

 

Personally Identifiable Information (PII)

Users might inadvertently include their personally identifiable information or try to jailbreak the LLM into exposing personally identifiable information from its data corpus. While, ideally, the data corpus that was indexed should not have included any PII information, a RAG designer would want to catch and thwart any such attempts. We would also want to be able to warn a user when they are inputting PII information.

 

Dataworkz RAG Builder Input Guardrails

Dataworkz RAG builder provides the RAG designer with tools to be able to protect the RAG pipeline from all these types of hazardous inputs. “Input Guardrails” can be configured at the very entry to the RAG pipeline. It can be configured to suit an enterprise’s needs. The following configuration is available – 

Keyword based checks

A list of keywords can be provided to ensure that offending keywords are not found in the input. This has the advantage of being inexpensive because no LLM calls are made. Keywords list needs to be updated frequently by reviewing questions and identifying any keywords that should be added to the list

LLM Detection

Using an LLM for input guardrails can throw a wider net to catch a larger set of invalid questions. Its cons are that it will require initial review and iteration to ensure that it is on target and it will have a slight cost associated with it depending on the query volume.

For Profanity

A LLM can be configured to check for detecting profanity in input. This can be a great choice to protect against a large body of invalid inputs requiring much lower supervision and updates. Dataworkz Input Guardrails provides the ability to customize the prompt so that checks can be added specifically for your data requirements.

For Off-topic Questions

Identifying off-topic questions is based on providing a description of the RAG application data corpus such that the LLM is able to confidently determine if a question is on-topic or not. This will require review and iteration from the RAG designer in the initial monitoring stage in order to ensure that it is working well.

PII

Dataworkz provides PII checks to ensure that the input to a RAG application does not contain or ask for any personally identifiable information protecting your organization as well as your users.

 

Best Practices

Input Guardrails are a powerful way to ensure that you have put up defenses in your RAG application against malicious or accidental invalid input. However, it is important for a RAG designer to monitor the system using the tools that Dataworkz provides. It is important to note any false positives or negatives and tweak the configuration to ensure that the RAG application is serving its intended purpose as well as protecting your enterprise and your users.

 

This is a complex topic – technically as well as socially – and is constantly evolving. The terms that are considered offensive, biases in inputs, etc can change over time and require updating. Additionally, as the technology for guardrails improves, Dataworkz will introduce stronger, more configurable capabilities enabling a finer degree of appropriate control.

Conclusion

As RAG applications thrive and enable a new host of information empowered users, it is also important to protect it from the inevitable accidental and malicious inputs that will follow. Dataworkz RAG Builder provides a powerful set of cutting-edge tools that enables a RAG designer to take control of the defenses and protect their application, their users, and their reputation.

Scroll to Top