Markus Klokhøj is an independent consultant at the intersection of AI and law. He teaches AI and Legal Disruption at the University of Copenhagen. A self-taught developer, Markus began his tech journey by creating native iOS apps to assist in studying for his LL.M exams, alongside developing a tutoring platform for law students. His interest in tech led him to founding the first student-driven coding course for law students. Markus has applied his expertise in law and technology through roles at the Danish Data Protection Agency, Deloitte, and IKEA.
The buzz around artificial intelligence’s role in the legal sector is palpable and impossible to ignore. Major law firms are trying to lead the conversations about harnessing the power of generative AI, positioning themselves at the forefront of technological innovation.
This wave of enthusiasm is not just about staying ahead in the tech race but also a strategic move to showcase their adaptability and modern approach. Amidst these declarations of “we’re working on it too,” one cannot help but question the actual state of practical, useful applications of generative AI in the legal realm.
Drawing from my experience in developing and deploying generative AI within this sector, I am more cautious about its immediate potential. While the theoretical benefits are touted far and wide, the reality of implementing these AI systems effectively in legal contexts is far more nuanced and challenging. This scepticism stems not from a disbelief in the technology’s future capabilities but rather from a grounded understanding of where we truly stand today in the journey of making generative AI a genuinely helpful tool for legal professionals.
Over the last year, Retrieval Augmented Generation, commonly known as RAG, has gained considerable attention in the legal tech community. But what exactly is behind all this hype? This blog sets out to unravel the mysteries of RAG through specific experiences, providing engaging insights into its real-world functionality and impact.
What is RAG?
Let’s start from scratch: RAG enhances Large Language Models like GPT-4, Mistral, or Llama 3 by adding an information retrieval system that provides context-specific data. Research indicates that RAG can significantly enhance performance on natural language processing tasks, mainly where access to specific data is important. This feature can be ideal in the legal domain, where the accuracy and relevance of information are fundamental to the conclusions drawn by the LLM.
Legal assessments require a deep understanding of facts and figures and an application of law to these facts. That’s basically the legal methodology. The added layer of “context-specific data” refers to the relevant legal data that the system uses as a base for processing and generating responses, such as legal frameworks or policy. This addition enables the language model to use grounding data more freely during response generation.
The process has the following steps:
- Query/input: The user sends a query/question using the RAG application interface. This could, for example, be an interface similar to OpenAIs’ ChatGPT.
- Retrieval: The model – could be GPT4 – searches through a source you have provided as additional context, enhancing your control of the model. Consider this as making the LLM smarter or more competent within a specific area. If you’re a lawyer, it might be legal information in a specific domain and legislation. The purpose of the search is to find text that contains information relevant to the input.
- Augmentation: The relevant information found in the retrieval phase is then used to augment the original user input. This involves prompt engineering, where the retrieved data is integrated into a new prompt or reformulated query. This new prompt is created to write the best possible answer by guiding the model’s response with context-specific details (for example, specific legal information).
- Generation: The model now uses the augmented prompt to generate a response. This is a combination of the original input, the contextual information and the model’s knowledge base. Hopefully, the response will be more accurate and useful for the user based on the context that has been added to the model.
RAG customises generative AI in an enterprise setting to meet specific business needs. By adding context-specific information to the powers of existing LLMs, RAG has the potential to create a digital colleague with superpowers. It is an advanced tool that generates highly specialised responses, such as those required in the legal sector, potentially changing business workflows.
Microsoft Azure now offers the following depiction of infrastructure for RAG applications. Link to source: https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview
This makes RAG an attractive business opportunity. Several major cloud service providers already offer the required infrastructure, with Microsoft Azure/OpenAI leading the pack. This allows corporations to explore RAG’s potential while relying on their existing corporate infrastructure, ensuring data privacy and security.
The RAG advantage
What are the practical benefits of this approach?
In a dynamic corporate setting, where legal reviews often slow progress, the ability to expedite legal assessments related to contractual terms, privacy/security, and internal policies is highly valuable. Quick determinations of approval or disapproval are crucial to avoid wasting resources. Therefore, empowering legal teams with tools to access information and generate responses swiftly becomes a compelling proposition.
Let us use the privacy assessment of corporate software as an example: Corporate companies onboard tons of new software all the time, and evaluating these vendors involves a comprehensive process that includes assessing the software’s privacy and security features, compliance with relevant laws and compatibility with existing systems. This thorough analysis helps ensure that the software meets the company’s needs and standards for quality and safety.
I undertook a project for a client, and the following outlines the experience so far. Starting with the pros:
Pros
- Speed and efficiency: RAG can provide access to relevant privacy laws and regulations, such as GDPR, HIPAA, and CCPA, if these are added as context-specific data. The same goes for internal policies. The RAG environment can furthermore be set up depending on the geographical and sector-specific context and retrieve and summarise the most relevant legal information based on your role and location.
- Consistency: If an organisation has an existing dataset of privacy assessments, the company can leverage this data to improve the quality of the RAG-generated output. If the assessments follow a uniform format, they can be more effectively utilised because the structure facilitates the comparison process for the LLM. By using the existing assessment as compliance benchmarks, the RAG system identifies patterns and common issues, improving its ability to identify privacy risks.
- Support for legal teams: The draft version of the assessment can be generated almost entirely by the system for basic software products. This means that legal resources can spend less time on the basic assessment and focus on bigger and more complex tasks. Ideally, the system also provides recommendations upon request on how to improve privacy settings/standards for software products based on the corpus of existing data.
It would be great if RAG could just be implemented effortlessly and seamlessly integrated into organisational processes. However, there are several challenges and drawbacks to consider when adopting this approach. Here are some of the cons that I’ve found so far:
Cons
- Quality of results and time savings: While RAG expedites the assessment process, the current state of these systems still requires human oversight. The need for review and potential corrections raises questions about the net time saved. If significant human intervention is necessary to ensure accuracy and compliance, the efficiency gains of RAG might not be as impactful as initially perceived. So, while RAG might give more control of the LLM output, it’s not necessarily enough.
- Reliance on extensive documentation: The effectiveness of RAG relies on the availability of comprehensive, well-structured documentation. Creating and maintaining such a detailed knowledge base is a task in itself, demanding considerable time and resources. This dependency can be a significant hurdle, especially for organisations that lack established documentation practices.
- Data preparation challenges: It’s not possible to use a piece of legislation and then expect the model to interpret and use the knowledge from the document correctly. Patterns in legislation can be hard for the LLM to recognise, and the groundwork for implementing RAG is no small feat.
Cleaning and preparing data for the data pipeline is a complex and often labour-intensive process. The quality of the input data directly influences the output. Thus, any shortcomings in this preparatory stage can diminish the effectiveness, correctness and reliability of the RAG, presenting a substantial challenge in its application. This also applies to the fine-tuning, which is formally considered a separate process but is inseparable for RAG to achieve the best possible results.
Unfortunately, there’s still not a lot of guidance on the best way to improve the performance of AI. As noted in a study by Harvard Business School, “Contributing further to the opacity is that the best ways to use these AI systems are not provided by their developers and appear to be best learned via ongoing user trial-and-error and the sharing of experiences and heuristics via various online forums like user groups, hackathons, Twitter feeds and YouTube channels”. This highlights the necessity for users to continually test and refine their approaches through hands-on experimentation and community engagement.
A balanced approach weighing the operational costs vs the technological benefits might not always lead to a clear picture of what implementation will actually look like. The setup and administration of the system might end up becoming just as big a burden as the legal work would’ve been, which should be a part of the considerations when deciding how to leverage the power of RAG.
Meet RAG. He might be your new assistant, but his skills are currently overhyped. The RAG(doll) still has a lot to learn.
RAG’s role in modern law
The integration of RAG into modern legal practices represents a potentially transformative advancement but comes with significant challenges. While the benefits – such as efficiency, consistency and simplification – might all seem compelling, they’re not guaranteed for every specific use case because of the underlying requirements of clean and big datasets, updated data pipelines, and the analysis of cost vs benefit.
Despite these hurdles, ongoing development holds promise for significant impacts on legal work. Exploring diverse applications is crucial, and there are plenty of use cases out there. Still, it’s essential to acknowledge that viable use cases may be more limited than initially thought if the business opportunity doesn’t outweigh the cost of maintaining the system.
Currently, the only way to thoroughly assess the capabilities of a RAG setup is by exploring and experimenting. Organisations implementing RAG might have to accept the higher initial cost because of the trial and error approach these projects currently carry – and, in some cases, accept that the use case might not be a good fit after all.
The future of RAG as an AI tool to improve legal work is therefore not only based on the technological advances, but also in the strategic approach and implementation by the organisations willing to take on this task.