July 9, 2024

Part III: Optimizing eDiscovery with AI — Overcoming Scaling Challenges

Aidan Randle-Conde

Hanzo

+ Follow Contact

Facebook

Send

Embed

Hanzo

Recap and Introduction to Scaling Challenges

In previous blog posts, we discussed how large language models (LLMs) can be used in eDiscovery and the importance of data security , keeping costs low, and transparency of data analysis . One of the novel risks introduced by using LLMs is “hallucinations,” or when an LLM generates inaccurate or irrelevant text. In this post, we will put everything together and tackle the challenge of scale.

Meeting Multiple Requirements with LLMs

Critical requirements in the deployment of Large Language Models (LLMs) for legal eDiscovery:

Data security: Datasets are segregated safely in customer environments to maintain a single-tenant policy. This method minimizes the risk of data breaches and unauthorized access, providing a robust security framework that is essential in handling sensitive legal information.
Cost: To manage costs effectively, Hanzo utilizes LLMs in a targeted manner. By deploying the smallest appropriate model for each specific task, Furthermore, the operational model is designed so that machines remain active only for the time needed to complete the task, significantly reducing unnecessary expenditures on processing power and energy.
Transparency: To address the issue of hallucinations—where LLMs might generate inaccurate or irrelevant information. The system is designed to either return the AI-generated content to the user for a thorough review or limit responses to simple yes/no answers. This method not only mitigates the risk of misinformation but also enhances the transparency of the AI processes, enabling users to understand and trust the results provided by the LLMs.

Challenges of Scaling LLMs in eDiscovery

The main challenge we face with LLMs is scale. LLMs are expensive to run due to the high-capacity computing resources required. Sometimes, the necessary hardware isn’t available, leading to delays in dataset analysis. Solving this scale issue is critical because a secure, cost-effective, and transparent solution is pointless without the essential hardware to analyze your datasets. However, even though LLMs require more expensive hardware compared to traditional machine learning models used in eDiscovery, the overall cost remains lower than CAL/TAR due to the elimination of human costs.

Impact of Scaling on Data Processing

Being able to scale up data processing can be crucial when time is limited, datasets grow in size, or additional processing is needed at a later time. By keeping the LLM-based data processing within customer environments, we ensure that customers do not compete for shared resources and that processing can be scaled up as long as resources are available. This also means that there are no quotas, rate limits, or API tokens to worry about, and that costs scale linearly with processing time.

[View source.]

Send Print Report

Written by:

Hanzo

Contact + Follow

Aidan Randle-Conde

+ Follow

less

PUBLISH YOUR CONTENT ON JD SUPRA NOW

Increased visibility
Actionable analytics
Ongoing guidance

Learn More

Published In:

Artificial Intelligence

+ Follow

Data Management

+ Follow

Discovery

+ Follow

e-Discovery

+ Follow

e-Discovery Professionals

+ Follow

Electronically Stored Information

+ Follow

Legal Project Management

+ Follow

Legal Technology

+ Follow

Machine Learning

+ Follow

Technology-Assisted Review

+ Follow

Electronic Discovery

+ Follow

Science, Computers & Technology

+ Follow

less

Hanzo on:

Part III: Optimizing eDiscovery with AI — Overcoming Scaling Challenges

Recap and Introduction to Scaling Challenges

Meeting Multiple Requirements with LLMs

Challenges of Scaling LLMs in eDiscovery

Impact of Scaling on Data Processing

Related Posts

Latest Posts

Written by:

PUBLISH YOUR CONTENT ON JD SUPRA NOW

Published In:

Hanzo on:

"My best business intelligence, in one easy email…"