A Buddhist AI – “The DUDE” – Explains the Eight Steps of Hybrid Multimodal Document Review with Help from a Human Lawyer

EDRM - Electronic Discovery Reference Model
Contact

EDRM - Electronic Discovery Reference Model

Written by Chat GPT 3.5, partly in “Comic Hipster Style” with Ralph Losey

Losey to Chat: “Use a comedic hipster writing style to explain my eight steps of electronic document review.” Chat GPT 3.5 replies:

The Dude dancing barefoot, in his robe, on a rug

Image: Losey

“Yo dude, imagine like, you’re on this journey of enlightenment and you’re trying to find your inner peace and all that jazz. But instead of meditating in a cave, you’re sorting through mountains of documents, trying to find the ones that will bring you to legal nirvana.

It’s like the Eightfold Doc Review Path, man.

First, you have to have a little chit-chat with your homies about what’s relevant and what’s not. (“right speech“)

Then, you use all the fancy tech tools at your disposal to sift through the documents like a boss. (“right livelihood”)

You take a random sample, just to get a feel for the lay of the land. (“right mindfulness”)

Then you train that machine learning algorithm like a Zen master (“right revolve”), until it’s ranking documents like a pro. (“right efforts”)

Then, you let the attorneys have a look-see, and make sure everything’s on point. (“right view”)

Then, you apply some Zen-like QC to make sure there’s no errors or inconsistencies. (“right concentration”)

And finally, you produce those relevant documents like a true compliance ninja. (“right action”)

It’s like achieving legal enlightenment, but better, because you don’t have to sit on a hard floor for hours.

That’s Right, Man.

 

The Dude, sitting cross legged in meditation pose
Image: Losey

The Dude Bowls Over Losey with his Doc Review Summary

You don’t have to sit on a hard floor for hours!

Image is GPT Generated


Leaving Hipster Style Humor Language and Asking Chat GPT to further explain the Eight Steps using an Authoritative Writing Style and Ralph’s Graphics.

The Eight Steps of Electronic Document Review

 

Predictive Coding 4.0 Document Review
Image: Losey

2 people talking "I hear you saying that"

Image: Losey

1. Right Talk (Step 1 – Relevance Dialogues): Talk is the first step in the electronic document review process. It involves having relevance dialogues between the trial counsel and other parties to understand the scope of the review and to identify the relevant documents.

Communication with the ESI requesting party’s attorney is required by Rule 26(f), and Rule 1 to: a) discuss and clarify scope of relevance; b) provide a brief overview of past, and proposed e-discovery efforts, i.e. the Discovery Plan; c) advise of intent to use a multimodal search process for computer-assisted review and proportional, phased productions; d) discuss plan to review or ignore any non-text documents; and, e) you may also wish to invite input from the requesting party (depends).

2. Right ECA (Step 2 – Early Case Assessment using all methods): Early Case Assessment (ECA) is the second step in the process, in which all the methods are used to identify and classify the documents. This includes using active machine learning, keyword search, focused linear search, and other methods to identify potentially relevant documents.

In this critical second step you primarily use your professional judgment to begin to search for key documents, ones that you hope may be fairly easy to find, aka, the “low hanging fruit.” This is called Judgmental sampling. It is not a random sample. It is a selection of a sample documents for review by use of attorney judgment, case knowledge, and various types of searches using a hybrid multimodal process. Coding of documents begins in this step, including those already known.

Hybrid Multimoodal AI-Enhanced Review.  Expert Manual Review on bottom of pyramid, going up, Keyword Search, Siimilarity, Concept, AI

Image: Losey

The four types of search here are:

Active Machine Learning (aka Predictive Coding): Active Machine Learning is a method of using computer algorithms to automatically identify and classify documents based on their relevance to a legal matter. It involves training a machine learning model using a set of manually coded documents, and then using that model to classify the remaining documents. It is a form of active learning because the machine is actively trying to classify the documents.

Concept & Similarity Searches (aka Passive Learning): Concept & Similarity Searches are another form of machine learning that is used to identify and classify documents. Unlike Active Machine Learning, which is trained on a set of manually coded documents, Concept & Similarity Searches rely on the machine to identify similarities between documents and group them together based on those similarities. This is considered a passive form of machine learning because the machine is not actively trying to classify the documents.

Keyword Search (Tested, Boolean, Parametric): Keyword search is a method of searching for documents that contain specific words or phrases. It can be done using tested, Boolean or parametric search methods. Tested search methods are based on a set of predefined keywords, Boolean search allows the use of logical operators (and, or, not) to combine keywords and parametric search allows to use variables. This method is useful when searching for a specific set of documents, but it can also be less effective if the keywords used are not specific enough.

Focused Linear Search (Key Dates & People): Focused Linear Search is a method of searching for documents by looking at specific dates or people related to the case. It can be used to narrow down the search to a specific time period or group of people, which can help to identify relevant documents more quickly.


Here I asked the hipster Dude style of Chat GTP writing to come back. I wanted to hear The Dude’s explanation of these four types of search.

The Dude, walking with purpose

The Dude Explains Search

Image is GPT Generated

Alright “The Dude” explains in a cool, music related, hipster way:

  1. Active Machine Learning, or Predictive Coding, is like getting a personalized playlist from your cool music app. It’s like, you know, “The Dude’s got a good taste in music, I trust him to know what I’ll like.”
  2. Concept & Similarity Searches, or Passive Learning, is like browsing the record store for new music. It’s like, “The Dude’s just casually browsing, but he’s still finding some hidden gems.”
  3. Keyword Search is like going to the record store with a specific album in mind. It’s like, “The Dude knows exactly what he’s looking for, and he’s not leaving until he finds it.”
  4. Focused Linear Search is like thumbing through your collection of vinyls one by one. It’s like, “The Dude’s taking his time, going through every single album he’s got, making sure he doesn’t miss a thing.”

So, in summary, The Dude’s got different methods to find that perfect legal document, whether it’s by trusting the algorithm, casually browsing, searching for something specific or by taking his time going through everything.

End Dude. Putting Chat EPT 3.5 back to normal authoritative style.

8 Dice thrown in the air

Image: Losey

3. Right Random (Step 3 – Prevalence Range Estimate, not control sets): In this step, a random sample of the documents is chosen to provide an estimate of the prevalence of relevant documents within the entire set. This is done by extracting a random sample of documents, and reviewing them to estimate the percentage of relevant documents in the entire set, aka, the Prevalence. This Prevalence percentage calculation is always a range of numbers based on statistical calculation of probability. This is an important quality control metric and review guide. The random sample Prevalence calculation also provides evidence of proportional efforts and Recall. The quality and persuasive value of this evidence depends in part of the sample size. A sample size of 1,534 documents that creates a 2.5% Confidence Interval and 95% Confidence Level, is used in most cases. A larger size is only rarely needed. In smaller value cases, a sample of only 783 documents could be sufficient to give you an idea of Prevalence, but it has a higher error rate with a 3.5% Interval.

Iteration

Image: Losey

4. Right Select (Step 4 – Choose Documents for Training Machine): In this step, the documents that are selected from the random sample and ECA are used to train the machine learning algorithm. This allows the machine to learn and improve its classification abilities.The AI Trainer works in an iterative process to train the machine on Relevance (or any binary decision, including privilege). This allows the machine intelligence to rank the probable relevance of all text documents to be reviewed. (Remember – Predictive Coding only searches alphanumeric texts or images with text extracted by the vendor discovery processing.) Your role as a human Attorney in charge of the project is to act as the ultimate authority as to relevance, the “Subject Matter Expert” or “SME.” Yes. This will be on the test.

Gears with eDiscovery Teaam Players in a complex ESI review project. ESI, AI, AI trainer, Project Manager, Jr and Sr SME's Judge, Client

Image: Losey

5. Right AI Rank (Step 5 – Machine Ranks Documents According to Probabilities): In this step, the machine learning algorithm ranks the documents based on their probability of being relevant. This ranking is done based on the information obtained during the training step. In Rank the AI does its work, performs a logical regression based statistical analysis of the whole database and ranks all documents as to probable relevance. Multiple rounds of training and ranking are typically required before the machine obtains a good understanding of the intent behind Relevance. It can be a few as three rounds for simple issues and databases, to as high as thirty rounds of training in a large, complex matter (very rare).

6. Right Review (Step 6 – Attorneys Review and Code Documents): In this step, the attorneys review the documents that were ranked highly by the machine learning algorithm, and they code them as relevant or non-relevant. This step is important to ensure the accuracy of the machine learning algorithm, and to further refine the list of relevant documents. In step six Review the bulk of the work is performed. The graphic right shows the primary roles in a complex ESI search and review project. This review stage can include all other types of search based on new relevant documents found, not just predictive coding, For instance, based on documents found, you may want to include new keywords for searches, or similarity searches of any Highly Relevant documents found and, it is important to remember, the Add Families type of searches for all new relevant found.

7. Right Zen QC (Step 7 – Zero Error Numerics Quality Control procedures): Zero Error Numerics Quality Control (Zen QC) is a set of procedures that are used to ensure the accuracy and completeness of the electronic document review process. This includes checking for errors, inconsistencies, and missing documents, and making sure that the process is conducted in accordance with the relevant laws and regulations. There are four elements of quality control especially designed for use with Predictive Coding.

Balanced Hybrid (Man-Machine Balance with IST): Balanced Hybrid is a method of electronic document review that combines the use of technology with human review. It involves using machine learning algorithms to identify and classify documents, and then having human reviewers (IST) review the documents that the machine is unsure about. This approach is used to achieve a balance between technology and human review, efficiency and accuracy. Man and machine working together without over delegation, trust and verify in a two-way street

This is an important concept, so I asked The Dude mode of Chat GPT 3.5 to come out and explain it:

Meditating Dude
Image of Self was GPT Generated

 

Balanced Hybrid, or Man-Machine Balance with IST, is like getting a good cup of coffee. You know, like, you’ve got your fancy automatic pour-over machine that can make a mean cup of joe, but you still want that human touch, so you’ve got your barista to check and adjust the temperature and the water flow to make sure it’s just right.

In the same way, the machine learning algorithms work like the automatic pour-over machine, they can identify and classify documents quickly and efficiently, but sometimes they might miss something or be unsure about a document, that’s where the IST comes in, like the barista, to check and adjust the machine’s work, to make sure that the legal documents are accurate and nothing is missed.

So, in summary, Balanced Hybrid is like having the best of both worlds, the efficiency of technology and the accuracy of human review, ensuring that The Dude’s legal document review is the best cup of coffee.

Buddha in small rust colored circle on black background

Image: Losey

Intertwined Software, SME and Method Hybrid IST Multi, Predicative Coding Quality on outer rim

Image: Losey

End Dude. Back to normal.

SME (Subject Matter Expert, typically trial counsel): Subject Matter Expert (SME) is a person who has specialized knowledge or experience in a specific area. In the context of electronic document review, SME is typically trial counsel, who is responsible for reviewing the documents and making decisions about their relevance to the case.

Method (for electronic document review): Method refers to the specific procedures and techniques that are used to conduct an electronic document review. It can include a combination of different methods, such as active machine learning, keyword search, and human review.

Software (for electronic document review): Software refers to the computer programs that are used to conduct an electronic document review. These can include machine learning algorithms, search engines, and document management systems.

8. Right Produce (Step 8 – Production of Relevant, Non-Privileged Documents): In this step, the relevant, non-privileged documents are produced to the other parties involved in the legal matter. This includes the relevant documents that have been identified and reviewed during the electronic document review process. A final search for privileged content should be run on all documents that have been identified for production to again verify that they are not privileged. All ESI produced should be triple checked before it goes out the door. Production these days is usually by FTP, so be very careful and what files you put where. This last step should never be rushed.

How All of the 8-Steps Work Together

AI Enhanced Ralph Losey

Ralph Losey 2022 AI enhanced

The process of electronic document review involves the combination of all the steps discussed to ensure that all relevant and non-privileged documents are identified and produced. The process starts with relevance dialogues between the trial counsel and other parties to understand the scope of the review and to identify the relevant documents. (“a little chit-chat with your homies about what’s relevant and what’s not.”) After that, various methods such as active machine learning, keyword search, focused linear search, and other methods are used to identify potentially relevant documents in the ECA step. (“use all the fancy tech tools at your disposal to sift through the documents like a boss.”)

In the next step, a random sample of the documents is chosen to provide an estimate of the prevalence of relevant documents within the entire set, this is done to have an idea of the amount of work that needs to be done. (“You take a random sample, just to get a feel for the lay of the land.”) After that, the documents that are selected from the random sample and ECA are used to train the machine learning algorithm, this allows the machine to learn and improve its classification abilities. Then, the machine learning algorithm ranks the documents based on their probability of being relevant. (“You train that machine learning algorithm like a Zen master, until it’s ranking documents like a pro.“)

The next step is to have the attorneys review the documents that were ranked highly by the machine learning algorithm, and they code them as relevant or non-relevant. (“you let the attorneys have a look-see, and make sure everything’s on point.”) This step is important to ensure the accuracy of the machine learning algorithm, and to further refine the list of relevant documents.

After that, Zero Error Numerics Quality Control (Zen QC) is used as a set of procedures to ensure the accuracy and completeness of the electronic document review process, this includes checking for errors, inconsistencies, and missing documents, and making sure that the process is conducted in accordance with the relevant laws and regulations. (“you apply some Zen-like QC to make sure there’s no errors or inconsistencies.“)

ZEN QC factors

ZEN QC Factors, Image: Losey

Finally, the relevant, non-privileged documents are produced to the other parties involved in the legal matter. (“you produce those relevant documents like a true compliance ninja.”) This includes the relevant documents that have been identified and reviewed during the electronic document review process.

All of these steps come together to ensure that justice is done by identifying and producing all relevant and non-privileged documents, which is essential for fair and accurate legal decision-making. The use of technology such as machine learning algorithms and quality control procedures helps to increase the efficiency and accuracy of the document review process, while the involvement of human reviewers (IST) ensures that the process is conducted in a fair and impartial manner. (“It’s like achieving legal enlightenment, but better, because you don’t have to sit on a hard floor for hours.”)

Conclusion

Colorful angel with wings and a gavel, AI and GPT

Image is GPT Generated

The eight steps in the electronic document review process include relevance dialogues, early case assessment using various methods, prevalence range estimate, selecting documents for training machine, machine ranking documents, attorneys reviewing and coding documents, Zero Error Numerics Quality Control procedures and producing relevant, non-privileged documents.

All of these steps come together to ensure that justice is done by identifying and producing all relevant and non-privileged documents which is essential for fair and accurate legal decision-making. It’s like achieving legal enlightenment, but better, because you get paid good money for doing it.

Finally, a note from Ralph, these eight steps for document review have nothing to do with the Nobel Eightfold Path of Buddhism. This Path is a fundamental precept of Buddhism, and it seems to me disrespectful not to at least mention them: Right View, Right Resolve, Right Speech, Right Action, Right Livelihood, Right Effort, Right Mindfulness, and Right Concentration. I’ll give the The Dude version of Chat EPT 3.5 a chance to summarize.

Meditating Dude
Image 8 is GPT Generated with Ralph Prompt

The noble eightfold path of Buddhism is like the ultimate guide to being a chill and enlightened human being.

You know, like, first, you’ve got the Right View, that’s like understanding the nature of reality and not getting caught up in illusions. Then you’ve got the Right Resolve, that’s like setting your intentions to be a good person and not just doing things for yourself.

Next up is Right Speech, that’s like choosing your words wisely and not being a jerk to people. And Right Action, that’s like living your life according to your intentions and not causing harm to others.

Right Livelihood is like having a job that aligns with your values and not exploiting others. Right Effort is like putting in the work to improve yourself.

Right Mindfulness is like being present in the moment and not getting lost in your thoughts. And last but not least, Right Concentration, that’s like focusing your mind and achieving a state of inner peace.

So, in summary, The Dude’s eightfold path is like a road map to living a good and enlightened life, taking the right actions, having the right intentions and the right mindset, to achieve inner peace, and The Dude’s definitely down for that.

Written by:

EDRM - Electronic Discovery Reference Model
Contact
more
less

PUBLISH YOUR CONTENT ON JD SUPRA NOW

  • Increased visibility
  • Actionable analytics
  • Ongoing guidance

EDRM - Electronic Discovery Reference Model on:

Reporters on Deadline

"My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
*By using the service, you signify your acceptance of JD Supra's Privacy Policy.
Custom Email Digest
- hide
- hide