Generative AI for Smart Discovery Professionals: An Introduction to Large Language Models – Second Edition

GenAI for Smart Discovery Professionals by John Tredennick and Dr. William Webber
ImageL John Tredennick with AI.

[EDRM Editor’s Note: The opinions and positions are those of John Tredennick and Dr. William Webber.] 


In November of 2022, ChatGPT upended our thinking about artificial intelligence with a new form of machine learning called Generative AI (GenAI). Since then, discussions about GenAI models like GPT have taken center stage in nearly every field, especially in the legal profession. Today, most legal publications feature articles about how GenAI will change the legal profession, and vendors at legal conferences tout their GenAI powered software.

This Guide introduces Generative AI for Smart Discovery and Investigation Professionals. It is designed for the smart individuals in our profession who want to learn the basics of how GenAI models like GPT work and how they can improve discovery workflow. While the underlying algorithms may be complex, we can learn enough about their function to put them to use in our practices.

Our goal is to teach smart discovery professionals how GenAI models operate and how to use them for more efficient and effective work. flow.


John Tredennick and Dr. William Webber.

The book is divided into two parts. Part 1 explores the fundamentals of GenAI and Large Language Models (LLMs), including key concepts like training, context windows, data security, and the potential for hallucinations in AI-generated content. Part 2 focuses on practical applications, demonstrating how GenAI and LLMs can streamline tasks such as document review, analysis, and transcript summarization in discovery and investigation workflows.

Our goal is to teach smart discovery professionals how GenAI models operate and how to use them for more efficient and effective work. Even if you are not focused on investigations and discovery, you should find this book interesting and helpful. While the examples in Part 2 are geared towards finding information in large document sets, the capabilities can be applied to a wide range of information needs. By understanding and harnessing the power of GenAI, professionals across various domains can unlock new levels of efficiency, insight, and innovation in their work.

Let’s get going. We hope you enjoy the ride.

John Tredennick and Dr. William Webber


TABLE OF CONTENTS

Introduction

PART 1: What is Generative AI and How Does it Work?

  • What is GPT?
  • What are LLMs?
    • LLMs: Modern Supercomputers
    • Training an LLM
    • Training Cutoff
    • If an LLM Has No Memory, How Does it Carry on a Conversation?
    • The Role of ChatGPT
    • The Importance of Context Windows
    • Carrying on a Conversation
  • The Context Window Size is Limited
    • Enlargements in Context Window Sizes
  • Is the Data We Send to GPT Secure?
    • Can an LLM Share Confidential Information, Even by Accident?
  • What about Hallucinations?
    • Reducing Chances of Hallucination
    • DiscoveryPartner, An Advanced RAG System

PART 2: Using Generative AI and Large Language Models for Investigation and Discovery

  • Using LLMs to Read, Analyze, and Report on Documents
    • The Assignment
    • Key Issues
    • Key Individuals and Organizations
  • Using LLMs to Answer Questions about Deposition and Hearing Transcripts
    • Deposition Summaries
    • Searching Across Transcripts
  • Conclusion
  • Key GenAI Terms Smart Discovery Professionals Should Know
  • About the Authors
  • About Merlin Search Technologies

To read the entire book, please click below and use your mouse or trackpad to advance the pages or download your own copy with the button at the bottom of this book.



Assisted by GAI and LLM Technologies per EDRM GAI and LLM Policy.

Authors

  • John Tredennick Headshot

    John Tredennick (JT@Merlin.Tech) is the CEO and founder of Merlin Search Technologies, a software company leveraging generative AI and cloud technologies to make investigation and discovery workflows faster, easier, and less expensive. Prior to that, he was founder and CEO of Catalyst Repository Systems, which he sold to a public company in early 2019. For the first 20 years of his career, he was a trial lawyer and litigation partner at a national law firm. Tredennick is a prolific speaker and writer. Over the past 30 years, he has written eight books and countless articles on legal technology topics, including “TAR for Smart People” (3rd Ed.),  two ABA bestsellers “Winning with Computers” (Vols. One and Two), “How to Prepare For, Take and Use a Deposition at Trial (James Publishing), and several editions of “The Lawyer’s Guide to Spreadsheets.” Tredennick has served as a Chair of the ABA’s Law Practice Management Section and Editor in Chief of its flagship magazine. He is currently active with EDRM and the Sedona Conference, having a lead drafting role on numerous of their publications.

    View all posts
  • Wiliiam webber

    Dr. William Webber (wwebber@Merlin.Tech) is the Chief Data Scientist of Merlin Search Technologies. With a PhD in Measurement in Information Retrieval Evaluation from the University of Melbourne, Dr. Webber is a leading authority in AI and statistical measurement for information retrieval and ediscovery. He has conducted post-doctoral research at the E-Discovery Lab of the University of Maryland and has over 30 peer-reviewed scientific publications in the areas of information retrieval, statistical evaluation, and machine learning. Dr. Webber has nearly a decade of industry experience as a consulting data scientist for ediscovery software vendors, service providers, and law firms.

    View all posts