Since its debut in late 2022, ChatGTP has become the focal point of discussions in almost every field that involves some form of computing. The advent of these new generative large language models (LLMs) heralds a transformative era, comparable to the groundbreaking inventions of the printing press, steam engine, cell phones, and the Internet. It also raises a number of important questions regarding the risks inherent in this new generative AI technology, both now and in the future.
Our focus today is on tasks relating to investigations and ediscovery. We hope to illustrate how modern search platforms can take advantage of a generative AI system like GPT to reengineer discovery workflow, drastically improving its efficiency and cost-effectiveness.
For these exercises we will be using B2, an internal system created in the Merlin lab to explore how we might best integrate the analytical power of GPT in a traditional discovery platform. Our goal is to identify relevant documents in response to a natural language prompt and submit them to GPT for analysis and response.
B2’s job is to:
- Comprehend the initial prompt and identify relevant documents from the database that could aid in responding to it.
- Distill these documents into a summary that aligns with the informational requirements of the initial prompt.
- Condense this information into a form that can be fed into GPT for response. Depending on the complexity and volume of the information, this may necessitate breaking our efforts into a series of prompts and responses.
- Receive GPT’s response, which should be based on both the reviewed information and the content of the original prompt.
- Record this information, and transmit it back to the user.
These steps are complicated by a feature known as a “context window.” This is akin to GPT’s working memory, which includes the prompt, the text it analyzes and its subsequent response. A software system like B2 must be cognizant of the memory limitations of each LLM it interacts with, tailoring the text it feeds into the LLM to stay within the boundaries of the context window. Limited context windows require that the system hold pieces of information in separate storage until they are needed.
From Search Hits to Discovery Answers
Before we jump into specifics, let us offer this observation. The integration of an LLM like GPT into a discovery platform marks the beginning of a new era for investigations and ediscovery.
Up until now, keywords were the primary means to find relevant documents. The search engine’s job was to locate potential candidates that might be responsive to your information needs. Once this step was complete, the search engine’s job ended. The onus then fell on you and your team to read, analyze, and interpret the results, a process that was often tedious, time-consuming, and expensive.
Generative AI systems like GPT can take discovery beyond simple search. They can analyze result sets and use them to answer questions or otherwise provide meaningful information in response to your prompt.
Thus, for the first time in history, we have at our fingertips a generative AI system that can assist with the second half of the discovery process–ESI review and analysis. The second half is where all the money (and the time) goes.
Keep an eye on how well B2 performs its tasks through these exercises. We believe you’ll be impressed with its analytical prowess and capabilities.
Our Topic for This Exercise
Our discovery collection consists of approximately 300,000 emails that Jeb Bush made public from his two terms as Governor of Florida. NIST used these emails for several of its annual Text Retrieval Conferences (TREC). Among other things, the program coordinators and a team of reviewers went through the emails and created a series of topics to be used for research.
For this exercise, we will focus on this legal track topic:
Slot Machines — All documents concerning the definition, legality, and licensing of “slot machines” in Florida.
We will ask B2 to find relevant documents, summarize and synthesize their content and send them on to GPT for analysis and response.
Let’s start by asking GPT to give us an overview of discussions around our topic.
Prompt: I am looking for information on the following topic: Slot Machines — All documents concerning the definition, legality, and licensing of “slot machines” in Florida. Please identify and summarize the discussion over different concerns in a numbered list. Identify the people involved with the issue. GPT: 1. The state should own machines to ensure that Florida’s families are put first and to increase the state’s take. (Document 210965) 2. “Slot Machines” should be defined as class II gaming devices and class III gaming devices should be outlawed to avoid Indian Nations upgrading to class III and to remove the pressure for the Gov. to negotiate with them. (Document 210965, Document 210748) 3. “Revenue” should be defined as any money put into machines to avoid taxation on profits and to get the money upfront. (Document 210965, Document 210803) 4. Near misses or any programming that causes the machine to act in a way that is not purely random should be prohibited. (Document 210965, Document 210748) 5. Those who manufacture or distribute the slot machines should not have any ownership or profit sharing in Para mutual to avoid making industry-friendly machines. (Document 210965, Document 210748) 6. Alcohol should be prohibited in the casino area. (Document 210965, Document 210748) 7. Minors should be prohibited from entering the casino. (Document 210965, Document 210748) 8. ATM’s should be prohibited. (Document 210965, Document 210748) 9. Felons should not own, operate, vend to, or work in any way (inc independent contractors) for casinos to help police organized crime infiltration. (Document 210965, Document 210748) 10. Slot Machines may be placed only in existing Para mutual buildings to honor the commitment made by the industry during the Nov election. (Document 210965, Document 210748) 11. DRI should be required for any changes if number 10 is not a part of the bill. (Document 210748) 12. Hours of operation should not exceed 12 am to 12 pm and only on days when para-mutuals are currently allowed to operate. (Document 210965, Document 210748) 13. Slot machines should only accept and remit tokens, not cash. (Document 210965, Document 210748) 14. Single bets should be limited to $2 and series wagers should not exceed $8. (Document 210965, Document 210748) 15. The state should have the right to permanently revoke a license withhabitual rule-breaking Para-mutuals. (Document 210965, Document 210748) 16. The tax rate should be set at 50% if the state does not own machines to put us within the reasonable range of taxation countrywide. (Document 210965, Document 210748) 17. Out-of-state ownership or interest should be prohibited if state ownership is not part of the bill. (Document 210965, Document 210748) 18. The outside deal cut between county commission and Para-mutual industry should be paid for out of Para-mutual pockets, and not before taxation. (Document 210748) 19. There have been conflicting court rulings on whether modified slot machines, which require players to stop the computer themselves, violate the slot machine prohibition. (Document 253501, Document 253557) 20. The Department of Business and Professional Regulation (DBPR) has no regulatory authority in the area of modified slot machines, and enforcement efforts vary among jurisdictions. (Document 253557) 21. The legislative intent of voters on the issue of implementing slot machines pursuant to Amendment 4 should be considered, including the rate at which machines are to be taxed, whether the machines should be Class 2 or Class 3, and whether a repeal amendment should be considered. (Document 255936) 22. Advertisements of proponents of Amendment 4 could be used to determine what voters wanted, and they show that voters believed they were voting to “tax and regulate slot machines at existing facilities” to generate “$500 million” for schools. (Document 255936) People involved in the issue include Randy Johnson, Allan Bense, Jeb Bush, Dorothy Hukill, Cindy L. Gas, Simone Marstiller, Raquel Rodriguez, Patricia Levesque, Janice Gilley, John Sowinski, and legislators in Florida. The issue of the definition, legality, and licensing of slot machines in Florida has been discussed by various individuals and committees over the years. The concerns raised include the ownership of machines, the definition of “slot machines,” the definition of “revenue,” the prohibition of near misses or any programming that causes the machine to act in a way that is not purely random, the prohibition of those who manufacture or distribute the slot machines from having any ownership or profit sharing in Para mutual, the prohibition of alcohol and minors in the casino area, the prohibition of ATM’s, the prohibition of felons from owning, operating, vending to, or working in any way for casinos, the placement of slot machines only in existing Para mutual buildings, the requirement of DRI for any changes if number 10 is not a part of the bill, the limitation of hours of operation, the acceptance and remittance of tokens instead of cash, the limitation of single bets and series wagers, and the right of the state to permanently revoke a license with habitual rule-breaking Para-mutuals. There have also been discussions about the legality of modified slot machines, which require players to stop the computer themselves, and the lack of regulatory authority of the DBPR in this area. Additionally, the legislative intent of voters on the issue of implementing slot machines pursuant to Amendment 4 has been considered, including the rate at which machines are to be taxed, whether the machines should be Class 2 or Class 3, and whether a repeal amendment should be considered. Advertisements of proponents of Amendment 4 have been used to determine what voters wanted, and they show that voters believed they were voting to “tax and regulate slot machines at existing facilities” to generate “$500 million” for schools. The people involved in the discussion include Randy Johnson, Allan Bense, Jeb Bush, Dorothy Hukill, Cindy L. Gas, Simone Marstiller, Raquel Rodriguez, Patricia Levesque, Janice Gilley, John Sowinski, and legislators in Florida. The discussions have taken place over several years, with documents dating back to 1999 (Document 025484) and continuing through 2005 (Document 210965, Document 210803, Document 210748, Document 253501, Document 253557) and 2006 (Document 255936). |
This is an interesting and well-written synthesis of some of the concerns about slot machines in Florida. GPT not only summarized the points made and the people making them, but provided links to the documents used to develop its answer.
Imagine having something like this handed to you at the beginning of an investigation (or at the start of a case). Prior to generative AI, it might take an attorney-led team hours (or even days) to find, analyze and report out this kind of information. GPT can do it in minutes.
Using GPT to Summarize Documents
During the course of responding to prompts, we ask GPT to summarize each document it finds, providing information about the people involved and communication dates. Summaries aren’t meant to be a substitute for the original documents but they can surely be helpful as a starting point.
Here are several examples that GPT prepared before crafting its response. They will give you an idea of GPT’s capabilities in this regard:
Docid: 210748 Score: 0.862 Date: March 10, 2005 People involved: Randy Johnson, Allan Bense, Jeb Bush Summary: Randy Johnson sent a note to Allan Bense and cc’d Jeb Bush with ideas for Slots enabling legislation in Florida. The note includes the following concerns: 1. The state should own machines to ensure that Florida’s families are put first. 2. Define “Slot Machines” as class II gaming devices and outlaw class III gaming devices to avoid Indian Nation upgrading to class III and remove pressure for the Gov. to negotiate with them. 3. Define “revenue” as any money put into machines to avoid taxation on profits. 4. Prohibit near misses or any programming that causes the machine to act in a way that is not purely random. 5. Prohibit those who manufacture or distribute the slot machines from having any ownership or profit sharing in Para mutual to avoid making industry-friendly machines. 6. Alcohol should be prohibited in the casino area. 7. Minors should be prohibited from entering the casino. 8. ATM’s should be prohibited. 9. Felons should not own, operate, vend to, or work in any way (inc independent contractors) for Casinos to help police organized crime infiltration. 10. Slot Machines may be placed only in existing Para mutual buildings to honor the commitment made by the industry during the Nov election. 11. DRI should be required for any changes if number 10 is not a part of the bill. 12. Hours of operation should not exceed 12 am to 12 pm and only on days when para-mutuals are currently allowed to operate. 13. Slot machines should only accept and remit tokens, not cash. 14. Single bets should be limited to $2, and series wagers should not exceed $8. 15. Give the state the right to permanently revoke a license with habitual rule-breaking Para-mutuals. 16. Set the tax rate at 50% if the state does not own machines. 17. Prohibit out-of-state ownership or interest if state ownership is not part of the bill. 18. The outside deal cut between county commission and Para-mutual industry should be paid for out of Para-mutual pockets, and not before taxation. 19. Broward County community leaders should be engaged as partners in this legislation. 20. The legislation should be self-repealing if amendment 4 is repealed from the constitution in the future. Docid: 210803 Score: 0.852 Summary: This is an email from Dorothy Hukill, a member of the Business Regulation Committee in the Florida House, to Governor Jeb Bush and Chris Flack regarding the definition of “revenue” as it relates to slot machines. Hukill believes that the legal definition of revenue should include all money that comes into the slot machine, while the proponents of slot machines define revenue as what is left in the machine after payout. Hukill is considering referring this issue to legal counsel for further research. Discussion: 1. The Business Regulation Committee in the Florida House is discussing how much tax should be levied on slot machines. 2. There is a disagreement over the definition of “revenue” as it relates to slot machines. 3. Hukill believes that the legal definition of revenue should include all money that comes into the slot machine. 4. Hukill is considering referring this issue to legal counsel for further research. 5. Governor Jeb Bush and Chris Flack are copied on this email. 6. The email was sent on March 10, 2005. 7. The main people involved in this issue are Dorothy Hukill, Governor Jeb Bush, and Chris Flack. Docid: 207523 Score: 0.838 Summary: This is an email sent by Jeb Bush on March 9, 2005, to Raquel Rodriguez, Chris Flack, and Alan Philp regarding the issue of slot machines in Florida. In the email, Bush suggests that they meet with the pro-slots group to try and narrow their differences and pass an implementing bill. The email also includes a disclaimer that all written communications regarding state business are public records available to the public and media upon request. Discussion over different concerns: 1. The legality of slot machines in Florida. 2. The definition of slot machines in Florida. 3. The licensing of slot machines in Florida. People involved: 1. Jeb Bush – sender of the email. 2. Raquel Rodriguez – recipient of the email. 3. Chris Flack – recipient of the email. 4. Alan Philp – recipient of the email. Docid: 244365 Score: 0.837 Date: October 10, 2005 People involved: Daniel K. Adkins, Jeb Bush, Raquel Rodriguez, Simone Marstiller, Patricia Levesque, and others. 1. Daniel K. Adkins, the owner of a greyhound racing track, expresses concern over the proliferation of unregulated gambling machines in Florida, which he claims are taking advantage of seniors. 2. Adkins suggests that existing statutes be cleaned up to remove the “gray” area and make violations a felony. 3. Jeb Bush asks Adkins how he suggests dealing with the issue. 4. Adkins proposes that the state regulate machine gambling and limit it to areas approved by voters, which would reduce gambling. 5. Adkins suggests that the same package regulating the voter-approved Broward tracks should also shut down these unregulated operations. 6. The article attached to the email discusses the opening of a penny arcade in Lauderhill, Florida, which offers all-adult gaming machines that look like slots and spin like slots. 7. Concerns are raised over safety, the elderly spending too much time and money in arcades with little regulation, and the potential for players to develop gambling problems. 8. The Florida Arcade Association argues that these arcades are not equivalent to gambling and provide a low-cost form of entertainment for seniors. 9. The article notes that the state benefits from these arcades, as it receives 4% of the gross earnings. |
B2’s document summaries strike us as hugely valuable for an investigation or discovery effort. They can be persisted for later viewing or exported as part of a report.
Using GPT to Synthesize Information
Along with summarization, GPT does an excellent job at synthesizing information. In this case we will ask GPT to report on concerns regarding slot machines. This request is similar to our starting request but we will give GPT a bit more room to respond here. It will also give you a chance to see how GPT responds to different types of prompts.
Prompt: What were the concerns about modified slot machines and the need for regulation and customer protection in Florida? Describe the concerns, who did they affect and who was concerned. Summarize the pros and cons regarding these concerns. GPT: The concerns about modified slot machines in Florida were related to their legality and the need for regulation and customer protection. The main concern was that these machines violated the prohibition on slot machines, which could lead to unregulated gambling and potential harm to customers. There were also concerns about the lack of regulation and customer protection for these machines, which could lead to fraud, theft, and other criminal activities. (Document 253570) The need for regulation and customer protection was felt by various stakeholders, including government officials, law enforcement agencies, and community leaders. They were concerned about the potential harm to customers, especially the elderly, who could spend too much time and money in arcades with little regulation. They were also concerned about the lack of oversight and regulation for unlicensed operators, which could lead to criminal activities and other forms of harm. (Document 244436, Document 244426, Document 244423) The pros of regulating modified slot machines were that it could help prevent unregulated gambling, protect customers from fraud and theft, and provide oversight and regulation for unlicensed operators. (Document 253570) The cons were that it could limit the availability of these machines, which could be popular with customers, and could lead to increased costs for operators who would have to comply with regulations. (Document 253570) Overall, the concerns about modified slot machines and the need for regulation and customer protection in Florida were related to the potential harm to customers and the lack of oversight and regulation for unlicensed operators. The pros of regulating these machines were related to preventing unregulated gambling and protecting customers, while the cons were related to limiting availability and increasing costs for operators. The main stakeholders involved in these concerns were government officials, law enforcement agencies, and community leaders. (Document 253570, Document 244436, Document 244426, Document 244423) In addition to the concerns about modified slot machines, there were also concerns about unregulated gambling machines, also known as “gray machines,” and the need for regulation and customer protection in that area as well. The concerns about unregulated gambling machines were similar to those related to modified slot machines, including the potential harm to customers and the lack of oversight and regulation for unlicensed operators. (Document 244436, Document 244426, Document 244423) The pros of regulating unregulated gambling machines were related to preventing unregulated gambling and protecting customers from fraud and theft. The cons were related to limiting availability and increasing costs for operators who would have to comply with regulations. (Document 244436, Document 244426, Document 244423) The stakeholders involved in these concerns were government officials, law enforcement agencies, and community leaders. They were concerned about the potential harm to customers, especially the elderly, and the lack of regulation and oversight for unlicensed operators. They were also concerned about the potential for criminal activities and other forms of harm related to unregulated gambling. (Document 244436, Document 244426, Document 244423) Overall, the concerns about modified slot machines and unregulated gambling machines in Florida were related to the potential harm to customers and the lack of regulation and oversight for unlicensed operators. The pros of regulating these machines were related to preventing unregulated gambling and protecting customers, while the cons were related to limiting availability and increasing costs for operators. The main stakeholders involved in these concerns were government officials, law enforcement agencies, and community leaders. (Document 253570, Document 244436, Document 244426, Document 244423) |
You can quickly see the value of these types of reports.
Using GPT to Identify Related Statutes and Regulations
We have seen references to a number of rules and regulations that bear on these discussions. Let’s ask GPT to summarize them.
Prompt: What statutes, regulations and rules are implicated by these discussions? Summarize and provide the citations to each and tell me who is talking about them and why. GPT: The discussions related to the definition, legality, and licensing of slot machines in Florida implicate several statutes, regulations, and rules, including: 1. Chapter 849, F.S.: This statute defines gambling and provides for the regulation of gambling activities in Florida. The discussions related to the definition of “revenue” in relation to slot machines and the need to clarify the language to preclude the practice of modified machines that violate the slot machine prohibition implicate this statute. (Document 253570) 2. DBPR (Department of Business and Professional Regulation): This agency is responsible for regulating and licensing various businesses and professions in Florida, including gambling activities. The discussions related to giving regulatory authority to the DBPR over unregulated gambling machines and the need for regulation and customer protection implicate this agency. (Document 253570, Document 244436, Document 244426, Document 244423) 3. Felony statutes: The discussions related to making violations of existing statutes a felony implicate the criminal statutes in Florida that provide for felony charges for certain offenses. (Document 253570) 4. Voter approval: The discussions related to limiting machine gambling to areas approved by voters implicate the need for voter approval for certain types of gambling activities in Florida. (Document 253570) The stakeholders involved in these discussions were government officials, law enforcement agencies, and community leaders. They were discussing these statutes, regulations, and rules in order to determine the best course of action for regulating and overseeing the gambling industry in Florida. The discussions centered around the need for regulation and customer protection, as well as the potential harm to customers and the lack of oversight and regulation for unlicensed operators. |
This can be helpful as well.
Using GPT to Create an Investigation Report
Our final step is to use GPT to create what we would call an investigation report. Assume that an investigator has found a number of important documents through any of a variety of methods (e.g. keyword searches, witness meetings). Rather than asking a series of questions about the document, the investigator could simply ask for a comprehensive report (designed to meet your needs).
The prompt might ask GPT to include this kind of information:
- Individuals involved, including their roles
- A timeline of discussions
- Summaries of the issues involved
- Rules and regulations involved in the discussion
Take a look at the report we received.
Prompt: Prepare an investigation report regarding the information you have found. Present it in the following sections: Section 1: A bullet point listing of the people involved including their positions (if you have them) and their roles in this discussion. Section 2: A bullet point timeline of the emails you have reviewed by date ascending. Give me a two line summary of each including From and To. Section 3: Summarize the issues involved and the pros and cons discussed for each. Section 4: List any rules, regulations or statutes being discussed with a summary of how each relates to this discussion. GPT: Section 1: – Jeb Bush, Governor of Florida- Raquel Rodriguez, Deputy Chief of Staff for Policy- Patricia Levesque, Deputy Chief of Staff for Education- Simone Marstiller, Secretary of the Florida Department of Business and Professional Regulation- Daniel K. Adkins, President of Mardi Gras Gaming- John Sowinski, President of No Casinos- Cindy Gass, Florida resident- Allan Bense, Speaker of the Florida House of Representatives- Chris Flack, Communications Director for the Florida House of Representatives- Dorothy Hukill, Member of the Business Regulation Committee in the Florida House of Representatives- Judge Moe, Florida judge Section 2: – June 23, 1999: Email from Cindy Gass to Jeb Bush regarding the law in Florida that prohibits the purchase of slot machines unless they are over 30 years old.- March 10, 2005: Email from Randy Johnson to Allan Bense and Jeb Bush regarding ideas for Slots enabling legislation in Florida.- March 10, 2005: Email from Dorothy Hukill to Jeb Bush and Chris Flack regarding the discussion of how much tax should be levied on slot machines in Florida.- November 7, 2005: Email from Simone Marstiller to Jeb Bush, Raquel Rodriguez, and Patricia Levesque discussing the legality of modified slot machines in Florida.- October 10, 2005: Email exchange discussing the issue of unregulated gambling machines, also known as “gray machines,” in Florida. Section 3: The issues involved in these emails include the definition, legality, and licensing of slot machines in Florida, as well as the need for regulation and customer protection. The pros of regulating these machines were related to preventing unregulated gambling and protecting customers, while the cons were related to limiting availability and increasing costs for operators. The main stakeholders involved in these concerns were government officials, law enforcement agencies, and community leaders. They were concerned about the potential harm to customers, especially the elderly, and the lack of regulation and oversight for unlicensed operators. Section 4: The rules, regulations, and statutes being discussed in these emails include Chapter 849, F.S., which permits “arcade amusement centers” and truck stops to operate “amusement games or machines” that require “application of skill” to win. There were also discussions about the definition of “revenue” in relation to slot machines and the need to clarify the language to preclude the practice of modified machines that violate the slot machine prohibition. Additionally, there were discussions about the need to give regulatory authority to the Department of Business and Professional Regulation (DBPR) over unregulated gambling machines, and the possibility of requiring unlicensed operators to pay a fee to the state. There were also discussions about the need to limit machine gambling to areas approved by voters to reduce gambling and the possibility of making violations of existing statutes a felony. Overall, the discussions centered around the need for regulation and customer protection in the gambling industry in Florida. It is worth noting that the issue of unregulated gambling machines, also known as “gray machines,” was also a concern in these emails. The lack of regulation and oversight for unlicensed operators was a major concern, as it could lead to criminal activities and other forms of harm related to unregulated gambling. The main stakeholders involved in these concerns were government officials, law enforcement agencies, and community leaders. They were concerned about the potential harm to customers, especially the elderly, and the need for regulation and oversight to prevent unregulated gambling. In conclusion, the emails reviewed in this investigation report highlight the concerns and discussions related to the definition, legality, and licensing of slot machines in Florida, as well as the need for regulation and customer protection in the gambling industry. The stakeholders involved in these discussions include government officials, law enforcement agencies, and community leaders, who were concerned about the potential harm to customers and the lack of regulation and oversight for unlicensed operators. The discussions centered around the need for regulation and customer protection in the gambling industry in Florida, including the possibility of clarifying existing statutes, giving regulatory authority to the DBPR, and limiting machine gambling to areas approved by voters. |
We plan to develop a series of report formats and content options to standardize the process. At any point, you can create your own report format simply by specifying what information you need and how you want to see it. Imagine having these kinds of capabilities at your fingertips.
Conclusion
Welcome to a new era of Discovery–one where we can quickly move from search hits to discovery answers with the extraordinary power of artificial intelligence. By seamlessly integrating an LLM with an algorithmic search engine like Sherlock, legal professionals can harness the immense capabilities of large language models to streamline discovery processes and quickly dive deeper into the key documents that are most relevant for the case.
The applications we have explored in this article merely scratch the surface of an LLM’s potential to assist in investigations and discovery efforts. Using an integrated system like B2 to help in synthesizing information, summarizing documents, answering questions and creating investigation reports, will make investigations and discovery more efficient, improving outcomes and saving time and money.
Looking ahead, we believe that LLMs, when integrated with systems like B2 , will dramatically reduce the number of attorneys required for “first pass review” of large document populations (for productions or return review).[1] If our hypothesis is correct–and there are field reports supporting our conclusions–the impact on investigations and ediscovery could be revolutionary.
If you have case data, particularly data that has already been judged by human reviewers, and want to test out our system, reach out. We are looking for test data as we work to bring it live this fall.
[1] We discussed some of our early research on using GPT to augment or replace human review teams in this article: Will ChatGPT Replace Ediscovery Review Teams? (Law.com 02/21/2023). We plan to explore this topic further in subsequent papers discussing how an LLM like GPT could be integrated into a TAR 1.0 or 2.0 process to make that review even more efficient and cost-effective.