Humans Against the Machines: Is Predictive Coding Really Better Than Humans? – Part 1

Outline of brain in center, with waves going in, and circuits going out.

Technological advancements are significantly influencing the legal services landscape.  At unprecedented rates, corporations, law firms, and state and federal enforcement agencies are accepting and adopting the use of advanced technology in legal matters, including automation, machine learning, and algorithm-driven data analytics.  With respect to discovery, over the past decade, the expansion of technology-assisted review has been well documented and debated.  The wide embrace of technology-assisted review – or “TAR” for short, has met with acclaim from clients and their counsel.  It is essentially undisputed by now, for instance, that TAR has proven to help produce quality results, while also achieving quantifiable cost savings.  But TAR has also not been met without controversy.  In many conversations, the widespread – and growing – adoption of TAR has fueled speculation that TAR will ultimately result in attorneys being largely supplanted and replaced by machines.

But while experience supports an emerging–and powerful–role for artificial intelligence in data review, research has not actually shown that predictive coding is simply ‘better than’ humans, or that predictive coding should ever be employed without human training, iteration, and final review.

Over the past few years, some industry observers have prophesied an “artificial intelligence invasion” that would ultimately erode and maybe even bring about the “extinction of the legal profession” in the expansive, multi-billion dollar electronic discovery realm. Specifically, in the e-discovery process, TAR, or ‘predictive coding’ has been tagged as this sort of disruptive, impactful technology that would altogether obviate the need for attorney review and judgment.

Predictive coding is a type of technology-assisted review that employs algorithms to help classify documents (based on relevancy to the subject matter at hand, or screening for privileged attorney-client communications, and so on).  This technology has surged in popularity, supported by well-accepted, conventional wisdom that posits that predictive coding is faster, cheaper, and more accurate than manual (i.e., human) review.  And for the most part, all of that is true.  Years of case studies show that TAR adds speed, quality and cost efficiency to the discovery process.  But while experience supports an emerging–and powerful–role for artificial intelligence in data review, research has not actually shown that predictive coding is simply ‘better than’ humans, or that predictive coding should ever be employed without human training, iteration, and final review.

This article’s authors recently published research in a Richmond Journal of Law & Technology article testing – and challenging – prevailing wisdom around what predictive coding purports to do, positing that machines are simply not what they are promoted to be, in terms of their perceived superiority over humans, particularly in the discovery process.  They undertook a study that analyzed the results of prior research on predictive coding, revealing flaws and correcting misunderstandings, in and around the effectiveness and necessity of human involvement.

Their study examined new data that challenges the prevailing ‘dim’ view the market has towards the quality and utility of human review.  The authors outlined new research that tilts against that narrative, showing that human attorney review can significantly increase the quality of document review results.  The data further revealed an important limitation of predictive coding: unlike humans, predictive coding cannot be ‘the’ sole tool for identifying key documents used in actual proceedings.  Finally, the study concluded by surveying the significant risks inherent in relying on predictive coding alone to drive high-quality, legally defensible document reviews.  Namely, the exclusive use of predictive coding can lead to unwanted disclosures, threatening attorney-client privilege and work-product protections.  

With a fulsome evaluation of predictive coding’s capabilities, limitations, and drawbacks, the authors arrived at the conclusion that the much talked-about ‘rise of the robot overlords’ was actually much further away than some people were anticipating.

The Richmond Journal of Law & Technology article can be found here: https://jolt.richmond.edu/files/2020/10/Keeling-FE.pdf

About the Authors of the JOLT article: Rishi Chhatwal, Robert Keeling, Peter Gronvall, and Nathaniel Huber-Fliflet are frequent collaborators on machine learning research in the legal realm.  Together they have conducted thousands of experiments to assess the effectiveness of predictive modeling on legal data and recognize the importance of its role in the e-discovery process.

  • Rishi P. Chhatwal is Assistant Vice President, Senior Legal Counsel at AT&T Services, Inc. He advises internal clients on antitrust and discovery issues for litigations, regulatory inquiries, and internal investigations.
  • Robert Keeling is a partner at Sidley Austin LLP whose practice includes a special focus on electronic discovery matters. He is co-chair of Sidley’s eDiscovery and Data Analytics Group and represents companies across a range of industries in civil litigation and government investigations that contain complex eDiscovery issues. Robert is a published author and frequent speaker on topics relating to eDiscovery, information governance, machine learning, and the attorney-client privilege. [https://www.sidley.com/en/people/k/keeling-robert-d]

[First published in The National Law Review here.]

Authors

  • Peter Gronvall

    Peter Gronvall is a Senior Managing Director at Ankura. He is an accomplished business visionary in the legal services and information-risk marketplace. He leads a global team of professionals that handles global investigations and large-scale litigation matters. Peter and his teams are routinely called upon to help corporate clients solve their most important legal needs as they relate to data-intensive requirements, especially as those matters implicate enterprise-level litigations and investigations. Peter’s engagements routinely involve C-suite executives, generals counsel, and outside counsel, confronting legal matters that involve information and technology concerns.

  • Nathaniel Huber-Fliflet

    Nathaniel (Nate) Huber-Fliflet is a Senior Managing Director at Ankura, based in Washington, DC. He has 15 years of experience consulting with law firms and corporations on advanced data analytics solutions and legal technology services. His core expertise is in machine learning, speech recognition, data mining, and software development – all within the legal technology and services market space.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.