[EDRM Editor’s Note: EDRM is happy to amplify our Trusted Partners news and events. The opinions and positions are those of Veritas and Irfan Shuttari. This article was first published on May 2, 2023]
Ever since the Electronic Discovery Reference Model (EDRM) was created back in 2005, there has been an assumed workflow to the practice of eDiscovery. Electronically stored information (ESI) has historically moved through a progression of being identified, preserved, collected, processed analyzed, reviewed, produced and presented.
Back then, email and office files located on-premises comprised most of the ESI involved in eDiscovery workflows, which was a significant driver as to why the phases are represented the way they are on the model today – that rigid, structured and contained eDiscovery workflow best fit the ESI that was predominantly included in discovery at the time.
However, the volume of ESI and the variety of ESI sources have evolved dramatically over the years and that has forced eDiscovery to move UPSTREAM on the EDRM. eDiscovery is no longer contained – today, it involves many different ESI sources, requires a variety of workflows and is continually evolving. This is forcing many organizations to learn how to conduct “eDiscovery in the wild”!
ESI Drivers to “eDiscovery in the Wild”
The evolution of ESI is driving eDiscovery “wild” in two ways:
Data Volume
According to IDC, the global data sphere is expected to reach 175 zettabytes by 2025, with a compounded annual growth rate of 61 percent. When the EDRM model was created back in 2005, the global data sphere was 0.1 zettabytes. That means global data will have grown 1,750 times in 20 years!
With such an explosive growth rate for data in organizations (much of it being unstructured data), the old mentality of “collect everything” and addressing it downstream is no longer efficient or cost-effective. Traditional “right-side” EDRM phases like Analysis and Review must be conducted further to the left – where the data lives – to keep eDiscovery projects on time and eDiscovery costs from spiraling out of control.
Data Variety
That data lives today in a variety of forms and in a variety of sources. It’s in the cloud in enterprise solutions like Microsoft 365 and G-Suite, as well as collaboration apps like Slack, Teams and Zoom (and many others). It’s in mobile devices and even wearables and Internet of Things (IoT) devices. The varied data sources are providing considerable challenges for organizations to keep evolving eDiscovery workflows “in the wild” to keep up with changing data formats and sources.
Taming “eDiscovery in the Wild”
How can organizations tame “eDiscovery in the wild” today? Here are some requirements that must be addressed:
Early Data Assessment is Earlier Than Ever
Early data assessment (EDA) used to be conducted once the data was collected for discovery. But, as I discussed in my previous article, early data assessment must be conducted earlier than ever today – at the source where the data lives. Conducting EDA post collection is too late and too costly to support today’s aggressive case timelines and budgets.
EDA Involves More Data Sources Than Ever
Not only must EDA be conducted where the data lives, but it must accommodate the data living in multiple places. Can you count on source solutions like Microsoft 365 to provide the capabilities and features for you to conduct the early assessment necessary to make decisions and move the case forward? No. Even if you could, conducting EDA within each of the source solutions would be inefficient and time-consuming. You need a technology solution that enables you to apply insight to data across multiple data sources.
More Application of AI and Machine Learning Than Ever
You didn’t think I would forget to mention “AI”, did you? Today, you can’t talk about technology and not mention AI and machine learning – the power and potential of AI technology is simply too great to ignore. For eDiscovery, AI and machine learning technology has already been applied for years in “right-side” phases like Analysis and Review through predictive coding. Today, AI and machine learning are being applied in several additional ways, such as advanced filtering, sentiment analysis, and classification tags. And because Analysis and Review is happening now further to upstream on the EDRM model, so is the application of AI and machine learning technology.
InfoGov and eDiscovery is More Integrated Than Ever
Many of these requirements apply to both information governance and eDiscovery – after all, InfoGov is the beginning of the EDRM life cycle! Organizations need to conduct analysis of data – where it lives, across various sources, using the latest AI tools – to support both information governance and eDiscovery. Obtaining data insight across your data sources happens in the information governance phase, which can then be applied to support eDiscovery use cases.
Platforms like Veritas Data Insight to classify and control unstructured data working with Alta™ eDiscovery to conduct end-to-end discovery enable your organization to “tame” today’s “eDiscovery in the wild” through early data assessment where the data lives, across various data sources, using the latest proven AI techniques.
The Path Forward
How “wild” has eDiscovery gotten? So wild that EDRM has announced the launch of a new EDRM 2.0 project to update the EDRM model! One of the drivers cited by EDRM CEO and Chief Legal Technologist Mary Mack for the new project was a “big push for preservation in place, [early case assessment] in place”. Moving today’s eDiscovery upstream will involve an integrated approach between information governance and eDiscovery, enabling organizations to conduct early data assessment earlier and more comprehensively than ever!
For more regarding how Veritas Alta™ can support your eDiscovery needs, click here.
Assisted by GAI and LLM Technologies as per EDRM GAI and LLM Policy.