How DeNISTing and Deduplication Instantly Reduce Ediscovery Costs

Zapproved: How deNisting and Deduplication Reduce Ediscovery Costs
Image: Zapproved

The rise of data in the workplace means corporate legal teams are struggling to keep up with skyrocketing volumes from an ever-expanding list of sources when it comes to ediscovery. Data needs to be collected, processed, reviewed, and finally produced – depending on the matter.

Whether legal teams are handling review 100% in-house, sending 100% to outside counsel, or handling some review in-house and outsourcing the rest, cutting down data volumes before review starts is critical. One of the fastest ways to do this is by DeNISTing and deduplicating document sets. 

What’s a NIST File?

Every computer contains a large amount of data: some of this information is highly useful during the document review process, and some of it is irrelevant application data or other system files that are there to enable a computer and its software to function (think of files with a .exe format for instance).

The National Institute of Standards and Technology (NIST) is a federal technology agency that maintains and publishes the National Software Reference Library (NSRL), a list of known, traceable software application files and the hash value for each one. 

More commonly called the NIST List, this information can be used to identify which computer files are important for ediscovery evidence and which files are part of the background noise that simply help computers and applications operate smoothly. 

What Are Hash Values?

At a basic level, “hashing” refers to the process of assigning a file a unique identification number, or hash value, using algorithms. Think of the hash value as a digital fingerprint for each file your team needs to process – by comparing fingerprints, you can keep or eliminate file matches. 

DeNISTing, Deduplication and Ediscovery

In ediscovery, “DeNISTing” is the process of removing unresponsive computer operating system files and standard application files from collected electronically stored information (ESI) prior to beginning document review. While this may seem like a time and energy intensive task, advancements in reputable document review software have made DeNISTing a standard step while processing data for further review, saving legal teams time. 

While this may seem like a time and energy intensive task, advancements in reputable document review software have made DeNISTing a standard step while processing data for further review, saving legal teams time. 

Meg McLaughlin

When ingesting new documents for review, modern document review software like Zapproved’s ZDiscovery Review, compares the NIST List’s catalog of hash values against the collected data. These irrelevant files are then flagged and eliminated. Once the NIST List files have been removed from the dataset, there will be significantly fewer documents to review.   

Like deNISTing, file deduplication is an effective way to reduce collected data ahead of the review process. As the name suggests, deduplicating files, or “deduping,” uses hash values and metadata to remove identical files or documents so that only one copy remains. For example, if several collected email inboxes contained the same email thread with the same document attachments, only one copy is needed for review. The remaining copies can be eliminated from the dataset. This means document reviewers aren’t wasting time reviewing multiple copies of identical documents.

Benefits of DeNISTing and Deduplication

By decreasing the volume of documents that need to be reviewed through deNISTing and Deduplication, corporate legal teams can save significant time and costs, as well as increase time-to-insight for matters involving document review.

For corporate ediscovery teams managing document review in-house, this means massive time savings. Fewer hours and/or fewer reviewers are needed when NIST files and duplicates are automatically removed from the data set before review begins. 

For corporate ediscovery teams paying outside counsel or outside vendors to review document sets, deNISTing and deduping can lead to huge cost savings. This is especially true if outside reviewers are being paid by the hour or by the gigabyte, since fewer documents are being sent over for review in the first place. Fewer documents result in less money wasted on unresponsive or irrelevant files.

Author