EDRM Announces Platform and Service Provider Adoption of Duplicate Email Identification Specification

Reveal, Relativity, edt, nuix, Law In Order
Image: Kaylee Walstad, EDRM (except logos)

Four leading eDiscovery Platforms and one leading service provider have formally adopted and implemented the EDRM Message ID Hash (EDRM MIH) to identify duplicate emails, even across platforms and productions.

While current approaches effectively identify email duplicates within native datasets processed by a single vendor, they do not enable duplicate identification across emails processed by multiple vendor platforms. Vendors use similar methods to detect email duplicates, but there are nuanced differences in their proprietary algorithms.

Currently no means of cross platform email duplicate identification exists, except to reprocess the data using a single vendor platform, often expending significant time and cost. The EDRM Duplicate Identification project set out to develop a solution to cross platform email duplicate identification.

Reveal, Relativity, EDT and Nuix are the supporting platforms first to market, with Law In Order creating the app for Relativity’s offering. EDRM is proud to provide links to each supporter in their turn. EDRM is exceptionally proud of the global project team, led by Trustee Beth Patterson, with our eDiscovery leading competing organizations collaborating with other professionals to create and test the EDRM MIH.

Supporting the EDRM Message ID Hash

In 2020, a team of volunteers gathered to take on a long-standing eDiscovery challenge: how to better enable cross-platform email duplicate identification.

The result of the group’s efforts is the EDRM Message Identification Hash (EDRM MIH), released by EDRM on January 11, 2023. Details about the project are at the EDRM DupeID page.

Reveal supports EDRM MIH and we recently added instructions for using it to our User Documentation site.

The Project’s History

The EDRM Duplicate Identification Project was the brainchild of Beth Patterson, the director and founder of ESPconnect and an industry stalwart from Australia. To hear directly from Beth about her role in the DupeID project, check out the eDiscovery Leaders Live discussion with her earlier this year.

Read more about Reveal’s implementation here.

Simplify Cross-Platform Email Duplicate Identification with EDRM Message ID Hash (MIH)

A groundbreaking digital tool, EDRM Message ID Hash (EDRM MIH), has been launched to tackle the long-standing issue of identifying duplicate emails across different eDiscovery platforms. Developed by a global team of industry experts, the tool eliminates the need for costly and time-consuming reprocessing of emails. Launched in February 2023, EDRM MIH promises to revolutionise email duplicate identification, offering substantial time and cost savings for legal professionals.

A Better Way to Identify Cross Platform Email Duplication

Have you had to find duplicate emails quickly after receiving an opposing party’s production but couldn’t because the opposing party used a different software platform to process their emails?

If what you received was processed by a different digital tool or “cross platform”, your only option has been to reprocess these emails using a single eDiscovery software platform. This method is expensive, time consuming and stressful, especially on a tight deadline. Your lawyers or client probably wonder why it takes so long and costs so much.

Read about Law In Order’s app here.

Introducing the EDRM Message ID Hash: Simplify Cross-Platform Email Duplicate Identification

Have you ever received a supplementary production of email and needed to know what’s new and what’s been produced before? How about finding duplicate email messages across productions from different vendors or in different forms of production?

Identifying such “cross-platform” duplicates traditionally entailed reprocessing native email from multiple sources—assuming native forms were produced. It’s expensive and time consuming. Often, it just isn’t feasible.

Did you ever think, there must be a better way? We did. And that’s how the EDRM Message ID Hash (EDRM MIH) was born.

The Challenge: Unravelling the Cross-Platform Puzzle

Digitally fingerprinting or “hashing” email messages to identify duplicates requires that precisely the same parts of email messages be processed in precisely the same way to obtain matching “hash values.” However, no electronic discovery tool approaches the task quite like another, so duplicate messages between productions couldn’t be identified when the productions were processed by different tools or produced in different forms.

From opposite sides of the planet, these authors—Craig Ball and Beth Patterson—reasoned there must be a way to achieve cross-platform duplicate identification without obliging service providers to change their workflows. That idea brought us together at ILTA’s 2016 conference (see Craig’s 2016 blog post on the subject!) and served as the impetus to found the EDRM Duplicate Identification Project in March 2021.

Read the blog post here.

Identifying duplicate emails just got easier, thanks to EDRM

Identifying and managing duplicate emails within one evidence management platform is already complicated. Do you use case-wide, per-custodian, custodian ranking, or family duplicate identification? The answer is often ‘all of the above’.

But what happens when you receive a production of emails from someone who used a different software to yours – or, the horror, as TIFF files – and you want to know which ones you’ve already seen? Until recently, it was double trouble.

The main challenge has been that each discovery software vendor uses its own formula to generate MD5 hashes that uniquely identify each email message. As a result, the same email message would generate a different hash in EDT as it would in, say, Relativity. So even if you received a load file that included hashes for email messages, your platform wouldn’t be able to match up the duplicates. Instead, you’d have to reprocess the production using your software to generate hashes using its formula. Even then, that could be problematic if you’re not working with native versions of the emails.

Read about the implementation here.



As of July 2023, Nuix Neo comes standard with EDRM Message Identification Hash (EDRM MIH) support. In an era where efficiency and accuracy are paramount, the world of eDiscovery and legal professionals has witnessed a groundbreaking innovation: the EDRM MIH. This revolutionary specification, introduced by EDRM on January 11, 2023, is a game-changer that promises to transform the way we identify duplicate emails across various platforms. In this blog post, we will explore the significance of EDRM MIH and its seamless integration into Nuix Neo, the cutting-edge industry solution that promises to streamline the eDiscovery process.   

Unlocking the Potential of EDRM MIH: 

  • Smaller Review Populations for Greater Efficiency: EDRM MIH offers the invaluable advantage of identifying duplicate emails, resulting in smaller review populations. This reduction in size not only saves precious time but also conserves valuable resources. 
  • Shorter Review Times for Time-Sensitive Matters: In the fast-paced world of litigation, time is of the essence. With EDRM MIH, users can conduct reviews more swiftly, a crucial aspect in time-sensitive cases where every minute counts. 

Read more about the about the implementation here.


  • Mary Mack

    Mary Mack is the CEO and Chief Legal Technologist for EDRM. Mary was the co-editor of the Thomson Reuters West Treatise, eDiscovery for Corporate Counsel for 10 years and the co-author of A Process of Illumination: the Practical Guide to Electronic Discovery. She holds the CISSP among her certifications.

    View all posts

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.