EDRM and ZL Technologies Launch New Enron Email Data Set

Leading E-Discovery Standards Organization and Enterprise Vendor Unite to Launch Standard Data Set for E-Discovery and Email Research\r\n\r\nST. PAUL, MN & SAN JOSE, CA – November 15, 2010 – The Electronic Discovery Reference Model (EDRM) and ZL Technologies today announced the launch of Version 2 of the EDRM Enron Email Data Set. This version offers the largest and richest set of publically available, general-purpose corporate email to date. Research into e-discovery and information retrieval require such data sets to test, develop and refine new capabilities and approaches; however, their availability is limited due to the competitive and private nature of enterprise communications. To facilitate advancements in commercial e-discovery and academic research, the EDRM data sets are provided free of charge and build on industry best practices. For Version 2 of the data set, EDRM and ZL Technologies are pleased to announce collaboration with the Text REtrieval Conference (TREC) Legal Track project that is using the data set in its research.\r\n\r\nThe new data set includes several improvements over the previous version, including direct input from various research communities. Some highlights include:\r\n


  • Larger Data Set: Inclusion of 1,227,255 emails with 493,384 attachments covering 151 custodians
  • \r\n

  • Rich Metadata: Threading information, tracking IDs, and general Internet headers are included
  • \r\n

  • Multiple Email Formats: The new data set provides both full and de-duplicated email in PST, MIME and EDRM XML, which allows organizations to test and compare results across formats
  • \r\n

  • Attached Files: Attachments to email messages are included in Version 2, as they were in Version 1
  • \r\n

\r\nPublic data sets allow organizations to perform better research by comparing test results and reproducing third-party tests. This enables organizations to leverage past work by themselves and others. Additionally, the use of a common data set facilitates the creation of benchmarks upon which to judge competing applications and workflows. It is this manner that Version 2 of the EDRM Enron Email Data Set will contribute to the information management and e-discovery community.\r\n\r\nThis data set was produced with input from the Text REtrieval Conference (TREC) Legal Track project, which uses the EDRM data set for its research. Doug Oard, Professor at the University of Maryland and TREC Legal Track Coordinator remarked, “We are delighted to have had this opportunity to collaborate with EDRM. One of our goals in the TREC Legal Track is to produce test collections that are of lasting value to the e-discovery community, which is well in line with those of the EDRM Data Set Project. EDRM provides us with the opportunity to leverage current commercial best practice for collection processing, thus significantly facilitating the research of both commercial and academic research teams.”\r\n\r\nThe EDRM Data Set Project has taken on a growing role in the practice of e-discovery. Littler Mendelson, P.C., an AMLAW 100 law firm, uses the EDRM Data Sets to meet several needs. “We regularly evaluate new e-discovery technologies, and thus require a large, rich data set for our tests that does not include client information,” said Michael McGuire, Shareholder and eDiscovery counsel for Littler. “The EDRM Data Sets allow us to direct vendors to download the data directly from EDRM.net and set up the test systems with data that we are familiar with, understand, and do not have a responsibility to hold in confidence. We also use the EDRM Data Sets for training and demonstrations of our tools and processes. We look forward to using this new data set.”\r\n\r\n“With delivery of this data set, the EDRM Data Set Project has achieved and surpassed its initial goals,” said John Wang, Project Lead for the EDRM Data Set Project and Product Manager at ZL Technologies, Inc. “We have been delighted with the adoption of the data sets already released as well as the interest in this data set. For this release, ZL Technologies worked closely with EDRM and TREC Legal Track to process the data using ZL Unified Archive®, which ensured the data was accurate, rich, and available in industry standard EDRM XML, PST, and MIME formats. We look forward to continuing to facilitate and refine the practice of e-discovery.”\r\n\r\nAbout EDRM\r\n\r\nLaunched in May 2005, the EDRM Project was created to address the lack of standards and guidelines in the e-discovery market – a problem identified in the 2003 and 2004 Socha-Gelbmann Electronic Discovery surveys as a major concern for vendors and consumers alike. The completed reference model provides a common, flexible and extensible framework for the development, selection, evaluation and use of e-discovery products and services. Expanding on the base defined with the Reference Model, the EDRM projects were expanded in May 2006 to include the EDRM Metrics and the EDRM XML projects. Over the past four years, the EDRM project has comprised more than 170 organizations, including 115 service and software providers, 43 law firms, three industry groups and 12 corporations involved with e-discovery. Information about EDRM is available at https://edrm.net.\r\n\r\nAbout ZL Technologies\r\n\r\nEstablished in 1999, ZL Technologies, Inc. (ZL) provides cutting-edge enterprise software solutions for e-mail and files archiving for regulatory compliance, litigation support, corporate governance, and storage management. ZL’s Unified Archive, offers a single unified platform to provide all the above capabilities, while maintaining a single copy and a unified policy across the enterprise. With a proven track record and enterprise clients that include top global institutions in finance and industry, ZL has emerged as the specialized provider of large-scale email archiving for e-discovery and compliance. For more information, please visit www.ZLTI.com.\r\n\r\nPress contacts\r\n\r\nGeorge Socha\r\nSocha Consulting, LLC\r\n651.690.1739\r\n pr@edrm.net\r\n\r\nTom Gelbmann\r\nGelbmann & Associates, LLC\r\n651.483.0022\r\n pr@edrm.net\r\n\r\nRob Elliott\r\nZL Technologies\r\n408.240.8989\r\n relliott@zlti.com


George Socha

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.