About this Project
A common question for eDiscovery professionals is “How many documents/files per gigabyte?” The answer can vary widely depending on the type of file at issue. Emails and text files are typically small, which means more files per GB. PowerPoint and video files can be quite large, lowering the files per GB count. Wouldn’t it be nice to have industry averages across large files sets to acquire this information?
John Tredennick, CEO of Merlin Search Technologies, an EDRM Trusted Partner, is leading this project to find answers to these questions based on statistics from large document populations. The goal is to solicit anonymized aggregate information from a wide variety of litigation support, software hosting companies and law firms that can be amalgamated to provide averages for different types of files and across typical eDiscovery populations.
“Over the years, I wrote several articles and blog posts asking: How many docs in a gig?,” explained John Tredennick in proposing the project. “While we had a lot of data at Catalyst, my dream was to reach out to a wide swath of litigation support vendors and clients to see if we couldn’t put together a valuable set of numbers for our industry. With EDRM’s support and help from a few volunteers, I think we can do just that.”