The EDRM Enron Email Data Set v2 consist of Enron e-mail messages and attachments in two sets of downloadable compressed files: XML and PST.
Files in each group are organized by custodian and listed alphabetically with compressed file sizes in parentheses. Materials for some custodians are spread across more than a single XML or PST file.
Select any combination of XML files or any combination of PST files to download.
If you find these files useful, please consider joining EDRM!
XML Files | PST Files
XML Files
The XML files are organized by custodian. Each compressed file should contain some combination of the following (depending on availability, of course):
- XML
- EML with attachments
- Native attachments
- Text email bodies
- Text email attachments
PLEASE NOTE: These files may contain viruses, as can be the case with any set of files collected during discovery. 
Select any combination of EDRM Enron Email Data Set v2 XML files to download:
PST Files
PST files are organized by custodian.
PLEASE NOTE: These files may contain viruses, as can be the case with any set of files collected during discovery. 
Select any combination of EDRM Enron Email Data Set v2 PST files to download:







EDRM Enron Email Data Set v2
Thanks so much for putting this data out there.
These types are dataset are very much useful to our environment.Thanks for intimation.
Thanks for these. (So these Enron emails are now under Creative Commons license??)
Membering now…
We have released this collection under a Creative Commons Attribution 3.0 United States License. This collection is a reworking of the previously available data, with substantial effort required to transform it to the released version. Thank you.
Anyone have any advice for rewriting the domain enron.com to someother.com? We’d like to use the PST data in a testing environment, and would like to use routable email addresses. Thanks, Paul.
Thanks for making all of these files available for download. FYI, the file edrm-enron-v2_rodrique-r_pst.zip is missing and gives an error.
Greg,
Earlier this week the folks preparing the files notified me that the file you cite was improperly named. It was supposed to be “rodrigue” with a “G” and not “rodrique” with a Q. I made the appropriate changes. That file is posted and you should be able to download it from this page.
Thanks,
George
Please note that we have replaced two files and corrected the file name on the third. We replaced “edrm-enron-v2_reitmeyer-j_pst.zip” and “edrm-enron-v2_arnold-j_pst.zip”; neither was decompressing properly. We changed the name of “edrm-enron-V2-rodrique-r_pst.zip” to “edrm-enron-V2-rodrigue-r_pst.zip”.
Thanks for making this data available. Would it be possible to calculate and publish md5sums of these files, so that users can verify download integrity?
it’s probably worth noting here that all of the v2 files together are 116GB (37GB for PST, 79GB for XML).
if you use a downloading tool to limit your bandwidth to (say) 50KB/s (400Kb/s) so you don’t destroy your employer’s internet connection or network proxies, that translates into about a month. divide appropriately if you choose to use more bandwidth or download only PST or XML.
Thank you so much for your work.
These datasets are very useful for my project.
Thank you for making these available!
When attempting to unzip two of these I’m repeatedly encountering problems. The zips I’m having a problem with are:
edrm-enron-v2_kaminski-v_xml_1of2.zip
edrm-enron-v2_kaminski-v_xml_2of2.zip
Are there any known issues with these? The error I receive is:
“! \edrm-enron-v2_kaminski-v_xml_2of2.zip: The archive is corrupt” and
“! \edrm-enron-v2_kaminski-v_xml_1of2.zip: The archive is corrupt”
Thanks again!!