EDRM Evergreen/Analysis
From Working EDRM
| Comments: Please submit comments to the EDRM Evergreen Analysis forum |
Categories
add introduction
Types of Analytics
As e-discovery tools and processes have matured and coalesced around the EDRM model, analytics are showing up in multiple steps of the EDRM process. When the model was originally conceived, the focus was on the analysis of the collected documents to make it easier to cull documents and provide increased productivity during the review step. The kinds of analysis ranged from indexing of the document for keyword searching, to sophisticated analysis of the content for clustering and foldering of the documents, to extraction of metadata to provide social network and event analysis, to extraction of additional information from within the content of email and documents like people names, company names, city, state and country names.
More recently the types of analytics being used to increase productivity through the whole process include:
- Content Analytics – extending content analytics back to the information management step and forward to the trial presentation step
- Reviewer Analytics – applying productivity metrics and analysis to the individual reviewer.
- Project and Matter Analytics – the collection of project metrics to answer the many forms of the question “what is the status of my matter?”
- Case Analytics – multiple types of case analytics are now used to help speed attorney understanding and improve legal matter strategy, including:
- Fact Based – using content analytics tools on the most relevant documents to cluster the important documents and extract relevant facts for the case
- Case Law Based – using content analytics tools on the case law and the relevant statutes to understand which cases bear on the matter
- Case Intelligence Based – using the results of previous decisions by the judges in charge of the case and the previous experience of opposing counsel in the case to aid in the crafting of case strategy both for the discovery phase and the trial phase
As cases get increasingly larger (from Gigabytes to Terabytes to Petabytes) and more professionals from different organizations are involved, it gets harder to both track where the case is and to keep testing to ensure that both the letter and the spirit of the FRCP are adhered to. As corporations and law firms mature in their understanding and management of the e-discovery process, the focus shifts through the progression of:
- Did we process everything that we collected?
- Did we collect and preserve everything that we were supposed to from all the custodians we were supposed to?
- How do we make sense of terabytes of responsive material?
- Did we do everything as cost effectively and productively as possible?
Content Analytics
As corporations start to install better systems for information management (Records Management, Enterprise Content Management and/or Archiving) and for the preserve and collect phases, they are now looking for content analytics technologies to further aid with these infrastructure investments. The two questions that corporations are asking are:
- How do we reduce the amount of information that goes into information management and archiving systems in the first place?
- Once information is in these systems how do we improve our process for identifying and preserving the responsive information for any given matter?
For the first question, corporations are looking at rules-based extensions to their compliance and investigatory systems to apply against the flow of information to determine whether a document or email is a record. For the second question, corporations are looking at the same kind of content analytics they use later in the process as a filter when they are bringing information from the archives into a matter based system. The more non-relevant information that is kept from the e-discovery flow the more effective the e-discovery process will be.
Some corporations are moving from a purely exclusionary process of e-discovery to an inclusionary approach. In an exclusionary process, one starts with the corpus of documents and then uses a variety of techniques from deduping and near deduping to keyword search terms to date ranges to cull the set down to the responsive documents. In an inclusionary process, the review team typically starts with a collection of exemplar documents from the highest likelihood custodians and then uses those exemplar documents as comprehensive “document concept search” examples to bring in related documents. Using social network analysis tools, the review team looks for additional custodians who were involved with the matter. Further, these same techniques are utilized to search publicly available information on the Internet and specialized databases for documents that could be relevant to the case.
Types of Content
Client data
Public Data
Opponent data
Third-party data
Reviewer Analytics
As more and more corporations and law firms have moved to a contingent labor force for first-pass review, project managers have realized that there can be a wide discrepancy in the capabilities of individual reviewers. From a cost and productivity standpoint, good project managers want to identify quickly those reviewers who are both highly productive and make high quality decisions. By tracking reviewer productivity, project managers have a better handle on how long it is going to take to finish the review phase of the e-discovery process.
The most important step in providing reviewer analytics is ensuring that the review tools track reviewer work product for each document as well as the rate of decision making per reviewer. In addition to document decision rate information, the review toolset needs to capture quality of the decisions. One way to capture quality of decisions is to track quality reviews that have another reviewer look at the decisions by a first pass reviewer and see if any of the decisions are reversed. The ideal reviewer should have a consistently high number of document decisions per hour combined with a low or no decisions reversed quality metric.
Productivity
Accuracy
Project Analytics
The primary focus of analytics for the matter or project is to determine the doneness of the project and to constantly be performing logical checks on what was expected versus actual results. Some corporations and law firms now realize they can gain additional insights into their matter and perform extensive quality checks by creating automatic taxonomies of the terms in the complete matter set of documents and the terms in the responsive documents subset.
By using a wide range of project analytics the matter team can be proactive in finding problems in their e-discovery strategy or actual process. Building on Reviewer Analytics the project manager can constantly be looking at and answering questions that impact both project cost and project risk, including:
- Have we gotten all of the documents that we expected?
- Are we getting the average number of responsive documents per custodian that we expected?
- Are reviewers making decisions at the rate that we expected and the quality that we expected?
- Have we missed any custodians or collections of documents?
Case Analytics
As the overall size of a case gets larger, the sheer number of responsive documents grows exponentially larger. In many cases, it is now as difficult to understand the collection of responsive documents as it used to be to winnow down the total set of document to the responsive document set. Many law firms are now using the content analytics tools to review and understand the total population of their own documents and the documents produced from the opposing side. The ability to cluster related documents on selected topics and then to tag and articulate the facts that each document supports or refutes requires similar tools to those used in the review stage. These content analytics technologies are used to extract the facts of the matter.
One of the big challenges of the litigating attorneys is to search for relevant case law to support their case. However, case law content systems still use simple keyword search capabilities to find relevant case law. Those attorneys who have gotten used to the advanced content analytics of e-discovery now use those same tools to go through 100s to 1,000s of potentially relevant cases as easily as they sort through the millions of documents in e-discovery. Advanced visualization techniques are even more powerful with the structure that is provided in case law content systems which provide the reference links to relevant statutes and other case decisions.
As more and more federal, state, and local cases are digitized by content providers such as LexisNexis and ThompsonWest, a new set of litigation analytics is arising. With content systems such as Lexis AtVantage, you can quickly find out the past track records and experience of judges and opposing counsel in cases that are similar to the current matter. In addition, a legal analyst can uncover what the track record for settlement or continued prosecution for the current client and the current opposition.
As relevant information from each of these types of case analytics is extracted, the information becomes a form of exemplar document to feed back into an inclusionary process to identify additional relevant documents and potential custodians.
Facts Based
Case Law Based
Introduction to Document Framework and Structure of Use Cases
The majority of content in this document is organized around examples of analysis use cases as they relate to the Electronic Discovery Reference Model. Each use case is organized into the following sections:
Inputs
The inputs section defines what information or content is needed for the specific analysis use case.
Roles
The success of any electronic discovery process is very often defined by the skills and experience of the people involved. For this document, we have defined the following roles that contribute to the analysis use cases in this document. As corporations have increased their level of involvement in and control of the e-discovery process, many of these roles are now fulfilled with in-house and/or outside resources such as law firms and/or service providers.
Document Analyst
Increasingly, document analysts are directly involved with electronic discovery — particularly when analysis is involved during identification, preservation and collection. Document analysts define and apply criteria to eliminate unnecessary data as well as ensure that potentially relevant data is collected and/or preserved.
Project Manager
E-discovery project managers help manage project expectations and ensure that the overall project stays on-time and within established budget parameters.
Senior Attorney
A senior attorney helps determine what documents should be collected and also defines the criteria for classifying documents during review.
Discovery Lead Attorney
The discovery lead attorney typically works for the senior attorney and with the project manager and review team to ensure that the review strategy is executed to the specification of the senior attorney.
Litigation Support Manager
In many scenarios, a litigation support manager is involved to help manage the e-discovery project and/or coordinate between legal and IT teams.
Review Attorney
Review attorneys classify documents based upon established criteria. These attorneys can be in-house, at a law firm or with a specialized provider of document review services.
Legal IT
Increasingly, many corporations today have dedicated IT resources to assist in the e-discovery process. These people have both a technical understanding of how to identify and collect necessary data in addition to a solid understanding of the entire electronic discovery process.
Tools & Technology
This section refers to the types of tools typically involved in the analysis use case.
Outputs & Desired Outcomes
This section provides information related to the objectives of the analysis use case.
Metrics
This section provides metrics that can be useful in determining the efficacy of the analysis.
Considerations
Considerations include tips and tricks as well as potential pitfalls to avoid.
Analysis Use Cases for Electronic Discovery
Project Management
Project Analysis
Inputs
Key inputs to project analysis include:
- Project budget
- Project timeline and key dates (e.g. production deadlines, etc.)
- Information related to the collected documents, including number of documents and/or total volume of data collected
- Information related to the review, including document classifications (e.g. marks and tags) and reviewer speed.
Roles
Most often, the e-discovery project manager is the primary role involved in project analysis. It is also not uncommon for the litigation support manager and/or the discovery lead attorney to be involved.
Tools & Technology
Many leading e-discovery software products offered through service providers or directly to the corporation or law firm provide reporting tools and project portals. These tools help provide key project status information that can be used during project analysis.
Outputs & Desired Outcomes
The most common questions that project managers are trying to answer with project analysis are:
- How much is this going to cost?
- Are we operating within our budget?
- When will this project be complete?
- Are we going to complete this project within our established timeline?
Using available project data such as total data collected and documents reviewed, project managers can ascertain how much of the overall collection has been reviewed. Many products now also provide attorney review rate information that can be useful in estimating project timelines, and/or adding additional resources (e.g. review attorneys) if needed.
Metrics
Useful metrics to track in the analysis of e-discovery projects include:
- Volume of data collected, processed and reviewed.
- Types of data collected (e.g. by file type)
- Number of documents processed and reviewed.
- Number of custodians.
- Number of documents per custodian.
- Number of files suppressed (e.g. by deduplication and/or criteria such as data range or keyword)
- Number of documents by classification (e.g. responsive, privileged, etc.).
- Document classification percentages by custodian.
- Number of documents reviewed and not reviewed.
- Attorney review rates (e.g. document decisions per hour).
Considerations
Like any business process, optimization of the e-discovery process and the resulting cost and risk reduction improves over time due to the team’s experience. Start by tracking each and every matter using a consistent and rich set of metrics. This enables your team to develop knowledge and experience that can be applied in future matters of any size or scope. Over time, you will be to more accurately gauge projected costs and timelines for all of your matters that involve e-discovery and document review.
Identification: Inclusionary & Exclusionary Criteria Development
Inputs
There are many different factors that are taken into consideration when analyzing the data corpus to determine appropriate items to move forward into the processing environment. Several of these include: Data Sampling (Interval or Weighted); External Research; Case Materials (both client and also opponent/third party)
Roles
The primary roles involved with this process would be the Document Analyst, Senior Attorney, Discovery Lead Attorney and Project Manager although others may be involved as needed.
Tools & Technology
There are many different tools and technology available to assist in this process. This is largely a research process and ultimately the decisions made regarding the application of any of the criteria developed will be a legal decision made by Counsel.
Outputs & Desired Outcomes
The desired outcome for this level of analysis is the development of criteria for selection of items most likely to yield responsive documents during review. Another outcome is also the converse, criteria that may be used to exclude items that are not likely to be responsive.
Metrics
There are various metrics that are created and used during this analysis. The first would be the sampling results, including:
- The percentage of the collection
- Percentage per custodian
- Date or other criteria if using weighted sampling
- Total number of documents
- Volume of documents (mb or gb)
- Document identifiers
This information should be collected for each iteration of sampling so there will be a record of the items contained within each sample set.
Testing of the inclusionary/exclusionary criteria will yield metrics regarding the effectiveness of the search by determining the number and volume of the documents retrieved, the percentages of the documents returned compared to the original sample taken, the resulting responsiveness of the items.
There should also be analysis of items that are dropped from the resulting sets of various iterations of selection criteria to ensure that responsive items are all still being returned within the revised criteria set.
One of the most important area of metrics when identifying selection criteria is to sample and review documents that do not hit on any of the selection criteria. If responsive documents are identified by the case team, then the selection criteria must be revised to retrieve those items in the resulting sets.
Considerations
The large quantities of ESI (electronically stored information) must be reduced in size prior to attorney review to make economic sense within the electronic discovery process. Reviewing all of the ESI in order to respond to discovery requests is not feasible. The time and expense involved would far outweigh the value of the large majority of cases.
Analysis of the data corpus will allow the case team to make decisions that would enable them to target the documents most likely to be responsive to the legal issues of their case. A review of the case materials would suggest time frames, custodians, data sources and subject matter that can be used to target the selection of documents to be moved forward into review.
External research may also be performed relative to the subject matter of the case. This may identify additional selection criteria to be used. There may be publicly available materials available through the internet or through traditional research methods that may yield additional information to be used by the case team during their analysis and development of both inclusionary and exclusionary criteria.
Samples of documents may be performed on an interval level (every nth item) or samples may be weighted to include specific date ranges, custodians, locations, etc. By analyzing the document samples, decisions can be made by the case team to identify additional date ranges, custodians, data sources, etc. to further the development of their criteria.
The above-referenced samples will then be used for testing the potential selection criteria to determine the effectiveness of the terms being considered or developed by the case team. The selection criteria would be run against a sample of items within the data corpus and the results are then reviewed to determine the effectiveness of the criteria. Comparison analysis of the criteria results between the various iterations of the criteria would be performed to ensure that no responsive items were inadvertently dropped during modifications made to the criteria. Lastly, samples of items that are not returned by the criteria would be reviewed and if responsive items are found within the sample, the criteria are modified to ensure that those items would be included in the resulting set of items.
Lastly, a very important consideration relative to this analysis is to determine what stage in the electronic discovery process this analysis will take place. It is very likely that it will take place during the Identification phase but can also occur during processing. Quite often decisions are made to preserve and collect larger volumes of data than are needed as a tactical decision so they do not need to re-collect if the needs of the project change. Rather they preserve and collect a larger volume and then perform the analysis discussed above in order to only process and ultimately review a much smaller set of data while still having the larger data set available if needed as case requirements and issues change. The analysis is a collaborative and iterative process and may be ongoing throughout the lifecycle of the project.
Preservation
Compliance Gap Analysis
Inputs
Information derived from interviews with Business Clients and IT Representatives.
Roles
The primary roles involved with this process would be the Document Analyst, Project Manager, Senior Attorney, Discovery Lead Attorney, Litigation Support Manager and the IT representatives.
Tools and Technology
With the volume of pending litigation ever increasing, there are many tools that have been developed that will assist with the development and tracking of the litigation hold. There are also many tools that are used during the initial analysis and identification of documents to be collected that will track the types and locations of data within a client organization. In addition, many tools today provide content and visual analytic capabilities that can help identify gaps through sampling and review of collected or preserved data sets. For example, social networking visualizations can quickly provide an overview of other custodians of interest based on interactions with key custodians in a given matter.
Outputs and Desired Outcomes
The desired outcome of this analysis is to make an assessment of the nature and volume of the electronically stored information (ESI) and to analyze the scope of the project and the preservation and collection efforts that will be required. Documentation may be created in the format of a data map, a checklist, custodian interview questionnaires or other formats as defined for the particular needs of the project. This will allow the case team to review the information and set appropriate expectations and budgetary requirements for the remainder of the project.
Metrics
Although specific metrics are not usually created during this process, the documentation that is created and/or collected will be invaluable in the preservation and collection processes.
Considerations
It is essential to determine the scope of your preservation and collection efforts and analyzing this information early on in the process to ensure compliance with discovery rules and obligations. Of course this should be an iterative process as additional information is compiled or the needs of a project may change throughout the project life cycle.
Identification of key custodians and sources of potentially responsive information will begin with the interview process. Preparation of a questionnaire and/or checklist that will be used throughout the interview process will be helpful in maintaining a consistent approach to obtaining the information necessary for the identification of potentially relevant data. There are many of these types of documents that are publicly available or may be developed by the case team to use during the interview and collection process.
During a custodian interview, the custodian will need to define their use of computers, the types of software used on a regular basis, the outputs of any of the software used, where they save documents locally and also any locations on the server(s) that they use for storage of ESI. They should also identify any devices that they use that may contain ESI – desktop computer, laptop computer, Blackberry or other handheld devices, CD, DVD, thumb drive, external hard drive, etc. The custodian interview should identify other potential custodians, people of interest, types of documents and specific content matter that may assist in the review of the content identified for collection.
The interview process with the IT representative(s) should focus on the structure of the organization and storage of ESI. During this interview process it is very helpful if a data map is prepared outlining all of the applications that generate ESI that needs to be collected and also storage locations of ESI and the backup methods involved.
The key areas of focus should be the applications used and locations of the following: email system, backup processes, server shares and mapped drives used by the user population, enterprise level applications, and any other systems that may contain ESI that is potentially relative to the case.
The IT representatives have a wealth of information that should be used to map the areas that potentially relevant data may exist and they also may be able to provide information that will allow for the exclusion of certain data sets during the collection process.
IT representatives that are responsible for specific applications should discuss the format that data is held within those applications and may provide assistance on the practicality of extracting useful information and the potential export formats available as well as any other details that may assist in the collection process.
Once the interviews are completed for both IT representatives and the Business Clients, the information should be compared and a complete collection protocol should be developed through the analysis of that information keeping in mind that as the actual collection process commences, there may need to be modifications to the protocol to account for additional identification of types of ESI, custodians, or storage location details that may emerge during that process.
Collection
Content Gap Analysis
Inputs
Information, statistics and samples derived from the collected data set.
Roles
The primary roles involved with this process would be the Document Analyst, Project Manager, Litigation Support Manager and the IT representatives.
Tools and Technology
There are many tools that assist in reviewing the collected data to determine whether there are any gaps within the content. A timeline view into the data set, for example, will allow the user to determine whether there are any gaps within the date ranges of the documents.
Another example would be tools that display the documents organized by Custodian, the user may then analyze the data to determine that all custodians expected to be collected and processed are accounted for within the data set.
Additionally, sampling may be used within the document corpus to determine whether gaps exist. A weighted sample can select documents based upon specific criteria. The sample may be weighted to a particular group of custodians, timeframe or other factor identified by the case team. There are a myriad of ways that the data set may be sampled or organized that will point to gaps within the document collection. This will allow the case team to analyze whether additional data needs to be collected or if the gaps are expected and explainable.
Outputs and Desired Outcomes: With the large volumes of data involved, it becomes necessary to analyze the items collected to determine whether there are any gaps within the content that will be needed for processing and ultimately review and production.
There are many factors to consider when determining whether there are gaps within your content. The first consideration is normally to ensure that you have data for the custodians that you have identified. The next step would be to determine that all available data types have been collected (email, hard drive and server share are the most common) for each identified custodian.
Sampling and meta-data may be used to determine gaps within date ranges, specific subject matter, communications between parties. Review of the sampled documents will allow the case team to analyze and ensure they have the data needed to respond to their case needs.
Metrics
The results of sampling are one set of metrics that will be used in this analysis. There are also other statistics that can be generated from the document corpus that will also be used. Document counts and volumes for data by custodian, by file type, by date range and any other category that would be important to the matter should be analyzed to determine the completeness of the document collection.
Considerations
There may be specific reasons for content gaps that are identified in the document corpus, but in order to document and explain the gaps to the opposing parties an analysis must be performed. There may be additional collection required that is identified during the content gap analysis. This level of analysis may also identify specific collection and / or processing issues as well.
Processing
Derived Metadata Extraction
Categorization
Automatic Tagging
Early Case Assessment
Sampling
Triage & Organizing for Review
Review
As document review has evolved from paper-based processes to the digital age, new capabilities have been introduced to help better manage document review projects.
Review & Assignment Management
Document review projects can range from very small internal investigations to large-scale reviews that occur during litigation or regulatory requests. Even the smallest review of a single custodian can require the reviewer to navigate thousands — or even tens of thousands — of individual documents. As matter size increases, multiple reviewers are often involved and the workload needs to be distributed to keep the review project on course and within necessary timelines.
Inputs
Key inputs to review and assignment management are an understanding of the documents to be reviewed (volume, type, etc.), the review team’s capabilities and relevant project constraints (e.g. budget or time).
Roles
The discovery lead attorney, project manager and/or litigation support manager are typically involved in the decision-making process for creating review assignments. In small investigative scenarios, the overall team may include ten or fewer attorneys. In large-scale scenarios, hundreds of contract attorneys may be involved in the document review. The discovery lead attorney typically sets the tone for how the review will be prioritized and assigned.
Tools & Technology
Most review platforms available through e-discovery solution providers or as in-house software applications provide workflow management tools that help distribute document sets across the review team. These tools help assign batches of documents to the reviewer(s) and track relevant work product (e.g. marks, tags, annotations, etc.) through the process.
Depending on the type of review, the team will want to ensure that the review platform’s functionality matches the needs of the review. For example, in investigative review scenarios the team will want to ensure that key documents can be shared across the team. These documents often can be used as exemplar or “seed” documents to help investigators identify similar documents during review.
Outputs & Desired Outcomes
The most common questions that need to be answered during review and assignment management include:
- How many documents will require review?
- How many documents have been reviewed?
- When do we need to complete this review?
- How many attorneys will I need to stay within our timeline?
- How many documents are we reviewing per day?
- How much are we spending on the review?
- How many documents require additional review or QC?
Metrics
Useful metrics to track in review and assignment management include:
- Number of documents reviewed and not reviewed.
- Attorney review rates (e.g. document decisions per hour).
- Document classification percentages (overall, by custodian and by file type).
Considerations
Prioritizing review sets is essential. Criteria such as date range or custodian can be useful in determining what documents should be reviewed first. By prioritizing the documents, teams can more quickly start refining the review strategy based on key relevant documents or other information (e.g. new custodians) that can impact the matter.
Bulk Tagging
Documents can be tagged in groups based on shared criteria and/or document similarity.
Inputs
The review criteria established for document classification is the primary input for bulk tagging of documents. Documents must be indexed for keyword searching, and access to document metadata may also be useful.
Roles
Discovery lead attorneys, document analysts and review attorneys are most frequently involved in bulk tagging analysis.
Tools & Technology
There are many content and visual analysis tools available today that can improve speed and accuracy with bulk tagging. Software that groups documents together based on content similarity can help present groups of similar document for inspection. Some tools enable reviewers to quickly examine differences within a group of related documents. By doing so, the reviewer can quickly and accurately determine if bulk tagging is possible and/or applicable. For example, weekly reports distributed by email will likely group together based on the similarity of conference. The reviewer can quickly see the subtle differences in these documents and accurately classify the documents based on the review criteria. Additionally, many review platforms have built-in sampling capabilities to create quality control (QC) sets for review.
Outputs & Desired Outcomes
Bulk tagging helps accelerate first-pass review by more quickly classifying groups of similar documents. This saves valuable time and reduces unnecessary expenses associated with attorney review. When based solely on criteria (such as custodian, date and/or keywords), some bulk tagging may be automated.
Metrics
As is done with any type of review, metrics that validate the accuracy and efficiency of attorney decision-making are useful in assessing bulk tagging.
Considerations
Work in close collaboration with the senior attorney and the lead discovery attorney in determining criteria that is applicable for bulk tagging.
Refinement of Case & Review Criteria
Analysis can be extremely useful early in the case to establish good review criteria for the document review team.
Inputs
Whether during an investigation or in response to a discovery request, refining case and review criteria requires a good understanding of the issues at hand in the matter and documents collected from key custodians.
Roles
The senior attorney and discovery lead attorney are most often involved in this analytic process.
Tools & Technology
Refining case and review criteria can be accomplished with many available tools that allow for processing and review of documents in native format. This process can be further enhanced with the use of content analytics and visualization. These tools enable more rapid understanding of what information is contained in the collected set of documents, including communications between key custodians and identifying groups of documents that share similar concepts. Using these tools, attorneys can also use example documents to identify similar documents in the collected set.
Outputs & Desired Outcomes
The key objective in this process is to establish useful review criteria for classifying the documents. By reviewing a small set of documents from key custodians and/or interesting documents, the senior attorney and discovery lead attorney can get a better understanding of how to instruct the review team. This enables a more efficient and accurate review.
Metrics
None.
Considerations
Although most review platforms provide no limits on the number of tags that can be applied to a document, consider using as few classifications (also known as tags) as possible. Unnecessary classifications can slow down first-pass review and result in lost time and unnecessary expense. Refining the review criteria early in the process helps the senior review team establish high-quality criteria and eliminate classifications that are not useful.
Quality Control
As even a small investigation may involve the initial review of thousands of documents, a good quality control process is critical for any review process.
Inputs
Reviewed documents and information about the review team.
Roles
The lead discovery attorney and project manager are most frequently involved in the quality control process.
Tools & Technology
Many review platforms provide out-of-the box reporting tools that provide information such as:
- Number of documents reviewed (by type, classification, custodian, etc.).
- Attorney review rates (e.g. document decisions per hour).
Several available software packages group documents by similarity and can visually display how documents relate to one another. Using these tools, attorneys can quickly see how documents that have been classified a certain way (e.g. “responsive”) relate to other documents. For example, if eight of nine documents that share similar concepts are marked as responsive, the QC reviewer can quickly check the ninth document to understand why it was not marked accordingly.
Outputs & Desired Outcomes
The goal of quality control is to ensure that documents are marked in accordance with the established review criteria.
Metrics
Track document classifications that are changed by review attorney. This helps identify review attorneys that may not understand or be following the established review criteria.
Considerations
Many review tools allow users to generate a sample set based on criteria such as the document classification, custodian, keywords, date range, type or even senders and/or recipients of email. See also Reviewer Analysis.
Reviewer Analysis
Tracking the review team’s progress and efficiency can help improve the overall accuracy of the review and reduce unnecessary expenses.
Inputs
Key information includes attorney review rates (e.g. document decisions per hour) and access to the documents reviewed.
Roles
The project manager and discovery lead attorney are most frequently involved in reviewer analysis.
Tools & Technology
Many review platforms provider tracking and reports of reviewer decisions and the decision-making rate.
Outputs & Desired Outcomes
The goal of reviewer analysis is to ensure that the reviewing attorneys are efficiently classifying documents in accordance with the established review criteria.
Metrics
Useful metrics to track in the analysis of e-discovery projects include:
- Number of documents by classification (e.g. responsive, privileged, etc.).
- Number of documents reviewed and not reviewed.
- Attorney review rates (e.g. document decisions per hour).
Considerations
In most cases, there will be a bell curve of productivity based on the rates of the team. Start by analyzing the decisions made by review attorneys at both ends of the curve (e.g. those review documents slower than the team average and those reviewing documents faster than the team average). See also Quality Control.

