Collecting Personal Data for E-Discovery

A huge component of e-discovery relates to electronic files that are created and stored every day by employees – e-mails, word documents, spreadsheets, presentations and more. Oftentimes, it is inadvertent spoliation or omission of such files in discovery that results in undesired sanctions and even default judgments. Thus, developing sound methodologies for identifying, preserving, and collecting files from personal data repositories is a key component of being litigation ready.Know what data exists and where
The first consideration in improving an organization’s litigation readiness is to identify where and how personal data is being created and stored. What applications are used to create messages and/or documents throughout the organization? Are application programs centrally managed to limit the types or versions being used?

Once electronic information is created, where are files being stored? Are they on desktop and laptop computer hard drives, mapped network share drives, portable flash drives or other removable media? Are document management systems, such as SharePoint sites or other collaboration repositories utilized? Are portable computing devices being used, such as PDAs, Blackberries or smart phones? Do employees use computers at home for business?
For these systems, it’s important to ask the following questions:

  • Are personal files backed up for disaster recovery?
  • Where are backup files stored?
  • Are entire hard drives backed up or only those files stored in “My Documents”?
  • Are the back-up processes automated and scheduled or do they rely on an employee’s action?

If using enterprise-wide content management systems:

  • Are there guidelines on how to move or restore files in and out of the records management repository?
  • Does a records retention plan exist that requires the preservation of specific file types?
  • Are e-mail mailboxes governed in terms of timeframes, file size, or overall mailbox capacity?
  • Can e-mails be archived to personal data stores?
  • How does the organization manage files from departing employees?
  • Are e-mail mailboxes routinely archived and files from computer hard drives preserved before redeploying computer hardware or disposing of stored network files?

Determine an approach to file collection.
Once an employee’s personal data repositories has been identified as potentially relevant to a particular matter, there are a variety of methods used to preserve or copy source files for electronic discovery. Typical collection methodologies range from user discretion, where the employee chooses which files are appropriate, to full forensics imaging that use investigative software to preserve an entire hard drive. Different methodologies have differing cost and risk impacts and, therefore, vary in their applicability.

1. User Discretion
The simplest and most straightforward approach to file collection is to rely on the end-user to provide relevant files and e-mail messages. Typically, employees are the most knowledgeable as to where their personal data is at least logically stored and how it is organized. They also know if files are password-protected or encrypted.

However, relying on witnesses to identify relevant evidence increases the risk of incomplete preservation or spoliation of crucial metadata, whether inadvertent or willful. Employees may not be aware of every location a file has been copied, such as back up locations or temporary folders.

Such user discretion may be appropriate if the exposure of the matter is low, chain-of-custody not critical, or metadata deemed irrelevant. However, user discretion relies on the integrity and thoroughness of the employee to assure all relevant files are indeed preserved and collected. And metadata is rarely preserved, since even opening the file to check on its relevance can update the “last modified date” and expose a claim for spoliation. Relevant dates may be retained if files are moved as part of zip files or as entire folders, but care must be taken both in copying and in saving files to a target repository. In addition, context can be lost when files are no longer stored as they were originally created and stored during the normal course of business.

2. Forensic Imaging

We oftentimes hear the word “forensics” when it comes to collecting potentially relevant evidence for electronic discovery1. Creating a forensic image of a hard drive, for example, typically refers to creating a bit-by-bit replication of a hard drive – not only active files, but also deleted files, swap space, and unallocated slack space2. This approach may be crucial when facts lie beyond the active files, such as what files might have been in place before an employee had the opportunity to start deleting incriminating evidence.

Tools capable of making such images are referred to as investigation tools. These tools are used by trained experts for recovering critical evidence, such as file fragments that might otherwise be lost. Examples of tools used for forensic-imaging include Guidance Software’s Encase and AccessData’s Forensic Toolkit (FTK).

Because a forensic image preserves every bit of the hard drive, such tools in the hands of experts can assure ultimate legal defensibility in litigation. However, this approach can also be far more time-consuming and costly. To begin with, the storage capacity when creating forensic images requires the same drive capacity as the original. A 200GB source drive requires a 200GB target, even if only 25% is used for active files. Software for making forensic images is typically costly. The time required to create an image, later restore the image to a target drive, and then extract relevant files or analyze fragments can also be extensive.

An added risk of creating a forensic image is related to the fact that this approach preserves all the bits on a hard drive. Once imaged, everything on that drive may by result become discoverable, including long-since deleted files and file fragments, even if at the time such fragments would have been outside the scope of a discovery request.

3. Active File Collection
Conversely, files can be copied from one storage device to another in a forensically-sound manner that preserves both the content of the file and its associated operating system metadata, such as create date, last modified date and folder path. When the process assures both legal defensibility and data authenticity, the methodology is considered “forensically sound.”

There are a number of methodologies that can be utilized to preserve both content and metadata in a legally-defensive matter. Entire folders can be copied without changing underlying metadata. In other cases, the metadata of certain container files, such as a .pst file for Outlook or a zipped file, may not be important, since the vital metadata is embedded within the file and able to be extracted during processing. The recommended approach for active file collection is to use technology that is designed to preserve metadata when copying files from one source to another. Microsoft’s Robocopy is an example of a free utility that can be used for such a collection.

When using tools like Robocopy, files are copied from a source repository, such as a PC’s hard drive, to a target repository, such as a removable USB hard drive, while maintaining important metadata. Cost can be substantially less than with forensic imaging, and deployed by a company’s trained IT professionals as opposed to third-party computer forensic experts.

Following is a summary of e-discovery collection processes and the costs, time and risks associated with each:

Collection Process

User Discretion

Active File

Forensic Image


Low risk/exposure matter; non-key players

Most common approach

Malfeasance, HR matters


User selects specific files

Copy all active files (by drive or path)

Create a forensic image of entire drive

Typical Cost




Typical Risk

Inconsistent, not defensible, loss of metadata

Business interruption

Time and expertise required; discoverability

Typical Volume




Figure 1 — Personal Computer Hard Drive Collection Methodologies ©Fios, Inc.

By Brad Harris :


Contact us

(Free initial consultation – no spam)

Contact form (1)

"*" indicates required fields

This field is for validation purposes and should be left unchanged.