Legally Defensible Data Remediation

A document retention policy is in reality a document destruction policy.  Therefore, a key reason for an organization to adopt a document retention policy is to establish a program for the deletion/destruction of information that is not required for business, regulatory and other needs.  This reality is made necessary by the fact that digital information is growing at an unprecedented rate and that much of it is contained in “unstructured” storage such as email, SharePoint and shared network drives.  Data hoarding not only increases direct information technology costs but it presents other substantial risks and costs to an organization ranging from discovery of “smoking gun” documents during investigation, litigation or audit; to reputational damage from information security breaches (hacking).

Document retention/destruction policies have long been recognized as a good business practice.  Inherent in the practice is the notion that information has a life cycle and that there are valid reasons to protect that information from competitors, thieves, snoops and even government investigators.  In the context of an appeal of an obstruction of justice conviction against Arthur Andersen LLP, this practice was blessed by the U.S. Supreme Court.  Chief Justice William Rehnquist delivered the opinion of the Court:

‘Document Retention Policies,’ which are created in part to keep certain information from getting into the hands of others, including the Government, are common in business.  It is, of course, not wrongful for a manager to instruct his employees to comply with a valid document retention policy under ordinary circumstances.

Arthur Andersen LLP v. U.S., 544 U.S. 696, 704 (2005)


Data remediation is but one component of an evolving Information Governance program. A plan for data remediation should start with the recognition that it is not simply a massive digital “shredding party” directing all employees to delete any email older than 30 days, 90 days, 1 year, etc.  Data remediation efforts can be selective, implemented in phases and preferably combined with a “day forward” plan to prevent data hoarding from reoccurring.  Importantly, data remediation efforts must avoid destruction of documents and records required to be maintained by statutes, regulations, contracts, the all-important “legal hold,” and other requirements.

Retention Policy is Key

In conjunction with other Information Governance efforts, the first major step in a data remediation initiative is creation of a document retention policy (aka records retention policy) and records retention schedules.  This step is fundamental in determining the legal, regulatory and business requirements of any of the data in the particular repository targeted for data remediation. 

Graphic showing retention policy structure - statement-procedures-schedule-departmental protocols

A document retention policy is a top-level directive emanating from the Board and/or the C-Level executive team, instructing all managers and employees to comply with the retention and destruction requirements of the policy and the accompanying retention schedules, as well as any particular procedures that are developed to implement the policy.  A legally defensible document retention policy must be created in good faith, related to business needs, reasonable, consistently followed.  It must contain a clear statement that “legal hold” directives supersede all regular or automated destruction of records and documents.

While there may be similarities within industry sectors, one size does not fit all.  The retention policy should reflect the way the organization actually does business and be designed to enhance the value and reduce the risk of information assets.

A records retention schedule is a tabular summary which classifies all organizational records by business use and regulatory requirements for retention.  It is commonly organized by department (e.g., legal, HR, Accounting & Finance, etc.). Some retention schedules contain upwards of 250 different classifications.  The modern trend is to group classifications together to create what records & information management (RIM) professionals call “big buckets.”  A legally defensible records retention schedule should include the particular legal citations relied upon to set the minimum period of retention for that classification.  The legal and regulatory requirements generally set the “minimum” retention period.  Historical, business and risk-management consideration can lengthen the retention periods.  The main exception is Personal Information (PI or PII) and other data regulated by privacy, identity theft or security concerns, where hoarding of unnecessary data is deemed so dangerous that it is expressly prohibited.

Retention Policy Team

Graphic depicting makeup of retention policy team: Legal, IT, Privacy/Security, and RIM (Records and Information Management)

Creating the retention policy and records retention schedules is a team effort.  Each team will be composed differently depending on the type of organization.  At a minimum the team should include people with expertise in information technology, legal requirements, records & information management (RIM), and privacy/security.  Where the retention policy is being developed with respect to a particular data remediation initiative, management from the business unit owning the data should be included.  Obviously, senior management must be involved and kept in the loop in order to demonstrate that there is an enforceable mandate for change.  Information Governance teams may already be partially constituted.

Both to fill expertise gaps and to provide unifying perspective and experience in this cross-disciplinary effort, the internal team must often be advised by appropriate outside consultants and counsel.  The varied disciplines necessary to successfully implement an Information Governance initiative each have their own technical language, developed in silos.  At the very least, the team needs a “big picture” person who can help ensure that everyone is on the same page.

Understanding Data Systems Architecture

“Data systems architecture” is a term that has found usage in legal realm following the 2006 amendments to the Federal Rules of Civil Procedure governing eDiscovery and the famous series of Zubulake cases in Federal Court.  Because nearly all the evidence is now “inside the computer,” lawyers are exhorted to understand their clients’ data systems architecture.

Outside the litigation context, it is fundamental that the document retention policy team must know what exists before it can intelligently address any electronic retention policy. In order to create an overall retention policy with retention schedules or to select a data repository for remediation, the team must survey and document the types, usage, duplication, ownership, security, and sensitivity of the organization’s data.  IT can often create detailed maps of its servers and produce reports regarding storage and usage. 

However, much more is required.  Especially where the data repository is “unstructured,” as in email, SharePoint and shared network drives, the task of identifying needed vs. unneeded data is more complex.  At a minimum, the departmental owners and users of the data need to be consulted as to the value of the records and document contained in each targeted data repository.  IT can assist greatly by identifying apparently useless files: duplicates, files that have not been accessed for X years, files belonging to former employees, etc.  Working together with data custodians, the team can develop and document business oriented strategies for data remediation.

Legal Hold

No discussion of data remediation or document retention/destruction policies is complete without a serious warning that the legal hold doctrine requires a suspension of normal operation of the policy, for all records and documents relevant to the particular audit, litigation, arbitration or government investigation.  Much has been written on this subject including recent articles by this author. 

The team must work with in-house and outside counsel to determine (and narrow where appropriate) the scope of all legal hold directives, to ensure that evidence is preserved.  Destruction of evidence (aka spoliation) presents a serious source of risk to organizations.  It can result in sanctions in civil court and arbitration. It can result in criminal charges of obstruction of justice as in the Arthur Anderson case.  It can result in grave reputational damage.


Always subject to legal hold directives, in its simplest form, remediation can involve basic deduplication.  Employees may be tasked with identifying records and documents that are important to their jobs and moving them to a new repository.  IT policies can be changed and enforced to prevent employees from making “just in case” backups of data to USB drives, personal cloud accounts, home computers, smartphones and PST archive files.  Existing enterprise software may have features that have not been fully implemented and may provide indexing and automated retention/destruction options.  A new formalized document retention policy may provide that all drafts of certain records can be destroyed earlier than the final record.  Of course, new and improved software tools such as archiving and ERM systems can be implemented.  Data analytics and technology assisted review (aka TAR), including artificial intelligence concepts, can be utilized to separate the wheat from the chaff.  It is entirely reasonable to develop a system for statistical analysis and selection to cull down vast collections of unstructured data.

As part of a longer term remediation strategy, “day forward” retention policies can be rolled out to restrict future hoarding.  Change management and employee training is essential as well.  Technical, legal and policy options that are legally defensible are quite broad.  Chief Justice Rehnquist recognized that under ordinary circumstances, organizations are entitled broad freedom to manage their information assets to maximize value and to minimize costs and risks. That is the law of the land.

Documentation, Testing and Analysis

A final component in implementing a legally defensible data remediation initiative is to make sure that the organization can document and prove that the policy and remediation actions were taken in good faith, related to business needs, reasonable, and consistently followed.  In addition, where technical and statistical methods are utilized to select records and documents for deletion, that process should be scientific, repeatable and tested. For some organizations, it may be advisable from a legal defense strategy to utilize an independent expert statistician to test and validate internal assumptions about the adequacy of selection for remediation.  Anything that can demonstrate good faith and reasonableness is likely to carry weight, in the event a record or document covered by legal hold or a legal retention is inadvertently destroyed.


Clearly, as organizations try to contain an unprecedented data explosion, the hoarding problem with legacy and other unstructured data is not going to go away.  Until some yet un-invented magical AI robot can clean our digital cubicles, a continuing strategy of delaying data remediation presents significant cost and risk. 

In 1772 Voltaire coined the phrase: “Le mieux est l’ennemi du bien,” roughly translated to “the perfect is the enemy of the good.”  In conjunction with other Information Governance initiatives, such as the updating of a document retention policy, data remediation efforts can be selective, implemented in phases and accomplished in a team effort.