Among the many types of challenges presented by the adoption of cloud computing are those involving computer forensics. Computer forensics can be thought of as the set of tools and techniques that make eDiscovery possible and reliable. It is defined in Wikipedia as, “a branch of digital forensic science pertaining to legal evidence found in computers and digital storage media.” The National Institute of Standards and Technology (NIST) Information Technology Laboratory (ITL) defines cloud computing forensic science more specifically as,
the application of scientific principles, technological practices and derived and proven methods to reconstruct past cloud computing events through identification, collection, preservation, examination, interpretation and reporting of digital evidence
As with other legal evidence, digital evidence is subject to challenge in court. It has to be what it purports to be. Therefore, the accurate identification of the creator, custodian, chain of custody, authenticity and other attributes of digital evidence is essential in any eDiscovery setting. Essentially, a computer forensic investigation must locate and identify “documents” and other information that can be traced to the actions, knowledge and information available to parties and other witnesses involved in a lawsuit, arbitration or investigation
While a number of technical tools and techniques have been developed to secure forensically sound images of data stored in workstation computers, servers, in-house data centers and mobile devices (e.g., Guidance Software EnCase, Symantec Clearwell, etc.), the rapidly developing and widely varying makeup of cloud computing architecture has presented or exacerbated numerous new challenges to computer forensics.
In conjunction with its work to define and apply some standards for understanding cloud computing issues (See my Cloudy Laws – Part I article), NIST formed a working group to research and evaluate the special challenges facing cloud computing forensics. The NIST Cloud Computing Forensic Science Working Group (NCC FSWG) created a draft Report dated July 2014, which identified and categorized 65 current challenges (hereinafter “Challenges”). The Report stresses the fact that the Challenges are multidisciplinary and that solutions require the interaction of experts in several fields. It states:
Cloud forensics challenges cannot be solved by technology, law, or organizational principles alone. Many of the challenges need solutions in all three areas. Technical, legal and organizational scholars and practitioners have begun to discuss these challenges. This report focuses more on the technical challenges, which need to be understood in order to develop technology- and standards-based mitigation approaches.
The Report goes on to warn that the interests and expectations of the technical, legal, and organizational stakeholders must be properly allocated and documented in contracts in order to avoid “misunderstandings” in the event cloud computing evidence is required in any kind of litigation, arbitration, government investigation, criminal probe, homeland security investigation or otherwise. It states:
There are many stakeholders involved in cloud forensics activities, including members of government, industry, and academia. One of the biggest challenges in cloud computing is understanding who holds the responsibilities for the various tasks involved in managing the cloud. All responsibilities should be clear at the time of contract signing. Forensics is an area that is particularly prone to misunderstandings since it is often not until a forensic investigation is under way that stakeholders start making assertions about ownership and responsibilities.
The Report, NIST Cloud Computing Forensic Science Challenges (Draft NISTR 8006) contains a densely packed 15 page table categorizing and describing the Challenges. The Report also produced the following “mind map” summarizing the findings of the NCC FSWG Group.
The forensics Challenges are mainly technical but as in other situations dealing with the closely intertwined fields of eDiscovery, Computer Forensics and Document Retention / Destruction Policies, the components must all work together as a system for managing information assets. All the processes in the system must be coordinated to achieve the goal of a legally defensible system, if and when it is ever tested, challenged or scrutinized by outside forces.
The Report categorizes the 65 Challenges into nine major groups (represented in the above chart in red). Some of the Challenges reside in more than one category. A quick review of the major categories brings to mind many other questions about the need for coordination among stakeholders. The nine categories are, Architecture, Data Collection, Analysis, Anti-forensics, Role management, Legal, Standards, and Training. The Report explains:
- Architecture (e.g., diversity, complexity, provenance, multi-tenancy, data segregation, etc.) –Architecture challenges in cloud forensics include dealing with variability in cloud architectures between providers; tenant data compartmentalization and isolation during resource provisioning; proliferation of systems, locations and endpoints that can store data; accurate and secure provenance for maintaining and preserving chain of custody; infrastructure to support seizure of cloud resources without disrupting other tenants; etc.
- Data collection (e.g., data integrity, data recovery, data location, imaging, etc.) — Data collection challenges in cloud forensics include locating forensic artifacts in large, distributed and dynamic systems; locating and collecting volatile data; data collection from virtual machines; data integrity in a multi-tenant environment where data is shared among multiple computers in multiple locations and accessible by multiple parties; inability to image all the forensic artifacts in the cloud; accessing the data of one tenant without breaching the confidentiality of other tenants; recovery of deleted data in a shared and distributed virtual environment; etc.
- Analysis (e.g., correlation, reconstruction, time synchronization, logs, metadata, timelines, etc.) — Analysis challenges in cloud forensics include correlation of forensic artifacts across and within cloud providers; reconstruction of events from virtual images or storage; integrity of metadata; timeline analysis of log data including synchronization of timestamps; etc.
- Anti-forensics (e.g., obfuscation, data hiding, malware, etc.) — Anti-forensics are a set of techniques used specifically to prevent or mislead forensic analysis. Challenges in cloud forensics include the use of obfuscation, malware, data hiding, or other techniques to compromise the integrity of evidence; malware may circumvent virtual machine isolation methods; etc.
- Incident first responders (e.g., trustworthiness of cloud providers, response time, reconstruction, etc.) — Incident first responder challenges in cloud forensics include confidence, competence, and trustworthiness of the cloud providers to act as first-responders and perform data collection; difficulty in performing initial triage; processing a large volume of forensic artifacts collected; etc.
- Role management (e.g., data owners, identity management, users, access control, etc.) — Role management challenges in cloud forensics include uniquely identifying the owner of an account; decoupling between cloud user credentials and physical users; ease of anonymity and creating fictitious identities online; determining exact ownership of data; authentication and access control; etc.
- Legal (e.g., jurisdictions, laws, service level agreements, contracts, subpoenas, international cooperation, privacy, ethics, etc.) — Legal challenges in cloud forensics include identifying and addressing issues of jurisdictions for legal access to data; lack of effective channels for international communication and cooperation during an investigation; data acquisition that relies on the cooperation of cloud providers, as well as their competence and trustworthiness; missing terms in contracts and service level agreements; issuing subpoenas without knowledge of the physical location of data; seizure and confiscation of cloud resources may interrupt business continuity of other tenants; etc.
- Standards (e.g., standard operating procedures, interoperability, testing, validation, etc.) — Standards challenges in cloud forensics include lack of even minimum/basic SOPs, practices, and tools; lack of interoperability among cloud providers; lack of test and validation procedures; etc.
- Training (e.g., forensic investigators, cloud providers, qualification, certification, etc.) — Training challenges in cloud forensics include misuse of digital forensic training materials that are not applicable to cloud forensics; lack of cloud forensic training and expertise for both investigators and instructors; limited knowledge by record-keeping personnel in cloud providers about evidence; etc.
The need for cloud customers to have their data forensically searched should be addressed early in vendor selection and contract negotiations – not after an incident (lawsuit, discrimination claim, terminating employee, subpoena), which requires a forensic search. Ediscovery Standards and best practices can be useful in communicating among stakeholders, without having to reinvent the wheel. The Electronic Discovery Reference Model (EDRM.net) provides a widely known standard, which delineates the processes required to boil down a forensic search into relevant evidence. Typically, the Identification, Preservation and Collection will create a forensic (bit by bit, verified by hash values) copy. That copy can then be searched for relevant information without altering the original digital evidence.
Because of the inconsistent log protocols, virtualization technology, elasticity and shared cloud servers, cloud computing presents some particularly difficult challenges to the Identification, Preservation and Collection of evidence. It is no longer a matter of pulling hard drives, attaching a write-block and running a bit by bit image. The NIST Report highlights the following characteristics of cloud forensics, many of which do not exist in a typical on-site forensics examination:
- Identification of the cloud provider and its partners. This is needed to better understand the environment and thus address the factors below.
- The ability to conclusively identify the proper accounts held within the cloud by a consumer, especially if different cyber personas are used.
- The ability of the forensics examiner to gain access to the desired media.
- Obtaining assistance of the cloud infrastructure/application provider service staff.
- Understanding the topology, proprietary policies, and storage system within the cloud.
- Once access is obtained, the examiner’s ability to complete a forensically sound image of the media.
- The sheer volume of the media.
- The ability to respond in a timely fashion to more than one physical location if necessary.
- E-discovery, log file collection and privacy rights given a multi tenancy system. (How does one collect the set of log files applicable for this matter versus extraneous information with possible privacy rights protections?)
- Validation of the forensic image.
- The ability to perform analysis on encrypted data and the collector’s ability to obtain keys for decryption.
- The storage system no longer being local.
- There is often no way to link given evidence to a particular suspect other than by relying on the cloud provider’s word.
The fact that this Report was first created as late as mid-2014 demonstrates that this emerging technology has jumped ahead of some very important legal and organizational controls. Previously I highlighted the risk that a vendor somewhere in the chain of companies who provision a cloud service could render a company’s data (i.e., information assets) beyond their reach. The breadth and complexity of these 65 Challenges to successful eDiscovery in the cloud, at the very least should motivate stakeholders to investigate whether their existing or intended cloud solutions can be inspected with current forensic tools. For example, where a cloud vendor is unwilling to create or share log files and other metadata, or to permit forensic collection in a multi tenancy system, legitimate eDiscovery efforts would be frustrated and the vendor would appear to be a poor choice.
Borrowing Ronald Reagan’s old phrase to the Soviet Union, “Trust but Verify”, cloud users should take steps to make sure that the systems they deploy are reasonably safe from these known Challenges by running tests to determine if authenticated data can be extracted. You, opposing counsel or the government may someday need to come looking in your cloud for authentic and verifiable documents. Cloud providers should compete on their ability to correct, manage, mitigate or indemnify their users against these risks to valuable information assets.