glossary

Data Remanence


Data remanence is the residual representation of data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting of storage media that does not remove data previously written to the media, or through physical properties of the storage medium that allow previously written data to be recovered. Data remanence may make inadvertent disclosure of sensitive information possible, should the storage media be released into an uncontrolled environment (e.g., thrown in the trash, or given to a third party).

Various techniques have been developed to counter data remanence. These techniques are classified as clearing, purging/sanitizing or destruction. Specific methods include overwriting, degaussing, encryption, and physical destruction.

Effective application of countermeasures can be complicated by several factors, including media that are inaccessible, media that cannot effectively be erased, advanced storage systems that maintain histories of data throughout the data's life cycle, and persistence of data in memory that is typically considered volatile.

Several standards exist for the secure removal of data and the elimination of data remanence.

Causes

Many operating systems, file managers, and other software provide a facility where a file is not immediately deleted when the user requests that action. Instead, the file is moved to a holding area, to allow the user to easily revert a mistake. Similarly, many software products automatically create backup copies of files that are being edited, to allow the user to restore the original version, or to recover from a possible crash (autosave feature).

Even when an explicit deleted file retention facility is not provided or when the user does not use it, operating systems do not actually remove the contents of a file when it is deleted. Instead, they simply remove the file's entry from the file system directory, because this requires less work and is therefore faster. The contents of the file—the actual data—remain on the storage medium. The data will remain there until the operating system reuses the space for new data. In some systems, enough filesystem metadata are also left behind to enable easy undeletion by commonly available utility software. Even when undelete has become impossible, the data, until it has been overwritten, can be read by software that reads disk sectors directly. Computer forensics often employs such software.

Likewise, reformatting, repartitioning or reimaging a system is not always guaranteed to write to every area of the disk, though all will cause the disk to appear empty or, in the case of reimaging, empty except for the files present in the image, to most software.

Finally, even when the storage medium is overwritten, physical properties of the medium may make it possible to recover the previous contents. In most cases however, this recovery is not possible by just reading from the storage device in the usual way, but requires using laboratory techniques such as disassembling the device and directly accessing/reading from its components.

The section on complications gives further explanations for causes of data remanence.

Countermeasures

There are three levels commonly recognized for eliminating remnant data:

Clearing

Clearing is the removal of sensitive data from storage devices in such a way that there is assurance that the data may not be reconstructed using normal system functions or software file/data recovery utilities. The data may still be recoverable, but not without special laboratory techniques.

Clearing is typically an administrative protection against accidental disclosure within an organization. For example, before a hard drive is re-used within an organization, its contents may be cleared to prevent their accidental disclosure to the next user.

Purging

Purging or sanitising is the removal of sensitive data from a system or storage device with the intent that the data can not be reconstructed by any known technique. Purging, proportional to the sensitivity of the data, is generally done before releasing media outside of control, such as before discarding old media, or moving media to a computer with different security requirements.

Destruction

The storage medium is physically destroyed. Effectiveness of physical destruction varies. Depending on recording density of the medium, and/or the destruction technique, this may leave data recoverable by laboratory methods. Conversely, physical destruction using appropriate techniques is generally considered the most secure method available.

Specific methods

Overwriting

A common method used to counter data remanence is to overwrite the storage medium with new data. This is often called wiping or shredding a file or disk. Because such methods can often be implemented in software alone, and may be able to selectively target only part of a medium, it is a popular, low-cost option for some applications. Overwriting is generally an acceptable method of clearing, as long as the media is writable and not damaged.

The simplest overwrite technique writes the same data everywhere—often just a pattern of all zeros. At a minimum, this will prevent the data from being retrieved simply by reading from the medium again using standard system functions.

In an attempt to counter more advanced data recovery techniques, specific overwrite patterns and multiple passes have often been prescribed. These may be generic patterns intended to eradicate any trace signatures, for example, the seven-pass pattern: 0xF6, 0x00, 0xFF, random, 0x00, 0xFF, random; sometimes erroneously attributed to the US standard DOD_5220.22-M.

One challenge with an overwrite is that some areas of the disk may be inaccessible, due to media degradation or other errors. Software overwrite may also be problematic in high-security environments which require stronger controls on data commingling than can be provided by the software in use. The use of advanced storage technologies may also make file-based overwrite ineffective (see the discussion below under Complications).

There are specialized machines and software that are capable of doing overwriting. The software can some times be a standalone Operating System specifically designed for data destruction. There are also machines specifically designed to wipe hard drives to the department of defense specifications DOD_5220.22-M as well.

Feasibility of recovering overwritten data

Peter Gutmann investigated data recovery from nominally overwritten media in the mid-1990s. He suggested magnetic force microscopy may be able to recover such data, and developed specific patterns, for specific drive technologies, designed to counter such. These patterns have come to be known as the Gutmann method.

Daniel Feenberg, an economist at the private National Bureau of Economic Research, claims that the chances of overwritten data being recovered from a modern hard drive amount to "urban legend". He also points to the "18½ minute gap" Rose Mary Woods created on a tape of Richard Nixon discussing the Watergate break-in. Erased information in the gap has not been recovered, and Feenberg claims doing so would be an easy task compared to recovery of a modern high density digital signal.

As of November 2007, the United States Department of Defense considers overwriting acceptable for clearing magnetic media within the same security area/zone, but not as a sanitization method. Only degaussing or physical destruction is acceptable for the latter.

On the other hand, according to the 2006 NIST Special Publication 800-88 (p. 7): "Studies have shown that most of today’s media can be effectively cleared by one overwrite" and "for ATA disk drives manufactured after 2001 (over 15 GB) the terms clearing and purging have converged." An analysis by Wright et al. of recovery techniques, including magnetic force microscopy, also concludes that a single wipe is all that is required for modern drives. They point out that the long time required for multiple wipes "has created a situation where many organisations ignore the issue all together – resulting in data leaks and loss. "

Degaussing

Degaussing is the removal or reduction of a magnetic field of a disk or drive, using a device called a degausser that has been designed for the media being erased. Applied to magnetic media, degaussing may purge an entire media element quickly and effectively.

Degaussing often renders hard disks inoperable, as it erases low-level formatting that is only done at the factory during manufacturing. It is possible, however, to return the drive to a functional state by having it serviced at the manufacturer. Degaussed floppy disks can generally be reformatted and reused with standard consumer hardware.

In some high-security environments, one may be required to use a degausser that has been approved for the task. For example, in US government and military jurisdictions, one may be required to use a degausser from the NSA's "Evaluated Products List".

Encryption

Encrypting data before it is stored on the medium may mitigate concerns about data remanence. If the decryption key is strong and carefully controlled (i.e., not itself subject to data remanence), it may effectively make any data on the medium unrecoverable. Even if the key is stored on the medium, it may prove easier or quicker to overwrite just the key, vs the entire disk.

Encryption may be done on a file-by-file basis, or on the whole disk. Cold boot attacks are one of the few possible methods for subverting a full-disk encryption method, as there is no possibility of storing the plain text key in an unencrypted section of the medium. See the section Complications: Data in RAM for further discussion.

Other side-channel attacks, like the use of hardware-based keyloggers or acquisition of a written note containing the decryption key, may offer a greater chance to success, but do not rely on weaknesses in the cryptographic method employed. As such, their relevance for this article is minor.

Physical destruction

Thorough physical destruction of the entire data storage medium is generally considered the most certain way to counter data remanence. However, the process is generally time-consuming and cumbersome. Physical destruction may require extremely thorough methods, as even a small media fragment may contain large amounts of data.

Specific destruction techniques include:

    Physically breaking the media apart, by grinding, shredding, etc.
    Incinerating
    Phase transition (i.e., liquefaction or vaporization of a solid disk)
    Application of corrosive chemicals, such as acids, to recording surfaces
    For magnetic media, raising its temperature above the Curie point
    For many electric volatile and non-volatile storage mediums, application of extremely high voltage as compared to safe operational specifications

Complications

Inaccessible media areas

Storage media may have areas which become inaccessible by normal means. For example, magnetic disks may develop new "bad sectors" after data have been written, and tapes require inter-record gaps. Modern hard disks often feature automatic remapping of marginal sectors or tracks, which the OS may not even be aware of. This problem is especially significant in solid state drives (SSDs) that rely on relatively large relocated bad block tables. Attempts to counter data remanence by overwriting may not be successful in such situations, as data remnants may persist in such nominally inaccessible areas.

Advanced storage systems

Data storage systems with more sophisticated features may make overwrite ineffective, especially on a per-file basis.

Journaling file systems increase the integrity of data by recording write operations in multiple locations, and applying transaction-like semantics. On such systems, data remnants may exist in locations "outside" the nominal file storage location.

Some file systems implement copy-on-write or built-in revision control, with the intent that writing to a file never overwrites data in-place.

Technologies such as RAID and anti-fragmentation techniques may result in file data being written to multiple locations, either by design (for fault tolerance), or as data remnants.

Wear leveling can also defeat data erasure, by relocating blocks between the time when they are originally written and the time when they are overwritten.

Optical media

Optical media are not magnetic and are not affected by degaussing. Write-once optical media (CD-R, DVD-R, etc.) also cannot be purged by overwrite. Read/write optical media, such as CD-RW and DVD-RW, may be receptive to overwriting. Methods for successfully sanitizing optical discs include delaminating-abrasion of the metallic data layer, shredding, destructive electrical arcing (as by exposure to microwave energy), and submersion in a polycarbonate solvent (e.g., acetone).

Data on solid-state drives

Research from the Center for Magnetic Recording and Research, University of California, San Diego has uncovered problems inherent in erasing data stored on solid-state drives (SSDs). Researchers discovered three problems with file storage on SSDs:

    First, built-in commands are effective, but manufacturers sometimes implement them incorrectly. Second, overwriting the entire visible address space of an SSD twice is usually, but not always, sufficient to sanitize the drive. Third, none of the existing hard drive-oriented techniques for individual file sanitization are effective on SSDs.

Flash-based solid-state drives differ from hard drives in two ways: first, in the way data is stored and second, the way the algorithms are used to manage and access that data. These differences can be exploited to recover previously erased data. SSDs maintain a layer of indirection between the logical addresses used by computer systems to access data and the internal addresses that identify physical storage. This layer of indirection enhances SSD performance and reliability by hiding idiosyncratic interfaces and managing flash memory's limited lifetime. But it can also produce copies of the data that are invisible to the user and that a sophisticated attacker could recover. For sanitizing entire disks, sanitize commands built into the SSD hardware have been found to be effective when implemented correctly, and software-only techniques for sanitizing entire disks have been found to work most, but not all, of the time. In testing, none of the software techniques were effective for sanitizing individual files. These included well-known algorithms such as the Gutmann method, US DoD 5220.22-M, RCMP TSSIT OPS-II, Schneier 7 Pass, and Mac OS X Secure Erase Trash.

Data in RAM

Data remanence has been observed in static random access memory (SRAM), which is typically considered volatile (i.e., contents are erased with loss of electrical power). In the study, data retention was sometimes observed even at room temperature.

Data remanence has also been observed in dynamic random access memory (DRAM). Modern DRAM chips have a built-in self-refresh module, as they not only require a power supply to retain data, but must also be periodically refreshed to prevent their data contents from fading away from the capacitors in their integrated circuits. A study found data remanence in DRAM with data retention of seconds to minutes at room temperature and "a full week without refresh when cooled with liquid nitrogen." The study authors were able to use a cold boot attack to recover cryptographic keys for several popular full disk encryption systems, including Microsoft Bitlocker, Apple FileVault, dm-crypt for Linux, and TrueCrypt. Despite some memory degradation, they were able to take advantage of redundancy in the way keys are stored after they have been expanded for efficient use, such as in key scheduling. The authors recommend that computers be powered down, rather than be left in a "sleep" state, when not in physical control of the owner. In some cases, such as certain modes of the software program Bitlocker, the authors recommend that a boot password or a key on a removable USB device be used. It is also possible to prevent data remanence in RAM by running a memory testing tool, such as Memtest86, in order to overwrite the entire RAM.

Technical Terms