Securely wipe disk
Template:Article summary start Template:Article summary text Template:Article summary heading Template:Article summary wiki Template:Article summary wiki Template:Article summary wiki Template:Article summary wiki Template:Article summary wiki Template:Article summary end
Wiping a disk is done by writing new data over every single bit.
Contents
Common use cases
Wipe all data left on the device
There may be (possibly unencrypted) data left on the device and you want to protect against simple Forensic Investigation that would be possible with for example File Recovery Software.
If you are not going to set up block device encryption but just want to roughly wipe everything from the disk you could consider using /dev/zero or simple patterns instead of a cryptographically strong random number generator. (Referred to as RNG in this article from now on.) This allows to wipe big disks with maximum performance.
It is meant to provide a level of data erasure not allowing recovery with normal system functions and hardware interfaces like standard ATA/SCSI commands. Any File Recovery Software mentioned above then would need to be specialized on proprietary storage-hardware features.
Without at least undocumented drive commands or fiddling about the device’s controller or firmware to make them read out for example reallocated sectors (bad blocks that S.M.A.R.T. retired from use) in case of a HDD no data can get recreated.
Read the section on the possibility of #Data remanence if you want to take wiping serious. This is exceedingly important for all Flash storage devices.
Preparations for block device encryption
If you want to prepare your drive to securely set up Block device encryption inside the wiped area afterwards you really should use random data.
Select a data source for overwriting
As just said If you want to wipe sensitive data you can use anything matching your needs.
If you want to setup block device encryption afterwards you should always wipe at least with Pseudorandom data.
For Data that is not truely random your disk's writing speed should be the only limiting factor. If you need random data performance may extremely depend on what you choose as source of entropy.
Unrandom data
Overwriting with /dev/zero
or simple patterns is considered secure in most resources. In the case of current HDD's it should be sufficient for fast disk wipes.
Pattern write test
badblocks
command overwrites the drive at a much faster rate by generating data that is not truly random.See also #Badblocks.
Random data
Kernel built-in RNG
Entropy
Template:Moveto The Kernel built-in RNG /dev/random provides you the same quality random data you would use for keygeneration, but can be nearly impractical to use at least for wiping current HDD capacitys. What makes disk wiping take so long with is to wait for it to gather enough true entropy. In an entropy starved situation (e.g. remote server) this might never end while doing search operations on large directories or moving the mouse in X can slowly refill the entropy pool.
You can always compare /proc/sys/kernel/random/entropy_avail
against /proc/sys/kernel/random/poolsize
to keep an eye on your entropy pool.
/dev/urandom
/dev/random
uses the kernel entropy pool and will halt overwriting until more input entropy once this pool has been exhausted. This can make it impractical for overwriting large hard disks.
/dev/urandom
in contrast will reuse entropy when low on it so you won't get stuck. Nevertheless it might still take a long time to bottle-feed the neverending surge of large drives with data.
The output may contain less entropy than the corresponding read from /dev/random. However it is still intended as a pseudorandom number generator suitable for most cryptographic purposes,
Pseudorandom Data
A Good Compromise between Performance and Security might be the use of a pseudorandom number generator (like Frandom).
There are also cryptographically secure pseudorandom number generators like Yarrow (FreeBSD/OS-X) or Fortuna (the intended successor of Yarrow).
Select a program
/dev/<drive>
is the drive to be encrypted.
Coreutils
Official documentation for dd and shred is linked to under #See also.
Dd
Checking progress of dd while running
By default, there is no output of dd until the task has finished. With kill and the "USR1"-Signal you can force status output without actually killing the program. Open up a 2nd root terminal and issue the following command:
# killall -USR1 dd
Or:
# kill -USR1 <PID_OF_dd_COMMAND>
For example:
# kill -USR1 $(pidof dd)
This causes the terminal in which dd is running to output the progress at the time the command was run. For example:
605+0 records in 605+0 records out 634388480 bytes (634 MB) copied, 8.17097 s, 77.6 MB/s
Dd spin-offs
Other dd alike programs feature periodical status output like i.e. a simple progress bar.
dcfldd
dcfldd is an enhanced version of dd with features useful for forensics and security. It accepts most of dd's parameters and includes status output. The last stable version of dcfldd was released on December 19, 2006.[1]
ddrescue
GNU ddrescue is a data recovery tool. It's capable of ignoring read errors what is a useless feature for disk wiping in almost any case. GNU ddrescue Manual
shred
Shred uses three passes, writing pseudo-random data to the harddrive each pass. This can be reduced or increased.
# shred -v /dev/<drive>
This invokes shred with default settings, displaying the progress to stdout.
# shred --verbose --random-source=/dev/urandom -n1 /dev/<drive>
Invokes shred telling it to only do one pass, with entropy from /dev/urandom.
Badblocks
Badblocks is in e2fsprogs
For letting badblocks perform a disk wipe a destructive read-write test has to be done.
# badblocks -c 10240 -wsv /dev/<drive>
Badblocks can be made to write "random patterns" with the -t
option.
# badblocks -wsvt random /dev/<drive>
Badblocks can run a settable number of consecutive passes with the -p
option. Default is one pass.
# badblocks -wsvp <number> /dev/<drive>
This makes it more likely to find all weak blocks. As a side effect this could help in limiting #Data_remanence to very rare cases for most storage devices.
Select a target
Use fdisk to locate all read/write devices the user has read acess to.
Check the output for lines that start with devices such as /dev/sdX
.
This is an example for a HDD formatted to boot a linux system:
# fdisk -l
Disk /dev/sda: 250.1 GB, 250059350016 bytes, 488397168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00ff784a Device Boot Start End Blocks Id System /dev/sda1 * 2048 206847 102400 83 Linux /dev/sda2 206848 488397167 244095160 83 Linux
Or the Arch Install Medium written to a 4GB USB thumb drive:
# fdisk -l
Disk /dev/sdb: 4075 MB, 4075290624 bytes, 7959552 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x526e236e Device Boot Start End Blocks Id System /dev/sdb1 * 0 802815 401408 17 Hidden HPFS/NTFS
Block size
Template:Wikipedia
If you have a Advanced Format hard drive it is recommended that you specify a block size larger than the default 512 bytes. To speed up the overwriting process choose a block size matching your drive's physical geometry by appending the block size option to the dd command (i.e. bs=4096
for 4KB).
Fdisk prints physical and logical sector size for every disk.
Alternatively sysfs does expose information:
/sys/block/sdX/queue/physical_block_size /sys/block/sdX/queue/logical_block_size /sys/block/sdX/alignment_offset
Overwrite the disk
of=...
option points to the target drive and not to a system disk.Zero-fill the disk by writing a zero byte to every addressable location on the disk using the /dev/zero stream.
# dd if=/dev/zero of=/dev/sdX bs=4096
or the /dev/random stream:
# dd if=/dev/urandom of=/dev/sdX bs=4096
The process is finished when dd reports, No space left on device
:
dd: writing to ‘/dev/sdb’: No space left on device 7959553+0 records in 7959552+0 records out 4075290624 bytes (4.1 GB) copied, 1247.7 s, 3.3 MB/s
Data remanence
Template:Wikipedia The residual representation of data may remain even after attempts have been made to remove or erase the data.
Residual data may be removed by writing random data to the disk or with more than one iteration. However, more than one iteration may not significantly decrease the ability to reconstruct the data of hard disk drives. For more information see Secure deletion: a single overwrite will do it - The H Security.
If the data can be located on the disk and you can confirm that it has never been copied anywhere else, a random number generator provides a quick and thorough alternative.
Residual magnetism
Wiped hard disk drives and other magnetic storage can get disassembled in a cleanroom and then analyzed with equipment like a magnetic force microscope. This may allow the overwritten data to be reconstructed by analyzing the measured residual magnetics.
This method of data recovery for current HDD's is largely theoretical and would require substantial financial resources. Nevertheless degaussing is still practiced.
Old magnetic storage
Securely wiping old magnetic storage (e.g. floppy disks, magnetic tape) is much harder due to much lower memory storage density. Many iterations with random data might be needed to wipe any sensitive data. To ensure that data has been completely erased most resources advise physical destruction.
Flash memory
Like older magnetic storage, flash memory can be difficult to wipe because of wear leveling and transparent compression. For more information see Reliably Erasing Data From Flash-Based Solid State Drives.
Filesystem, operating system, programs
The operating system, executed programs or journaling file systems may copy your unencrypted data throughout the block device. However, this should only be relevant in conjunction with one of the above, because you are writing to plain disks.
See also
- GNU Coreutils Manpage on Basic operations. Official documentation for dd and shred.
- Learn the DD command. - linuxquestions.org