Disk encryption

From ArchWiki
Revision as of 08:53, 20 June 2012 by Sas (Talk | contribs) (remove stub template)

Jump to: navigation, search

Template:Moveto

Summary help replacing me
Transparent encryption/decryption software
Related
System Encryption with LUKS
System Encryption with eCryptfs
TrueCrypt
EncFS
Mount encrypted volumes in parallel
Removing System Encryption

This article discusses common techniques available in Arch Linux for cryptographically protecting a logical part of a storage disk (folder, partition, whole disk, ...), so that all data that is written to / read from it by an authorized user is automatically encrypted/decrypted on-the-fly.

Why use encryption?

Disk encryption makes sure that (all or some of) your files are always stored on disk in encrypted form, and only "magically" become available to the operating system and applications in readable form while the system is running and unlocked by you (or someone you trust). Anyone who looks at the contents of (the protected parts of) your hard drive without your consent, will only see garbled random-looking data instead of your actual files.

This can, for example, prevent unauthorized persons from viewing your data (documents / emails / locally store passwords / ...) when your computer or hard-disk is...

  • located in a place (e.g. office / living room) to which non-trusted people might gain access while you're away (but see warning below)
  • lost (in case of laptops/netbooks or external hard drives)
  • stolen
  • in the repair shop
  • discarded after its end-of-life
  • ...
Warning: Disk encryption does not protect you against:
  • Attackers who can break into your system (e.g. over the Internet) while it is running, after you've already unlocked and mounted the encrypted parts of your hard drive.
  • Attackers who are able to gain physical access to the computer while (or very shortly after) it is running, and have the resources to perform a Cold boot attack.
  • Attackers who are able to gain physical access to the computer before you use it, and have the resources to e.g. install a hidden key logger or Trojan horse in a non-encrypted part of the disk, like the boot partition.
    (Encrypting the whole system disk and keeping the boot partition on an external USB stick that you carry around with yourself can add more security in that regard, but the only true remidy would be something like hardware-supported Trusted Computing.)
  • The government.
    In addition to having the resources to easily pull off the above attacks, governments may simply force you to give up your keys/passphrases using various techniques of coercion. In most non-democratic countries around the world as well as in the USA and UK, it is legal for law enforcement agencies to do so if they have suspicions that you might be hiding something of interest.


It also won't protect you against someone simply wiping out your data - you have to do regular backups to keep you data safe in this regard.

Data encryption vs system encryption

Disk encryption can serve as a means for both data encryption, and (partial or full) system encryption.

Data encryption, defined as encrypting only the user's data itself (often located within the /home directory), can already provide a certain level of security, but has some significant drawbacks. In modern computing systems, there are many background processes that may (temporarily) store information about encrypted data or parts of the encrypted data itself in non-encrypted areas of the hard drive, thus reducing the effectiveness of any data encryption system in place. Depending on the type of data and applications used, it might be possible to avoid/circumvent all cases where this is happening, but it takes extra diligence on the side of the user.

Places outside of the home directory where fragments of user data might end up, include (but are not limited to):

  • swap partitions
    • (potential remedy: disable swapping)
  • /tmp (temporary files created by user applications)
    • (potential remedies: avoid such applications; mount /tmp inside a ramdisk)
  • /var (log files and such; for example, mlocate stores an index of all file names in /var/lib/mlocate/mlocate.db)

In addition, mere data encryption will leave the system vulnerable to system tampering attacks like keyloggers (see warning above).


System encryption, defined as the encryption of (part of) the operating system and user data, helps to address some of the inadequacies of data encryption. The benefits of system encryption over data encryption alone include:

  • Preventing unauthorized physical access to operating system files
  • Preventing unauthorized physical access to private data that may cached by the system.

Despite the use of system encryption, there are still points of physical insecurity (see warning above). However, it is presently the best way to minimize the loss of data privacy by physical attempts at invasion.

From a usability perspective, a potential disadvantage of full system encryption over mere data encryption is that unlocking/locking of encrypted parts of the disk can no longer coincide with user login/logout (sharing the same password), because now the unlocking already needs to happen before or during boot.


In practice, there's not always a clear line between data encryption and system encryption, and many different compromises and customized setups are possible.

In any case, disk encryption should only be viewed as an adjunct to the existing security mechanisms of the operating system - focused on securing offline physical access, while relying on other parts of the system to provide things like network security and user-based access control.

Available methods

All disk encryption methods operate in such a way that even though the disk actually holds encrypted data, the operating system and applications "see" it as the corresponding normal readable data as long as the cryptographic container (i.e. the logical part of the disk that holds the encrypted data) has been "unlocked" and mounted.

For this to happen, some "secret information" (usually in the form of a keyfile and/or passphrase) needs to be supplied by the user, from which the actual encryption key can be derived (and stored in the kernel keyring for the duration of the session).

If you are completely unfamiliar with this sort of operation, please first read the #How the encryption works section below.

Stacked filesystem encryption vs block device encryption

The available disk encryption methods can be separated into two types by their layer of operation:

Stacked filesystem encryption solutions are implemented as a layer that stacks on top of an existing filesystem, causing all files written to an encryption-enabled folder to be encrypted on-the-fly before the underlying filesystem writes them to disk, and decrypted whenever the filesystem reads them from disk. This way, the files are stored in the host filesystem in encrypted form (meaning that their contents, and usually also their file/folder names, are replaced by random-looking data of roughly the same length), but other than that they still exist in that filesystem as they would without encryption, as normal files / symlinks / hardlinks / etc.

The way it is implemented, is that to unlock the folder storing the raw encrypted files in the host filesystem ("lower directory"), it is mounted (using a special stacked pseudo-filesystem) onto itself or optionally a different location ("upper directory"), where the same files then appear in readable form - until it is unmounted again, or the system is turned off.

Available solutions in this category are:

eCryptfs
...
EncFS
...

Block device encryption methods, on the other hand, operate below the filesystem layer and make sure that everything written to a certain block device (i.e. a whole disk, or a partition, or a file acting as a virtual loop-back device) is encrypted. This means that while the block device is offline, its whole content looks like a large blob of random data, with no way of determining what kind of filesystem and data it contains. Accessing the data happens, again, by mounting the protected container (in this case the block device) to an arbitrary location in a special way.

The following "block device encryption" solutions are available in Arch Linux:

loop-AES
loop-AES is a descendant of cryptoloop and is a secure and fast solution to system encryption.
However loop-AES is considered less user-friendly than other options as it requires non-standard kernel support.
dm-crypt + LUKS
dm-crypt is the standard device-mapper encryption functionality provided by the Linux kernel. It can be used directly by those who like to have full control over all aspects of partition and key management.
LUKS is an additional convenience layer which stores all of the needed setup information for dm-crypt on the disk itself and abstracts partition and key management in an attempt to improve ease of use.
TrueCrypt
...

For practical implication of the chosen layer of operation, see the comparison table below, as well as [1].

Possible setups

In practice, which disk encryption setup is appropriate for you will depend on your goals (see especially #Data encryption vs system encryption above) and system parameters.

An example of a simple (but not really watertight) setup for using disk encryption as a means for data encryption, would be:

- a folder called "Private" in the user's home dir encrypted with EncFS (keyfile on unencrypted part of home dir, unlocked by user on demand with passphrase)

Here's a setup implementing (partial) system encryption, but in which the whole security would still rely on the strength of the user's passphrase (and the assumption that no attacker will tamper with the boot/root partitions to e.g. install a keylogger):

- each user's home directory encrypted with eCryptfs (keyfile on unencrypted part of disk, passphrase entered on user login)
- swap and /tmp partitions encrypted with dm-crypt+LUKS (using automatically generated per-session random key)
- indexing/caching of contents of /home by slocate (and similar apps) disabled

A more paranoid system encryption setup in which both a special USB stick and a passphrase are needed for booting or tampering with any files, but those secrets need to be shared between all users of the system:

- whole hard drive encrypted with dm-crypt+LUKS (keyfile on USB stick, passphrase entered on boot)
- boot partition on USB stick

Many many other combinations are of course possible. You should carefully plan what kind of setup will be appropriate for your system.

Comparison table

summary
Loop-AES dm-crypt + LUKS Truecrypt eCryptfs EncFs

type

block device encryption stacked filesystem encryption

main selling points

longest-exiting one; possibly the fastest; works on legacy systems de-facto standard for block device encryption on Linux; very flexible very portable, well-polished, self-contained solution slightly faster than EncFS; individual encrypted files portable between systems easiest one to use; supports non-root administration

availability in Arch Linux

must manually compile custom kernel kernel modules: already shipped with default kernel; tools: device-mapper, cryptsetup [core] truecrypt [extra] kernel module: already shipped with default kernel; tools: ecryptfs-utils [community] encfs [community]

license

GPL GPL custom[1] GPL GPL
basic classification
Loop-AES dm-crypt + LUKS Truecrypt eCryptfs EncFs

encrypts...

whole block devices files

container for encrypted data may be...

  • a disk or disk partition
  • a file acting as a virtual partition
  • a directory in an existing file system

relation to filesystem

operates below the filesystem layer - doesn't care whether the content of the encrypted block device is a filesystem, a partition table, a LVM setup, or anything else adds an additional layer to an existing filesystem, to automatically encrypt/decrypt files whenever they're written/read

encryption implemented in...

kernelspace kernelspace kernelspace kernelspace userspace
(using FUSE)

cryptographic metadata stored in...

 ?  ?  ? header of each encrypted file control file at the top level of each EncFs container

wrapped encryption key stored in...

 ?  ?  ? key file that can be stored anywhere control file at the top level of each EncFs container
practical implications
Loop-AES dm-crypt + LUKS Truecrypt eCryptfs EncFs

file metadata (number of files, dir structure, file sizes, permissions, mtimes, etc.) is encrypted


(file and dir names can be encrypted though)

can be used to encrypt whole hard drives (including partition tables)

can be used to encrypt swap space

can be used without pre-allocating a fixed amount of space for the encrypted data container

can be used to protect existing filesystems without block device access, e.g. NFS or Samba shares, cloud storage, etc.

    [2]

allows offline file-based backups of encrypted files

usability features
Loop-AES dm-crypt + LUKS Truecrypt eCryptfs EncFs

support for automounting on login

 ?  ?  ?  ?

support for automatic unmounting in case of inactivity

 ?  ?  ?  ?

non-root users can create/destroy containers for encrypted data

provides a GUI

security features
Loop-AES dm-crypt + LUKS Truecrypt eCryptfs EncFs

supported ciphers

AES  ?  ? AES, blowfish, twofish...  ?

support for salting

 ?
(with LUKS)
 ?

support for chaining multiple ciphers

 ?  ?  ?  ?

support for key-slot diffusion

 ?
(with LUKS)
 ?  ?  ?

protection against key scrubbing

 ?  ?  ?  ?

support for multiple (independently revokable) keys for the same encrypted data

 ?
(with LUKS)
 ?  ?  ?
performance features
Loop-AES dm-crypt + LUKS Truecrypt eCryptfs EncFs

multithreading support

 ?  ?  ?

hardware-accelerated encryption support

 ?  ?  ?

optimised handling of sparse files

 ?  ?  ?  ?
block device encryption specific
Loop-AES dm-crypt + LUKS Truecrypt

support for (manually) resizing the encrypted block device in-place

 ?
stacked filesystem encryption specific
eCryptfs EncFs

supported file systems

ext3, ext4, xfs (with caveats), jfs, nfs...  ?

ability to encrypt filenames

ability to not encrypt filenames

compatibility & prevalence
Loop-AES dm-crypt + LUKS Truecrypt eCryptfs EncFs

supported Linux kernel versions

2.0 or newer  ?  ?  ? 2.4 or newer
encrypted data can also be accessed from... Windows (with [3]) (with [4])  ?  ?
Mac OS X  ?  ?  ?     [5]
FreeBSD  ?  ?  ?     [6]

used by

 ?
  • Arch Linux installer (system encryption)
  • Ubuntu alternate installer (system encryption)
 ?
  • Ubuntu installer (home dir encryption)
  • Chromium OS (encryption of cached user data[7])
 ?

How the encryption works

This section is intended as a high-level introduction to the concepts and processes which are at the heart of usual disk encryption setups.

It does not go into technical or mathematical details (consult the appropriate literature for that), but should provide a system administrator with a rough understanding of how different setup choices (especially regarding key management) can affect usability and security.

Basic principle

For the purposes of disk encryption, each blockdevice (or individual file in the case of stacked filesystem encryption) is divided into sectors of equal lenght, for example 512 bytes (4,096 bits). The encryption/decryption then happens on a per-sector basis, so the n'th sector of the blockdevice/file on disk will store the encrypted version of the n'th sector of the original data.

Whenever the operating system or an application requests a certain fragment of data from the blockdevice/file, the whole sector (or sectors) that contains the data will be read from disk, decrypted on-the-fly, and temporarily stored in memory:

          ╔═══════╗
 sector 1 ║"???.."║
          ╠═══════╣         ╭┈┈┈┈┈╮
 sector 2 ║"???.."║         ┊ key ┊
          ╠═══════╣         ╰┈┈┬┈┈╯
          ⁝       ⁝            │
          ╠═══════╣            ▼             ┣┉┉┉┉┉┉┉┫
 sector n ║"???.."║━━━━━━━(decryption)━━━━━━▶┋"abc.."┋ sector n
          ╠═══════╣                          ┣┉┉┉┉┉┉┉┫
          ⁝       ⁝
          ╚═══════╝
 
          encrypted                          unencrypted
     blockdevice or                          data in memory
       file on disk

Similarly, on each write operation, all sectors that are affected must be re-encrypted complelety (while the rest of the sectors remain untouched).

Keys, keyfiles and passphrases

Tango-view-fullscreen.pngThis article or section needs expansion.Tango-view-fullscreen.png

Reason: explain the relationship between passphrases, keyfiles, and encryption keys (for common disk encryption setups) (Discuss in Talk:Disk encryption#)

Further reading:

Ciphers and modes of operation

The actual algorithm used for translating between pieces of unencrypted and encrypted data (so-called "plaintext" and "ciphertext") that correspond to each other with respect to a given encryption key, is called a "cipher".

Disk encryption employs "block ciphers", which operate on fixed-length blocks of data, e.g. 16 bytes (128 bits). At the time of this writing, the predominantly used ones are AES and Blowfish. (Note: AES with a key-length of 192 or 256 bits has been approved by the NSA for protecting "SECRET" and "TOP SECRET" classified government information.)

Encrypting/decrypting a sector (see above) is achieved by dividing it into small blocks matching the cipher's block-size, and following a certain rule-set (a so-called "mode of operation") for how to consecutively apply the cipher to the individual blocks.

Simply applying it to each block separately without modification (dubbed the "electronic codebook (ECB)" mode) would not be secure, because if the same 16 bytes of plaintext always produce the same 16 bytes of ciphertext, an attacker could easily recognize patterns in the ciphertext that is stored on disk.

The most basic (and common) mode of operation used in practice is "cipher-block chaining (CBC)". When encrypting a sector with this mode, each block of plaintext data is combined in a mathematical way with the ciphertext of the previous block, before encrypting it using the cipher. For the first block, since it has no previous ciphertext to use, a special pre-generated data block stored with the sector's cryptographic metadata and called an "initialization vector (IV)" is used:

                                  ╭──────────────╮
                                  │initialization│
                                  │vector        │
                                  ╰────────┬─────╯
          ╭  ╠══════════╣        ╭─key     │      ┣┉┉┉┉┉┉┉┉┉┉┫        
          │  ║          ║        ▼         ▼      ┋          ┋         . START
          ┴  ║"????????"║◀━━━━(cipher)━━━━(+)━━━━━┋"Hello, W"┋ block  ╱╰────┐
    sector n ║          ║                         ┋          ┋ 1      ╲╭────┘
  of file or ║          ║──────────────────╮      ┋          ┋         ' 
 blockdevice ╟──────────╢        ╭─key     │      ┠┈┈┈┈┈┈┈┈┈┈┨
          ┬  ║          ║        ▼         ▼      ┋          ┋
          │  ║"????????"║◀━━━━(cipher)━━━━(+)━━━━━┋"orld !!!"┋ block
          │  ║          ║                         ┋          ┋ 2
          │  ║          ║──────────────────╮      ┋          ┋
          │  ╟──────────╢                  │      ┠┈┈┈┈┈┈┈┈┈┈┨
          │  ║          ║                  ▼      ┋          ┋
          ⁝  ⁝   ...    ⁝        ...      ...     ⁝   ...    ⁝ ...
 
               ciphertext                         plaintext
                  on disk                         in memory

When decrypting, the procedure is reversed analogously.

One thing worth noting is the generation of the unique initialization vector for each sector. The simplest choice is to calculate it in a predictable fashion from a readily available value such as the sector number. However, this might allow an attacker with repeated access to the system to perform a so-called watermarking attack. To prevent that, a method called "Encrypted salt-sector initialization vector (ESSIV)" can be used to generate the initialization vectors in a way that makes them look completely random to a potential attacker.

There are also a number of other, more complicated modes of operation available for disk encryption, which already provide built-in security agains such attacks. Some can also additionally guarantee authenticity (see below) of the encrypted data.

Further reading:

Additional considerations

Performance

Tango-view-fullscreen.pngThis article or section needs expansion.Tango-view-fullscreen.png

Reason: (Discuss in Talk:Disk encryption#)

Data integrity/authenticity

Tango-view-fullscreen.pngThis article or section needs expansion.Tango-view-fullscreen.png

Reason: (Discuss in Talk:Disk encryption#)

Further reading:

Plausible deniability

Tango-view-fullscreen.pngThis article or section needs expansion.Tango-view-fullscreen.png

Reason: (Discuss in Talk:Disk encryption#)

Notes & References

  1. ^ see http://www.truecrypt.org/legal/license
  2. ^ well, a single file in those filesystems could be used as a container (virtual loop-back device!) but then one wouldn't actually be using the filesystem (and the features it provides) anymore
  3. ^ CrossCrypt - Open Source AES and TwoFish Linux compatible on the fly encryption for Windows XP and Windows 2000
  4. ^ FreeOTFE - supports Windows 2000 and later (for PC), and Windows Mobile 2003 and later (for PDA)
  5. ^ see EncFs build instructions for Mac
  6. ^ see http://www.freshports.org/sysutils/fusefs-encfs/
  7. ^ see http://www.chromium.org/chromium-os/chromiumos-design-docs/protecting-cached-user-data