Difference between revisions of "Synchronization and backup programs"

From ArchWiki
Jump to: navigation, search
(Add descriptions for backup packages and clarify what they do, capitalization fixes)
m (Forgot SafeKeep!)
Line 44: Line 44:
 
**Destination filesystem must support hard links
 
**Destination filesystem must support hard links
 
**Win32 version available
 
**Win32 version available
*[http://safekeep.sourceforge.net/ safekeep] (in AUR)
+
*[http://safekeep.sourceforge.net/ SafeKeep] (in AUR)
 +
**Enhancement to rdiff-backup
 +
**Integrates with Linux LVM and databases to create consistent backups
 +
**Bandwidth throttling
 
* [http://www.scottlu.com/Content/Link-Backup.html Link-Backup] ([http://aur.archlinux.org/packages.php?ID=16929 AUR] may be patched with additional features) is similar to rsync based scripts, but does not use rsync
 
* [http://www.scottlu.com/Content/Link-Backup.html Link-Backup] ([http://aur.archlinux.org/packages.php?ID=16929 AUR] may be patched with additional features) is similar to rsync based scripts, but does not use rsync
 
** Creates hard links between a series of backed-up trees (snapshots)
 
** Creates hard links between a series of backed-up trees (snapshots)

Revision as of 03:54, 12 November 2009

Intro

This wiki page contains information about various backup programs. It's a good idea to have regular backups of important data, most notably configuration files (/etc/*) and local pacman database (usually /var/lib/pacman/local/*).

Few words to end the introduction: before you start trying programs out, try to think about your needs, e.g. consider the following questions:

  • What backup medium do I have available?
    • cd / dvd
    • remote server (With what access? Ssh? Can I install some software on it (necessary for e.g. rsync-based solutions)?)
    • external harddrive
  • How often do I plan to backup?
    • daily?
    • weekly?
    • less often?
  • What goodies do I expect from the backup solution?
    • compression? (what algorithms?)
    • encryption? (gpg or something more straightforward?)
  • Most importantly: how do I plan to restore backups if needed?

All right, enough with this, let's see some options!

Incremental backups

The point with these is that they remember what has been backed up during the last run, and back up only what has changed. Great if you back up often.

Rsync-type backups

The main characteristic of this type of backups is that they maintain a copy of the directory you want to keep a backup of, in a traditional "mirror" fashion.

Certain rsync-type packages also do snapshot backups by storing files which describe the how the contents of files and folders changed from the last backup (the so-called 'diffs'). Hence, they are inherently incremental, but usually they don't have compression/encryption. On the other hand, a working copy of everything is immediately available, no decompression/decryption needed. Finally the way it works makes it hard to burn backups to cd/dvd..

CLI
  • rsync (in extra repo)
    • rsync almost always makes a mirror of the source
    • Impossible to restore a backup before the most recent backup
    • Win32 version available
  • rdiff-backup (in community repo)
    • Stores most recent backup as regular files
    • To revert to older versions, you apply the diff files to recreate the older versions
    • It is truly granularly incremental (delta backup), it only stores changes to a file; will not create a new copy of a file upon change
    • Win32 version available
  • rsnapshot (in community repo)
    • Does not store diffs, instead it copies entire files if they have changed
    • Creates hard links between a series of backed-up trees (snapshots)
    • It is incremental in that the size of the backup is only the original backup size plus the size of all files that have changed since the last backup.
    • Destination filesystem must support hard links
    • Win32 version available
  • SafeKeep (in AUR)
    • Enhancement to rdiff-backup
    • Integrates with Linux LVM and databases to create consistent backups
    • Bandwidth throttling
  • Link-Backup (AUR may be patched with additional features) is similar to rsync based scripts, but does not use rsync
    • Creates hard links between a series of backed-up trees (snapshots)
    • Intelligently handles renames, moves, and duplicate files without additional storage or transfer
    • dstdir/.catalog is a catalog of all unique file instances; backup trees hard-link to the catalog
    • If a backup tree would be identical to the previous backup tree, it won't be needlessly created
    • Transfer occurs over standard I/O locally or remotely between a client and server instance of this script
    • It copies itself to the server; it does not need to be installed on the server
    • Remote backups rely on SSH
    • It resumes stopped backups; it can even be told to run for n minutes
GUI
  • Back In Time (in AUR)
    • Creates hard links between a series of backed-up trees (snapshots)
    • Inspired by FlyBack and TimeVault
    • Really is just a front-end to rsync, diff, cp
    • A new snapshot is created only if something changed since the last snapshot
  • FlyBack (in AUR)
    • A clone of Apple's Mac OS X Time Machine software
  • Areca Backup (in AUR)
    • Written in Java
    • Primarily archive-based (ZIP), but will do file-based backup as well
    • Claims delta backup supported (stores only changes)
  • TimeVault (in AUR)
    • Creates hard links between a series of backed-up trees (snapshots)
    • Imitates Windows Volume Shadow Copy feature in that it integrates with Nautilus to provide a "Previous Versions" tab in the Properties dialog.

Not rsync-based

They tend to create (big) archive files (like tar.bz2), and (of course) keep track of what's been archived. Now creating tar.bz2 or tar.gz archives has the advantage that you can extract the backups with just tar/bzip2/gzip, so you don't need to have the backup program around.

  • arch-backup (in community repo); website); trivial backup scripts with simple configuration:
    • compression method can be configured
    • possible to specify more directories to backup
  • hdup (in extra repo; website; it's no longer developed, the author develops rdup now (below); but it's still a decent one):
    • creates tar.gz or tar.bz2 archives
    • supports gpg encryption
    • supports pushing over ssh
    • possible to specify more directories to backup
  • rdup (in AUR) successor to hdup: the program *just determines* which files have changed since the last backup. It's completely up to you what do you want to do with that list. Some helper scripts are supplied, and with them it supports:
    • creating tar.gz archives or rsync-type copy
    • encryption (gpg and usual strong (eg. blowfish)), also applies for rsync-type copy
    • compression (also for rsync-type copy)
  • duplicity (in community repo) is similar to hdup, supports tarring and encrypting. But:
    • the files backed up are "randomly" distributed between encrypted tar archives, which makes it harder to recover a particular file
    • you can backup just one directory at a time (while with hdup you can specify as many as you want in one backup profile)
  • dar (in community repo):
    • it uses its own format for archives (so you need to have it around when you want to restore)
    • supports splitting backups into more files by size
    • makefile-type config files, some custom scripts are available along with it
    • supports basic encryption (not gpg; but also strong, but you need to supply a password every time)
    • some gui tools for inspecting backups are also available (kdar, in AUR, but current dar needs beta version)
    • a script suitable for running from cron is sarab (in AUR): supports pretty much any backup scheme (Towers of Hanoi, Grandfather-Father-Son, etc..)
  • backerupper (in AUR) Backerupper is a simple program for backing up selected directories over a local network. Its main intended purpose is backing up a user's personal data.
    • GUI based
    • creating tar.gz archives
    • possible to define : backup frequency, backup time, Max copies
  • Manent (in AUR) is an algorithmically strong backup and archival program. It's Python based and has the following features:
    • Efficient backup to anything that looks like a storage
    • Works well over a slow and unreliable network
    • Offers online access to the contents of the backup
    • Backed up storage is completely encrypted
    • Several computers can use the same storage for backup, automatically sharing data
    • Not reliant on timestamps of the remote system to detect changes
    • Cross-platform support for Unicode file names

Cloud backups

  • Dropbox (in AUR with Gnome support, and also AUR without Gnome dependencies).
    • A daemon monitors a specified directory, and uploads incremental changes to Dropbox.com.
    • Changes automatically show up on your other computers.
    • Includes file sharing and a public directory.
    • You can recover deleted files.
    • Community written add-ons.
    • Free accounts have 2GB storage.
  • Jungle Disk (in AUR)
    • Automatic backups to Amazon's servers.
    • Not free, but very low prices.

Not incremental backups

  • Q7Z (in AUR) is a P7Zip GUI for Linux, which attempts to simplify data compression and backup. It can create the following archive types: 7z, BZip2, Zip, GZip, Tar. Use Q7Z if you want to:
    • Update existing archives quickly
    • Backup multiple folders to a storage location
    • Create or extract protected archives
    • Lessen effort by using archiving profiles and lists
  • "Just copy everything into one big archive, but support writing to cd/dvd"-type: backup-manager (in AUR)
  • Partclone -- back up and restore only the used blocks of a partition
  • filesystem-backup -- simple bash script (was originally a MySQL backup script) that creates a rolling 7 days, rolling 4 weeks and static monthly backups in tar format. Good for servers without a GUI. Available in repo: http://repo.falconn.nl/any/
  • Clonezilla boots from live CD and USB flash drive images, to make backup images of hard drives and partitions, as well as copy drives and partitions from one to another. It uses Partimage, ntfsclone, partclone, and dd, which allows for compatability with many file systems (ext2, ext3, ext4 supported in testing branch, reiserfs, xfs, jfs, FAT, NTFS, and HFS+). It will also backup over a network and there is a server edition.

Versioning systems

These are traditionally used for keeping track of software development; but if you want to have a simple way to manage your config files in one directory, it might be a good solution.

  • mercurial or git (both in extra repo)
  • gibak: a backup system based on git. it also supports binary diffs (for binaries, e-books, pictures, multimedia files, etc). on the homepage there is a short usage advice. it is meant to backup only the $HOME directory. one could also backup other directories (like /etc) by changing the $HOME variable to point to that directory (though i don't really recommend this). gibak is handy for people who are familiar with git. it uses .gitignore to filter files and one can use the git commands to restore files, browse through logs, diffs, etc. if one needs a gui, it is also possible to use gitk or qgit to browse through commits or do whatever these interfaces support. get it from AUR: http://aur.archlinux.org/packages.php?ID=18318.

Articles