Synchronization and backup programs

From ArchWiki
Jump to navigation Jump to search


This wiki page contains information about various backup programs. It's a good idea to have regular backups of important data, most notably configuration files (/etc/*) and local pacman database (usually /var/lib/pacman/local/*).

Few words to end the introduction: before you start trying programs out, try to think about your needs, e.g. consider the following questions:

  • What backup medium do I have available?
    • cd / dvd
    • remote server (With what access? Ssh? Can I install some software on it (necessary for e.g. rsync-based solutions)?)
    • external harddrive
  • How often do I plan to backup?
    • daily?
    • weekly?
    • less often?
  • What goodies do I expect from the backup solution?
    • compression? (what algorithms?)
    • encryption? (gpg or something more straightforward?)
  • Most importantly: how do I plan to restore backups if needed?

All right, enough with this, let's see some options!

Incremental backups

The point with these is that they remember what has been backed up during the last run, and back up only what has changed. Great if you back up often.

Rsync-type backups

The main characteristic of this type of backups is that they maintain a copy of the directory you want to keep a backup of, together with files which describe how did the contents change from the last backup (the so-called 'diffs'). Hence, they are inherently incremental, but usually they don't have compression/encryption. On the other hand, a working copy of everything is immediately available, no decompression/decryption needed. Finally the way it works makes it hard to burn backups to cd/dvd..

command line based
  • rsync (in extra repo)
  • rdiff-backup (in community repo)
  • rsnapshot (in community repo)
  • safekeep (in AUR)
  • Link-Backup (AUR may be patched with additional features) is similar to rsync based scripts, but does not use rsync
    • creates hard links between a series of backed-up trees (snapshots)
    • intelligently handles renames, moves, and duplicate files without additional storage or transfer
    • dstdir/.catalog is a catalog of all unique file instances; backup trees hard-link to the catalog
    • if a backup tree would be identical to the previous backup tree, it won't be needlessly created
    • transfer occurs over standard I/O locally or remotely between a client and server instance of this script
    • it copies itself to the server; it does not need to be installed on the server
    • remote backups rely on SSH
    • it resumes stopped backups; it can even be told to run for n minutes
gui based

Not rsync-based

They tend to create (big) archive files (like tar.bz2), and (of course) keep track of what's been archived. Now creating tar.bz2 or tar.gz archives has the advantage that you can extract the backups with just tar/bzip2/gzip, so you don't need to have the backup program around.

  • arch-backup (in community repo); website); trivial backup scripts with simple configuration:
    • compression method can be configured
    • possible to specify more directories to backup
  • hdup (in extra repo; website; it's no longer developed, the author develops rdup now (below); but it's still a decent one):
    • creates tar.gz or tar.bz2 archives
    • supports gpg encryption
    • supports pushing over ssh
    • possible to specify more directories to backup
  • rdup (in AUR) successor to hdup: the program *just determines* which files have changed since the last backup. It's completely up to you what do you want to do with that list. Some helper scripts are supplied, and with them it supports:
    • creating tar.gz archives or rsync-type copy
    • encryption (gpg and usual strong (eg. blowfish)), also applies for rsync-type copy
    • compression (also for rsync-type copy)
  • duplicity (in community repo) is similar to hdup, supports tarring and encrypting. But:
    • the files backed up are "randomly" distributed between encrypted tar archives, which makes it harder to recover a particular file
    • you can backup just one directory at a time (while with hdup you can specify as many as you want in one backup profile)
  • dar (in community repo):
    • it uses its own format for archives (so you need to have it around when you want to restore)
    • supports splitting backups into more files by size
    • makefile-type config files, some custom scripts are available along with it
    • supports basic encryption (not gpg; but also strong, but you need to supply a password every time)
    • some gui tools for inspecting backups are also available (kdar, in AUR, but current dar needs beta version)
    • a script suitable for running from cron is sarab (in AUR): supports pretty much any backup scheme (Towers of Hanoi, Grandfather-Father-Son, etc..)
  • backerupper (in AUR) Backerupper is a simple program for backing up selected directories over a local network. Its main intended purpose is backing up a user's personal data.
    • GUI based
    • creating tar.gz archives
    • possible to define : backup frequency, backup time, Max copies

Not incremental backups

  • "Just copy everything into one big archive, but support writing to cd/dvd"-type: backup-manager (in AUR)
  • Q7Z (in AUR) uses P7Zip, which can create these archives: 7z, BZip2, Zip, GZip, Tar. Use Q7Z if you want to:
    • Update existing archives quickly
    • Backup multiple folders to a storage location
    • Create or extract protected archives
    • Lessen effort by using archiving profiles and lists

Versioning systems

These are traditionally used for keeping track of software development; but if you want to have a simple way to manage your config files in one directory, it might be a good solution.

  • mercurial or git (both in extra repo)
  • gibak: a backup system based on git. it also supports binary diffs (for binaries, e-books, pictures, multimedia files, etc). on the homepage there is a short usage advice. it is meant to backup only the $HOME directory. one could also backup other directories (like /etc) by changing the $HOME variable to point to that directory (though i don't really recommend this). gibak is handy for people who are familiar with git. it uses .gitignore to filter files and one can use the git commands to restore files, browse through logs, diffs, etc. if one needs a gui, it is also possible to use gitk or qgit to browse through commits or do whatever these interfaces support. get it from AUR: