rsync
rsync is an open source utility that provides fast incremental file transfer.
Installation
rsync must be installed on both the source and the destination machine.
Front-ends
- Grsync — GTK front-end.
- JotaSync — Java Swing GUI for rsync with integrated scheduler.
- luckyBackup — Qt front-end written in C++.
Other tools using rsync are rdiff-backupAUR, osyncAUR and yarsyncAUR.
As cp/mv alternative
rsync
can be used as an advanced alternative for the cp
or mv
command, especially for copying larger files:
$ rsync -P source destination
The -P
option is the same as --partial --progress
, which keeps partially transferred files and shows a progress bar.
You may want to use the -r
/--recursive
option to recurse into directories.
Files can be copied locally as with cp, but the motivating purpose of rsync is to copy files remotely, i.e. between two different hosts. Remote locations can be specified with a host-colon syntax:
$ rsync source host:destination
or
$ rsync host:source destination
Network file transfers use the SSH protocol by default and host
can be a real hostname or a predefined profile/alias from .ssh/config
.
Whether transferring files locally or remotely, rsync first creates a file-list containing information (by default, it is the file size and last modification timestamp) which will then be used to determine if a file needs to be constructed. For each file to be constructed, a weak and strong checksum is found for all blocks such that each block is of length S bytes, non-overlapping, and has an offset which is divisible by S. Using this information a large file can be constructed using rsync without having to transfer the entire file. For a more detailed practical and mathematical explanation refer to how rsync works and the rsync algorithm, respectively.
To use sane defaults quickly, you could use some aliases:
cpr() { rsync --archive -hh --partial --info=stats1,progress2 --modify-window=1 "$@" } mvr() { rsync --archive -hh --partial --info=stats1,progress2 --modify-window=1 --remove-source-files "$@" }
-hh
: output numbers in a human-readable format--info=stats1,progress2
:stats1
displays rsync transfer statistics with verbosity level 1.progress2
prints total transfer progress as opposed to per-file transfer progress (progress1
)--modify-window=1
: when comparing the timestamps of two files, treat their timestamps as being equivalent if their timestamps have a difference of less than 1 second--remove-source-files
: remove files from the source directory after they have been successfully synced
--checksum
option. The --checksum
option affects the file skip heuristic used prior to any file being transferred. Independent of --checksum
, a checksum is always used for the block-based file construction which is how rsync transfers a file.Trailing slash caveat
Arch by default uses GNU cp (part of GNU coreutils). However, rsync follows the convention of BSD cp, which gives special treatment to source directories with a trailing slash "/". Whereas
$ rsync -r source destination
creates a directory "destination/source" with the contents of "source", the command
$ rsync -r source/ destination
copies all of the files in "source/" directly into "destination", with no intervening subdirectory - just as if you had invoked it as
$ rsync -r source/. destination
This behavior is different from that of GNU cp, which treats "source" and "source/" identically (but not "source/."). Also, some shells automatically append the trailing slash when tab-completing directory names. Because of these factors, there can be a tendency among new or occasional rsync users to forget about rsync's different behavior, and inadvertently create a mess or even overwrite important files by leaving the trailing slash on the command line.
Thus it can be prudent to use a wrapper script to automatically remove trailing slashes before invoking rsync:
#!/bin/bash new_args=() for i in "${@}"; do case "${i}" in /) i="/" ;; */) i="${i%/}" ;; esac new_args+=("${i}") done exec rsync "${new_args[@]}"
This script can be put somewhere in the path, and aliased to rsync in the shell init file.
As a backup utility
The rsync protocol can easily be used for backups, only transferring files that have changed since the last backup. This section describes a very simple scheduled backup script using rsync, typically used for copying to removable media.
Automated backup
For the sake of this example, the script is created in the /etc/cron.daily
directory, and will be run on a daily basis if a cron daemon is installed and properly configured. Configuring and using cron is outside the scope of this article.
First, create a script containing the appropriate command options:
/etc/cron.daily/backup
#!/bin/sh rsync -a --delete --quiet /path/to/backup /location/of/backup
-a
: indicates that files should be archived, meaning that most of their characteristics are preserved (but not ACLs, hard links or extended attributes such as capabilities)--delete
: means files deleted on the source are to be deleted on the backup as well
Here, /path/to/backup
should be changed to what needs to be backed-up (e.g. /home
) and /location/of/backup
is where the backup should be saved (e.g. /media/disk
).
Finally, the script must be executable.
Automated backup with SSH
If backing-up to a remote host using SSH, use this script instead:
/etc/cron.daily/backup
#!/bin/sh rsync -a --delete --quiet -e ssh /path/to/backup remoteuser@remotehost:/location/of/backup
-e ssh
: tells rsync to use SSHremoteuser
: is the user on the hostremotehost
-a
: groups all these options-rlptgoD
(recursive, links, perms, times, group, owner, devices)
-a
option), root access to the target machine is needed. The preferred way to achieve this for automation is to set up the SSH daemon to allow root to login using a public key without password and run the rsync command as root.Automated backup with NetworkManager
This script starts a backup when network connection is established.
First, create a script containing the appropriate command options:
/etc/NetworkManager/dispatcher.d/backup
#!/bin/sh if [ x"$2" = "xup" ] ; then rsync --force --ignore-errors -a --delete --bwlimit=2000 --files-from=files.rsync /path/to/backup /location/of/backup fi
-a
: group all this options-rlptgoD
recursive, links, perms, times, group, owner, devices--files-from
: read the relative path of/path/to/backup
from this file--bwlimit
: limit I/O bandwidth; Kilo-bytes per second
The script must be owned by root (see NetworkManager#Network services with NetworkManager dispatcher for details).
Automated backup with systemd and inotify
- Due to the limitations of inotify and systemd (see this question and answer), recursive filesystem monitoring is not possible. Although you can watch a directory and its contents, it will not recurse into subdirectories and watch the contents of them; you must explicitly specify every directory to watch, even if that directory is a child of an already watched directory.
- This setup is based on a systemd/User instance.
Instead of running time interval backups with time based schedules, such as those implemented in cron, it is possible to run a backup every time one of the files you are backing up changes. systemd.path
units use inotify
to monitor the filesystem, and can be used in conjunction with systemd.service
files to start any process (in this case your rsync backup) based on a filesystem event.
First, create the systemd.path
unit that will monitor the files you are backing up:
~/.config/systemd/user/backup.path
[Unit] Description=Checks if paths that are currently being backed up have changed [Path] PathChanged=%h/documents PathChanged=%h/music [Install] WantedBy=default.target
Then create a systemd.service
file that will be activated when it detects a change. By default a service file of the same name as the path unit (in this case backup.path
) will be activated, except with the .service
extension instead of .path
(in this case backup.service
).
Type=oneshot
. This allows you to specify multiple ExecStart=
parameters, one for each rsync command, that will be executed. Alternatively, you can simply write a script to perform all of your backups, just like cron scripts.~/.config/systemd/user/backup.service
[Unit] Description=Backs up files [Service] ExecStart=/usr/bin/rsync %h/./documents %h/./music -CERrltm --delete ubuntu:
Now all you have to do is enable/start backup.path
like a normal systemd service and it will start monitoring file changes and automatically start backup.service
.
Differential backup on a week
This is a useful option of rsync, resulting in a full backup (on each run) and keeping a differential backup copy of changed files only in a separate directory for each day of a week.
First, create a script containing the appropriate command options:
/etc/cron.daily/backup
#!/bin/sh DAY=$(date +%A) if [ -e /location/to/backup/incr/$DAY ] ; then rm -fr /location/to/backup/incr/$DAY fi rsync -a --delete --quiet --inplace --backup --backup-dir=/location/to/backup/incr/$DAY /path/to/backup/ /location/to/backup/full/
The --inplace
option implies --partial
and updates destination files in-place.
Snapshot backup
The same idea can be used to maintain a tree of snapshots of your files. In other words, a directory with date-ordered copies of the files. The copies are made using hardlinks, which means that only files that did change will occupy space. Generally speaking, this is the idea behind Apple's TimeMachine.
This basic script is easy to implement and creates quick incremental snapshots using the --link-dest
option to hardlink unchanged files:
/usr/local/bin/snapbackup.sh
#!/bin/sh # Basic snapshot-style rsync backup script # Config OPT="-aPh" LINK="--link-dest=/snapshots/username/last/" SRC="/home/username/files/" SNAP="/snapshots/username/" LAST="/snapshots/username/last" date=`date "+%Y-%b-%d:_%T"` # Run rsync to create snapshot rsync $OPT $LINK $SRC ${SNAP}$date # Remove symlink to previous snapshot rm -f $LAST # Create new symlink to latest snapshot for the next backup to hardlink ln -s ${SNAP}$date $LAST
There must be a symlink to a full backup already in existence as a target for --link-dest
. If the most recent snapshot is deleted, the symlink will need to be recreated to point to the most recent snapshot. If --link-dest
does not find a working symlink, rsync will proceed to copy all source files instead of only the changes.
A more sophisticated version keeps an up-to-date full backup $SNAP/latest
and in case a certain number of files has changed since the last full backup, it creates a snapshot $SNAP/$DATETAG
of the current full-backup utilizing cp -al
to hardlink unchanged files:
/usr/local/bin/rsnapshot.sh
#!/bin/sh ## my own rsync-based snapshot-style backup procedure ## (cc) marcio rps AT gmail.com # config vars SRC="/home/username/files/" #dont forget trailing slash! SNAP="/snapshots/username" OPTS="-rltgoi --delay-updates --delete --chmod=a-w" MINCHANGES=20 # run this process with real low priority ionice -c 3 -p $$ renice +12 -p $$ # sync rsync $OPTS $SRC $SNAP/latest >> $SNAP/rsync.log # check if enough has changed and if so # make a hardlinked copy named as the date COUNT=$( wc -l $SNAP/rsync.log|cut -d" " -f1 ) if [ $COUNT -gt $MINCHANGES ] ; then DATETAG=$(date +%Y-%m-%d) if [ ! -e $SNAP/$DATETAG ] ; then cp -al $SNAP/latest $SNAP/$DATETAG chmod u+w $SNAP/$DATETAG mv $SNAP/rsync.log $SNAP/$DATETAG chmod u-w $SNAP/$DATETAG fi fi
To make things really, really simple this script can be run from a systemd/Timers unit.
Full system backup
This section is about using rsync to transfer a copy of the entire /
tree, excluding a few selected directories. This approach is considered to be better than disk cloning with dd
since it allows for a different size, partition table and filesystem to be used, and better than copying with cp -a
as well, because it allows greater control over file permissions, attributes, Access Control Lists and extended attributes.
rsync will work even while the system is running, but files changed during the transfer may or may not be transferred, which can cause undefined behavior of some programs using the transferred files. For mitigation log out all users and shut down all programs and databases.
This approach works well for migrating an existing installation to a new hard drive or SSD.
Run the following command as root to make sure that rsync can access all system files and preserve the ownership:
# rsync -aAXHv --exclude='/dev/*' --exclude='/proc/*' --exclude='/sys/*' --exclude='/tmp/*' --exclude='/run/*' --exclude='/mnt/*' --exclude='/media/*' --exclude='/lost+found/' / /path/to/backup
By using the -aAX
set of options, the files are transferred in archive mode which ensures that symbolic links, devices, permissions, ownerships, modification times, ACLs, and extended attributes are preserved, assuming that the target file system supports the feature. The option -H
preserves hard links, but uses more memory.
The --exclude
option causes files/directories that match the given patterns to be excluded. Instead or in conjunction, the --exclude-from=file
option excludes files/directories that match patterns (one per line) in file
, similar to the example described in #Advanced usage of filter rules but without the +
/-
syntax.
The directories /dev
, /proc
, /sys
, /tmp
, and /run
are included in the above command, but the contents of those directories are excluded. This is because they are populated on boot, but the directories themselves are not created. /lost+found
is filesystem-specific. Quoting the exclude patterns will avoid expansion by the shell, which is necessary, for example, when backing up over SSH. Ending the excluded paths with *
ensures that the directories themselves are created if they do not already exist.
- If you plan on backing up your system somewhere other than
/mnt
or/media
, do not forget to add it to the list of exclude patterns to avoid an infinite loop. - If there are any bind mounts in the system, they should be excluded as well so that the bind mounted contents is copied only once.
- If you use a swap file, make sure to exclude it as well.
- Consider if you want to backup the
/home/
directory. If it contains your data it might be considerably larger than the system. Otherwise consider excluding unimportant sub-directories such as/home/*/.thumbnails/*
,/home/*/.cache/mozilla/*
,/home/*/.cache/chromium/*
, and/home/*/.local/share/Trash/*
, depending on software installed on the system. - If GVFS is installed,
/home/*/.gvfs
must be excluded to prevent rsync errors. - If Dhcpcd ≥ 9.0.0 is installed, exclude the
/var/lib/dhcpcd/*
directory as it mounts several system directories as sub-directories there.
You may want to include additional rsync options, or remove some, such as the following. See rsync(1) for the full list.
- If you run on a system with very low memory, consider removing
-H
option; however, it should be no problem on most modern machines. There can be many hard links on the file system depending on the software used (e.g. if you are using Flatpak). Many hard links reside under the/usr/
directory. - You may want to add rsync's
--delete
option if you are running this multiple times to the same backup directory. In this case make sure that the source path does not end with/*
, or this option will only have effect on the files inside the subdirectories of the source directory, but it will have no effect on the files residing directly inside the source directory. - If you use any sparse files, such as virtual disks, Docker images and similar, you should add the
-S
option. - The
--numeric-ids
option will disable mapping of user and group names; instead, numeric group and user IDs will be transfered. This is useful when backing up over SSH or when using a live system to backup different system disk. - Choosing
--info=progress2
option instead of-v
will show the overall progress info and transfer speed instead of the list of files being transferred. - To avoid crossing a filesystem boundary when recursing, add the option
-x
/--one-file-system
. This will prevent backing up any mount point in the hierarchy.
Restore a backup
If you wish to restore a backup, use the same rsync command that was executed but with the source and destination reversed.
Advanced usage of filter rules
Instead of specifying include and exclude rules separately rsync can read all of these from a single filter file. rsync then processes the rules in a top-down order; the first matching rule wins.
backup.filter
# Exclude patterns - .thumbnails/*** - node_modules/*** - venv/*** # Include patterns + /Documents/*** + /Books/*** + /Music/*** # Exclude everything else - /**
***
is a special rsync pattern which matches a folder and all of its contents recursively.
Check rsync(1) § PATTERN MATCHING RULES and rsync(1) § FILTER RULES IN DEPTH for more details.
Then run rsync with:
$ rsync -aAXHv --filter="merge backup.filter" $SRC $DEST
The key word is the --filter "merge ..."
parameter which will take the filter file and parse the rules in order for each sync-ed file.
Copy from list of paths
An alternative to the #Advanced usage of filter rules method is to use the --files-from
option. This can take an input from a text file containing a list of directory or file paths, with each entry being separated by new lines. It should be noted that the -r
flag must manually be specified for this option if the user wants recursive directory copying, even when -a
is already included.
For example, a list of directories and all recursive directories can be archived with the following:
$ rsync -aAXHvr --files-from="dir_list.txt" $SRC $DEST
File system cloning
rsync provides a way to do a copy of all data in a file system while preserving as much information as possible, including the file system metadata. It is a procedure of data cloning on a file system level where source and destination file systems do not need to be of the same type. It can be used for backing up, file system migration or data recovery.
rsync's archive mode comes close to being fit for the job, but it does not back up the special file system metadata such as access control lists, extended attributes or sparse file properties. For successful cloning at the file system level, some additional options need to be provided:
rsync -qaHAXS SOURCE_DIR DESTINATION_DIR
And their meaning is (from the manpage):
--hard-links, -H preserve hard links --acls, -A preserve ACLs (implies --perms) --xattrs, -X preserve extended attributes --sparse, -S turn sequences of nulls into sparse blocks
Additionally, use -x
if you have other filesystems mounted under the tree that you want to exclude from the copy.
SOURCE_DIR
, for the reason mentioned in #Trailing slash caveat.The produced copy can be simply reread and checked (for example after a data recovery attempt) at the file system level with diff
's recursive option:
diff -r SOURCE_DIR DESTINATION_DIR
It is possible to do a successful file system migration by using rsync as described in this article and updating the fstab and boot loader as described in Migrate installation to new hardware. This essentially provides a way to convert any root file system to another one.
As a daemon
rsync can be run as daemon on a server listening on TCP port 873
.
Edit the template /etc/rsyncd.conf
, configure a share and start the rsyncd.service
.
rsyncd.service
and rsyncd@.service
. The change for ProtectHome
has been commented, the security feature ProtectSystem=full
under the [Service]
section is still active. This makes the /boot/
, /etc/
and /usr/
directories read-only. If you need rsyncd write system directories you can edit the unit and set ProtectSystem=off
in the [Service]
section of the overriding snippet.Usage from client, e.g. list server content:
$ rsync rsync://server/share
transfer file from client to server:
$ rsync local-file rsync://server/share/
Consider opening TCP port 873
in the firewall, and using user authentication.
Example configuration
Sharing from a list of files
/etc/rsyncd.conf
... # Needed when crossing filesystem boundaries. #use chroot = no read only = yes ... [sync] path = / # List of files to copy. include from = /backup.list # Exclude the rest. exclude = *
Inside the file list, all the intermediary paths are necessary, except when the ***
wildcard is used:
/backup.list
/etc/ /etc/conf.d/ /etc/conf.d/hwclock /etc/fonts/***
See also
- More usage examples can be searched in the Community Contributions and General Programming forums
- Howto – local and remote snapshot backup using rsync with hard links Includes file deduplication with hard-links, MD5 integrity signature, 'chattr' protection, filter rules, disk quota, retention policy with exponential distribution (backups rotation while saving more recent backups than older)
- Using SSH keys/identity files with rsync