Difference between revisions of "Improve pacman performance"

From ArchWiki
Jump to: navigation, search
(Using aria2)
(Using aria2)
Line 58: Line 58:
 
* <b>--summary-interval=60</b> : Output download progress summary every 60 seconds (default: 60)<sup>2</sup>
 
* <b>--summary-interval=60</b> : Output download progress summary every 60 seconds (default: 60)<sup>2</sup>
 
* <b>-t5</b> : Set a 5 second timeout per mirror after a connection is established (default: 60)
 
* <b>-t5</b> : Set a 5 second timeout per mirror after a connection is established (default: 60)
* <b>-o %o</b> : Set the <b>%o</b> file name(s) of the downloaded file(s) as specified by pacman
+
* <b>-d</b> : The directory to store the downloaded file(s) as specified by pacman
 +
* <b>-o %o</b> : The <b>%o</b> file name(s) of the downloaded file(s) as specified by pacman
 
* <b>%u</b> : Download the <b>%u</b> file(s) specified by pacman
 
* <b>%u</b> : Download the <b>%u</b> file(s) specified by pacman
  

Revision as of 18:03, 25 August 2010

This template has only maintenance purposes. For linking to local translations please use interlanguage links, see Help:i18n#Interlanguage links.


Local languages: Català – Dansk – English – Español – Esperanto – Hrvatski – Indonesia – Italiano – Lietuviškai – Magyar – Nederlands – Norsk Bokmål – Polski – Português – Slovenský – Česky – Ελληνικά – Български – Русский – Српски – Українська – עברית – العربية – ไทย – 日本語 – 正體中文 – 简体中文 – 한국어


External languages (all articles in these languages should be moved to the external wiki): Deutsch – Français – Română – Suomi – Svenska – Tiếng Việt – Türkçe – فارسی

Improving database access speeds

Pacman stores all package information in a collection of small files, one for each package. Improving database access speeds reduces the time taken in database-related tasks, e.g. searching packages and resolving package dependencies.

The safest and easiest method is to run

# pacman-optimize && sync

as root. This will attempt to put all the small files together in one (physical) location on the hard disk so that the hard disk head does not have to move so much when accessing all the packages. This method is safe, but is not for-sure. It depends on your filesystem, disk usage and empty space fragmentation.

Pacman-cage

pacman-cage is a script that puts the pacman database, Template:Filename, in a single loop file containing its own file-system that can speed up access times. This very simple alteration that improves pacman's speed for tasks like searching and updating. The script makes a backup (at installation only) in case something goes wrong but has caused some users to lose their database (e.g. when used in a chroot). Use with caution.

pacman-cage is now part of pactools on AUR and has been renamed pt-pacman-cage (with his brother pt-pacman-uncage)

Improving download speeds

Firstly, if your download speeds have been reduced to a crawl, ensure you are using one of the many mirrors and not ftp.archlinux.org, which, as of March 2007, is now throttled.

Pacman's speed in downloading packages can be improved by using a different application to download packages instead of Pacman's built-in file downloader.

In all cases, make sure you have the latest Pacman before doing any modifications.

# pacman -Syu

Using wget

This is also very handy if you need more powerful proxy settings than pacman's built-in capabilities.

To use wget, first install it with pacman -S wget and then modify /etc/pacman.conf by adding the following line to the [options] section:

XferCommand = /usr/bin/wget -c --passive-ftp -c %u

Instead of putting wget parameters in /etc/pacman.conf, you can also modify the wget configuration file directly (the system-wide file is /etc/wgetrc, per user files are $HOME/.wgetrc).

Using aria2

aria2 is a lightweight download utility with support for resuming and segmented downloading with support for HTTP/HTTPS and FTP and part of the . This means that you can make several HTTP/HTTPS and FTP connections to an Arch mirror at the same time, which should result in an increase in download speeds.

Install aria2 with pacman -S aria2 and then edit Template:Filename by adding the following line to the [option] section:

XferCommand = /usr/bin/aria2c --allow-overwrite=true -c --file-allocation=none --log-level=error -m2 --max-file-not-found=2 --no-conf --remote-time=true --summary-interval=60 -t5 -d / -o %o %u
  • /usr/bin/aria2c : The full PATH of the aria2 executable
  • --allow-overwrite=true : Restart download if the corresponding control file does not exist (default: false)
  • -c : Continue downloading a partially downloaded file
  • --file-allocation=none : Do not pre-allocate file space before download begins (default: prealloc)1
  • --log-level=error : Set log level to output errors only (default: debug)
  • -m2 : Make 2 maximum attempts to download specified package(s) per mirror (default: 5)
  • --max-file-not-found=2 : Force download to fail if a single byte is not received within 2 attempts (default: 0)
  • --no-conf : Disable loading an aria2.conf file if it exists (typically ~/.aria2/aria2.conf)
  • --remote-time=true : Apply timestamps of the remote file(s) and apply them to the local file(s) (default: false)
  • --summary-interval=60 : Output download progress summary every 60 seconds (default: 60)2
  • -t5 : Set a 5 second timeout per mirror after a connection is established (default: 60)
  • -d : The directory to store the downloaded file(s) as specified by pacman
  • -o %o : The %o file name(s) of the downloaded file(s) as specified by pacman
  • %u : Download the %u file(s) specified by pacman

1 --file-allocation=falloc is recommended for newer file systems such as ext4 (with extents support), btrfs or xfs as it allocates large files (GB) almost instantly. Do not use falloc with legacy file systems such as ext3 as prealloc consumes the same amount of time as standard allocation while locking the aria2 process from continuing to download.

2--summary-interval=0 supresses download progress summary output and may improve overall performance. Errors, if encountered, will be output regardless.

Powerpill

Powerpill is a wrapper for pacman that uses aria2 to download packages. Unlike the other aria2 solutions, powerpill uses simultaneous downloads for all files and segmented downloads only for larger files, which really makes the most of your bandwidth without wasting time splitting small files unnecessarily. Powerpill is available in the community repo.

# pacman -S powerpill

For more info, see the Powerpill wiki article.

Using airpac

In a nutshell, airpac is an aria2c wrapper for pacman. Unlike powerpill, which acts as a frontend to pacman, airpac serves as a backend downloader for pacman. On the other hand, however, it behaves similarly to powerpill, as far as downloading is concerned, since both use aria2c to actually download the files. Because it is a backend though, it cannot download multiple packages simultaneously as powerpill can.

Essentially, airpac is the Python implementation of the pacget script below. However, the main difference lies in the handling of aria2c output. airpac shows only the most relevant info, i.e., the download progress, although it currently doesn't use a progressbar (maybe in the near future). Also, airpac caches the db files so that they won't be downloaded for every pacman -Sy. On the downside, this breaks pacman -Syy since airpac has no way of knowing the options pacman is executed with. As a workaround, however, one can use pacman -Sc to delete the cached files in /var/lib/pacman/.airpac.

The configuration file is located in /etc/airpac.conf. This is actually an aria2c config file. Because of this, the user can directly configure how aria2c is used by airpac without meddling with airpac's code. For more info about the available options, consult the aria2c manpage.

airpac also uses the Server Performance Profile feature of aria2c by default. The statistics file is located in /var/lib/airpac.stats. The default URI selector is adaptive.

Usage in /etc/pacman.conf

XferCommand = /usr/bin/airpac %u %o

pacget (aria2) Mirror Script

This script will greatly improve the download speed for broadband users. It uses the servers in /etc/pacman.d/mirrorlist as mirrors in aria2. What happens is that aria2 downloads from multiple servers simultaneously which gives a huge boost in download speed.

Take note that you have to put 'exec' before /usr/bin/pacget in the XferCommand. This is needed so that when you terminate pacget or aria2 (with process id used by pacget), pacman would also terminate. This would prevent inconvenience because Pacman would not persist downloading a file when you tell it not to.

WARNING: You may experience some problems if the mirrors used are out-of-sync or are simply not up-to-date. Just use the Reflector script to generate a list of up-to-date and fast mirrors. Also, ftp.archlinux.org resolves to two IPs. You may want to choose only one of them and hard code ftp.archlinux.org and the chosen IP address to /etc/hosts.

/usr/bin/pacget

#!/bin/bash

msg() {
  echo ""
  echo -e "   \033[1;34m->\033[1;0m \033[1;1m${1}\033[1;0m" >&2
}

error() {
  echo -e "\033[1;31m==> ERROR:\033[1;0m \033[1;1m$1\033[1;0m" >&2
}

CONF=/etc/pacget.conf
STATS=/etc/pacget.stats
ARIA2=$(which aria2c 2> /dev/null)

# ----- do some checks first -----
if [ ! -x "$ARIA2" ]; then
  error "aria2c was not found or isn't executable."
  exit 1
fi

if [ $# -ne 2 ]; then
  error "Incorrect number of arguments"
  exit 1
fi

filename=$(basename $1)
server=${1%/$filename}

# Determine which repo is being used
repo=$(awk -F'/' '$(NF-2)~/^(community|core|extra|testing)$/{print $(NF-2)}' <<< $server)
[ -z $repo ] && repo="custom"

# For db files, or when using a custom repo (which most likely doesn't have any mirror),
# use only the URL passed by pacman; Otherwise, extract the list of servers (from the include file of the repo) to download from
url=$1
if ! [[ $filename = *.db.tar.gz || $repo = "custom" ]]; then
  mirrorlist=$(awk -F' *= *' '$0~"^\\["r"\\]",/Include *= */{l=$2} END{print l}' r=$repo /etc/pacman.conf)
  if [ -n mirrorlist ]; then
    num_conn=$(grep ^split $CONF | cut -d'=' -f2)
    url=$(sed -r '/^Server *= */!d; s/Server *= *//; s/\$repo'"/$repo/; s:$:/$filename:" $mirrorlist | head -n $(($num_conn * 2)))
  fi
fi

msg "Downloading $filename"
cd /var/cache/pacman/pkg/

touch $STATS

$ARIA2 --conf-path=$CONF --max-tries=1 --max-file-not-found=5 \
  --uri-selector=adaptive --server-stat-if=$STATS --server-stat-of=$STATS \
  --allow-overwrite=true --remote-time=true --log-level=error --summary-interval=0 \
  $url --out=${filename}.pacget && [ ! -f ${filename}.pacget.aria2 ] && mv ${filename}.pacget $2 && chmod 644 $2

exit $?

/etc/pacget.conf

# The log file
log=/var/log/pacget.log
# Number of servers to download from
split=5
# Maximum download speed (0 = unrestricted)
max-download-limit=0
# Minimum download speed (0 = don't care)
lowest-speed-limit=0
# Server timeout period
timeout=5
# Passive FTP or not
ftp-pasv=true
# 'none' or 'prealloc'
file-allocation=none

Save this script as /usr/bin/pacget.

chmod 755 /usr/bin/pacget

This makes the script an executable

In /etc/pacman.conf, in the [options] section, the following needs to be added:

XferCommand = exec /usr/bin/pacget %u %o

PS: If you use ftp.archlinux.org as the first server listed in your include files (/etc/pacman.d/*), some problems may occur when the mirrors you are using have not yet synced. To make great use of this script, choose a mirror (that syncs in a timely manner) that is more appropriate for you, then put that on top of the server lists. This is to prevent downloading only from ftp.archlinux.org when the mirrors have not yet synced. The rankmirrors python script can be useful in this case.

Using other applications

There are other downloading applications that you can use with Pacman. Here they are, and their associated XferCommand settings:

  • snarf: XferCommand = /usr/bin/snarf -N %u
  • lftp: XferCommand = /usr/bin/lftp -c pget %u
  • axel: XferCommand = /usr/bin/axel -n 2 -v -a -o %o %u

Choosing the fastest mirror

When downloading packages pacman uses the mirrors in the order they are in /etc/pacman.d/mirrorlist.The mirror which is at the top of the list by default however may not be the fastest for you.

Choosing a local mirror

The simple way is to edit mirrorlist file by placing a local mirror at the top of the list. pacman will then use this mirror for preference.

Alternativley the pacman.conf file can be edited by placing a local mirror before the line sourcing the mirrorlist file, i.e. where it says "add your preferred servers here". It is safer if you use the same server for each repository.

Using rankmirror

You can use rankmirrors to rank pacman mirrors by their connection and opening speed.

Backup the original in case any problems come up:

mv /etc/pacman.d/mirrorlist /etc/pacman.d/mirrorlist.org

Then run rankmirrors to test and add the five fastest mirrors:

rankmirrors -n 5 /etc/pacman.d/mirrorlist.org > /etc/pacman.d/mirrorlist

See the help for more information.

rankmirrors -h

After changing mirrors

After changing your mirror it is a good idea to refresh the pacman database. Using two y's forces a download of a fresh copy of the master package list from the server even if they are thought to be up to date.

# pacman -Syy

Sharing packages over your LAN

If you happen to run several Arch boxes on your LAN, you can share packages so that you can greatly decrease your download times. Keep in mind you should not share between different architectures (i.e. i686 and x86_64) or you'll get into troubles. There are actually 2 ways to achieve this :

The do-it-yourself way

Get your hands dirty: http://wiki.archlinux.org/index.php/Howto_Upgrade_via_Home_Network

The easy way

Install and configure Xyne's pkgd: http://xyne.archlinux.ca/info/pkgd

Related forum thread