Difference between revisions of "Improve pacman performance"

From ArchWiki
Jump to: navigation, search
(Powerpill)
m (Pacget Aria2 Mirror Script: rm all traces of [unstable])
Line 134: Line 134:
  
 
# Determine which repo is being used
 
# Determine which repo is being used
repo=$( awk -F'/' '$(NF-2)~/^(community|core|current|extra|testing|unstable)$/{print $(NF-2)}' <<<$server )
+
repo=$( awk -F'/' '$(NF-2)~/^(community|core|extra|testing)$/{print $(NF-2)}' <<<$server )
 
[ -z $repo ] && repo="custom"
 
[ -z $repo ] && repo="custom"
  

Revision as of 16:39, 14 December 2008

Template:I18n links start Template:I18n entry Template:I18n entry Template:I18n links end

Improving database access speeds

Pacman stores all package information in a collection of small files, one for each package. Improving database access speeds reduces the time taken in database-related tasks, e.g. searching packages and resolving package dependencies.

The safest and easiest method is to run

pacman-optimize && sync

as root. This will attempt to put all the small files together in one (physical) location on the hard disk so that the hard disk head does not have to move so much when accessing all the packages. This method is safe, but is not for-sure. It depends on your filesystem, disk usage and empty space fragmentation.

Further tweaks

ody has posted a script on the forum that replaces the current Pacman database with a loopback filesystem which ensures that all the small files continue to stay together on the hard disk. Several users have reported great improvements, but problems have also been reported so do not do this unless you are an expert user.

To use ody's script you must have a kernel compiled with loopback filesystem support. The default kernels already have this, so you only need to be concerned with this if you compile your own custom kernel.

Improving download speeds

Firstly, if your download speeds have been reduced to a crawl, ensure you are using one of the many mirrors and not ftp.archlinux.org, which, as of March 2007, is now throttled.

Pacman's speed in downloading packages can be improved by using a different application to download packages instead of Pacman's built-in file downloaded.

In all cases, make sure you have the latest Pacman before doing any modifications.

pacman -Sy pacman

Using wget

This is also very handy if you need more powerful proxy settings than pacman's built-in capabilities.

To use wget, first install it with pacman -S wget and then modify /etc/pacman.conf by adding the following line to the [options] section:

XferCommand = /usr/bin/wget -c --passive-ftp -c %u

Instead of putting wget parameters in /etc/pacman.conf, you can also modify the wget configuration file directly (the system-wide file is /etc/wgetrc, per user files are $HOME/.wgetrc).


Using aria2

According to the aria2 website, aria2 is "a download utility with resuming and segmented downloading. Supports HTTP/HTTPS/FTP/BitTorrent/Metalink." This means that you can make several HTTP/FTP connections to an Arch mirror at the same time, which should result in an increase in download speeds.

Install it with pacman -S aria2 and then edit /etc/pacman.conf by adding the following line to the [option] section:

XferCommand = /usr/bin/aria2c -s 2 -m 2 -d / -o %o %u

Let's run over the options here:

  • /usr/bin/aria2c - the location of the aria2 application
  • -s 2 - use 2 concurrent connections (you can set this higher if you want, but it's not going to do a whole lot)
  • -m 2 - make 2 attempts to download the package per mirror
  • -o %o - output to the file pacman specifies
  • %u - download the file pacman specifies


Powerpill

Powerpill is a wrapper for pacman that uses aria2 to download packages. Unlike the other aria2 solutions, powerpill uses simultaneous downloads for all files and segmented downloads only for larger files, which really makes the most of your bandwidth without wasting time splitting small files unnecessarily. You can read more about powerpill and get the latest version in this thread in the Community Contributions forum. An AUR package is available.

Pacget Aria2 Mirror Script

This script will greatly improve the download speed for broadband users. It takes the download URL from pacman, then looks up the mirror list in /etc/pacman.d/mirrorlist and adds them all as mirrors to aria2. What ends up happening is aria2 connects to 10-20 servers (the default is 4 servers max) downloading from all of them at the same time, which should give anyone on broadband a huge boost in download speed. This should max out the download speed for most people.

Take note that you have to put 'exec' before /usr/bin/pacget in the XferCommand. This is needed so that when you terminate pacget or aria2 (with process id used by pacget), pacman would also terminate. This would prevent "inconvenience" because Pacman would not persist downloading a file when you tell it not to.

WARNING: You may experience a lot of problems when you become greedy. Just choose a handful of reliable servers (5 is already a lot) that syncs regularly with the master server. Do NOT choose out-of-date mirrors as these may cause problems like corrupted downloads, etc. Also, ftp.archlinux.org resolves to two IPs. You may want to choose only one of them and hard code ftp.archlinux.org and the chosen IP address to /etc/hosts.

WARNING: Needs testing with custom repos.

#!/bin/bash

# ------------ Begin Configuration ------------ #

# The log file
LOG=/var/log/pacget.log
# Number of connections per server (don't go beyond 2, please)
CONNECTIONS=2
# Max number of servers to download from (follows ordering in include file)
SERVERS=4
# Maximum download speed (0 = unrestricted)
MAX_SPEED=0
# Minimum download speed (0 = don't care)
MIN_SPEED=0
# Maximum tries per download
MAX_TRIES=2
# Server timeout period
TIMEOUT=15
# Passive FTP or not: 'yes' or 'no'
FTP_PASV="no"
# 'none' or 'prealloc'
FILE_ALLOC="none"
# Use color in messages
USE_COLOR="yes"

# ------------- End Configuration ------------- #

msg() {
  echo ""
  if [ "${USE_COLOR}" = "yes" ]; then
    echo -ne "   \033[1;34m->\033[1;0m \033[1;1m${1}\033[1;0m" >&2
  else
    echo -n "   -> ${1}" >&2
  fi
}

error() {
  if [ "$USE_COLOR" = "yes" ]; then
    echo -e "\033[1;31m==> ERROR:\033[1;0m \033[1;1m$1\033[1;0m" >&2
  else
    echo "==> ERROR: $1" >&2
  fi
}

ARIA2_BIN=$(which aria2c 2> /dev/null)

# ----- do some checks first -----
if [ ! -x "$ARIA2_BIN" ]; then
  error "aria2c was not found or isn't executable."
  exit 1
fi

if [ $# -ne 2 ]; then
  error "Incorrect number of arguments"
  exit 1
fi

filename=$(basename ${1})
server=${1%/${filename}}

# Determine which repo is being used
repo=$( awk -F'/' '$(NF-2)~/^(community|core|extra|testing)$/{print $(NF-2)}' <<<$server )
[ -z $repo ] && repo="custom"

# Override number of connections for db files
[[ ${filename} = *.db.tar.gz ]] && CONNECTIONS=1

# For db files, or when using a custom repo (which most likely doesn't have any mirror),
# use only the URL passed by pacman; Otherwise, extract the list of servers (from the include file of the repo) to download from
url=${1}
if ! [[ ${filename} = *.db.tar.gz || ${repo} = "custom" ]]; then
        mirrorlist=$( awk -F' *= *' '$0~"^\\["r"\\]",/Include *= */{l=$2} END{print l}' r=${repo} /etc/pacman.conf )
        [ -n mirrorlist ] && 
        url=$( sed -r '/^Server *= */!d; s/Server *= *//; s/\$repo'"/${repo}/; s:$:/${filename}:" ${mirrorlist}|head -n ${SERVERS} )
        #awk way of doing the same thing, 
        #url=$( awk -F' *= *' 'i<=n&&$1=="Server"{gsub(/\$repo/, r, $2);print $2"/"f;i++}' n=${SERVERS} r=${repo} f=${filename} ${mirrorlist} )
fi

# Passive FTP or not?
[ "${FTP_PASV}" = "yes" ] && OPT_FTP_PASV="--ftp-pasv" || OPT_FTP_PASV=""

msg "Downloading ${filename}"

cd /var/cache/pacman/pkg/

${ARIA2_BIN} --log=${LOG} --timeout=${TIMEOUT} --max-tries=${MAX_TRIES} --allow-overwrite=true \
             --split=${CONNECTIONS} ${OPT_FTP_PASV} --file-allocation=${FILE_ALLOC} \
             --lowest-speed-limit=${MIN_SPEED} --max-download-limit=${MAX_SPEED} \
             ${url} --out=${filename}.pacget && [ ! -f ${filename}.pacget.aria2 ] && mv ${filename}.pacget ${2} && chmod 644 ${2}

exit $?

Save this script as /usr/bin/pacget.

chmod 755 /usr/bin/pacget

This makes the script an executable

In /etc/pacman.conf, in the [options] section, the following needs to be added:

XferCommand = exec /usr/bin/pacget %u %o

PS: If you use ftp.archlinux.org as the first server listed in your include files (/etc/pacman.d/*), some problems may occur when the mirrors you are using have not yet synced. To make great use of this script, choose a mirror (that syncs in a timely manner) that is more appropriate for you, then put that on top of the server lists. This is to prevent downloading only from ftp.archlinux.org when the mirrors have not yet synced.

Using other applications

There are other downloading applications that you can use with Pacman. Here they are, and their associated XferCommand settings:

  • snarf: XferCommand = /usr/bin/snarf -N %u
  • lftp: XferCommand = /usr/bin/lftp -c pget %u
  • axel: XferCommand = /usr/bin/axel -n 2 -v -a -o %o %u