This article is a retrospective analysis and basic rundown about gaining performance in Arch Linux.
- 1 The basics
- 2 Storage devices
- 3 CPU
- 4 Network
- 5 Graphics
- 6 RAM and swap
- 7 Boot time
- 8 Application-specific tips
Know your system
The best way to tune a system is to target the bottlenecks, that is the subsystems that limit the overall speed. They usually can be identified by knowing the specifications of the system, but there are some basic indications:
- If the computer becomes slow when big applications, like openoffice and firefox, are running at the same time, then there is a good chance the amount of RAM is insufficient. To verify available RAM, use this command, and check for the line beginning with -/+buffers:
$ free -m
- If boot time is really slow, and if applications take a lot of time to load the first time they are launched, but run fine afterwards, then the hard drive is probably too slow. The speed of a hard drive can be measured using the hdparm command:
$ hdparm -t /dev/harddrive
This is only the pure read speed of the hard drive, and is not a valid benchmark, but a value superior to 40mb/s can be considered decent on an average system.
- If the CPU load is consistently high even when RAM is available, then lowering CPU usage should be a priority. CPU load can be monitored in many ways, like using the top command:
- If the only applications lagging are the ones using direct rendering, meaning they use the graphic card, like video players and games, then improving the graphic performance should help. To verify this, try running this command for 20 seconds:
This also isn't a valid benchmark, but a value of 300fps or less can be considered very low on an average system. If that is the case, then maybe direct rendering simply isn't enabled. This is indicated by the glxinfo command:
$ glxinfo | grep direct
The first thing to do
The simplest and most efficient way of improving overall performance is to run less and lighter applications. This can be achieved by:
- Changing the desktop environment to a lighter one. Good choices are LXDE, Xfce, or a standalone window manager like Openbox.
- Using lightweight applications. See Lightweight Software and the Light and Fast Applications Awards threads in the forum: 2007, 2008, 2009, and 2010.
- Removing unnecessary daemons in rc.conf.
Almost all tuning brings drawbacks. Lighter applications usually come with less features and some tweaks may make a system unstable, or simply require time to implement and maintain. This page tries to highlight those drawbacks, but the final judgment rests on the user.
The effects of optimization are often difficult to judge. They can however be measured by benchmarking tools
Choosing and tuning your filesystem
Choosing the best filesystem for a specific system is very important because each has its own strengths. The beginner's guide provides a short summary of the most popular ones. You can also find relevant articles here.
- XFS: Excellent performance with large files. Low speed with small files. A good choice for /home.
- Reiserfs: Excellent performance with small files. A good choice for /var.
- Ext3: Average performance, reliable.
- Ext4: Great overall performance, reliable, has performance issues with sqlite and some other databases.
- JFS: Good overall performance, very low CPU usage.
- Btrfs: Great overall performance (better than ext4), reliable (once it becomes stable). Lots of features. Still in heavy development and considered as unstable. Do not use this filesystem yet unless you know what you are doing and are prepared for potential data loss.
Mount options offer an easy way to improve speed without reformatting. They can be set using the mount command:
$ mount -o option1,option2 /dev/partition /mnt/partition
To set them permanently, you can modify /etc/fstab to make the relevant line look like this:
/dev/partition /mnt/partition partitiontype option1,option2 0 0
A couple of mount options improving performance on almost all file-systems is Template:Codeline. In rare cases, for example if you use mutt, it can cause minor problems. You can instead use the Template:Codeline option.
See Ext3 Filesystem Tips.
See JFS Filesystem.
For optimal speed, create an XFS file-system with:
$ mkfs.xfs -l internal,size=128m -d agcount=2 /dev/thetargetpartition
An XFS specific mount option that may increase performance is Template:Codeline. As its speed when dealing with small files is poor, you should definitely consider using pacman-cage. For defragmentation, see Defragmentation XFS.
The Template:Codeline mount option improves speed, but may corrupt data during power loss. The Template:Codeline mount option increases the space used by the filesystem by about 5%, but also improves overall speed. You can also reduce disk load by putting the journal and data on separate drives. This is be done when creating the filesystem:
$ mkreiserfs –j /dev/hda1 /dev/hdb1
Replace /dev/hda1 with the partition reserved for the journal, and /dev/hdb1 with the partition for data. You can learn more about reiserfs with this article.
Btrfs is a new filesystem offering online defragmentation, optimized mode for SSDs, writable snapshots, changing size of partition without data loss and many other features. Btrfs is still in active development, and is available in the kernel (marked experimental). See more info on the Btrfs homepage.
mkinitcpio.conf for btrfs
For non-root btrfs filesystems the btrfs module and dependencies are loaded when required. For a root btrfs filesystem you should ensure the initial ramdisk has the correct modules. There is a dependency of the btrfs module on the libcrc32c module. You can add crc32c to the modules line of /etc/mkinitcpio.conf like so:
MODULES="crc32c libcrc32c zlib_deflate btrfs"
This avoids pitfalls like "unknown symbol" errors when loading the btrfs modules. See also mkinitcpio-btrfs.
A way to speed up reading from the hard drive is to compress the data, because there is less data to be read. It must however be decompressed, which means a greater CPU load. Some filesystems support transparent compression, most notably btrfs and reiserfs4, but their compression ratio is limited by the 4k block size. A good alternative is to compress /usr in a squashfs file, with a 64k(128k) block size, as instructed in this Gentoo forums thread. What this tutorial does is basically to compress the /usr folder into a compressed squashfs file-system, then mounts it with aufs. A lot of space is saved, usually two thirds of the original size of /usr, and applications load faster. However, each time an application is installed or reinstalled, it is written uncompressed, so /usr must be re-compressed periodically.Squashfs is already in the kernel, and aufs2 is in the extra repository, so no kernel compilation is needed if using the stock kernel. Since the linked guide is for Gentoo the next commands outline the steps especially for Arch. Basically we have got install two packages to get it working:
$ pacman -S aufs2 squashfs-tools
This command installs the aufs-modules and some userspace-tools for the squash-filesystem. Now we need some extra directories where we can store the archive of /usr as read-only and another folder where we can store the data changed after the last compression as writeable:
$ mkdir /squashed $ mkdir /squashed/usr $ cd /squashed/usr $ mkdir ro $ mkdir rw
Now that we got a rough setup you should perform a complete system-upgrade since every change of content in /usr after the compression will be excluded from this speedup. If you use prelink you should also perform a complete prelink before creating the archive. Now it is time to invoke the command to compress /usr:
$ mksquashfs /usr /squashed/usr/usr.sfs -b 65536
These parameters/options are the ones suggested by the Gentoo link but there might be some room for improvement using some of the options described here. Now to get the archive mounted together with the writeable folder it is necessary to edit fstab:
$ nano /etc/fstab
Add the following lines:
/squashed/usr/usr.sfs /squashed/usr/ro squashfs loop,ro 0 0 usr /usr aufs udba=reval,br:/squashed/usr/rw:/squashed/usr/ro 0 0
Now you should be done and able to reboot. The original Author suggests to delete all the old content of /usr, but this might cause some problems if anything goes wrong during some later re-compression. It is more safe to leave the old files in place just to be on the safe side.
A bash script has been created that will automate the process of re-compressing (read updating) the archive since the tutorial is meant for Gentoo and some options don't correlate to what they should be in Arch.
Tuning for an SSD
This tutorial has some very simple tricks to fully harness the power of an SSD and to reduce disk read/write cycles to prolong its life. But you should keep in mind that the one tip that suggests to mount /tmp within RAM is really usefull as long as you don't compile stuff like games or don't watch long flash videos within your browser. Yaourt uses /tmp for storing everything that it compiles and might abort due to insufficient amount of disk space which means in this case not enough RAM. You can do the following to let Yaourt use another directory for its temporary files:
and now uncomment the line:
# TmpDirectory /what/ever/you/want/
and change it to something like:
Just make sure that this directory is not on your SSD; that's what we're trying to prevent in first place!
See also compcache.
See this guide for information on the introduction of online SSD TRIM support in the 2.6.33 kernel.
If you have the latest Kernel and use ext4 you can simply activate SSD live TRIM support by adding one mount option to your fstab.
/dev/sda1 / ext4 defaults,noatime,discard 0 1
The only way to directly improve CPU speed is overclocking. As it is a complicated and risky task, it is not recommended for anyone except experts. The best way to overclock is through the BIOS. When purchasing your system, keep in mind that most Intel motherboards are notorious for disabling the capacity to overclock.
A way to modify performance (ref) is to use Con Kolivas' desktop-centric kernel patchset, which, among other things, replaces the Completely Fair Scheduler (CFS) with the Brain Fuck Scheduler (BFS).
To install a kernel that contains the ck patchset from the AUR using yaourt:
$ yaourt -S kernel26-ck
Or a kernel that contains just BFS patch:
$ yaourt -S kernel26-bfs
Some other kernels in the AUR have the BFS patch included or available as an option.
Verynice is a daemon, available on AUR, for dynamically adjusting the nice levels of executables. The nice level represent the priority of the executable when allocating CPU resources. Simply define executables for which responsiveness is important, like X or multimedia applications, as goodexe in Template:Filename. Similarly, CPU-hungry executables running in the background, like make, can be defined as badexe. This prioritisation greatly improves system responsiveness under heavy load.
See relevant section in General Recomendations.
Driconf is a small utility that allows you to change the direct rendering settings for open source drivers. Enabling HyperZ can drastically improve performance.
Overclocking a graphics card is typically more expedient than with a CPU, since there are readily accessible software packages which allow for on-the-fly GPU clock adjustments. For ATI users, get rovclock, and Nvidia users should get nvclock in the extra repository.
The changes can be made permanent by running the appropriate command after X boots, for example by adding it to Template:Filename. A safer approach would be to only apply the overclocked settings when needed.
RAM and swap
The swappiness represent how much the kernel prefers swap to RAM. Setting it to a very low value, meaning the kernel will almost always use RAM, is known to improve responsiveness on many systems. To do that, simply add those line to Template:Filename:
To test and more on why this may work, take a look at this article.
Compcache, also known as the ramzswap kernel module, creates a swap device in RAM and compresses it. That means that part of the RAM can hold much more information, but uses more CPU. Still, is it much quicker than a hard drive swap. If a system often falls back to swap, this could improve responsiveness. Compcache is available in [community].
It is also possible (and recommended) to tell compcache to fall back on the hard drive swap when full. To do this, define a backing swap device in the configuration file. This swap device must not be in use when compcache is started, so remove it from your /etc/fstab!
This is also a good way to reduce disk read/write cycles due to swap on SSDs.
Mounting /tmp to RAM
This will make your system a tiny bit faster, but will take up some of your RAM. It also reduces disk read/write cycles, and is therefore a good choice if using an SSD or if you have RAM to spare. Simply add this line to Template:Filename and reboot:
tmpfs /tmp tmpfs defaults,noatime,mode=1777 0 0
Using the graphic card's RAM
In the unlikely case that you have very little RAM and a surplus of video RAM, you can use the latter as swap. See Swap on video ram.
Preloading is the action of of putting and keeping target files into the RAM. The practical use is that preloaded applications always start very quickly, because reading from the RAM is always quicker than from the hard drive. However, part of your RAM will be dedicated to this task, but no more than if you kept the application open. Therefore, preloading is best used with heavy, often-used applications, like firefox and openoffice.
# gopreload-prepare program
Then, as instructed, press enter when the program is fully loaded. This will add a list of files needed by the program in Template:Filename. To load all lists at boot, simply add gopreload to your DAEMONS array in Template:Filename. To disable the loading of a program, remove the appropriate list in Template:Filename, or move it to Template:Filename.
A more automated, albeit less KISS, approach is used by Preload. All you have to do is add it to your DAEMONS array in Template:Filename. It will monitor the most used files on your system, and with time build its own list of files to preload at boot.
Suspend to ram
The best way to reduce boot time is not booting at all. Consider suspending your system to ram instead.
Kernel boot options
Some boot options can decrease kernel boot time. The Template:Codeline option usually can take off one second or so. Also, if you see a message saying "Waiting 8s for device XXX" at boot, adding Template:Codeline can reduce the waiting time, but be careful, as it may break the booting process. Those options are set in Template:Filename or Template:Filename, depending on which bootloader you use.
Compiling a custom kernel will reduce boot time and memory usage, but can be long, complicated and even painful. It usually is not worth the effort, but can be very interesting and a great learning experience. If you really know what you are doing, start here.
User josh_ from the forum has made impressive changes to the mkinitcpio script, making it two or three times faster. While waiting for these changes to be implemented, you can get them here.
See Speed up OpenOffice.
See Speed up SSH.