Difference between revisions of "Distcc"

From ArchWiki
Jump to: navigation, search
m (For use with makepkg: reorder)
 
(54 intermediate revisions by 21 users not shown)
Line 1: Line 1:
 
[[Category:Package development]]
 
[[Category:Package development]]
[[Category:Daemons and system services]]
+
[[Category:Distributed computing]]
[[Category:Networking]]
+
 
[[it:Distcc]]
 
[[it:Distcc]]
 +
[[ja:Distcc]]
 
[[zh-CN:Distcc]]
 
[[zh-CN:Distcc]]
{{Article summary start}}
+
{{Related articles start}}
{{Article summary text|
+
{{Related|TORQUE}}
 +
{{Related|Slurm}}
 +
{{Related articles end}}
  
'''distcc''' is a program that distributes source code among a number of distcc-servers allowing many machines to compile one program and thus speed up the compilation process. The cool part is you can use it together with pacman/srcpac.}}
+
Distcc is a program to distribute builds of C, C++, Objective C or Objective C++ code across several machines on a network. It should always generate the same results as a local build, is simple to install and use, and is usually much faster than a local compile. The cool part is one can use it together with native Arch build tools such as makepkg.
{{Article summary heading|Related}}
+
{{Article summary wiki|TORQUE}}
+
{{Article summary end}}
+
  
Distcc is a program to distribute builds of C, C++, Objective C or Objective C++ code across several machines on a network. distcc should always generate the same results as a local build, is simple to install and use, and is usually much faster than a local compile.
+
== Terms ==
  
==Terms==
+
; master: The master is the computer which initiates the compilation.
{{Note| The terminology used by the software can be a bit counterintuitive in that "the daemon" is the master and "the server(s)" are the slave PC(s) in a distcc cluster.}}
+
; slaves: The slave(s) accept compilation requests send by the master.
  
; distcc daemon: The PC or server that's running distcc to distribute the source code. The daemon itself will compile parts of the source code but will also send other parts to the hosts defined in ''DISTCC_HOSTS''.
+
{{Note|Both master and slave(s) machines need to be running distcc.}}
  
; distcc server: The PC or server compiling the source code it gets from the daemon. When it's done compiling, it sends back the object code (i.e. compiled source code) to the daemon, which in turn sends back some more source code (if there's any left to compile).
+
== Getting started ==
  
==Getting started==
+
[[Install]] the {{pkg|distcc}} package from [[official repositories]] on all PCs in the cluster:
  
Install the ''distcc'' package from [community] on all PCs in the cluster:
+
For other distros, or even OSes including Windows through using Cygwin, refer to the [http://distcc.samba.org/doc.html distcc docs].
  
# pacman -S distcc
+
== Configuration ==
  
For other distros or even OSes including Windows through using Cygwin, refer to the [http://distcc.samba.org/doc.html distcc docs].
+
=== Slaves ===
  
==Configuration==
+
The configuration for the slaves is stored in {{ic|/etc/conf.d/distccd}}. The available command line options are listed in the [http://distcc.googlecode.com/svn%20...%20%3Cdiv%20class=/trunk/doc/web/man/distccd_1.html distcc manual]. At a minimum, configure the allowed address ranges in [[wikipedia:Classless_Inter-Domain_Routing|CIDR]] format:
===Both Daemon and Server(s)===
+
Edit {{ic|/etc/conf.d/distccd}} and modify the only uncommented line with the correct IP address or range of your daemon or of your entire subnet:
+
DISTCC_ARGS="--user nobody --allow 192.168.0.0/24"
+
  
===Daemon Only===
+
DISTCC_ARGS="--allow 192.168.0.0/24"
 +
 
 +
A nice tool for converting address ranges to CIDR format can be found here: [http://www.ipaddressguide.com/cidr CIDR Utility Tool].
 +
 
 +
[[Start]] {{ic|distccd.service}} on every participating slave. To have {{ic|distccd.service}} start at boot-up, [[enable]] it on every participating machine.
 +
 
 +
=== Master ===
 +
==== For use with makepkg ====
  
 
Edit {{ic|/etc/makepkg.conf}} in the following three sections:
 
Edit {{ic|/etc/makepkg.conf}} in the following three sections:
#BUILDENV has distcc unbanged i.e. without exclamation point.
 
#Uncomment the ''DISTCC_HOSTS'' line and add the IP addresses of the servers (slaves) then a slash and the number of threads they are to use.  The subsequent IP address/threads should be separated by a white space.  This list is ordered from most powerful to least powerful (processing power).
 
#Adjust the MAKEFLAGS variable to correspond to the number of sum of the number of individual values specified for the max threads per server.  In the example below, this is 5+3+3=11.  If users specify more than this sum, the extra theoretical thread(s) will be blocked by distcc and appear as such in monitoring utils such as ''distccmon-text'' described below.
 
  
{{Note|It is common practice to define the number of threads as the number of physical core+hyperhtreaded cores (if they exist) plus 1.  Do this on a per-server basis, NOT in the MAKEFLAGS!}}
+
# BUILDENV has distcc unbanged i.e. without exclamation point.
 +
# Uncomment the ''DISTCC_HOSTS'' line and add the IP addresses of the slaves then a slash and the number of threads they are to use.  The subsequent IP address/threads should be separated by a white space.  This list is ordered from most powerful to least powerful (processing power).
 +
# Adjust the MAKEFLAGS variable to correspond to the number of sum of the number of individual values specified for the max threads per server.  In the example below, this is 5+3+3=11.  If users specify more than this sum, the extra theoretical thread(s) will be blocked by distcc and appear as such in monitoring utils such as ''distccmon-text'' described below.
 +
 
 +
{{Note|It is common practice although optional to define the number of threads as the number of physical core+hyperhtreaded cores (if they exist) plus 1.  Do this on a per-server basis, NOT in the MAKEFLAGS!}}
  
 
Example using relevant lines:
 
Example using relevant lines:
  BUILDENV=(distcc fakeroot color !ccache !check)
+
 
 +
  BUILDENV=(distcc fakeroot color !ccache check !sign)
 
  MAKEFLAGS="-j11"
 
  MAKEFLAGS="-j11"
 
  DISTCC_HOSTS="192.168.0.2/5 192.168.0.3/3 192.168.0.4/3"
 
  DISTCC_HOSTS="192.168.0.2/5 192.168.0.3/3 192.168.0.4/3"
Line 50: Line 55:
 
If users wish to use distcc through SSH, add an "@" symbol in front of the IP address in this section.  If key-based auth is not setup on the systems, set the DISTCC_SSH variable to ignore checking for authenticated hosts, i.e. DISTCC_SSH="ssh -i"
 
If users wish to use distcc through SSH, add an "@" symbol in front of the IP address in this section.  If key-based auth is not setup on the systems, set the DISTCC_SSH variable to ignore checking for authenticated hosts, i.e. DISTCC_SSH="ssh -i"
  
==Compile==
+
{{Warning|1=Make sure that neither the '''CFLAGS''' and '''CXXFLAGS''' have -march=native set or else distccd will not distribute work to other machines!  Using the Arch defaults for these variables is recommended.}}
Start the distcc daemon on every participating machine:
+
# rc.d start distccd
+
  
If having ''distccd'' run at boot up, add it to the DAEMONS array in {{ic|/etc/rc.conf}}.
+
==== For use without makepkg ====
 +
 
 +
The minimal configuration for distcc on the master includes the setting of the available slaves. This can either be done by setting the addresses in the environment variable {{ic|DISTCC_HOSTS}} or in either of the configuration files {{ic|$DISTCC_HOSTS}}, {{ic|$DISTCC_DIR/hosts}}, {{ic|~/.distcc/hosts}} or {{ic|/etc/distcc/hosts}}.
 +
 
 +
Example for setting the slave address using {{ic|DISTCC_HOSTS}}:
 +
 
 +
$ export DISTCC_HOSTS="192.168.0.3,lzo,cpp 192.168.0.4,lzo,cpp"
 +
 
 +
{{Note|This is a white space separated list.}}
 +
 
 +
Example for setting the slave addresses in the hosts configuration file:
 +
 
 +
{{hc|~/.distcc/hosts|
 +
192.168.0.3,lzo,cpp 192.168.0.4,lzo,cpp
 +
}}
 +
 
 +
Instead of explicitly listing the server addresses one can also use the avahi zeroconf mode. To use this mode {{ic|+zeroconf}} must be in place instead of the server addresses and the distcc daemons on the slaves have to be started using the {{ic|--zeroconf}} option. Note that this option does not support the pump mode!
 +
 
 +
The examples add the following options to the address:
 +
 
 +
* {{ic|lzo}}: Enables LZO compression for this TCP or SSH host (slave).
 +
* {{ic|cpp}}: Enables distcc-pump mode for this host (slave). Note: the build command must be wrapped in the pump script in order to start the include server.
 +
 
 +
A description for the pump mode can be found here: [http://distcc.googlecode.com/svn%20...%20%3Cdiv%20class=/trunk/doc/web/man/distcc_1.html#TOC_8 HOW DISTCC-PUMP MODE WORKS] and [http://google-opensource.blogspot.de/2008/08/distccs-pump-mode-new-design-for.html distcc's pump mode: A New Design for Distributed C/C++ Compilation ]
 +
 
 +
To use distcc-pump mode for a slave, users must start the compilation using the pump script otherwise the compilation will fail.
 +
 
 +
== Compile ==
 +
=== With makepkg ===
  
 
Compile via makepkg as normal.
 
Compile via makepkg as normal.
  
==Monitoring Progress==
+
=== Without makepkg ===
 +
 
 +
To compile a source file using the distcc pump mode, use the following command:
 +
 
 +
$ pump distcc g++ -c hello_world.cpp
 +
 
 +
In this case the pump script will execute distcc which in turn calls g++ with "-c hello_world.cpp" as parameter.
 +
 
 +
To compile a Makefile project, first find out which variables are set by the compiler. For example in gzip-1.6,  one can find the following line in the Makefile: {{ic|1=CC = gcc -std=gnu99}}. Normally the variables are called {{ic|CC}} for C projects and {{ic|CXX}} for C++ projects. To compile the project using distcc it would look like this:
 +
 
 +
$ wget ftp://ftp.gnu.org/pub/gnu/gzip/gzip-1.6.tar.xz
 +
$ tar xf gzip-1.6.tar.xz
 +
$ cd gzip-1.6
 +
$ ./configure
 +
$ pump make -j2 CC="distcc gcc -std=gnu99"
 +
 
 +
This example would compile gzip using distcc's pump mode with two compile threads. For the correct {{ic|-j}} setting have a look at [https://cdn.rawgit.com/distcc/distcc/master/doc/web/faq.html What -j level to use?]
 +
 
 +
== Monitoring progress ==
 +
 
 
Progress can be monitored via several methods.
 
Progress can be monitored via several methods.
 +
 
#distccmon-text
 
#distccmon-text
 
#tailing log file
 
#tailing log file
  
 
Invoke distccmon-text to check on compilation status:
 
Invoke distccmon-text to check on compilation status:
 +
 
{{bc|$ distccmon-text
 
{{bc|$ distccmon-text
 
29291 Preprocess  probe_64.c                                192.168.0.2[0]
 
29291 Preprocess  probe_64.c                                192.168.0.2[0]
Line 79: Line 131:
  
 
One can have this program run continuously by using watch or by appending a space followed by integer to the command which corresponds to the number of sec to wait for a repeat query:
 
One can have this program run continuously by using watch or by appending a space followed by integer to the command which corresponds to the number of sec to wait for a repeat query:
 +
 
  $ watch distccmon-text
 
  $ watch distccmon-text
 +
 
or
 
or
 +
 
  $ distccmon-text 2
 
  $ distccmon-text 2
  
 
One can also simply tail {{ic|/var/log/messages.log}} on daemon:
 
One can also simply tail {{ic|/var/log/messages.log}} on daemon:
 +
 
  # tail -f /var/log/messages.log
 
  # tail -f /var/log/messages.log
  
=="Cross Compiling" with Distcc==
+
== "Cross Compiling" with distcc ==
There are currently two method from which to select to have the ability of distcc distribution of tasks over a cluster building i686 packages from a native x86_64 environment.  Neither is ideal, but to date, there are the only two methods documented on the wiki.
+
 
 +
=== X86 ===
 +
 
 +
There are currently two methods from which to select to have the ability of distcc distribution of tasks over a cluster building i686 packages from a native x86_64 environment.  Neither is ideal, but to date, there are the only two methods documented on the wiki.
  
 
An ideal setup is one that uses the unmodified ARCH packages for distccd running only once one each node regardless of building from the native environment or from within a chroot AND one that works with makepkg.  Again, this Utopian setup is not currently known.
 
An ideal setup is one that uses the unmodified ARCH packages for distccd running only once one each node regardless of building from the native environment or from within a chroot AND one that works with makepkg.  Again, this Utopian setup is not currently known.
Line 93: Line 152:
 
A [https://bbs.archlinux.org/viewtopic.php?id=129762 discussion thread] has been started on the topic; feel free to contribute.
 
A [https://bbs.archlinux.org/viewtopic.php?id=129762 discussion thread] has been started on the topic; feel free to contribute.
  
=== Chroot Method (Preferred) ===
+
==== Chroot method (preferred) ====
 +
 
 
{{Note|This method works, but is not very elegant requiring duplication of distccd on all nodes AND need to have a 32-bit chroots on all nodes.}}
 
{{Note|This method works, but is not very elegant requiring duplication of distccd on all nodes AND need to have a 32-bit chroots on all nodes.}}
  
Assuming the user has a [https://wiki.archlinux.org/index.php/Install_bundled_32-bit_system_in_Arch64 32-bit chroot] setup and configured on '''each node''' of the distcc cluster, the strategy is to have two separate instances of distccd running on different ports on each node -- one runs in the native x86_64 environment and the other in the x86 chroot on a modified port.  Start makepkg via a [https://wiki.archlinux.org/index.php/Install_bundled_32-bit_system_in_Arch64#Executing_32-bit_applications_from_a_64-bit_environment schroot command] invoking makepkg.
+
Assuming the user has a [[Install_bundled_32-bit_system_in_Arch64|32-bit chroot]] setup and configured on '''each node''' of the distcc cluster, the strategy is to have two separate instances of distccd running on different ports on each node -- one runs in the native x86_64 environment and the other in the x86 chroot on a modified port.  Start makepkg via a [[Install_bundled_32-bit_system_in_64-bit_system#Schroot|schroot command]] invoking makepkg.
  
==== Setup ====
+
===== Add port numbers to DISTCC_HOSTS on the i686 chroot =====
Setup the chroot according to the aforementioned link.  Be sure to install the discc package!
+
  
Once distcc is installed in the chroot, simply make three minor modifications '''inside''' the chroot:
+
Append the port number defined eariler (3692) to each of the hosts in {{ic|/opt/arch32/etc/makepkg.conf}} as follows:
# Configure distccd to run on another port which allows two version (one outside the chroot and once inside) to co-exist.
+
# Create a link in /usr/bin to "distccd2".
+
# Modify two lines in {{ic|/etc/rc.d/distccd}} pointing the script to the symlink
+
# Redefine the DISTCC_HOSTS including a the modified port number in {{ic|/opt/arch32/etc/rc.d/distccd}}
+
  
==== Configuration ====
+
  DISTCC_HOSTS="192.168.1.101/5:3692 192.168.1.102/5:3692 192.168.1.103/3:3692"
Example {{ic|/etc/conf.d/distcc.conf}} in the chroot:
+
  DISTCC_ARGS="--user nobody --allow 192.168.0.0/24 --port 3692 --log-level info --log-file /tmp/distccd-i686.log"
+
  
==== Symlink ====
+
{{Note|This only needs to be setup on the "master" i686 chroot.  Where "master" is defined as the one from which the compilation will take place.}}
  
The symlink serves as a trivial method to avoid having two of the same programs (and start paths) running in the process list.  Omitting this step will render the second instance of distcc unable to start by design of the init script: specifically, "pidof -o %PPID /usr/bin/distccd"!
+
===== Invoke makepkg from the Native Environment =====
  
Execute the following inside the chroot:
+
Setup [[Install_bundled_32-bit_system_in_Arch64#Executing_32-bit_applications_from_a_64-bit_environment|schroot]]{{Broken section link}} on the native x86_64 environment.  Invoke makepkg to build an i686 package from the native x86_64 environment, simply by:
# ln -s /usr/bin/distccd /usr/bin/distccd2
+
  
==== Modification to Daemon Script ====
+
$ schroot -p -- makepkg -src
{{bc|1=#!/bin/bash
+
  
[ -f /etc/conf.d/distccd ] && . /etc/conf.d/distccd
+
==== Multilib GCC method (not recommended) ====
  
. /etc/rc.conf
+
See [[Makepkg#Build 32-bit packages on a 64-bit system]].
. /etc/rc.d/functions
+
  
PID=`pidof -o %PPID /usr/bin/distccd2`    <-----
+
=== Other architectures ===
case "$1" in
+
==== Arch ARM ====
  start)
+
When building on an Arch ARM device, the developers ''highly'' recommend using the official project toolchains.
    stat_busy "Starting distcc Daemon"
+
*[https://archlinuxarm.org/builder/xtools/x-tools8.tar.xz ARMv8]
    [ -z "$PID" ] && /usr/bin/distccd2 --daemon ${DISTCC_ARGS}    <-----
+
*[https://archlinuxarm.org/builder/xtools/x-tools7h.tar.xz ARMv7l hard]
    if [ $? -gt 0 ]; then
+
*[https://archlinuxarm.org/builder/xtools/x-tools6h.tar.xz ARMv6l hard]
      stat_fail
+
*[https://archlinuxarm.org/builder/xtools/x-tools.tar.xz ARMv5te soft]
    else
+
      add_daemon distccd
+
      stat_done
+
    fi
+
    ;;}}
+
  
Streamline this by simply adding a line to {{ic|/etc/rc.d/arch32}} to invoke the modified distccd like so:
+
Extract the toolchain corresponding to the requisite architecture somewhere on the '''slave filesystem''' and edit {{ic|/etc/conf.d/distccd}} adjusting the PATH to allow the toolchain to be used.
{{bc|<nowiki>case $1 in
+
start)
+
stat_busy "Starting Arch32 chroot"
+
for d in "${dirs[@]}"; do
+
  mount -o bind $d /opt/arch32/$d
+
done
+
mount -t proc none /opt/arch32/proc
+
mount -t sysfs none /opt/arch32/sys
+
      ----->    linux32 chroot /opt/arch32 sh -c "/etc/rc.d/distccd start" || return 1    <-----
+
add_daemon arch32
+
stat_done
+
;;
+
        stop)
+
                stat_busy "Stopping Arch32 chroot"
+
      ----->    linux32 chroot /opt/arch32 sh -c "/etc/rc.d/distccd stop" || return 1    <-----
+
                ...</nowiki>}}
+
==== Add port numbers to DISTCC_HOSTS on the i686 chroot ====
+
Append the port number defined eariler (3692) to each of the hosts in {{ic|/opt/arch32/etc/makepkg.conf}} as follows:
+
  
  DISTCC_HOSTS="192.168.1.101/5:3692 192.168.1.102/5:3692 192.168.1.103/3:3692"
+
Example with the toolchair extracted to {{ic|/mnt/data}}:
 +
  PATH=/mnt/data/x-tools8/aarch64-unknown-linux-gnueabi/bin:$PATH
  
{{Note|This only needs to be setup on the "master" i686 chroot.  Where "master" is defined as the one from which the compilation will take place.}}
+
To read in the configuration file, [[restart]] {{ic|distcc.service}}.
  
==== Invoke makepkg from the Native Environment ====
+
Optionally link it to your user's homedir if planning to build without makepkg.
 +
Example:
 +
$ ln -s /mnt/data/x-tools8 x-tools8
  
Setup [https://wiki.archlinux.org/index.php/Install_bundled_32-bit_system_in_Arch64#Executing_32-bit_applications_from_a_64-bit_environment schroot] on the native x86_64 environment.  Invoke makepkg to build an i686 package from the native x86_64 environment, simply by:
+
==== Additional toolchains ====
 +
* [https://embtoolkit.org/ EmbToolkit]: Tool for creating cross compilation tool chain; supports ARM and MIPS architectures; supports building of an LLVM based tool chain
 +
* [http://crosstool-ng.org/ crosstool-ng]: Similar to EmbToolkit; supports more architectures (see website for more information)
 +
* [https://www.linaro.org/downloads/ Linaro]: Provides tool chains for ARM development
  
$ schroot -p -- makepkg -src
+
The {{ic|EmbToolkit}} provides a nice graphical configuration menu ({{ic|make xconfig}}) for configuring the tool chain.
  
=== Multilib GCC Method (Not Recommended) ===
+
== Troubleshooting ==
{{Warning|Errors have been reported when using this method to build the i686 linux package from a native x86_64 system!  The chroot method is preferred and has been verified to work building the kernel packages.}}
+
  
Edit {{ic|/etc/pacman.conf}} and uncomment the multilib repo:
+
=== Journalctl ===
  
[multilib]
+
Use {{ic|journalctl}} to find out what was going wrong:
Include = /etc/pacman.d/mirrorlist
+
  
Install gcc-multilib and its dependencies
+
$ journalctl $(which distccd) -e --since "5 min ago"
  
# pacman -Syy & pacman -S gcc-multilib binutils-multilib
+
=== code 110 ===
  
Compile packages on x86_64 for i686 is as easy as adding the following lines to {{ic|$HOME/.makepkg.conf}}
+
Make sure that the tool chain works for the user account under which the distcc daemon process gets started (default is nobody). The following will test if the tool chain works for user nobody. In {{ic|/etc/passwd}} change the login for the nobody user to the following:
CARCH="i686"
+
CHOST="i686-pc-linux-gnu"
+
CFLAGS="-march=i686 -O2 -pipe -m32"
+
CXXFLAGS="${CFLAGS}"
+
  
and invoking makepkg via the following
+
{{hc|$ cat /etc/passwd|
$ linux32 makepkg -src
+
...
 +
nobody:x:99:99:nobody:/:/bin/bash
 +
...
 +
}}
  
Remember to remove or modify {{ic|$HOME/.makepkg.conf}} when finished compiling i686 packages!
+
Then cd into the directory containing the cross compiler binaries and try to execute the compiler:
  
==Tips/Tricks==
+
# su nobody
===Limit HDD/SDD usage===
+
$ ./gcc --version
====Relocate $HOME/.distcc====
+
  bash: ./gcc: Permission denied
By default, distcc creates {{ic|$HOME/.distcc}} which stores transient relevant info as it serves up work for nodes to compile. Create a directory named ''.distcc'' in RAM such as /tmp and soft link to it in $HOME.  This will avoid needless HDD read/writes and is particularly important for SSDs.
+
  
$ mv $HOME/.distcc /tmp
+
Users experiencing this error should make sure that groups permissions as described in [[#Other architectures]] are correctly in setup.
$ ln -s $HOME/.distcc /tmp/.distcc
+
  
One only needs to have {{ic|/etc/rc.local}} re-create this directory on a reboot (the soft link will remain until it is manually removed like any other file):
+
Make sure to change back {{ic|/etc/passwd}} to its original state after these modifications.
  
su -c "mkdir /tmp/.distcc" USERNAME
+
Alternatively, use sudo without changing the shell in /etc/passwd.
  
====Adjust log level====
+
  # sudo -u nobody gcc --version
 +
 
 +
=== Adjust log level ===
  
 
By default, distcc will log to {{ic|/var/log/messages.log}} as it goes along.  One trick (actually recommended in the distccd manpage) is to log to an alternative file directly.  Again, one can locate this in RAM via /tmp.  Another trick is to lower to log level of minimum severity of error that will be included in the log file.  Useful if only wanting to see error messages rather than an entry for each connection.  LEVEL can be any of the  standard syslog levels, and in particular critical, error, warning, notice, info, or debug.
 
By default, distcc will log to {{ic|/var/log/messages.log}} as it goes along.  One trick (actually recommended in the distccd manpage) is to log to an alternative file directly.  Again, one can locate this in RAM via /tmp.  Another trick is to lower to log level of minimum severity of error that will be included in the log file.  Useful if only wanting to see error messages rather than an entry for each connection.  LEVEL can be any of the  standard syslog levels, and in particular critical, error, warning, notice, info, or debug.
  
Both of these lines are to be '''appended''' to DISTCC_ARGS in {{ic|/etc/conf.d/distccd}}
+
Either call distcc with the arguments mentioned here on the master or appended it to DISTCC_ARGS in {{ic|/etc/conf.d/distccd}} on the slaves:
 +
 
 +
DISTCC_ARGS="--allow 192.168.0.0/24 --log-level error --log-file /tmp/distccd.log"
 +
 
 +
=== Failure work with CMake or other tools ===
 +
 
 +
CMake sometimes pass [http://gcc.gnu.org/wiki/Response_Files "response file"] to gcc, but the distcc cannot deal with it.
 +
There is a [http://code.google.com/p/distcc/issues/detail?id=85&q=response patch file], but it has not been applied to upstream code.
 +
Users encountering this problem, can source this file or use the {{AUR|distcc-rsp}}{{Broken package link|{{aur-mirror|distcc-rsp}}}} package.
 +
 
 +
=== Limit HDD/SSD usage ===
 +
 
 +
==== Relocate $HOME/.distcc ====
 +
 
 +
By default, distcc creates {{ic|$HOME/.distcc}} which stores transient relevant info as it serves up work for nodes to compile.  Create a directory named ''.distcc'' in RAM such as /tmp and soft link to it in $HOME.  This will avoid needless HDD read/writes and is particularly important for SSDs.
 +
 
 +
$ mv $HOME/.distcc /tmp
 +
$ ln -s /tmp/.distcc $HOME/.distcc
 +
 
 +
Use systemd to re-create this directory on a reboot (the soft link will remain until it is manually removed like any other file):
  
DISTCC_ARGS="--user nobody --allow 192.168.0.0/24 --log-level error --log-file /tmp/distccd.log"
+
Create the following tmpfile.
  
==Failure work with CMake or other tools==
+
{{hc|/etc/tmpfiles.d/tmpfs-create.conf|<nowiki>
CMake sometimes pass [http://gcc.gnu.org/wiki/Response_Files  "response file"] to gcc, but the distcc can't deal with it.
+
d /tmp/.distcc 0755 <username> users -
There a [http://code.google.com/p/distcc/issues/detail?id=85&q=response  patch file], but has not patched to main stream code.
+
</nowiki>}}
If you encounter this problem, you can path this file or use the [http://aur.archlinux.org/packages.php?ID=58822 "distcc-rsp"] package on aur.
+

Latest revision as of 12:57, 24 September 2016

Related articles

Distcc is a program to distribute builds of C, C++, Objective C or Objective C++ code across several machines on a network. It should always generate the same results as a local build, is simple to install and use, and is usually much faster than a local compile. The cool part is one can use it together with native Arch build tools such as makepkg.

Terms

master
The master is the computer which initiates the compilation.
slaves
The slave(s) accept compilation requests send by the master.
Note: Both master and slave(s) machines need to be running distcc.

Getting started

Install the distcc package from official repositories on all PCs in the cluster:

For other distros, or even OSes including Windows through using Cygwin, refer to the distcc docs.

Configuration

Slaves

The configuration for the slaves is stored in /etc/conf.d/distccd. The available command line options are listed in the distcc manual. At a minimum, configure the allowed address ranges in CIDR format:

DISTCC_ARGS="--allow 192.168.0.0/24"

A nice tool for converting address ranges to CIDR format can be found here: CIDR Utility Tool.

Start distccd.service on every participating slave. To have distccd.service start at boot-up, enable it on every participating machine.

Master

For use with makepkg

Edit /etc/makepkg.conf in the following three sections:

  1. BUILDENV has distcc unbanged i.e. without exclamation point.
  2. Uncomment the DISTCC_HOSTS line and add the IP addresses of the slaves then a slash and the number of threads they are to use. The subsequent IP address/threads should be separated by a white space. This list is ordered from most powerful to least powerful (processing power).
  3. Adjust the MAKEFLAGS variable to correspond to the number of sum of the number of individual values specified for the max threads per server. In the example below, this is 5+3+3=11. If users specify more than this sum, the extra theoretical thread(s) will be blocked by distcc and appear as such in monitoring utils such as distccmon-text described below.
Note: It is common practice although optional to define the number of threads as the number of physical core+hyperhtreaded cores (if they exist) plus 1. Do this on a per-server basis, NOT in the MAKEFLAGS!

Example using relevant lines:

BUILDENV=(distcc fakeroot color !ccache check !sign)
MAKEFLAGS="-j11"
DISTCC_HOSTS="192.168.0.2/5 192.168.0.3/3 192.168.0.4/3"

If users wish to use distcc through SSH, add an "@" symbol in front of the IP address in this section. If key-based auth is not setup on the systems, set the DISTCC_SSH variable to ignore checking for authenticated hosts, i.e. DISTCC_SSH="ssh -i"

Warning: Make sure that neither the CFLAGS and CXXFLAGS have -march=native set or else distccd will not distribute work to other machines! Using the Arch defaults for these variables is recommended.

For use without makepkg

The minimal configuration for distcc on the master includes the setting of the available slaves. This can either be done by setting the addresses in the environment variable DISTCC_HOSTS or in either of the configuration files $DISTCC_HOSTS, $DISTCC_DIR/hosts, ~/.distcc/hosts or /etc/distcc/hosts.

Example for setting the slave address using DISTCC_HOSTS:

$ export DISTCC_HOSTS="192.168.0.3,lzo,cpp 192.168.0.4,lzo,cpp"
Note: This is a white space separated list.

Example for setting the slave addresses in the hosts configuration file:

~/.distcc/hosts
192.168.0.3,lzo,cpp 192.168.0.4,lzo,cpp

Instead of explicitly listing the server addresses one can also use the avahi zeroconf mode. To use this mode +zeroconf must be in place instead of the server addresses and the distcc daemons on the slaves have to be started using the --zeroconf option. Note that this option does not support the pump mode!

The examples add the following options to the address:

  • lzo: Enables LZO compression for this TCP or SSH host (slave).
  • cpp: Enables distcc-pump mode for this host (slave). Note: the build command must be wrapped in the pump script in order to start the include server.

A description for the pump mode can be found here: HOW DISTCC-PUMP MODE WORKS and distcc's pump mode: A New Design for Distributed C/C++ Compilation

To use distcc-pump mode for a slave, users must start the compilation using the pump script otherwise the compilation will fail.

Compile

With makepkg

Compile via makepkg as normal.

Without makepkg

To compile a source file using the distcc pump mode, use the following command:

$ pump distcc g++ -c hello_world.cpp

In this case the pump script will execute distcc which in turn calls g++ with "-c hello_world.cpp" as parameter.

To compile a Makefile project, first find out which variables are set by the compiler. For example in gzip-1.6, one can find the following line in the Makefile: CC = gcc -std=gnu99. Normally the variables are called CC for C projects and CXX for C++ projects. To compile the project using distcc it would look like this:

$ wget ftp://ftp.gnu.org/pub/gnu/gzip/gzip-1.6.tar.xz
$ tar xf gzip-1.6.tar.xz
$ cd gzip-1.6
$ ./configure
$ pump make -j2 CC="distcc gcc -std=gnu99"

This example would compile gzip using distcc's pump mode with two compile threads. For the correct -j setting have a look at What -j level to use?

Monitoring progress

Progress can be monitored via several methods.

  1. distccmon-text
  2. tailing log file

Invoke distccmon-text to check on compilation status:

$ distccmon-text
29291 Preprocess  probe_64.c                                 192.168.0.2[0]
30954 Compile     apic_noop.c                                192.168.0.2[0]
30932 Preprocess  kfifo.c                                    192.168.0.2[0]
30919 Compile     blk-core.c                                 192.168.0.2[1]
30969 Compile     i915_gem_debug.c                           192.168.0.2[3]
30444 Compile     block_dev.c                                192.168.0.3[1]
30904 Compile     compat.c                                   192.168.0.3[2]
30891 Compile     hugetlb.c                                  192.168.0.3[3]
30458 Compile     catalog.c                                  192.168.0.4[0]
30496 Compile     ulpqueue.c                                 192.168.0.4[2]
30506 Compile     alloc.c                                    192.168.0.4[0]

One can have this program run continuously by using watch or by appending a space followed by integer to the command which corresponds to the number of sec to wait for a repeat query:

$ watch distccmon-text

or

$ distccmon-text 2

One can also simply tail /var/log/messages.log on daemon:

# tail -f /var/log/messages.log

"Cross Compiling" with distcc

X86

There are currently two methods from which to select to have the ability of distcc distribution of tasks over a cluster building i686 packages from a native x86_64 environment. Neither is ideal, but to date, there are the only two methods documented on the wiki.

An ideal setup is one that uses the unmodified ARCH packages for distccd running only once one each node regardless of building from the native environment or from within a chroot AND one that works with makepkg. Again, this Utopian setup is not currently known.

A discussion thread has been started on the topic; feel free to contribute.

Chroot method (preferred)

Note: This method works, but is not very elegant requiring duplication of distccd on all nodes AND need to have a 32-bit chroots on all nodes.

Assuming the user has a 32-bit chroot setup and configured on each node of the distcc cluster, the strategy is to have two separate instances of distccd running on different ports on each node -- one runs in the native x86_64 environment and the other in the x86 chroot on a modified port. Start makepkg via a schroot command invoking makepkg.

Add port numbers to DISTCC_HOSTS on the i686 chroot

Append the port number defined eariler (3692) to each of the hosts in /opt/arch32/etc/makepkg.conf as follows:

DISTCC_HOSTS="192.168.1.101/5:3692 192.168.1.102/5:3692 192.168.1.103/3:3692"
Note: This only needs to be setup on the "master" i686 chroot. Where "master" is defined as the one from which the compilation will take place.
Invoke makepkg from the Native Environment

Setup schroot[broken link: invalid section] on the native x86_64 environment. Invoke makepkg to build an i686 package from the native x86_64 environment, simply by:

$ schroot -p -- makepkg -src

Multilib GCC method (not recommended)

See Makepkg#Build 32-bit packages on a 64-bit system.

Other architectures

Arch ARM

When building on an Arch ARM device, the developers highly recommend using the official project toolchains.

Extract the toolchain corresponding to the requisite architecture somewhere on the slave filesystem and edit /etc/conf.d/distccd adjusting the PATH to allow the toolchain to be used.

Example with the toolchair extracted to /mnt/data:

PATH=/mnt/data/x-tools8/aarch64-unknown-linux-gnueabi/bin:$PATH

To read in the configuration file, restart distcc.service.

Optionally link it to your user's homedir if planning to build without makepkg. Example:

$ ln -s /mnt/data/x-tools8 x-tools8

Additional toolchains

  • EmbToolkit: Tool for creating cross compilation tool chain; supports ARM and MIPS architectures; supports building of an LLVM based tool chain
  • crosstool-ng: Similar to EmbToolkit; supports more architectures (see website for more information)
  • Linaro: Provides tool chains for ARM development

The EmbToolkit provides a nice graphical configuration menu (make xconfig) for configuring the tool chain.

Troubleshooting

Journalctl

Use journalctl to find out what was going wrong:

$ journalctl $(which distccd) -e --since "5 min ago"

code 110

Make sure that the tool chain works for the user account under which the distcc daemon process gets started (default is nobody). The following will test if the tool chain works for user nobody. In /etc/passwd change the login for the nobody user to the following:

$ cat /etc/passwd
...
nobody:x:99:99:nobody:/:/bin/bash
...

Then cd into the directory containing the cross compiler binaries and try to execute the compiler:

# su nobody
$ ./gcc --version
bash: ./gcc: Permission denied

Users experiencing this error should make sure that groups permissions as described in #Other architectures are correctly in setup.

Make sure to change back /etc/passwd to its original state after these modifications.

Alternatively, use sudo without changing the shell in /etc/passwd.

 # sudo -u nobody gcc --version

Adjust log level

By default, distcc will log to /var/log/messages.log as it goes along. One trick (actually recommended in the distccd manpage) is to log to an alternative file directly. Again, one can locate this in RAM via /tmp. Another trick is to lower to log level of minimum severity of error that will be included in the log file. Useful if only wanting to see error messages rather than an entry for each connection. LEVEL can be any of the standard syslog levels, and in particular critical, error, warning, notice, info, or debug.

Either call distcc with the arguments mentioned here on the master or appended it to DISTCC_ARGS in /etc/conf.d/distccd on the slaves:

DISTCC_ARGS="--allow 192.168.0.0/24 --log-level error --log-file /tmp/distccd.log"

Failure work with CMake or other tools

CMake sometimes pass "response file" to gcc, but the distcc cannot deal with it. There is a patch file, but it has not been applied to upstream code. Users encountering this problem, can source this file or use the distcc-rspAUR[broken link: archived in aur-mirror] package.

Limit HDD/SSD usage

Relocate $HOME/.distcc

By default, distcc creates $HOME/.distcc which stores transient relevant info as it serves up work for nodes to compile. Create a directory named .distcc in RAM such as /tmp and soft link to it in $HOME. This will avoid needless HDD read/writes and is particularly important for SSDs.

$ mv $HOME/.distcc /tmp
$ ln -s /tmp/.distcc $HOME/.distcc

Use systemd to re-create this directory on a reboot (the soft link will remain until it is manually removed like any other file):

Create the following tmpfile.

/etc/tmpfiles.d/tmpfs-create.conf
d /tmp/.distcc 0755 <username> users -