Linux Containers

From ArchWiki
Revision as of 02:35, 2 December 2009 by Delerious010 (Talk | contribs) (Created page with '=Current state of this HowTo= ~~~~ * Currently just a rough draft... I think I'll need to restructure this a bit and I've also noticed I've become a bit too verbose -_-;; I'll b…')

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Current state of this HowTo

Delerious010 21:35, 1 December 2009 (EST)

  • Currently just a rough draft... I think I'll need to restructure this a bit and I've also noticed I've become a bit too verbose -_-;; I'll be along shortly to complete this as well as clean it up.

Introduction

Synopsis

Linux Containers (LXC) are an operating system-level virtualization method for running multiple isolated server installs (containers) on a single control host. LXC does not provide a virtual machine, but rather provides a virtual environment that has its own process and network space. It is similar to a chroot, but offers much more isolation.

About this HowTo

This document is intended as an overview on setting up and deploying containers, and is not an in depth detailed instruction by instruction guide. A certain amount of prerequisite knowledge and skills are assumed (running commands as root, kernel configuration, mounting filesystems, shell scripting, chroot type environments, networking setup, etc).

Much of this was taken verbatim from Dwight Schauer, Tuxe and Ulhume. It has been copied here to provide a place for the community to come and share some of their wealth of knowledge on this topic...

Kernel configuration

Through the GUI

General Setup

  • [*] Group CPU scheduler
    • [*] Group scheduling for SCHED_OTHER
    • Basis for grouping tasks (Control groups)
      • [*] Control groups
  • [*] Control Group support
    • [*] Namespace cgroup subsystem
    • [*] Freezer cgroup subsystem
    • [*] Device controller for cgroups
    • [*] Cpuset support
    • [*] Include legacy /proc/ /cpuset file
    • [*] Simple CPU accounting cgroup subsystem
    • [*] Resource counters
      • [*] Memory Resource Controller for Control Groups
        • [*] Memory Resource Controller Swap Extension(EXPERIMENTAL)
  • [*] Namespace support
    • [*] UTS namespace
    • [*] IPC namespace
    • [*] User namespace (EXPERIMENTAL)
    • [*] PID Namespaces (EXPERIMENTAL)
    • [*] Network namespace

Networking support

  • Networking options
    • [*] QoS and/or fair queueing
      • [*] Control Group Classifier

Device drivers

  • Character devices
    • [*] Unix98 pty support
      • [*] Support multiple instances of devpts

Security options

  • [*] File POSIX Capabilities

Through the .config

CONFIG_GROUP_SCHED=y CONFIG_FAIR_GROUP_SCHED=y CONFIG_RT_GROUP_SCHED=y CONFIG_CGROUP_SCHED=y CONFIG_CGROUPS=y CONFIG_CGROUP_NS=y CONFIG_CGROUP_FREEZER=y CONFIG_CGROUP_DEVICE=y CONFIG_CPUSETS=y CONFIG_PROC_PID_CPUSET=y CONFIG_CGROUP_CPUACCT=y CONFIG_RESOURCE_COUNTERS=y CONFIG_CGROUP_MEM_RES_CTLR=y CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y CONFIG_MM_OWNER=y CONFIG_NAMESPACES=y CONFIG_UTS_NS=y CONFIG_IPC_NS=y CONFIG_USER_NS=y CONFIG_PID_NS=y CONFIG_NET_NS=y CONFIG_NET_CLS_CGROUP=y CONFIG_SECURITY_FILE_CAPABILITIES=y CONFIG_DEVPTS_MULTIPLE_INSTANCES=y


Testing capabilities

Once the lxc package is installed, running lxc-checkconfig will print out a list of your system's capabilities

Host configuration

Control group filesystem

LXC depends on the control group filesystem being mounted. At present, there exists no standard location for it. As such, you're free to create it where ever you see fit.

Mounting manually

mkdir /cgroup
mount -t cgroup none /cgroup

In /etc/fstab

none /cgroup cgroup defaults 0 0

Userspace tools

Both lxc and lxc-git can be found in AUR. In the example below, yaourt will be used to download and compile the package for us.

yaourt -S lxc-git

Bridge device setup

/etc/conf.d/bridges

bridge_br0="eth0"
config_br0="brctl setfd br0 0"
BRIDGE_INTERFACES=(br0)

/etc/rc.conf

MODULES=(... bridge ...) # YMMV, but this was not required eth0="eth0 0.0.0.0 up" # I had to do the 0.0.0.0, "eth0 up" was not sufficient. br0="dhcp" # or however you set your address INTERFACES=(eth0 br0)

Bridge forward delay

In order for br0 to dhcp quickly (and for container network devices to be available quickly) one must set the forward delay of the bridge device to zero.

brctl setfd br0 0

Patch for /etc/rc.d/network

This is required to use the above mentioned config_br0 statement as of initscripts 2009.08-1.

--- network.0 2009-10-13 13:05:40.924603683 -0500
+++ network 2009-10-13 13:18:59.534523717 -0500
@@ -172,6 +172,15 @@
          /usr/sbin/brctl addif $br $brif || error=1
        fi
      done
+     eval brconfig="\$config_${br}"
+     if [ -n "${brconfig}" ]; then
+       if ${brconfig}; then
+         true
+       else
+         echo config_${br}=\"${brconfig}\" \<-- invalid  configuration statement
+         error=1
+       fi
+     fi
    fi
  done
 }

See also: FS#16625

Container setup

There are various different means to do this

Creating the filesystem

Bootstrap

Bootstrap an install ( mkarchroot debootstrap rinse Install From Existing Linux ). You can also just copy/use an existing installation’s complete root filesystem.

Download existing

You can download a base install tar ball. OpenVZ templates work just fine.

Using the lxc tools

/usr/bin/lxc-debian {create|destroy|purge|help}
/usr/bin/lxc-fedora {create|destroy|purge|help}

Creating the device nodes

Since udev does not work within the container, you'll want to make sure that a certain minimum amount of devices is created for it. This may be done with the following script :

#!/bin/bash
ROOT=$(pwd)
DEV=${ROOT}/dev
mv ${DEV} ${DEV}.old
mkdir -p ${DEV}
mknod -m 666 ${DEV}/null c 1 3
mknod -m 666 ${DEV}/zero c 1 5
mknod -m 666 ${DEV}/random c 1 8
mknod -m 666 ${DEV}/urandom c 1 9
mkdir -m 755 ${DEV}/pts
mkdir -m 1777 ${DEV}/shm
mknod -m 666 ${DEV}/tty c 5 0
mknod -m 600 ${DEV}/console c 5 1
mknod -m 666 ${DEV}/tty0 c 4 0
mknod -m 666 ${DEV}/full c 1 7
mknod -m 600 ${DEV}/initctl p
mknod -m 666 ${DEV}/ptmx c 5 2

Container configuration

Configuration file

All blocks prefixed with Main config - are parts of the same configuration file which we'll use to launch containers. This file may be located anywhere, though /etc/lxc is probably a good place.

Warning : These configuration files are only used for the original creation of the container. Once created, lxc-start will use another copy of the file. More information on this is provided below in the section named #Creation.

Main configuration

Basic settings

# This will be used both for the cgroup name and the auto populated hostname of the container
lxc.utsname = $CONTAINER_NAME
# Path to the host side fstab used to create the container
lxc.mount = $CONTAINER_FSTAB
# Path to the container rootfs
lxc.rootfs = $CONTAINER_ROOTFS
# network.type = veth
## creates a virtual device in the container
## creates veth0$PID on the host
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br0
lxc.network.hwaddr = $CONTAINER_MACADDR 
lxc.network.ipv4 = $CONTAINER_IPADDR
# network.name = eth0
## name of the virtual device on the container side
## a good default would be eth0
lxc.network.name = $CONTAINER_DEVICENAME

Terminals

The following configuration paramters are optional. You may add them to your main configuration file if you wish to login via lxc-console, or through a terminal ( ex.: Ctrl+Alt+F1 ).

# amount of usable /dev/tty* in the container
lxc.tty = 1
# amount of pseudo ttys
lxc.pseudo = 1024

Device access

lxc.cgroup.devices.deny = a # Deny all access to devices
lxc.cgroup.devices.allow = c 1:3 rwm # dev/null
lxc.cgroup.devices.allow = c 1:5 rwm # dev/zero
lxc.cgroup.devices.allow = c 5:1 rwm # dev/console
lxc.cgroup.devices.allow = c 5:0 rwm # dev/tty
lxc.cgroup.devices.allow = c 4:0 rwm # dev/tty0
lxc.cgroup.devices.allow = c 1:9 rwm # dev/urandon
lxc.cgroup.devices.allow = c 1:8 rwm # dev/random
lxc.cgroup.devices.allow = c 136:* rwm # dev/pts/*
lxc.cgroup.devices.allow = c 5:2 rwm # dev/pts/ptmx
# No idea what this is .. dev/bsg/0:0:0:0 ???
lxc.cgroup.devices.allow = c 254:0 rwm

Configuration notes

lxc-start re-creates /dev/ttyX

If you've enabled multiple DevPTS instances in your kernel, lxc-start will recreate lxc.tty amount of /dev/ttyX devices when it is executed.

This means that you will have lxc.tty amount of pseudo ttys. If you're planning on accessing the container via a "real" terminal ( Ctrl+Alt+FX ), make sure that it's a number that is inferior to lxc.tty'.

To tell whether it's been re-created, you have only to execute the following before lxc-start :

rm $CONTAINER_ROOTFS/dev/tty1
touch $CONTAINER_ROOTFS/dev/tty1

Once you start the container, the file will be replaced with a new node with major 136. This is evident via ls in the container ( not the host ).

Containers have access to host's TTY nodes

If you do not properly restrict the container's access to the /dev/tty nodes, the container may have access to the host's.

Taking into consideration that, as previously mentioned, lxc-start recreates lxc.tty amount of /dev/tty devices, any tty nodes present in the container that are of a greater minor number than lxc.tty will be linked to the host's.

To access the container from a host TTY
  1. On the host, verify no getty is started for that tty by checking /etc/inittab.
  2. In the container, start a getty for that tty.
To prevent access to the host TTY

Please have a look at the following section on device access.

To test this access

I may be off here, but looking at the output of the ls command below should show you both the major and minor device numbers. These are located after the user and group and represented as : 4, 2

  1. Set lxc.tty to 1
  2. Make there that the container has dev/tty1 and /dev/tty2
  3. lxc-start the container
  4. lxc-console into the container
  5. ls -Al /dev/tty
    crw------- 1 root root 4, 2 Dec 2 00:20 /dev/tty2
  6. echo "test output" > /dev/tty2
  7. Ctrl+Alt+F2 to view the host's second terminal
  8. You should see "test output" printed on the screen

Configuration troubleshootung

console access denied: Permission denied

If, when executing lxc-console, you receive the error lxc-console: console access denied: Permission denied you've most likely either omitted lxc.tty or set it to 0.

lxc-console does not provide a login prompt

Though you're reaching a tty on the container, it most likely is not running a getty. You'll want to double check that you have a getty defined in the container's /etc/inittab for the specific tty.

Configuration fstab

none $CONTAINER_ROOTFS/dev/pts devpts defaults 0 0
none $CONTAINER_ROOTFS/proc    proc   defaults 0 0
none $CONTAINER_ROOTFS/sys     sysfs  defaults 0 0
none $CONTAINER_ROOTFS/dev/shm tmpfs  defaults 0 0

Note : This fstab is used by lxc-start when mounting the container. As such, you can define any mount that would be possible on the host such as bind mounting to the host's own filesystem. However, please be aware of any and all security implications that this may have.

Warning : You certainly do not want to bind mount the host's /dev to the container as this would allow it to, amongst other things, reboot the host.

Container Creation and Destruction

Creation

lxc-create -f $CONTAINER_CONFIGPATH -n $CONTAINER_NAME

lxc-create will create /var/lib/lxc/$CONTAINER_NAME with a new copy of the container configuration file found in $CONTAINER_CONFIGPATH.

As such, if you need to make modifications to the container's configuration file, it's advisable to modify only the original file and then perform lxc-destroy and lxc-create operations afterwards. No data will be lost by doing this.

Note : When copying the file over, lxc-create will strip all comments from the file.

Note : As of lxc-git from atleast 2009-12-01, performing lxc-create no longer splits the config file into multiple files and folders. Therefore, we only have the configuration file to worry about.

Destruction

lxc-destroy -n $CONTAINER_NAME

This will delete /var/lib/lxc/$CONTAINER_NAME which only contains configuration files. No data will be lost.