Kubernetes (aka. k8s) is an open-source system for automating the deployment, scaling, and management of containerized applications.
A k8s cluster consists of its control-plane components and node components (each representing one or more host machines running a container runtime and
kubelet.service. There are two options to install kubernetes, "the real one", described here, and a local install with k3s, kind, or minikube.
When creating a Kubernetes cluster with the help of
kubeadm, install and on each node.
Both control-plane and regular worker nodes require a container runtime for their
kubelet instances which is used for hosting containers.
Install either or to meet this dependency.
To control a kubernetes cluster, installon the control-plane hosts and any external host that is supposed to be able to interact with the cluster.
All nodes in a cluster (control-plane and worker) require a running instance of
kubelet.servicewill otherwise fail to start.
kubelet.serviceon a btrfs drive or subvolume running Kubernetes versions prior to 1.20.4 may be affected by a kubelet error, like:
Failed to start ContainerManager failed to get rootfs info: failed to get device for dir "/var/lib/kubelet": could not find device with major: 0, minor: 25 in cached partitions map
If you cannot upgrade to version 1.20.4+, a workaround for this bug is to create an explicit mountpoint (added to fstab) for e.g. either the entire
/var/lib/ or just
/var/lib/containers/, like this:
btrfs subvolume create /var/lib/kubelet btrfs subvolume create /var/lib/containers echo "/dev/vda2 /var/lib/kubelet btrfs subvol=/var/lib/kubelet 0 0" >>/etc/fstab echo "/dev/vda2 /var/lib/containers btrfs subvol=/var/lib/containers 0 0" >>/etc/fstab # mount -t btrfs -o subvol=/var/lib/kubelet /dev/vda2 /var/lib/kubelet/ # mount -t btrfs -o subvol=/var/lib/containers /dev/vda2 /var/lib/containers/Beware that
kubeadm resetundoes this! For more information check out #95826, #65204,and #94335.
All provided systemd services accept CLI overrides in environment files:
The networking setup for the cluster has to be configured for the respective container runtime. This can be done using.
Pass the virtual network's CIDR to
kubeadm init with e.g.
The container runtime has to be configured and started, before
kubelet.service can make use of it.
When using CRI-O as container runtime, it is required to provide
kubeadm init or
kubeadm join with its CRI endpoint:
/etc/crio/crio.conf). This is not compatible with kubelet's default (
cgroupfs) when using kubelet < v1.22.
Change kubelet's default by appending
--cgroup-driver='systemd' to the
KUBELET_ARGS environment variable in
/etc/kubernetes/kubelet.env upon first start (i.e. before using
Note that the
KUBELET_EXTRA_ARGS variable, used by older versions is now no longer read by the default
When https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/#config-file as explained on https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#configure-cgroup-driver-used-by-kubelet-on-control-plane-node, as in https://github.com/cri-o/cri-o/pull/4440/files, instead of the above. (TBC, untested.)updates from 1.19.x to 1.20.x, then it should be possible to use
After the node has been configured, the CLI flag could (but does not have to) be replaced by a configuration entry for
kubelet.servicewill fail (but restart) until configuration for it is present.
When creating a new kubernetes cluster with
kubeadm a control-plane has to be created before further worker nodes can join it.
- If the cluster is supposed to be turned into a high availability cluster (a stacked etcd topology) later on
kubeadm initneeds to be provided with
--control-plane-endpoint=<IP or domain>(it is not possible to do this retroactively!).
- It is possible to use a config file for
kubeadm initinstead of a set of parameters.
kubeadm init to initialize a control-plane on a host machine:
# kubeadm init --node-name=<name_of_the_node> --pod-network-cidr=<CIDR> --cri-socket=<SOCKET>
If run successfully,
kubeadm init will have generated configurations for the
kubelet and various control-plane components below
Finally, it will output commands ready to be copied and pasted to setup and make a worker node join the cluster (based on a token, valid for 24 hours).
kubectl with the freshly created control-plane node, setup the configuration (either as root or as a normal user):
$ mkdir -p $HOME/.kube # cp -i /etc/kubernetes/admin.conf $HOME/.kube/config # chown $(id -u):$(id -g) $HOME/.kube/config
With the token information generated in #Control-plane it is possible to make a node machine join an existing cluster:
# kubeadm join <control-plane-host>:<control-plane-port> --token <token> --discovery-token-ca-cert-hash sha256:<hash> --node-name=<name_of_the_node> --cri-socket=<SOCKET>
Tips and tricks
Tear down a cluster
When it is necessary to start from scratch, use tear down a cluster.to
kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
<node name> is the name of the node that should be drained and reset.
kubectl get node -A to list all nodes.
Then reset the node:
# kubeadm reset
Operating from Behind a Proxy
kubeadm reads the
no_proxy environment variables. Kubernetes internal networking should be included in the latest one, for example
where the second one is the default service network CIDR.
Failed to get container stats
Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service"
it is necessary to add configuration for the kubelet (see relevant upstream ticket).
systemCgroups: '/systemd/system.slice' kubeletCgroups: '/systemd/system.slice'
Pods cannot communicate when using Flannel CNI and systemd-networkd
See upstream bug report.
systemd-networkd assigns a persistent MAC address to every link. This policy is defined in its shipped configuration file
/usr/lib/systemd/network/99-default.link. However, Flannel relies on being able to pick its own MAC address. To override systemd-networkd's behaviour for
flannel* interfaces, create the following configuration file:
[Match] OriginalName=flannel* [Link] MACAddressPolicy=none
If the cluster is already running, you might need to manually delete the
flannel.1 interface and the
kube-flannel-ds-* pod on each node, including the master. The pods will be recreated immediately and they themselves will recreate the
Delete the interface
# ip link delete flannel.1
kube-flannel-ds-* pod. Use the following command to delete all
kube-flannel-ds-* pods on all nodes:
$ kubectl -n kube-system delete pod -l="app=flannel"
- Kubernetes Documentation - The upstream documentation
- Kubernetes Cluster with Kubeadm - Upstream documentation on how to setup a Kubernetes cluster using kubeadm
- Kubernetes Glossary - The official glossary explaining all Kubernetes specific terminology
- Kubernetes Addons - A list of third-party addons
- Kubelet Config File - Documentation on the Kubelet configuration file
- Taints and Tolerations - Documentation on node affinities and taints