IT, 筆記

在家搭建和測試 Kubernetes (1)

以前聽說要自行搭建 Kubernetes是十分困難,現在剛好有時間在家測試一下。幸好網上有很多教學可以參考,而且最終成功了。我把安裝過程做個記錄(詳見下文 - Setup Log)

以下是安裝過程得到的一些經驗總結:

  1. 目前使用 Windows 11 + vmware 安裝,RAM 只有16GB, CPU 是 i7-3770s
  2. 以前 K8S 叫作 master node, 現在叫作 Control-Plane node
  3. Control-Plane node 分配的RAM不能太少,2GB會出錯誤,建議4GB(我猜3GB也剛好可以)。Worker node 分配 2GB RAM沒有問題
  4. VM 的Network要使用 bridged,還要在 Router 為VM分配固定IP
  5. CentOS 7 VM Power on 後,如安裝時沒有啟動Network,開機後要自行修改 ONBOOT=Yes,也留意有有設定好時區
  6. 這一次想先用舊的 CentOS 7,但應該最新穩定版,如 CentOS 9
  7. 安裝時所參考的文章有點過時,較好的文章是基於 Ubuntu 的,所以不算很順利。主要的參考文章為:Kubernetes (K8S) 自建地端伺服器 (on-premise) 建置實錄 - 清新下午茶 (jks.coffee)
  8. 遇到的最大問題是 Flannel 和 CoreDNS 的POD啟動失敗,最後發現問題在於CNI,要手動修改 /etc/systemd/system/cri-docker.service 配置文件,還要用上配合的 pause image版本,還要利用 kubeadm reset 重置再來一遍才可以
  9. 目前因為RAM有限只能加一個Worker Node,之後可以在NAS加另一隻Worker Node 做測試
  10. 因為cluster api關係,所以要在每個node的firewall開放port: 443/tcp。但因為測試,所以關掉 firewalld 也可以的。
  11. 留意cluster init (kubeadm init) 和設定其他plugin時,IP Range 必需要配對好,很多時出現問題也因為IP 區段相衝所致。

Setup log

Install CentOS on vmware

  1. Download ISO, I use CentOS-7-x86_64-Minimal-2009.iso
    • We can also use CentOS 9 minimal
  2. Create VM and install
    • Name: CentOS 7-2009 Min
    • RAM: 2GB
    • Processor: 1x2 Core
    • HDD: 20GB
    • Network: Bridged
    • Guest OS: Linux, CentOS 7 64Bit
  3. Start VM, login as root
    • If the network is not set during installation:
    • If the timezone is not set during installation:
      • print the timezone: timedatectl
      • set to HK: timedatectl set-timezone Asia/Hong_Kong
    • Perform yum update and upgrade, to latest version
    • Poweroff
  4. Create CentOS VM for K8S Node base template
    • Select VM: CentOS 7-2009 Min
    • <Right-click> -> Manage -> Clone
      • Clone from: Current state in the VM
      • Clone method: Create a linked clone
      • Name: K8S-Node-base
    • Power on
    • Install basic tools
      • yum install -y nc git net-tools

Preparation for k8s installation: disable swap

# Check swap status
free -h

# Turn off swap temporary
swapoff -a

# Turn off swap permanently
vi /etc/fstab
# Edit and comment out: /dev/mapper/centos-swap swap...

# also set the vm.swappiness
sysctl -w vm.swappiness=0

# Reboot
reboot

# check swap again
free -h
sysctl vm.swappiness
  1. NOTE: donno why vm.swappiness cannot be set to 0. It will revert to the default value after reboot.

Install Docker

Ref: How To Install and Use Docker on CentOS 7 | DigitalOcean

# Install docker
curl -fsSL https://get.docker.com/ | sh

# Start docker and check
systemctl start docker
systemctl status docker
docker ps

# Enable the docker
systemctl enable docker

Update the daemon.json (avoid network conflict)

vi /etc/docker/daemon.json

{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "tag": "{{.Name}}",
    "max-size": "2m",
    "max-file": "2"
  },
  "default-address-pools": [
    {
      "base": "172.31.0.0/16",
      "size": 24
    }
  ],
  "bip": "172.7.0.1/16"
}

note: change the default-address pool if needed

Reboot and check if docker is started and working

Check if docker is using systemd for cgroupdriver

# restart docker if needed
systemctl daemon-reload
systemctl restart docker

docker info | grep -i cgroup

# Note: it is not version 2 but version 1

Now let's install k8s stuffs: kubelet, kubeadm and kubectl (Ref: Installing kubeadm | Kubernetes)

Login as root and excute the follow commands

# Set SELinux in permissive mode (effectively disabling it)
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

# This overwrites any existing configuration in /etc/yum.repos.d/kubernetes.repo
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF

# Reset the iptables
iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && iptables -P FORWARD ACCEPT

# Set firewall (NOTE: 443/tcp is very important!)
firewall-cmd --add-port=443/tcp --add-port=6443/tcp --add-port=10250/tcp --permanent
firewall-cmd --list-all

# Enable overlay and br_netfilter
$ cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

modprobe overlay
modprobe br_netfilter

# Set sysctrl.d for k8s
cat <<EOF > /etc/sysctl.d/k8s.conf
vm.swappiness = 0
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

# Apply it:
sysctl --system

# Check if overlay and br_netfilter are working:
lsmod | grep br_netfilter
lsmod | grep overlay

# Check the values are set to 1 properly
sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward

# Install kubelet, kubeadm and kubectl
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes

# Create kubelet config
# NOTE: --cgroup-driver is deprecated, not required)
#cat <<EOF > /etc/sysconfig/kubelet
#KUBELET_EXTRA_ARGS="--cgroup-driver=systemd"
#EOF
echo -n "" > /etc/sysconfig/kubelet

# Enable kubelet
systemctl enable --now kubelet

# NOTE: kubelet service start failure...

Install Container Runtime Interface (CRI) – cri-dockerd

Ref:

# Download cri-dockerd
curl -O -L https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.9/cri-dockerd-0.3.9.amd64.tgz

# Extract
tar xzvf cri-dockerd-0.3.9.amd64.tgz

# Install
install -o root -g root -m 0755 cri-dockerd/cri-dockerd /usr/bin/cri-dockerd
rm -Rf cri-dockerd
git clone https://github.com/Mirantis/cri-dockerd.git
install cri-dockerd/packaging/systemd/* /etc/systemd/system

# IMPORTANT - Update /etc/systemd/system/cri-docker.service
vi /etc/systemd/system/cri-docker.service
# Change from:
# ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd://
# To:
# ExecStart=/usr/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=registry.k8s.io/pause:3.9
# NOTE: if "kubeadm init" complains the pause:3.9 is outdated, please change the above version to the suggested one.

# Enable systemd
systemctl daemon-reload
systemctl enable --now cri-docker.service
systemctl enable --now cri-docker.socket

# Clean up
rm -Rf cri-dockerd cri-dockerd-0.3.9.amd64.tgz

reboot

Take a VM snapshot for cloning later

Setup the Control-plane nodes: Ctrl (a.k.a. master)

Clone the VM for the master node from previous VM.

Note:

  • Control-plane node needs 4G Ram, 2GB does not work
  • Reserve the static IP for this VM, Important!

Start the VM, login as root then:

# Reset the machine ID
rm /etc/machine-id
systemd-machine-id-setup

# Change hostname to: k8s-ctrl
vi /etc/hostname

# Re-configurate the opensshd
rm /etc/ssh/ssh_host_* -f && systemctl restart sshd

# Reboot and connect ssh again
reboot

# Get the mac address of the physical interface
ip link show $(arp | sed -n '2p' | awk '{print $NF}') |sed -n '2p' |awk '{print $2}'
# -OR-
ip link show $(ls -l /sys/class/net/ | grep -v virtual | awk 'NR==2 {print $9}') |sed -n '2p' |awk '{print $2}'

# Set the static IP (in router's DHCP) for this VM to keep the IP unchanged

# Prepare the variables:
IPADDR=$(ip route get 8.8.8.8 | head -1 | awk '{print $NF}')
NODENAME=$(hostname -s)
POD_CIDR="10.244.0.0/16"

# Perform Kubeadm init:
# Note: Service CIDR can be set if needed: --service-cidr=10.96.0.0/16
kubeadm init \
--apiserver-advertise-address=$IPADDR \
--control-plane-endpoint=$IPADDR \
--node-name $NODENAME \
--pod-network-cidr=$POD_CIDR \
--cri-socket unix:///var/run/cri-dockerd.sock \
--ignore-preflight-errors=all

Wait until the initialzation is finished. Jot down the join-command:

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

  kubeadm join 192.168.11.44:6443 --token xxx \
        --discovery-token-ca-cert-hash sha256:xxxx\
        --control-plane

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.11.44:6443 --token xxxx \
        --discovery-token-ca-cert-hash sha256:xxxx

We can print the join-string anytime if needed:

kubeadm token create --print-join-command

Update the .bash_profile

vi ~/.bash_profile

# Add:
# export KUBECONFIG=/etc/kubernetes/admin.conf

# Reboot and check

Since CNI (Flannel to be used) has not been installed, kubelet will report error

systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Wed 2024-01-24 00:51:13 EST; 4min 48s ago
     Docs: https://kubernetes.io/docs/
 Main PID: 1032 (kubelet)
    Tasks: 11
   Memory: 113.7M
   CGroup: /system.slice/kubelet.service
           └─1032 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime-endpoint=unix:///var/run/cri-dockerd.sock --pod-infra-contai...

Jan 24 00:55:13 k8s-ctrl kubelet[1032]: E0124 00:55:13.138304    1032 kubelet.go:2855] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"

Install helm

# Install over script
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh

# Check installation
helm version

# clean up
rm ./get_helm.sh

Install Flannel CNI

# NOTE: POD_CIDR is the same as previous step
POD_CIDR="10.244.0.0/16"

# Create namespace "kube-flannel"
kubectl create ns kube-flannel

# Assign privileged for namespace "kube-flannel"
kubectl label --overwrite ns kube-flannel pod-security.kubernetes.io/enforce=privileged

# Add flannel to helm repo
helm repo add flannel https://flannel-io.github.io/flannel/

# Install flannel via helm
helm install flannel --set podCidr="$POD_CIDR" --namespace kube-flannel flannel/flannel

# Check if container is started:
docker ps --filter name=^.*flannel.*$

# If ok, a runnung flannel container will be found

Setup the Worker nodes: 1

Clone the VM for the work node from previous base VM.

Note:

  • Worker Node can set to 2GB
  • Reserve the static IP for this VM, Important!

Start the VM, login as root then:

# Reset the machine ID
rm /etc/machine-id
systemd-machine-id-setup

# Change hostname to: k8s-node-1
vi /etc/hostname

# Re-configurate the opensshd
rm /etc/ssh/ssh_host_* -f && systemctl restart sshd

# Reboot and connect ssh again
reboot

# Get the mac address of the physical interface
ip link show $(arp | sed -n '2p' | awk '{print $NF}') |sed -n '2p' |awk '{print $2}'
# -OR-
ip link show $(ls -l /sys/class/net/ | grep -v virtual | awk 'NR==2 {print $9}') |sed -n '2p' |awk '{print $2}'

# Set the static IP (in router's DHCP) for this VM to keep the IP unchanged

# Prepare the variables:
IPADDR=$(ip route get 8.8.8.8 | head -1 | awk '{print $NF}')
NODENAME=$(hostname -s)
POD_CIDR="10.244.0.0/16"

Join the cluster

# In case you need to get the join-command:
# Run it on control-plain node, don't forget to append the cri-socket options
kubeadm token create --print-join-command

# Join the worker node to cluster
kubeadm join 192.168.11.44:6443 --token xxx \
--discovery-token-ca-cert-hash sha256:xxxx \
--cri-socket unix:///var/run/cri-dockerd.sock

Check if the node is added successfully. Perform the following commands in control-plane node (master node)

kubectl get nodes

# NAME         STATUS   ROLES           AGE     VERSION
# k8s-ctrl     Ready    control-plane   3h55m   v1.28.6
# k8s-node-1   Ready    <none>          2m45s   v1.28.6

Setup the Worker nodes: n-th

Repeat the above steps for additional nodes

Reset / Start-over

In case there is somthing wrong and wanna reset the cluster and start over:

# In EACH control-plane and worker nodes:
kubeadm reset -f --cri-socket unix:///var/run/cri-dockerd.sock
rm -rf /etc/cni/net.d

# Reset the iptables
iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && iptables -P FORWARD ACCEPT

# Optionally
rm $HOME/.kube/config

發佈留言

發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *