Kubernetes delploy

k8s deploy

install kubelet kubeadm kubectl

all nodes should install

  • Step1
1
2
3
# Set SELinux in permissive mode (effectively disabling it)
sudo setenforce 0
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
  • Step2

    注意下面示例的v1.32不是最新的版本,可修改到最新的版本进行安装

CentOS/RHEL/Rocky

1
2
3
4
5
6
7
8
9
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.ustc.edu.cn/kubernetes/core:/stable:/v1.32/rpm/
enabled=1
gpgcheck=1
gpgkey=https://mirrors.ustc.edu.cn/kubernetes/core:/stable:/v1.32/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF

Ubuntu/Debian

1
2
3
4
5
6
7
8
9
sudo apt update
sudo apt install -y apt-transport-https ca-certificates curl
sudo curl -fsSL https://mirrors.ustc.edu.cn/kubernetes/core:/stable:/v1.32/deb/Release.key | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/kubernetes-ustc.gpg

cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.ustc.edu.cn/kubernetes/core:/stable:/v1.32/deb/ /
EOF

sudo apt update
  • step3

CentOS/RHEL/Rocky

1
2
3
4
# all node
sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes

sudo systemctl enable --now kubelet

Ubuntu/Debian

1
2
sudo apt install -y kubelet kubeadm kubectl
sudo systemctl enable --now kubelet
  • step4 disable swap

    主要是 kubelet 服务需要禁用swap

1
2
sudo swapoff -a
sudo sed -i '/[[:space:]]swap[[:space:]]/s/^/#/' /etc/fstab

install Container Runtimes

https://www.cloudraft.io/blog/container-runtimes

1
2
3
4
# Runtime	Path to Unix domain socket
containerd unix:///var/run/containerd/containerd.sock
CRI-O unix:///var/run/crio/crio.sock
Docker Engine (using cri-dockerd) unix:///var/run/cri-dockerd.sock
  • Enable ipv4 packet forward
1
2
3
4
5
6
7
8
9
10
# sysctl params required by setup, params persist across reboots
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
EOF

# Apply sysctl params without reboot
sudo sysctl --system

# Verify that net.ipv4.ip_forward is set to 1
sysctl net.ipv4.ip_forward
  • cgroup driver (optional in container and CRI-O)

To set systemd as the cgroup driver, edit the KubeletConfiguration option of cgroupDriver and set it to systemd. For example:

1
2
3
4
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
...
cgroupDriver: systemd

Starting with v1.22 and later, when creating a cluster with kubeadm, if the user does not set the cgroupDriver field under KubeletConfiguration, kubeadm defaults it to systemd.

containerd

  • init config.toml

    1
    2
    3
    # must create /etc/containerd directory advanced
    sudo mkdir -p /etc/containerd
    sudo containerd config default | sudo tee /etc/containerd/config.toml
  • config

Configuring the systemd cgroup driver

sudo vim /etc/containerd/config.toml

1
2
3
4
5
6
7
8
9
10
#disabled_plugins = ["cri"]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9" # "swr.cn-north-4.myhuaweicloud.com/ddn-k8s/registry.k8s.io/pause:3.10"
[plugins."io.containerd.grpc.v1.cri".registry.configs]
xxxxxx
[plugins."io.containerd.grpc.v1.cri".registry]
xxxxxx

sudo systemctl restart containerd

启动完成后就可以使用 containerd 的本地 CLI 工具 ctrcrictl

  • check containerd version

    1
    2
    containerd --version
    # containerd containerd.io 1.7.19

CRI-O (optional)

  • set repo

    1
    2
    3
    4
    5
    6
    7
    8
    cat <<EOF | sudo tee /etc/yum.repos.d/cri-o.repo
    [cri-o]
    name=CRI-O
    baseurl=https://mirrors.ustc.edu.cn/kubernetes/addons:/cri-o:/stable:/v1.30/rpm/
    enabled=1
    gpgcheck=1
    gpgkey=https://mirrors.ustc.edu.cn/kubernetes/addons:/cri-o:/stable:/v1.30/rpm/repodata/repomd.xml.key
    EOF
  • install

    1
    2
    3
    dnf install -y container-selinux
    dnf install -y cri-o
    systemctl start crio.service
  • bootstrap cluster

    1
    2
    3
    swapoff -a
    modprobe br_netfilter
    sysctl -w net.ipv4.ip_forward=1

Docker Engine (optional)

  • install docker (existing)

  • install cri-dockerd, the CRI socket is /run/cri-dockerd.sock by default.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    # download 
    wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.16/cri-dockerd-0.3.16.amd64.tgz

    # extract tgz
    tar -xzvf cri-dockerd-0.3.16.amd64.tgz

    sudo mv ./cri-dockerd /usr/local/bin/ # mv binary cri-dockerd file
    wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/master/packaging/systemd/cri-docker.service
    wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/master/packaging/systemd/cri-docker.socket
    sudo mv cri-docker.socket cri-docker.service /etc/systemd/system/
    sudo sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service

    # config always auto-restart
    vim /etc/systemd/system/cri-docker.socket
    [Socket]
    Restart=always

    # enable service, cri-docker.socket -> trigger -> cri-docker.service
    sudo systemctl daemon-reload
    sudo systemctl enable cri-docker.service
    sudo systemctl enable --now cri-docker.socket
    sudo systemctl status cri-docker.socket

crictl

cri client, could check container running on k8s

all node should install it

  • Config containerd vim /etc/crictl.yaml
1
2
3
4
5
runtime-endpoint: unix:///run/containerd/containerd.sock # docker-engine use unix:///var/run/cri-dockerd.sock 
image-endpoint: unix:///run/containerd/containerd.sock # docker-engine use unix:///var/run/cri-dockerd.sock
timeout: 2
debug: true
pull-image-on-create: false
  • verify mirror
1
sudo crictl info
  • command
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# List All Pods
crictl pods

# Filter Pods by Namespace, To view pods in a specific namespace, use the `--namespace` or `-n` option:
crictl pods --namespace=<namespace_name>

# Check Container Status
crictl containers --namespace=<namespace_name>

# Describe a Specific Container
crictl describe --namespace=<namespace_name> <container_id>

# Check Logs (if applicable)
crictl logs --namespace=<namespace_name> <container_id>

kubeadm deploy cluster

To use different container runtime or if there are more than one installed on the provisioned node, specify the --cri-socket argument to kubeadm.

Runtime Path to Unix domain socket
containerd unix:///var/run/containerd/containerd.sock
CRI-O unix:///var/run/crio/crio.sock
Docker Engine (using cri-dockerd) unix:///var/run/cri-dockerd.sock

Initializing control-plane node

  • bootstrap cluster

    1
    2
    3
    4
    5
    # use flannel as Pod network add-on (CNI plugin), 10.244.0.0/16 is config for flannel
    sudo kubeadm init --apiserver-advertise-address=10.220.32.16 --pod-network-cidr=10.244.0.0/16 --cri-socket=unix:///var/run/containerd/containerd.sock # 如果使用其他 cri, 需要修改sock文件

    # sudo kubeadm config images pull --config kubeadm.conf (提前拉取相关镜像)
    # sudo kubeadm init --config kubeadm.conf (使用指定的配置文件,上文中的 apiserver-advertise-address 等参数均在配置文件中配置)
    • (optional) resert (back to kubeadm init)

      执行报错,或修改init配置的时候,进行resert操作

      1
      2
      3
      4
      5
      sudo kubeadm reset

      sudo rm -rf /etc/cni/net.d
      sudo rm -rf $HOME/.kube/config
      sudo systemctl restart kubelet # 必须执行重启
  • Use kubectl

    1
    2
    3
    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
  • install flannel

    1
    2
    3
    4
    5
    6
    # apply flannel(optional) CNI plugin 
    kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

    # 如果需要替换 flannel 相关的镜像源,需要提前wget 下载 kube-flannel.yml, 然后修改image信息
    swr.cn-north-4.myhuaweicloud.com/ddn-k8s/ghcr.io/flannel-io/flannel:v0.26.4
    swr.cn-north-4.myhuaweicloud.com/ddn-k8s/ghcr.io/flannel-io/flannel-cni-plugin:v1.6.2-flannel1
  • check the CoreDNS Pod is Running

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    kubectl get pods --all-namespaces

    NAMESPACE NAME READY STATUS RESTARTS AGE
    kube-system coredns-7b5944fdcf-mgwq2 1/1 Running 0 4h22m
    kube-system coredns-7b5944fdcf-sxfvt 1/1 Running 0 4h22m
    kube-system etcd-dingo7232 1/1 Running 1 4h22m
    kube-system kube-apiserver-dingo7232 1/1 Running 1 4h22m
    kube-system kube-controller-manager-dingo7232 1/1 Running 1 4h22m
    kube-system kube-proxy-w2lfk 1/1 Running 0 4h22m
    kube-system kube-scheduler-dingo7232 1/1 Running 1 4h22m
  • kubeadm.conf info

    sudo kubeadm config print init-defaults > kubeadm.conf

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    apiVersion: kubeadm.k8s.io/v1beta3
    bootstrapTokens:
    - groups:
    - system:bootstrappers:kubeadm:default-node-token
    token: abcdef.0123456789abcdef
    ttl: 24h0m0s
    usages:
    - signing
    - authentication
    kind: InitConfiguration
    localAPIEndpoint:
    advertiseAddress: 172.20.7.232
    bindPort: 6443
    nodeRegistration:
    criSocket: unix:///var/run/containerd/containerd.sock
    imagePullPolicy: IfNotPresent
    name: dingo7232
    taints: null
    ---
    apiServer:
    timeoutForControlPlane: 4m0s
    apiVersion: kubeadm.k8s.io/v1beta3
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    controllerManager: {}
    dns: {}
    etcd:
    local:
    dataDir: /var/lib/etcd
    imageRepository: registry.aliyuncs.com/google_containers
    kind: ClusterConfiguration
    kubernetesVersion: 1.30.0
    networking:
    dnsDomain: cluster.local
    serviceSubnet: 10.96.0.0/12
    scheduler: {}

join node

工作节点加入集群 (工作及节点需要提前安装 kubelet、kubeadm、kubectl)

1
2
3
4
5
6
7
8
9
10
# shangdi 16
sudo kubeadm join 172.20.7.232:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:9f2fac23dc994bb63b7510b1925f143a5b6fd3305ec8f376b402aa3c08ae5e90 --cri-socket unix:///var/run/containerd/containerd.sock

# sjj 127
sudo kubeadm join 172.30.14.127:6443 --token c2ixm4.l1iv90mj1b1ug3bq \
--discovery-token-ca-cert-hash sha256:600be5a4505e894ac2ee7a963b28ff33e6756291dfaf542884bf63a73ddf3fca \
--cri-socket unix:///var/run/cri-dockerd.sock

sudo kubeadm join 172.30.14.127:6443 --token c0ctm3.nkii2s87omvon67u --discovery-token-ca-cert-hash sha256:600be5a4505e894ac2ee7a963b28ff33e6756291dfaf542884bf63a73ddf3fca \
--cri-socket unix:///var/run/cri-dockerd.sock

注意:kubeadm的token会有TTL,可用 kubeadm token list 查看token是否有效,否则使用kubeadm token create --print-join-command创建一个新的token

  • Check on control-plane
1
2
3
4
5
kubectl get nodes

NAME STATUS ROLES AGE VERSION
dingo7232 Ready control-plane 4h31m v1.30.3
dingo7233 Ready <none> 98s v1.30.3
  • check all pods

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    kubectl get pods --all-namespaces
    NAMESPACE NAME READY STATUS RESTARTS AGE
    kube-flannel kube-flannel-ds-gmdks 1/1 Running 0 56s
    kube-flannel kube-flannel-ds-qzmb9 1/1 Running 0 10m
    kube-flannel kube-flannel-ds-tvdjg 1/1 Running 0 3m58s
    kube-system coredns-55cb58b774-7wwq7 1/1 Running 0 74m
    kube-system coredns-55cb58b774-z8v87 1/1 Running 0 74m
    kube-system etcd-dingo127 1/1 Running 0 74m
    kube-system kube-apiserver-dingo127 1/1 Running 0 74m
    kube-system kube-controller-manager-dingo127 1/1 Running 0 74m
    kube-system kube-proxy-6wv22 1/1 Running 0 74m
    kube-system kube-proxy-mc6nt 1/1 Running 0 56s
    kube-system kube-proxy-zv8fp 1/1 Running 0 3m58s
    kube-system kube-scheduler-dingo127 1/1 Running 0 74m

plugin

dashboard

best practices

change etcd default port

  • step 1: init kubeadm config

    只需要更新需要修改的参数,例如下面的配置信息就是修改etcd的默认端口改为 12379、12380

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    apiVersion: kubeadm.k8s.io/v1beta3
    kind: InitConfiguration
    nodeRegistration:
    criSocket: unix:///var/run/cri-dockerd.sock # 使用了 docker engine 作为 container runtime
    name: dingo127
    ---
    apiVersion: kubeadm.k8s.io/v1beta3
    kind: ClusterConfiguration
    kubernetesVersion: 1.30.0
    networking:
    podSubnet: "10.244.0.0/16" # 对应了命令行的 pod-network-cidr 参数
    apiServer:
    timeoutForControlPlane: 4m0s
    extraArgs:
    advertise-address: "172.30.14.127"
    etcd-servers: "https://127.0.0.1:12379" # etcd 服务地址
    etcd:
    local:
    dataDir: /var/lib/k8s-etcd
    extraArgs:
    advertise-client-urls: "https://172.30.14.127:12379"
    initial-advertise-peer-urls: "https://172.30.14.127:12380"
    listen-client-urls: "https://127.0.0.1:12379,https://172.30.14.127:12379"
    listen-peer-urls: "https://172.30.14.127:12380"
    initial-cluster: "dingo127=https://172.30.14.127:12380"
    listen-metrics-urls: "http://127.0.0.1:12381"

    注意:kubeadm.k8s.io/v1beta3 版本的 extraArgs 是 map[string]string 格式;如果使用的是 kubeadm.k8s.io/v1beta4,则相应的extraArgs 需要变为 list 类型

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    ...
    apiServer:
    extraArgs:
    - name: "advertise-address"
    value: "172.30.14.127"
    - name: "etcd-servers"
    value: "https://127.0.0.1:12379"
    ...
    etcd:
    local:
    extraArgs:
    - name: "advertise-client-urls"
    value: "https://172.30.14.127:12379"
    - name: "initial-advertise-peer-urls"
    value: "https://172.30.14.127:12380"
    - name: "listen-client-urls"
    value: "https://127.0.0.1:12379,https://172.30.14.127:12379"
    - name: "listen-peer-urls"
    value: "https://172.30.14.127:12380"
    - name: "initial-cluster"
    value: "dingo127=https://172.30.14.127:12380"
    - name: "listen-metrics-urls"
    value: "http://127.0.0.1:12381"
  • step2: 执行 init 操作

    1
    2
    sudo kubeadm init --config=kubeadm.conf --ignore-preflight-errors=Port-2379,Port-2380
    # 因为 kubeadm 需要做 pref-light(检测etcd端口2379和2380端口是否被占用),这里需要跳过对这两个端口的检测

rejoin work node

  • step1: remove node

    1
    2
    kubectl drain <old-node-name> --ignore-daemonsets --delete-emptydir-data
    kubectl delete node <old-node-name>
  • step 2: 重置kubeadm

    1
    2
    3
    4
    5
    # Reset kubeadm on the Worker Node
    sudo kubeadm reset -f # --cri-socket unix:///var/run/cri-dockerd.sock

    # restart kubelet on workder node
    sudo systemctl restart kubelet
  • step 3: join node

    1
    sudo kubeadm join <control-plane-ip>:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash> # # --cri-socket unix:///var/run/cri-dockerd.sock

other node use kubectl

1
2
3
4
5
6
7
# method 1
scp root@<control-plane-host>:/etc/kubernetes/admin.conf .
kubectl --kubeconfig ./admin.conf get nodes

# method 2: set kubectl default config
scp root@<control-plane-host>:/etc/kubernetes/admin.conf ~/.kube/config
kubectl get nodes

log

init control-plane (kubeadm init)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
[reset] Reading configuration from the cluster...                                                          
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W0804 19:50:00.492596 337004 reset.go:124] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: configmaps "kubeadm-config" is forbidden: User "kubernetes-admin" cannot get
resource "configmaps" in API group "" in the namespace "kube-system"
W0804 19:50:00.492782 337004 preflight.go:56] [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W0804 19:50:04.771583 337004 removeetcdmember.go:106] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Deleted contents of the etcd data directory: /var/lib/etcd
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of directories: [/etc/kubernetes/manifests /var/lib/kubelet /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/super-admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/sched
uler.conf]
The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.
If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.
The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
[dongw@dingo7232 kubernetes]$
[dongw@dingo7232 kubernetes]$ sudo kubeadm init --config kubeadm-containerd.conf
[init] Using Kubernetes version: v1.30.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0804 19:50:17.819534 337405 checks.go:844] detected that the sandbox image "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6" of the container runtime is inconsistent with that used by kubeadm.It is
recommended to use "registry.aliyuncs.com/google_containers/pause:3.9" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [dingo7232 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.20.7.232]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [dingo7232 localhost] and IPs [172.20.7.232 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [dingo7232 localhost] and IPs [172.20.7.232 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.002018544s
[api-check] Waiting for a healthy API server. This can take up to 4m0s
[api-check] The API server is healthy after 6.001201324s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node dingo7232 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node dingo7232 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.20.7.232:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:9f2fac23dc994bb63b7510b1925f143a5b6fd3305ec8f376b402aa3c08ae5e90

join node to cluter

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.501334955s
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

troubelshoot

flannel

Failed to check br_netfilter: stat /proc/sys/net/bridge/bridge-nf-call-iptables: no such file or directory

Solution : Load the br_netfilter Module

  1. Check if br_netfilter is loaded

    1
    lsmod | grep br_netfilter
  2. If not loaded, load the module:

    1
    sudo modprobe br_netfilter
  3. Enable br_netfilter on boot

    1
    2
    # Add the following line to /etc/modules-load.d/k8s.conf:
    br_netfilter
  4. Ensure bridge-nf-call-iptables and bridge-nf-call-ip6tables are set to 1:

    1
    2
    sudo sysctl net.bridge.bridge-nf-call-iptables=1
    sudo sysctl net.bridge.bridge-nf-call-ip6tables=1
  5. Persist the settings by adding the following to /etc/sysctl.d/k8s.conf:

    1
    2
    net.bridge.bridge-nf-call-iptables = 1
    net.bridge.bridge-nf-call-ip6tables = 1
  6. Reload sysctl settings:

    1
    sudo sysctl --system

containerd

change containerd’s default data path

  1. Identify the Current Data Path

    1
    containerd config default | grep "root" # Expected output: root = "/var/lib/containerd"
  2. Modify the following lines in /etc/containerd/config.toml:

    1
    root = "/path/to/new/data/path" # the location where container data (images, volumes) is stored
  3. Move Existing Data (if required)

    1
    2
    3
    sudo systemctl stop containerd  # optional
    sudo mv /var/lib/containerd /data/containerd
    sudo systemctl start containerd

kubelet serivce failed

The connection to the server 10.220.32.16:6443 was refused - did you specify the right host or port?

API server process is up and listening on port 6443

  • disable swap
1
2
sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
  • systemctl start kubelet

control-plane NotReady

  • restart kubelet service on control-plane node

    1
    systemctl restart kubectl