整理之前安装k8s时候遇到的一些问题,这篇文章是基于在腾讯云轻量服务器上搭建机器的一些记录。

[TOC]

1.环境信息

  • Ubuntu 22.04.3 LTS,由于centOS目前已经不再维护,不建议在CentOS各个版本上搭建kubernetes集群了

  • 内存:2G,必须要求

  • CPU:2核,必须要求

  • kubernetes版本:1.28.2

  • 容器: containerd.io 1.7.12,从1.24版本开始,kubernetes不再使用docker,而是使用containerd,这里顺应时代发展吧

    以下几点非常重要,都是血的教训:

    • 如果在腾讯云或阿里云上安装,一定要提前在安全组上开启各个端口,免得产生不必要的网络不通麻烦。例如在腾讯云上一定要开启8472端口,否则安装flannel后,master节点ping不通从节点的pod IP
    • 如果是使用vagrant进行本地虚拟机搭建,一定要注意vagrant生成的虚拟机上,eth0网卡都是10.0.2.15,而eth1网卡才是真实的本机IP,在master初始化时,以及flannel等网络插件安装时,必须明确指定相应的本机IP或网卡,否则搭建会出现问题
    • 一定要确保网络是畅通的,由于docker.io registry.k8s.io等镜像仓库国内都是无法访问的,所以要做好容器镜像仓库的映射,或手动提前下载必要的镜像

2.设置主机名和hosts

1hostnamectl set-hostname node1
2
3cat <<EOF>> /etc/hosts
4192.168.100.101     node1
5192.168.100.102     node2
6192.168.100.103     node3
7EOF

3.设置时区

1timedatectl set-timezone Asia/Shanghai
2
3systemctl restart rsyslog
4
5timedatectl

4.禁用缓存

1swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
22setenforce 0 && sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config

5.关闭防火墙

1ufw disable
2
3ufw status

6.调整内核参数

 1modprobe overlay
 2
 3modprobe br_netfilter
 4
 5cat <<EOF | tee /etc/modules-load.d/k8s.conf
 6
 7overlay
 8
 9br_netfilter
10
11EOF
12
13cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
14
15net.bridge.bridge-nf-call-iptables = 1
16
17net.bridge.bridge-nf-call-ip6tables = 1
18
19net.ipv4.ip_forward = 1
20
21EOF
22
23sudo sysctl --system

通过执行以下命令是否修改生效:

1lsmod | grep br_netfilter
2lsmod | grep overlay
3sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward

7.安装Containerd

1apt update
2
3apt -y install containerd
4
5systemctl status containerd

查看客户端和服务端的版本,客户端工具是ctr:

执行:ctr version

 1root@master:~# ctr version
 2Client:
 3  Version:  1.7.12
 4  Revision: 
 5  Go version: go1.21.1
 6
 7Server:
 8  Version:  1.7.12
 9  Revision: 
10  UUID: 3102980b-8c2f-422b-ab94-2387efecdb98

执行:systemctl status containerd

 1root@master:~# systemctl status containerd
 2● containerd.service - containerd container runtime
 3     Loaded: loaded (/lib/systemd/system/containerd.service; enabled; vendor preset: enabled)
 4     Active: active (running) since Wed 2025-01-08 17:53:19 CST; 4 days ago
 5       Docs: https://containerd.io
 6   Main PID: 893 (containerd)
 7      Tasks: 117
 8     Memory: 91.3M
 9        CPU: 32min 37.934s
10     CGroup: /system.slice/containerd.service
11             ├─ 893 /usr/bin/containerd
12             ├─1625 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 0bccf5c5dac302415759c74a0f7d755df47389836bde0c29c69c322202cf2083 -address /run/containerd/containerd.sock
13             ├─1626 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 351e5fcd5715a1067af2c25ec78051eb122259d1aa2ec1023feaae055b49979d -address /run/containerd/containerd.sock
14             ├─1627 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 7b2a26ea07a43d8339c35d9eb37827ac71f66b626d6b8f5563746c4c43bf55b4 -address /run/containerd/containerd.sock
15             ├─1628 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id c5268729f9f1db7073bdd60bfb2856389b7e8e9a47d7de4afa256b965c5dbc24 -address /run/containerd/containerd.sock
16             ├─1957 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 200195cd2c04b14526732eb9b8fa5b619a9e338b7f7caa9f8258791ac5a92862 -address /run/containerd/containerd.sock
17             ├─1978 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 1e10203dd5000f911dfbf9d8c66bd571e11efbffb11e5e2b37fd007f72be7776 -address /run/containerd/containerd.sock
18             ├─2607 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id c2e4475054041e7689105aecc785ddab98aaf2a7ebc8fa70dea6b05b0ea05831 -address /run/containerd/containerd.sock
19             └─2736 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 05f0711763da1dc33add864309e5756794a043c8329a7104a34b8afd556933e8 -address /run/containerd/containerd.sock
20
21Jan 08 17:53:44 master containerd[893]: time="2025-01-08T17:53:44.335577926+08:00" level=info msg="CreateContainer within sandbox \"05f0711763da1dc33add864309e5756794a043c8329a7104a34b8afd556933e8\" for container &ContainerMetadata{Name:coredns,Attempt:1,}"
22Jan 08 17:53:44 master containerd[893]: time="2025-01-08T17:53:44.370769792+08:00" level=info msg="CreateContainer within sandbox \"05f0711763da1dc33add864309e5756794a043c8329a7104a34b8afd556933e8\" for &ContainerMetadata{Name:coredns,Attempt:1,} returns container id 

8.修改containerd的配置

生成默认的配置文件:

1mkdir /etc/containerd/
2
3containerd config default > /etc/containerd/config.toml
4
5grep sandbox_image /etc/containerd/config.toml

将sanbox_image镜像源由k8s.io替换为阿里云google_containers镜像源

1sed -i "s#registry.k8s.io/pause#registry.aliyuncs.com/google_containers/pause#g" /etc/containerd/config.toml

配置containerd cgroup驱动程序systemd

1sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.toml

查看SystemdCgroup参数

1grep SystemdCgroup /etc/containerd/config.toml

重启容器

1systemctl restart containerd

9.containerd容器镜像加速

进入配置文件/etc/containerd/config.toml,配置config_path

1 [plugins."io.containerd.grpc.v1.cri".registry]
2     config_path = "/etc/containerd/certs.d"

可以执行一下一行命令进行修改

1sudo sed -ri 's@(config_path).*@\1 = "/etc/containerd/certs.d"@g' /etc/containerd/config.toml
2
3sudo systemctl  restart containerd

然后配置docker.io镜像仓库

 1sudo mkdir -p /etc/containerd/certs.d/docker.io
 2
 3cat <<'EOF' | sudo tee /etc/containerd/certs.d/docker.io/hosts.toml > /dev/null
 4server = "https://docker.io"
 5[host."https://dockerproxy.com"]
 6  capabilities = ["pull", "resolve"]
 7
 8[host."https://docker.m.daocloud.io"]
 9  capabilities = ["pull", "resolve"]
10
11[host."https://hub-mirror.c.163.com"]
12  capabilities = ["pull", "resolve"]
13EOF

然后配置registry.k8s.io镜像仓库

1sudo mkdir -p /etc/containerd/certs.d/registry.k8s.io
2
3cat <<'EOF' | sudo tee /etc/containerd/certs.d/registry.k8s.io/hosts.toml > /dev/null
4server = "https://registry.k8s.io"
5[host."https://k8s.m.daocloud.io"]
6  capabilities = ["pull", "resolve"]
7EOF

然后配置k8s.gcr.io镜像仓库

1sudo mkdir -p /etc/containerd/certs.d/k8s.gcr.io
2
3cat <<'EOF' | sudo tee /etc/containerd/certs.d/k8s.gcr.io/hosts.toml > /dev/null
4server = "https://k8s.gcr.io"
5[host."k8s-gcr.m.daocloud.io"]
6  capabilities = ["pull", "resolve"]
7EOF

如何确认我的镜像是否生效呢

1crictl --debug pull  nginx

镜像不加任何前缀,表示拉取的是docker.io/library/nginx, 日志显示如下:

1root@node1:~# crictl --debug pull  nginx
2DEBU[0000] get image connection                         
3DEBU[0000] PullImageRequest: &PullImageRequest{Image:&ImageSpec{Image:nginx,Annotations:map[string]string{},},Auth:nil,SandboxConfig:nil,} 
4DEBU[0002] PullImageResponse: &PullImageResponse{ImageRef:sha256:f876bfc1cc63d905bb9c8ebc5adc98375bb8e22920959719d1a96e8f594868fa,} 
5Image is up to date for sha256:f876bfc1cc63d905bb9c8ebc5adc98375bb8e22920959719d1a96e8f594868fa

执行命令crictl images|grep nginx 查看某个镜像

1root@node1:~# crictl images|grep nginx
2docker.io/library/nginx                              1.19                f0b8a9a541369       53.7MB
3docker.io/library/nginx                              1.7.9               35d28df486f61       39.9MB
4docker.io/library/nginx                              1.9.1               ee609d78a6476       54.7MB
5docker.io/library/nginx                              latest              f876bfc1cc63d       72.1MB

10.安装kubernetes组件

参见阿里云https://developer.aliyun.com/mirror/kubernetes的安装介绍进行安装

1apt-get update && apt-get install -y apt-transport-https
2curl -fsSL https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/deb/Release.key |
3    gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
4echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/deb/ /" |
5    tee /etc/apt/sources.list.d/kubernetes.list
6apt-get update

查看kubeadm可以安装的版本:

1apt-cache madison kubeadm|head

如果安装指定版本,使用以下命令,本文指定的版本是1.28.2:

1apt install -y kubeadm=1.28.2-00 kubelet=1.28.2-00 kubectl=1.28.2-00

如果安装最新版本,使用以下命令:

1apt install -y kubeadm kubelet kubectl 

11.配置crictl

参见 https://www.51cto.com/article/717474.html

1crictl config runtime-endpoint unix:///run/containerd/containerd.sock
2crictl config image-endpoint unix:///run/containerd/containerd.sock

以上所有的操作步骤对于master节点和node1节点以及后续要加入到k8s cluster的节点都要执行。

12.安装和初始化master节点(仅master节点)

执行初始化:

1kubeadm init --kubernetes-version=v1.28.2 --pod-network-cidr 172.16.0.0/16 --apiserver-advertise-address=10.0.24.15 --image-repository registry.aliyuncs.com/google_containers

参数说明:

–kubernetes-version=v1.28.2 替换成你要安装的版本

–pod-network-cidr 172.16.0.0/16 POD的ip地址范围

–apiserver-advertise-address=10.0.24.15 替换成master所在机器的真实IP,也就是第一步设置hosts时候的IP

–image-repository registry.aliyuncs.com/google_containers 这里取的是阿里云的镜像

初始化后置动作:

1mkdir -p $HOME/.kube
2
3sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
4
5sudo chown $(id -u):$(id -g) $HOME/.kube/config

安装命令补全:

1# 安装bash-completion工具
2sudo apt install bash-completion
3# 执行bash_completion
4source /usr/share/bash-completion/bash_completion
5# 永久生效
6echo "source <(kubectl completion bash)" >> ~/.bashrc

13.安装网络插件Flannel(仅master节点)

wget https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

进入kube-flannel.yml,查看镜像:

1root@node1:~# cat kube-flannel.yml |grep image
2        image: docker.io/flannel/flannel:v0.26.3
3        image: docker.io/flannel/flannel-cni-plugin:v1.6.0-flannel1
4        image: docker.io/flannel/flannel:v0.26.3

确保上面两个镜像可以拉取成功!可以手动拉取镜像,防止出错:

1crictl --debug pull  docker.io/flannel/flannel:v0.26.3
2
3crictl --debug pull  docker.io/flannel/flannel-cni-plugin:v1.6.0-flannel1

修改网络地址为上面POD的cider地址,例如文件里面是10.244.0.0/16,将其修改成172.16.0.0/16

1 net-conf.json: |
2    {
3      "Network": "10.244.0.0/16",
4      "EnableNFTables": false,
5      "Backend": {
6        "Type": "vxlan"
7      }
8    }

然后应用操作:

1kubectl apply -f kube-flannel.yml

然后执行以下命令,如果结果正常,说明主节点安装OK!

1kubectl get pod -A
2
3kubectl get node

14.从节点加入集群(仅从节点)

在master节点上生成加入URL

1root@node1:~# kubeadm token create --print-join-command
2kubeadm join 10.0.24.15:6443 --token w2moq6.8mifsq8hiez6bcim --discovery-token-ca-cert-hash sha256:47f0082054a9bba11cf351b92a6e26a451151393079042a3dbf893696c8addec

然后在从节点上执行上面生成的url命令:

1kubeadm join 10.0.24.15:6443 --token w2moq6.8mifsq8hiez6bcim --discovery-token-ca-cert-hash sha256:47f0082054a9bba11cf351b92a6e26a451151393079042a3dbf893696c8addec

为了让worker node上能显示node、node等信息的设置

在从节点上运行:

1mkdir -p $HOME/.kube

在master上运行,注意这里是node1:

1cd /root/.kube
2
3scp config root@node1:/root/.kube

最后在从节点上执行kubectl命令了

1kubectl get node