整理之前安装k8s时候遇到的一些问题,这篇文章是基于在腾讯云轻量服务器上搭建机器的一些记录。
[TOC]
1.环境信息
-
Ubuntu 22.04.3 LTS,由于centOS目前已经不再维护,不建议在CentOS各个版本上搭建kubernetes集群了
-
内存:2G,必须要求
-
CPU:2核,必须要求
-
kubernetes版本:1.28.2
-
容器: containerd.io 1.7.12,从1.24版本开始,kubernetes不再使用docker,而是使用containerd,这里顺应时代发展吧
以下几点非常重要,都是血的教训:
- 如果在腾讯云或阿里云上安装,一定要提前在安全组上开启各个端口,免得产生不必要的网络不通麻烦。例如在腾讯云上一定要开启8472端口,否则安装flannel后,master节点ping不通从节点的pod IP
- 如果是使用vagrant进行本地虚拟机搭建,一定要注意vagrant生成的虚拟机上,eth0网卡都是10.0.2.15,而eth1网卡才是真实的本机IP,在master初始化时,以及flannel等网络插件安装时,必须明确指定相应的本机IP或网卡,否则搭建会出现问题
- 一定要确保网络是畅通的,由于docker.io registry.k8s.io等镜像仓库国内都是无法访问的,所以要做好容器镜像仓库的映射,或手动提前下载必要的镜像
2.设置主机名和hosts
1hostnamectl set-hostname node1
2
3cat <<EOF>> /etc/hosts
4192.168.100.101 node1
5192.168.100.102 node2
6192.168.100.103 node3
7EOF
3.设置时区
1timedatectl set-timezone Asia/Shanghai
2
3systemctl restart rsyslog
4
5timedatectl
4.禁用缓存
1swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
22setenforce 0 && sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
5.关闭防火墙
1ufw disable
2
3ufw status
6.调整内核参数
1modprobe overlay
2
3modprobe br_netfilter
4
5cat <<EOF | tee /etc/modules-load.d/k8s.conf
6
7overlay
8
9br_netfilter
10
11EOF
12
13cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
14
15net.bridge.bridge-nf-call-iptables = 1
16
17net.bridge.bridge-nf-call-ip6tables = 1
18
19net.ipv4.ip_forward = 1
20
21EOF
22
23sudo sysctl --system
通过执行以下命令是否修改生效:
1lsmod | grep br_netfilter
2lsmod | grep overlay
3sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward
7.安装Containerd
1apt update
2
3apt -y install containerd
4
5systemctl status containerd
查看客户端和服务端的版本,客户端工具是ctr:
执行:ctr version
1root@master:~# ctr version
2Client:
3 Version: 1.7.12
4 Revision:
5 Go version: go1.21.1
6
7Server:
8 Version: 1.7.12
9 Revision:
10 UUID: 3102980b-8c2f-422b-ab94-2387efecdb98
执行:systemctl status containerd
1root@master:~# systemctl status containerd
2● containerd.service - containerd container runtime
3 Loaded: loaded (/lib/systemd/system/containerd.service; enabled; vendor preset: enabled)
4 Active: active (running) since Wed 2025-01-08 17:53:19 CST; 4 days ago
5 Docs: https://containerd.io
6 Main PID: 893 (containerd)
7 Tasks: 117
8 Memory: 91.3M
9 CPU: 32min 37.934s
10 CGroup: /system.slice/containerd.service
11 ├─ 893 /usr/bin/containerd
12 ├─1625 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 0bccf5c5dac302415759c74a0f7d755df47389836bde0c29c69c322202cf2083 -address /run/containerd/containerd.sock
13 ├─1626 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 351e5fcd5715a1067af2c25ec78051eb122259d1aa2ec1023feaae055b49979d -address /run/containerd/containerd.sock
14 ├─1627 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 7b2a26ea07a43d8339c35d9eb37827ac71f66b626d6b8f5563746c4c43bf55b4 -address /run/containerd/containerd.sock
15 ├─1628 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id c5268729f9f1db7073bdd60bfb2856389b7e8e9a47d7de4afa256b965c5dbc24 -address /run/containerd/containerd.sock
16 ├─1957 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 200195cd2c04b14526732eb9b8fa5b619a9e338b7f7caa9f8258791ac5a92862 -address /run/containerd/containerd.sock
17 ├─1978 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 1e10203dd5000f911dfbf9d8c66bd571e11efbffb11e5e2b37fd007f72be7776 -address /run/containerd/containerd.sock
18 ├─2607 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id c2e4475054041e7689105aecc785ddab98aaf2a7ebc8fa70dea6b05b0ea05831 -address /run/containerd/containerd.sock
19 └─2736 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id 05f0711763da1dc33add864309e5756794a043c8329a7104a34b8afd556933e8 -address /run/containerd/containerd.sock
20
21Jan 08 17:53:44 master containerd[893]: time="2025-01-08T17:53:44.335577926+08:00" level=info msg="CreateContainer within sandbox \"05f0711763da1dc33add864309e5756794a043c8329a7104a34b8afd556933e8\" for container &ContainerMetadata{Name:coredns,Attempt:1,}"
22Jan 08 17:53:44 master containerd[893]: time="2025-01-08T17:53:44.370769792+08:00" level=info msg="CreateContainer within sandbox \"05f0711763da1dc33add864309e5756794a043c8329a7104a34b8afd556933e8\" for &ContainerMetadata{Name:coredns,Attempt:1,} returns container id
8.修改containerd的配置
生成默认的配置文件:
1mkdir /etc/containerd/
2
3containerd config default > /etc/containerd/config.toml
4
5grep sandbox_image /etc/containerd/config.toml
将sanbox_image镜像源由k8s.io替换为阿里云google_containers镜像源
1sed -i "s#registry.k8s.io/pause#registry.aliyuncs.com/google_containers/pause#g" /etc/containerd/config.toml
配置containerd cgroup驱动程序systemd
1sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.toml
查看SystemdCgroup参数
1grep SystemdCgroup /etc/containerd/config.toml
重启容器
1systemctl restart containerd
9.containerd容器镜像加速
进入配置文件/etc/containerd/config.toml,配置config_path
1 [plugins."io.containerd.grpc.v1.cri".registry]
2 config_path = "/etc/containerd/certs.d"
可以执行一下一行命令进行修改
1sudo sed -ri 's@(config_path).*@\1 = "/etc/containerd/certs.d"@g' /etc/containerd/config.toml
2
3sudo systemctl restart containerd
然后配置docker.io镜像仓库
1sudo mkdir -p /etc/containerd/certs.d/docker.io
2
3cat <<'EOF' | sudo tee /etc/containerd/certs.d/docker.io/hosts.toml > /dev/null
4server = "https://docker.io"
5[host."https://dockerproxy.com"]
6 capabilities = ["pull", "resolve"]
7
8[host."https://docker.m.daocloud.io"]
9 capabilities = ["pull", "resolve"]
10
11[host."https://hub-mirror.c.163.com"]
12 capabilities = ["pull", "resolve"]
13EOF
然后配置registry.k8s.io镜像仓库
1sudo mkdir -p /etc/containerd/certs.d/registry.k8s.io
2
3cat <<'EOF' | sudo tee /etc/containerd/certs.d/registry.k8s.io/hosts.toml > /dev/null
4server = "https://registry.k8s.io"
5[host."https://k8s.m.daocloud.io"]
6 capabilities = ["pull", "resolve"]
7EOF
然后配置k8s.gcr.io镜像仓库
1sudo mkdir -p /etc/containerd/certs.d/k8s.gcr.io
2
3cat <<'EOF' | sudo tee /etc/containerd/certs.d/k8s.gcr.io/hosts.toml > /dev/null
4server = "https://k8s.gcr.io"
5[host."k8s-gcr.m.daocloud.io"]
6 capabilities = ["pull", "resolve"]
7EOF
如何确认我的镜像是否生效呢
1crictl --debug pull nginx
镜像不加任何前缀,表示拉取的是docker.io/library/nginx, 日志显示如下:
1root@node1:~# crictl --debug pull nginx
2DEBU[0000] get image connection
3DEBU[0000] PullImageRequest: &PullImageRequest{Image:&ImageSpec{Image:nginx,Annotations:map[string]string{},},Auth:nil,SandboxConfig:nil,}
4DEBU[0002] PullImageResponse: &PullImageResponse{ImageRef:sha256:f876bfc1cc63d905bb9c8ebc5adc98375bb8e22920959719d1a96e8f594868fa,}
5Image is up to date for sha256:f876bfc1cc63d905bb9c8ebc5adc98375bb8e22920959719d1a96e8f594868fa
执行命令crictl images|grep nginx 查看某个镜像
1root@node1:~# crictl images|grep nginx
2docker.io/library/nginx 1.19 f0b8a9a541369 53.7MB
3docker.io/library/nginx 1.7.9 35d28df486f61 39.9MB
4docker.io/library/nginx 1.9.1 ee609d78a6476 54.7MB
5docker.io/library/nginx latest f876bfc1cc63d 72.1MB
10.安装kubernetes组件
参见阿里云https://developer.aliyun.com/mirror/kubernetes的安装介绍进行安装
1apt-get update && apt-get install -y apt-transport-https
2curl -fsSL https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/deb/Release.key |
3 gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
4echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/deb/ /" |
5 tee /etc/apt/sources.list.d/kubernetes.list
6apt-get update
查看kubeadm可以安装的版本:
1apt-cache madison kubeadm|head
如果安装指定版本,使用以下命令,本文指定的版本是1.28.2:
1apt install -y kubeadm=1.28.2-00 kubelet=1.28.2-00 kubectl=1.28.2-00
如果安装最新版本,使用以下命令:
1apt install -y kubeadm kubelet kubectl
11.配置crictl
参见 https://www.51cto.com/article/717474.html
1crictl config runtime-endpoint unix:///run/containerd/containerd.sock
2crictl config image-endpoint unix:///run/containerd/containerd.sock
以上所有的操作步骤对于master节点和node1节点以及后续要加入到k8s cluster的节点都要执行。
12.安装和初始化master节点(仅master节点)
执行初始化:
1kubeadm init --kubernetes-version=v1.28.2 --pod-network-cidr 172.16.0.0/16 --apiserver-advertise-address=10.0.24.15 --image-repository registry.aliyuncs.com/google_containers
参数说明:
–kubernetes-version=v1.28.2 替换成你要安装的版本
–pod-network-cidr 172.16.0.0/16 POD的ip地址范围
–apiserver-advertise-address=10.0.24.15 替换成master所在机器的真实IP,也就是第一步设置hosts时候的IP
–image-repository registry.aliyuncs.com/google_containers 这里取的是阿里云的镜像
初始化后置动作:
1mkdir -p $HOME/.kube
2
3sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
4
5sudo chown $(id -u):$(id -g) $HOME/.kube/config
安装命令补全:
1# 安装bash-completion工具
2sudo apt install bash-completion
3# 执行bash_completion
4source /usr/share/bash-completion/bash_completion
5# 永久生效
6echo "source <(kubectl completion bash)" >> ~/.bashrc
13.安装网络插件Flannel(仅master节点)
wget https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
进入kube-flannel.yml,查看镜像:
1root@node1:~# cat kube-flannel.yml |grep image
2 image: docker.io/flannel/flannel:v0.26.3
3 image: docker.io/flannel/flannel-cni-plugin:v1.6.0-flannel1
4 image: docker.io/flannel/flannel:v0.26.3
确保上面两个镜像可以拉取成功!可以手动拉取镜像,防止出错:
1crictl --debug pull docker.io/flannel/flannel:v0.26.3
2
3crictl --debug pull docker.io/flannel/flannel-cni-plugin:v1.6.0-flannel1
修改网络地址为上面POD的cider地址,例如文件里面是10.244.0.0/16,将其修改成172.16.0.0/16
1 net-conf.json: |
2 {
3 "Network": "10.244.0.0/16",
4 "EnableNFTables": false,
5 "Backend": {
6 "Type": "vxlan"
7 }
8 }
然后应用操作:
1kubectl apply -f kube-flannel.yml
然后执行以下命令,如果结果正常,说明主节点安装OK!
1kubectl get pod -A
2
3kubectl get node
14.从节点加入集群(仅从节点)
在master节点上生成加入URL
1root@node1:~# kubeadm token create --print-join-command
2kubeadm join 10.0.24.15:6443 --token w2moq6.8mifsq8hiez6bcim --discovery-token-ca-cert-hash sha256:47f0082054a9bba11cf351b92a6e26a451151393079042a3dbf893696c8addec
然后在从节点上执行上面生成的url命令:
1kubeadm join 10.0.24.15:6443 --token w2moq6.8mifsq8hiez6bcim --discovery-token-ca-cert-hash sha256:47f0082054a9bba11cf351b92a6e26a451151393079042a3dbf893696c8addec
为了让worker node上能显示node、node等信息的设置
在从节点上运行:
1mkdir -p $HOME/.kube
在master上运行,注意这里是node1:
1cd /root/.kube
2
3scp config root@node1:/root/.kube
最后在从节点上执行kubectl命令了
1kubectl get node