← 返回文章列表

Kubernetes 1.28.2 集群安装指南

在 VMware 虚拟机上使用 kubeadm 部署 Kubernetes 1.28.2 集群,包含 Containerd 配置和 Flannel 网络插件安装

10 分钟阅读
字号

Kubernetes 1.28.2 集群安装指南

初始安装日期:2026-03-25 最后更新:2026-03-27(修复 IP 配置、网络模式、镜像源,完善 Flannel 部署配置)

一、环境信息

虚拟机配置

主机名IP 地址MAC 地址说明
master192.168.157.10100:0c:29:86:31:c3静态 IP,NAT 模式
worker1192.168.157.10200:0c:29:f9:c6:00静态 IP,NAT 模式
worker2192.168.157.10300:0c:29:fb:c4:b7静态 IP,NAT 模式

VMware 网络模式说明:K8s 集群必须使用 NAT 模式(连接 VMnet8)。NAT 模式下 DHCP 池为 192.168.157.128-254,静态 IP 必须在此范围之外(如 .2-.127)。Host-Only (VMnet1) 的 DHCP 池是 192.168.116.x,不是 192.168.157.x,切勿混淆。

系统环境

  • 操作系统:CentOS Linux 7 (Core)
  • 容器运行时:Containerd 1.6.33
  • Kubernetes 版本:1.28.2

二、VMware 网络配置

2.1 确保虚拟机使用 NAT 模式

检查 VMX 配置文件:

ethernet0.connectionType = "nat"

NAT 模式下的 DHCP 池:192.168.157.128 - 192.168.157.254 NAT 网关地址:192.168.157.2

2.2 配置静态 IP(所有节点)

重要:静态 IP 必须在 DHCP 池范围之外(.128-.254),否则会 IP 冲突!

编辑 /etc/sysconfig/network-scripts/ifcfg-ens33

DEVICE=ens33
BOOTPROTO=static
ONBOOT=yes
IPADDR=192.168.157.101    # master
# IPADDR=192.168.157.102  # worker1
# IPADDR=192.168.157.103  # worker2
NETMASK=255.255.255.0
GATEWAY=192.168.157.2
DNS1=192.168.157.2

或使用 nmcli:

nmcli connection modify ens33 ipv4.addresses 192.168.157.101/24 \
  ipv4.gateway 192.168.157.2 \
  ipv4.dns 192.168.157.2 \
  ipv4.method manual
 
nmcli connection up ens33

三、安装步骤

3.1 基础配置(所有节点)

# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
 
# 关闭 SELinux
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/sysconfig/selinux
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
 
# 关闭 Swap
swapoff -a
sed -i '/swap/s/^/#/' /etc/fstab
 
# 配置 hosts
cat >> /etc/hosts << EOF
192.168.157.101 master
192.168.157.102 worker1
192.168.157.103 worker2
EOF
 
# 开启 IP forwarding(Worker 节点必须执行!)
echo 1 > /proc/sys/net/ipv4/ip_forward
# 永久生效
echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.d/k8s.conf
 
# 加载内核模块
cat >> /etc/modules-load.d/containerd.conf << EOF
overlay
br_netfilter
EOF
 
modprobe overlay
modprobe br_netfilter
 
# 配置 sysctl
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
EOF
sysctl --system

3.2 配置 Containerd(所有节点)

# 生成默认配置
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
 
# 修改配置启用 SystemdCgroup(必须!)
sed -i '/SystemdCgroup/s/false/true/g' /etc/containerd/config.toml
 
# 修改 sandbox_image 为阿里云镜像(国内加速)
sed -i '/sandbox_image/s/registry.k8s.io/registry.aliyuncs.com\/google_containers/g' /etc/containerd/config.toml
 
# 重启 containerd
systemctl enable containerd
systemctl restart containerd

验证:

systemctl is-active containerd  # 应返回 active
ctr -n k8s.io images pull registry.aliyuncs.com/google_containers/pause:3.9  # 验证镜像可拉取

3.3 安装 Kubernetes 1.28.2(所有节点)

# 添加阿里云 K8s 源
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
 
# 安装 K8s 组件
yum install -y kubectl-1.28.2 kubelet-1.28.2 kubeadm-1.28.2
 
# 启用 kubelet(注意:不要 start,只是 enable)
systemctl enable kubelet

3.4 初始化 Master 节点

# 创建 kubeadm 配置文件
cat > /tmp/kubeadm-init.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: v1.28.2
controlPlaneEndpoint: 192.168.157.101:6443
imageRepository: registry.aliyuncs.com/google_containers
networking:
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
EOF
 
# 初始化前清理旧数据(如有)
rm -rf /etc/kubernetes/ /var/lib/kubelet/
systemctl start containerd
 
# 初始化集群
kubeadm init --config=/tmp/kubeadm-init.yaml
 
# 配置 kubectl
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
export KUBECONFIG=/etc/kubernetes/admin.conf

3.5 安装 Flannel 网络插件

注意:使用下方提供的完整 YAML 配置,不要使用官方的 kube-flannel.yml(namespace 不匹配)

# 完整 Flannel YAML(已修复 namespace 和 RBAC 配置)
kubectl apply -f - << 'EOF'
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: psp.flannel.unprivileged
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
    seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
    apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
    apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
  privileged: false
  volumes:
  - configMap
  - secret
  - emptyDir
  - hostPath
  allowedHostPaths:
  - pathPrefix: "/etc/cni/net.d"
  - pathPrefix: "/etc/kube-flannel"
  - pathPrefix: "/run/flannel"
  readOnlyRootFilesystem: false
  runAsUser:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
  allowPrivilegeEscalation: false
  defaultAllowPrivilegeEscalation: false
  allowedCapabilities: ['NET_ADMIN', 'NET_RAW']
  defaultAddCapabilities: []
  requiredDropCapabilities: []
  hostPID: false
  hostIPC: false
  hostNetwork: true
  hostPorts:
  - min: 0
    max: 65535
  seLinux:
    rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/status
  verbs:
  - patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: flannel
  namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-system
  labels:
    tier: node
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
      hostNetwork: true
      priorityClassName: system-node-critical
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: registry.cn-hangzhou.aliyuncs.com/alvinos/flanned:v0.13.1-rc1
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: registry.cn-hangzhou.aliyuncs.com/alvinos/flanned:v0.13.1-rc1
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
            add: ["NET_ADMIN", "NET_RAW"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
      - name: run
        hostPath:
          path: /run/flannel
      - name: cni
        hostPath:
          path: /etc/cni/net.d
      - name: flannel-cfg
        configMap:
          name: kube-flannel-cfg
EOF

3.6 加入 Worker 节点

在 Master 上获取 join 命令:

kubeadm token create --print-join-command

在每个 Worker 节点上执行:

# 必须先确保 ip_forward 开启
echo 1 > /proc/sys/net/ipv4/ip_forward
 
# 清理旧配置(如果之前加入过)
kubeadm reset --force
iptables -F && iptables -t nat -F && ipvsadm -C
rm -rf /etc/kubernetes/ /var/lib/kubelet/
 
# 重新加入(使用 Master 上生成的命令)
kubeadm join 192.168.157.101:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>

四、最终集群状态

NAME      STATUS   ROLES           AGE     VERSION
master    Ready    control-plane   87m     v1.28.2
worker1   Ready    <none>          5m      v1.28.2
worker2   Ready    <none>          31m     v1.28.2

五、关键要点

  1. 使用 Containerd:K8s 1.24+ 推荐使用 containerd,避免 Docker 依赖问题
  2. 配置 SystemdCgroup:必须启用,否则 kubelet 无法正常工作
  3. NAT 模式下的静态 IP:必须在 DHCP 池(.128-.254)之外,如 .2-.127
  4. Worker 节点 IP forwarding/proc/sys/net/ipv4/ip_forward 必须设为 1,必须重启后依然生效
  5. Flannel namespace:所有组件必须在 kube-system namespace,不能使用默认的 kube-flannel
  6. Flannel RBAC:ServiceAccount、ClusterRole、ClusterRoleBinding 必须在同一 namespace
  7. Join 前清理:重新加入集群前必须 kubeadm reset 并清理 iptables/IPVS

六、常见问题

问题 1:kubelet 无法启动(config.yaml not found)

原因:kubeadm init 未成功完成或被中断 解决

rm -rf /etc/kubernetes/ /var/lib/kubelet/
kubeadm init --config=/tmp/kubeadm-init.yaml

问题 2:Flannel pods 报 ImagePullBackOff

原因:镜像无法从 docker.io 拉取 解决:使用阿里云镜像 registry.cn-hangzhou.aliyuncs.com/alvinos/flanned:v0.13.1-rc1

问题 3:Flannel pods 报 error retrieving pod spec

原因:Flannel RBAC 配置错误,ServiceAccount 在错误的 namespace 解决:使用上方完整的 YAML,确保所有组件在 kube-system namespace

问题 4:kubeadm join 报 ip_forward 错误

解决

echo 1 > /proc/sys/net/ipv4/ip_forward
echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.d/k8s.conf

问题 5:改 IP 后 K8s 证书无效

原因:K8s 证书绑定 IP 解决:必须重新初始化集群

kubeadm reset --force
rm -rf /etc/kubernetes/ /var/lib/kubelet/
kubeadm init --config=/tmp/kubeadm-init.yaml

问题 6:CoreDNS / Pod 持续 ContainerCreating

原因:可能是 containerd 缓存损坏或 sandbox 镜像问题 解决

# 清理 containerd
systemctl stop containerd kubelet
rm -rf /var/lib/containerd/* /run/containerd/*
systemctl restart containerd
 
# 重新拉取镜像
ctr -n k8s.io images pull registry.aliyuncs.com/google_containers/pause:3.9
 
# 重启 kubelet
systemctl restart kubelet

问题 7:containerd 超时导致 sandbox 创建失败

原因:内存不足或 containerd 卡死 解决

# 重启 containerd
systemctl restart containerd
 
# 如果还是不行,完全清理
systemctl stop kubelet containerd
rm -rf /var/lib/containerd/* /run/containerd/*
systemctl start containerd
sleep 3
systemctl start kubelet

七、常用命令

# 集群状态
kubectl get nodes -o wide
kubectl get pods -A
kubectl cluster-info
 
# 节点操作
systemctl restart kubelet
systemctl restart containerd
 
# 镜像操作
ctr -n k8s.io images list                           # 列出镜像
ctr -n k8s.io images pull registry.aliyuncs.com/google_containers/pause:3.9  # 拉取镜像
 
# Token 操作
kubeadm token create --print-join-command  # 生成 join 命令
kubeadm token list                         # 查看现有 token
 
# 日志查看
journalctl -u kubelet --no-pager -n 50
journalctl -u containerd --no-pager -n 50

八、当前集群 Join 命令

注意:Token 有效期 24 小时,过期后需要重新生成

# 当前可用 Join 命令
kubeadm join 192.168.157.101:6443 --token 8sp3hd.kcytxrfrz4goqg01 \
  --discovery-token-ca-cert-hash sha256:b4541c7da1c4cb7cf2d7ffbb1c6db2ea2cfbbc53bb7706c07519b321a2868cfa
 
# 重新生成 Join 命令
kubeadm token create --print-join-command
分享

// RELATED_POSTS

0%