14 Jun 2021
kubernetes的官方文档,推荐如果是自己维护k8s集群的话,要使用kubeadm这个工具来初始化集群。但是其实二进制模式的安装还是可行,而且个人意见:二进制安装能增强维护人员对k8s集群的细节了解程度,并且能在解决安装时遇到各种问题的情况下增加对k8s管理知识的了解。
关于文档
| items | version |
|---|---|
| OS | centos7 |
| kubernetes | 1.21.0 |
| containerd | 1.4.6 |
| etcd | v3.4.16 |
| name | ip address | service | comment |
|---|---|---|---|
| proxy | 192.168.33.70 | haproxy | |
| master1 | 192.168.33.71 | kube-apiserver,kube-controller-manager,kube-scheduler | |
| master2 | 192.168.33.72 | kube-apiserver,kube-controller-manager,kube-scheduler | |
| master3 | 192.168.33.73 | kube-apiserver,kube-controller-manager,kube-scheduler | |
| etcd1 | 192.168.33.81 | etcd | |
| etcd2 | 192.168.33.82 | etcd | |
| etcd3 | 192.168.33.83 | etcd | |
| node01 | 192.168.33.91 | kubelet,kube-proxy | |
| node02 | 192.168.33.92 | kubelet,kube-proxy | |
| node03 | 192.168.33.93 | kubelet,kube-proxy |
| name | CIDR |
|---|---|
| pod | 10.5.0.0/16 |
| service | 10.254.0.0/16 |
| host | 192.168.33.0/24 |
# 安装必要的工具包 yum install -y wget vim # 确保mac地址和product_uuid不重复 # 查看mac地址 ip link # 查看uuid cat /sys/class/dmi/id/product_uuid # nftables驱动的防火墙管理工具和kube-proxy不兼容,所以需要换回老版本的iptables systemctl stop firewalld systemctl disable firewalld systemctl mask firewalld yum install -y iptables iptables-services systemctl disable iptables systemctl stop iptables # 安装期间临时关闭防火墙,正式运行需开放api等服务的端口 # 关闭selinux sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config setenforce 0 # 设定hostname到hosts文件中 cat << EOF >> /etc/hosts 192.168.33.71 master1 192.168.33.72 master2 192.168.33.73 master3 192.168.33.81 etcd1 192.168.33.82 etcd2 192.168.33.83 etcd3 192.168.33.91 node01 192.168.33.92 node02 192.168.33.93 node03 EOF # 关闭系统swap swapoff -a # 注释swap的开机挂载项,修改/etc/fstab sed -ri "s|(^ ?+\/.*swap.*$)|#\1|g" /etc/fstab # 关闭系统swap,是为了严格的按照cpu和内存的限制,这样scheduler在规划pod的时候就不会把pod放进swap中了,这是为了性能考虑。 # 加载内核模块br_netfilter cat << EOF > /etc/modules-load.d/k8s.conf br_netfilter EOF lsmod | grep br_netfilter [ $? -eq 0 ] || modprobe br_netfilter # 优化系统内核 cat << EOF > /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF sysctl --system
参照 containerd installation on centos 7 这篇文档来安装
重点关注:
- 存储驱动由graph driver演化为了snapshotter(无需修改配置)
- 使用systemd替代cgroupfs作为cgroup管理器
# 各角色ip变量 declare -A IP_LIST IP_LIST=( [master1]="192.168.33.71" \ [master2]="192.168.33.72" \ [master3]="192.168.33.73" \ [etcd1]="192.168.33.81" \ [etcd2]="192.168.33.82" \ [etcd3]="192.168.33.83" \ [node01]="192.168.33.91" \ [node02]="192.168.33.92" \ [node03]="192.168.33.93") KUBE_API_PROXY_IP=192.168.33.70 # 安装环境变量 DEPLOY_DIR=/root/k8s K8S_VER=v1.21.0 ETCD_VER=v3.4.16 # 证书环境变量 K8S_PKI_DIR=/etc/kubernetes/pki ETCD_PKI_DIR=/etc/etcd/pki ADMIN_KUBECONFIG_DIR=/root/.kube KUBECONFIG_DIR=/etc/kubernetes/kubeconfig # IP配置变量 SERVICE_CLUSTER_IP_RANGE=10.254.0.0/16 SERVICE_NODE_PORT_RANGE=30000-32767 POD_CLUSTER_IP_RANGE=10.5.0.0/16
在haproxy的服务器上安装docker和docker-compose,启动haproxy
DOCKER_YML_DIR=/data/docker/yml DOCKER_RUNTIME_DIR=/data/docker/runtime mkdir -p ${DOCKER_YML_DIR} cat << EOF > ${DOCKER_YML_DIR}/docker-compose-haproxy.yml version: '3' services: haproxy: container_name: haproxy-kube-apiserver image: haproxy ports: - 443:6443 volumes: - /data/docker/runtime/haproxy/etc/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg EOF mkdir -p ${DOCKER_RUNTIME_DIR}/haproxy/etc cat << EOF > ${DOCKER_RUNTIME_DIR}/haproxy/etc/haproxy.cfg frontend k8s-api bind 0.0.0.0:6443 mode tcp option tcplog timeout client 1h default_backend k8s-api backend k8s-api mode tcp timeout server 1h option tcplog option tcp-check balance roundrobin default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100 server k8s-api-1 ${IP_LIST["master1"]}:6443 check server k8s-api-2 ${IP_LIST["master2"]}:6443 check server k8s-api-3 ${IP_LIST["master3"]}:6443 check EOF docker-compose -f /data/docker/yml/docker-compose-haproxy.yml up -d
master1上准备二进制文件,统一下发给所有其他机器,所以提前做好ssh信任
mkdir -p ${DEPLOY_DIR}/{node,master,etcd,cni}/bin
cd ${DEPLOY_DIR} # 下载kubernetes wget https://dl.k8s.io/${K8S_VER}/kubernetes-server-linux-amd64.tar.gz -O ${DEPLOY_DIR}/kubernetes-server-linux-amd64.tar.gz tar zxvf kubernetes-server-linux-amd64.tar.gz cp ${DEPLOY_DIR}/kubernetes/server/bin/{kube-apiserver,kube-scheduler,kube-controller-manager,kubectl} ${DEPLOY_DIR}/master/bin cp ${DEPLOY_DIR}/kubernetes/server/bin/{kubelet,kube-proxy} ${DEPLOY_DIR}/node/bin # 下载etcd curl -L https://github.com/coreos/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz \ -o etcd-${ETCD_VER}-linux-amd64.tar.gz tar xzvf etcd-${ETCD_VER}-linux-amd64.tar.gz cp etcd-${ETCD_VER}-linux-amd64/{etcd,etcdctl} ${DEPLOY_DIR}/etcd/bin/ ## # 下载cni ## wget https://github.com/containernetworking/plugins/releases/download/v0.8.4/cni-plugins-linux-amd64-v0.8.4.tgz ## tar zxvf cni-plugins-linux-amd64-v0.8.4.tgz -C ${DEPLOY_DIR}/cni/bin/
chmod +x ${DEPLOY_DIR}/etcd/bin/* chmod +x ${DEPLOY_DIR}/node/bin/* chmod +x ${DEPLOY_DIR}/master/bin/* # 下发master二进制文件 for master in {master1,master2,master3};do rsync -av ${DEPLOY_DIR}/master/bin/* ${master}:/usr/local/bin/ done # 下发node二进制文件 for node in {node01,node02,node03};do rsync -av ${DEPLOY_DIR}/node/bin/* ${node}:/usr/local/bin/ done # 下发etcd二进制文件 for etcd in {etcd1,etcd2,etcd3};do rsync -av ${DEPLOY_DIR}/etcd/bin/* ${etcd}:/usr/local/bin/ done ## # 下发cni二进制文件 ## for node in {node01,node02,node03};do ## ssh root@${node} "mkdir -p /opt/cni/bin" ## rsync -av ${DEPLOY_DIR}/cni/bin/* ${node}:/opt/cni/bin ## done
创建认证文件之前,墙裂推荐阅读apiserver authentication documentation。
里面重点提到了:
这里是一个证书CSR的示例,可以通过CSR来设定CN和O
{ "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 }, "names":[{ "C": "<country>", "ST": "<state>", "L": "<city>", "O": "<organization>", "OU": "<organization unit>" }] }
curl -s -L -o /usr/local/bin/cfssl https://github.com/cloudflare/cfssl/releases/download/v1.6.0/cfssl_1.6.0_linux_amd64 curl -s -L -o /usr/local/bin/cfssljson https://github.com/cloudflare/cfssl/releases/download/v1.6.0/cfssljson_1.6.0_linux_amd64 curl -s -L -o /usr/local/bin/cfssl-certinfo https://github.com/cloudflare/cfssl/releases/download/v1.6.0/cfssl-certinfo_1.6.0_linux_amd64 chmod +x /usr/local/bin/* # 创建k8s-ssl目录 mkdir -p ${DEPLOY_DIR}/pki/{etcd,kubernetes} # 此目录只是临时存放ca生成文件,可随意更换位置
cd ${DEPLOY_DIR}/pki/etcd # step 1. 创建根CA # 创建 ETCD CA 证书签名请求文件 cat > ca-csr.json << EOF { "CN": "etcd.local", "key": { "algo": "rsa", "size": 4096 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "kubernetes", "OU": "System" } ] } EOF # step 2. 签名证书 # 创建CA签名配置文件 # [issue]: 因为etcd开启--client-cert-auth选项,导致需要给serverde profile (client auth) 权限 # [issue-url]: https://github.com/etcd-io/etcd/issues/9785 cat > ca-config.json <<EOF { "signing": { "default": { "expiry": "87600h" }, "profiles": { "server": { "usages": [ "signing", "key encipherment", "server auth", "client auth" ], "expiry": "87600h" }, "client": { "usages": [ "signing", "key encipherment", "client auth" ], "expiry": "87600h" }, "peer": { "usages": [ "signing", "key encipherment", "server auth", "client auth" ], "expiry": "87600h" } } } } EOF # step 3. 创建"证书签名请求"文件 # server限定etcd所有节点监听ip cat > server-csr.json << EOF { "CN": "server", "hosts": [ "127.0.0.1", "${IP_LIST['etcd1']}", "${IP_LIST['etcd2']}", "${IP_LIST['etcd3']}" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "kubernetes", "OU": "System" } ] } EOF # client不限定签名ip cat > client-csr.json << EOF { "CN": "client", "hosts": [""], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:masters", "OU": "System" } ] } EOF # peer限定签名etcd所有节点的通信ip cat > peer-csr.json << EOF { "CN": "peer", "hosts": [ "${IP_LIST['etcd1']}", "${IP_LIST['etcd2']}", "${IP_LIST['etcd3']}" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "kubernetes", "OU": "System" } ] } EOF
# step 1. 生成 CA 证书和私钥 cfssl gencert -initca ca-csr.json | cfssljson -bare ca # 生成文件:ca-key.pem ca.csr ca.pem # step 2. 生成应CSR文件请求,使用CA签名过的证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server-csr.json | cfssljson -bare server # 生成文件:server-key.pem server.csr server.pem cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client-csr.json | cfssljson -bare client # 生成文件:client-key.pem client.csr client.pem cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer peer-csr.json | cfssljson -bare peer # 生成文件:peer-key.pem peer.csr peer.pem
cd ${DEPLOY_DIR}/pki/kubernetes # step 1. 创建根CA # 创建 K8S CA 证书签名请求文件 cat > ${DEPLOY_DIR}/pki/kubernetes/ca-csr.json << EOF { "CN": "kubernetes", "key": { "algo": "rsa", "size": 4096 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "kubernetes", "OU": "System" } ] } EOF # step 2. 签名证书 # 创建CA签名配置文件 cat > ${DEPLOY_DIR}/pki/kubernetes/ca-config.json <<EOF { "signing": { "default": { "expiry": "87600h" }, "profiles": { "server": { "usages": [ "signing", "key encipherment", "server auth", "client auth" ], "expiry": "87600h" }, "client": { "usages": [ "signing", "key encipherment", "client auth" ], "expiry": "87600h" } } } } EOF # step 3. 创建"证书签名请求"文件 # kube-apiserver # hosts内容: # - HA所有监听ip、vip # - --apiserver-advertise-address指定的ip # - service网段第一个ip # - k8s DNS域名 # - master节点名称 cat > ${DEPLOY_DIR}/pki/kubernetes/kube-apiserver-csr.json << EOF { "CN": "kubernetes", "hosts": [ "127.0.0.1", "${KUBE_API_PROXY_IP}", "${IP_LIST['master1']}", "${IP_LIST['master2']}", "${IP_LIST['master3']}", "${SERVICE_CLUSTER_IP_RANGE%.*}.1", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "kubernetes", "OU": "System" } ] } EOF # api-kubelet-client cat > ${DEPLOY_DIR}/pki/kubernetes/api-kubelet-client.json << EOF { "CN": "system:kubelet-api-admin", "hosts": [ "127.0.0.1", "node01", "node02", "node03", "${IP_LIST['master1']}", "${IP_LIST['master2']}", "${IP_LIST['master3']}", "${IP_LIST['node01']}", "${IP_LIST['node02']}", "${IP_LIST['node03']}" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:kubelet-api-admin", "OU": "System" } ] } EOF # kube-controller-manager # 注意点: # - CN名称必须是: system:kube-controller-manager cat > ${DEPLOY_DIR}/pki/kubernetes/kube-controller-manager-csr.json << EOF { "CN": "system:kube-controller-manager", "hosts": [ "127.0.0.1", "${IP_LIST['master1']}", "${IP_LIST['master2']}", "${IP_LIST['master3']}" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:kube-controller-manager", "OU": "System" } ] } EOF # kube-scheduler cat > ${DEPLOY_DIR}/pki/kubernetes/kube-scheduler-csr.json << EOF { "CN": "system:kube-scheduler", "hosts": [ "127.0.0.1", "${IP_LIST['master1']}", "${IP_LIST['master2']}", "${IP_LIST['master3']}" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:kube-scheduler", "OU": "System" } ] } EOF # kube-proxy cat > ${DEPLOY_DIR}/pki/kubernetes/kube-proxy-csr.json << EOF { "CN": "system:kube-proxy", "hosts": [""], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:node-proxier", "OU": "System" } ] } EOF cat > ${DEPLOY_DIR}/pki/kubernetes/service-account-csr.json << EOF { "CN": "service-accounts", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "US", "L": "Portland", "O": "Kubernetes", "OU": "Kubernetes The Hard Way", "ST": "Oregon" } ] } EOF
# step 1. 生成 CA 证书和私钥 cfssl gencert -initca ca-csr.json | cfssljson -bare ca # 生成文件: ca-key.pem ca.csr ca.pem # step 2. 生成应CSR文件请求,使用CA签名过的证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server kube-apiserver-csr.json | cfssljson -bare kube-apiserver # 生成文件: kube-apiserver-key.pem kube-apiserver.csr kube-apiserver.pem cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server api-kubelet-client.json | cfssljson -bare api-kubelet-client # 生成文件: api-kubelet-client-key.pem api-kubelet-client.csr api-kubelet-client.pem cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager # 生成文件: kube-controller-manager-key.pem kube-controller-manager.csr kube-controller-manager.pem cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client kube-scheduler-csr.json | cfssljson -bare kube-scheduler # 生成文件: kube-scheduler-key.pem kube-scheduler.csr kube-scheduler.pem cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client kube-proxy-csr.json | cfssljson -bare kube-proxy # 生成文件:kube-proxy-key.pem kube-proxy.csr kube-proxy.pem cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server service-account-csr.json | cfssljson -bare service-account # 生成文件:service-account-key.pem service-account.csr service-account.pem
cd ${DEPLOY_DIR}/pki/kubernetes # step 1. 创建"证书签名请求"文件 cat > admin-csr.json << EOF { "CN": "admin", "hosts": [""], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:masters", "OU": "System" } ] } EOF # step 2. 生成应CSR文件请求,使用CA签名过的证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client admin-csr.json | cfssljson -bare admin # 生成文件:admin-key.pem admin.csr admin.pem
cfssl-certinfo -cert server.pem
# 下发证书到master # 证书对象 # - 根证书 # - 管理员证书 # - apiserver证书 # - apiserver-etcd-client证书 for master in {master1,master2,master3};do ssh root@${master} "mkdir -p ${K8S_PKI_DIR}" ssh root@${master} "mkdir -p ${ETCD_PKI_DIR}" scp ${DEPLOY_DIR}/pki/kubernetes/{ca.pem,ca-key.pem} ${master}:${K8S_PKI_DIR} scp ${DEPLOY_DIR}/pki/kubernetes/kube-apiserver.pem ${master}:${K8S_PKI_DIR} scp ${DEPLOY_DIR}/pki/kubernetes/kube-apiserver-key.pem ${master}:${K8S_PKI_DIR} scp ${DEPLOY_DIR}/pki/kubernetes/api-kubelet-client.pem ${master}:${K8S_PKI_DIR} scp ${DEPLOY_DIR}/pki/kubernetes/api-kubelet-client-key.pem ${master}:${K8S_PKI_DIR} scp ${DEPLOY_DIR}/pki/etcd/ca.pem ${master}:${ETCD_PKI_DIR} scp ${DEPLOY_DIR}/pki/etcd/client.pem ${master}:${K8S_PKI_DIR}/apiserver-etcd-client.pem scp ${DEPLOY_DIR}/pki/etcd/client-key.pem ${master}:${K8S_PKI_DIR}/apiserver-etcd-client-key.pem scp ${DEPLOY_DIR}/pki/kubernetes/service-account.pem ${master}:${K8S_PKI_DIR} scp ${DEPLOY_DIR}/pki/kubernetes/service-account-key.pem ${master}:${K8S_PKI_DIR} done # 下发证书到node for node in {node01,node02,node03};do ssh root@${node} "mkdir -p ${K8S_PKI_DIR}" scp ${DEPLOY_DIR}/pki/kubernetes/{ca.pem,api-kubelet-client.pem,api-kubelet-client-key.pem} ${node}:${K8S_PKI_DIR} done # 用于加密master和node通信,kubelet中需要提供ca.pem来认证,主要用于kubectl logs/exec # 下发证书到etcd for etcd in {etcd1,etcd2,etcd3};do ssh root@${etcd} "mkdir -p ${ETCD_PKI_DIR}" scp ${DEPLOY_DIR}/pki/etcd/{ca.pem,server.pem,server-key.pem,peer.pem,peer-key.pem} ${etcd}:${ETCD_PKI_DIR} done
参照文档: kubernetes in hard way about kubeconfig
# 创建k8s-config目录 mkdir -p ${DEPLOY_DIR}/kubeconfig
cd ${DEPLOY_DIR}/kubeconfig export KUBE_APISERVER="https://${KUBE_API_PROXY_IP}:443" # step 1. 创建 bootstrap token file # Token 可以是任意的包涵128 bit的字符串,可以使用安全的随机数发生器生成。 export BOOTSTRAP_TOKEN=$(head -c 16 /dev/urandom | od -An -t x | tr -d ' ') cat > ${DEPLOY_DIR}/kubeconfig/token.csv <<EOF ${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,"system:bootstrappers" EOF # 注意: 在进行后续操作前请检查 token.csv 文件,确认其中的 ${BOOTSTRAP_TOKEN} 环境变量已经被真实的值替换。 cat ${DEPLOY_DIR}/kubeconfig/token.csv # 输出类似这种值: 31c5af9c14a8f8ddbed6564234b2644f,kubelet-bootstrap,10001,"system:bootstrappers" # step 2. 生成 kubeconfig 和 设置 current context # 注意点: credential必须是,用户组system:node和hostname小写化后的拼接 for node in {node01,node02,node03};do kubectl config set-cluster kubernetes \ --certificate-authority=${DEPLOY_DIR}/pki/kubernetes/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=bootstrap-kubelet-${node}.conf kubectl config set-credentials system:node:${node} \ --token=${BOOTSTRAP_TOKEN} \ --kubeconfig=bootstrap-kubelet-${node}.conf kubectl config set-context default \ --cluster=kubernetes \ --user=system:node:${node} \ --kubeconfig=bootstrap-kubelet-${node}.conf kubectl config use-context default --kubeconfig=bootstrap-kubelet-${node}.conf done # 依次执行了以下步骤: # 生成文件: bootstrap-kubelet-${node}.conf,内容为集群信息(证书、apiserver地址、集群名称) # 修改文件: bootstrap-kubelet-${node}.conf,增加token等认证信息 # 修改文件: bootstrap-kubelet-${node}.conf,增加context信息 # 修改文件: bootstrap-kubelet-${node}.conf,设定当前context为default
export KUBE_APISERVER="https://${KUBE_API_PROXY_IP}:443" # step 1. 生成 kubeconfig kubectl config set-cluster kubernetes \ --certificate-authority=${DEPLOY_DIR}/pki/kubernetes/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=kube-proxy.conf # 生成文件: kube-proxy.conf,内容为集群信息(证书、apiserver地址、集群名称) kubectl config set-credentials system:kube-proxy \ --client-certificate=${DEPLOY_DIR}/pki/kubernetes/kube-proxy.pem \ --client-key=${DEPLOY_DIR}/pki/kubernetes/kube-proxy-key.pem \ --embed-certs=true \ --kubeconfig=kube-proxy.conf # 修改文件: kube-proxy.conf,增加认证用户和认证信息 kubectl config set-context default \ --cluster=kubernetes \ --user=system:kube-proxy \ --kubeconfig=kube-proxy.conf # 修改文件: kube-proxy.conf,增加context信息 # 设置 current context kubectl config use-context default --kubeconfig=kube-proxy.conf # 修改文件: kube-proxy.conf,设定当前context为default
# step 1. 生成 kubeconfig kubectl config set-cluster kubernetes \ --certificate-authority=${DEPLOY_DIR}/pki/kubernetes/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=admin.conf # 生成文件: admin.conf,内容为集群信息(证书、apiserver地址、集群名称) kubectl config set-credentials admin \ --client-certificate=${DEPLOY_DIR}/pki/kubernetes/admin.pem \ --embed-certs=true \ --client-key=${DEPLOY_DIR}/pki/kubernetes/admin-key.pem \ --kubeconfig=admin.conf # 修改文件: admin.conf,增加认证用户和认证信息 kubectl config set-context kubernetes \ --cluster=kubernetes \ --user=admin \ --kubeconfig=admin.conf # 修改文件: admin.conf,增加context信息 # 设定上下文 kubectl config use-context kubernetes --kubeconfig=admin.conf # 修改文件: admin.conf,设定当前context为kubernetes
for master in {master1,master2,master3};do kubectl config set-cluster kubernetes \ --certificate-authority=${DEPLOY_DIR}/pki/kubernetes/ca.pem \ --embed-certs=true \ --server=https://${IP_LIST[${master}]}:6443 \ --kubeconfig=kube-controller-manager-${master}.conf # 注意点: credential必须是system:kube-controller-manager kubectl config set-credentials system:kube-controller-manager \ --client-certificate=${DEPLOY_DIR}/pki/kubernetes/kube-controller-manager.pem \ --client-key=${DEPLOY_DIR}/pki/kubernetes/kube-controller-manager-key.pem \ --embed-certs=true \ --kubeconfig=kube-controller-manager-${master}.conf kubectl config set-context default \ --cluster=kubernetes \ --user=system:kube-controller-manager \ --kubeconfig=kube-controller-manager-${master}.conf kubectl config use-context default --kubeconfig=kube-controller-manager-${master}.conf done
注意点:
- 此处指定了–server,必须是自己所在节点的kube-apiserver,不然选举会卡住失败,参照issue: 49000
- 另外,systemd unit文件中,–master是用来覆盖此处配置,可以省略
for master in {master1,master2,master3};do kubectl config set-cluster kubernetes \ --certificate-authority=${DEPLOY_DIR}/pki/kubernetes/ca.pem \ --embed-certs=true \ --server=https://${IP_LIST[${master}]}:6443 \ --kubeconfig=kube-scheduler-${master}.conf # 注意点: credential必须是system:kube-scheduler kubectl config set-credentials system:kube-scheduler \ --client-certificate=${DEPLOY_DIR}/pki/kubernetes/kube-scheduler.pem \ --client-key=${DEPLOY_DIR}/pki/kubernetes/kube-scheduler-key.pem \ --embed-certs=true \ --kubeconfig=kube-scheduler-${master}.conf kubectl config set-context default \ --cluster=kubernetes \ --user=system:kube-scheduler \ --kubeconfig=kube-scheduler-${master}.conf kubectl config use-context default --kubeconfig=kube-scheduler-${master}.conf done
注意点:
- 此处指定了–server,必须是自己所在节点的kube-apiserver,不然选举会卡住失败,参照issue: 49000
- 另外,systemd unit文件中,–master是用来覆盖此处配置,可以省略
cd ${DEPLOY_DIR}/kubeconfig # 将bootstrap.kubelet.<node-hostname>.conf和kube-proxy.conf分发到node节点 for node in {node01,node02,node03};do ssh root@${node} "mkdir -p ${KUBECONFIG_DIR}" scp ${DEPLOY_DIR}/kubeconfig/bootstrap-kubelet-${node}.conf ${node}:${KUBECONFIG_DIR}/bootstrap-kubelet.conf scp ${DEPLOY_DIR}/kubeconfig/kube-proxy.conf ${node}:${KUBECONFIG_DIR} done # 将master节点的 kubeconfig 分发到所有master上 for master in {master1,master2,master3};do ssh root@${master} "mkdir -p ${KUBECONFIG_DIR}" scp ${DEPLOY_DIR}/kubeconfig/token.csv $master:${KUBECONFIG_DIR} scp ${DEPLOY_DIR}/kubeconfig/kube-controller-manager-${master}.conf $master:${KUBECONFIG_DIR}/kube-controller-manager.conf scp ${DEPLOY_DIR}/kubeconfig/kube-scheduler-${master}.conf $master:${KUBECONFIG_DIR}/kube-scheduler.conf done # 拷贝admin.conf到管理机器上,此处以haproxy为例 ssh root@${KUBE_API_PROXY_IP} "mkdir -p ${ADMIN_KUBECONFIG_DIR}" scp ${DEPLOY_DIR}/kubeconfig/admin.conf ${KUBE_API_PROXY_IP}:${ADMIN_KUBECONFIG_DIR}/config
mkdir -p ${DEPLOY_DIR}/{node,master,etcd}/systemd-unit-files
需要各master节点根据自身调整ip地址
# kube-apiserver.service for master in {master1,master2,master3};do cat << EOF > ${DEPLOY_DIR}/master/systemd-unit-files/kube-apiserver-${master}.service [Unit] Description=Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-apiserver \\ --advertise-address=${IP_LIST[${master}]} \\ --bind-address=${IP_LIST[${master}]} \\ --secure-port=6443 \\ --insecure-port=0 \\ --authorization-mode=Node,RBAC \\ --enable-admission-plugins=NodeRestriction \\ --enable-bootstrap-token-auth=true \\ --token-auth-file=${KUBECONFIG_DIR}/token.csv \\ --service-cluster-ip-range=${SERVICE_CLUSTER_IP_RANGE} \\ --service-node-port-range=${SERVICE_NODE_PORT_RANGE} \\ --client-ca-file=${K8S_PKI_DIR}/ca.pem \\ --tls-cert-file=${K8S_PKI_DIR}/kube-apiserver.pem \\ --tls-private-key-file=${K8S_PKI_DIR}/kube-apiserver-key.pem \\ --service-account-key-file=${K8S_PKI_DIR}/ca-key.pem \\ --etcd-cafile=${ETCD_PKI_DIR}/ca.pem \\ --etcd-certfile=${K8S_PKI_DIR}/apiserver-etcd-client.pem \\ --etcd-keyfile=${K8S_PKI_DIR}/apiserver-etcd-client-key.pem \\ --etcd-servers=https://${IP_LIST["etcd1"]}:2379,https://${IP_LIST["etcd2"]}:2379,https://${IP_LIST["etcd3"]}:2379 \\ --kubelet-certificate-authority=${K8S_PKI_DIR}/ca.pem \\ --kubelet-client-certificate=${K8S_PKI_DIR}/api-kubelet-client.pem \\ --kubelet-client-key=${K8S_PKI_DIR}/api-kubelet-client-key.pem \\ --service-account-key-file=${K8S_PKI_DIR}//service-account.pem \\ --service-account-signing-key-file=${K8S_PKI_DIR}//service-account-key.pem \\ --service-account-issuer=api \\ --allow-privileged=true \\ --apiserver-count=3 \\ --audit-log-maxage=30 \\ --audit-log-maxbackup=3 \\ --audit-log-maxsize=100 \\ --audit-log-path=/var/lib/audit.log \\ --event-ttl=1h \\ --v=2 Restart=on-failure RestartSec=5 Type=notify LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF done # --admission-control: 1.10之后默认启用admission-control # 默认设定的值: "NamespaceLifecycle, LimitRanger, ServiceAccount, TaintNodesByCondition, Priority, # DefaultTolerationSeconds, DefaultStorageClass, StorageObjectInUseProtection, PersistentVolumeClaimResize, # MutatingAdmissionWebhook, ValidatingAdmissionWebhook, RuntimeClass, ResourceQuota" # 若需要额外配置其他admission,请参照kubernetes admission controller官方文档 # --enable-bootstrap-token-auth: 启用bootstrap-token认证,详情请参照[官方文档](https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/) # 根据下面这个issue和pr内容,以下三个配置是用来加密master和node通信的,主要是kubectl logs/exec。 # [issue: 14700](https://github.com/kubernetes/kubernetes/pull/14700) # [PR: 31562](https://github.com/kubernetes/kubernetes/pull/31562) # [kubelet-authentication-authorization] # (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-authentication-authorization/) # --kubelet-certificate-authority # --kubelet-client-certificate # --kubelet-client-key # kube-controller-manager.service for master in {master1,master2,master3};do cat << EOF > ${DEPLOY_DIR}/master/systemd-unit-files/kube-controller-manager-${master}.service [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/kubernetes/kubernetes [Service] ExecStart=/usr/local/bin/kube-controller-manager \\ --bind-address=127.0.0.1 \\ --controllers=*,bootstrapsigner,tokencleaner \\ --allocate-node-cidrs=true \\ --service-cluster-ip-range=${SERVICE_CLUSTER_IP_RANGE} \\ --cluster-cidr=${POD_CLUSTER_IP_RANGE} \\ --cluster-name=kubernetes \\ --kubeconfig=${KUBECONFIG_DIR}/kube-controller-manager.conf \\ --root-ca-file=${K8S_PKI_DIR}/ca.pem \\ --cluster-signing-cert-file=${K8S_PKI_DIR}/ca.pem \\ --cluster-signing-key-file=${K8S_PKI_DIR}/ca-key.pem \\ --use-service-account-credentials=true \\ --service-account-private-key-file=${K8S_PKI_DIR}/ca-key.pem \\ --leader-elect=true \\ --v=2 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF done # kube-scheduler.service for master in {master1,master2,master3};do cat << EOF > ${DEPLOY_DIR}/master/systemd-unit-files/kube-scheduler-${master}.service [Unit] Description=Kubernetes Scheduler Documentation=https://github.com/kubernetes/kubernetes [Service] ExecStart=/usr/local/bin/kube-scheduler \\ --bind-address=127.0.0.1 \\ --kubeconfig=${KUBECONFIG_DIR}/kube-scheduler.conf \\ --leader-elect=true \\ --v=2 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF done
如果中途修改过
--service-cluster-ip-range(kube-apiserver,kube-controller-manager),有可能会遇到下面的错误
"message": "Cluster IP *.*.*.* is not within the service CIDR *.*.*.*/**; please recreate service"解决方案:kubectl get services --all-namespaces获得系统的最初创建的集群的service后,删除它kubectl delete service kubernets然后系统会自动创建它,但是!!!不推荐生产环境已经存在应用service后这样搞!!!
for etcd in {etcd1,etcd2,etcd3};do cat << EOF > ${DEPLOY_DIR}/etcd/systemd-unit-files/etcd-${etcd}.service [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target Documentation=https://github.com/coreos [Service] Type=notify WorkingDirectory=/var/lib/etcd/ EnvironmentFile=-/etc/etcd/etcd.conf ExecStart=/usr/local/bin/etcd \\ --name=${etcd} \\ --client-cert-auth=true \\ --trusted-ca-file=${ETCD_PKI_DIR}/ca.pem \\ --cert-file=${ETCD_PKI_DIR}/server.pem \\ --key-file=${ETCD_PKI_DIR}/server-key.pem \\ --peer-client-cert-auth=true \\ --peer-trusted-ca-file=${ETCD_PKI_DIR}/ca.pem \\ --peer-cert-file=${ETCD_PKI_DIR}/peer.pem \\ --peer-key-file=${ETCD_PKI_DIR}/peer-key.pem \\ --initial-advertise-peer-urls=https://${IP_LIST[${etcd}]}:2380 \\ --listen-peer-urls=https://${IP_LIST[${etcd}]}:2380 \\ --listen-client-urls=https://${IP_LIST[${etcd}]}:2379,https://127.0.0.1:2379 \\ --advertise-client-urls=https://${IP_LIST[${etcd}]}:2379 \\ --initial-cluster-token=etcd-cluster-0 \\ --initial-cluster=etcd1=https://${IP_LIST["etcd1"]}:2380,etcd2=https://${IP_LIST["etcd2"]}:2380,etcd3=https://${IP_LIST["etcd3"]}:2380 \\ --initial-cluster-state=new \\ --data-dir=/var/lib/etcd Restart=on-failure RestartSec=5 LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF done
# kubelet.service for node in {node01,node02,node03};do cat << EOF > ${DEPLOY_DIR}/node/systemd-unit-files/kubelet-${node}.service [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/kubernetes/kubernetes [Service] WorkingDirectory=/var/lib/kubelet ExecStart=/usr/local/bin/kubelet \\ --address=${IP_LIST[${node}]} \\ --hostname-override=${node} \\ --network-plugin=cni \\ --pod-infra-container-image=k8s.gcr.io/pause-amd64:3.0 \\ --bootstrap-kubeconfig=${KUBECONFIG_DIR}/bootstrap-kubelet.conf \\ --kubeconfig=${KUBECONFIG_DIR}/kubelet.conf \\ --client-ca-file=${K8S_PKI_DIR}/ca.pem \\ --cert-dir=${K8S_PKI_DIR} \\ --tls-cert-file=${K8S_PKI_DIR}/api-kubelet-client.pem \\ --tls-private-key-file=${K8S_PKI_DIR}/api-kubelet-client-key.pem \\ --anonymous-auth=false \\ --hairpin-mode promiscuous-bridge \\ --serialize-image-pulls=false \\ --cgroup-driver=systemd \\ --cluster-dns=${SERVICE_CLUSTER_IP_RANGE%.*}.2 \\ --cluster-domain=cluster.local \\ --container-runtime=remote \\ --container-runtime-endpoint=unix:///run/containerd/containerd.sock \\ --v=2 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF done # cgroup-driver和docker一致,皆为systemd # --cert-dir指定kubelet从master那边获取的签名证书存放目录 # 根据下面这个issue和pr内容,以下配置是用来加密master和node通信的,主要是kubectl logs/exec。 # [issue: 14700](https://github.com/kubernetes/kubernetes/pull/14700) # [PR: 31562](https://github.com/kubernetes/kubernetes/pull/31562) # [kubelet-authentication-authorization] # (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-authentication-authorization/) # --client-ca-file # 根据[issue: 63164](https://github.com/kubernetes/kubernetes/issues/63164) # tls bootsraping里面自动在--cert-dir下面生成的key,只是用于kubelet -> apiserver # 而如果需要apiserver -> kubelet的认证,需要手动指定以下参数,如果不指定,kubelet会自动生成一个ca # --tls-cert-file # --tls-private-key-file for node in {node01,node02,node03};do cat << EOF > ${DEPLOY_DIR}/node/systemd-unit-files/kube-proxy-${node}.service [Unit] Description=Kubernetes Kube-Proxy Server Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-proxy \\ --logtostderr=true \\ --v=0 \\ --master=https://${KUBE_API_PROXY_IP}:443 \\ --bind-address=${IP_LIST[${node}]} \\ --hostname-override=${node} \\ --kubeconfig=${KUBECONFIG_DIR}/kube-proxy.conf \\ --cluster-cidr=${POD_CLUSTER_IP_RANGE} \\ --proxy-mode=ipvs Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF done
# 下发master unit文件 for master in {master1,master2,master3};do rsync -av ${DEPLOY_DIR}/master/systemd-unit-files/kube-apiserver-${master}.service \ ${master}:/usr/lib/systemd/system/kube-apiserver.service rsync -av ${DEPLOY_DIR}/master/systemd-unit-files/kube-controller-manager-${master}.service \ ${master}:/usr/lib/systemd/system/kube-controller-manager.service rsync -av ${DEPLOY_DIR}/master/systemd-unit-files/kube-scheduler-${master}.service \ ${master}:/usr/lib/systemd/system/kube-scheduler.service done # 下发node unit文件 for node in {node01,node02,node03};do rsync -av ${DEPLOY_DIR}/node/systemd-unit-files/kubelet-${node}.service \ ${node}:/usr/lib/systemd/system/kubelet.service rsync -av ${DEPLOY_DIR}/node/systemd-unit-files/kube-proxy-${node}.service \ ${node}:/usr/lib/systemd/system/kube-proxy.service done # 下发etcd unit文件 for etcd in {etcd1,etcd2,etcd3};do rsync -av ${DEPLOY_DIR}/etcd/systemd-unit-files/etcd-${etcd}.service ${etcd}:/usr/lib/systemd/system/etcd.service done
for etcd in {etcd1,etcd2,etcd3};do ssh root@${etcd} "mkdir -p /var/lib/etcd" ssh root@${etcd} "systemctl daemon-reload" ssh root@${etcd} "systemctl enable etcd" done # 手动去etcd机器上启动etcd,必须同时启动,集群才能启动成功 systemctl start etcd ETCDCTL_API=3 ${DEPLOY_DIR}/etcd/bin/etcdctl \ --endpoints https://${IP_LIST["etcd1"]}:2379,https://${IP_LIST["etcd2"]}:2379,https://${IP_LIST["etcd3"]}:2379 \ --cacert=${DEPLOY_DIR}/pki/etcd/ca.pem \ --cert=${DEPLOY_DIR}/pki/etcd/peer.pem \ --key=${DEPLOY_DIR}/pki/etcd/peer-key.pem \ endpoint status
for master in {master1,master2,master3};do ssh root@${master} "systemctl daemon-reload" ssh root@${master} "systemctl start kube-apiserver kube-controller-manager kube-scheduler" ssh root@${master} "systemctl enable kube-apiserver kube-controller-manager kube-scheduler" done
下面执行的内容牵扯到kubelet-tls-bootstrap的内容,可以参考官方文档
创建ClusterRoleBinding允许kubelet创建CSR(certificate signing requests)
# enable bootstrapping nodes to create CSR cat << EOF | kubectl apply -f - apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: create-csrs-for-bootstrapping subjects: - kind: Group name: system:bootstrappers apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: system:node-bootstrapper apiGroup: rbac.authorization.k8s.io EOF
创建ClusterRoleBinding允许kubelet请求和接收证书
# Approve all CSRs for the group "system:bootstrappers" cat << EOF | kubectl apply -f - apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: auto-approve-csrs-for-group subjects: - kind: Group name: system:bootstrappers apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:nodeclient apiGroup: rbac.authorization.k8s.io EOF
创建ClusterRoleBinding允许kubelet重签证书
# Approve renewal CSRs for the group "system:nodes" cat << EOF | kubectl apply -f - apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: auto-approve-renewals-for-nodes subjects: - kind: Group name: system:nodes apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient apiGroup: rbac.authorization.k8s.io EOF
for node in {node01,node02,node03};do ssh root@${node} "mkdir -p /var/lib/kubelet" ssh root@${node} "systemctl daemon-reload && systemctl start kubelet kube-proxy && systemctl enable kubelet kube-proxy" done
# 查看csr请求 kubectl get csr # 如果自动认证签证证书失败,有需要人工批准的请求,可以执行approve csr请求 # kubectl get csr | awk '/Pending/ {print $1}' | xargs kubectl certificate approve
kubectl get componentstatuses NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-0 Healthy {"health":"true"} etcd-2 Healthy {"health":"true"} etcd-1 Healthy {"health":"true"} kubectl get nodes NAME STATUS ROLES AGE VERSION node01 Ready <none> 40m v1.9.1 node02 Ready <none> 3m v1.9.1 node03 Ready <none> 3m v1.9.1
关于CNI网络组件的选择,可以自行搜索文档,此处选择使用calico,相较于flannel,calico是属于第三层的网络,性能更好。
参考文档:自建k8s集群上部署calico
这里使用的是自建k8s集群下部署超过50个节点的calico模板
mkdir -p ${DEPLOY_DIR}/kube-addon curl https://docs.projectcalico.org/manifests/calico-typha.yaml -o ${DEPLOY_DIR}/kube-addon/calico.yaml # 包含以下资源 # - ConfigMap # - CustomResourceDefinition > BGPConfiguration # - CustomResourceDefinition > BGPPeer # - CustomResourceDefinition > BlockAffinity # - CustomResourceDefinition > ClusterInformation # - CustomResourceDefinition > FelixConfiguration # - CustomResourceDefinition > GlobalNetworkPolicy # - CustomResourceDefinition > GlobalNetworkSet # - CustomResourceDefinition > HostEndpoint # - CustomResourceDefinition > IPAMBlock # - CustomResourceDefinition > IPAMConfig # - CustomResourceDefinition > IPAMHandle # - CustomResourceDefinition > IPPool # - CustomResourceDefinition > KubeControllersConfiguration # - CustomResourceDefinition > NetworkPolicy # - CustomResourceDefinition > NetworkSet # - ClusterRoleBinding # - ClusterRole # - ClusterRoleBinding # - Service # - Deployment # - PodDisruptionBudget # - DaemonSet # - ServiceAccount # - Deployment # - ServiceAccount # - PodDisruptionBudget sed -i 's/# - name: CALICO_IPV4POOL_CIDR/- name: CALICO_IPV4POOL_CIDR/g' ${DEPLOY_DIR}/kube-addon/calico.yaml sed -i "s|# value: \"192.168.0.0/16\"| value: \"${POD_CLUSTER_IP_RANGE}\"|g" ${DEPLOY_DIR}/kube-addon/calico.yaml kubectl apply -f ${DEPLOY_DIR}/kube-addon/calico.yaml
kubectl get pods --namespace kube-system NAME READY STATUS RESTARTS AGE calico-kube-controllers-78d6f96c7b-tf7bx 1/1 Running 3 2m46s calico-node-829jl 1/1 Running 1 2m46s calico-node-h2jf4 1/1 Running 1 2m46s calico-node-svntn 1/1 Running 1 2m46s calico-typha-c5bb98bbd-xqtt7 1/1 Running 0 38s
高版本里面基本都推荐使用CoreDNS替代,参考文档K8S-COREDNS & COREDNS-GITHUB
创建coredns部署脚本
# coredns脚本安装需要jq命令支持 yum install epel-release -y yum install jq -y # 下载部署coredns需要的文件 wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/deploy.sh -O ${DEPLOY_DIR}/kube-addon/deploy.sh wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/coredns.yaml.sed -O ${DEPLOY_DIR}/kube-addon/coredns.yaml.sed # 官方脚本里面的function名称不符合bash的标准,你敢信? sed -i "s/translate-kube-dns-configmap/translate_kube_dns_configmap/g" ${DEPLOY_DIR}/kube-addon/deploy.sh sed -i "s/kube-dns-federation-to-coredns/kube_dns_federation_to_coredns/g" ${DEPLOY_DIR}/kube-addon/deploy.sh sed -i "s/kube-dns-upstreamnameserver-to-coredns/kube_dns_upstreamnameserver_to_coredns/g" ${DEPLOY_DIR}/kube-addon/deploy.sh sed -i "s/kube-dns-stubdomains-to-coredns/kube_dns_stubdomains_to_coredns/g" ${DEPLOY_DIR}/kube-addon/deploy.sh # 部署coredns # -i 指定dns ip sh ${DEPLOY_DIR}/kube-addon/deploy.sh -i ${SERVICE_CLUSTER_IP_RANGE%.*}.2| kubectl apply -f - # 查看coredns deployment kubectl -n kube-system get deploy coredns NAME READY UP-TO-DATE AVAILABLE AGE coredns 1/1 1 1 97s
看了k8s的pv文档后,因为前面安装k8s的环境是bare metal,所以无法使用云提供的方案,最终只能从glusterfs和ceph里面选择一个。经过网络上简单的搜索之后,发现glusterfs性能上比ceph稍高一点,另外它支持一个heketi,可以用restful的方式管理glusterfs的volume。虽然我没有实际部署对比过,但是倾向于glusterfs。
另外,之前不知道使用k8s持久化存储之前需要先安装持久化存储,汗,走了一点弯路。
mount.glusterfs命令# 检查raw disk环境 fdisk -l # 查看结果中是否有未被使用的磁盘 # 检查和加载内核模块 lsmod | grep dm_snapshot || modprobe dm_snapshot lsmod | grep dm_mirror || modprobe dm_mirror lsmod | grep dm_thin_pool || modprobe dm_thin_pool # 检查加载是否成功 lsmod | egrep '^(dm_snapshot|dm_mirror|dm_thin_pool)' # 输出内容 # dm_thin_pool 66358 0 # dm_snapshot 39103 0 # dm_mirror 22289 0 # 安装mount.glusterfs命令 yum install -y https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-7/glusterfs-libs-7.1-1.el7.x86_64.rpm yum install -y https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-7/glusterfs-7.1-1.el7.x86_64.rpm yum install -y https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-7/glusterfs-client-xlators-7.1-1.el7.x86_64.rpm yum install -y https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-7/glusterfs-fuse-7.1-1.el7.x86_64.rpm # 默认安装的是glusterfs 3.12,为了和下面gk-deploy脚本里面安装的版本一致,手动安装7.1版本 # 查看glusterfs版本 glusterfs --version glusterfs 7.1 mount.glusterfs -V glusterfs 7.1
# step 1. 下载安装文件 # 以下测试使用的,最新commit是: # Latest commit # 7246eb4 # on Jul 19, 2019 # 这个时候的版本和kubenetes 1.17不兼容,有很多需要修改的东西 # 我个人做了很多修改和debug,后面应该会修复,所以下面的内容 # 请酌情来判断是否需要执行 git clone https://github.com/gluster/gluster-kubernetes.git cd gluster-kubernetes/deploy # step 2. 准备topology文件 # *************************** # 重点关注 # - hostsnames.manage里面填写节点的hostname # - hostnames.storage里面填写节点的ip # - devices里面填写磁盘的名称 # *************************** cat << EOF > topology.json { "clusters": [ { "nodes": [ { "node": { "hostnames": { "manage": [ "node01" ], "storage": [ "${IP_LIST['node01']}" ] }, "zone": 1 }, "devices": [ "/dev/sdb" ] }, { "node": { "hostnames": { "manage": [ "node02" ], "storage": [ "${IP_LIST['node02']}" ] }, "zone": 1 }, "devices": [ "/dev/sdb" ] }, { "node": { "hostnames": { "manage": [ "node03" ], "storage": [ "${IP_LIST['node03']}" ] }, "zone": 1 }, "devices": [ "/dev/sdb" ] } ] } ] } EOF # 安装前确认节点都ready状态 kubectl get nodes # step 3. ISSUCE解决(在官方的git中已经有看到解决的pr,但我当前使用的时间点还需要自己来修改) # 1) k8s 1.17换了api版本 sed -ir "s|apiVersion: extensions/v1beta1|apiVersion: apps/v1|g" kube-templates/deploy-heketi-deployment.yaml sed -ir "s|apiVersion: extensions/v1beta1|apiVersion: apps/v1|g" kube-templates/gluster-s3-template.yaml sed -ir "s|apiVersion: extensions/v1beta1|apiVersion: apps/v1|g" kube-templates/glusterfs-daemonset.yaml sed -ir "s|apiVersion: extensions/v1beta1|apiVersion: apps/v1|g" kube-templates/heketi-deployment.yaml sed -ir "s|apiVersion: extensions/v1beta1|apiVersion: apps/v1|g" ocp-templates/glusterfs-template.yaml # 2) error: error validating "STDIN": error validating data: ValidationError(DaemonSet.spec): missing required # field "selector" in io.k8s.api.apps.v1.DaemonSetSpec; if you choose to ignore these errors, turn validation off with --validate=false # k8s 1.17需要指定pod selector # 确认以下内容,如果不存在,请手动增加 vim kube-templates/glusterfs-daemonset.yaml spec: selector: matchLabels: name: glusterfs template: metadata: labels: name: glusterfs vim kube-templates/deploy-heketi-deployment.yaml spec: selector: matchLabels: name: deploy-heketi template: metadata: labels: name: deploy-heketi vim kube-templates/gluster-s3-template.yaml - kind: Deployment spec: selector: matchLabels: name: gluster-s3 template: metadata: labels: name: gluster-s3 vim kube-templates/heketi-deployment.yaml spec: selector: matchLabels: name: heketi template: metadata: labels: name: heketi # 3) Determining heketi service URL ... Error: unknown flag: --show-all # See 'kubectl get --help' for usage. # Failed to communicate with heketi service. # kubectl v1.17 没有--show-all这个选项 vim gk-deploy # 将下面的内容 # heketi_pod=$(${CLI} get pod --no-headers --show-all --selector="heketi" | awk '{print $1}') # 修改为 # heketi_pod=$(${CLI} get pod --no-headers --selector="heketi" | awk '{print $1}') # step 4. 部署heketi and GlusterFS ADMIN_KEY=adminkey USER_KEY=userkey ./gk-deploy -g -y -v --admin-key ${ADMIN_KEY} --user-key ${USER_KEY} # 如果第一次没安装成功,需要二次安装,使用下面命令清除之前的安装资源 # 删除资源和服务 # ./gk-deploy -g --abort --admin-key adminkey --user-key userkey # 查看lv名称 # lvs # 删除lv # lvremove /dev/vg # 清除磁盘(在节点机器上执行) # wipefs -a /dev/sdc # step 5. 检查heketi和glusterfs运行情况 export HEKETI_CLI_SERVER=$(kubectl get svc/heketi --template 'http://{{.spec.clusterIP}}:{{(index .spec.ports 0).port}}') echo $HEKETI_CLI_SERVER curl $HEKETI_CLI_SERVER/hello # Hello from Heketi # 如果timeout的话,看看是不是master没搞成node节点,没加入kube-proxy # 可以获取到地址之后,到node节点上执行curl操作 # step 6. 创建storageclass,来自动为pvc创建pv SECRET_KEY=`echo -n "${ADMIN_KEY}" | base64` cat << EOF | kubectl apply -f - apiVersion: v1 kind: Secret metadata: name: heketi-secret namespace: default data: # base64 encoded password. E.g.: echo -n "mypassword" | base64 key: ${SECRET_KEY} type: kubernetes.io/glusterfs EOF cat << EOF | kubectl apply -f - apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: glusterfs-storage provisioner: kubernetes.io/glusterfs parameters: resturl: "${HEKETI_CLI_SERVER}" restuser: "admin" secretNamespace: "default" secretName: "heketi-secret" volumetype: "replicate:3" EOF # 注意生成pvc资源时,需要指定storageclass为上面配置的"glusterfs-storage" kubectl get nodes,pods NAME STATUS ROLES AGE VERSION node/node01 Ready <none> 5d3h v1.17.0 node/node02 Ready <none> 5d3h v1.17.0 node/node03 Ready <none> 5d3h v1.17.0 NAME READY STATUS RESTARTS AGE pod/glusterfs-bhprz 1/1 Running 0 45m pod/glusterfs-jt64n 1/1 Running 0 45m pod/glusterfs-vkfp5 1/1 Running 0 45m pod/heketi-779bc95979-272qk 1/1 Running 0 38m