编辑
2025-08-10
undefined
00
请注意,本文编写于 122 天前,最后修改于 122 天前,其中某些信息可能已经过时。

目录

5、云原生
1、ElasticSearch
1、单机部署
2、ES故障排查技巧
3、ElasticSearch集群部署
4、ES相关的面试题
2、部署kibana
3、部署filebeat
1、filebeat采集日志文件细节分析
2、今日内容回顾
3、Ansible一键部署EFK架构
1、安装ES
2、安装kibana
3、安装filebeat
4、filebeat的模块管理
1、filebeat的模块案例之nginx
1、加模块与不加模块的区别
2、EFK架构基础环境准备
3、ELK画图
4、kibana数据排错原因
5、filebeat的多实例实战案例
6、filebeat写入数据到不同的ES索引
7、filebeat采集json数据
8、filestream类型采集tomcat日志案例
9、filestream解析json格式
10、filebeat多行合并
5、filebeat处理器processors
6、filebeat数据流走向
4、Logstash
1、安装Logstash
2、logstash对接filebeat实战
3、Logstash的pipeline内部结构
4、Logstash数据分析实战案例
5、logstash采集文本日志
6、geoip分析经纬度
7、logstash的多实例
8、logstash的多分支语法
9、logstash的多pipeline配置
6、ES集群加密
1、使用明文密码方式
1、filebeat对接ES加密集群
2、Logstash对接ES加密集群
3、ES7重置elastic管理员密码案例
2、使用api_key方式
1、配置api_key
2、基于ES的api创建api-key并实现权限管理(扩展)
3、ES集群配置https证书
4、kibana对接ES的https加密集群
5、logstash基于api-key访问ES集群
6、RBAC案例
练习
7、Kibana登录报错
8、生产环境忘记ES7密码如何操作
7、集群优化
8、安装ES8
1、ES8单点环境部署
2、ES8重置管理员elatic密码
3、部署ES8集群
4、部署kibana对接ES8集群
5、filebeat对接ES8实战
6、logstash对接ES8集群
7、ES8和ES7对比部署
9、zookeeper
1、zookeeper单点部署
2、zookeeper的基本使用
3、zookeeper的集群部署
4、zookeeper的watch机制及znode类型
5、zookeeper的JVM调优
6、zookeeper图形化管理工具(扩展)
10、kafka
1、单点部署实战
2、kafka集群部署
3、kafka常用术语
4、启动生产者和消费者验证kafka集群
5、Kafka和ZooKeeper的关系
6、KRaft
7、kafka 2.8+ 为什么要移除zookeeper组件呢?
8、kafka常用术语精讲
9、kafka的topics管理
10、kafka的消费者组管理
11、kafka的JVM调优
12、kafka的图形化管理工具EFAK(了解)
13、kafka丢失数据问题
14、kafka消费数据延迟问题解决方案
15、filebeat对接kafka集群
16、logstsh对接kafka集群
17、企业级ELFK+KAFKA架构设计方案
11、docker
1、docker架构概述
1、docker架构图
2、安装docker
3、docker基于VPN配置代理
4、linux实现vpn翻墙
1、前言
2、设置 V2rayN
2、docker管理基础
1、镜像基础管理
2、容器管理基础
3、基于docker部署游戏镜像案例
4、容器名称修改,在容器中执行命令及IP地址查看
5、基于docker部署WordPress
6、外部是如何访问容器的
7、docker安装redis
8、docker安装mq
3、故障排查命令
4、容器4种重启策略
5、docker的单机网络类型
6、基于docker部署ES8
1、单点部署
2、基于docker部署ES8集群
3、部署kibana
7、自定义网络案例之部署zabbix系统
8、虚拟机磁盘不足解决方案
9、docker底层特性
1、内核转发参数
2、iptables
3、overlayFS概述
4、chroot
5、cgroup
6、namespace
10、docker支持跨主机互联的解决方案
2、docker容器的数据持久化
3、docker的存储卷实战
4、docker底层Linux特性之cgroup
5、docker底层Linux特性之namespace技术
11、制作镜像案例
1、手动制作镜像
2、Dockerfile
1、Dockerfile构建多服务镜像案例
2、ENV环境变量实战案例之自定义root密码
3、docker容器无法访问时可能遇到的错误总结
4、ARG和ENV的区别实战案例
5、ENTRYPOINT和CMD区别验证
6、EXPOSE端口暴露案例
7、WORKDIR指定进入容器默认的工作目录
8、VOLUME指令对容器指定路径数据持久化
9、HEALCHECK检查容器服务是否健康及日志输出到docker logs
10、USER指令可以指定运行容器的用户
11、ONBUILD指定基础镜像触发器
12、SHELL声明解释器案例
13、多阶段构建
14、通过dockerignore忽略不必要的文件
3、Dockerfile优化
4、scratch自定义基础镜像
5、docker-compose
1、docker-compose实现单个服务案例
2、docker-compose实现环境变量传递和启动命令参数
3、docker-compose服务引用存储卷
4、docker-compose服务引用自定义网络
5、docker-compose实现服务依赖之wordpress
6、docker-compose部署ES和kibana环境及env变量文件使用
7、Services引用secrets实战案例
8、docker-compose部署zabbix案例
6、docker-compose编译镜像
7、docker-registry私有镜像仓库部署
12、harbor企业级镜像仓库实战
1、部署harbor镜像仓库
2、harbor实现镜像的基础管理
4、harbor同步数据
5、harbor的高可用解决方案
1、仓库复制(官方推荐)
6、harbor仓库密码base64编码存储路径
13、将本地镜像推送到第三方阿里云服务器
14、docker hub官方仓库使用
15、containerd基本使用
1、containerd安装
2、Containerd的名称空间管理【用来隔离容器,镜像,任务等资源信息】
3、ctr镜像管理
4、容器管理
5、containerd连接harbor服务
6、Containerd实现数据持久化
7、docker集成Containerd
8、harbor的https实战
9、docker访问https的harbor
10、containerd访问https的harbor
12、kubernetes
1、kubernetes安装与介绍
1、Kubernetes集群架构
2、Kubernetes的三种网络类型
3、Kubernetes的部署方式
4、k8s集群环境准备
5、基于kubeadm组件初始化K8S的master组件
6、基于kubeadm部署worker组件
7、部署CNI插件之Flannel
8、harbor基于自建证书https实战
9、配置harbor服务端证书
10、K8S节点配置docker客户端证书实战
11、k8s部署服务的三种方式
2、pod
1、pod的容器类型
2、删除容器对Pod的IP地址变化
3、部署MySQL服务到K8S集群
3、服务的暴露方式
1、在k8s集群部署WordPress
2、k8s部署jenkins实战案例基于hostPort暴露
3、基于port-forward暴露sonarQube服务
3、k8s部署ES单点案例
4、k8s部署kibana对接ES
5、一个Pod运行多个容器
4、故障排查技巧
1、故障排查技巧describe
2、故障排查技巧之cp
3、故障排查技巧之exec
4、故障排查技巧之command&args
5、故障排查技巧之explain
6、故障排查技巧之logs
7、查看Pod容器重启前的日志
以上c1跟c2容器基于存储卷可以实现数据共享
ls -l /weixiang-weixiang98/
很明显,c1和c2本身就网络共享,现在我们也实现数据共享。
curl 10.100.1.7
定义存储卷
声明存储卷类型是一个临时的空目录。
挂载存储卷
实现效果:worker232 和 worker233 上的两个 Pod(xiuxian-apps-v1 和 xiuxian-apps-v2)均将容器内的 /usr/share/
nginx/html/ 目录挂载到 同一个 NFS 服务器路径(10.0.0.231:/yinzhengjie/data/nfs-server
验证
labels: apps: v3就是创建LABELS的标签,跟image没关系,image只是用这个镜像创建容器
得出来的结果是v3
修改配置文件为v2
虽然查看结果是v2
但是curl出来的结果依旧是v3
手动删除后才能实现更新
查看结果是v2
rc的配置已经更新了
pod还是之前的老版本镜像
手动删除pod后
新创建的pod变成了v2版本
deploy实现声明式更新,核心作用可总结为:通过声明式配置,自动化管理Pod的部署、更新、回滚和扩缩容,确保应用始终以期望的状态运行
级联删除,删除控制器,语法kubectl delete deployment 资源名称
这张图足以说明deploy底层是rs
deploy/deploy-rabbitmq <资源类型>/<资源名称>
一次性job运行成功会显示Completed
目的:将宿主机的时区文件覆盖容器内的时区文件。这通常是为了确保容器内的时间(尤其是 date 命令的输出)与宿主机(以及你所在的时区)一致,避免容器使用默认的 UTC 时间。
本案例有3个worker节点,但是ds仅在2个节点部署成功,原因是这个ds没有配置'污点容忍'
如果不使用-n选项,则表示默认使用'default'名称空间。
查看所有的名称空间的pod资源
查看所有名称空间的ds资源
不难发现,在默认名称空间下找到我们创建的资源。
查看资源必须指定名称空间。
查看deploy控制器信息
查看service信息
Pod是临时的,可能因故障、扩缩容或滚动更新导致IP变化,ClusterIP Service提供一个固定虚拟IP(VIP),流量通过该IP自动负载均衡到后端Pod
无论后端Pod如何变化,其他服务只需访问该 IP
可以看到变化的ip在service里面可以看到
为什么所有节点都能访问?
中间遇到一个错,配置完后查看一直是iptables
解决办法:加载IPVS相关模块,所有节点执行
检查模块是否加载成功
创建配置文件
重启系统模块加载服务
重新删除pod验证生效
概述:
功能说明:
基本资源定义
查看dns是哪个解析服务器的
dig:DNS 查询工具
@10.200.0.10:指定 DNS 服务器的 IP 地址(这里是 Kubernetes 集群的 DNS 服务 IP,通常是 kube-dns 或 CoreDNS 的 ClusterIP)
kubernetes.default.svc.weixiang.com:要查询的域名(格式为 <服务名>.<命名空间>.svc.<集群域名>)。
+short:简化输出,只返回结果(不显示详细调试信息)
svc-blog.default.svc.weixiang.com:要查询的域名
直接查询完整域名svc-blog.default.svc.weixiang.com
用于记录 Service 背后实际 Pod 的 IP 和端口列表
Endpoints 的核心作用:
kubectl get pods -o wide
指定API的版本号
指定资源的类型
定义元数据信息
做host解析
拷贝证书
拉取镜像
手动拉取测试
修改IfNotPresent策略
运行成功
233节点已经拉取到镜像
指定键值对,一个key对应的是一个具体的值。
进入配置文件中进行修改
进入配置文件中进行修改后生效
进入master终端进行操作
stringData 字段允许直接写明文值,Kubernetes 会自动转换为 base64 编码
与ConfigMap 的区别:Secret 专为敏感数据设计(会特殊标记)
拉取私有仓库镜像需要认证
让sa绑定secret,将来与sa认证时,会使用secret的认证信息。
显示镜像拉取失败
查看详细信息
最少要等待30s+
--server: https://10.1.24.13:6443
6443: 这是 API Server: 默认监听的安全端口
--certificate-authority=/etc/kubernetes/pki/ca.crt: 指定了证书颁发机构(CA)的证书文件
--token=01b202.d5c4210389cbff08: 这个参数提供了用于身份认证的令牌
kubectl --server=https://10.1.24.13:6443 --certificate-authority=/etc/kubernetes/pki/ca.crt --token=oldboy.yinzhengjiejason get nodes
不使用token登录,判定为匿名用户
如果不指定认证信息,将被识别为匿名"system:anonymous"用户。
233节点生成私钥
使用一个已存在的私钥(jiege.key)来生成一个证书签名请求(Certificate Signing Request, CSR)文件(jiege.csr)
openssl req:用密码学工具包创建新的 CSR
-new:这个选项告诉 req 子命令,我们要创建一个新的证书签名请求
-key jiege.key:指定了生成此 CSR 所需的私钥文件。
-out jiege.csr:指定了生成的输出文件的名称
-subj "/CN=jiege/O=weixiang":主体信息,随意更改
--client-key jiege.key:之前是token访问,现在直接用文件进行访问
查看所有内容
kubectl config: 这是 kubectl 用于管理 kubeconfig 文件的一系列子命令的入口
set-cluster: 这是具体的动作
myk8s:集群名称
--embed-certs=true:kubectl会读取 --certificate-authority指定的证书文件(/etc/kubernetes/pki/ca.crt),将其内容进行 Base64 编码,然后直接存入 kubeconfig 文件中
--server="https://10.1.24.13:6443":集群的API Server的地址和端口
./yinzhengjie-k8s.conf:表示在当前目录下名为 yinzhengjie-k8s.conf 的文件
创建yinzhengjie的用户
--token="01b202.d5c4210389cbff08":token方式进行认证,确保有这个token
--kubeconfig=./yinzhengjie-k8s.conf:要操作的 kubeconfig 文件是当前目录下的 yinzhengjie-k8s.conf
在yinzhengjie-k8s.conf文件中,创建一个名为yinzhengjie@myk8s的上下文,这个上下文将用户 yinzhengjie和集群myk8s绑定在一起
--user=yinzhengjie: 指定这个上下文使用哪个用户
--cluster=myk8s: 指定这个上下文连接哪个集群
--kubeconfig=./yinzhengjie-k8s.conf: 再次指定操作的是当前目录下的 yinzhengjie-k8s.conf 文件
use-context yinzhengjie@myk8s: 定义默认使用的上下文为yinzhengjie@myk8s
--context=jasonyin@myk8s:指定上下文为jasonyin@myk8s
执行时会自动查找它KUBECONFIG变量
也就是把配置文件拷贝到/.kube/config会自动加载
验证,什么参数也不用加
直接把conf文件移动,也可以查看
让sa绑定secret,将来与sa认证时,会使用secret的认证信息。
kubectl get pods -o wide
实体(Entity):
角色(Role):
角色绑定(Role Binding):
- '*'
ClusterRoleBinding:这是一种集群级别的绑定。它的作用是将一个ClusterRole(集群角色)绑定到一个或多个 subjects(主体,如用户、用户组或服务账户)。
将来可用于声明式
--verb=get,watch,list:指定允许对这些资源执行的操作是只读的
响应式创建
将名为 reader 的 Role(角色/权限集)授予给一个名为 jiege 的用户。这个授权仅在当前的 Kubernetes 命名空间中生效。
jiege-as-reader:[用户]-as-[角色]
--user=jiege:--user: 表明主体(Subject)的类型是用户(User)
值得注意的是,尽管我们能够查看default的Pod,但不能查看所有名称空间的Pod,如果你想要查看所有名称空间的Pod,请使用CLusterRole。
创建个集群角色叫reader
--resource=deploy,rs,pods:指定的资源deploy,rs,pods
--verb=get,watch,list:
-o yaml:定了命令的输出格式为 YAML,执行文件后才会创建
--dry-run: 表示这只是一个“演习”,并不会真的执行任何操作。
将一个名为 reader 的 ClusterRole(集群角色)授予给一个名为 k8s 的用户组(Group)
很明显,weixiang98属于k3s分组,不属于K8S组的,因此无法访问!
定义 Deployment 部署配置
定义 ClusterRole(集群角色),设置权限规则
第一条规则:针对核心API组(空字符串表示核心API组)的资源权限
第二条规则:针对apps API组的资源权限
创建 ClusterRoleBinding(集群角色绑定),将角色绑定到ServiceAccount
第一次查看发现Pod副本数量只有1个
1.kubectl + kubeconfig
2.kubesphere(企业级全栈平台)
3.Kubernetes Dashboard(官方 UI)
4.kuboard(国产轻量级)
这里有个问题,配置的MetalLB IP 池已耗尽,当时是配置了1个公网ip,导致EXTERNAL-IP一直是pending,
编辑kubectl edit ipaddresspools -n metallb-system jasonyin2020,添加其他公网ip,完成
auther: Jason Yin
获取secret的名称
指定API SERVER的地址
指定kubeconfig配置文件的路径名称
获取用户的tocken
在kubeconfig配置文件中设置群集项
在kubeconfig中设置用户项
配置上下文,即绑定用户和集群的上下文关系,可以将多个集群和用户进行绑定哟~
配置当前使用的上下文
优先尝试使用集群的 DNS 服务来解析域名。这个Pod就既能访问外部网络(通过宿主机的 DNS),也能方便地发现和访问集群内的其他服务
根据配置文件,每个pod占0.5cpu,系统一共4个cpu,按理说应该是8个Running状态的,但是别的pod还占0.1cpu,因为每个节点cpu是2核,所以每个节点只能有3个,总共6个Running
kubectl describe nodes | grep Taints -A 2
注意,operator表示key和value的关系,有效值为: Exists and Equal,默认值为: Equal。
如果将operator的值设置为: Exists,且不定义key,value,effect时,表示无视污点。
目前来说,其他节点无法完成调度时则处于Pending状态。
删除标签
查看MetalLB 配置
Please edit the object below. Lines beginning with a '#' will be ignored,
and an empty file will abort the edit. If an error occurs while saving this file will be
reopened with the relevant failures.
apiVersion: metallb.io/v1beta1
查看
修改
验证
WARNING: This value is valid only in the following conditions
1. If provided manually (either via GITLABROOTPASSWORD environment variable or via gitlabrails['initialroot_password'] setting in gitlab.rb, it was provided before database was seeded for the first time (usually, the first reconfigure run).
2. Password hasn't been changed manually, either via UI or via command line.
# If the password shown here doesn't work, you must reset the admin password following https://docs.gitlab.com/ee/security/resetuserpassword.html#reset-your-root-password.
NOTE: This file will be automatically deleted in the first reconfigure run after 24 hours.
把上面内容添加到下方红框
如上所示,ping失败,如果发现同节点pod无法互通,先检查cni0网卡是否还在,不在就不通
添加pod ip a 对应的网卡
可以看到在Flannel的VXLAN模式下,每个节点上 ip neigh show dev flannel.1 的输出内容几乎是完全一样的。
原因在于,flannel.1 接口的邻居表(ARP表)不是通过传统的ARP广播/请求动态学习的,而是由flanneld进程根据整个集群的状态,主动、
所有去往其他节点Pod网段(10.100.0.0/24 和 10.100.2.0/24)的流量,都被指向了 flannel.1 这个设备。
任何要去往 10.100.0.0/24 这个Pod网段的数据包,都应该通过eth0网卡,发给网关10.0.0.231。这里的10.0.0.231正是另一个节点的主机IP
10.100.2.0 10.0.0.233 ... eth0: 同理,要去往 10.100.2.0/24 这个Pod网段,数据包需要发给 10.0.0.233 这个主机。
一条直接将“目标Pod网段”指向“目标Node主机IP”的路由规则
间隔1s探测一次
右边窗口的检测命令
不会触发重启
刚启动15秒后开始READY就绪,这个时间点是能curl通的(图一)
30秒后文件被删除,之后是curl不通的
报错,因为80端口没起来
手动把某一个容器的nginx启动
可以看到状态已经Running了
查看ep的地址池,发现可用地址有容器的ip
发现顺利curl通了
pod没有就绪,因为nginx没起
手动启动nginx
测试
根据startupProbe的优先级,38秒后pod才进入就绪状态
通过日志可以看出,startupProbe先执行,执行一次
修改优雅停止时间为3s
没有postStart.log,因为没满足sleep20s就停止了
pvc自动关联到pv02上,虽然配置文件没指定,但是pv02的容量跟pvc的期望容量是最接近的
根据上面找到的pv名称weixiang-linux-pv02
mountPath: /usr/share/nginx/html
可以看到default字段
可以看到default字段已经没了
此类型用于定义可以对一组Pod造成的最大中断,说白了就是最大不可用的Pod数量。
一般情况下,对于分布式集群而言,假设集群故障容忍度为N,则集群最少需要2N+1个Pod。
Copyright 2017 The Kubernetes Authors.
# Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
#
Starts a ZooKeeper server based on the supplied options.
--servers The number of servers in the ensemble. The default
value is 1.
--data_dir The directory where the ZooKeeper process will store its
snapshots. The default is /var/lib/zookeeper/data.
--datalogdir The directory where the ZooKeeper process will store its
write ahead log. The default is
/var/lib/zookeeper/data/log.
--conf_dir The directoyr where the ZooKeeper process will store its
configuration. The default is /opt/zookeeper/conf.
--client_port The port on which the ZooKeeper process will listen for
client requests. The default is 2181.
--election_port The port on which the ZooKeeper process will perform
leader election. The default is 3888.
--server_port The port on which the ZooKeeper process will listen for
requests from other servers in the ensemble. The
default is 2888.
--tick_time The length of a ZooKeeper tick in ms. The default is
2000.
--init_limit The number of Ticks that an ensemble member is allowed
to perform leader election. The default is 10.
--sync_limit The maximum session timeout that the ensemble will
allows a client to request. The default is 5.
--heap The maximum amount of heap to use. The format is the
same as that used for the Xmx and Xms parameters to the
JVM. e.g. --heap=2G. The default is 2G.
--maxclientcnxns The maximum number of client connections that the
ZooKeeper process will accept simultaneously. The
default is 60.
--snapretaincount The maximum number of snapshots the ZooKeeper process
will retain if purge_interval is greater than 0. The
default is 3.
--purge_interval The number of hours the ZooKeeper process will wait
between purging its old snapshots. If set to 0 old
snapshots will never be purged. The default is 0.
--maxsessiontimeout The maximum time in milliseconds for a client session
timeout. The default value is 2 * tick time.
--minsessiontimeout The minimum time in milliseconds for a client session
timeout. The default value is 20 * tick time.
--log_level The log level for the zookeeeper server. Either FATAL,
ERROR, WARN, INFO, DEBUG. The default is INFO.
sed -ri '/tag\:/s#tag: ""#tag: v1#' weixiang-weixiang98/values.yaml
使用 weixiang-weixiang98 这个应用的“安装包”(Chart),在 Kubernetes 集群里安装一个实例,并给这个实例起个名字叫 xiuxian
安装一个名为 weixiang-weixiang98 的应用包(Chart),并且将这次安装命名为 xiuxian。
使用的配置来自weixiang-weixiang98/values.yaml
应用程序的模板定义在 weixiang-weixiang98/ 这个 Helm Chart 目录中
此处的“elasticsearch”可以换成你想要搜索的Chart关键字即可。
显示所有的版本信息列表
若不指定,默认显示最新版本信息。
指定Chart版本信息
若不指定,拉取最新的Chart
拉取指定Chart版本
HELP processresidentmemory_bytes Resident memory size in bytes.
TYPE processresidentmemory_bytes gauge
HELP processstarttime_seconds Start time of the process since unix epoch in seconds.
TYPE processstarttime_seconds gauge
HELP processvirtualmemory_bytes Virtual memory size in bytes.
TYPE processvirtualmemory_bytes gauge
指定Chart的版本,一般无需修改。
指定Chart的名称
表示Chart描述信息,描述此Chart的作用
指定Chart的类型,有效值为: application和library
application:
此类型的Chart可以被独立部署,打包等。
library:
无法被独立部署。但可以被application类型的Chart进行引用。
定义当前Chart版本,建议命令遵循: https://semver.org/
核心语法: MAJOR.MINOR.PATCH
MINOR: 在大版本(MAJOR)架构基础之上新增各种功能。
PATCH: 修复功能的各种BUG,说白了,就是各种打补丁。
表示当前正在部署的Release发行版本。
#
学IT来老男孩,月薪过万不是梦~
官网地址:
https://www.weixiang.com
# 作者: 尹正杰
#
学IT来老男孩,月薪过万不是梦~
官网地址:
https://www.weixiang.com
# 作者: 尹正杰
#
学IT来老男孩,月薪过万不是梦~
官网地址:
https://www.weixiang.com
# 作者: 尹正杰
指定下载的版本
卸载traefik
当一个访问 v1.weixiang.com、v2.weixiang.com 或 v3.weixiang.com 的请求到达集群入口时,Traefik Ingress Controller 会处理它。
【重要前提】: 你必须事先在同一个命名空间(default)中创建了一个名为 "whoami-tls" 的 Secret。
这个 Secret 必须包含 tls.crt (证书) 和 tls.key (私钥) 这两个键。
Traefik 会自动从这个 Secret 中获取证书,用于与客户端(如浏览器)进行 TLS 握手。
此处是SVC的地址和端口
此处是SVC的地址和端口
如果输入的部署WHO,则会将你输入的返回给你。
这个名字必须与上面定义的 Middleware 资源的名称相匹配
这个文件内部,Service 通过 selector 字段与 Deployment 的 Pod 关联。Deployment 通过 volumes 和 volumeMounts 字段与 ConfigMap 关联,从而获得自定义的 Nginx 配置
基于上面4实现
这个10.0.0.152是创建Traefik的时候分配的LoadBalancer的ip就是所说的Traefik Ingress Controller服务的对外暴露地址
先从 Helm Chart中获取 values.yaml 文件,修改该文件,确保名为 web 的入口点监听 80 端口;然后安装 Traefik,它随之创建了web
Service 再将流量最终送达 Pods
-f traefik/values.yaml: 使用 traefik/values.yaml 这个文件中的配置值来覆盖 Chart 中默认的配置
traefik-server: 发布名称
traefik: Chart 名称
1.创建了GatewayClass (网关类):
它通过 gatewayClassName: traefik 与 GatewayClass 关联,它作为 HTTPRoute 的父级 (parent),被 HTTPRoute 通过名称引用
它通过 parentRefs.name: http-gateway 与 Gateway 关联,表示“我的规则应用在 http-gateway 这个入口上”
它通过 backendRefs.name: jiege-traefik-dashboard 与后端的 Service 关联,定义了流量的最终去向
auther: Jason Yin
创建密钥对
声明你服务器密码,建议所有节点的密码均一致,否则该脚本需要再次进行优化
定义主机列表
配置免密登录,利用expect工具免交互输入
Auther: Jason Yin
以下3个参数是containerd所依赖的内核参数
向 etcd 数据库中写入一条数据,其中键(Key)是 school,值(Value)是 weixiang
会同时打印出键和值,各占一行
只获取键 school 的键本身
只获取键 school 的值
查询以 / 这个键为起点的键名
打印出所有以 / 开头的键所对应的值。
会匹配 etcd 中的所有键
查找所有的值
把所有键和它们对应的值都打印出来
修改school的值为IT
查看备份快照的状态
服务监听的端口
链接超时时间
数据存储目录
启用认证功能
指定用户名和密码
指定日志的级别
日志存储目录
日志文件的名称
指定日志的滚动大小
日志打印的位置
默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse
以下安全更新软件源包含了官方源与镜像站配置,如有需要可自行修改注释切换
deb-src http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse
预发布软件源,不建议启用
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse
停止服务
https://prometheus.io Metrics Exporter
curl -X POST http://106.55.44.37:9090/-/reload
默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse
以下安全更新软件源包含了官方源与镜像站配置,如有需要可自行修改注释切换
deb-src http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse
预发布软件源,不建议启用
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse
docker load -i weixiang-tomcat-v9.0.87.tar.gz
docker load -i weixiang-tomcat-v9.0.87.tar.gz
tree certs/etcd/
在顶级字段中配置VictoriaMetrics地址
默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse
以下安全更新软件源包含了官方源与镜像站配置,如有需要可自行修改注释切换
deb-src http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse
预发布软件源,不建议启用
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse
全局配置
告警管理器(Alertmanager)的配置,如果暂时不用可以忽略
抓取配置 (这是核心部分)
... global 和 alerting 配置 ...
... global 和 alerting 配置 ...
在首次启动 Grafana 时禁用管理员用户创建,说白了,就是不创建管理员用户(admin)。
默认管理员用户,启动时创建,可以修改,若不指定,则默认为admin。
指定默认的密码。
默认的邮箱地址。
TYPE student_online untyped
auther: Jason Yin
blog: https://www.cnblogs.com/yinzhengjie/
Create a metric to track time spent and requests made
可以看到数据
通用配置
定义路由信息
定义接受者
已经变红了,邮箱能收到告警邮件
恢复业务
通用配置
定义路由信息
定义接受者
加载模板
恢复
通用配置
定义路由信息
定义接受者
加载模板
通用配置
定义路由信息
定义接受者
加载模板
也可以直接即可
通用配置
定义路由信息
定义接受者
html: '{{ template "weixiang.html" . }}'
html: '{{ template "xixi" . }}'
html: '{{ template "weixiang.html" . }}'
html: '{{ template "xixi" . }}'
加载模板
停止三个业务
查看界面
开启服务
钉钉收不到通知消息,静默测试成功
抑制规则已经生效
配置告警抑制规则
三个服务都已停止
通用配置
定义路由信息
定义接受者
html: '{{ template "weixiang.html" . }}'
html: '{{ template "xixi" . }}'
html: '{{ template "weixiang.html" . }}'
html: '{{ template "xixi" . }}'
加载模板
... 其他配置,如 global, scrape_configs ...
告警规则文件路径
如果你还有其他规则文件,可以一并保留
... 其他配置,如 alerting ...
热加载
curl -X POST http://106.55.44.37:9090/-/reload
默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse
以下安全更新软件源包含了官方源与镜像站配置,如有需要可自行修改注释切换
deb-src http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse
预发布软件源,不建议启用
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse
docker load -i weixiang-tomcat-v9.0.87.tar.gz
docker load -i weixiang-tomcat-v9.0.87.tar.gz
tree certs/etcd/
在顶级字段中配置VictoriaMetrics地址
默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse
deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse
以下安全更新软件源包含了官方源与镜像站配置,如有需要可自行修改注释切换
deb-src http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse
预发布软件源,不建议启用
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse
全局配置
告警管理器(Alertmanager)的配置,如果暂时不用可以忽略
抓取配置 (这是核心部分)
... global 和 alerting 配置 ...
... global 和 alerting 配置 ...
在首次启动 Grafana 时禁用管理员用户创建,说白了,就是不创建管理员用户(admin)。
默认管理员用户,启动时创建,可以修改,若不指定,则默认为admin。
指定默认的密码。
默认的邮箱地址。
TYPE student_online untyped
auther: Jason Yin
blog: https://www.cnblogs.com/yinzhengjie/
Create a metric to track time spent and requests made
可以看到数据
通用配置
定义路由信息
定义接受者
已经变红了,邮箱能收到告警邮件
恢复业务
通用配置
定义路由信息
定义接受者
加载模板
恢复
通用配置
定义路由信息
定义接受者
加载模板
通用配置
定义路由信息
定义接受者
加载模板
也可以直接即可
通用配置
定义路由信息
定义接受者
html: '{{ template "weixiang.html" . }}'
html: '{{ template "xixi" . }}'
html: '{{ template "weixiang.html" . }}'
html: '{{ template "xixi" . }}'
加载模板
停止三个业务
查看界面
开启服务
钉钉收不到通知消息,静默测试成功
抑制规则已经生效
配置告警抑制规则
三个服务都已停止
prometheus.yml
prometheus.yml
prometheus.yml
disk_alert.rules.yml
通用配置
定义路由信息
定义接受者
加载模板
执行alertmanager
热加载
开始监听和接收来自 Alertmanager 的 HTTP 请求
清理测试文件,观察恢复
TYPE processvirtualmemorymaxbytes gauge
HELP promhttpmetrichandlerrequestsin_flight Current number of scrapes being served.
TYPE promhttpmetrichandlerrequestsin_flight gauge
HELP promhttpmetrichandlerrequeststotal Total number of scrapes by HTTP status code.
TYPE promhttpmetrichandlerrequeststotal counter
html: '{{ template "weixiang.html" . }}'
html: '{{ template "xixi" . }}'
html: '{{ template "weixiang.html" . }}'
html: '{{ template "xixi" . }}'
systemctl restart kubelet
先关闭时间同步服务。
修改即将过期的时间的前一天

5、云原生

1、ElasticSearch

image

bash
- ElasticStack课程内容包好的技术栈: 可以轻松应对pb级别的数据 - ElasticSearch : 以来sei克色池 数据库,用于数据存储和检索。 - filebeat: 数据采集,将数据写入ES集群。 - Kibana: 数据展示,支持dsl语句,从ES集群获取数据,并提供图形化界面。 - Logstash: 做数据处理 。 - zookeeper : 分布式协调服务。 - Kafka: 分布式消息队列。 在企业当中,有可能用到的架构: EFK,ELK,ELFK,ELFK+kafka 优化: # 更新软件源 [root@elk91 ~]#apt update [root@elk91 ~]#apt -y install gdm3 net-tools iputils-ping # 修改sshd服务的配置文件 [root@elk92 ~]#cat /etc/ssh/sshd_config PermitRootLogin yes # 重启sshd服务 [root@elk92 ~]#systemctl restart sshd # Ubuntu配置root的ssh登录: 参考链接: https://www.cnblogs.com/yinzhengjie/p/18257781

1、单机部署

bash
# 环境准备: 2 Core 4GB 50GB+ 10.0.0.91 elk91 10.0.0.92 elk92 10.0.0.93 elk93 - ElasticSearch单机部署 参考链接: https://www.elastic.co/guide/en/elasticsearch/reference/7.17/install-elasticsearch.html # 1.下载ES软件包 wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.28-amd64.deb SVIP: [root@elk91 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/ES7/7.17.28/elasticsearch-7.17.28-amd64.deb # 2.安装ES [root@elk91 ~]# dpkg -i elasticsearch-7.17.28-amd64.deb # 3.修改ES的配置文件 [root@elk91 ~]# vim /etc/elasticsearch/elasticsearch.yml ... [root@elk91 ~]# egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml cluster.name: weixiang-weixiang98-single path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch network.host: 0.0.0.0 http.port: 9200 discovery.type: single-node [root@elk91 ~]# 相关参数说明: cluster.name: # 指定集群的名称。 path.data: # 数据的存储路径。 path.logs: # 日志的存储路径。 network.host: # 服务的监听地址。 http.port: # 服务的监听端口。 discovery.type: # 指定部署ES的模式,可以指定单点模式。 # 4.登录管理员用户 oldboy@elk93:~$ sudo su - [sudo] password for oldboy: # 输入密码1 配置ps1 [root@elk93 ~]#vim .bashrc [root@elk93 ~]#source .bashrc # 5.启动ES服务 [root@elk91 ~]# systemctl enable --now elasticsearch [root@elk91 ~]# ss -ntl | egrep "92|300" LISTEN 0 4096 *:9200 *:* LISTEN 0 4096 *:9300 *:* [root@elk91 ~]# # 6.访问测试 [root@elk01 /var/lib/elasticsearch]#curl http://43.139.47.66:9200 { "name" : "elk01", "cluster_name" : "weixiang-weixiang98-cluster", "cluster_uuid" : "_lgs7qe9R7uiXvWJbz9bVw", "version" : { "number" : "7.17.28", "build_flavor" : "default", "build_type" : "deb", "build_hash" : "139cb5a961d8de68b8e02c45cc47f5289a3623af", "build_date" : "2025-02-20T09:05:31.349013687Z", "build_snapshot" : false, "lucene_version" : "8.11.3", "minimum_wire_compatibility_version" : "6.8.0", "minimum_index_compatibility_version" : "6.0.0-beta1" }, "tagline" : "You Know, for Search" } # 7.查看集群的节点数量 [root@elk01 /var/lib/elasticsearch]#curl http://43.139.47.66:9200/_cat/nodes 10.1.24.4 62 97 1 0.06 0.04 0.01 cdfhilmrstw - elk03 # 这个-表示从节点 10.1.20.5 51 97 2 0.15 0.12 0.06 cdfhilmrstw - elk01 # 这个-表示从节点 10.1.24.13 14 96 1 0.00 0.02 0.02 cdfhilmrstw * elk02 # 这个*表示主节点

2、ES故障排查技巧

bash
- ES故障排查技巧 #1、查看服务配置文件,所有服务都适用 systemctl cat elasticsearch.service #2、实时查看ElasticSearch服务的日志,所有服务都适用 journalctl -u elasticsearch.service -f #3、查看日志观察详细的日志信息 tail -f /var/log/elasticsearch/weixiang-weixiang98-single.log #4、手动启动ES服务 观察是否有错误信息输出,如果直接kill,则可能是内存不足导致。 # 具体操作如下: [root@elk91 ~]# vim /etc/elasticsearch/jvm.options ... -Xms256m -Xmx256m 温馨提示: - 建议各位同学学习环境无论是IP地址还是主机名一定要和我保持一致。 - 虚拟机模板必须"干净",不要启动其他服务和我们即将学习课程的服务存在资源("配置文件""端口""内存""CPU"等)抢占。

3、ElasticSearch集群部署

bash
1.停止旧集群服务 [root@elk91 ~]# systemctl disable --now elasticsearch.service 2.清空原始数据 [root@elk91 ~]# rm -rf /var/{log,lib}/elasticsearch/* 3.修改ES的配置文件 [root@elk91 ~]# vim /etc/elasticsearch/elasticsearch.yml ... [root@elk91 ~]# egrep -v "^$|^#" /etc/elasticsearch/elasticsearch.yml cluster.name: weixiang-weixiang98-cluster path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch network.host: 0.0.0.0 http.port: 9200 discovery.seed_hosts: ["10.1.20.5", "10.1.24.13","10.1.24.4"] cluster.initial_master_nodes: ["10.1.20.5", "10.1.24.13","10.1.24.4"] # 云ununtu配置主机名, # 1、cluster.initial_master_nodes: ["elk01", "elk02","elk03"] # 2、放行安全组规则9020,9030,5601 4.拷贝ES程序到其他节点 [root@elk91 ~]# scp elasticsearch-7.17.28-amd64.deb 10.0.0.92:~ [root@elk91 ~]# scp elasticsearch-7.17.28-amd64.deb 10.0.0.93:~ 5.其他节点安装ES服务 [root@elk92 ~]# dpkg -i elasticsearch-7.17.28-amd64.deb [root@elk93 ~]# dpkg -i elasticsearch-7.17.28-amd64.deb 6.拷贝配置文件到其他节点 [root@elk91 ~]# scp /etc/elasticsearch/elasticsearch.yml 10.1.24.13:/etc/elasticsearch [root@elk91 ~]# scp /etc/elasticsearch/elasticsearch.yml 10.1.24.4:/etc/elasticsearch 7.集群启动 [root@elk91 ~]# systemctl enable --now elasticsearch.service [root@elk91 ~]# ss -ntl | egrep "92|300" LISTEN 0 4096 *:9200 *:* LISTEN 0 4096 *:9300 *:* [root@elk91 ~]# [root@elk92 ~]# systemctl enable --now elasticsearch.service [root@elk92 ~]# ss -ntl | egrep "92|300" LISTEN 0 4096 *:9200 *:* LISTEN 0 4096 *:9300 *:* [root@elk92 ~]# [root@elk93 ~]# systemctl enable --now elasticsearch.service [root@elk93 ~]# ss -ntl | egrep "92|300" LISTEN 0 4096 *:9300 *:* LISTEN 0 4096 *:9200 *:* [root@elk93 ~]# 8.测试验证 [root@elk91 ~]# for i in `seq 220 222`; do curl -s 172.22.233.${i}:9200 | egrep "cluster_uuid";done [root@elk91 ~]# for i in `seq 134.126.249 134.173.173 148.220.37`; do curl -s 8.${i}:9200 | egrep "cluster_uuid";done "cluster_uuid" : "rvKtmLJURKaYV5mUI3LTAg", "cluster_uuid" : "rvKtmLJURKaYV5mUI3LTAg", "cluster_uuid" : "rvKtmLJURKaYV5mUI3LTAg", [root@elk91 ~]# [root@elk91 ~]# [root@elk91 ~]# curl 10.1.20.5:9200/_cat/nodes 10.0.0.93 5 97 4 0.07 0.14 0.06 cdfhilmrstw - elk93 10.0.0.92 5 97 5 0.05 0.12 0.04 cdfhilmrstw * elk92 10.0.0.91 25 97 5 0.04 0.10 0.03 cdfhilmrstw - elk91 [root@elk91 ~]# - 测试集群是否可以正常读写 1.写入数据 curl --location --request POST 'http://10.1.24.13:9200/_bulk' \ --header 'Content-Type: application/json' \ --data-raw '{ "create" : { "_index" : "weixiang-weixiang98", "_id" : "1001" } } { "name" : "猪八戒","hobby": ["猴哥","高老庄"] } { "create" : { "_index" : "weixiang-weixiang98", "_id" : "1002" } } { "name" : "沙和尚","hobby": ["流沙河","挑行李"] } { "create" : { "_index" : "weixiang-weixiang98", "_id" : "1003" } } { "name" : "白龙马","hobby": ["大师兄,师傅被妖怪抓走啦"] } ' 2.查询数据 [root@elk91 ~]#apt -y install jq [root@elk91 ~]#curl -s --location --request GET '8.148.219.35:9200/weixiang-weixiang98/_search' \ --header 'Content-Type: application/json' \ --data-raw '{ "query": { "match": { "hobby": "猴哥" } } }' | jq 3.删除索引 curl --location --request DELETE '10.0.0.92:9200/weixiang-weixiang98'

4、ES相关的面试题

image

bash
ES集群架构:用户实际数据是name:老男孩,基于weixiang-linux索引(类似于mysql的表)存放到多个分片里面,每个分片存放的不同的数据,为了保证 数据的分布式,每个分片有0-多个副本分片,如果主分片挂掉后,副本分片能做一个接管。分片里面存放的是文档,是用户实实在在的数据
bash
# ES集群的常用术语: # 索引: Index 客户端对ES进行数据读写的逻辑单元。 # 分片: Shard 一个索引最少有1个或多个分片,是数据的实际存储载体。 分片不可切分,隶属于某个ES节点,分片可以从某个节点迁移到其他节点。 如果说一个索引只有1个分片的话,该索引将无法充分利用集群资源。因为数据只能存放在一个分片 分片实现数据的分布式存储 # 副本: replica 副本是针对分片而言的,用于对分片的数据进行备份。 一个分片可以有0个或多个副本。 当分片的副本数量为0时,则可能会存在数据丢失的风险。 存储同一数据的分片0跟副本0不能在同一节点,因为节点挂了数据会丢失 副本实现数据的高可用 # 文档: document 文档是用户进行数据存储的最小单元。文档包含元数据和源数据。 元数据: 用于描述源数据的数据。 源数据: 用户实际存储的数据。 举个例子: 源数据: {"name": "孙悟空","hobby": "紫霞仙子"} ES中存储的样子: { _index: "weixiang-weixiang98", _type: "_doc", _id: "XXXXXX" ... _source: {"name": "孙悟空","hobby": "紫霞仙子"} } 其中源数据就是"_source"字段的内容,而"_source","_type","_index","_id"都是用来描述源数据的数据,这些字段称之为"元数据"# ES相关的面试题: - Q1: ES集群颜色有几种,分别代表什么含义? ES集群颜色有三种,分别为: Red,Yellow,Green。 Green: 代表所有的主分片和副本分片都可以正常访问。 Yellow: 代表所有的主分片可以正常访问,但有部分副本分片无法访问。 Red: 代表有副本主分片无法访问。 - Q2: ES集群监听端口是多少,各自协议及作用是啥? 9200:http|https ES集群对外的访问端口。 9300:tcp ES集群内部数据传输及master选举的端口。 温馨提示: ES集群启动时优先启动的是9300,而后才是9200。 # ES集群故障排查思路: 1.检查配置文件是否正确 egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml 2.尝试清空数据 systemctl disable --now elasticsearch.service rm -rf /var/{log,lib}/elasticsearch/* 3.启动服务 systemctl enable --now elasticsearch.service 4.查看日志 cat /var/log/elasticsearch/

2、部署kibana

bash
# 1.下载kibana 软件包 wget https://artifacts.elastic.co/downloads/kibana/kibana-7.17.28-amd64.deb SVIP: [root@elk91 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/ES7/7.17.28/kibana-7.17.28-amd64.deb # 2.安装kibana [root@elk91 ~]# dpkg -i kibana-7.17.28-amd64.deb # 3.修改kibana的配置文件 [root@elk91 ~]# vim /etc/kibana/kibana.yml ... [root@elk91 ~]# egrep -v "^#|^$" /etc/kibana/kibana.yml server.port: 5601 server.host: "0.0.0.0" elasticsearch.hosts: ["http://10.1.20.5:9200","http://10.1.24.13:9200","http://10.1.24.4:9200"] i18n.locale: "zh-CN" [root@elk91 ~]# 相关参数说明: server.port: # 服务的监听端口。 server.host: # 服务的监听地址。 elasticsearch.hosts: # 指定ES集群地址。 i18n.locale: "zh-CN" # kibana图形化展示使用的语言。 # 4.启动服务 [root@elk91 ~]# systemctl enable --now kibana.service [root@elk91 ~]# ss -ntl | grep 5601 LISTEN 0 511 0.0.0.0:5601 0.0.0.0:* [root@elk91 ~]# # 5.访问测试 http://43.139.47.66:5601/

9c21f3db4782250aad96f03c91bd7d41

762d4e9069d7b9f5f78e22fbf4f600fc

af5ffd87ae4b831143517b5bab19b217

e02ed8664d39949beb3829d15fc84f6e

210baec17798aafd25e2977bfae606a3

ed458d6a1c6bccb07bb9f29266312e42

3、部署filebeat

bash
1.下载软件包 wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.17.28-amd64.deb SVIP: [root@elk92 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/ES7/7.17.28/filebeat-7.17.28-amd64.deb 2.安装filebeat [root@elk92 ~]# dpkg -i filebeat-7.17.28-amd64.deb 3.编写配置文件 [root@elk92 ~]# mkdir /etc/filebeat/config [root@elk92 ~]# cd /etc/filebeat [root@elk92 filebeat]# cat config/01-stdin-to-console.yaml filebeat.inputs: # 配置输入源 - type: stdin # 指定输入类型为标准输入(键盘输入) output.console: # 配置输出目标到控制台 pretty: true # 启用美化输出格式 [root@elk92 filebeat]# 4.启动服务 [root@elk92 filebeat]# filebeat -e -c config/01-stdin-to-console.yaml # -e输出到终端,-c指定配置文件 ... 111111111111111111111111111111111111 # 键盘输入: { "@timestamp": "2025-06-19T08:37:35.519Z", "@metadata": { "beat": "filebeat", "type": "_doc", "version": "7.17.28" }, "agent": { "ephemeral_id": "60ff7608-8f0f-4cf1-abdb-bb7ed555c0b0", "id": "d2d0d6a3-a362-4400-b47a-0658acabe034", "name": "elk92", "type": "filebeat", "version": "7.17.28", "hostname": "elk92" }, "log": { "offset": 0, "file": { "path": "" } }, "message": "111111111111111111111111111111111111", # 输出到控制台 "input": { "type": "stdin" }, "ecs": { "version": "1.12.0" }, "host": { "name": "elk92" } }
1、filebeat采集日志文件细节分析
bash
1.编写filebeat配置文件 [root@elk92 filebeat]# cat config/02-log-to-console.yaml filebeat.inputs: # 配置输入源 - type: log # 指定输入类型为日志文件 paths: # 定义要监控的日志文件路径 - /tmp/xixi.log # 具体日志文件路径 output.console: # 配置输出目标为控制台 pretty: true # 启用美化输出格式 [root@elk92 filebeat]# 2.启动服务 [root@elk92 filebeat]# filebeat -e -c config/02-log-to-console.yaml 3.发送测试数据【观察filebeat采集效果】 [root@elk92 ~]# echo ABCD > /tmp/xixi.log [root@elk92 ~]# [root@elk92 ~]# cat /tmp/xixi.log ABCD [root@elk92 ~]# [root@elk92 ~]# echo -n abc >> /tmp/xixi.log # filebeat采集不到 [root@elk92 ~]# [root@elk92 ~]# echo -n 1234 >> /tmp/xixi.log # filebeat采集不到 [root@elk92 ~]# [root@elk92 ~]# cat /tmp/xixi.log ABCD abc1234[root@elk92 ~]# [root@elk92 ~]# [root@elk92 ~]# echo def >> /tmp/xixi.log [root@elk92 ~]# cat /tmp/xixi.log ABCD abc1234def [root@elk92 ~]# 温馨提示: # filebeat默认按行采集数据,如果采集完上一次新增数据不换行是不会采集,比如加echo -n参数输出; # filebeat会保留采集源文件的offset记录。 - /var/lib/filebeat/registry/filebeat/log.json # filebeat增量采集,首次采集,或数据目录被删除时,则默认从头采集数据。 - filebeat采集nginx日志并写入ES集群 1.安装nginx服务并启动 [root@elk92 ~]# apt -y install nginx [root@elk92 ~]# systemctl enable --now nginx 2.filebeat采集nginx日志 [root@elk92 filebeat]# cat > config/03-nginx-to-es.yaml <<EOF filebeat.inputs: - type: log paths: - /var/log/nginx/access.log* output.console: pretty: true EOF # 记得重启nginx

image

bash
3.filebeat采集nginx日志写入ES集群 [root@elk92 filebeat]# cat > config/03-nginx-to-es.yaml<<EOF filebeat.inputs: - type: log paths: - /var/log/nginx/access.log* #output.console: # pretty: true output.elasticsearch: # 输出到elasticsearch集群 hosts: - 43.139.47.66:9200 - 106.55.44.37:9200 - 43.139.77.96:9200 index: "weixiang98-nginx-accesslog-%{+yyyy.MM.dd}" # 自定义索引名称 setup.ilm.enabled: false # 关闭索引生命周期,如果开启,则不支持filebeat自定义索引名称 setup.template.name: "weixiang-weixiang98" # 定义索引模板的名称 setup.template.pattern: "weixiang98*" # 定义索引的匹配模式,该模式可以匹配写入ES的自定义索引。匹配的索引必须遵循该索引模板的配置。也就是索引名称必须是weixiang98*开始 setup.template.overwrite: false # 如果索引模板已经存在是否覆盖。 setup.template.settings: # 设置索引模板 index.number_of_shards: 3 # 自定义索引的分片数量 index.number_of_replicas: 0 # 自定义索引的副本数量 EOF 补充: #如果定义的索引匹配模式(zwx*)与自定义索引名称(weixiang98-nginx-accesslog-%{+yyyy.MM.dd})不一样 当写入数据到不存在的索引时: Elasticsearch 检查是否有匹配的模板 → 没有("zwx*" 不匹配) # 使用默认配置创建新索引: number_of_shards: 1(默认5分片,但可能被系统修改) number_of_replicas: 1 # 页面查看

image

image

49004e7b3c384823225c460756804162

bash
3.采集日志到ES集群 [root@elk92 filebeat]# rm -rf /var/lib/filebeat/ [root@elk92 filebeat]# filebeat -e -c config/03-nginx-to-es.yaml 4.kibana查看数据 略,见视频。 5.发送测试数据 [root@elk91 ~]# for i in `seq 10`; do curl 10.1.20.5 ;done
2、今日内容回顾

image

bash
用户访问nginx,filebeat采集nginx访问日志写到ES集群,kibana从ES集群获取数据,并提供图形化界面 - ElasticStack架构 ***** - EFK - ELK - ELFK - ELFK + KAFKA - ES单点部署 ** - ES集群部署 ***** - Kibana部署 ***** - filebeat部署 ***** - filebeat日志采集原理 ***** - EFK架构梳理NGINX案例 ***** - 今日作业: - 完成课堂的所有练习并整理思维导图; - 使用ansible-Playbook一键部署EFK架构; - 扩展作业: - 实现"科学上网",测试站点: https://www.google.com/ https://hub.docker.com/
3、Ansible一键部署EFK架构
1、安装ES
bash
root@iZ7xvfih1ndn3k7gd7z9b9Z:~/roles/cld# vim tasks/main.yml: - name: 检查 Elasticsearch 是否已安装 command: dpkg -s elasticsearch # 执行shell命令检查deb包安装状态 register: es_check # 将命令结果保存到es_check变量 ignore_errors: yes # 忽略命令错误(如未安装时的返回码) changed_when: false # 标记此任务不会改变系统状态 - name: 拷贝ES安装包 copy: # 使用文件复制模块 src: /root/roles/cld/files/elasticsearch-7.17.28-amd64.deb # 源文件路径 dest: /tmp/elasticsearch.deb # 目标路径(临时目录) mode: '0644' # 设置文件权限(所有者读写,其他只读) when: es_check.rc != 0 # 仅当ES未安装时执行(rc≠0表示未安装) - name: 拷贝配置文件 copy: # 使用文件复制模块 src: /root/roles/cld/files/elasticsearch.yml # 源配置文件路径 dest: /etc/elasticsearch/ # 目标目录(ES配置目录) notify: Restart elasticsearch # 触发重启handler(当配置文件变更时) - name: 安装ES apt: # 使用APT包管理模块 deb: /tmp/elasticsearch.deb # 指定本地deb包路径 force: yes # 强制安装(覆盖冲突文件) register: dpkg_result # 保存安装结果到变量 ignore_errors: yes # 忽略安装错误(如依赖问题) when: es_check.rc != 0 # 仅当ES未安装时执行 - name: 修复依赖关系 apt: # 使用APT包管理模块 update_cache: yes # 更新包索引(相当于apt update) fix_broken: yes # 修复损坏的依赖(相当于apt --fix-broken) when: # 条件判断: - es_check.rc != 0 # 1) ES未安装 - dpkg_result is failed # 2) 且安装任务失败 - name: 启动ES systemd: # 使用systemd服务管理模块 name: elasticsearch # 服务名称 state: started # 确保服务处于运行状态 enabled: yes # 设置开机自启动 root@iZ7xvfih1ndn3k7gd7z9b9Z:~/roles/cld# vim handlers/main.yml: - name: Restart elasticsearch systemd: # 使用systemd模块管理服务 name: elasticsearch # 指定要管理的服务名称 state: restarted # 确保服务被重启 daemon_reload: yes # 执行前重新加载systemd配置
2、安装kibana
bash
root@iZ7xvfih1ndn3k7gd7z9b9Z:~/roles/kibana# vim tasks/main.yml: - name: 检查 Kibana 是否已安装 command: dpkg -s kibana # 使用dpkg检查kibana包状态(注意:包名应为小写kibana) register: ki_check # 保存检查结果到ki_check变量 ignore_errors: yes # 忽略未安装时的错误(返回码非0) changed_when: false # 标记为永不改变系统状态的任务 - name: 拷贝Kibana安装包 copy: src: /root/roles/kibana/files/kibana-7.17.28-amd64.deb # 源安装包路径 dest: /tmp/kibana-7.17.28-amd64.deb # 目标路径(临时目录) mode: '0644' # 设置文件权限(rw-r--r--) when: ki_check.rc != 0 # 仅在Kibana未安装时执行(应使用.rc检查返回码) - name: 拷贝配置文件 copy: src: /root/roles/kibana/files/kibana.yml # 源配置文件 dest: /etc/kibana/kibana.yml # 目标路径(Kibana主配置文件) notify: Restart kibana # 配置文件变更时触发重启handler - name: 安装Kibana apt: deb: /tmp/kibana-7.17.28-amd64.deb # 指定本地deb包路径 force: yes # 强制安装(覆盖冲突文件) register: dpkg_result # 保存安装结果 ignore_errors: yes # 忽略安装过程中的错误 when: ki_check.rc != 0 # 仅在Kibana未安装时执行 - name: 修复依赖关系 apt: update_cache: yes # 更新软件包缓存(apt update) fix_broken: yes # 修复损坏的依赖(apt --fix-broken) when: dpkg_result is failed # 仅当安装任务失败时执行 - name: 启动Kibana systemd: name: kibana.service # 服务名称(完整systemd单元名) state: started # 确保服务处于运行状态 enabled: yes # 启用开机自启动 root@iZ7xvfih1ndn3k7gd7z9b9Z:~/roles/kibana# vim handlers/main.yml: - name: Restart kibana systemd: # 使用systemd模块管理服务 name: kibana.service # 指定要管理的服务名称 state: restarted # 确保服务被重启 daemon_reload: yes # 执行前重新加载systemd配置
3、安装filebeat
bash
root@iZ7xvfih1ndn3k7gd7z9b9Z:~/roles/filebeat# vim tasks/main.yml: - name: 检查 Filebeat 是否已安装 command: dpkg -s filebeat # 检查filebeat是否安装(修正为小写包名) register: fi_check # 保存检查结果到变量 ignore_errors: yes # 忽略未安装时的错误 changed_when: false # 标记为永不改变系统状态的任务 - name: 清理旧版 Filebeat 数据目录 command: rm -rf /var/lib/filebeat/ # 递归删除filebeat数据目录 args: warn: no # 禁止Ansible警告(使用原生命令) when: fi_check.rc == 0 # 仅在Filebeat已安装时执行 changed_when: false # 不标记为系统变更 ignore_errors: yes # 忽略可能的删除错误(如目录不存在) - name: 拷贝 Filebeat 安装包 copy: src: /root/roles/filebeat/files/filebeat-7.17.28-amd64.deb # 源安装包 dest: /tmp/filebeat-7.17.28-amd64.deb # 目标路径 mode: '0644' # 文件权限设置 when: fi_check.rc != 0 # 仅在未安装时执行 - name: 安装 Filebeat apt: deb: /tmp/filebeat-7.17.28-amd64.deb # 指定本地deb包 force: yes # 强制安装(覆盖冲突) register: dpkg_result # 保存安装结果 ignore_errors: yes # 忽略可能的安装错误 when: fi_check.rc != 0 # 仅在未安装时执行 - name: 拷贝配置文件 copy: src: /root/roles/filebeat/files/03-nginx-to-es.yaml # Filebeat配置文件 dest: /etc/filebeat/03-nginx-to-es.yaml # 目标配置文件路径 notify: Reload filebeat - name: 修复依赖关系 apt: update_cache: yes # 更新软件包缓存 fix_broken: yes # 修复损坏的依赖关系 when: dpkg_result is failed # 仅在安装失败时执行 - name: 测试 Filebeat 配置 command: filebeat test config -c /etc/filebeat/03-nginx-to-es.yaml # 测试配置文件有效性 register: filebeat_test # 保存测试结果 changed_when: false # 配置测试不改变系统状态 - name: 显示配置测试结果 debug: msg: "{{ filebeat_test.stdout }}" # 输出配置测试结果 - name: 启动 Filebeat 服务 systemd: name: filebeat # 服务名称 state: started # 确保服务运行 enabled: yes # 启用开机自启 daemon_reload: yes # 重新加载systemd配置 - name: 安装 Nginx apt: name: nginx # 软件包名称 state: present # 确保安装 - name: 启动 Nginx systemd: name: nginx # 服务名称 state: started # 启动服务 enabled: yes # 启用开机自启 root@iZ7xvfih1ndn3k7gd7z9b9Z:~/roles/filebeat# vim handlers/main.yml: handlers: - name: Reload filebeat systemd: name: filebeat state: reloaded
4、filebeat的模块管理
bash
# 1.什么是filebeat模块 其实就是filebeat针对不同主流中间件日志采集的预定方案。 # 2.查看模块列表 [root@elk92 ~]# filebeat modules list Enabled: Disabled: activemq apache auditd aws awsfargate azure barracuda bluecoat cef checkpoint cisco coredns ... # 3.启用模块 [root@elk92 ~]# filebeat modules enable nginx tomcat mysql traefik Enabled nginx Enabled tomcat Enabled mysql Enabled traefik [root@elk92 ~]# [root@elk92 ~]# filebeat modules list Enabled: mysql nginx tomcat traefik Disabled: activemq apache auditd aws awsfargate azure ... [root@elk92 ~]# ll /etc/filebeat/modules.d/*.yml # 启动逻辑就是把mysql.yml.disable改成mysql.yml -rw-r--r-- 1 root root 472 Feb 14 00:58 /etc/filebeat/modules.d/mysql.yml -rw-r--r-- 1 root root 784 Feb 14 00:58 /etc/filebeat/modules.d/nginx.yml -rw-r--r-- 1 root root 623 Feb 14 00:58 /etc/filebeat/modules.d/tomcat.yml -rw-r--r-- 1 root root 303 Feb 14 00:58 /etc/filebeat/modules.d/traefik.yml [root@elk92 ~]# # 4.禁用模块 [root@elk92 ~]# filebeat modules disable mysql traefik Disabled mysql Disabled traefik [root@elk92 ~]# [root@elk92 ~]# ll /etc/filebeat/modules.d/*.yml -rw-r--r-- 1 root root 784 Feb 14 00:58 /etc/filebeat/modules.d/nginx.yml -rw-r--r-- 1 root root 623 Feb 14 00:58 /etc/filebeat/modules.d/tomcat.yml [root@elk92 ~]# [root@elk92 ~]# filebeat modules list Enabled: nginx tomcat Disabled: activemq apache auditd aws awsfargate azure ... # 5.验证模块启用和禁用的原理 [root@elk92 ~]# ll /etc/filebeat/modules.d/*.yml -rw-r--r-- 1 root root 784 Feb 14 00:58 /etc/filebeat/modules.d/nginx.yml -rw-r--r-- 1 root root 623 Feb 14 00:58 /etc/filebeat/modules.d/tomcat.yml [root@elk92 ~]# [root@elk92 ~]# mv /etc/filebeat/modules.d/tomcat.yml{,.disabled} [root@elk92 ~]# [root@elk92 ~]# ll /etc/filebeat/modules.d/*.yml -rw-r--r-- 1 root root 784 Feb 14 00:58 /etc/filebeat/modules.d/nginx.yml [root@elk92 ~]# [root@elk92 ~]# filebeat modules list Enabled: nginx Disabled: activemq apache auditd aws awsfargate azure ...
1、filebeat的模块案例之nginx
bash
# 1.准备Nginx访问日志 [root@elk92 ~]# cat /var/log/nginx/access.log 123.117.19.236 - - [19/Jun/2025:17:27:13 +0800] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0" 123.117.19.236 - - [19/Jun/2025:17:36:41 +0800] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0" 123.117.19.236 - - [19/Jun/2025:17:36:41 +0800] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0" 123.117.19.236 - - [19/Jun/2025:17:36:41 +0800] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0" 123.117.19.236 - - [19/Jun/2025:17:36:41 +0800] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0" 123.117.19.236 - - [19/Jun/2025:17:36:41 +0800] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0" 123.117.19.236 - - [19/Jun/2025:17:36:41 +0800] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0" 123.117.19.236 - - [19/Jun/2025:17:36:41 +0800] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0" 123.117.19.236 - - [19/Jun/2025:17:36:41 +0800] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0" 123.117.19.236 - - [19/Jun/2025:17:36:41 +0800] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0" 123.117.19.236 - - [19/Jun/2025:17:36:41 +0800] "GET / HTTP/1.1" 200 612 "-" "curl/7.81.0" 23.117.19.236 - - [20/Jun/2025:09:31:30 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36" 24.117.19.236 - - [20/Jun/2025:09:31:30 +0800] "GET /favicon.ico HTTP/1.1" 404 197 "http://10.0.0.92/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36" 25.117.19.236 - - [20/Jun/2025:09:31:58 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 16_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 Safari/604.1" 30.117.19.236 - - [20/Jun/2025:09:32:17 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 16_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 Safari/604.1" 31.117.19.236 - - [20/Jun/2025:09:32:30 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Safari/605.1.15" 32.117.19.236 - - [20/Jun/2025:09:32:51 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Safari/605.1.15" 40.117.19.236 - - [20/Jun/2025:09:33:04 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (iPad; CPU OS 16_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 Safari/604.1" 41.117.19.236 - - [20/Jun/2025:09:33:12 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (iPad; CPU OS 16_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 Safari/604.1" 42.117.19.236 - - [20/Jun/2025:09:33:13 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (iPad; CPU OS 16_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 Safari/604.1" 51.117.19.236 - - [20/Jun/2025:09:33:23 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Safari/605.1.15" 52.117.19.236 - - [20/Jun/2025:09:33:42 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Mobile Safari/537.36" 53.117.19.236 - - [20/Jun/2025:09:33:49 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Mobile Safari/537.36" 80.117.19.236 - - [20/Jun/2025:09:33:54 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Mobile Safari/537.36" 82.117.19.236 - - [20/Jun/2025:09:33:54 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Mobile Safari/537.36" 92.117.19.236 - - [20/Jun/2025:09:33:54 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Mobile Safari/537.36" 78.117.19.236 - - [20/Jun/2025:09:33:55 +0800] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Mobile Safari/537.36" 110.117.19.236 - - [20/Jun/2025:09:33:55 +0800] "GET / HTTP/1.1" 404 396 "-" "Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Mobile Safari/537.36" [root@elk92 ~]# # 2.启用模块 [root@elk92 ~]# ll /etc/filebeat/modules.d/*.yml -rw-r--r-- 1 root root 784 Feb 14 00:58 /etc/filebeat/modules.d/nginx.yml [root@elk92 ~]# [root@elk92 ~]# egrep -v "^.*#|^$" /etc/filebeat/modules.d/nginx.yml - module: nginx access: enabled: true # 启动访问日志 # var.paths: ["/tmp/access.log"]:# 如果nginx访问日志不是默认位置,需要在这里指定,默认是/var/log/nginx/access.log error: enabled: false # 不启动错误日志 ingress_controller: enabled: false [root@elk92 ~]# # 3.编写filebeat配置文件 [root@elk92 filebeat]# cat config/04-module-to-es.yaml filebeat.config.modules: # 使用模块 path: ${path.config}/modules.d/nginx.yml # 指定Nginx模块配置文件路径 reload.enabled: true # 支持热加载,文件发生改变会去加载 #output.console: # pretty: true output.elasticsearch: hosts: - 10.0.0.91:9200 - 10.0.0.92:9200 - 10.0.0.93:9200 index: "weixiang98-modeules-nginx-accesslog-%{+yyyy.MM.dd}" # 自定义索引名称 setup.ilm.enabled: false # 关闭索引生命周期,如果开启,则不支持filebeat自定义索引名称 setup.template.name: "weixiang-weixiang98" # 定义索引模板的名称 setup.template.pattern: "weixiang98*" # 定义索引的匹配模式,该模式可以匹配写入ES的索引。匹配的索引必须遵循该索引模板的配置。 setup.template.overwrite: false # 如果索引模板已经存在是否覆盖。 setup.template.settings: # 设置索引模板 index.number_of_shards: 3 # 自定义索引的分片数量 index.number_of_replicas: 0 # 自定义索引的副本数量 # 4.启动实例 [root@elk92 filebeat]# rm -rf /var/lib/filebeat/ [root@elk92 filebeat]# [root@elk92 filebeat]# filebeat -e -c config/04-module-to-es.yaml # 5.kibana查询数据
1、加模块与不加模块的区别
bash
1.filebeat.config.modules: path: ${path.config}/modules.d/nginx.yml reload.enabled: true 工作方式: 加载 Filebeat 预定义的 Nginx 模块,该模块包含: 预解析管道(Ingest pipelines) 字段映射(ECS 字段标准化) Kibana 仪表板(预建可视化) 自动处理访问日志/错误日志 核心优势: ✅ 自动日志解析(如拆分 $request 到独立字段) ✅ 自动生成监控仪表板 ✅ 符合 Elastic Common Schema (ECS) 规范 ✅ 支持自动重载配置变更 2.filebeat.inputs: - type: filestream # 注:Filebeat 8.x+ 推荐用 filestream 替代 log paths: - /var/log/nginx/access.log 工作方式: 仅进行原始日志收集,无任何高级处理: ❌ 无字段解析(所有内容堆在 message 字段) ❌ 无 ECS 字段标准化 ❌ 需手动创建解析管道 ❌ 无预建仪表板

📊 功能差异详解

功能模块模式 (filebeat.config.modules​)基础输入模式 (filebeat.inputs​)
日志解析✅ 自动解析 URL/状态码/客户端IP❌ 原始文本,需手动 Grok 解析
ECS 字段标准化✅ 生成 http.request.method​ 等字段❌ 所有内容在 message​ 字段
Kibana 仪表板✅ 自动创建 Nginx 监控仪表板❌ 需手动创建可视化
错误日志处理✅ 自动区分 access/error 日志❌ 需单独配置
GeoIP 集成✅ 自动解析客户端地理位置❌ 需手动添加 GeoIP 处理器
用户代理解析✅ 自动拆分设备/OS/浏览器信息❌ 需手动处理

2、EFK架构基础环境准备
bash
1.清空现有的数据 rm -rf /var/lib/filebeat/ 2.启动filebeat实例 [root@elk92 filebeat]# cat config/05-efk-to-es.yaml filebeat.config.modules: path: ${path.config}/modules.d/nginx.yml reload.enabled: true output.elasticsearch: hosts: - 8.148.219.35:9200 - 8.148.229.196:9200 - 8.134.186.218:9200 index: "weixiang98-efk-nginx-accesslog-%{+yyyy.MM.dd}" setup.ilm.enabled: false setup.template.name: "weixiang-weixiang98" setup.template.pattern: "weixiang98*" setup.template.overwrite: false setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 0 [root@elk92 filebeat]# [root@elk92 filebeat]# rm -rf /var/lib/filebeat/ [root@elk92 filebeat]# filebeat -e -c config/05-efk-to-es.yaml 3.统计PV 略,见视频。

3、ELK画图

搜索字段画图

image

image

1、根据Nginx状态码进行分组统计

ace5ed3d8735920aec5150742e73a6b9

2、根据Nginx日志统计PV

663719fbadbd9c3e8614bb1d2b500f37

4c471f959beb4de3a99e21bf57e56c07

f27c3f757653012906eabb03b4f10315

bed50c22b77d019bcd8a537c3a490e7c_720

40d8178010c8cdb4fa1d310bfba6208c

3、根据Nginx日志统计IP

829fbc84b03d80eb975a269e112035c6

9ff530d028f1cc72e466d67eb469f60a

4、统计带宽

226cc1f5e378033c969a9d7d109049c4

e2fb504b5693c96f0addb149f8ea48be

1cdcb4de1c103f9d198944cffc068c34

5、创建地图

490b63935f6fd058c1380b6b7adc3e1e

37beac1c60aea60d067680eac8e0d457

bash
source.geo.location

6、用户操作系统数量统计

deda390f1833902b2a8588aa79055155

353009f8e77c33554a927ec09c00e2e2

7、设备类型占比图

f95ab65f8e49bceb50b117a4da7f5544_720

9bef303827fdd4063c8717eda4b3146f_720

8、Dashboard图

ba08df18d2add2d209270a090c9f234a_720

60d47de5c539a2c3e296c16b5d0a3ec1_720

4、kibana数据排错原因
bash
# Kibana如果查询不到数据,可能是由什么原因呢? - Filebeat端存在问题的可能性: - filebeat挂掉无法采集数据; - 配置文件和实际采集的数据不对应; # 看配置文件填写的路径 - 源数据文件为空,未能写入; # 查看采集的数据文件 - 数据已经采集过了,本地缓存offset未清空; # rm -rf /var/lib/filebeat - logstash和Filebeat同理,也会存在类似的问题。 - ES集群挂掉,导致kibana无法查询数据; - kibana的时间选择有问题,也会查询不到数据; - kibana做了KQL数据过滤,也可能导致数据查询不到; - kibana的索引被删除,索引模式不生效;
5、filebeat的多实例实战案例
bash
1.什么是多实例? 一台服务器运行多个filebeat实例。多个实例共同同一套程序。 2.实战案例 2.1 启动第一个实例 [root@elk92 filebeat]# filebeat -e -c config/03-nginx-to-es.yaml 2.2 启动第二个实例 [root@elk92 filebeat]# filebeat -e -c config/02-log-to-console.yaml --path.data /tmp/xixi 2.3 测试验证 [root@elk92 ~]# ps -ef | grep filebeat root 111460 109818 0 14:42 pts/0 00:00:00 /usr/share/filebeat/bin/filebeat --path.home /usr/share/filebeat --path.config /etc/filebeat --path.data /var/lib/filebeat --path.logs /var/log/filebeat -e -c config/03-nginx-to-es.yaml root 111513 110130 1 14:44 pts/1 00:00:00 /usr/share/filebeat/bin/filebeat --path.home /usr/share/filebeat --path.config /etc/filebeat --path.data /var/lib/filebeat --path.logs /var/log/filebeat -e -c config/02-log-to-console.yaml --path.data /tmp/xixi root 111640 111620 0 14:44 pts/2 00:00:00 grep --color=auto filebeat [root@elk92 ~]# [root@elk92 ~]# 3.实战案例 3.1 实例1-采集系统日志文件 [root@elk92 filebeat]# [root@elk92 filebeat]# cat config/06-systlog-to-es.yaml filebeat.inputs: - type: log paths: - /var/log/syslog* # 排除以"*.gz"结尾的文件 exclude_files: ['\.gz$'] # 排除掉.gz结尾的 output.elasticsearch: hosts: - 10.0.0.91:9200 - 10.0.0.92:9200 - 10.0.0.93:9200 index: "weixiang98-efk-syslog-%{+yyyy.MM.dd}" setup.ilm.enabled: false setup.template.name: "weixiang-weixiang98" setup.template.pattern: "weixiang98*" setup.template.overwrite: false setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 0 [root@elk92 filebeat]# [root@elk92 filebeat]# [root@elk92 filebeat]# filebeat -e -c config/06-systlog-to-es.yaml 3.2 实例2-采集auth日志文件 [root@elk92 filebeat]# cat config/07-auth-to-es.yaml filebeat.inputs: - type: log paths: - /var/log/auth.log exclude_files: ['\.gz$'] output.elasticsearch: hosts: - 10.0.0.91:9200 - 10.0.0.92:9200 - 10.0.0.93:9200 index: "weixiang98-efk-auth-%{+yyyy.MM.dd}" setup.ilm.enabled: false setup.template.name: "weixiang-weixiang98" setup.template.pattern: "weixiang98*" setup.template.overwrite: false setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 0 [root@elk92 filebeat]# [root@elk92 filebeat]# filebeat -e -c config/07-auth-to-es.yaml --path.data /var/lib/filebeat-auth 3.3 kibana查看数据 略,见视频。 syslog日志过滤: message :"Successfully " auth日志过滤: message :"10.0.0.1 "
6、filebeat写入数据到不同的ES索引
bash
- filebeat写入数据到不同的ES索引 1.编写filebeat配置文件 [root@elk92 filebeat]# cat config/08-multiple_input-to-es.yaml filebeat.inputs: - type: log paths: - /var/log/auth.log exclude_files: ['\.gz$'] # 给数据打标签 tags: "auth" - type: log tags: "syslog" paths: - /var/log/syslog* exclude_files: ['\.gz$'] output.elasticsearch: hosts: - 10.0.0.91:9200 - 10.0.0.92:9200 - 10.0.0.93:9200 # index: "weixiang98-efk-auth-%{+yyyy.MM.dd}" indices: - index: "weixiang98-efk-contains-auth-%{+yyyy.MM.dd}" # 当tags的值为"auth"时,则写入当前索引。 when.contains: tags: "auth" - index: "weixiang98-efk-contains-syslog-%{+yyyy.MM.dd}" when.contains: tags: "syslog" setup.ilm.enabled: false # 关闭索引生命周期,如果开启,则不支持filebeat自定义索引名称 setup.template.name: "weixiang-weixiang98" # 定义索引模板的名称 setup.template.pattern: "weixiang98*" # 定义索引的匹配模式,该模式可以匹配写入ES的自定义索引。匹配的索引必须遵循该索引模板的配置。也就是索引名称必须是weixiang98*开始 # setup.template.overwrite: false setup.template.overwrite: true # 如果索引模板已经存在是否覆盖。 setup.template.settings: # 设置索引模板 index.number_of_shards: 5 # 自定义索引的分片数量 index.number_of_replicas: 0 # 自定义索引的副本数量 2.启动filebeat实例 [root@elk92 filebeat]# rm -rf /var/lib/filebeat [root@elk92 filebeat]# [root@elk92 filebeat]# filebeat -e -c config/08-multiple_input-to-es.yaml 3.kibana出图展示 略,见视频。 syslog日志过滤: message :"Successfully " auth日志过滤: message :"10.0.0.1 "

image

image

7、filebeat采集json数据
bash
filebeat采集json数据 1.测试文件 [root@elk92 filebeat]# cat /tmp/student.json {"name":"张锋","hobby":["玩手机","俯卧撑","看美女"],"gender": "boy"} {"name":"常义朝","hobby":["打台球","吹牛","喝啤酒"],"gender": "boy","age":18} {"name":"刘志松","hobby":["打游戏","看动漫"],"gender":"boy","class": "weixiang98"} {"name":"李鑫","hobby":["听音乐","打飞机"]} {"name":"杨晓东","hobby":["学习","打飞机"]} [root@elk92 filebeat]# 2.准备配置文件 [root@elk92 filebeat]# cat config/09-log_json-to-es.yaml filebeat.inputs: - type: log paths: - /tmp/student.json # 将message字段进行解析,解析后的数据放在顶级字段中,和message同级。 # 如果解析正确,则message字段就删除,如果解析错误,则message字段保留。 json.keys_under_root: true #output.console: # pretty: true output.elasticsearch: hosts: - 43.139.47.66:9200 - 106.55.44.37:9200 - 43.139.77.96:9200 index: "weixiang98-efk-log-json-%{+yyyy.MM.dd}" setup.ilm.enabled: false # 关闭索引生命周期,如果开启,则不支持filebeat自定义索引名称 setup.template.name: "weixiang-weixiang98" # 定义索引模板的名称 setup.template.pattern: "weixiang98*" # 定义索引的匹配模式,该模式可以匹配写入ES的自定义索引。匹配的索引必须遵循该索引模板的配置。也就是索引名称必须是weixiang98*开始 setup.template.overwrite: true # 如果索引模板已经存在是否覆盖。 setup.template.settings: # 设置索引模板 index.number_of_shards: 3 # 自定义索引的分片数量 index.number_of_replicas: 0 # 自定义索引的副本数量 3.启动实例 [root@elk92 filebeat]# rm -rf /var/lib/filebeat [root@elk92 filebeat]# [root@elk92 filebeat]# filebeat -e -c config/09-log_json-to-es.yaml 4.kibana验证数据 略,见视频。

image

8、filestream类型采集tomcat日志案例
bash
1.安装tomcat 1.1 下载tomcat wget https://dlcdn.apache.org/tomcat/tomcat-11/v11.0.8/bin/apache-tomcat-11.0.8.tar.gz SVIP: [root@elk92 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/tomcat/apache-tomcat-11.0.8.tar.gz 1.2 安装tomcat [root@elk92 ~]# tar xf apache-tomcat-11.0.8.tar.gz -C /usr/local/ [root@elk92 ~]# 1.3 配置环境变量,环境变量会加载/etc/profile.d/目录 [root@elk92 ~]# cat /etc/profile.d/tomcat.sh #!/bin/bash export JAVA_HOME=/usr/share/elasticsearch/jdk export TOMCAT_HOME=/usr/local/apache-tomcat-11.0.8 export PATH=$PATH:$TOMCAT_HOME/bin:$JAVA_HOME/bin [root@elk92 ~]# [root@elk92 ~]# source /etc/profile.d/tomcat.sh [root@elk92 ~]# 1.4 启动tomcat [root@elk92 ~]# startup.sh Using CATALINA_BASE: /usr/local/apache-tomcat-11.0.8 Using CATALINA_HOME: /usr/local/apache-tomcat-11.0.8 Using CATALINA_TMPDIR: /usr/local/apache-tomcat-11.0.8/temp Using JRE_HOME: /usr/share/elasticsearch/jdk Using CLASSPATH: /usr/local/apache-tomcat-11.0.8/bin/bootstrap.jar:/usr/local/apache-tomcat-11.0.8/bin/tomcat-juli.jar Using CATALINA_OPTS: Tomcat started. [root@elk92 ~]# [root@elk92 ~]# ss -ntl | grep 8080 LISTEN 0 100 *:8080 *:* [root@elk92 ~]# 1.5 访问tomcat的webUI http://10.0.0.92:8080/ 1.6 查看访问日志 [root@elk92 ~]# cat /usr/local/apache-tomcat-11.0.8/logs/localhost_access_log.2025-06-21.txt 10.0.0.1 - - [20/Jun/2025:16:49:43 +0800] "GET / HTTP/1.1" 200 11235 10.0.0.1 - - [20/Jun/2025:16:49:43 +0800] "GET /tomcat.css HTTP/1.1" 200 5584 10.0.0.1 - - [20/Jun/2025:16:49:43 +0800] "GET /tomcat.svg HTTP/1.1" 200 67795 10.0.0.1 - - [20/Jun/2025:16:49:43 +0800] "GET /asf-logo-wide.svg HTTP/1.1" 200 27235 10.0.0.1 - - [20/Jun/2025:16:49:43 +0800] "GET /bg-nav.png HTTP/1.1" 200 1401 10.0.0.1 - - [20/Jun/2025:16:49:43 +0800] "GET /bg-button.png HTTP/1.1" 200 713 10.0.0.1 - - [20/Jun/2025:16:49:43 +0800] "GET /bg-upper.png HTTP/1.1" 200 3103 10.0.0.1 - - [20/Jun/2025:16:49:43 +0800] "GET /bg-middle.png HTTP/1.1" 200 1918 10.0.0.1 - - [20/Jun/2025:16:49:43 +0800] "GET /favicon.ico HTTP/1.1" 200 21630 [root@elk92 ~]# 2.filebeat采集tomcat日志案例 2.1 编写filebeat配置文件 [root@elk92 filebeat]# cat config/10-filestream-to-es.yaml filebeat.inputs: - type: filestream paths: - /usr/local/apache-tomcat-11.0.8/logs/localhost_access_log.2025-06-21.txt output.elasticsearch: hosts: - 43.139.47.66:9200 - 106.55.44.37:9200 - 43.139.77.96:9200 index: "weixiang98-efk-filestream-tomcat-%{+yyyy.MM.dd}" setup.ilm.enabled: false setup.template.name: "weixiang-weixiang98" setup.template.pattern: "weixiang98*" setup.template.overwrite: false setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 0 [root@elk92 filebeat]# 2.2 启动filebeat实例 [root@elk92 filebeat]# filebeat -e -c config/10-filestream-to-es.yaml 2.3 kibana查看数据 略,见视频。

image

9、filestream解析json格式
bash
1.修改tomcat的访问日志格式 [root@elk92 ~]# vim /usr/local/apache-tomcat-11.0.8/conf/server.xml ... <Host name="tomcat.weixiang.com" appBase="webapps" unpackWARs="true" autoDeploy="true"> <Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs" prefix="tomcat.weixiang.com_access_log" suffix=".json" pattern="{&quot;clientip&quot;:&quot;%h&quot;,&quot;ClientUser&quot;:&quot;%l&quot;,&quot;authenticated&quot;:&quot;%u&quot;,&quot;AccessTime&quot;:&quot;%t&quot;,&quot;request&quot;:&quot;%r&quot;,&quot;status&quot;:&quot;%s&quot;,&quot;SendBytes&quot;:&quot;%b&quot;,&quot;Query?string&quot;:&quot;%q&quot;,&quot;partner&quot;:&quot;%{Referer}i&quot;,&quot;http_user_agent&quot;:&quot;%{User-Agent}i&quot;}"/> </Host> 2.重启tomcat服务 [root@elk92 ~]# source /etc/profile.d/tomcat.sh [root@elk92 ~]# [root@elk92 ~]# shutdown.sh Using CATALINA_BASE: /usr/local/apache-tomcat-11.0.8 Using CATALINA_HOME: /usr/local/apache-tomcat-11.0.8 Using CATALINA_TMPDIR: /usr/local/apache-tomcat-11.0.8/temp Using JRE_HOME: /usr/share/elasticsearch/jdk Using CLASSPATH: /usr/local/apache-tomcat-11.0.8/bin/bootstrap.jar:/usr/local/apache-tomcat-11.0.8/bin/tomcat-juli.jar Using CATALINA_OPTS: [root@elk92 ~]# [root@elk92 ~]# startup.sh Using CATALINA_BASE: /usr/local/apache-tomcat-11.0.8 Using CATALINA_HOME: /usr/local/apache-tomcat-11.0.8 Using CATALINA_TMPDIR: /usr/local/apache-tomcat-11.0.8/temp Using JRE_HOME: /usr/share/elasticsearch/jdk Using CLASSPATH: /usr/local/apache-tomcat-11.0.8/bin/bootstrap.jar:/usr/local/apache-tomcat-11.0.8/bin/tomcat-juli.jar Using CATALINA_OPTS: Tomcat started. [root@elk92 ~]# [root@elk92 ~]# ss -ntl | grep 8080 LISTEN 0 100 *:8080 *:* [root@elk92 ~]# 3.访问测试 http://tomcat.weixiang.com:8080/ 温馨提示: windows添加解析:"10.0.0.92 tomcat.weixiang.com" 4.查看日志 【大家也可以跳过上面的步骤直接复制我的内容】 [root@elk92 ~]# cat /usr/local/apache-tomcat-11.0.8/logs/tomcat.weixiang.com_access_log.2025-06-20.json {"clientip":"123.117.19.236","ClientUser":"-","authenticated":"-","AccessTime":"[20/Jun/2025:17:37:16 +0800]","request":"GET / HTTP/1.1","status":"200","SendBytes":"11235","Query?string":"","partner":"-","http_user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36"} {"clientip":"123.117.19.236","ClientUser":"-","authenticated":"-","AccessTime":"[20/Jun/2025:17:37:16 +0800]","request":"GET /tomcat.svg HTTP/1.1","status":"200","SendBytes":"67795","Query?string":"","partner":"http://tomcat.weixiang.com:8080/","http_user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36"} {"clientip":"123.117.19.236","ClientUser":"-","authenticated":"-","AccessTime":"[20/Jun/2025:17:37:16 +0800]","request":"GET /tomcat.css HTTP/1.1","status":"200","SendBytes":"5584","Query?string":"","partner":"http://tomcat.weixiang.com:8080/","http_user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36"} {"clientip":"123.117.19.236","ClientUser":"-","authenticated":"-","AccessTime":"[20/Jun/2025:17:37:16 +0800]","request":"GET /asf-logo-wide.svg HTTP/1.1","status":"200","SendBytes":"27235","Query?string":"","partner":"http://tomcat.weixiang.com:8080/tomcat.css","http_user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36"} {"clientip":"123.117.19.236","ClientUser":"-","authenticated":"-","AccessTime":"[20/Jun/2025:17:37:16 +0800]","request":"GET /bg-middle.png HTTP/1.1","status":"200","SendBytes":"1918","Query?string":"","partner":"http://tomcat.weixiang.com:8080/tomcat.css","http_user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36"} {"clientip":"123.117.19.236","ClientUser":"-","authenticated":"-","AccessTime":"[20/Jun/2025:17:37:16 +0800]","request":"GET /bg-button.png HTTP/1.1","status":"200","SendBytes":"713","Query?string":"","partner":"http://tomcat.weixiang.com:8080/tomcat.css","http_user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36"} {"clientip":"123.117.19.236","ClientUser":"-","authenticated":"-","AccessTime":"[20/Jun/2025:17:37:16 +0800]","request":"GET /bg-nav.png HTTP/1.1","status":"200","SendBytes":"1401","Query?string":"","partner":"http://tomcat.weixiang.com:8080/tomcat.css","http_user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36"} {"clientip":"123.117.19.236","ClientUser":"-","authenticated":"-","AccessTime":"[20/Jun/2025:17:37:16 +0800]","request":"GET /bg-upper.png HTTP/1.1","status":"200","SendBytes":"3103","Query?string":"","partner":"http://tomcat.weixiang.com:8080/tomcat.css","http_user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36"} {"clientip":"123.117.19.236","ClientUser":"-","authenticated":"-","AccessTime":"[20/Jun/2025:17:37:16 +0800]","request":"GET /favicon.ico HTTP/1.1","status":"200","SendBytes":"21630","Query?string":"","partner":"http://tomcat.weixiang.com:8080/","http_user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36"} [root@elk92 ~]# 5.filebeat采集日志 5.1 编写filebeat配置文件 [root@elk92 filebeat]# cat config/11-filestream_json-to-es.yaml filebeat.inputs: - type: filestream paths: - /usr/local/apache-tomcat-11.0.8/logs/tomcat.weixiang.com_access_log*.json # 配置解析器 parsers: # 配置json格式解析 - ndjson: # 将解析的字段放在指定的字段中,如果为"",表示放在顶级字段。 target: "" # 指定要解析的json格式字段 message_key: message output.elasticsearch: hosts: - 43.139.47.66:9200 - 106.55.44.37:9200 - 43.139.77.96:9200 index: "weixiang98-efk-filestream-tomcat-json-%{+yyyy.MM.dd}" setup.ilm.enabled: false setup.template.name: "weixiang-weixiang98" setup.template.pattern: "weixiang98*" setup.template.overwrite: false setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 0 [root@elk92 filebeat]# 5.2 启动filebeat实例 [root@elk92 filebeat]# rm -rf /var/lib/filebeat [root@elk92 filebeat]# [root@elk92 filebeat]# filebeat -e -c config/11-filestream_json-to-es.yaml 5.3 查看数据

image

10、filebeat多行合并
bash
1.制造错误 1.1 停止tomcat服务 [root@elk92 filebeat]# shutdown.sh Using CATALINA_BASE: /usr/local/apache-tomcat-11.0.8 Using CATALINA_HOME: /usr/local/apache-tomcat-11.0.8 Using CATALINA_TMPDIR: /usr/local/apache-tomcat-11.0.8/temp Using JRE_HOME: /usr/share/elasticsearch/jdk Using CLASSPATH: /usr/local/apache-tomcat-11.0.8/bin/bootstrap.jar:/usr/local/apache-tomcat-11.0.8/bin/tomcat-juli.jar Using CATALINA_OPTS: [root@elk92 filebeat]# 1.2 修改tomcat配置文件【请你故意改错】 [root@elk92 ~]# vim /usr/local/apache-tomcat-11.0.8/conf/server.xml [root@elk92 ~]# 1.3 启动服务 [root@elk92 ~]# startup.sh Using CATALINA_BASE: /usr/local/apache-tomcat-11.0.8 Using CATALINA_HOME: /usr/local/apache-tomcat-11.0.8 Using CATALINA_TMPDIR: /usr/local/apache-tomcat-11.0.8/temp Using JRE_HOME: /usr/share/elasticsearch/jdk Using CLASSPATH: /usr/local/apache-tomcat-11.0.8/bin/bootstrap.jar:/usr/local/apache-tomcat-11.0.8/bin/tomcat-juli.jar Using CATALINA_OPTS: Tomcat started. [root@elk92 ~]# [root@elk92 ~]# ss -ntl | grep 8080 [root@elk92 ~]# 1.4 查看日志信息 [root@elk92 ~]# tail -100f /usr/local/apache-tomcat-11.0.8/logs/catalina.out er.xml] org.xml.sax.SAXParseException; systemId: file:/usr/local/apache-tomcat-11.0.8/conf/server.xml; lineNumber: 142; columnNumber: 17; The end-tag for element type "Host" must end with a '>' delimiter. at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1252) at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643) at org.apache.tomcat.util.digester.Digester.parse(Digester.java:1506) at org.apache.catalina.startup.Catalina.parseServerXml(Catalina.java:605) at org.apache.catalina.startup.Catalina.load(Catalina.java:695) at org.apache.catalina.startup.Catalina.load(Catalina.java:733) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) at java.base/java.lang.reflect.Method.invoke(Method.java:580) at org.apache.catalina.startup.Bootstrap.load(Bootstrap.java:299) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:472) 21-Jun-2025 17:57:50.215 SEVERE [main] org.apache.catalina.startup.Catalina.start Cannot start server, server instance is not configured 2.配置多行合并 2.1 编写filebeat配置文件 [root@elk92 filebeat]# cat config/12-filestream_tomcat_errlog-to-es.yaml filebeat.inputs: - type: filestream paths: - /usr/local/apache-tomcat-11.0.8/logs/catalina.out* parsers: # 定义多行匹配 - multiline: # 指定多行匹配的类型,有效值为: pattern,count type: pattern # 指定匹配模式 pattern: '^\d' # 行首是[0-9] # 以下2个值参考官网: https://www.elastic.co/guide/en/beats/filebeat/7.17/multiline-examples.html # 这个配置的意思是:将不以数字开头的行合并到前一行之后 negate: true match: after output.elasticsearch: hosts: - 43.139.47.66:9200 - 106.55.44.37:9200 - 43.139.77.96:9200 index: "weixiang98-efk-filestream-tomcat-errlog-%{+yyyy.MM.dd}" setup.ilm.enabled: false setup.template.name: "weixiang-weixiang98" setup.template.pattern: "weixiang98*" setup.template.overwrite: false setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 0 # 不同组合效果对比: 配置 negate match 效果 negate: true, match: after 反转匹配 追加到后面 不以数字开头的行追加到前一行之后 negate: false, match: after 正常匹配 追加到后面 以数字开头的行追加到前一行之后 negate: true, match: before 反转匹配 追加到前面 不以数字开头的行追加到下一行之前 negate: false, match: before 正常匹配 追加到前面 以数字开头的行追加到下一行之前 2.2 启动filebeat实例 [root@elk92 filebeat]# rm -rf /var/lib/filebeat [root@elk92 filebeat]# filebeat -e -c config/12-filestream_tomcat_errlog-to-es.yaml 2.3 kibana查看数据 参考查询KQL语句: message : "at" - 扩展作业: - 调研虚拟化产品ESXI。 参考链接:

image

5、filebeat处理器processors
bash
- filebeat处理器 1.准备测试数据 [root@elk92 filebeat]# cat /opt/students.json {"name":"张锋","hobby":["玩手机","俯卧撑"],"age": "25","gender": "true"} {"name":"赵亚一","hobby":["学习","喝酒"],"age": "20","gender": "false"} {"name":"刘志松","hobby":["Linux","喝酒"],"age": "21","gender": "false"} {"name":"刘敬才","hobby":["ElasticStack","跳绳"],"age": "22","gender": "false"} {"name":"蒋梁文","hobby":["学习","喝酒"],"age": "23","gender": "false"} [root@elk92 filebeat]# 2.编写filebeat配置文件 [root@elk92 filebeat]# cat config/13-filestream-to-es.yaml filebeat.inputs: - type: filestream paths: - /opt/students.json parsers: - ndjson: target: "" message_key: message # 定义处理器,event(事件)在发送到output之前会先经过处理器处理。 processors: # 给事件进行数据转换 - convert: # 定义相关字段信息 fields: # from指定源字段,to指定模板字段,type指定数据类型。 # type支持的数据类型有: integer, long, float, double, string, boolean,和 ip。 - {from: "age", to: "age", type: "integer"} - {from: "gender", to: "gender", type: "boolean"} # 如果为true,则当在事件中找不到from键时,处理器将继续到下一个字段。 # 如果为false,则处理器返回错误,不处理其余字段。默认值为false。 ignore_missing: true # 如果忽略错误类型转换失败,处理器将继续下一个字段。默认值为true。 fail_on_error: false # 给事件打标签 - add_tags: # 给事件打标签的值 tags: [weixiang,weixiang98] # 将标签(tags)的值存储在某个特定字段(此处定义是'laonanhai'),若不定义该字段,则默认字段为"tags" target: "laonanhai" # 删除特定的事件 - drop_event: # 基于条件表达式,配置事件的正则 when: # 当hobby字段包含'喝酒'就将该event删掉。 contains: hobby: "喝酒" #output.console: # pretty: true output.elasticsearch: hosts: - 10.0.0.91:9200 - 10.0.0.92:9200 - 10.0.0.93:9200 index: "weixiang98-efk-filestream-processors-%{+yyyy.MM.dd}" setup.ilm.enabled: false setup.template.name: "weixiang-weixiang98" setup.template.pattern: "weixiang98*" setup.template.overwrite: false setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 0 [root@elk92 filebeat]# 3.启动filebeat实例 [root@elk92 filebeat]# rm -rf /var/lib/filebeat [root@elk92 filebeat]# filebeat -e -c config/13-filestream-to-es.yaml 4.kibana查看数据 略,见视频。

2d1b2ca5c816e12c74aabea6c9e758ba

6、filebeat数据流走向
bash
- 源数据的event由input或者module模块采集 - 经过processors处理器进行轻量处理 -- 由output输出到es集群

4、Logstash

1、安装Logstash
bash
#1.下载Logstash wget https://artifacts.elastic.co/downloads/logstash/logstash-7.17.28-amd64.deb SVIP: [root@elk93 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/ES7/7.17.28/logstash-7.17.28-amd64.deb #2.安装Logstash [root@elk93 ~]# dpkg -i logstash-7.17.28-amd64.deb #3.添加Logstash到环境变量 [root@elk93 ~]# ln -svf /usr/share/logstash/bin/logstash /usr/local/bin/ '/usr/local/bin/logstash' -> '/usr/share/logstash/bin/logstash' [root@elk93 ~]# #4.基于命令行方式启动【不推荐,一般用于测试,可读性极差,尤其是配置较多时】 4.1 命令行中指定配置参数 [root@elk93 ~]# logstash -e 'input { stdin { type => stdin } } output { stdout { codec => rubydebug } }' 4.2 测试验证 The stdin plugin is now waiting for input: ... www.weixiang.com { "message" => "www.weixiang.com", "type" => "stdin", "host" => "elk93", "@version" => "1", "@timestamp" => 2025-06-23T01:26:37.343Z } #5.基于配置文件方式启动【推荐】 5.1 编写配置文件 [root@elk93 ~]# cat >/etc/logstash/conf.d/01-stdin-to-stdout.conf <<EOF input { stdin { # 使用标准输入插件(键盘输入) type => stdin # 给所有输入事件添加 type 字段(值为"stdin") } } output { stdout { # 使用标准输出插件(控制台显示) codec => rubydebug } } EOF #5.2 启动Logstash实例 [root@elk93 ~]# logstash -f /etc/logstash/conf.d/01-stdin-to-stdout.conf The stdin plugin is now waiting for input: ... 学IT来老男孩,月薪过万梦~ { "message" => "学IT来老男孩,月薪过万梦~", "host" => "elk93", "@version" => "1", "@timestamp" => 2025-06-23T01:28:50.526Z, "type" => "stdin" }

image

2、logstash对接filebeat实战
bash
#1.启动Logstash实例 [root@elk93 ~]# cat > /etc/logstash/conf.d/02-beats-to-stdout.conf <<EOF input { beats { # beats:监听5044端口,接收Filebeat等工具发送的数据 port => 5044 # Logstash在5044端口接收来自Beats的数据,采集filebeat传来的的数据 } } output { stdout { codec => rubydebug } } EOF [root@elk93 ~]# logstash -f /etc/logstash/conf.d/02-beats-to-stdout.conf ... [root@elk93 ~]# ss -ntl | grep 5044 # 单独开一个终端测试 LISTEN 0 4096 *:5044 *:* [root@elk93 ~]# #2.生成测试数据 [root@elk92 ~]# cat > generate_log.py <<EOF #!/usr/bin/env python # -*- coding: UTF-8 -*- # @author : Jason Yin import datetime import random import logging import time import sys LOG_FORMAT = "%(levelname)s %(asctime)s [com.weixiang.%(module)s] - %(message)s " DATE_FORMAT = "%Y-%m-%d %H:%M:%S" # 配置root的logging.Logger实例的基本配置 logging.basicConfig(level=logging.INFO, format=LOG_FORMAT, datefmt=DATE_FORMAT, filename=sys.argv[1] , filemode='a',) actions = ["浏览页面", "评论商品", "加入收藏", "加入购物车", "提交订单", "使用优惠券", "领取优惠券", "搜索", "查看订单", "付款", "清空购物车"] while True: time.sleep(random.randint(1, 5)) user_id = random.randint(1, 10000) # 对生成的浮点数保留2位有效数字. price = round(random.uniform(15000, 30000),2) action = random.choice(actions) svip = random.choice([0,1]) logging.info("DAU|{0}|{1}|{2}|{3}".format(user_id, action,svip,price)) EOF [root@elk92 ~]# python3 generate_log.py /tmp/apps.log [root@elk92 ~]# tail -f /tmp/apps.log INFO 2025-06-23 10:10:09 [com.weixiang.generate_log] - DAU|1958|提交订单|0|16621.61 INFO 2025-06-23 10:10:10 [com.weixiang.generate_log] - DAU|1951|加入购物车|0|29168.27 INFO 2025-06-23 10:10:14 [com.weixiang.generate_log] - DAU|4465|付款|2|21334.82 INFO 2025-06-23 10:10:16 [com.weixiang.generate_log] - DAU|4909|领取优惠券|2|22650.45 INFO 2025-06-23 10:10:17 [com.weixiang.generate_log] - DAU|9068|领取优惠券|0|24090.4 INFO 2025-06-23 10:10:22 [com.weixiang.generate_log] - DAU|7146|付款|0|19730.06 INFO 2025-06-23 10:10:23 [com.weixiang.generate_log] - DAU|9224|搜索|1|21308.91 INFO 2025-06-23 10:10:27 [com.weixiang.generate_log] - DAU|5810|浏览页面|1|16808.16 INFO 2025-06-23 10:10:30 [com.weixiang.generate_log] - DAU|3109|评论商品|2|21532.56 INFO 2025-06-23 10:10:32 [com.weixiang.generate_log] - DAU|8684|评论商品|1|16394.65 ... #3.filebeat采集数据到Logstash [root@elk92 filebeat]# cat config/14-filestream-to-logstash.yaml filebeat.inputs: - type: filestream paths: - /tmp/apps.log output.logstash: hosts: ["106.55.44.37:5044"] # 输出到logstash的106.55.44.37 [root@elk92 filebeat]# [root@elk92 filebeat]# filebeat -e -c config/14-filestream-to-logstash.yaml #4.验证测试 在logstash可以看到数据

image

3、Logstash的pipeline内部结构

3daf66ddff64c288e26fd668d0de2cba_720

bash
1.filebeat对接Logstash架构图解 2.编写Logstash的配置文件 [root@elk93 ~]# cat >/etc/logstash/conf.d/03-beats-filter-stdout.conf <<EOF input { beats { port => 5044 } } filter { # 定义处理器 mutate { # 使用mutate插件 remove_field => [ "@version", "input","ecs","log","tags","agent","host" ] # 定义要移除的字段 } } output { stdout { codec => rubydebug } } EOF [root@elk93 ~]# 3.启动Logstash实例 [root@elk93 ~]# logstash -rf /etc/logstash/conf.d/03-beats-filter-stdout.conf ... { "message" => "INFO 2025-06-23 10:45:18 [com.weixiang.generate_log] - DAU|2475|提交订单|2|18708.17 ", "@timestamp" => 2025-06-23T02:45:21.711Z } { "message" => "INFO 2025-06-23 10:45:23 [com.weixiang.generate_log] - DAU|2137|提交订单|2|21681.93 ", "@timestamp" => 2025-06-23T02:45:23.711Z } 4.python脚本要启动 [root@elk01 /etc/filebeat]#python3 generate_log.py /tmp/apps.log 5.filebeat也要启动 [root@elk01 /etc/filebeat]#filebeat -e -c config/14-filestream-to-logstash.yaml

image

4、Logstash数据分析实战案例
bash
1.编写Logstash的配置文件 [root@elk93 ~]# cat >/etc/logstash/conf.d/04-beats-filter_mutate_date-stdout.conf <<EOF input { beats { port => 5044 } } filter { mutate { split => { "message" => "|" } # 按照"|"对数据进行切割 add_field => { # 添加字段 "oher" => "%{[message][0]}" "userId" => "%{[message][1]}" "action" => "%{[message][2]}" "svip" => "%{[message][3]}" "price" => "%{[message][4]}" } } mutate { split => { "other" => " " } # 按照空格对数据进行切割 add_field => { "dt" => "%{[other][1]} %{[other][2]}" # 添加字段dt,并且对第二、三列进行拼接 } convert => { "price" => "float" # 对price字段进行数据类型转换成浮点型 "userId" => "integer" # 对userId字段进行数据类型准换成整型 } remove_field => [ "@version", "input","ecs","log","tags","agent","host","message","other"] } # 删除字段 date { # "dt" => "2025-06-23 11:36:20", # 参考链接: https://www.elastic.co/guide/en/logstash/7.17/plugins-filters-date.html match => [ "dt", "yyyy-MM-dd HH:mm:ss" ] # 数据类型转换 target => "xixi" # 定义目标字段,若不指定,则默认覆盖"@timestamp"字段 } } output { stdout { codec => rubydebug } elasticsearch { hosts => ["http://43.139.47.66:9200","http://106.55.44.37:9200","http://43.139.77.96:9200"] index => "weixiang-weixiang98-logstash-apps-%{+YYYY.MM.dd}" } } EOF [root@elk93 ~]# 2.启动Logstash [root@elk93 ~]# logstash -rf /etc/logstash/conf.d/04-beats-filter_mutate_date-stdout.conf 3.Kibana验证 略,见视频。

image

image

5、logstash采集文本日志
bash
# logstash采集文本日志 1.准备日志文件 [root@elk92 ~]# scp /var/log/nginx/access.log 10.0.0.93:/tmp 2.Logstash配置文件 [root@elk93 ~]# cat > /etc/logstash/conf.d/05-file-to-es.yaml<<EOF input { file { path => "/tmp/access.log*" # 指定文件的采集路径 # 指定文件采集的位置,有效值为: beginning, end。 # 默认值为'end',表示从文件末尾采集。 # 如果想要从头采集,建议设置为: 'beginning',但前提是该文件之前没有被采集过,说白了,就是首次采集生效。 start_position => "beginning" } } output { stdout { codec => rubydebug } # elasticsearch { # hosts => ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"] # index => "weixiang-weixiang98-logstash-apps-%{+YYYY.MM.dd}" # } } EOF [root@elk93 ~]# [root@elk93 ~]# logstash -f /etc/logstash/conf.d/05-file-to-es.yaml 温馨提示: - 1.logstash采集文件需要指定采集的起始位置,若不指定,则从文件的末尾开始采集; - 2.起始位置配置仅在首次使用生效,黑科技可以直接修改"/usr/share/logstash/data/plugins/inputs/file/.sincedb_*"; # grok案例分析nginx访问日志 1.参考示例 rm -f /usr/share/logstash/data/plugins/inputs/file/.sincedb_* [root@elk93 ~]# cat /etc/logstash/conf.d/06-nginx-to-es.yaml input { file { path => "/tmp/access.log*" start_position => "beginning" # 从头开始采集 } } filter { # 处理器 grok { # 使用正则表达式解析 match => { "message" => "%{HTTPD_COMBINEDLOG}" # 内置的 Apache 日志模式 } remove_field => ["message"] # 删除“message” } } output { stdout { codec => rubydebug } # elasticsearch { # hosts => ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"] # index => "weixiang-weixiang98-logstash-apps-%{+YYYY.MM.dd}" # } } [root@elk93 ~]# 2.测试样例 [root@elk93 ~]# rm -f /usr/share/logstash/data/plugins/inputs/file/.sincedb_* [root@elk93 ~]# logstash -f /etc/logstash/conf.d/06-nginx-to-es.yaml ... { "auth" => "-", "clientip" => "110.117.19.236", "httpversion" => "1.1", "agent" => "\"Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Mobile Safari/537.36\"", "response" => "404", "@timestamp" => 2025-06-23T06:55:34.212Z, "ident" => "-", "referrer" => "\"-\"", "@version" => "1", "verb" => "GET", "host" => "elk93", "bytes" => "10995116277760", "timestamp" => "20/Jun/2025:09:33:55 +0800", "path" => "/tmp/access.log.1", "request" => "/" } # useragent案例分析nginx访问日志 1.编写配置文件 [root@elk93 ~]# cat /etc/logstash/conf.d/06-nginx-to-es.yaml input { file { path => "/tmp/access.log*" start_position => "beginning" # 从头开始采集 } } filter { grok { match => { "message" => "%{HTTPD_COMBINEDLOG}" } remove_field => ["message"] } useragent { source => "agent" target => "weixiang98-agent" } } output { stdout { codec => rubydebug } # elasticsearch { # hosts => ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"] # index => "weixiang-weixiang98-logstash-apps-%{+YYYY.MM.dd}" # } } [root@elk93 ~]# 2.测试案例 [root@elk93 ~]# rm -f /usr/share/logstash/data/plugins/inputs/file/.sincedb_* [root@elk93 ~]# logstash -f /etc/logstash/conf.d/06-nginx-to-es.yaml ... { "referrer" => "\"-\"", "@version" => "1", "host" => "elk93", "clientip" => "110.117.19.236", "ident" => "-", "@timestamp" => 2025-06-23T07:00:49.318Z, "agent" => "\"Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Mobile Safari/537.36\"", "path" => "/tmp/access.log.1", "timestamp" => "20/Jun/2025:09:33:55 +0800", "verb" => "GET", "response" => "404", "bytes" => "396", "weixiang98-agent" => { "name" => "Chrome Mobile", "os_minor" => "0", "version" => "137.0.0.0", "device" => "Samsung SM-G955U", "major" => "137", "minor" => "0", "os_major" => "8", "os_patch" => "0", "os_name" => "Android", "os" => "Android", "os_full" => "Android 8.0.0", "patch" => "0", "os_version" => "8.0.0" }, "request" => "/", "auth" => "-", "httpversion" => "1.1" }
6、geoip分析经纬度
bash
#1.导入本地的geoip数据库,为了增加解析经纬度的速度 [root@elk93 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/ES7/7.17.28/geoip/GeoLite2-City_20250311.tar.gz [root@elk93 ~]# tar xf GeoLite2-City_20250311.tar.gz -C /usr/local/ [root@elk93 ~]# [root@elk93 ~]# ll /usr/local/GeoLite2-City_20250311/ total 58436 drwxrwxr-x 2 root root 4096 Mar 11 17:20 ./ drwxr-xr-x 11 root root 4096 Jun 23 15:23 ../ -rw-r--r-- 1 root root 55 Mar 11 17:20 COPYRIGHT.txt -rw-r--r-- 1 root root 59816818 Mar 11 17:20 GeoLite2-City.mmdb -rw-r--r-- 1 root root 398 Mar 11 17:20 LICENSE.txt -rw-r--r-- 1 root root 116 Mar 11 17:20 README.txt - ELK架构分析nginx访问日志案例实战 1.自定义索引模板 索引模式: elk-nginx* 索引设置: { "number_of_replicas": 0, "number_of_shards": 10 } 在kibana出图的时候要注意,如果是数字是字符串是不能进行运算的,需要在索引管理-索引模板中设置映射,将字符串转换成数值类型的 地图坐标点,需要geoip.location映射成地理左边点,要不自动默认经纬度分为两个字段 怎么看字段属于哪个类型?在索引管理节点,点索引名称,点映射可以看到 ELK之Kibana出图展示没有数据: 之前kibana在出图展示统计指标跟做map地图分布的时候,查询都是正常的,但是上图展示没有值,经过查阅资料发现ES是由数据类型的概念的 ,原来还得创建个索引模板,将原本数据类型映射成符合要求的数据类型,之后就可以顺利出图展示了 #索引模式设置

4ceb9e9d898bf7d9d658115eac649464_720

37678aef3e46e867a7b7c45c666547bf_720

a702193c83cb87153deb6c0bb619b23f_720

image

bash
映射: bytes: 数值 ---》 长整型 【因为整型只能存储42亿数据的大小,存不下我们的测试数据!!】 可能会出现的错误: ---》 Value [10995116277760] is out of range for an integer ... #2.编写Logstash的配置文件 [root@elk93 ~]# cat /etc/logstash/conf.d/06-nginx-to-es.yaml input { file { path => "/tmp/access.log*" # 监控/tmp目录下所有access.log开头的文件 start_position => "beginning" # 首次运行时从文件开头读取(默认只读新增内容) } } filter { grok { # 使用正则表达式解析 match => { "message" => "%{HTTPD_COMBINEDLOG}" # 预定义模式,可解析:客户端IP、用户标识、认证用户、时间戳、HTTP状态码等 } remove_field => ["message"] # 删除原始日志内容 } useragent { # 用户代理解析 source => "agent" # grok解析出的agent字段 target => "weixiang98-agent" # 解析结果存储字段 } geoip { # 根据公网ip获取城市的基本信息 database => "/usr/local/GeoLite2-City_20250311/GeoLite2-City.mmdb" # database指定geoip数据库的位置 default_database_type => "City" # 精确到某个城市 source => "clientip" # 使用grok解析出的clientip字段 } date { # 时间戳处理 # "20/Jun/2025:09:33:55 +0800" match => ["timestamp","dd/MMM/yyyy:HH:mm:ss Z"] } } output { # stdout { # codec => rubydebug # } elasticsearch { hosts => ["http://43.139.47.66:9200","http://106.55.44.37:9200","http://43.139.77.96:9200"] index => "elk-nginx-weixiang98-logstash-nginx-%{+YYYY.MM.dd}" } } # 3.启动Logstash [root@elk93 ~]# rm -f /usr/share/logstash/data/plugins/inputs/file/.sincedb_* [root@elk93 ~]# [root@elk93 ~]# logstash -f /etc/logstash/conf.d/06-nginx-to-es.yaml # 4.访问测试

image

7、logstash的多实例
bash
1.实例一 [root@elk93 ~]# logstash -f /etc/logstash/conf.d/06-nginx-to-es.yaml 2.实例二 [root@elk93 ~]# logstash -f /etc/logstash/conf.d/05-file-to-es.yaml --path.data /tmp/xixi-logstash 温馨提示: logstash多实例用于启动多个Logstash的场景。说白了就是需要运行多个配置。因此需要单独指定数据目录。
8、logstash的多分支语法
bash
1.编写Logstash的配置文件 [root@elk93 ~]#cat /etc/logstash/conf.d/07-if-casedemo.yaml input { beats { # beats:监听5044端口,接收Filebeat等工具发送的数据 port => 5044 # Logstash在5044端口接收来自Beats的数据,采集filebeat传来的的数据 type => "beats" # 标记来自Beats的数据 } file { # 监控/tmp/access.log*文件 path => "/tmp/access.log*" start_position => "beginning" # 如果没采集过,从头采集 type => "file" # 标记来自文件的数据 } } filter { # 处理器 if [type] == "beats" { # 处理来自Beats的数据 mutate { # 第一重mutate过滤器:分割原始消息 split => { "message" => "|" } # 用竖线分割原始消息 add_field => { # 添加字段 "other" => "%{[message][0]}" "userId" => "%{[message][1]}" "action" => "%{[message][2]}" "svip" => "%{[message][3]}" "price" => "%{[message][4]}" } } mutate { # 第二重mutate过滤器:进一步处理 split => { "other" => " " } # 分割第一部分(空格分隔) add_field => { # 添加新字段 "dt" => "%{[other][1]} %{[other][2]}" # 提取日期和时间 } convert => { # 字段类型转换 "price" => "float" "userId" => "integer" } remove_field => [ "@version", "input","ecs","log","tags","agent","host","message","other"] } # 删除不需要的字段 date { # 日期处理 match => [ "dt", "yyyy-MM-dd HH:mm:ss" ] # 解析自定义时间字段 target => "xixi" # 存储到新字段 } } else if [type] == "file" { # 处理来自文件的数据 mutate { add_field => { # 添加新字段 "school" => "weixiang" "class" => "weixiang98" } } } } output { stdout { codec => rubydebug } if [type] == "beats" { elasticsearch { hosts => ["http://43.139.47.66:9200","http://106.55.44.37:9200","http://43.139.77.96:9200"] index => "weixiang-weixiang98-logstash-if-apps-%{+YYYY.MM.dd}" } } else { elasticsearch { hosts => ["http://43.139.47.66:9200","http://106.55.44.37:9200","http://43.139.77.96:9200"] index => "weixiang-weixiang98-logstash-if-file-%{+YYYY.MM.dd}" } } } 2.启动Logstash [root@elk93 ~]# logstash -f /etc/logstash/conf.d/07-if-casedemo.yaml --path.data /tmp/logstash-haha001 3.测试验证 kibana查询数据 温馨提示: - 1.确保文件有数据; - 2.filebeat是正常运行的;

image

9、logstash的多pipeline配置
bash
1.准备logstash的配置文件 [root@elk93 ~]# cat /etc/logstash/conf.d/04-beats-filter_mutate_date-es.conf input { beats { # beats:监听5044端口,接收Filebeat等工具发送的数据 port => 5044 } } filter { # 处理器 mutate { # 过滤器 split => { "message" => "|" } add_field => { "other" => "%{[message][0]}" "userId" => "%{[message][1]}" "action" => "%{[message][2]}" "svip" => "%{[message][3]}" "price" => "%{[message][4]}" } } mutate { split => { "other" => " " } add_field => { "dt" => "%{[other][1]} %{[other][2]}" } convert => { "price" => "float" "userId" => "integer" } remove_field => [ "@version", "input","ecs","log","tags","agent","host","message","other"] } date { # "dt" => "2025-06-23 11:36:20", # 参考链接: https://www.elastic.co/guide/en/logstash/7.17/plugins-filters-date.html match => [ "dt", "yyyy-MM-dd HH:mm:ss" ] # 定义目标字段,若不指定,则默认覆盖"@timestamp"字段 target => "xixi" } } output { stdout { codec => rubydebug } elasticsearch { hosts => ["http://43.139.47.66:9200","http://106.55.44.37:9200","http://43.139.77.96:9200"] # index => "weixiang-weixiang98-logstash-apps-%{+YYYY.MM.dd}" index => "weixiang-weixiang98-logstash-pipeline-apps-%{+YYYY.MM.dd}" } } [root@elk93 ~]# cat /etc/logstash/conf.d/05-file-to-es.yaml input { file { # 指定文件的采集路径 path => "/tmp/access.log*" # 指定文件采集的位置,有效值为: beginning, end。 # 默认值为'end',表示从文件末尾采集。 # 如果想要从头采集,建议设置为: 'beginning',但前提是该文件之前没有被采集过,说白了,就是首次采集生效。 start_position => "beginning" } } output { stdout { codec => rubydebug } elasticsearch { hosts => ["http://43.139.47.66:9200","http://106.55.44.37:9200","http://43.139.77.96:9200"] index => "weixiang-weixiang98-logstash-pipeline-nginx-%{+YYYY.MM.dd}" } } 2.修改pipeline [root@elk93 ~]# vim /etc/logstash/pipelines.yml ... #- pipeline.id: main # path.config: "/etc/logstash/conf.d/*.conf" - pipeline.id: xixi path.config: "/etc/logstash/conf.d/04-beats-filter_mutate_date-es.conf" - pipeline.id: haha path.config: "/etc/logstash/conf.d/05-file-to-es.yaml" 3.启动Logstash [root@elk93 ~]# logstash --path.settings /etc/logstash/ 4.启动filebeat [root@elk01 /etc/filebeat/config]#filebeat -e -c /etc/filebeat/config/14-filestream-to-logstash.yaml 5.kibana查看数据

image

image

bash
# 错误案例 执行过程中,一直报错ERROR: Failed to read pipelines yaml file. Location: /etc/logstash/pipelines.yml 原因是pipelines语法有问题 # 1.下载语法检测工具 [root@elk02 ~]#apt -y install yamllint # 2.进行检测测试 [root@elk02 ~]#yamllint /etc/logstash/pipelines.yml /etc/logstash/pipelines.yml 1:1 error syntax error: expected the node content, but found '<document end>' (syntax) 2:2 warning missing starting space in comment (comments) 6:1 warning missing document start "---" (document-start) 11:1 error too many blank lines (1 > 0) (empty-lines) # 3.重新修正 [root@elk02 ~]#sudo tee /etc/logstash/pipelines.yml <<'EOF' > --- > # 主管道配置(示例) > # - pipeline.id: main > # path.config: "/etc/logstash/conf.d/*.conf" > > - pipeline.id: xixi > path.config: "/etc/logstash/conf.d/04-beats-filter_mutate_date-es.conf" > > - pipeline.id: haha > path.config: "/etc/logstash/conf.d/05-file-to-es.conf" # 修正扩展名 > EOF # 之前的文件 [root@elk93 ~]# vim /etc/logstash/pipelines.yml ... #- pipeline.id: main # path.config: "/etc/logstash/conf.d/*.conf" - pipeline.id: xixi path.config: "/etc/logstash/conf.d/04-beats-filter_mutate_date-es.conf" - pipeline.id: haha path.config: "/etc/logstash/conf.d/05-file-to-es.yaml" - Logstash的input常用插件 ***** - stdin # 从标准输入读取数据(键盘输入) - beats # 接收Elastic Beats发送的数据(Filebeat/Metricbeat等) - file # 从文件读取日志 - logstash的filter常用插件 ***** - date # 解析时间字段 - mutate # 字段操作工具箱:类型转换、重命名、切割、增删 - grok # 用正则解析非结构化文本 - geoip # 根据 IP 解析地理位置 - useragent # 解析 User-Agent 字符串,分析客户端设备信息 - logstash的output常用插件 ***** - stdout # 输出到控制台 - elasticsearch # 发送数据到 Elasticsearch

6、ES集群加密

1、使用明文密码方式

bash
#1 生成证书文件 [root@elk91 ~]# /usr/share/elasticsearch/bin/elasticsearch-certutil cert -out /etc/elasticsearch/elastic-certificates.p12 -pass "" --days 36500 ... Certificates written to /etc/elasticsearch/elastic-certificates.p12 This file should be properly secured as it contains the private key for your instance. This file is a self contained file and can be copied and used 'as is' For each Elastic product that you wish to configure, you should copy this '.p12' file to the relevant configuration directory and then follow the SSL configuration instructions in the product guide. [root@elk91 ~]# ll /etc/elasticsearch/elastic-certificates.p12 -rw------- 1 root elasticsearch 3596 May 7 09:04 /etc/elasticsearch/elastic-certificates.p12 [root@elk91 ~]# #2 将证书文件拷贝到其他节点 [root@elk91 ~]# chmod 640 /etc/elasticsearch/elastic-certificates.p12 [root@elk91 ~]# chown root.elasticsearch /etc/elasticsearch/elastic-certificates.p12 [root@elk91 ~]# ll /etc/elasticsearch/elastic-certificates.p12 -rw-r----- 1 root elasticsearch 3596 May 7 09:04 /etc/elasticsearch/elastic-certificates.p12 [root@elk91 ~]# scp -p /etc/elasticsearch/elastic-certificates.p12 10.1.20.5:/etc/elasticsearch [root@elk91 ~]# scp -p /etc/elasticsearch/elastic-certificates.p12 10.1.24.4:/etc/elasticsearch #3.修改ES集群的配置文件 [root@elk91 ~]# vim /etc/elasticsearch/elasticsearch.yml ... # 在最后一行添加以下内容 xpack.security.enabled: true xpack.security.transport.ssl.enabled: true xpack.security.transport.ssl.verification_mode: certificate xpack.security.transport.ssl.keystore.path: elastic-certificates.p12 xpack.security.transport.ssl.truststore.path: elastic-certificates.p12 #4.同步ES配置文件到其他节点 [root@elk91 ~]# scp /etc/elasticsearch/elasticsearch.yml 10.1.20.5:/etc/elasticsearch/ [root@elk91 ~]# scp /etc/elasticsearch/elasticsearch.yml 10.1.24.4:/etc/elasticsearch/ # 5.所有节点重启ES集群 [root@elk91 ~]# systemctl restart elasticsearch.service [root@elk92 ~]# systemctl restart elasticsearch.service [root@elk93 ~]# systemctl restart elasticsearch.service #6.测试验证ES集群访问 [root@elk91 ~]# curl 10.1.24.13:9200/_cat/nodes?v {"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/_cat/nodes?v]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/_cat/nodes?v]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}},"status":401} [root@elk91 ~]# #7.生成随机密码 [root@elk01 ~]#/usr/share/elasticsearch/bin/elasticsearch-setup-passwords auto warning: usage of JAVA_HOME is deprecated, use ES_JAVA_HOME Initiating the setup of passwords for reserved users elastic,apm_system,kibana,kibana_system,logstash_system,beats_system,remote_monitoring_user. The passwords will be randomly generated and printed to the console. Please confirm that you would like to continue [y/N]y Changed password for user apm_system PASSWORD apm_system = uNWWUOspZb17BrIPTm7B Changed password for user kibana_system PASSWORD kibana_system = yGdahhN9BXCKXic9KPhK Changed password for user kibana PASSWORD kibana = yGdahhN9BXCKXic9KPhK Changed password for user logstash_system PASSWORD logstash_system = QlCmoxdgCJnNLHG7fCxC Changed password for user beats_system PASSWORD beats_system = 1gCeCCCeRA1Pm24oQKSM Changed password for user remote_monitoring_user PASSWORD remote_monitoring_user = 7taAyyePwRh4JWbmFYSU Changed password for user elastic PASSWORD elastic = hpmmr3qjy3KneOfWLC9d #8.验证集群是否正常,此密码不要抄我的,看你上面生成的密码 [root@elk91 ~]# curl -u elastic:123456 10.1.24.13:9200/_cat/nodes?v ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name 10.0.0.92 19 93 5 0.18 0.13 0.05 cdfhilmrstw - elk92 10.0.0.91 37 95 2 0.45 0.27 0.10 cdfhilmrstw - elk91 10.0.0.93 27 97 3 0.28 0.21 0.08 cdfhilmrstw * elk93 - kibana对接ES加密集群 1.修改kibana的配置文件 [root@elk91 ~]# vim /etc/kibana/kibana.yml ... server.port: 5601 server.host: "0.0.0.0" elasticsearch.hosts: ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"] elasticsearch.username: "kibana_system" elasticsearch.password: "14OmHmJjT6MKv2XDQNe8" # 注意,此密码不要抄我的,看你上面生成的密码 i18n.locale: "zh-CN" [root@elk91 ~]# 2.重启kibana [root@elk91 ~]# systemctl restart kibana.service [root@elk91 ~]# 3.访问kibana的webUI重置管理员密码 使用elastic用户进行登录即可。

47c7b78de517b65acd3d502e57f1b8ff

51927efd2a7dd823e84feb5b8a29b939_720

1、filebeat对接ES加密集群
bash
# 1.编写filebeat配置文件 [root@elk92 filebeat]# cat config/15-tcp-to-es_tls.yaml filebeat.inputs: - type: tcp host: "0.0.0.0:9000" output.elasticsearch: hosts: - "http://10.0.0.91:9200" - "http://10.0.0.92:9200" - "http://10.0.0.93:9200" index: "weixiang-weixiang98-es-tls-filebeat-%{+yyyy-MM-dd}" username: "elastic" password: "123456" setup.ilm.enabled: false setup.template.name: "weixiang-weixiang98" setup.template.pattern: "weixiang-weixiang98-*" setup.template.overwrite: false setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 1 [root@elk92 filebeat]# # 2.启动filebeat实例 [root@elk92 filebeat]# filebeat -e -c config/15-tcp-to-es_tls.yaml # 3.发送测试数据 [root@elk91 ~]# echo www.weixiang.com | nc 10.0.0.92 9000 # 4.kibana测试验证 略,见视频。

0fc2c0b750a78b31a1928499d441e08c

2、Logstash对接ES加密集群
bash
#1.编写Logstash的配置文件 [root@elk93 ~]# cat /etc/logstash/conf.d/08-tcp-to-es_tls.conf input { tcp { port => 8888 } } output { elasticsearch { hosts => ["http://10.0.0.91:9200","http://10.0.0.92:9200","http://10.0.0.93:9200"] index => "weixiang-tls-logstash-%{+yyyy-MM-dd}" user => "elastic" password => "123456" } } [root@elk93 ~]# #2.启动Logstash [root@elk93 ~]# logstash -f /etc/logstash/conf.d/08-tcp-to-es_tls.conf #3.发送测试数据 [root@elk91 ~]# echo "学IT来老男孩,月薪过万不是梦" | nc 10.0.0.93 8888 #4.kibana测试验证 略,见视频。

d0e530238a2f2fda659cd406e2a0f06e

3、ES7重置elastic管理员密码案例
bash
#1.创建一个超级管理员角色【该用户属于ES本地的一个特殊用户】 [root@elk91 ~]# /usr/share/elasticsearch/bin/elasticsearch-users useradd weixiang -p 123456 -r superuser [root@elk91 ~]# #2.查看用户列表【注意,在哪个节点创建,就在对应节点用户信息即可】 [root@elk91 ~]# /usr/share/elasticsearch/bin/elasticsearch-users list weixiang : superuser [root@elk91 ~]# #3.基于本地管理员修改密码 [root@elk91 ~]# curl -s --user weixiang:123456 -XPUT "http://localhost:9200/_xpack/security/user/elastic/_password?pretty" -H 'Content-Type: application/json' -d' { "password" : "123456" }' #4.使用密码登录 [root@elk93 ~]# curl 43.139.47.66:9200/_cat/nodes -u elastic:123456 10.0.0.92 63 94 0 0.01 0.00 0.00 cdfhilmrstw - elk92 10.0.0.93 66 83 0 0.04 0.08 0.04 cdfhilmrstw * elk93 10.0.0.91 64 95 0 0.02 0.06 0.02 cdfhilmrstw - elk91 [root@elk93 ~]# [root@elk93 ~]# [root@elk93 ~]# curl 10.0.0.91:9200/_cat/nodes -u elastic:123456 {"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [elastic] for REST request [/_cat/nodes]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}}],"type":"security_exception","reason":"unable to authenticate user [elastic] for REST request [/_cat/nodes]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}},"status":401}

2、使用api_key方式

1、配置api_key
bash
ES配置启用api-key功能并Filebeat测试验证,页面默认是关闭的 1.为什么要启用api-key 为了安全性,使用用户名和密码的方式进行认证会暴露用户信息。 ElasticSearch也支持api-key的方式进行认证。这样就可以保证安全性。api-key是不能用于登录kibana,安全性得到保障。 而且可以基于api-key实现权限控制。 2.ES启用api-key [root@elk91 ~]# vim /etc/elasticsearch/elasticsearch.yml ... # 添加如下配置 # 启用api_key功能 xpack.security.authc.api_key.enabled: true # 指定API密钥加密算法 xpack.security.authc.api_key.hashing.algorithm: pbkdf2 # 缓存的API密钥时间 xpack.security.authc.api_key.cache.ttl: 1d # API密钥保存数量的上限 xpack.security.authc.api_key.cache.max_keys: 10000 # 用于内存中缓存的API密钥凭据的哈希算法 xpack.security.authc.api_key.cache.hash_algo: ssha256 3.拷贝配置文件到其他节点 [root@elk91 ~]# scp /etc/elasticsearch/elasticsearch.yml 106.55.44.37:/etc/elasticsearch [root@elk91 ~]# scp /etc/elasticsearch/elasticsearch.yml 43.139.77.96:/etc/elasticsearch 4.重启ES集群 [root@elk93 ~]# systemctl restart elasticsearch.service [root@elk92 ~]# systemctl restart elasticsearch.service [root@elk91 ~]# systemctl restart elasticsearch.service 5.访问kibana的WebUI,就可以看到api http://43.139.47.66:5601/app/management/security/api_keys 6.创建api-key 略,见视频。

image

image

bash
7.基于api-key解析 [root@elk93 ~]# echo LW9yUG41Y0JDdzZ3ckpITXlZcGY6NURMd1NzLTlUTWU5R3B6bEFuUXJIZw==| base64 -d ;echo -orPn5cBCw6wrJHMyYpf:5DLwSs-9TMe9GpzlAnQrHg [root@elk93 ~]# base64格式: LW9yUG41Y0JDdzZ3ckpITXlZcGY6NURMd1NzLTlUTWU5R3B6bEFuUXJIZw== beats格式: 1uDXn5cB4S096DtgWCow:h8YQPaEXQzq1NkGlFzDS2w logstash格式: -orPn5cBCw6wrJHMyYpf:5DLwSs-9TMe9GpzlAnQrHg JSON格式 {"id":"-orPn5cBCw6wrJHMyYpf","name":"yinzhengjie","api_key":"5DLwSs-9TMe9GpzlAnQrHg","encoded":"LW9yUG41Y0JDdzZ3ckpITXlZcGY6NURMd1NzLTlUTWU5R3B6bEFuUXJIZw=="} 8.编写Filebeat的配置文件 [root@elk92 filebeat]# cat config/16-tcp-to-es_api-key.yaml filebeat.inputs: - type: tcp host: "0.0.0.0:9000" output.elasticsearch: hosts: - 106.55.44.37:9200 - 43.139.77.96:9200 - 43.139.47.66:9200 #username: "elastic" #password: "123456" # 基于api_key方式认证,相比于上面的base_auth更加安全。(生产环境推荐使用此方式!) api_key: "1uDXn5cB4S096DtgWCow:h8YQPaEXQzq1NkGlFzDS2w" # 这里改成上面复制的beats,如果不一样,会出403错误,权限被拒绝 index: weixiang-es-tls-filebeat-api-key setup.ilm.enabled: false setup.template.name: "weixiang-es" setup.template.pattern: "weixiang-es-*" setup.template.overwrite: true setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 0 9.启动filebeat实例 [root@elk92 filebeat]# filebeat -e -c config/16-tcp-to-es_api-key.yaml 10.发送测试数据 [root@elk91 ~]# echo 1111111111111111111 | nc 43.139.47.66 9000

image

bash
11.kibana验证数据

image

2、基于ES的api创建api-key并实现权限管理(扩展)
bash
参考链接: https://www.elastic.co/guide/en/beats/filebeat/7.17/beats-api-keys.html https://www.elastic.co/guide/en/elasticsearch/reference/7.17/security-privileges.html#privileges-list-cluster https://www.elastic.co/guide/en/elasticsearch/reference/7.17/security-privileges.html#privileges-list-indices # 1.创建api-key POST /_security/api_key { "name": "jasonyin2020", "role_descriptors": { "filebeat_monitoring": { "cluster": ["all"], "index": [ { "names": ["weixiang-weixiang98-es-apikey*"], "privileges": ["create_index", "create_doc"] } ] } } } 执行后,返回数据如下: { "id" : "AIr9n5cBCw6wrJHMC4u_", "name" : "jasonyin2020", "api_key" : "YOo22xB_ToCVj9XaB1KIaQ", "encoded" : "QUlyOW41Y0JDdzZ3ckpITUM0dV86WU9vMjJ4Ql9Ub0NWajlYYUIxS0lhUQ==" } # 2.解码数据 [root@elk92 ~]# echo QUlyOW41Y0JDdzZ3ckpITUM0dV86WU9vMjJ4Ql9Ub0NWajlYYUIxS0lhUQ== | base64 -d ;echo AIr9n5cBCw6wrJHMC4u_:YOo22xB_ToCVj9XaB1KIaQ [root@elk92 ~]# # 3.查看创建的api-key列表 GET /_security/api_key # 温馨提示: 此案例我们无法进行删除或查看的权限验证,因为需要使用编程语言调用api-key的方式操作。

3、ES集群配置https证书

bash
#1.自建ca证书 [root@elk91 ~]# /usr/share/elasticsearch/bin/elasticsearch-certutil ca --out /etc/elasticsearch/elastic-stack-ca.p12 --pass "" --days 36500 [root@elk91 ~]# [root@elk91 ~]# ll /etc/elasticsearch/elastic-stack-ca.p12 -rw------- 1 root elasticsearch 2672 May 7 11:38 /etc/elasticsearch/elastic-stack-ca.p12 [root@elk91 ~]# #2.基于自建ca证书生成ES证书 [root@elk91 ~]# /usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca /etc/elasticsearch/elastic-stack-ca.p12 --out /etc/elasticsearch/elastic-certificates-https.p12 --pass "" --days 3650 --ca-pass "" [root@elk91 ~]# ll /etc/elasticsearch/elastic-stack-ca.p12 -rw------- 1 root elasticsearch 2672 May 7 11:38 /etc/elasticsearch/elastic-stack-ca.p12 [root@elk91 ~]# [root@elk91 ~]# ll /etc/elasticsearch/elastic-certificates-https.p12 -rw------- 1 root elasticsearch 3596 May 7 11:39 /etc/elasticsearch/elastic-certificates-https.p12 [root@elk91 ~]# #3.修改配置文件 [root@elk91 ~]# vim /etc/elasticsearch/elasticsearch.yml ... # 启用https配置 xpack.security.http.ssl.enabled: true xpack.security.http.ssl.keystore.path: elastic-certificates-https.p12 [root@elk91 ~]# #4.同步配置文件集群的其他节点 [root@elk91 ~]#chown root.elasticsearch /etc/elasticsearch/elastic-certificates-https.p12 [root@elk91 ~]# chmod 640 /etc/elasticsearch/elastic-certificates-https.p12 [root@elk91 ~]# [root@elk91 ~]# ll /etc/elasticsearch/elastic-certificates-https.p12 -rw-r----- 1 root elasticsearch 3596 May 7 11:39 /etc/elasticsearch/elastic-certificates-https.p12 [root@elk91 ~]# [root@elk91 ~]# scp -p /etc/elasticsearch/elastic{-certificates-https.p12,search.yml} 43.139.77.96:/etc/elasticsearch/ [root@elk91 ~]# scp -p /etc/elasticsearch/elastic{-certificates-https.p12,search.yml} 106.55.44.37:/etc/elasticsearch/ #5.重启ES集群 [root@elk91 ~]# systemctl restart elasticsearch.service [root@elk92 ~]# systemctl restart elasticsearch.service [root@elk93 ~]# systemctl restart elasticsearch.service #6.测试验证,使用https协议 [root@elk91 ~]# curl https://10.1.24.13:9200/_cat/nodes?v -u elastic:H6KrryuAfO03nP33NPVn -k ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name 10.0.0.91 17 94 2 0.81 0.35 0.12 cdfhilmrstw - elk91 10.0.0.93 14 87 2 0.58 0.24 0.08 cdfhilmrstw - elk93 10.0.0.92 33 96 1 0.40 0.20 0.07 cdfhilmrstw * elk92 [root@elk91 ~]# [root@elk91 ~]# curl https://43.139.47.66:9200/_cat/nodes?v -u elastic:123456 --insecure ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name 10.0.0.91 20 95 0 0.07 0.18 0.09 cdfhilmrstw - elk91 10.0.0.93 17 87 0 0.02 0.11 0.06 cdfhilmrstw - elk93 10.0.0.92 36 97 0 0.01 0.09 0.05 cdfhilmrstw * elk92

4、kibana对接ES的https加密集群

bash
# 1.修改kibana的配置跳过自建证书校验 [root@elk91 ~]# vim /etc/kibana/kibana.yml ... # 指向ES集群的地址协议为https elasticsearch.hosts: ["https://10.0.0.91:9200","https://10.0.0.92:9200","https://10.0.0.93:9200"] # 跳过证书校验 elasticsearch.ssl.verificationMode: none # 2.重启kibana [root@elk91 ~]# systemctl restart kibana.service # 3.再次访问测试 http://10.0.0.91:5601/

5、logstash基于api-key访问ES集群

bash
#1.确保ES集群使用的是https协议 略,见视频。 #2.创建api-key POST /_security/api_key { "name": "yinzhengjie", "role_descriptors": { "filebeat_monitoring": { "cluster": ["all"], "index": [ { "names": ["weixiang-logstash-api-key*"], "privileges": ["create_index", "create"] } ] } } } 返回数据: { "id" : "go3HoJcBrtagu8XV_Fyq", "name" : "yinzhengjie", "api_key" : "QTO_Rh3YSiygMqql9gYEHg", "encoded" : "Z28zSG9KY0JydGFndThYVl9GeXE6UVRPX1JoM1lTaXlnTXFxbDlnWUVIZw==" }

image

bash
解码encoded数据: [root@elk91 ~]# echo Z28zSG9KY0JydGFndThYVl9GeXE6UVRPX1JoM1lTaXlnTXFxbDlnWUVIZw== |base64 -d ;echo go3HoJcBrtagu8XV_Fyq:QTO_Rh3YSiygMqql9gYEHg #3.修改Logstash的配置文件 [root@elk93 ~]# cat > /etc/logstash/conf.d/09-tcp-to-es_api-key.conf <<EOF input { tcp { port => 8888 } } output { elasticsearch { hosts => ["https://106.55.44.37:9200","https://43.139.77.96:9200","https://43.139.47.66:9200"] index => "weixiang-logstash-api-key-xixi" #user => elastic #password => "123456" # 指定api-key的方式认证 api_key => "go3HoJcBrtagu8XV_Fyq:QTO_Rh3YSiygMqql9gYEHg" # 使用api-key则必须启动ssl ssl => true # 跳过ssl证书验证 ssl_certificate_verification => false } } EOF #4.启动Logstash [root@elk93 ~]# logstash -f /etc/logstash/conf.d/09-tcp-to-es_api-key.conf #5.发送测试数据 [root@elk91 ~]# ·[root@elk91 ~]# echo 99999999999999999999999 | nc 10.0.0.93 8888

6、RBAC案例

image

创建角色:

513a54ffb9d813bc0d6678485996e275_720

dbbaef7607941a557acbc7baf2bd99c9_720

9363a223d03fd87de50abf391f5d9bdf_720

创建用户:

image

bash
#测试数据: GET weixiang-weixiang98-logstash-nginx-2025.06.20/_search POST weixiang-weixiang98-logstash-nginx-2025.06.20/_doc/1001 { "name": "赵本刚", "hobby": ["学习","台球","恋爱"] } GET weixiang-weixiang98-logstash-nginx-2025.06.20/_doc/1001 DELETE weixiang-weixiang98-logstash-nginx-2025.06.20/_doc/1001

image

练习
bash
- 1.使用Logstash基于分别监听"6666","7777","8888"端口,并基于3个不同的api-key写入3个不同的索引: weixiang-dba weixiang-k8s weixiang-sre - 2.创建3个角色,分别为: dba,k8s,sre,创建3个不同的用户,基于base认证对相应的索引有对应的访问权限,要去如下 xixi用户可以访问weixiang-dba索引的数据 haha用户可以访问weixiang-k8s索引的数据 hehe用户可以访问weixiang-sre索引的数据 解题思路: 1.创建api-key 可以在webUI的方式创建api-key,可以得到如下的信息: dba: h4_woJcBrUUXDTP_7xQ6:Raj-oob-QoqQBIrQ_CZTNg k8s: iI_yoJcBrUUXDTP_ahRg:8-749wDUTv6YTVSkrDGfTg sre: iY_yoJcBrUUXDTP_5xTN:xe3luoO3Q3uy5tt8mRm2uQ 值得注意的是,需要根据不同的索引设置不同的权限 dba索引的设置: { "superuser": { "cluster": [ "all" ], "indices": [ { "names": [ "weixiang-dba" ], "privileges": [ "all" ], "allow_restricted_indices": true } ], "run_as": [ "*" ] } } k8s索引的设置 { "superuser": { "cluster": [ "all" ], "indices": [ { "names": [ "weixiang-k8s" ], "privileges": [ "all" ], "allow_restricted_indices": true } ], "run_as": [ "*" ] } } sre索引的设置 { "superuser": { "cluster": [ "all" ], "indices": [ { "names": [ "weixiang-sre" ], "privileges": [ "all" ], "allow_restricted_indices": true } ], "run_as": [ "*" ] } } 2.编写Logstash的配置文件 [root@elk93 ~]# cat > /etc/logstash/conf.d/10-ketanglianxi-tcp-to-es.conf <<'EOF' input { tcp { port => 6666 type => dba } tcp { port => 7777 type => k8s } tcp { port => 8888 type => sre } } output { if [type] == "dba" { elasticsearch { hosts => ["https://10.0.0.91:9200","https://10.0.0.92:9200","https://10.0.0.93:9200"] index => "weixiang-dba" api_key => "h4_woJcBrUUXDTP_7xQ6:Raj-oob-QoqQBIrQ_CZTNg" ssl => true ssl_certificate_verification => false } } else if [type] == "k8s" { elasticsearch { hosts => ["https://10.0.0.91:9200","https://10.0.0.92:9200","https://10.0.0.93:9200"] index => "weixiang-k8s" api_key => "iI_yoJcBrUUXDTP_ahRg:8-749wDUTv6YTVSkrDGfTg" ssl => true ssl_certificate_verification => false } } else { elasticsearch { hosts => ["https://10.0.0.91:9200","https://10.0.0.92:9200","https://10.0.0.93:9200"] index => "weixiang-sre" api_key => "iY_yoJcBrUUXDTP_5xTN:xe3luoO3Q3uy5tt8mRm2uQ" ssl => true ssl_certificate_verification => false } } } EOF 3.启动Logstash实例 [root@elk93 ~]# logstash -f /etc/logstash/conf.d/10-ketanglianxi-tcp-to-es.conf 4.发送测试数据 [root@elk91 ~]# echo 11111111111111111 | nc 10.0.0.93 6666 ^C [root@elk91 ~]# [root@elk91 ~]# [root@elk91 ~]# echo 2222222222222222 | nc 10.0.0.93 7777 ^C [root@elk91 ~]# [root@elk91 ~]# [root@elk91 ~]# echo 33333333333333333 | nc 10.0.0.93 8888 5.测试验证 略,见视频。

7、Kibana登录报错

bash
# 登录出现Kibana server is not ready yet解决办法: 1、先做ES集群检查,看是否正常 http配置检查curl http://10.0.0.91:9200/_cat/nodes, https检查curl https://43.139.47.66:9200/_cat/nodes?v -u elastic:H6KrryuAfO03nP33NPVn -k 2、查看配置文件 密码是否配置正确 yml格式缩进 配置内容问题

8、生产环境忘记ES7密码如何操作

bash
基于超级管理员无法重置es密码。 # 1、首先把以下https认证信息全部注释掉 [root@elk01 ~]#vim /etc/elasticsearch/elasticsearch.yml 。。。 ## 添加如下配置 ## 启用api_key功能 #xpack.security.authc.api_key.enabled: true ## 指定API密钥加密算法 #xpack.security.authc.api_key.hashing.algorithm: pbkdf2 ## 缓存的API密钥时间 #xpack.security.authc.api_key.cache.ttl: 1d ## API密钥保存数量的上限 #xpack.security.authc.api_key.cache.max_keys: 10000 ## 用于内存中缓存的API密钥凭据的哈希算法 #xpack.security.authc.api_key.cache.hash_algo: ssha256 ## 启用https配置 #xpack.security.http.ssl.enabled: true #xpack.security.http.ssl.keystore.path: elastic-certificates-https.p12 #action.destructive_requires_name: true # 2、配置文件分发到其他节点 [root@elk01 ~]#scp /etc/elasticsearch/elasticsearch.yml 10.0.0.91:/etc/elasticsearch/ [root@elk01 ~]#scp /etc/elasticsearch/elasticsearch.yml 10.0.0.92:/etc/elasticsearch/ # 3、重启所有节点的es [root@elk01 ~]#systemctl restart elasticsearch.service # 4、重置密码 [root@elk01 ~]#/usr/share/elasticsearch/bin/elasticsearch-setup-passwords auto # 5、测试 [root@elk01 ~]#curl -k http://43.139.47.66:9200/_cat/nodes -u elastic:xxx # 6、修改kibana密码 elasticsearch.username: "kibana_system" elasticsearch.password: "14OmHmJjT6MKv2XDQNe8" elasticsearch.hosts: ["http://43.139.47.66:9200","http://106.55.44.37:9200","http://43.139.77.96:9200"] # 7、重启kibana [root@elk01 ~]#systemctl restart kibana.service # 8、登录kibana 密码是重置后的密码 # 9、修改为https [root@elk01 ~]#vim /etc/elasticsearch/elasticsearch.yml # 添加如下配置 # 启用api_key功能 xpack.security.authc.api_key.enabled: true # 指定API密钥加密算法 xpack.security.authc.api_key.hashing.algorithm: pbkdf2 # 缓存的API密钥时间 xpack.security.authc.api_key.cache.ttl: 1d # API密钥保存数量的上限 xpack.security.authc.api_key.cache.max_keys: 10000 # 用于内存中缓存的API密钥凭据的哈希算法 xpack.security.authc.api_key.cache.hash_algo: ssha256 # 启用https配置 xpack.security.http.ssl.enabled: true xpack.security.http.ssl.keystore.path: elastic-certificates-https.p12 action.destructive_requires_name: true # 10、配置文件分发到其他节点 [root@elk01 ~]#scp /etc/elasticsearch/elasticsearch.yml 10.0.0.91:/etc/elasticsearch/ [root@elk01 ~]#scp /etc/elasticsearch/elasticsearch.yml 10.0.0.92:/etc/elasticsearch/ # 11、重启所有节点的es [root@elk01 ~]#systemctl restart elasticsearch.service # 12、修改kibana配置文件 [root@elk01 ~]#vim /etc/kibana/kibana.yml elasticsearch.hosts: ["https://43.139.47.66:9200","https://106.55.44.37:9200","https://43.139.77.96:9200"] # 13、重启生效 [root@elk01 ~]#systemctl restart kibana.service

7、集群优化

bash
- ES集群优化 # 1.ES集群的JVM优化 1.1 JVM优化的思路 默认情况下,ES会吃掉宿主机的一半内存。 # 生存环境中,建议大家使用宿主机的一半内存,但是这个内存上限为32GB。官方推荐是26GB。 参考链接: https://www.elastic.co/guide/en/elasticsearch/reference/7.17/advanced-configuration.html#set-jvm-heap-size 1.2 查看JVM的大小 [root@elk91 ~]# ps -ef | grep elasticsearch | egrep "Xmx|Xms" elastic+ 19076 1 1 11:42 ? 00:04:09 /usr/share/elasticsearch/jdk/bin/java ... -Xms1937m -Xmx1937m ... [root@elk91 ~]# [root@elk91 ~]# free -h total used free shared buff/cache available Mem: 3.8Gi 2.8Gi 270Mi 1.0Mi 696Mi 718Mi Swap: 3.8Gi 0.0Ki 3.8Gi [root@elk91 ~]# 1.3 修改JVM大小 [root@elk91 ~]# vim /etc/elasticsearch/jvm.options ... -Xms256m -Xmx256m 1.4 拷贝配置文件到其他节点 [root@elk91 ~]# scp /etc/elasticsearch/jvm.options 106.55.44.37:/etc/elasticsearch [root@elk91 ~]# scp /etc/elasticsearch/jvm.options 43.139.77.96:/etc/elasticsearch 1.5 所有节点重启ES集群 [root@elk91 ~]# systemctl restart elasticsearch.service [root@elk92 ~]# systemctl restart elasticsearch.service [root@elk93 ~]# systemctl restart elasticsearch.service 1.6 验证JVM大小 [root@elk91 ~]# ps -ef | grep elasticsearch | egrep "Xmx|Xms" elastic+ 20219 1 58 17:16 ? 00:01:05 /usr/share/elasticsearch/jdk/bin/java ... -Xms256m -Xmx256m ... ... [root@elk91 ~]# free -h total used free shared buff/cache available Mem: 3.8Gi 1.1Gi 2.1Gi 1.0Mi 682Mi 2.5Gi Swap: 3.8Gi 0.0Ki 3.8Gi [root@elk91 ~]# 温馨提示: 其他两个节点都要查看验证。 [root@elk93 ~]# curl https://106.55.44.37:9200/_cat/nodes -u elastic:123456 -k 10.0.0.93 79 39 11 0.16 0.30 0.21 cdfhilmrstw - elk93 10.0.0.91 78 50 14 0.15 0.29 0.20 cdfhilmrstw - elk91 10.0.0.92 63 41 17 0.24 0.36 0.22 cdfhilmrstw * elk92 [root@elk93 ~]# # 2.ES禁用索引的通配符删除 参考链接: https://www.elastic.co/guide/en/elasticsearch/reference/7.17/index-management-settings.html 2.0 环境准备 PUT laonanhai-weixiang98-001 PUT laonanhai-weixiang98-002 PUT laonanhai-weixiang98-003 2.1 默认支持基于通配符删除索引 [root@elk93 ~]# curl -u elastic:123456 -k -X DELETE https://10.0.0.91:9200/laonanhai-weixiang98*;echo {"acknowledged":true} [root@elk93 ~]# 2.2 修改ES集群的配置文件 [root@elk91 ~]# vim /etc/elasticsearch/elasticsearch.yml ... # 禁止使用通配符或_all删除索引 action.destructive_requires_name: true 2.3 拷贝配置文件到其他节点 [root@elk91 ~]# scp /etc/elasticsearch/elasticsearch.yml 10.0.0.92:/etc/elasticsearch/ [root@elk91 ~]# scp /etc/elasticsearch/elasticsearch.yml 10.0.0.93:/etc/elasticsearch/ 2.4 重启ES集群 [root@elk91 ~]# systemctl restart elasticsearch.service [root@elk92 ~]# systemctl restart elasticsearch.service [root@elk93 ~]# systemctl restart elasticsearch.service 2.5 验证测试 [root@elk93 ~]# curl -u elastic:123456 -k -X DELETE https://106.55.44.37:9200/laonanhai-weixiang98*;echo {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Wildcard expressions or all indices are not allowed"}],"type":"illegal_argument_exception","reason":"Wildcard expressions or all indices are not allowed"},"status":400} [root@elk93 ~]#

8、安装ES8

1、ES8单点环境部署

bash
环境准备: 【2c 4GB】 10.0.0.81 elk81 10.0.0.82 elk82 10.0.0.83 elk83 - ES8单点环境部署 1.下载ES软件包 wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.18.3-amd64.deb svip: [root@elk81 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/ES8/8.18.3/elasticsearch-8.18.3-amd64.deb 2.安装ES8 [root@elk81 ~]# dpkg -i elasticsearch-8.18.3-amd64.deb ... --------------------------- Security autoconfiguration information ------------------------------ Authentication and authorization are enabled. TLS for the transport and HTTP layers is enabled and configured. The generated password for the elastic built-in superuser is : 6xaDe2+oGoKDLXFgS6iJ If this node should join an existing cluster, you can reconfigure this with '/usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token <token-here>' after creating an enrollment token on your existing cluster. You can complete the following actions at any time: Reset the password of the elastic built-in superuser with '/usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic'. Generate an enrollment token for Kibana instances with '/usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana'. Generate an enrollment token for Elasticsearch nodes with '/usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s node'. ------------------------------------------------------------------------------------------------- ### NOT starting on installation, please execute the following statements to configure elasticsearch service to start automatically using systemd sudo systemctl daemon-reload sudo systemctl enable elasticsearch.service ### You can start elasticsearch service by executing sudo systemctl start elasticsearch.service [root@elk81 ~]# 3.启动ES服务 [root@elk81 ~]# systemctl enable --now elasticsearch.service Created symlink /etc/systemd/system/multi-user.target.wants/elasticsearch.service → /lib/systemd/system/elasticsearch.service. [root@elk81 ~]# [root@elk81 ~]# ss -ntl | egrep "9200|9300" LISTEN 0 4096 [::1]:9300 [::]:* LISTEN 0 4096 [::ffff:127.0.0.1]:9300 *:* LISTEN 0 4096 *:9200 *:* [root@elk81 ~]# 4.访问测试 [root@elk83 ~]# curl -k https://43.139.47.66:9200 -u "elastic:6xaDe2+oGoKDLXFgS6iJ" { "name" : "elk81", "cluster_name" : "elasticsearch", "cluster_uuid" : "rJA3htMjS6eMSMFzFj5oQQ", "version" : { "number" : "8.18.3", "build_flavor" : "default", "build_type" : "deb", "build_hash" : "28fc77664903e7de48ba5632e5d8bfeb5e3ed39c", "build_date" : "2025-06-18T22:08:41.171261054Z", "build_snapshot" : false, "lucene_version" : "9.12.1", "minimum_wire_compatibility_version" : "7.17.0", "minimum_index_compatibility_version" : "7.0.0" }, "tagline" : "You Know, for Search" } [root@elk83 ~]# [root@elk83 ~]# curl -k https://43.139.47.66:9200/_cat/nodes -u "elastic:6xaDe2+oGoKDLXFgS6iJ" 127.0.0.1 8 97 12 0.28 0.29 0.15 cdfhilmrstw * elk81 [root@elk83 ~]#

2、ES8重置管理员elatic密码

bash
1.重置密码 [root@elk81 ~]# /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic This tool will reset the password of the [elastic] user to an autogenerated value. The password will be printed in the console. Please confirm that you would like to continue [y/N]y # 手动输入字母'y' Password for the [elastic] user successfully reset. New value: 8guK_k29l_VeW*JXwHKg # 这是新密码 [root@elk81 ~]# 2.用旧密码将无法访问 [root@elk81 ~]# curl -k https://10.0.0.81:9200/_cat/nodes -u "elastic:j1NOIrCYY7Z52nZfDSL*" ; echo {"error":{"root_cause":[{"type":"security_exception","reason":"unable to authenticate user [elastic] for REST request [/_cat/nodes]","header":{"WWW-Authenticate":["Basic realm=\"security\", charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"unable to authenticate user [elastic] for REST request [/_cat/nodes]","header":{"WWW-Authenticate":["Basic realm=\"security\", charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401} [root@elk81 ~]# 3.使用新密码是可以正常访问的 [root@elk81 ~]# curl -k https://43.139.47.66:9200/_cat/nodes -u "elastic:DcaA1A*00YxCDcWaURyK" 127.0.0.1 10 95 1 0.06 0.13 0.11 cdfhilmrstw * elk81 [root@elk81 ~]# [root@elk81 ~]#curl -k https://43.139.47.66:9200 -u "elastic:DcaA1A*00YxCDcWaURyK" { "name" : "elk81", "cluster_name" : "elasticsearch", "cluster_uuid" : "rJA3htMjS6eMSMFzFj5oQQ", "version" : { "number" : "8.18.3", "build_flavor" : "default", "build_type" : "deb", "build_hash" : "28fc77664903e7de48ba5632e5d8bfeb5e3ed39c", "build_date" : "2025-06-18T22:08:41.171261054Z", "build_snapshot" : false, "lucene_version" : "9.12.1", "minimum_wire_compatibility_version" : "7.17.0", "minimum_index_compatibility_version" : "7.0.0" }, "tagline" : "You Know, for Search" }

3、部署ES8集群

bash
参考链接: https://www.elastic.co/guide/en/elasticsearch/reference/8.18/deb.html # 1.拷贝软件包到其他节点 [root@elk81 ~]# scp elasticsearch-8.18.3-amd64.deb 106.55.44.37:~ [root@elk81 ~]# scp elasticsearch-8.18.3-amd64.deb 43.139.77.96:~ # 2.其他节点安装ES8软件包 [root@elk82 ~]# dpkg -i elasticsearch-8.18.3-amd64.deb [root@elk83 ~]# dpkg -i elasticsearch-8.18.3-amd64.deb # 3.修改ES集群的配置文件 [root@elk01 ~]# egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch network.host: 0.0.0.0 http.port: 9200 discovery.seed_hosts: ["43.139.47.66","106.55.44.37","43.139.77.96"] cluster.initial_master_nodes: ["43.139.47.66","106.55.44.37","43.139.77.96"] xpack.security.enabled: true xpack.security.enrollment.enabled: true xpack.security.http.ssl: enabled: true keystore.path: certs/http.p12 xpack.security.transport.ssl: enabled: true verification_mode: certificate keystore.path: certs/transport.p12 truststore.path: certs/transport.p12 http.host: 0.0.0.0 # 4.启动ES集群 [root@elk01 ~]#systemctl enable --now kibana.service # 5.生成token [root@elk81 ~]# /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s node eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTAuMC4wLjgxOjkyMDAiXSwiZmdyIjoiYTY5ZjhiMTI2ZmRkMDRkNDk4NWY0YWEyZDY3ZWY5YmZkOTA0ZWEyM2VkNDkzMmYwMWZhZjdjM2I3ZDUwNTg1MCIsImtleSI6IjkwME5wWmNCX1ZjNjZMMERRellQOlItNW1IZVNMbFNtaEx5ZXhJRUtNVEEifQ== # 6.新加入的节点使用81生成的token注册,ES要先启动 [root@elk82 ~]# /usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTAuMC4wLjgxOjkyMDAiXSwiZmdyIjoiYTY5ZjhiMTI2ZmRkMDRkNDk4NWY0YWEyZDY3ZWY5YmZkOTA0ZWEyM2VkNDkzMmYwMWZhZjdjM2I3ZDUwNTg1MCIsImtleSI6IjkwME5wWmNCX1ZjNjZMMERRellQOlItNW1IZVNMbFNtaEx5ZXhJRUtNVEEifQ== This node will be reconfigured to join an existing cluster, using the enrollment token that you provided. This operation will overwrite the existing configuration. Specifically: - Security auto configuration will be removed from elasticsearch.yml - The [certs] config directory will be removed - Security auto configuration related secure settings will be removed from the elasticsearch.keystore Do you want to continue with the reconfiguration process [y/N]y [root@elk83 ~]# /usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTAuMC4wLjgxOjkyMDAiXSwiZmdyIjoiYTY5ZjhiMTI2ZmRkMDRkNDk4NWY0YWEyZDY3ZWY5YmZkOTA0ZWEyM2VkNDkzMmYwMWZhZjdjM2I3ZDUwNTg1MCIsImtleSI6IjkwME5wWmNCX1ZjNjZMMERRellQOlItNW1IZVNMbFNtaEx5ZXhJRUtNVEEifQ== This node will be reconfigured to join an existing cluster, using the enrollment token that you provided. This operation will overwrite the existing configuration. Specifically: - Security auto configuration will be removed from elasticsearch.yml - The [certs] config directory will be removed - Security auto configuration related secure settings will be removed from the elasticsearch.keystore Do you want to continue with the reconfiguration process [y/N]y [root@elk83 ~]# # 7.拷贝主节点配置文件到别的节点。 [root@elk81 ~]# scp /etc/elasticsearch/elasticsearch.yml 10.0.0.82:/etc/elasticsearch [root@elk81 ~]# scp /etc/elasticsearch/elasticsearch.yml 10.0.0.83:/etc/elasticsearch # 8.检查集群的配置文件 [root@elk02 ~]# egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch network.host: 0.0.0.0 http.port: 9200 discovery.seed_hosts: ["43.139.47.66","106.55.44.37","43.139.77.96"] cluster.initial_master_nodes: ["43.139.47.66","106.55.44.37","43.139.77.96"] xpack.security.enabled: true xpack.security.enrollment.enabled: true xpack.security.http.ssl: enabled: true keystore.path: certs/http.p12 xpack.security.transport.ssl: enabled: true verification_mode: certificate keystore.path: certs/transport.p12 truststore.path: certs/transport.p12 http.host: 0.0.0.0 [root@elk82 ~]# [root@elk03 ~]# egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch network.host: 0.0.0.0 http.port: 9200 discovery.seed_hosts: ["43.139.47.66","106.55.44.37","43.139.77.96"] cluster.initial_master_nodes: ["43.139.47.66","106.55.44.37","43.139.77.96"] xpack.security.enabled: true xpack.security.enrollment.enabled: true xpack.security.http.ssl: enabled: true keystore.path: certs/http.p12 xpack.security.transport.ssl: enabled: true verification_mode: certificate keystore.path: certs/transport.p12 truststore.path: certs/transport.p12 http.host: 0.0.0.0 # 9.重启集群 [root@elk81 ~]# systemctl enable --now elasticsearch.service Created symlink /etc/systemd/system/multi-user.target.wants/elasticsearch.service → /lib/systemd/system/elasticsearch.service. [root@elk81 ~]# [root@elk81 ~]# ss -ntl | egrep "92|300" LISTEN 0 4096 *:9300 *:* LISTEN 0 4096 *:9200 *:* [root@elk81 ~]# [root@elk82 ~]# systemctl enable --now elasticsearch Created symlink /etc/systemd/system/multi-user.target.wants/elasticsearch.service → /lib/systemd/system/elasticsearch.service. [root@elk82 ~]# [root@elk82 ~]# ss -ntl | egrep "92|300" LISTEN 0 4096 *:9200 *:* LISTEN 0 4096 *:9300 *:* [root@elk82 ~]# [root@elk83 ~]# systemctl enable --now elasticsearch.service Created symlink /etc/systemd/system/multi-user.target.wants/elasticsearch.service → /lib/systemd/system/elasticsearch.service. [root@elk83 ~]# [root@elk83 ~]# ss -ntl | egrep "92|300" LISTEN 0 4096 *:9300 *:* LISTEN 0 4096 *:9200 *:* [root@elk83 ~]# #10.验证集群是否正常 [root@elk01 ~]# curl -k https://43.139.47.66:9200/_cat/nodes -u "elastic:DcaA1A*00YxCDcWaURyK" 10.1.20.5 45 97 24 0.28 0.24 0.13 cdfhilmrstw * elk01 10.1.24.13 42 97 25 0.38 0.23 0.09 cdfhilmrstw - elk02 10.1.24.4 7 97 23 0.48 0.37 0.17 cdfhilmrstw - elk03 温馨提示: - ES8采用该脚本生成令牌。 /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s node - 被加入节点使用令牌加入,目的是为了拷贝配置文件。 /usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token <TOKEN> 彩蛋: 卸载ES服务 dpkg -P elasticsearch rm -f /var/lib/elasticsearch/*

4、部署kibana对接ES8集群

bash
# 1.下载kibana wget https://artifacts.elastic.co/downloads/kibana/kibana-8.18.3-amd64.deb svip: [root@elk81 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/ES8/8.18.3/kibana-8.18.3-amd64.deb # 2.安装kibana [root@elk81 ~]# dpkg -i kibana-8.18.3-amd64.deb # 3.修改kibana的配置文件 [root@elk81 ~]# vim /etc/kibana/kibana.yml ... server.host: "0.0.0.0" i18n.locale: "zh-CN" # 4.启动kibana [root@elk81 ~]# systemctl enable --now kibana.service Created symlink /etc/systemd/system/multi-user.target.wants/kibana.service → /lib/systemd/system/kibana.service. [root@elk81 ~]# [root@elk81 ~]# ss -ntl | grep 5601 LISTEN 0 511 0.0.0.0:5601 0.0.0.0:* [root@elk81 ~]# # 5.生成kiban专用的token [root@elk81 ~]# /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTAuMS4yMC41OjkyMDAiXSwiZmdyIjoiYmZmNWY1Y2Y2MDQ0MDQ5ZGUwNWQ1ZjA2MTM0OWZiNDkzZDc1MWI0ZWQxYzIxNDJlNTgyZDA3YmZkNTNmYzA3MSIsImtleSI6IjktN1VwWmNCYWFxM280bTZ1Y0p1OkMwenhPYWNlWGhGVnRCYkFiQ3ZCX0EifQ== # 6.访问kibana的webUI http://10.0.0.81:5601/,把令牌填入 基于token进行认证配置即可。 # 7.kiban服务器获取校验码 [root@elk81 ~]# /usr/share/kibana/bin/kibana-verification-code Your verification code is: 629 053 [root@elk81 ~]# # 8.更改密码登录 /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic

image

5、filebeat对接ES8实战

bash
参考链接: https://www.elastic.co/guide/en/beats/filebeat/8.18/configuration-ssl.html#server-verification-mode #1.下载filebeat wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.18.3-amd64.deb SVIP: [root@elk82 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/ES8/8.18.3/filebeat-8.18.3-amd64.deb #2.安装filebeat [root@elk82 ~]# dpkg -i filebeat-8.18.3-amd64.deb #3.创建api-key 略,见视频。 H-76pZcBaaq3o4m66sOY:mQfsXtGcNbfDt8Mt5W_yYA #4.编写filebeat的配置文件 [root@elk82 ~]# cat /etc/filebeat/01-tcp-to-es.yaml filebeat.inputs: - type: tcp host: "0.0.0.0:9000" output.elasticsearch: hosts: - https://43.139.47.66:9200 - https://106.55.44.37:9200 - https://43.139.77.96:9200 api_key: "H-76pZcBaaq3o4m66sOY:mQfsXtGcNbfDt8Mt5W_yYA" index: weixiang-weixiang98-es8-apikey-001 # 跳过客户端证书校验。 ssl.verification_mode: none setup.ilm.enabled: false setup.template.name: "weixiang-es" setup.template.pattern: "weixiang-es-*" setup.template.overwrite: false setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 0 [root@elk82 ~]# #5.启动filebeat实例 [root@elk82 ~]# filebeat -e -c /etc/filebeat/01-tcp-to-es.yaml ... #6.发送测试数据 [root@elk83 ~]# echo 我叫小沈阳 | nc 43.139.47.66 9000 # 建立到10.0.0.92:9000的TCP连接,并将输入数据发送过去 #7.kibana查看数据

image

6、logstash对接ES8集群

bash
- # 1.下载logstash wget https://artifacts.elastic.co/downloads/logstash/logstash-8.18.3-amd64.deb SVIP: [root@elk83 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/ES8/8.18.3/logstash-8.18.3-amd64.deb # 2.安装logstash [root@elk83 ~]# dpkg -i logstash-8.18.3-amd64.deb # 3.创建符号链接 [root@elk83 ~]# ln -svf /usr/share/logstash/bin/logstash /usr/local/bin/ # 4.创建api-key 略,见视频。 建议基于图形化创建,后期可以考虑调研使用API。 JE0mppcB_Vc66L0Dujcs:8FFUdXTw6I4AFBH2Vce2Hg # 5.编写Logstash配置文件 [root@elk83 ~]# cat /etc/logstash/conf.d/01-tcp-to-es.yaml input { tcp { port => 8888 } } output { elasticsearch { hosts => ["https://43.139.47.66:9200","https://106.55.44.37:9200","https://43.139.77.96:9200"] index => "weixiang-weixiang98-es-apikey-xixi" api_key => "Ie5dppcBaaq3o4m6JcPY:hIEn5tfU4M1Ss0m0VRq8tA" ssl => true ssl_certificate_verification => false } } [root@elk83 ~]# # 6.启动Logstash [root@elk83 ~]# logstash -f /etc/logstash/conf.d/01-tcp-to-es.yaml # 7.发送测试数据 [root@elk81 ~]# echo "学IT来老男孩,月薪过万不是梦~" | nc 10.0.0.83 8888 # 建立到10.0.0.92:9000的TCP连接,并将输入数据发送过去 # 8.kibana查看数据 略,见视频。

image

7、ES8和ES7对比部署

bash
1.ES8默认启用了https,支持认证等功能; 2.ES8新增'elasticsearch-reset-password'脚本,对于elastic用户重置密码更加简单; 3.ES8新增'elasticsearch-create-enrollment-token'脚本,可以为组件创建token信息,比如kibana,node组件; 4.ES8新增kibana新增'kibana-verification-code'用于生成校验码。 5.kibana支持更多的语言:English (default) "en", Chinese "zh-CN", Japanese "ja-JP", French "fr-FR" 6.kibana的webUI更加丰富,支持AI助手,手动创建索引等功能; 7.kibana不用创建索引模式就可以直接查看数据,系统会生成一个临时的视图以供查看; 8.ES8集群部署时,需要借助'elasticsearch-reconfigure-node'脚本来加入已存在的集群,默认就是单master节点的配置;

9、zookeeper

1、zookeeper单点部署

bash
- zookeeper单点部署 1.什么是zookeeper ZooKeeper是一个集中式服务,用于维护配置信息、命名、提供分布式同步和提供组服务,常被用于分布式系统中的配置管理 主要的应用场景: Kafka,HDFS HA,YARN HA,HBase,Solr,... 官网地址: https://zookeeper.apache.org/ 2.下载zookeeper wget https://dlcdn.apache.org/zookeeper/zookeeper-3.8.4/apache-zookeeper-3.8.4-bin.tar.gz SVIP: [root@elk91 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/Zookeeper/apache-zookeeper-3.8.4-bin.tar.gz 3.解压软件包 [root@elk91 ~]# tar xf apache-zookeeper-3.8.4-bin.tar.gz -C /usr/local/ [root@elk91 ~]# 4.配置环境变量 [root@elk91 ~]# cat /etc/profile.d/zk.sh #!/bin/bash export ZK_HOME=/usr/local/apache-zookeeper-3.8.4-bin export JAVA_HOME=/usr/share/elasticsearch/jdk export PATH=$PATH:${ZK_HOME}/bin:${JAVA_HOME}/bin [root@elk91 ~]# [root@elk91 ~]# source /etc/profile.d/zk.sh [root@elk91 ~]# [root@elk91 ~]# java --version openjdk 22.0.2 2024-07-16 OpenJDK Runtime Environment (build 22.0.2+9-70) OpenJDK 64-Bit Server VM (build 22.0.2+9-70, mixed mode, sharing) [root@elk91 ~]# 5.准备配置文件 [root@elk91 ~]# cp /usr/local/apache-zookeeper-3.8.4-bin/conf/zoo{_sample,}.cfg [root@elk91 ~]# [root@elk91 ~]# ll /usr/local/apache-zookeeper-3.8.4-bin/conf/zoo* -rw-r--r-- 1 root root 1183 Jun 26 09:07 /usr/local/apache-zookeeper-3.8.4-bin/conf/zoo.cfg -rw-r--r-- 1 yinzhengjie yinzhengjie 1183 Feb 13 2024 /usr/local/apache-zookeeper-3.8.4-bin/conf/zoo_sample.cfg [root@elk91 ~]# 6.启动zookeeper服务 [root@elk91 ~]# zkServer.sh start ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@elk91 ~]# 7.查看zookeeper状态 [root@elk91 ~]# zkServer.sh status ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: standalone [root@elk91 ~]# 8.登录测试 [root@elk91 ~]# zkCli.sh -server 43.139.47.66:2181 ... [zk: 10.0.0.91:2181(CONNECTED) 0] ls / [zookeeper] [zk: 10.0.0.91:2181(CONNECTED) 1]

2、zookeeper的基本使用

bash
- zookeeper的基本使用 1.创建zookeeper node [zk: 10.0.0.91:2181(CONNECTED) 1] create /school weixiang Created /school [zk: 10.0.0.91:2181(CONNECTED) 2] 2.查看zookeeper node列表 [zk: 10.0.0.91:2181(CONNECTED) 2] ls / [school, zookeeper] [zk: 10.0.0.91:2181(CONNECTED) 3] 3.查看zookeeper node数据 [zk: 10.0.0.91:2181(CONNECTED) 3] get /school weixiang [zk: 10.0.0.91:2181(CONNECTED) 4] 4.修改zookeeper node数据 [zk: 10.0.0.91:2181(CONNECTED) 4] set /school laonanhai [zk: 10.0.0.91:2181(CONNECTED) 5] [zk: 10.0.0.91:2181(CONNECTED) 5] get /school laonanhai [zk: 10.0.0.91:2181(CONNECTED) 6] 5.删除zookeeper node [zk: 10.0.0.91:2181(CONNECTED) 6] delete /school [zk: 10.0.0.91:2181(CONNECTED) 7] [zk: 10.0.0.91:2181(CONNECTED) 7] ls / [zookeeper] [zk: 10.0.0.91:2181(CONNECTED) 8] 6.创建层级的zookeeper node [zk: 10.0.0.91:2181(CONNECTED) 10] create /school weixiang Created /school [zk: 10.0.0.91:2181(CONNECTED) 11] [zk: 10.0.0.91:2181(CONNECTED) 11] create /school/class weixiang98 Created /school/class [zk: 10.0.0.91:2181(CONNECTED) 12] [zk: 10.0.0.91:2181(CONNECTED) 12] create /school/address https://www.weixiang.com Created /school/address [zk: 10.0.0.91:2181(CONNECTED) 13] [zk: 10.0.0.91:2181(CONNECTED) 13] ls / [school, zookeeper] [zk: 10.0.0.91:2181(CONNECTED) 14] [zk: 10.0.0.91:2181(CONNECTED) 14] ls /school [address, class] [zk: 10.0.0.91:2181(CONNECTED) 15] [zk: 10.0.0.91:2181(CONNECTED) 15] get /school/address https://www.weixiang.com [zk: 10.0.0.91:2181(CONNECTED) 16] [zk: 10.0.0.91:2181(CONNECTED) 16] get /school/class weixiang98 [zk: 10.0.0.91:2181(CONNECTED) 17] 7.递归删除zookeeper node [zk: 10.0.0.91:2181(CONNECTED) 19] ls /school [address, class] [zk: 10.0.0.91:2181(CONNECTED) 20] [zk: 10.0.0.91:2181(CONNECTED) 20] delete /school/class [zk: 10.0.0.91:2181(CONNECTED) 21] [zk: 10.0.0.91:2181(CONNECTED) 21] ls /school [address] [zk: 10.0.0.91:2181(CONNECTED) 22] [zk: 10.0.0.91:2181(CONNECTED) 22] delete /school Node not empty: /school [zk: 10.0.0.91:2181(CONNECTED) 23] [zk: 10.0.0.91:2181(CONNECTED) 23] ls /school [address] [zk: 10.0.0.91:2181(CONNECTED) 24] [zk: 10.0.0.91:2181(CONNECTED) 24] deleteall /school [zk: 10.0.0.91:2181(CONNECTED) 25] [zk: 10.0.0.91:2181(CONNECTED) 25] ls /school Node does not exist: /school [zk: 10.0.0.91:2181(CONNECTED) 26] [zk: 10.0.0.91:2181(CONNECTED) 26] ls / [zookeeper] [zk: 10.0.0.91:2181(CONNECTED) 27]
命令功能示例
create创建节点create /app "config_data"
get获取节点数据get /app
set更新节点数据set /app "new_config"
ls列出子节点ls /
delete删除节点delete /app
deleteall递归删除deleteall /app
stat查看节点状态stat /app

3、zookeeper的集群部署

bash
#1.集群模式节点数量选择: 参考链接: https://zookeeper.apache.org/doc/current/zookeeperOver.html #2.实战案例 2.1 停止单点服务 [root@elk91 ~]# zkServer.sh stop ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED [root@elk91 ~]# [root@elk91 ~]# zkServer.sh status ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Error contacting service. It is probably not running. [root@elk91 ~]# 2.2 修改zookeeper的配置文件 [root@elk91 ~]# egrep -v "^#|^$" /usr/local/apache-zookeeper-3.8.4-bin/conf/zoo.cfg tickTime=2000 initLimit=10 syncLimit=5 dataDir=/var/lib/zookeeper clientPort=2181 server.11=10.1.20.5:2888:3888 server.22=10.1.24.13:2888:3888 server.33=10.1.24.4:2888:3888 [root@elk91 ~]# 2.3.将程序和环境变量文件同步到集群的其他节点 [root@elk91 ~]# scp -r /usr/local/apache-zookeeper-3.8.4-bin/ 106.55.44.37:/usr/local [root@elk91 ~]# scp -r /usr/local/apache-zookeeper-3.8.4-bin/ 43.139.77.96:/usr/local [root@elk91 ~]# scp /etc/profile.d/zk.sh 106.55.44.37:/etc/profile.d/ [root@elk91 ~]# scp /etc/profile.d/zk.sh 43.139.77.96:/etc/profile.d/ 2.4.所有节点准备数据目录【此处的myid要和zk集群的配置文件要匹配!!!】 [root@elk91 ~]# mkdir /var/lib/zookeeper && echo 11 > /var/lib/zookeeper/myid [root@elk92 ~]# mkdir /var/lib/zookeeper && echo 22 > /var/lib/zookeeper/myid [root@elk92 ~]# mkdir /var/lib/zookeeper && echo 33 > /var/lib/zookeeper/myid 2.5.启动zookeeper集群 [root@elk91 ~]# source /etc/profile.d/zk.sh && zkServer.sh start [root@elk92 ~]# source /etc/profile.d/zk.sh && zkServer.sh start [root@elk93 ~]# source /etc/profile.d/zk.sh && zkServer.sh start 2.6.检查集群的状态 [root@elk91 ~]# zkServer.sh status ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: follower [root@elk91 ~]# [root@elk91 ~]# ss -ntl | egrep "2181|2888|3888" LISTEN 0 50 *:2181 *:* LISTEN 0 50 [::ffff:10.0.0.91]:3888 *:* [root@elk91 ~]# [root@elk92 ~]# zkServer.sh status ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: leader [root@elk92 ~]# [root@elk92 ~]# ss -ntl | egrep "2181|2888|3888" LISTEN 0 50 [::ffff:10.0.0.92]:3888 *:* LISTEN 0 50 *:2181 *:* [root@elk92 ~]# [root@elk93 ~]# zkServer.sh status ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: follower [root@elk93 ~]# [root@elk93 ~]# ss -ntl | egrep "2181|2888|3888" LISTEN 0 50 [::ffff:10.0.0.93]:2888 *:* LISTEN 0 50 [::ffff:10.0.0.93]:3888 *:* LISTEN 0 50 *:2181 *:* [root@elk93 ~]# 2.7 客户端链接测试 [root@elk91 ~]# zkCli.sh -server 10.1.20.5:2181,10.1.24.13:2181,10.1.24.4:2181 ... [zk: 10.0.0.91:2181,10.0.0.92:2181,10.0.0.93:2181(CONNECTED) 0] ls / [zookeeper] [zk: 10.0.0.91:2181,10.0.0.92:2181,10.0.0.93:2181(CONNECTED) 1] [zk: 10.0.0.91:2181,10.0.0.92:2181,10.0.0.93:2181(CONNECTED) 1] create /school weixiang Created /school [zk: 10.0.0.91:2181,10.0.0.92:2181,10.0.0.93:2181(CONNECTED) 2] [zk: 10.0.0.91:2181,10.0.0.92:2181,10.0.0.93:2181(CONNECTED) 2] ls / [school, zookeeper] [zk: 10.0.0.91:2181,10.0.0.92:2181,10.0.0.93:2181(CONNECTED) 3] #3. 验证zookeeper集群高可用 3.1 开启两个终端指向不同的节点分别写入数据测试 [root@elk91 ~]# zkCli.sh -server 10.1.24.13:2181 ... [root@elk92 ~]# zkCli.sh -server 10.1.24.4:2181 ... 3.2 停止leader节点观察是否可用 [root@elk91 ~]# zkCli.sh -server 10.0.0.91:2181,10.0.0.92:2181,10.0.0.93:2181 ... 3.3 停止服务观察终端是否可用 [root@elk93 ~]# zkServer.sh start ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: leader [root@elk93 ~]# ss -ntl | egrep "2181|2888|3888" LISTEN 0 50 [::ffff:10.0.0.93]:2888 *:* LISTEN 0 50 [::ffff:10.0.0.93]:3888 *:* LISTEN 0 50 *:2181 *:* [root@elk93 ~]# [root@elk93 ~]# zkServer.sh stop ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED [root@elk93 ~]# [root@elk93 ~]# ss -ntl | egrep "2181|2888|3888" [root@elk92 ~]# zkServer.sh status ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: leader [root@elk92 ~]# ss -ntl | egrep "2181|2888|3888" LISTEN 0 50 [::ffff:10.0.0.92]:3888 *:* LISTEN 0 50 *:2181 *:* LISTEN 0 50 [::ffff:10.0.0.92]:2888 *:* [root@elk92 ~]# [root@elk91 ~]# zkServer.sh status ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: follower [root@elk91 ~]# [root@elk91 ~]# ss -ntl | egrep "2181|2888|3888" LISTEN 0 50 *:2181 *:* LISTEN 0 50 [::ffff:10.0.0.91]:3888 *:* [root@elk91 ~]# 4.zookeeper集群要保证半数以上存活机制 一个集群如果想要容忍N台故障,则要求有2N+1台服务器,说白了,是奇数台服务器。 举例子 - 如果集群容忍挂掉1个节点,则至少要准备2*1 + 1 = 3台服务器。 - 如果集群容忍挂掉2个节点,则至少要准备2*2 + 1 = 5台服务器。 5.测试验证故障容忍节点数量 略,见视频。 # 面试题: zookeeper的leader选举流程 1、启动集群是默认都认为自己是leader,进行leader选举,选举时各节点要暴露出自己的:czxid、 myid, 2、先比较czxid事务ID,越大说明操作就越多,就优先被选举为leader 3、如果czxid比不出来,则比较myid,myid越大则优先成为leader 4、当zookeeper集群半数以上节点参与选举完成,则leader就确认下来了。

4、zookeeper的watch机制及znode类型

bash
# 1.终端1监听事件 [root@elk92 ~]# zkCli.sh -server 10.0.0.93:2181 ... [zk: 10.0.0.93:2181(CONNECTED) 1] ls /school [haha, xixi] [zk: 10.0.0.93:2181(CONNECTED) 2] [zk: 10.0.0.93:2181(CONNECTED) 2] ls -w /school # 此处的w选项就会监听/school节点是否有子节点变化。 [haha, xixi] [zk: 10.0.0.93:2181(CONNECTED) 3] [zk: 10.0.0.93:2181(CONNECTED) 3] # 2.终端2触发事件 [root@elk93 ~]# zkCli.sh -server 10.0.0.91:2181 ... [zk: 10.0.0.91:2181(CONNECTED) 4] ls /school [haha, xixi] [zk: 10.0.0.91:2181(CONNECTED) 5] [zk: 10.0.0.91:2181(CONNECTED) 5] create /school/hehe Created /school/hehe [zk: 10.0.0.91:2181(CONNECTED) 6] [zk: 10.0.0.91:2181(CONNECTED) 6] ls /school [haha, hehe, xixi] [zk: 10.0.0.91:2181(CONNECTED) 7] # 3.观察终端1出现结果 WATCHER:: WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/school # 4.类似watch还支持监控数据变化 [zk: 10.0.0.93:2181(CONNECTED) 4] get -w /school laonanhai [zk: 10.0.0.93:2181(CONNECTED) 5] [zk: 10.0.0.93:2181(CONNECTED) 5] WATCHER:: WatchedEvent state:SyncConnected type:NodeDataChanged path:/school # 5.zookeeper的类型 - 临时的zookeeper node: 当链接断开,zookeeper node在一定时间范围(30s)内自动删除。 - 永久的zookeeper node: 当链接断开后,zookeeper node数据并不丢失。 默认情况下,我们创建的 都是永久的zookeeper node。 [root@elk92 ~]# zkCli.sh -server 10.1.20.5:2181 Connecting to 10.0.0.93:2181 .... 2025-06-26 11:24:25,762 [myid:10.0.0.93:2181] - INFO [main-SendThread(10.0.0.93:2181):o.a.z.ClientCnxn$SendThread@1453] - Session establishment complete on server 10.0.0.93/10.0.0.93:2181, session id = 0x2100009cac2d0000, negotiated timeout = 30000 [zk: 10.0.0.93:2181(CONNECTED) 7] create -e /office Created /office [zk: 10.0.0.93:2181(CONNECTED) 8] [zk: 10.0.0.93:2181(CONNECTED) 8] get /office null [zk: 10.0.0.93:2181(CONNECTED) 9] [zk: 10.0.0.93:2181(CONNECTED) 9] set /office https://www.weixiang.com [zk: 10.0.0.93:2181(CONNECTED) 10] [zk: 10.0.0.93:2181(CONNECTED) 10] get /office https://www.weixiang.com [zk: 10.0.0.93:2181(CONNECTED) 11] [zk: 10.0.0.93:2181(CONNECTED) 11] ls / [class, office, school, zookeeper] [zk: 10.0.0.93:2181(CONNECTED) 12] [zk: 10.0.0.93:2181(CONNECTED) 12] stat /office cZxid = 0x40000000a ctime = Thu Jun 26 11:36:48 CST 2025 mZxid = 0x40000000b mtime = Thu Jun 26 11:37:04 CST 2025 pZxid = 0x40000000a cversion = 0 dataVersion = 1 aclVersion = 0 ephemeralOwner = 0x2100009cac2d0000 # 当前连接的session 会话ID。当当前会话断开30s(negotiated timeout)后会自动删除。 dataLength = 25 numChildren = 0 [zk: 10.0.0.93:2181(CONNECTED) 13] [zk: 10.0.0.93:2181(CONNECTED) 13] stat /school cZxid = 0x100000002 ctime = Thu Jun 26 10:30:06 CST 2025 mZxid = 0x400000009 mtime = Thu Jun 26 11:30:48 CST 2025 pZxid = 0x400000008 cversion = 6 dataVersion = 2 aclVersion = 0 ephemeralOwner = 0x0 # 如果为0,则表示是永久的zookeeper node。 dataLength = 17 numChildren = 2 [zk: 10.0.0.93:2181(CONNECTED) 14]

image

5、zookeeper的JVM调优

bash
# 1.查看默认的堆内存 [root@elk92 ~]# ps -ef | grep zookeeper | grep Xmx root 35492 1 0 10:23 pts/0 00:00:03 /usr/share/elasticsearch/jdk/bin/java ... -Xmx1000m ... # 2.修改默认的堆内存大小 [root@elk91 ~]# vim /usr/local/apache-zookeeper-3.8.4-bin/bin/zkEnv.sh ... # ZK_SERVER_HEAP="${ZK_SERVER_HEAP:-1000}" ZK_SERVER_HEAP="${ZK_SERVER_HEAP:-128}" ... # ZK_CLIENT_HEAP="${ZK_CLIENT_HEAP:-256}" ZK_CLIENT_HEAP="${ZK_CLIENT_HEAP:-128}" # 3.将修改的配置文件同步到集群的其他节点 [root@elk91 ~]# scp /usr/local/apache-zookeeper-3.8.4-bin/bin/zkEnv.sh 10.0.0.92:/usr/local/apache-zookeeper-3.8.4-bin/bin/ [root@elk91 ~]# scp /usr/local/apache-zookeeper-3.8.4-bin/bin/zkEnv.sh 10.0.0.93:/usr/local/apache-zookeeper-3.8.4-bin/bin/ # 4.滚动更新所有节点 [root@elk91 ~]# zkServer.sh restart # 5.再次检查JVM修改是否生效 [root@elk92 ~]# ps -ef | grep zookeeper | grep Xmx root 35492 1 0 10:23 pts/0 00:00:03 /usr/share/elasticsearch/jdk/bin/java ... -Xmx128m ... 温馨提示: 生产环境建议大家设置2GB+,4GB即可。

6、zookeeper图形化管理工具(扩展)

bash
#1.部署JDK [root@elk91 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/Zookeeper/jdk-8u291-linux-x64.tar.gz #2.解压软件包 [root@elk91 ~]# tar xf jdk-8u291-linux-x64.tar.gz -C /usr/local/ #3.下载zkWeb程序包 [root@elk91 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/Zookeeper/zkWeb-v1.2.1.jar #4.运行jar包 [root@elk91 ~]# nohup /usr/local/jdk1.8.0_291/bin/java -jar zkWeb-v1.2.1.jar &>> /tmp/zkweb.log & [1] 26055 [root@elk91 ~]# [root@elk91 ~]# tail -100f /tmp/zkweb.log nohup: ignoring input 11:02:02.851 [main] INFO com.yasenagat.zkweb.ZkWebSpringBootApplication - applicationYamlFileName(application-zkweb.yaml)=file:/root/zkWeb-v1.2.1.jar!/BOOT-INF/classes!/application-zkweb.yaml . ____ _ __ _ _ /\\ / ____ __ _ _(_)_ __ __ _ \ \ \ \ ( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \ \\/ ___)| |_)| | | | | || (_| | ) ) ) ) ' |____| .__|_| |_|_| |_\__, | / / / / =========|_|==============|___/=/_/_/_/ :: Spring Boot :: (v2.0.2.RELEASE) ... [2025-06-26 14:36:21 INFO main TomcatWebServer.java:206] o.s.b.w.e.tomcat.TomcatWebServer --> Tomcat started on port(s): 8099 (http) with context path '' #5.访问zkWeb进行界面管理 http://43.139.47.66:8099/#

image

image

10、kafka

bash
# 消息队列: 其应用场景就是用来缓存数据,多用于高并发,数据量较大的场景使用。 场景的MQ有: ActiveMQ,RocketMQ,RabbitMQ,Kafka等。 消息队列带来的优势: # 1、削峰填谷 场景:系统经常面临突发性、不可预测的流量高峰,高峰流量瞬间涌入导致集群压力大,服务不可用,CPU、内存、数据库连接池被占满 解决办法: 加入消息队列位于生产者和消费者之间,当流量洪峰到来时,生产者(前端应用、API网关等)快速将请求/任务转化为消息,发送到消息队列 中,消息队列将这些消息按顺序存储起来。队列的存储能力和高吞吐写入能力“削平”了洪峰,保护了脆弱的后台系统。 后台消费者服务按照自己稳定、可控的处理速度,持续不断地从消息队列中拉取消息进行处理。在流量低谷期,消费者依然可以处理队列中积 压的消息,或者保持待命状态,充分利用资源,避免了资源闲置浪费 # 2、异步提速 场景:一个用户请求需要触发多个操作,其中一些操作是非核心、耗时长或非必须实时完成的,用户必须等到所有操作都完成后才能得到响应,影响效率 解决办法: 主线程(生产者)只负责处理核心业务逻辑(如验证、扣款、创建订单记录),并快速将需要异步执行的任务封装成消息,发送到消息队列。 消息发送成功后,主线程立即返回结果给用户(如显示“下单成功”页面),用户感知到的响应时间大大缩短。 专门的后台消费者服务监听队列,独立、异步地拉取消息并执行那些耗时或非核心的操作(发短信、更新推荐、同步库存等)。这些操作的执 行时间和主流程完全解耦 # 3、架构解耦 场景:在复杂的系统中,不同服务或模块之间需要通信协作。耦合性太强, 调用方必须知道被调用方(Consumer)的网络地址、接口定义、调用方式 服务挂掉、升级、响应慢都会导致任务失败或阻塞。 解决办法: 服务之间不直接通信,它们都只和消息队列打交道,只需要知道消息队列的地址和如何连接,不关心谁消费、有多少消费者、消费者在哪里、 消费者当前是否可用、消费者处理得快慢,提高系统灵活性、可靠性和可维护性。

1、单点部署实战

bash
# 1.什么是kakfa kafka是一款开源的消息队列(可以理解为shell的数组)。 kafka可以理解为生活中的菜鸟驿站,你和快递员,就相当于程序。菜鸟驿站就是MQ。 能够替代菜鸟驿站的产品也有很多,比如:蜂巢,快递站等。 说白了,kafka也可以被替代的,比如RabbitMQ,RocketMQ,... 官网地址: https://kafka.apache.org/ 温馨提示: KAFKA 2.8+可以不使用zookeeper存储元数据信息了,而在kafka 4.0.0版本中完全不依赖于zookeeper。 https://kafka.apache.org/blog#apache_kafka_400_release_announcement # 2.下载kafka wget https://dlcdn.apache.org/kafka/3.9.1/kafka_2.13-3.9.1.tgz SVIP: [root@elk91 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/kafka/kafka_2.13-3.9.1.tgz # 3.解压软件包 [root@elk91 ~]# tar xf kafka_2.13-3.9.1.tgz -C /usr/local/ [root@elk91 ~]# [root@elk91 ~]# ll /usr/local/kafka_2.13-3.9.1/ total 80 drwxr-xr-x 7 root root 4096 May 12 12:06 ./ drwxr-xr-x 13 root root 4096 Jun 26 14:56 ../ drwxr-xr-x 3 root root 4096 May 12 12:06 bin/ drwxr-xr-x 3 root root 4096 May 12 12:06 config/ drwxr-xr-x 2 root root 12288 Jun 26 14:56 libs/ -rw-r--r-- 1 root root 15243 May 12 12:02 LICENSE drwxr-xr-x 2 root root 4096 May 12 12:06 licenses/ -rw-r--r-- 1 root root 28359 May 12 12:02 NOTICE drwxr-xr-x 2 root root 4096 May 12 12:06 site-docs/ [root@elk91 ~]# # 4.配置环境变量 [root@elk91 ~]# cat /etc/profile.d/kafka.sh #!/bin/bash export KAFKA_HOME=/usr/local/kafka_2.13-3.9.1 export PATH=$PATH:$KAFKA_HOME/bin [root@elk91 ~]# [root@elk91 ~]# [root@elk91 ~]# source /etc/profile.d/kafka.sh [root@elk91 ~]# # 5.修改kafka的配置文件 [root@elk91 ~]# vim /usr/local/kafka_2.13-3.9.1/config/server.properties ... # 唯一标识broker节点 broker.id=91 # 当前节点监听服务的IP地址和端口号 advertised.listeners=PLAINTEXT://10.1.20.5:9092 # 指定kafka数据目录 log.dirs=/var/lib/kafka # 链接zookeeper集群的地址 zookeeper.connect=10.1.20.5:2181,10.1.24.13:2181,10.1.24.4:2181/kafka-v3.9.1 # 6.启动kafka单点 [root@elk91 ~]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties [root@elk91 ~]# [root@elk91 ~]# ss -ntl | grep 9092 LISTEN 0 50 *:9092 *:* [root@elk91 ~]# # 7.验证zkWeb对应的数据是否生成 略,见视频。 # 8.停止服务观察zk WebUI [root@elk91 ~]# kafka-server-stop.sh [root@elk91 ~]# [root@elk91 ~]# ss -ntl | grep 9092 [root@elk91 ~]#

image

2、kafka集群部署

bash
#1.拷贝程序到其他节点 [root@elk91 ~]# scp -r /usr/local/kafka_2.13-3.9.1/ 10.1.24.13:/usr/local [root@elk91 ~]# scp -r /usr/local/kafka_2.13-3.9.1/ 10.1.12.4:/usr/local [root@elk91 ~]# scp /etc/profile.d/kafka.sh 10.1.24.13:/etc/profile.d/ [root@elk91 ~]# scp /etc/profile.d/kafka.sh 10.1.12.4:/etc/profile.d/ #2.其他节点修改配置文件 [root@elk92 ~]# source /etc/profile.d/kafka.sh [root@elk92 ~]# [root@elk92 ~]# vim ${KAFKA_HOME}/config/server.properties ... broker.id=92 advertised.listeners=PLAINTEXT://10.1.24.13:9092 [root@elk93 ~]# source /etc/profile.d/kafka.sh ; vim ${KAFKA_HOME}/config/server.properties ... broker.id=93 advertised.listeners=PLAINTEXT://10.1.24.4:9092 #3.启动kafka服务 [root@elk92 ~]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties [root@elk92 ~]# ss -ntl | grep 9092 LISTEN 0 50 *:9092 *:* [root@elk92 ~]# [root@elk93 ~]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties [root@elk93 ~]# ss -ntl | grep 9092 LISTEN 0 50 *:9092 *:* [root@elk93 ~]# #4.zk WebUI验证 略,见视频。 10.1.20.5 10.1.24.13 10.1.24.4

3、kafka常用术语

image

bash
- kafka常用的术语 - broker 表示kafka集群的每一个节点。 - kafka cluster 也称为kafka集群,还有一个名称"broker list"。 - producer 生产者,表示向kafka集群写入数据的一方。 - consumer 消费者,表示向kfka集群读取数据的一方。 - topic # 理解为es集群的索引 主题,是Producer和consumer进行数据读取的逻辑单元。 - parition 分区,一个topic最少要有1个或多个分区。 - offset 分区的每条消息都对应一个位置点,将来消费者消费的数据和这个offset有关。 说白了,就是标识每一条消息的位置点。 - replica 副本,每个分区最少有1个或多个副本。 当副本数量大于1时,就会区分leader和follower副本。 其中对于消费者而言,只能看到leader副本,follower副本仅对leader副本进行数据备份。
Kafka 术语Elasticsearch 术语相似点关键差异点
BrokerNode集群的基本工作单元/节点。Kafka Broker 主要负责消息传递;ES Node 负责数据存储、索引和搜索。
Kafka ClusterES Cluster多个节点组成的集合,共同工作。目标不同:消息流处理 vs 数据检索。
Producer(Indexing Client)向系统写入数据的客户端。ES 写入通常称为“索引文档”。
Consumer(Search Client)从系统读取数据的客户端。ES 读取主要是搜索查询;Kafka 是顺序/实时消费消息流。
TopicIndex数据的逻辑容器/分类单元。
生产者和消费者操作的核心入口点。
核心相似点! Topic 是消息流;Index 是文档集合。Topic 本身不存储最终状态,只传递消息;Index 是持久化存储数据的最终目的地。
PartitionShard (Primary)数据水平拆分的基本单位。
实现并行处理、扩展性和负载均衡的关键。
核心相似点! Kafka Partition 是消息流的有序子集;ES Shard 是索引的子集(Lucene 索引)。Topic 的 Partition 数影响并行消费能力;Index 的 Shard 数影响并行读写/搜索能力。
OffsetDocument _id唯一标识分区/分片内一条数据的机制。核心相似点(定位数据)! Offset 是分区内严格递增的序列号(位置),标识消息顺序;_id​ 是文档在分片内的唯一标识符(可以是任意值,不隐含顺序)。
ReplicaReplica Shard数据的冗余拷贝,提供高可用性和容错能力。核心相似点! 都用于故障转移。
Leader ReplicaPrimary Shard处理读写请求的主副本。Kafka Consumer 从 Leader Replica 读取;ES 读写请求可以发送给任何包含相关分片的节点,但最终由 Primary Shard 协调索引操作。
Follower ReplicaReplica Shard异步复制 Leader 的数据,作为热备份。
Leader 失效时提升为新 Leader。
Kafka Follower Replica 服务 Consumer 读请求;ES Replica Shard 可以服务读请求(搜索)。

4、启动生产者和消费者验证kafka集群

bash
- 启动生产者和消费者验证kafka集群是否正常工作 # 1.查看topic列表【默认集群是空的】 [root@elk91 ~]# kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --list kafka-topics.sh --bootstrap-server 10.0.0.101:30667--list # kafka-topics.sh:查看组 # 2.写入数据【首次写入时会提示WARN,说白了,会自动创建topic,创建了weixiang-weixiang98主题】 [root@elk91 ~]# kafka-console-producer.sh --bootstrap-server 10.1.20.5:9092 --topic weixiang-weixiang98 >1111111111111111111111 [2025-06-26 16:26:33,925] WARN [Producer clientId=console-producer] The metadata response from the cluster reported a recoverable issue with correlation id 6 : {weixiang-weixiang98=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient) >22222222222222222222222 >33333333333333333333333333333 # kafka-console-producer.sh:Kafka自带的命令行生产者工具 # --bootstrap-server 10.1.20.5:9092:指定Kafka集群的连接点 # --topic weixiang-weixiang98:指定消息要发送到的目标主题为weixiang-weixiang98 # 3.查看topic主题 [root@elk91 ~]# kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --list weixiang-weixiang98 [root@elk91 ~]# # 4.从kafka读取数据 [root@elk91 ~]# kafka-console-consumer.sh --bootstrap-server 10.1.24.4:9092 --topic weixiang-weixiang98 --from-beginning 1111111111111111111111 22222222222222222222222 33333333333333333333333333333 # Kafka自带的命令行生产者工具:Kafka自带的命令行消费者工具 # --bootstrap-server 10.1.24.4:9092: 指定Kafka集群的连接点 # --topic weixiang-weixiang98:指定消息要发送到的目标主题为weixiang-weixiang98 # --from-beginning:表示从头拿

image

5、Kafka和ZooKeeper的关系

image

bash
ZooKeeper 是一个分布式协调服务,常用于管理配置、命名和同步服务。 长期以来,Kafka使用ZooKeeper负责管理集群元数据、控制器选举和消费者组协调等任务理,包括主题、分区信息、ACL(访问控制列表)等。 ZooKeeper为Kafka提供了选主(leader election)、集群成员管理等核心功能,为Kafka提供了一个可靠的分布式协调服务,使得Kafka能够在多个节点之间进行有效的通信和管理。 然而,随着 Kafka的发展,其对ZooKeeper的依赖逐渐显露出一些问题,这些问题也是下面Kafka去除 Zookeeper的原因。 - kafka 2.8+ 为什么要弃用zookeeper组件呢?[kafka 4.0+版本彻底移除了zookeeper组件] 1.复杂性增加 ZooKeeper 是独立于 Kafka 的外部组件,需要单独部署和维护,因此,使用 ZooKeeper 使得 Kafka的运维复杂度大幅提升。 运维团队必须同时管理两个分布式系统(Kafka和 ZooKeeper),这不仅增加了管理成本,也要求运维人员具备更高的技术能力。 2. 性能瓶颈 作为一个协调服务,ZooKeeper 并非专门为高负载场景设计,因此,随着集群规模扩大,ZooKeeper在处理元数据时的性能问题日益突出。 例如,当分区数量增加时,ZooKeeper需要存储更多的信息,这导致了监听延迟增加,从而影响Kafka的整体性能。 在高负载情况下,ZooKeeper可能成为系统的瓶颈,限制了Kafka的扩展能力。 3. 一致性问题 Kafka 内部的分布式一致性模型与 ZooKeeper 的一致性模型有所不同。由于 ZooKeeper和 Kafka控制器之间的数据同步机制不够高效,可能导致状态不一致,特别是在处理集群扩展或不可用情景时,这种不一致性会影响消息传递的可靠性和系统稳定性。 4.发展自己的生态 Kafka 抛弃 ZooKeeper,我个人觉得最核心的原因是:Kafka生态强大了,需要自立门户,这样就不会被别人卡脖子。 纵观国内外,有很多这样鲜活的例子,当自己弱小时,会先选择使用别家的产品,当自己羽翼丰满时,再选择自建完善自己的生态圈。

6、KRaft

bash
5- KAFKA 2.8+引入Kraft模式抛弃ZooKeeper kafka2.8.0版本引入了基于Raft共识协议的新特性,它允许kafka集群在没有ZooKeeper的情况下运行。 为了剥离和去除ZooKeeper,Kafka引入了自己的亲儿子KRaft(Kafka Raft Metadata Mode)。 KRaft是一个新的元数据管理架构,基于Raft一致性算法实现的一种内置元数据管理方式,旨在替代ZooKeeper的元数据管理功能。 KRaft的优势有以下几点: 简化部署: Kafka集群不再依赖外部的ZooKeeper集群,简化了部署和运维的复杂性。 KRaft将所有协调服务嵌入Kafka自身,不再依赖外部系统,这样大大简化了部署和管理,因为管理员只需关注 Kafka 集群。 高效的一致性协议: Raft是一种简洁且易于理解的一致性算法,易于调试和实现。KRaft 利用Raft协议实现了强一致性的元数据管理,优化了复制机制。 提高性能: 由于元数据管理不再依赖ZooKeeper,Kafka集群的性能得到了提升,尤其是在元数据读写方面。 增强可扩展性: KRaft 模式支持更大的集群规模,可以有效地扩展到数百万个分区。 提高元数据操作的扩展性:新的架构允许更多的并发操作,并减少了因为扩展性问题导致的瓶颈,特别是在高负载场景中。 更快的控制器故障转移: 控制器(Controller)的选举和故障转移速度更快,提高了集群的稳定性。 消除 ZooKeeper 作为中间层之后,Kafka 的延迟性能有望得到改善,特别是在涉及选主和元数据更新的场景中。 KRaft模式下,kafka集群中的一些节点被指定为控制器(Controller),它们负责集群的元数据管理和共识服务,所有的元数据都存储在kafka内部的主题中, 而不是ZooKeeper,控制器通过KRaft协议来确保元数据在集群中的准确复制,这种模式使用了基于时间的存储模型,通过定期快照来保证元数据日志不会无限增长。 完全自主: 因为是自家产品,所以产品的架构设计,代码开发都可以自己说了算,未来架构走向完全控制在自己手上。 控制器(Controller)节点的去中心化: KRaft 模式中,控制器节点由一组 Kafka 服务进程代替,而不是一个独立的 ZooKeeper 集群。 这些节点共同负责管理集群的元数据,通过 Raft 实现数据的一致性。 日志复制和恢复机制: 利用 Raft 的日志复制和状态机应用机制,KRaft 实现了对元数据变更的强一致性支持,这意味着所有控制器节点都能够就集群状态达成共识。 动态集群管理: KRaft允许动态地向集群中添加或移除节点,而无需手动去ZooKeeper中更新配置,这使得集群管理更为便捷。 - kafka基于KRaft工作模式集群部署实战 1.停止zookeeper集群 【目的是防止大家以为kafka对zookeeper集群的依赖!!!】 [root@elk91 ~]# zkServer.sh stop [root@elk92 ~]# zkServer.sh stop [root@elk93 ~]# zkServer.sh stop 2.修改kafka的配置文件 [root@elk91 ~]# cp /usr/local/kafka_2.13-3.9.1/config/server.properties{,-kraft} [root@elk91 ~]# vim /usr/local/kafka_2.13-3.9.1/config/server.properties-kraft # 指的borker的ID broker.id=66 # 监听地址 advertised.listeners=PLAINTEXT://10.1.20.5:9092 num.network.threads=3 num.io.threads=8 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 # 指定数据目录 log.dirs=/var/lib/kafka391-kraft num.partitions=1 num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 log.retention.hours=168 log.retention.check.interval.ms=300000 group.initial.rebalance.delay.ms=0 # zookeeper的配置请一定要注释掉!!! # zookeeper.connect=10.0.0.91:2181,10.0.0.92:2181,10.0.0.93:2181/kafka-v3.9.1 # 指定kafka集群的角色,controller维护集群的角色,broker存储数据的角色。此处我进行了复用。 process.roles=broker,controller # 配置监听 listeners=PLAINTEXT://:9092,CONTROLLER://:9093 # 定义监听的名称,必须在‘listeners’列表中 controller.listener.names=CONTROLLER # 配置kafka集群映射列表,注意端口是controller端口 controller.quorum.voters=66@10.1.20.5:9093,77@10.1.24.13:9093,88@10.1.24.4:9093[root@elk91 ~]# 3.拷贝配置文件到其他节点 [root@elk91 ~]# scp /usr/local/kafka_2.13-3.9.1/config/server.properties-kraft 10.1.24.13:/usr/local/kafka_2.13-3.9.1/config/ [root@elk91 ~]# scp /usr/local/kafka_2.13-3.9.1/config/server.properties-kraft 10.1.24.4:/usr/local/kafka_2.13-3.9.1/config/ 4.另外两个节点需要修改配置文件 [root@elk92 ~]# vim /usr/local/kafka_2.13-3.9.1/config/server.properties-kraft ... broker.id=77 advertised.listeners=PLAINTEXT://10.0.0.92:9092 [root@elk93 ~]# vim /usr/local/kafka_2.13-3.9.1/config/server.properties-kraft ... broker.id=88 advertised.listeners=PLAINTEXT://10.0.0.93:9092 5.生成UUID [root@elk91 ~]# kafka-storage.sh random-uuid gP8Rxn7eTMCcLwhK2sjSlA # 后生成的 JUD3aHegT6SS3dSZ34za3w [root@elk91 ~]# 6.所有节点初始化集群 [root@elk91 ~]# kafka-storage.sh format -t JUD3aHegT6SS3dSZ34za3w -c /usr/local/kafka_2.13-3.9.1/config/server.properties-kraft Formatting metadata directory /var/lib/kafka391-kraft with metadata.version 3.9-IV0. [root@elk91 ~]# [root@elk91 ~]# ll /var/lib/kafka391-kraft total 16 drwxr-xr-x 2 root root 4096 Jun 26 16:54 ./ drwxr-xr-x 66 root root 4096 Jun 26 16:54 ../ -rw-r--r-- 1 root root 249 Jun 26 16:54 bootstrap.checkpoint -rw-r--r-- 1 root root 123 Jun 26 16:54 meta.properties [root@elk91 ~]# [root@elk91 ~]# cat /var/lib/kafka391-kraft/meta.properties # #Thu Jun 26 16:54:35 CST 2025 cluster.id=gP8Rxn7eTMCcLwhK2sjSlA directory.id=_Ar3Kwhzoh8sfhm0_KvZKA node.id=66 version=1 [root@elk91 ~]# [root@elk92 ~]# kafka-storage.sh format -t JUD3aHegT6SS3dSZ34za3w -c /usr/local/kafka_2.13-3.9.1/config/server.properties-kraft Formatting metadata directory /var/lib/kafka391-kraft with metadata.version 3.9-IV0. [root@elk92 ~]# [root@elk92 ~]# ll /var/lib/kafka391-kraft/ total 16 drwxr-xr-x 2 root root 4096 Jun 26 16:55 ./ drwxr-xr-x 68 root root 4096 Jun 26 16:55 ../ -rw-r--r-- 1 root root 249 Jun 26 16:55 bootstrap.checkpoint -rw-r--r-- 1 root root 123 Jun 26 16:55 meta.properties [root@elk92 ~]# [root@elk92 ~]# cat /var/lib/kafka391-kraft/meta.properties # #Thu Jun 26 16:55:26 CST 2025 cluster.id=gP8Rxn7eTMCcLwhK2sjSlA directory.id=xNPVlmPAVq3W7xrNb_lINQ node.id=77 version=1 [root@elk92 ~]# [root@elk93 ~]# kafka-storage.sh format -t JUD3aHegT6SS3dSZ34za3w -c /usr/local/kafka_2.13-3.9.1/config/server.properties-kraft Formatting metadata directory /var/lib/kafka391-kraft with metadata.version 3.9-IV0. [root@elk93 ~]# [root@elk93 ~]# ll /var/lib/kafka391-kraft/ total 16 drwxr-xr-x 2 root root 4096 Jun 26 16:55 ./ drwxr-xr-x 66 root root 4096 Jun 26 16:55 ../ -rw-r--r-- 1 root root 249 Jun 26 16:55 bootstrap.checkpoint -rw-r--r-- 1 root root 123 Jun 26 16:55 meta.properties [root@elk93 ~]# [root@elk93 ~]# cat /var/lib/kafka391-kraft/meta.properties # #Thu Jun 26 16:55:56 CST 2025 cluster.id=gP8Rxn7eTMCcLwhK2sjSlA directory.id=krvCT8b2DepzDUMpuPHaog node.id=88 version=1 [root@elk93 ~]# # 如果已经有了集群id会报错集群Exception in thread "main" java.lang.RuntimeException: Invalid cluster.id in: /var/lib/kafka391-kraft/meta.properties. Expected JUD3aHegT6SS3dSZ34za3w, but read gP8Rxn7eTMCcLwhK2sjSlA at org.apache.kafka.metadata.properties.MetaPropertiesEnsemble.verify(MetaPropertiesEnsemble.java:503) #这是初始化失败,需要删除旧的元数据文件:rm -rf /var/lib/kafka391-kraft/* 7.后台启动kafka集群 [root@elk91 ~]# kafka-server-start.sh -daemon ${KAFKA_HOME}/config/server.properties-kraft [root@elk91 ~]# [root@elk91 ~]# ss -ntl | egrep "9092|9093" LISTEN 0 50 *:9093 *:* LISTEN 0 50 *:9092 *:* # -daemon: 后台启动 [root@elk92 ~]# kafka-server-start.sh -daemon ${KAFKA_HOME}/config/server.properties-kraft [root@elk92 ~]# ss -ntl | egrep "9092|9093" LISTEN 0 50 *:9092 *:* LISTEN 0 50 *:9093 *:* [root@elk92 ~]# [root@elk93 ~]# kafka-server-start.sh -daemon ${KAFKA_HOME}/config/server.properties-kraft [root@elk93 ~]# ss -ntl | egrep "9092|9093" LISTEN 0 50 *:9093 *:* LISTEN 0 50 *:9092 *:* [root@elk93 ~]# 8.测试验证 8.1 启动生产者 [root@elk91 ~]# kafka-console-producer.sh --bootstrap-server 10.1.20.5:9092 --topic weixiang-weixiang98 >www.weixiang.com [2025-06-26 17:00:53,052] WARN [Producer clientId=console-producer] The metadata response from the cluster reported a recoverable issue with correlation id 7 : {weixiang-weixiang98=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient) [2025-06-26 17:00:53,168] WARN [Producer clientId=console-producer] The metadata response from the cluster reported a recoverable issue with correlation id 8 : {weixiang-weixiang98=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient) >学IT来老男孩,月薪过万不是梦~ > 8.2 启动消费者 [root@elk93 ~]# kafka-console-consumer.sh --bootstrap-server 10.1.24.4:9092 --topic weixiang-weixiang98 --from-beginning www.weixiang.com 学IT来老男孩,月薪过万不是梦~

e7928c126a2d5c49e6fef21d438ed384

7、kafka 2.8+ 为什么要移除zookeeper组件呢?

bash
1.复杂性增加 2. 性能瓶颈 3. 一致性问题 4.发展自己的生态 - KAFKA 2.8+引入KRaft(Kafka Raft Metadata Mode)模式抛弃ZooKeeper的优势: 1.简化部署 2.高效的一致性协议 3.提高性能 4.增强可扩展性 5.更快的控制器故障转移 6.完全自主 7.控制器节点的去中心化 8.日志复制和恢复机制 9.动态集群管理 - KAFKA 4.0+仅支持KRaft架构,完全弃用了zookeeper。

8、kafka常用术语精讲

image

bash
生产者通过连接集群主机跟topic主题将数据逻辑写入一个或者多个分区,通过分区将数据写入leader副本,follower副本主要是备份leader 副本的数据,同一个分区的不同副本不能在不同的broker节点。对于生产者来说能看到leader跟follower,但对于消费者来说看不到follower 副本,只能看到leader,因为在写的过程中follower没有跟上leader的数据。 对于开发来说只需要知道集群主机ip跟topic是什么进行读写就可以。 - broker 表示kafka集群的每一个节点。 - kafka cluster 也称为kafka集群,还有一个名称"broker list"。 - producer 生产者,表示向kafka集群写入数据的一方。 - consumer 消费者,表示向kfka集群读取数据的一方。 - consumer group 消费者组, - topic # 理解为es集群的索引 主题,是Producer和consumer进行数据读取的逻辑单元。 - parition 分区,一个topic最少要有1个或多个分区。 - offset 分区的每条消息都对应一个位置点,将来消费者消费的数据和这个offset有关。 说白了,就是标识每一条消息的位置点。 - replica 副本,每个分区最少有1个或多个副本。 当副本数量大于1时,就会区分leader和follower副本。 其中对于消费者而言,只能看到leader副本,follower副本仅对leader副本进行数据备份。

9、kafka的topics管理

bash
# 1.查看topics列表 [root@elk91 ~]# kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --list __consumer_offsets weixiang-weixiang98 [root@elk91 ~]# # 2.查看weixiang-weixiang98主题topic详情 [root@elk91 ~]# kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --topic weixiang-weixiang98 --describe Topic: weixiang-weixiang98 TopicId: EjAFeZ61RYuP9YtLh3nAkA PartitionCount: 1 ReplicationFactor: 1 Configs: Topic: weixiang-weixiang98 Partition: 0 Leader: 66 Replicas: 66 Isr: 66 Elr: LastKnownElr: # 3.创建laonanhai的topic主题,3分区,2副本 [root@elk91 ~]# kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --topic laonanhai --partitions 3 --replication-factor 2 --create Created topic laonanhai. [root@elk91 ~]# [root@elk02 ~]#kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --topic laonanhai --describe Topic: laonanhai TopicId: z9bwuxaNQ2uLRnKe40ZbrQ PartitionCount: 3 ReplicationFactor: 2 Configs: Topic: laonanhai Partition: 0 Leader: 92 Replicas: 92,93 Isr: 92,93 Elr: N/A LastKnow Topic: laonanhai Partition: 1 Leader: 93 Replicas: 93,91 Isr: 93,91 Elr: N/A LastKnow Topic: laonanhai Partition: 2 Leader: 91 Replicas: 91,92 Isr: 91,92 Elr: N/A LastKnow # 查看 [root@elk03 ~]#ll /var/lib/kafka/ |grep laonanhai

image

bash
# 4.修改分区数 [root@elk91 ~]# kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --topic laonanhai --partitions 5 --alter [root@elk91 ~]# [root@elk91 ~]#kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --topic laonanhai --describe Topic: laonanhai TopicId: biu6Bl6YQuWcYRBVaJoKMA PartitionCount: 5 ReplicationFactor: 2 Configs: Topic: laonanhai Partition: 0 Leader: 77 Replicas: 77,88 Isr: 77,88 Elr: LastKnownElr: Topic: laonanhai Partition: 1 Leader: 88 Replicas: 88,66 Isr: 88,66 Elr: LastKnownElr: Topic: laonanhai Partition: 2 Leader: 66 Replicas: 66,77 Isr: 66,77 Elr: LastKnownElr: Topic: laonanhai Partition: 3 Leader: 77 Replicas: 77,88 Isr: 77,88 Elr: LastKnownElr: Topic: laonanhai Partition: 4 Leader: 88 Replicas: 88,66 Isr: 88,66 Elr: LastKnownElr: # 不能减少分区 [root@elk91 ~]# kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --topic laonanhai --partitions 3 --alter Error while executing topic command : The topic laonanhai currently has 5 partition(s); 3 would not be an increase. [2025-06-27 09:59:15,250] ERROR org.apache.kafka.common.errors.InvalidPartitionsException: The topic laonanhai currently has 5 partition(s); 3 would not be an increase. (org.apache.kafka.tools.TopicCommand) 温馨提示: - 1.分区数只能增多不能减少; - 2.副本数量修改比较麻烦,生产环境建议大家设置3副本; 参考链接: https://www.cnblogs.com/yinzhengjie/p/9808125.html # 5.删除topic [root@elk91 ~]# kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --topic laonanhai --delete [root@elk91 ~]# [root@elk91 ~]# kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --list __consumer_offsets weixiang-weixiang98 [root@elk91 ~]# [root@elk91 ~]# kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --topic laonanhai --delete # 如果指定的topic不存在则报错!!! Error while executing topic command : Topic 'Optional[laonanhai]' does not exist as expected [2025-06-27 10:06:50,590] ERROR java.lang.IllegalArgumentException: Topic 'Optional[laonanhai]' does not exist as expected at org.apache.kafka.tools.TopicCommand.ensureTopicExists(TopicCommand.java:218) at org.apache.kafka.tools.TopicCommand.access$700(TopicCommand.java:81) at org.apache.kafka.tools.TopicCommand$TopicService.deleteTopic(TopicCommand.java:656) at org.apache.kafka.tools.TopicCommand.execute(TopicCommand.java:113) at org.apache.kafka.tools.TopicCommand.mainNoExit(TopicCommand.java:90) at org.apache.kafka.tools.TopicCommand.main(TopicCommand.java:85) (org.apache.kafka.tools.TopicCommand) [root@elk91 ~]#

10、kafka的消费者组管理

bash
# 1.创建测试的topics [root@elk91 ~]# kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --topic laonanhai --replication-factor 2 --partitions 3 --create Created topic laonanhai. [root@elk91 ~]# [root@elk91 ~]# kafka-topics.sh --bootstrap-server 10.1.20.5:9092 --topic laonanhai --describe Topic: laonanhai TopicId: tZO3eMicRE6zHoGfCxBaWQ PartitionCount: 3 ReplicationFactor: 2 Configs: Topic: laonanhai Partition: 0 Leader: 66 Replicas: 66,77 Isr: 66,77 Elr: LastKnownElr: Topic: laonanhai Partition: 1 Leader: 77 Replicas: 77,88 Isr: 77,88 Elr: LastKnownElr: Topic: laonanhai Partition: 2 Leader: 88 Replicas: 88,66 Isr: 88,66 Elr: LastKnownElr: [root@elk91 ~]# # 2.启动生产者 [root@elk91 ~]# kafka-console-producer.sh --bootstrap-server 10.1.20.5:9092 --topic laonanhai >学IT来老男孩,月薪过万不是梦~ >https://www.weixiang.com > # 3.启动消费并指定消费者组 [root@elk92 ~]# kafka-console-consumer.sh --bootstrap-server 10.1.20.5:9092 --topic laonanhai --group weixiang98 --from-beginning 学IT来老男孩,月薪过万不是梦~ https://www.weixiang.com # 4.查看消费者组信息 4.1 查看消费者组列表 [root@elk91 ~]# kafka-consumer-groups.sh --bootstrap-server 10.1.20.5:9092 --list weixiang98 [root@elk91 ~]# 4.2 查看消费者组的详细信息 [root@elk91 ~]# kafka-consumer-groups.sh --bootstrap-server 10.1.20.5:9092 --describe --group weixiang98 ;echo #消费者组 #主题 #对应的分区 #消费的位置点 #最后的位置点 #延迟 #消费者id,谁消费了 GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID weixiang98 laonanhai 0 0 0 0 console-consumer-a4caf423-2bcc-42b6-9c98-0847cca339d3 /10.0.0.92 console-consumer weixiang98 laonanhai 2 2 2 0 console-consumer-a4caf423-2bcc-42b6-9c98-0847cca339d3 /10.0.0.92 console-consumer weixiang98 laonanhai 1 0 0 0 console-consumer-a4caf423-2bcc-42b6-9c98-0847cca339d3 /10.0.0.92 console-consumer [root@elk91 ~]# # 5.再次启动生产者 [root@elk93 ~]# kafka-console-producer.sh --bootstrap-server 10.1.24.13:9092 --topic laonanhai >刘峰爱做俯卧撑 > # 6.再次查看消费者组信息 [root@elk91 ~]# kafka-consumer-groups.sh --bootstrap-server 10.1.24.4:9092 --describe --group weixiang98 ;echo GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID weixiang98 laonanhai 0 1 1 0 console-consumer-a4caf423-2bcc-42b6-9c98-0847cca339d3 /10.0.0.92 console-consumer weixiang98 laonanhai 2 2 2 0 console-consumer-a4caf423-2bcc-42b6-9c98-0847cca339d3 /10.0.0.92 console-consumer weixiang98 laonanhai 1 0 0 0 console-consumer-a4caf423-2bcc-42b6-9c98-0847cca339d3 /10.0.0.92 console-consumer [root@elk91 ~]# # 7.启动新的消费者【发现没有拿到数据】 [root@elk92 ~]# kafka-console-consumer.sh --bootstrap-server 10.1.24.4:9092 --topic laonanhai --group weixiang98 --from-beginning # 8.再次查看消费者组信息 [root@elk91 ~]# kafka-consumer-groups.sh --bootstrap-server 10.1.24.4:9092 --describe --group weixiang98 ;echo GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID weixiang98 laonanhai 0 1 1 0 console-consumer-8f5a680a-3ed4-4406-a5a2-a2dbef5af2fe /10.0.0.92 console-consumer weixiang98 laonanhai 1 0 0 0 console-consumer-8f5a680a-3ed4-4406-a5a2-a2dbef5af2fe /10.0.0.92 console-consumer weixiang98 laonanhai 2 2 2 0 console-consumer-a4caf423-2bcc-42b6-9c98-0847cca339d3 /10.0.0.92 console-consumer [root@elk91 ~]# # 9.再次尝试生产者写入观察消费者组的变化 略,见视频。 # 10.停止所有的消费者生产者写入数据观察消费者组变化 略,见视频。 参考如下: 【不难发现,存在延迟的情况】 [root@elk91 ~]# kafka-consumer-groups.sh --bootstrap-server 10.1.24.4:9092 --describe --group weixiang98 ;echo Consumer group 'weixiang98' has no active members. GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID weixiang98 laonanhai 0 3 6 3 - - - weixiang98 laonanhai 2 5 7 2 - - - weixiang98 laonanhai 1 0 0 0 - - - [root@elk91 ~]# - 总结: - 1.同一个消费者组内,每个分区只能被一个消费者消费,也就是谁先消费了,之后同一个组消费不到了 - 2.当消费者组的消费者数量发生变化时,或者分区数发生变化时,会触发消费者组的消费者重新分配分区的情况,这个现象我们称之为"reblancer(重平衡)"。 - 3.别的组也可以消费生产者的数据,使用afka-consumer-groups.sh --bootstrap-server 10.1.24.4:9092 --describe --group weixiang98 ;echo查看具体消费信息

11、kafka的JVM调优

bash
# 1.调优思路 建议kafka的JVM的堆内存设置为6GB即可。 参考配置: 【JDK8,JDK11,JDK17,JDK21】 KAFKA_OPTS="-Xmx6g -Xms6g -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=85" # 2.学习环境建议配置256Mb就可以了 2.1 调优前查看配置信息 [root@elk91 ~]# ps -ef | grep kafka | grep Xmx root 45114 1 0 May08 pts/0 00:03:29 /usr/share/elasticsearch/jdk/bin/java -Xmx1G -Xms1G ... 2.2 修改堆内存大小 [root@elk91 ~]# vim /usr/local/kafka_2.13-3.9.1/bin/kafka-server-start.sh ... # export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G" export KAFKA_HEAP_OPTS="-Xmx256m -Xms256m" ... [root@elk91 ~]# [root@elk91 ~]# scp /usr/local/kafka_2.13-3.9.1/bin/kafka-server-start.sh 10.0.0.92:/usr/local/kafka_2.13-3.9.1/bin/ [root@elk91 ~]# scp /usr/local/kafka_2.13-3.9.1/bin/kafka-server-start.sh 10.0.0.93:/usr/local/kafka_2.13-3.9.1/bin/ 2.3 停止kafka [root@elk91 ~]# kafka-server-stop.sh [root@elk92 ~]# kafka-server-stop.sh [root@elk93 ~]# kafka-server-stop.sh 2.4 启动zookeeper集群 [root@elk91 ~]# zkServer.sh start ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@elk91 ~]# [root@elk92 ~]# zkServer.sh start ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@elk92 ~]# [root@elk92 ~]# [root@elk93 ~]# zkServer.sh start ZooKeeper JMX enabled by default Using config: /usr/local/apache-zookeeper-3.8.4-bin/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@elk93 ~]# 2.5 启动kafka [root@elk91 ~]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties [root@elk92 ~]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties [root@elk93 ~]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties 2.5 再次验证JVM是否调优成功 [root@elk91 ~]# ps -ef | grep kafka | grep Xmx root 46218 1 29 08:44 pts/0 00:00:05 /usr/share/elasticsearch/jdk/bin/java -Xmx256m -Xms256m ...

12、kafka的图形化管理工具EFAK(了解)

bash
# 1.启动MySQL数据库 1.1 安装docker环境 [root@elk93 ~]# wget http://192.168.21.253/Resources/Docker/softwares/yinzhengjie-autoinstall-docker-docker-compose.tar.gz [root@elk93 ~]# tar xf yinzhengjie-autoinstall-docker-docker-compose.tar.gz [root@elk93 ~]# ./install-docker.sh i 1.2 导入MySQL镜像 [root@elk93 ~]# wget http://192.168.21.253/Resources/Docker/images/WordPress/weixiang-mysql-v8.0.36-oracle.tar.gz [root@elk93 ~]# docker load < weixiang-mysql-v8.0.36-oracle.tar.gz 1.3 运行MySQL服务 [root@elk93 ~]# docker run -d --network host --name mysql-server --restart always -e MYSQL_DATABASE=weixiang_kafka -e MYSQL_USER=weixiang98 -e MYSQL_PASSWORD=yinzhengjie -e MYSQL_ALLOW_EMPTY_PASSWORD=yes mysql:8.0.36-oracle --character-set-server=utf8 --collation-server=utf8_bin --default-authentication-plugin=mysql_native_password [root@elk93 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 1fd33841f55d mysql:8.0.36-oracle "docker-entrypoint.s…" 22 seconds ago Up 21 seconds mysql-server [root@elk93 ~]# [root@elk93 ~]# ss -ntl | grep 3306 LISTEN 0 151 *:3306 *:* LISTEN 0 70 *:33060 *:* [root@elk93 ~]# # 2.下载kafka [root@elk92 ~]# wget https://github.com/smartloli/kafka-eagle-bin/blob/master/efak-web-3.0.2-bin.tar.gz svip: [root@elk92 ~]# wget http://192.168.21.253/Resources/ElasticStack/softwares/kafka/efak-web/efak-web-3.0.2-bin.tar.gz # 3.解压软件包 [root@elk92 ~]# tar xf efak-web-3.0.2-bin.tar.gz -C /usr/local/ [root@elk92 ~]# # 4.修改配置文件 [root@elk92 ~]# cat > /usr/local/efak-web-3.0.2/conf/system-config.properties <<EOF efak.zk.cluster.alias=weixiang-weixiang98 weixiang-weixiang98.zk.list=10.0.0.91:2181,10.0.0.92:2181,10.0.0.93:2181/kafka-v3.9.1 weixiang-weixiang98.zk.acl.enable=false weixiang-weixiang98.efak.broker.size=20 kafka.zk.limit.size=16 efak.webui.port=8048 efak.distributed.enable=false weixiang-weixiang98.efak.jmx.acl=false weixiang-weixiang98.efak.offset.storage=kafka weixiang-weixiang98.efak.jmx.uri=service:jmx:rmi:///jndi/rmi://%s/jmxrmi efak.metrics.charts=true efak.metrics.retain=15 efak.sql.topic.records.max=5000 efak.sql.topic.preview.records.max=10 efak.topic.token=weixiang weixiang-weixiang98.efak.sasl.enable=false efak.driver=com.mysql.cj.jdbc.Driver efak.url=jdbc:mysql://10.0.0.93:3306/weixiang_kafka?useUnicode=true&characterEncoding=UTF-8&zeroDateTimeBehavior=convertToNull efak.username=weixiang98 efak.password=yinzhengjie EOF # 5.修改启动的脚本 [root@elk92 ~]# vim /usr/local/efak-web-3.0.2/bin/ke.sh ... export KE_JAVA_OPTS="-server -Xmx256m -Xms256m -XX:MaxGCPauseMillis=20 -XX:+UseG1GC -XX:MetaspaceSize=128m -XX:InitiatingH eapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80" # 6.添加环境变量 [root@elk92 ~]# cat /etc/profile.d/kafka.sh #!/bin/bash export KAFKA_HOME=/usr/local/kafka_2.13-3.9.0 export KE_HOME=/usr/local/efak-web-3.0.2 export PATH=$PATH:$KAFKA_HOME/bin:$KE_HOME/bin [root@elk92 ~]# [root@elk92 ~]# [root@elk92 ~]# source /etc/profile.d/kafka.sh [root@elk92 ~]# # 7.启动efak程序 [root@elk92 ~]# ke.sh start ... Welcome to ______ ______ ___ __ __ / ____/ / ____/ / | / //_/ / __/ / /_ / /| | / ,< / /___ / __/ / ___ | / /| | /_____/ /_/ /_/ |_|/_/ |_| ( Eagle For Apache Kafka® ) Version v3.0.2 -- Copyright 2016-2022 ******************************************************************* * EFAK Service has started success. * Welcome, Now you can visit 'http://10.0.0.92:8048' * Account:admin ,Password:123456 ... [root@elk92 ~]# ss -ntl | grep 8048 LISTEN 0 500 *:8048 *:* [root@elk92 ~]# # 8.访问efak的WebUI【使用上一步的用户名和密码登录即可】 http://10.0.0.92:8048/tv

image

13、kafka丢失数据问题

kafka为何丢失数据图解

bash
- kafka的ISR存在丢失数据的风险 ISR: 和leader副本数据相同的所有副本集合。 OSR: 和leader副本数据不相同的副本集合。 AR: all replica,表示所有的副本集合,包括leader和follower。 AR = ISR + OSR 。 LEO: 表示某个paritition的最后一个offset。。 HW: 表示ISR列表中最小的LEO。 对于消费者而言,只能消费HW之前的offset。 生产环境环境中如何避免kakfa丢失数据 - 1.尽量保证服务器不断电(IDC可以解决); - 2.停止服务的时候不要使用"kill -9",使用官方提供的"kafka-server-stop.sh "脚本脚本优雅停止,生产环境中我执行改脚本需要(5-10m左右才能停下来); - 3.注意kafka集群压力不能过大,当集群压力过大时,频繁出现ISR和OSR列表来回切换,应该考虑扩容kafka节点; 使用kafka的场景有两种主流: - 大数据开发(PB级别+) - 日志采集(100TB级别+) 若数据比较重要可以考虑使用RabbitMQ来替换。

14、kafka消费数据延迟问题解决方案

bash
消费者在数据延迟可能的情况分析: 方案一: 扩容消费者组的消费者数量。 一定要注意的是,分区数 >= 消费者数量,因为会浪费资源,一个消费者负责一个或多个分区

15、filebeat对接kafka集群

bash
# 1.创建topic [root@elk92 ~]# kafka-topics.sh --bootstrap-server 10.1.24.4:9092 --topic weixiang-elasticstack --partitions 3 --replication-factor 2 --create Created topic weixiang-elasticstack. [root@elk92 ~]# # 2.生成测试数据 [root@elk92 ~]# python3 generate_log.py /tmp/apps.log # 3.filebeat写入数据到kafka集群 [root@elk92 ~]# cat /etc/filebeat/config/17-filestream-to-kafka.yaml filebeat.inputs: - type: filestream paths: - /tmp/apps.log output.kafka: hosts: - 10.1.20.5:9092 - 10.1.24.13:9092 - 10.1.24.4:9092 topic: "weixiang-elasticstack" [root@elk92 ~]# [root@elk92 ~]# filebeat -e -c /etc/filebeat/config/17-filestream-to-kafka.yaml # 4.查看kafka数据 [root@elk93 ~]# kafka-console-consumer.sh --bootstrap-server 10.1.24.13:9092 --topic weixiang-elasticstack --group weixiang98-001 --from-beginning

16、logstsh对接kafka集群

bash
# 1.kibana创建api-key 假设我拿到的api-key为: hi5XsJcBeldXDLG8xVg4:anFgm1jKRIeoSN9x8mIqdw # 2.Logstash从kafka拉取数据写入ES集群 [root@elk93 ~]# cat /etc/logstash/conf.d/11-kafka-to-es.conf input { kafka { bootstrap_servers => "43.139.77.96:9092,43.139.47.66:9092,106.55.44.37:9092" # 指定主机 group_id => "logstash-weixiang98-006" # 指定消费者组 topics => ["weixiang-elasticstack"] # 指定主题 auto_offset_reset => "earliest" # 当无消费偏移量时,从最早的消息开始消费(防止数据丢失) } } filter { json { source => "message" # 解析原始消息中的JSON字符串 } mutate { split => { "message" => "|" } # 用竖线分割message字段 add_field => { "other" => "%{[message][0]}" "userId" => "%{[message][1]}" "action" => "%{[message][2]}" "svip" => "%{[message][3]}" "price" => "%{[message][4]}" } } mutate { split => { "other" => " " } # 用空格分割other字段 add_field => { "dt" => "%{[other][1]} %{[other][2]}" # 提取日期时间(如 "2023-01-01 INFO" → 取索引1和2) } convert => { "price" => "float" # 价格转为浮点数 "userId" => "integer" # 用户ID转为整数 } remove_field => [ "@version", "input","ecs","log","tags","agent","host","message","other"] } # 删除冗余字段 date { match => [ "dt", "yyyy-MM-dd HH:mm:ss" ] # 将dt字段解析为@timestamp } } output { #stdout { # codec => rubydebug #} elasticsearch { hosts => ["https://43.139.77.96:9200","https://43.139.47.66:9200","https://106.55.44.37:9200"] index => "weixiang-logstash-kafka-apps-%{+YYYY.MM.dd}" api_key => "hi5XsJcBeldXDLG8xVg4:anFgm1jKRIeoSN9x8mIqdw" ssl => true ssl_certificate_verification => false } } [root@elk93 ~]# [root@elk93 ~]# logstash -rf /etc/logstash/conf.d/11-kafka-to-es.conf # 3.kibana出图展示

17、企业级ELFK+KAFKA架构设计方案

bash
公司已经部署好了浪潮品牌的11台Elasticsearch集群,当时版本是6.8(现在是7.17.28),当时网站经常会被盗链,有时周二,有时周五, 会定时的攻击我们,有时候会多出30多个T的流量,所以这时候要拿日志做日志分析,部署了2台logstash服务器,分别是32核,32GB内存,8T *12的硬件配置,都是浪潮的牌子,前面是部署了18台WEB服务器,都部署了Nginx,在上面都部署了filebeat,然后部署了5台Kafka集群,也 都是是32核,32内存,8T*12的硬件配置,然后我们Web服务器的日志通过filebeat打到了kafka集群中,然后logstash去kafka集群采集数 据,做一个简单的日志分析,早期是没有部署kafka的,是由filebbeat直接打到logstash,直到发现logstash不够用,会经常崩掉,18台服务 器同时打的时候OOM了,32核根本扛不住,所以在前面又部署了kafka集群,然后部署kibana,通过ES集群采集日志,在上面做些IP、PV、带宽统 计、全球用户分布、用户设备类型的图,进行分析,然后开发感觉kibana难用,我又部署了Grafana去采集数据展示,我们的架构用的是GO语言, 从kafka集群消费数据后,交给ES集群,然后往ES集群去存,Kibana过ES集群去采集数据进行分析。ES集群的工作日每天的数据量是1.5TB,非工 作日的数据量是2.7TB,过滤出的有效数据,Nginx的LUA语言的鉴权日志,也可以打到ES集群; 还有一份数据,是我另外一个同事负责的架构,他是单独用flume单独用了一台机器,也是32核,32内存,8T*12的硬件配置去采集,是通过 flume去kafka集群采集WEB日志,然后写到HDFS集群,然后去做处理,开发人员通过跑MapReduce、Spark,黑名单的用户,禁止登录的,会 去HDFS采集日志,去分析黑名单用户日志,开发人员会去Kafka集群消费数据,通过Flink分析,将结果写入到分布式数据库集群,硬件配置是 64核心,256G内存,8T*12的硬件配置,ClichHouse,用的是Ubuntu系统,然后通过开源的图形化界面SuperSet去展示,做报表数据,十分 钟一次出一次报表;

11、docker

1、docker架构概述

1、docker架构图

dbaae424872f10ab421c714722446e33

bash
#Docker客户端(Client) · 用户与Docker交互的命令行工具(docker命令) · 发送指令给Docker守护进程(如docker run、docker build、docker pull) #Docker Host Docker守护进程(Docker daemon) · 后台服务,负责管理 Docker 对象(镜像、容器、网络、存储等) · 监听 /var/run/docker.sock(Unix 套接字)或 TCP 端口(如 2375) Docker 镜像(Images) · 只读模板,用于创建容器(类似虚拟机的ISO文件)。 · 存储在 镜像仓库(如 Docker Hub、私有 Registry)。 Docker容器(Containers) · 镜像的运行实例,具有独立的进程、网络和文件系统。 · 轻量级(共享主机内核),比虚拟机更高效。 # Docker 仓库(Registry) · 存储和分发镜像的服务: · 公共仓库:Docker Hub(docker.io)、阿里云镜像仓库。 · 私有仓库:企业自建 Registry(如 Harbor)。 # Docker工作流程: 1、用户通过docker命令(Client)发送请求(如 docker pull)。 2、Docker守护进程(Daemon)接收请求,管理本地镜像或从 Registry 拉取镜像。 3、基于镜像创建容器: 分配独立的文件系统、网络和进程空间。 启动容器内的应用(如Nginx、MySQL)。 4、容器运行期间: 可以通过docker exec进入容器。 日志通过docker logs查看。 5、停止或删除容器: docker stop 停止容器。 docker rm 删除容器。 # 容器是基于镜像的运行实例,删除镜像不会自动删除容器,但会导致容器无法重启,维护时需遵循:先清理容器 → 再删除镜像

2、安装docker
bash
# 1.什么是docker 所谓的docker其实是一款容器管理工具。 容器可以简单理解为打包Linux文件的一种工具。 官网: https://www.docker.com/ 官方文档: https://docs.docker.com/ 官方仓库: https://hub.docker.com/ # 2.Docker架构 - 二进制部署docker环境 参考链接: https://docs.docker.com/engine/install/binaries/ 1.下载docker软件包 2.解压软件包 [root@elk91 ~]# tar xf docker-28.3.0.tgz [root@elk91 ~]# ll docker total 215588 drwxrwxr-x 2 yinzhengjie yinzhengjie 4096 Jun 24 23:46 ./ drwx------ 8 root root 4096 Jun 30 09:05 ../ -rwxr-xr-x 1 yinzhengjie yinzhengjie 41451704 Jun 24 23:46 containerd* -rwxr-xr-x 1 yinzhengjie yinzhengjie 14065848 Jun 24 23:46 containerd-shim-runc-v2* -rwxr-xr-x 1 yinzhengjie yinzhengjie 21242040 Jun 24 23:46 ctr* -rwxr-xr-x 1 yinzhengjie yinzhengjie 43617760 Jun 24 23:46 docker* -rwxr-xr-x 1 yinzhengjie yinzhengjie 79277584 Jun 24 23:46 dockerd* -rwxr-xr-x 1 yinzhengjie yinzhengjie 708456 Jun 24 23:46 docker-init* -rwxr-xr-x 1 yinzhengjie yinzhengjie 2457968 Jun 24 23:46 docker-proxy* -rwxr-xr-x 1 yinzhengjie yinzhengjie 17914256 Jun 24 23:46 runc* [root@elk91 ~]# 3.将docker程序放到PATH环境变量 [root@elk91 ~]# cp docker/* /usr/bin/ [root@elk91 ~]# 4.启动docker服务端 [root@elk91 ~]# dockerd 5.客户端测试验证 [root@elk91 ~]# docker version Client: Version: 28.3.0 API version: 1.51 Go version: go1.24.4 Git commit: 38b7060 Built: Tue Jun 24 15:43:00 2025 OS/Arch: linux/amd64 Context: default Server: Docker Engine - Community Engine: Version: 28.3.0 API version: 1.51 (minimum version 1.24) Go version: go1.24.4 Git commit: 265f709 Built: Tue Jun 24 15:44:17 2025 OS/Arch: linux/amd64 Experimental: false containerd: Version: v1.7.27 GitCommit: 05044ec0a9a75232cad458027ca83437aae3f4da runc: Version: 1.2.6 GitCommit: v1.2.6-0-ge89a299 docker-init: Version: 0.19.0 GitCommit: de40ad0 [root@elk91 ~]# 6.卸载docker环境 [root@elk91 ~]# kill `ps -ef | grep dockerd | grep -v grep |awk '{print $2}'` [root@elk91 ~]# [root@elk91 ~]# for pkg in containerd containerd-shim-runc-v2 ctr docker dockerd docker-init docker-proxy runc;do rm -f /usr/bin/$pkg ;done [root@elk91 ~]# [root@elk91 ~]# rm -rf /var/lib/docker/ - 基于脚本一键部署docker环境 1.下载软件包 [root@elk91 ~]# wget http://192.168.21.253/Resources/Docker/scripts/weixiang-autoinstall-docker-docker-compose.tar.gz 2.解压软件包 [root@elk91 ~]# tar xf weixiang-autoinstall-docker-docker-compose.tar.gz [root@elk91 ~]# 3.安装docker环境 [root@elk91 ~]# ./install-docker.sh i 4.检查docker版本 [root@elk91 ~]# docker version Client: Version: 20.10.24 API version: 1.41 Go version: go1.19.7 Git commit: 297e128 Built: Tue Apr 4 18:17:06 2023 OS/Arch: linux/amd64 Context: default Experimental: true Server: Docker Engine - Community Engine: Version: 20.10.24 API version: 1.41 (minimum version 1.12) Go version: go1.19.7 Git commit: 5d6db84 Built: Tue Apr 4 18:23:02 2023 OS/Arch: linux/amd64 Experimental: false containerd: Version: v1.6.20 GitCommit: 2806fc1057397dbaeefbea0e4e17bddfbd388f38 runc: Version: 1.1.5 GitCommit: v1.1.5-0-gf19387a6 docker-init: Version: 0.19.0 GitCommit: de40ad0 [root@elk91 ~]# [root@elk91 ~]# 5.卸载docker环境 [root@elk91 ~]# ./install-docker.sh r wget https://download.docker.com/linux/static/stable/x86_64/docker-28.3.0.tgz SVIP: [root@elk91 ~]# wget http://192.168.21.253/Resources/Docker/softwares/binary/docker-28.3.0.tgz
3、docker基于VPN配置代理
bash
1.查看本地镜像 [root@elk91 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE [root@elk91 ~]# 2.无法直接获取官方的镜像仓库 [root@elk91 ~]# docker run hello-world Unable to find image 'hello-world:latest' locally docker: Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers). See 'docker run --help'. [root@elk91 ~]# 3.修改docker的启动脚本 [root@elk91 ~]# cat /lib/systemd/system/docker.service [Unit] Description=weixiang linux Docke Engine Documentation=https://docs.docker.com,https://www.weixiang.com Wants=network-online.target [Service] Type=notify ExecStart=/usr/bin/dockerd # 配置docker代理 Environment="HTTP_PROXY=http://10.0.0.1:7890" Environment="HTTPS_PROXY=http://10.0.0.1:7890" [Install] WantedBy=multi-user.target [root@elk91 ~]# 4.加载配置并验证 [root@elk91 ~]# systemctl daemon-reload [root@elk91 ~]# systemctl restart docker [root@elk91 ~]# [root@elk91 ~]# docker info | grep Proxy HTTP Proxy: http://10.0.0.1:7890 HTTPS Proxy: http://10.0.0.1:7890 [root@elk91 ~]# 5.docker成功从hub官方拉取镜像 [root@elk91 ~]# docker run hello-world Unable to find image 'hello-world:latest' locally latest: Pulling from library/hello-world e6590344b1a5: Pull complete Digest: sha256:940c619fbd418f9b2b1b63e25d8861f9cc1b46e3fc8b018ccfe8b78f19b8cc4f Status: Downloaded newer image for hello-world:latest Hello from Docker! This message shows that your installation appears to be working correctly. To generate this message, Docker took the following steps: 1. The Docker client contacted the Docker daemon. 2. The Docker daemon pulled the "hello-world" image from the Docker Hub. (amd64) 3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading. 4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal. To try something more ambitious, you can run an Ubuntu container with: $ docker run -it ubuntu bash Share images, automate workflows, and more with a free Docker ID: https://hub.docker.com/ For more examples and ideas, visit: https://docs.docker.com/get-started/ [root@elk91 ~]# 6.查看本地镜像 [root@elk91 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE hello-world latest 74cc54e27dc4 5 months ago 10.1kB [root@elk91 ~]# [root@elk91 ~]# docker run hello-world Hello from Docker! This message shows that your installation appears to be working correctly. To generate this message, Docker took the following steps: 1. The Docker client contacted the Docker daemon. 2. The Docker daemon pulled the "hello-world" image from the Docker Hub. (amd64) 3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading. 4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal. To try something more ambitious, you can run an Ubuntu container with: $ docker run -it ubuntu bash Share images, automate workflows, and more with a free Docker ID: https://hub.docker.com/ For more examples and ideas, visit: https://docs.docker.com/get-started/ [root@elk91 ~]# [root@elk91 ~]# docker container ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@elk91 ~]# [root@elk91 ~]# docker container ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ab37aa30f2f8 hello-world "/hello" 48 seconds ago Exited (0) 47 seconds ago infallible_jepsen 4f3e3539d0d5 hello-world "/hello" About a minute ago Exited (0) About a minute ago affectionate_chatterjee [root@elk91 ~]# 7.关闭代理功能 [root@elk91 ~]# cat /lib/systemd/system/docker.service [Unit] Description=weixiang linux Docke Engine Documentation=https://docs.docker.com,https://www.weixiang.com Wants=network-online.target [Service] Type=notify ExecStart=/usr/bin/dockerd # 配置docker代理 # Environment="HTTP_PROXY=http://10.0.0.1:7890" # Environment="HTTPS_PROXY=http://10.0.0.1:7890" [Install] WantedBy=multi-user.target [root@elk91 ~]# [root@elk91 ~]# systemctl daemon-reload [root@elk91 ~]# systemctl restart docker [root@elk91 ~]# [root@elk91 ~]# docker info | grep Proxy [root@elk91 ~]#
4、linux实现vpn翻墙
1、前言
bash
由于需要在Linux环境下拉取docker的外部镜像,于是决定使用VMware虚拟化软件来安装Ubuntu22.04操作系统。因此产生了让Ubuntu虚拟机 共享宿主机(即运行VMware的物理机器)代理设置的需求。 # 下载V2rayN,软件包位置D:\老男孩\上课视频\代码上线软件\其他软件\v2rayN-windows-64-SelfContained.zip
2、设置 V2rayN

进入机场复制订阅地址

image

打开V2rayN

image

image

image

image

image

image

bash
VMWare虚拟机配置: 1.在VMware中创建或打开您的Ubuntu22.04虚拟机。 2.进入虚拟机的设置或编辑虚拟机设置。 3.寻找网络适配器设置。 4.将网络连接模式更改为NAT模式。 5.确认虚拟机网络适配器绑定到了VMnet8。

image

image

bash
# 修改docker的启动脚本 [root@elk91 ~]# cat /lib/systemd/system/docker.service [Unit] Description=weixiang linux Docke Engine Documentation=https://docs.docker.com,https://www.weixiang.com Wants=network-online.target [Service] Type=notify ExecStart=/usr/bin/dockerd # 配置docker代理 Environment="HTTP_PROXY=http://10.0.0.1:10808" # 改为代理的ip Environment="HTTPS_PROXY=http://10.0.0.1:10808" # 改为代理的ip [Install] WantedBy=multi-user.target [root@elk91 ~]# systemctl daemon-reload [root@elk91 ~]# systemctl restart docker [root@elk91 ~]# [root@elk91 ~]# docker info | grep Proxy [root@elk91 ~]# # ping谷歌测试 [root@elk01 profile.d]# ping google.com PING google.com (142.250.197.206) 56(84) bytes of data. 64 bytes from nchkga-an-in-f14.1e100.net (142.250.197.206): icmp_seq=1 ttl=128 time=0.546 ms 64 bytes from nchkga-an-in-f14.1e100.net (142.250.197.206): icmp_seq=2 ttl=128 time=0.754 ms # 拉取测试 [root@elk01 profile.d]# docker run hello-world Unable to find image 'hello-world:latest' locally latest: Pulling from library/hello-world e6590344b1a5: Pull complete Digest: sha256:940c619fbd418f9b2b1b63e25d8861f9cc1b46e3fc8b018ccfe8b78f19b8cc4f Status: Downloaded newer image for hello-world:latest Hello from Docker! This message shows that your installation appears to be working correctly. To generate this message, Docker took the following steps: 1. The Docker client contacted the Docker daemon. 2. The Docker daemon pulled the "hello-world" image from the Docker Hub. (amd64) 3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading. 4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal. To try something more ambitious, you can run an Ubuntu container with: $ docker run -it ubuntu bash Share images, automate workflows, and more with a free Docker ID: https://hub.docker.com/ For more examples and ideas, visit: https://docs.docker.com/get-started/

2、docker管理基础

1、镜像基础管理
bash
1.下载镜像 [root@elk91 ~]# docker pull registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 v1: Pulling from yinzhengjie-k8s/apps 5758d4e389a3: Pull complete 51d66f629021: Pull complete ff9c6add3f30: Pull complete dcc43d9a97b4: Pull complete 5dcfac0f2f9c: Pull complete 2c6e86e57dfd: Pull complete 2dd61e30a21a: Pull complete Digest: sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c Status: Downloaded newer image for registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 [root@elk91 ~]# 2.查看本地镜像 [root@elk91 ~]# docker image ls #镜像的来源仓库和名称。 镜像的版本标签,默认latest 唯一哈希 镜像的创建时间 占用空间 REPOSITORY TAG IMAGE ID CREATED SIZE hello-world latest 74cc54e27dc4 5 months ago 10.1kB registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps v1 f28fd43be4ad 17 months ago 23MB [root@elk91 ~]# docker image ls -q # 仅显示本地所有Docker镜像的ID 74cc54e27dc4 f28fd43be4ad 3.导入镜像 [root@elk91 ~]# wget http://192.168.21.253/Resources/Docker/images/weixiang-games-v0.6.tar.gz # 将镜像压缩包解压并导入到本地Docker环境 [root@elk91 ~]# docker load -i weixiang-games-v0.6.tar.gz 24f6c2496534: Loading layer [==================================================>] 288.1MB/288.1MB df2c564d255b: Loading layer [==================================================>] 6.144kB/6.144kB ce4dda5fa1c1: Loading layer [==================================================>] 7.168kB/7.168kB 1d0291efebc6: Loading layer [==================================================>] 70.73MB/70.73MB Loaded image: jasonyin2020/weixiang-games:v0.6 # 如何辨别是镜像文件而非压缩包,tar -tf看一下包的内容如果包含manifest.json、*.tar等文件说明是标准镜像。如果包含安装脚本,则不是镜像文件 4.导出镜像 [root@elk91 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE hello-world latest 74cc54e27dc4 5 months ago 10.1kB jasonyin2020/weixiang-games v0.6 b55cbfca1946 15 months ago 376MB registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps v1 f28fd43be4ad 17 months ago 23MB [root@elk91 ~]# # docker image save -o +自定义名 +本地镜像名称+TAG/IMAGE ID [root@elk91 ~]# docker image save -o weixiang-apps-v1.tar.gz registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 [root@elk91 ~]# ll weixiang-apps-v1.tar.gz weixiang-images.tar.gz -rw------- 1 root root 24512000 Jun 30 10:05 weixiang-apps-v1.tar.gz -rw------- 1 root root 24528896 Jun 30 10:05 weixiang-images.tar.gz 5.删除镜像 [root@elk91 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE hello-world latest 74cc54e27dc4 5 months ago 10.1kB jasonyin2020/weixiang-games v0.6 b55cbfca1946 15 months ago 376MB registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps v1 f28fd43be4ad 17 months ago 23MB [root@elk91 ~]# [root@elk91 ~]# docker image rm -f hello-world:latest Untagged: hello-world:latest Untagged: hello-world@sha256:940c619fbd418f9b2b1b63e25d8861f9cc1b46e3fc8b018ccfe8b78f19b8cc4f Deleted: sha256:74cc54e27dc41bb10dc4b2226072d469509f2f22f1a3ce74f4a59661a1d44602 [root@elk91 ~]# [root@elk91 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE jasonyin2020/weixiang-games v0.6 b55cbfca1946 15 months ago 376MB registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps v1 f28fd43be4ad 17 months ago 23MB [root@elk91 ~]# [root@elk91 ~]# docker image rm -f `docker image ls -q` Untagged: jasonyin2020/weixiang-games:v0.6 Deleted: sha256:b55cbfca19466855f8641e50dcc8b39fd670faa7ead235e91f3da5d058002f1e Deleted: sha256:ec9db6625058a29f186f9ff0bb0ced12cc8a9f742bd8153afb69ea6d4bc9f10f Deleted: sha256:24adb7d22d9dd2081102c4d27ed626e45096ed0efda6ae1ae57af023524e85c1 Deleted: sha256:fa1f3f6f0a5e10b1bcf6a8a81e25f6c852686390ea379b5921c1c489a6025aa6 Deleted: sha256:2e11b973beb2dd3ccdba59979f11de6cf0661e5450bffc057c894e0f617e0bef Untagged: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 Untagged: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps@sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c Deleted: sha256:f28fd43be4ad41fc768dcc3629f8479d1443df01ada10ac9a771314e4fdef599 Deleted: sha256:1d14ebb5d571f4ae7b23885c1936c7ccf7ccea25d9abe47adef4bbb08b02b0c1 Deleted: sha256:d934f66fc762f5dfba2222695f49e12207d31c10f028a60dcaed0116863946e4 Deleted: sha256:4045d0a9d114395acf42abeaa961f6cc6ecc3a6cdef1f44f9b39fe9abdddc41f Deleted: sha256:14c3b43c8b6645d8a4a8cf9f351428455fcd3b24822831f54e0ac6bfe0739313 Deleted: sha256:ca043fe36d34b379681078b276e99e77ac5d9019cca8653a0a5408ab09893aba Deleted: sha256:601a550fa75854a4beeeba9640873699e0fc4c4a9b1a88cb66e7fae6ae881f31 Deleted: sha256:7fcb75871b2101082203959c83514ac8a9f4ecfee77a0fe9aa73bbe56afdf1b4 [root@elk91 ~]# [root@elk91 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE [root@elk91 ~]# 6.基于刚刚导出的镜像导入 [root@elk91 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE [root@elk91 ~]# # 这个weixiang-apps-v1.tar.gz本地要有 [root@elk91 ~]# docker image load -i weixiang-apps-v1.tar.gz 7fcb75871b21: Loading layer [==================================================>] 5.904MB/5.904MB 15d7cdc64789: Loading layer [==================================================>] 18.32MB/18.32MB 5f66747c8a72: Loading layer [==================================================>] 3.072kB/3.072kB c39c1c35e3e8: Loading layer [==================================================>] 4.096kB/4.096kB b8dbe22b95f7: Loading layer [==================================================>] 3.584kB/3.584kB 9d5b000ce7c7: Loading layer [==================================================>] 7.168kB/7.168kB 8e2be8913e57: Loading layer [==================================================>] 238.1kB/238.1kB Loaded image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 [root@elk91 ~]# [root@elk91 ~]# docker image load -i weixiang-images.tar.gz Loaded image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 Loaded image: hello-world:latest [root@elk91 ~]# [root@elk91 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE hello-world latest 74cc54e27dc4 5 months ago 10.1kB registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps v1 f28fd43be4ad 17 months ago 23MB [root@elk91 ~]# 7.为镜像打标签 [root@elk91 ~]# docker image tag hello-world:latest weixiang98:v0.1 # docker image tag 源镜像 新标签 [root@elk91 ~]# [root@elk91 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE hello-world latest 74cc54e27dc4 5 months ago 10.1kB weixiang98 v0.1 74cc54e27dc4 5 months ago 10.1kB registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps v1 f28fd43be4ad 17 months ago 23MB [root@elk91 ~]# [root@elk91 ~]# docker image tag hello-world:latest registry.cn-hangzhou.aliyuncs.com/weixiang-weixiang98/xiuxian:v1 [root@elk91 ~]# [root@elk91 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE registry.cn-hangzhou.aliyuncs.com/weixiang-weixiang98/xiuxian v1 74cc54e27dc4 5 months ago 10.1kB hello-world latest 74cc54e27dc4 5 months ago 10.1kB weixiang98 v0.1 74cc54e27dc4 5 months ago 10.1kB registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps v1 f28fd43be4ad 17 months ago 23MB [root@elk91 ~]# docker镜像管理的简写形式 - docker rmi 删除镜像 - docker save 导出镜像 - docker load 导入镜像 - docker images 查看镜像列表 - docker tag 镜像打标签。
2、容器管理基础
bash
1.查看容器列表 [root@elk91 ~]# docker container ps # 查看正在运行的容器,很显然,容器没有运行的。 CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@elk91 ~]# [root@elk91 ~]# docker container ps -a # 使用-a选项可以查看所有的容器状态,包含已经退出的容器。 CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 84d25e3f6104 hello-world "/hello" 53 minutes ago Exited (0) 53 minutes ago ecstatic_chaum bc79db9286e0 hello-world:latest "/hello" 53 minutes ago Exited (0) 53 minutes ago loving_albattani ab37aa30f2f8 hello-world "/hello" About an hour ago Exited (0) About an hour ago infallible_jepsen 4f3e3539d0d5 hello-world "/hello" About an hour ago Exited (0) About an hour ago affectionate_chatterjee [root@elk91 ~]# # ps跟ls一样的功能 [root@elk91 ~]# docker container ls CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@elk91 ~]# [root@elk91 ~]# docker container ls -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 84d25e3f6104 hello-world "/hello" 54 minutes ago Exited (0) 54 minutes ago ecstatic_chaum bc79db9286e0 hello-world:latest "/hello" 54 minutes ago Exited (0) 54 minutes ago loving_albattani ab37aa30f2f8 hello-world "/hello" About an hour ago Exited (0) About an hour ago infallible_jepsen 4f3e3539d0d5 hello-world "/hello" About an hour ago Exited (0) About an hour ago affectionate_chatterjee [root@elk91 ~]# [root@elk91 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@elk91 ~]# [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 84d25e3f6104 hello-world "/hello" 55 minutes ago Exited (0) 55 minutes ago ecstatic_chaum bc79db9286e0 hello-world:latest "/hello" 55 minutes ago Exited (0) 55 minutes ago loving_albattani ab37aa30f2f8 hello-world "/hello" About an hour ago Exited (0) About an hour ago infallible_jepsen 4f3e3539d0d5 hello-world "/hello" About an hour ago Exited (0) About an hour ago affectionate_chatterjee [root@elk91 ~]# 2.创建名字为xiuxiu的容器 # 基于镜像创建 [root@elk01 ~]# docker image ls [root@elk91 ~]# docker container create --name xiuxian -p 90:80 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 ec80a71d46687d337f16d708f86553c8a5504ecadd631bc392264561f6c6aeb2 [root@elk91 ~]# [root@elk91 ~]# docker ps -a # 可以看到新创建的xiuxiu容器 CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ec80a71d4668 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 10 seconds ago Created xiuxian 84d25e3f6104 hello-world "/hello" 56 minutes ago Exited (0) 56 minutes ago ecstatic_chaum bc79db9286e0 hello-world:latest "/hello" 56 minutes ago Exited (0) 56 minutes ago loving_albattani ab37aa30f2f8 hello-world "/hello" About an hour ago Exited (0) About an hour ago infallible_jepsen 4f3e3539d0d5 hello-world "/hello" 2 hours ago Exited (0) 2 hours ago affectionate_chatterjee [root@elk91 ~]# [root@elk91 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@elk91 ~]# [root@elk91 ~]# docker ps -l # 查看'最新'一个容器的状态 CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ec80a71d4668 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 19 seconds ago Created xiuxian [root@elk91 ~]# 3.启动容器 [root@elk91 ~]# docker container start xiuxian xiuxian [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ec80a71d4668 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" About a minute ago Up 1 second 0.0.0.0:90->80/tcp, :::90->80/tcp xiuxian # status为up表示已经启动 4.访问测试 http://10.0.0.91:90/ 5.直接运行容器 [root@elk91 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE hello-world latest 74cc54e27dc4 5 months ago 10.1kB weixiang98 v0.1 74cc54e27dc4 5 months ago 10.1kB registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps v1 f28fd43be4ad 17 months ago 23MB [root@elk91 ~]# docker container run --name xiuxian02 -p 100:80 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 Unable to find image 'registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2' locally # docker container run:Docker 命令,用于创建并启动一个新的容器。 # --name xiuxian02:为容器指定一个名称 "xiuxian02" # -p 100:80:端口映射参数,将宿主机的 100 端口映射到容器内部的 80 端口。这样外部访问宿主机IP的100端口时,请求会被转发到容器内的80端口 # registry.cn-han。。。:指定要运行的容器镜像地址 v2: Pulling from yinzhengjie-k8s/apps 5758d4e389a3: Already exists 51d66f629021: Already exists ff9c6add3f30: Already exists dcc43d9a97b4: Already exists 5dcfac0f2f9c: Already exists 2c6e86e57dfd: Already exists b07c4abce9eb: Pull complete Digest: sha256:3ac38ee6161e11f2341eda32be95dcc6746f587880f923d2d24a54c3a525227e Status: Downloaded newer image for registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh 10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf 10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh /docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh /docker-entrypoint.sh: Configuration complete; ready for start up 2025/06/30 03:03:00 [notice] 1#1: using the "epoll" event method 2025/06/30 03:03:00 [notice] 1#1: nginx/1.20.1 2025/06/30 03:03:00 [notice] 1#1: built by gcc 10.2.1 20201203 (Alpine 10.2.1_pre1) 2025/06/30 03:03:00 [notice] 1#1: OS: Linux 5.15.0-142-generic 2025/06/30 03:03:00 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 524288:524288 2025/06/30 03:03:00 [notice] 1#1: start worker processes 2025/06/30 03:03:00 [notice] 1#1: start worker process 33 2025/06/30 03:03:00 [notice] 1#1: start worker process 34 6.客户端访问测试 http://10.0.0.91:100/ 7.后台运行镜像 [root@elk91 ~]# docker run --name xiuxian03 -d -p 200:80 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3 Unable to find image 'registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3' locally v3: Pulling from yinzhengjie-k8s/apps 5758d4e389a3: Already exists 51d66f629021: Already exists ff9c6add3f30: Already exists dcc43d9a97b4: Already exists 5dcfac0f2f9c: Already exists 2c6e86e57dfd: Already exists fe426320d5d6: Pull complete Digest: sha256:3d6b02b6335d8ecf3d09bc2a4a9848e6868a8f6aa4924faf53a24e8e0a017472 Status: Downloaded newer image for registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3 75af4a05e2c54eeefaec16675b5d35755a35a4fbeceb6e845aa735f48ce9ed44 [root@elk91 ~]# [root@elk91 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE hello-world latest 74cc54e27dc4 5 months ago 10.1kB weixiang98 v0.1 74cc54e27dc4 5 months ago 10.1kB registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps v2 d65adc8a2f32 17 months ago 22.9MB registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps v3 b86c7f8ae11e 17 months ago 23MB registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps v1 f28fd43be4ad 17 months ago 23MB [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 505ba8caab73 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3 "/docker-entrypoint.…" 3 seconds ago Up 2 seconds 0.0.0.0:200->80/tcp, :::200->80/tcp xiuxian03 [root@elk91 ~]# 8.客户端再次访问测试 http://10.0.0.91:200/ 9.启动容器 [root@elk91 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 505ba8caab73 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3 "/docker-entrypoint.…" About a minute ago Up About a minute 0.0.0.0:200->80/tcp, :::200->80/tcp xiuxian03 ec80a71d4668 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 8 minutes ago Up 6 minutes 0.0.0.0:90->80/tcp, :::90->80/tcp xiuxian [root@elk91 ~]# [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 505ba8caab73 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3 "/docker-entrypoint.…" About a minute ago Up About a minute 0.0.0.0:200->80/tcp, :::200->80/tcp xiuxian03 c6e67c77def3 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 "/docker-entrypoint.…" 4 minutes ago Exited (0) 3 minutes ago xiuxian02 ec80a71d4668 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 8 minutes ago Up 6 minutes 0.0.0.0:90->80/tcp, :::90->80/tcp xiuxian 84d25e3f6104 hello-world "/hello" About an hour ago Exited (0) About an hour ago ecstatic_chaum bc79db9286e0 hello-world:latest "/hello" About an hour ago Exited (0) About an hour ago loving_albattani ab37aa30f2f8 hello-world "/hello" 2 hours ago Exited (0) 2 hours ago infallible_jepsen 4f3e3539d0d5 hello-world "/hello" 2 hours ago Exited (0) 2 hours ago affectionate_chatterjee [root@elk91 ~]# [root@elk91 ~]# [root@elk91 ~]# docker container start xiuxian02 xiuxian02 [root@elk91 ~]# [root@elk91 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 505ba8caab73 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3 "/docker-entrypoint.…" 2 minutes ago Up 2 minutes 0.0.0.0:200->80/tcp, :::200->80/tcp xiuxian03 c6e67c77def3 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 "/docker-entrypoint.…" 5 minutes ago Up 3 seconds 0.0.0.0:100->80/tcp, :::100->80/tcp xiuxian02 ec80a71d4668 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 8 minutes ago Up 6 minutes 0.0.0.0:90->80/tcp, :::90->80/tcp xiuxian [root@elk91 ~]# 10.停止容器 [root@elk91 ~]# docker container stop xiuxian03 xiuxian03 [root@elk91 ~]# [root@elk91 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES c6e67c77def3 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 "/docker-entrypoint.…" 6 minutes ago Up About a minute 0.0.0.0:100->80/tcp, :::100->80/tcp xiuxian02 ec80a71d4668 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 9 minutes ago Up 7 minutes 0.0.0.0:90->80/tcp, :::90->80/tcp xiuxian [root@elk91 ~]# [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 505ba8caab73 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3 "/docker-entrypoint.…" 3 minutes ago Exited (137) 7 seconds ago xiuxian03 c6e67c77def3 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 "/docker-entrypoint.…" 6 minutes ago Up About a minute 0.0.0.0:100->80/tcp, :::100->80/tcp xiuxian02 ec80a71d4668 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 9 minutes ago Up 7 minutes 0.0.0.0:90->80/tcp, :::90->80/tcp xiuxian 84d25e3f6104 hello-world "/hello" About an hour ago Exited (0) About an hour ago ecstatic_chaum bc79db9286e0 hello-world:latest "/hello" About an hour ago Exited (0) About an hour ago loving_albattani ab37aa30f2f8 hello-world "/hello" 2 hours ago Exited (0) 2 hours ago infallible_jepsen 4f3e3539d0d5 hello-world "/hello" 2 hours ago Exited (0) 2 hours ago affectionate_chatterjee [root@elk91 ~]# [root@elk91 ~]# docker container stop -t 0 xiuxian02 xiuxian02 [root@elk91 ~]# [root@elk91 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ec80a71d4668 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 10 minutes ago Up 8 minutes 0.0.0.0:90->80/tcp, :::90->80/tcp xiuxian [root@elk91 ~]# 11.杀死容器 [root@elk91 ~]# docker container kill xiuxian xiuxian [root@elk91 ~]# [root@elk91 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@elk91 ~]# 12.重启容器 [root@elk91 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@elk91 ~]# [root@elk91 ~]# [root@elk91 ~]# docker container restart xiuxian xiuxian [root@elk91 ~]# [root@elk91 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ec80a71d4668 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 11 minutes ago Up 1 second 0.0.0.0:90->80/tcp, :::90->80/tcp xiuxian [root@elk91 ~]# 13.删除容器 [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 505ba8caab73 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3 "/docker-entrypoint.…" 5 minutes ago Exited (137) 2 minutes ago xiuxian03 c6e67c77def3 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 "/docker-entrypoint.…" 8 minutes ago Exited (137) About a minute ago xiuxian02 ec80a71d4668 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 11 minutes ago Up 35 seconds 0.0.0.0:90->80/tcp, :::90->80/tcp xiuxian 84d25e3f6104 hello-world "/hello" About an hour ago Exited (0) About an hour ago ecstatic_chaum bc79db9286e0 hello-world:latest "/hello" About an hour ago Exited (0) About an hour ago loving_albattani ab37aa30f2f8 hello-world "/hello" 2 hours ago Exited (0) 2 hours ago infallible_jepsen 4f3e3539d0d5 hello-world "/hello" 2 hours ago Exited (0) 2 hours ago affectionate_chatterjee [root@elk91 ~]# [root@elk91 ~]# docker container rm xiuxian02 xiuxian02 [root@elk91 ~]# [root@elk91 ~]# docker rm xiuxian03 xiuxian03 [root@elk91 ~]# [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ec80a71d4668 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 12 minutes ago Up About a minute 0.0.0.0:90->80/tcp, :::90->80/tcp xiuxian 84d25e3f6104 hello-world "/hello" About an hour ago Exited (0) About an hour ago ecstatic_chaum bc79db9286e0 hello-world:latest "/hello" About an hour ago Exited (0) About an hour ago loving_albattani ab37aa30f2f8 hello-world "/hello" 2 hours ago Exited (0) 2 hours ago infallible_jepsen 4f3e3539d0d5 hello-world "/hello" 2 hours ago Exited (0) 2 hours ago affectionate_chatterjee [root@elk91 ~]# [root@elk91 ~]# docker rm -f `docker ps -a -q` # 删除所有的容器 ec80a71d4668 84d25e3f6104 bc79db9286e0 ab37aa30f2f8 4f3e3539d0d5 [root@elk91 ~]# [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@elk91 ~]#
3、基于docker部署游戏镜像案例
bash
1.hub官方搜索要使用的镜像 略,见视频。 2.拉取镜像 略,见视频。 3.运行服务 [root@elk91 ~]# docker run --restart unless-stopped -dp 80:80 jasonyin2020/weixiang-games:v0.6 3a0e656125109d252c0572380ec8220f0effecec007bdea327073315277d02bd [root@elk91 ~] [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3a0e65612510 jasonyin2020/weixiang-games:v0.6 "/docker-entrypoint.…" 23 seconds ago Up 23 seconds 0.0.0.0:80->80/tcp, :::80->80/tcp sad_banzai [root@elk91 ~]# 4.添加windows解析记录 10.0.0.91 game01.weixiang.com 10.0.0.91 game02.weixiang.com 10.0.0.91 game03.weixiang.com 10.0.0.91 game04.weixiang.com 10.0.0.91 game05.weixiang.com 10.0.0.91 game06.weixiang.com 10.0.0.91 game07.weixiang.com 10.0.0.91 game08.weixiang.com 10.0.0.91 game09.weixiang.com 10.0.0.91 game10.weixiang.com 10.0.0.91 game11.weixiang.com 10.0.0.91 game12.weixiang.com 10.0.0.91 game13.weixiang.com 10.0.0.91 game14.weixiang.com 10.0.0.91 game15.weixiang.com 10.0.0.91 game16.weixiang.com 10.0.0.91 game17.weixiang.com 10.0.0.91 game18.weixiang.com 10.0.0.91 game19.weixiang.com 10.0.0.91 game20.weixiang.com 10.0.0.91 game21.weixiang.com 10.0.0.91 game22.weixiang.com 10.0.0.91 game23.weixiang.com 10.0.0.91 game24.weixiang.com 10.0.0.91 game25.weixiang.com 10.0.0.91 game26.weixiang.com 10.0.0.91 game27.weixiang.com 5.访问测试 略,见视频。
4、容器名称修改,在容器中执行命令及IP地址查看
bash
- 容器名称修改,在容器中执行命令及IP地址查看 1.修改容器的名称 [root@elk91 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3a0e65612510 jasonyin2020/weixiang-games:v0.6 "/docker-entrypoint.…" 2 hours ago Up 2 hours 0.0.0.0:80->80/tcp, :::80->80/tcp sad_banzai [root@elk91 ~]# # docker container rename 原名字 现名字 [root@elk91 ~]# docker container rename sad_banzai games-server [root@elk91 ~]# [root@elk91 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3a0e65612510 jasonyin2020/weixiang-games:v0.6 "/docker-entrypoint.…" 2 hours ago Up 2 hours 0.0.0.0:80->80/tcp, :::80->80/tcp games-server [root@elk91 ~]# 2.在容器中执行命令 [root@elk91 ~]# docker container exec games-server ifconfig # docker container exec:用于在正在运行的容器内执行指定命令,简写docker exec eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:02 inet addr:172.17.0.2 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3705 errors:0 dropped:0 overruns:0 frame:0 TX packets:2438 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:619197 (604.6 KiB) TX bytes:28116419 (26.8 MiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) [root@elk91 ~]# docker container exec games-server cat /etc/hostname 3a0e65612510 [root@elk91 ~]# [root@elk91 ~]# docker container exec games-server ps -ef PID USER TIME COMMAND 1 root 0:00 nginx: master process nginx -g daemon off; 31 nginx 0:00 nginx: worker process 32 nginx 0:00 nginx: worker process 86 root 0:00 ps -ef [root@elk91 ~]# 3.和容器进行交互 [root@elk91 ~]# docker exec -it games-server sh / # ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:02 inet addr:172.17.0.2 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3705 errors:0 dropped:0 overruns:0 frame:0 TX packets:2438 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:619197 (604.6 KiB) TX bytes:28116419 (26.8 MiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # / # ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 24: eth0@if25: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0 valid_lft forever preferred_lft forever / # / # cat /etc/shells # valid login shells /bin/sh /bin/ash / # / # ps -ef PID USER TIME COMMAND 1 root 0:00 nginx: master process nginx -g daemon off; 31 nginx 0:00 nginx: worker process 32 nginx 0:00 nginx: worker process 78 root 0:00 sh 85 root 0:00 ps -ef / # [root@elk91 ~]# 4.查看容器的IP地址 [root@elk91 ~]# docker container inspect -f "{{.NetworkSettings.IPAddress}}" games-server 172.17.0.2 [root@elk91 ~]# # 查看所有的信息 [root@elk01 ~]# docker inspect ganmes-server 5.查看最新一个容器的IP地址 [root@elk91 ~]# docker container inspect -f "{{.NetworkSettings.IPAddress}}" `docker ps -lq` 172.17.0.2 [root@elk91 ~]# [root@elk91 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" `docker container ps -lq` # 扩展,了解即可 172.17.0.2 [root@elk91 ~]# - 向容器传递环境变量并部署MySQL服务 1.向容器传递环境变量 [root@elk91 ~]# docker run --name c1 -e SCHOOL=weixiang -e class=weixiang98 -d registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 5af436ca048cfa8153de02657ce9378e559c649ca7929cd18cc412dd99f5b6d0 --------------------------------------------------------------------------------------- docker run \ --name c1 \ # 指定容器名为 "c1" -e SCHOOL=weixiang \ # 设置环境变量 SCHOOL=weixiang -e class=weixiang98 \ # 设置环境变量 class=weixiang98 -d \ # 后台运行(detached 模式) registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 # 使用的镜像 --------------------------------------------------------------------------------------- [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5af436ca048c registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 4 seconds ago Up 2 seconds 80/tcp c1 [root@elk91 ~]# [root@elk91 ~]# docker exec -it c1 env ... SCHOOL=weixiang class=weixiang98 ... [root@elk91 ~]# 2.导入镜像 [root@elk91 ~]# wget http://192.168.21.253/Resources/Docker/images/WordPress/weixiang-mysql-v8.0.36-oracle.tar.gz [root@elk91 ~]# docker load -i weixiang-mysql-v8.0.36-oracle.tar.gz fc037c17567d: Loading layer [==================================================>] 118.8MB/118.8MB 152c1ecea280: Loading layer [==================================================>] 11.26kB/11.26kB fb5c92e924ab: Loading layer [==================================================>] 2.359MB/2.359MB 5b76076a2dd4: Loading layer [==================================================>] 13.86MB/13.86MB a6909c467615: Loading layer [==================================================>] 6.656kB/6.656kB eaa1e85de732: Loading layer [==================================================>] 3.072kB/3.072kB 9513d2aedd12: Loading layer [==================================================>] 185.6MB/185.6MB 84d659420bad: Loading layer [==================================================>] 3.072kB/3.072kB 876b8cd855eb: Loading layer [==================================================>] 298.7MB/298.7MB 1c0ff7ed67c4: Loading layer [==================================================>] 16.9kB/16.9kB 318dde184d61: Loading layer [==================================================>] 1.536kB/1.536kB Loaded image: mysql:8.0.36-oracle [root@elk91 ~]# [root@elk91 ~]# docker image ls mysql REPOSITORY TAG IMAGE ID CREATED SIZE mysql 8.0.36-oracle f5f171121fa3 15 months ago 603MB [root@elk91 ~]# 3.启动mysql服务 [root@elk91 ~]# docker run -d --name mysql-server -e MYSQL_ALLOW_EMPTY_PASSWORD="yes" -e MYSQL_DATABASE=wordpress -e MYSQL_USER=weixiang98 -e MYSQL_PASSWORD=weixiang mysql:8.0.36-oracle 3574c86d2f610a92d1d92d0a8e6089b66aeb7bc09937265adb9640ca7c5a64b9 # -d:以后台(daemon)模式运行容器 # --name mysql-server:给容器命名为"mysql-server" # -e MYSQL_ALLOW_EMPTY_PASSWORD="yes":设置环境变量,允许空密码登录MySQL # -e MYSQL_DATABASE=word:设置环境变量,容器启动时会自动创建一个名为"wordpress"的数据库 [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3574c86d2f61 mysql:8.0.36-oracle "docker-entrypoint.s…" 3 seconds ago Up 2 seconds 3306/tcp, 33060/tcp mysql-server [root@elk91 ~]# 4.测试验证 [root@elk91 ~]# docker exec -it mysql-server mysql # docker exec:在运行的容器中执行命令 # -it:以交互模式运行(保持 STDIN 打开并分配伪终端) # mysql-server:目标容器的名称(之前用 --name mysql-server 创建的) # mysql:要在容器内执行的命令(MySQL 客户端) Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 10 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SHOW DATABASES; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | wordpress | +--------------------+ 5 rows in set (0.00 sec) mysql> mysql> USE wordpress; Database changed mysql> mysql> SHOW TABLES; Empty set (0.00 sec) mysql> mysql> SELECT user,host FROM mysql.user; +------------------+-----------+ | user | host | +------------------+-----------+ | weixiang98 | % | | root | % | | mysql.infoschema | localhost | | mysql.session | localhost | | mysql.sys | localhost | | root | localhost | +------------------+-----------+ 6 rows in set (0.00 sec) mysql> 5.查看mysql的IP地址 [root@elk91 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" `docker container ps -lq` 172.17.0.4 [root@elk91 ~]# [root@elk91 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" mysql-server 172.17.0.4 [root@elk91 ~]#
5、基于docker部署WordPress
bash
# 1、下载 SVIP: [root@elk91 ~]# wget http://192.168.21.253/Resources/Docker/images/WordPress/weixiang-wordpress-v6.7.1-php8.1-apache.tar.gz # 2、将Docker镜像从 .tar.gz 压缩归档文件加载到本地Docker环境中 [root@elk91 ~]# docker load < weixiang-wordpress-v6.7.1-php8.1-apache.tar.gz 7914c8f600f5: Loading layer [==================================================>] 77.83MB/77.83MB 9d3505e94f88: Loading layer [==================================================>] 3.584kB/3.584kB cca374cc7ecc: Loading layer [==================================================>] 320.2MB/320.2MB 93531ad2cad2: Loading layer [==================================================>] 5.12kB/5.12kB 76c322751b28: Loading layer [==================================================>] 50.46MB/50.46MB e1862c15b46e: Loading layer [==================================================>] 9.728kB/9.728kB 41a48fee6648: Loading layer [==================================================>] 7.68kB/7.68kB 683fadaa2d15: Loading layer [==================================================>] 12.42MB/12.42MB cd29cc24986e: Loading layer [==================================================>] 4.096kB/4.096kB 65ed9c32ccf8: Loading layer [==================================================>] 49.07MB/49.07MB 6a874987401a: Loading layer [==================================================>] 12.8kB/12.8kB 72d18aad6507: Loading layer [==================================================>] 4.608kB/4.608kB 541b75dced10: Loading layer [==================================================>] 4.608kB/4.608kB 5f70bf18a086: Loading layer [==================================================>] 1.024kB/1.024kB dd20169e4636: Loading layer [==================================================>] 69.66MB/69.66MB 7aa076c583ee: Loading layer [==================================================>] 56.58MB/56.58MB 1bd5766fdd49: Loading layer [==================================================>] 5.632kB/5.632kB fd6f751879ec: Loading layer [==================================================>] 4.608kB/4.608kB 10ffebd37647: Loading layer [==================================================>] 91.65kB/91.65kB 9dfe5f929ccc: Loading layer [==================================================>] 76.94MB/76.94MB 3a7d623958af: Loading layer [==================================================>] 9.216kB/9.216kB 5a91ae3138b2: Loading layer [==================================================>] 6.656kB/6.656kB Loaded image: wordpress:6.7.1-php8.1-apache [root@elk91 ~]# # 3.查看MySQL容器的IP地址 [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3574c86d2f61 mysql:8.0.36-oracle "docker-entrypoint.s…" 43 minutes ago Up 43 minutes 3306/tcp, 33060/tcp mysql-server [root@elk91 ~]# [root@elk91 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" mysql-server 172.17.0.7 [root@elk91 ~]# # 4.运行WordPress服务 [root@elk91 ~]# docker run -e WORDPRESS_DB_HOST=172.17.0.7 -e WORDPRESS_DB_USER=weixiang98 -e WORDPRESS_DB_PASSWORD=weixiang -e WORDPRESS_DB_NAME=wordpress --name wordpress-server -p 8080:80 -d wordpress:6.7.1-php8.1-apache 34daefce114492a75cd7a0e106c0355d17450355b2a8c0e3c92e4b5b8070beb5 [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 34daefce1144 wordpress:6.7.1-php8.1-apache "docker-entrypoint.s…" 5 seconds ago Up 4 seconds 0.0.0.0:8080->80/tcp, :::8080->80/tcp wordpress-server [root@elk91 ~]# 4.访问wordpress的WebUI - 今日内容回顾: - docker部署方式 - 包管理工具:apt/dnf - 二进制 - 脚本安装 【推荐】 - docker架构 - docker客户端 - dockerd服务端 - registry远程仓库 镜像的组成部分: [远程仓库的服务器地址/项目名称/]镜像名称[:版本号] 远程仓库的服务器地址默认值: docker.io 项目名称默认值: library 版本号默认值: latest - 镜像管理 - docker pull ***** - docker tag ***** - docker save ***** - docker load ***** - docker rmi ***** - docker images ***** - 容器管理 - docker create ** - docker start **** - docker restart **** - docker stop **** - docker kill *** - docker rm ***** - docker run ***** - docker ps ***** - docker inspect **** - docker rename ** - docker exec *****
6、外部是如何访问容器的

docker容器端口转发底层原理图解

bash
外部是怎么访问容器的: 如果我们启动一个容器,会自动多出一块网卡,通过ifconfig可以看到所有网卡,理论上有多少个容器就会有多少块网卡产生,现在有c1与c2两个 容器,暴露的都是80端口,c1容器的网卡本端是16对端是17,c2容器的网卡本端是18对端是19,在宿主机上有一个设备叫docker0,是部署docker 服务本身就有的,这个宿主机上就有c1c2两个容器的对端索引 c1如果想访问外网,先通过本地的eth网卡,流经对端docker交换机的对端网口,然后通过开启内核转发,转发给物理网卡,也就是91ip的网卡, 再通过snat源地址转换 #测试 1.查看网卡 [root@elk01 ~]# ifconfig docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255 inet6 fe80::42:1cff:fe31:1d2 prefixlen 64 scopeid 0x20<link> ether 02:42:1c:31:01:d2 txqueuelen 0 (Ethernet) RX packets 2290 bytes 3567613 (3.5 MB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 3295 bytes 513931 (513.9 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.0.0.91 netmask 255.255.255.0 broadcast 10.0.0.255 inet6 fe80::20c:29ff:fe31:65e prefixlen 64 scopeid 0x20<link> ether 00:0c:29:31:06:5e txqueuelen 1000 (Ethernet) RX packets 5122643 bytes 4297563150 (4.2 GB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1668894 bytes 1617847297 (1.6 GB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 518 bytes 54532 (54.5 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 518 bytes 54532 (54.5 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 2.运行容器 [root@elk91 ~]# docker run -d --name c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 1472ea9a4cdef472ccf01e96d2cf273f0c7453d8d45a827ca6c64057bf5d70c2 [root@elk91 ~]# [root@elk91 ~]# docker run -d -p 82:80 --name c2 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 0924c0f01eb397ace624768f5f20ab250056a0e0e786fe6734252298b2f5612c [root@elk91 ~]# [root@elk01 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 335b2afa978d registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 2 minutes ago Up 2 minutes 0.0.0.0:82->80/tcp, :::82->80/tcp c2 69350502c088 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 3 minutes ago Up 3 minutes 80/tcp c1 [root@elk01 ~]# 3.再次查看网卡信息 [root@elk01 ~]# ifconfig docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255 inet6 fe80::42:1cff:fe31:1d2 prefixlen 64 scopeid 0x20<link> ether 02:42:1c:31:01:d2 txqueuelen 0 (Ethernet) RX packets 2290 bytes 3567613 (3.5 MB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 3297 bytes 514191 (514.1 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.0.0.91 netmask 255.255.255.0 broadcast 10.0.0.255 inet6 fe80::20c:29ff:fe31:65e prefixlen 64 scopeid 0x20<link> ether 00:0c:29:31:06:5e txqueuelen 1000 (Ethernet) RX packets 5123139 bytes 4297630252 (4.2 GB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1669254 bytes 1617897621 (1.6 GB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 522 bytes 54956 (54.9 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 522 bytes 54956 (54.9 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth6d6eecf: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet6 fe80::ccd8:5eff:fea6:a774 prefixlen 64 scopeid 0x20<link> ether ce:d8:5e:a6:a7:74 txqueuelen 0 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 19 bytes 2416 (2.4 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 vethd9b24da: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet6 fe80::9c5d:41ff:febc:21c3 prefixlen 64 scopeid 0x20<link> ether 9e:5d:41:bc:21:c3 txqueuelen 0 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 16 bytes 2086 (2.0 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 4.查看 veth 对端索引 # 可以看到veth6d6eecf本端是37对端是36 # 可以看到vethd9b24da本端是39对端是38 [root@elk01 ~]# ip link show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 00:0c:29:31:06:5e brd ff:ff:ff:ff:ff:ff altname enp2s1 altname ens33 3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default link/ether 02:42:1c:31:01:d2 brd ff:ff:ff:ff:ff:ff 37: veth6d6eecf@if36: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default link/ether ce:d8:5e:a6:a7:74 brd ff:ff:ff:ff:ff:ff link-netnsid 0 39: vethd9b24da@if38: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default link/ether 9e:5d:41:bc:21:c3 brd ff:ff:ff:ff:ff:ff link-netnsid 1 4,查看c1与c2的网卡信息 # 可以看到c1本端是36对端是37 [root@elk01 ~]# docker exec c1 ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 36: eth0@if37: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0 valid_lft forever preferred_lft forever [root@elk01 ~]# docker exec c2 ip a # 可以看到c2本端是38对端是39 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 38: eth0@if39: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0 valid_lft forever preferred_lft forever 4.查看docker0网卡的信息 [root@elk91 ~]# apt -y install bridge-utils [root@elk01 ~]# brctl show docker0 # 分别是 bridge name bridge id STP enabled interfaces docker0 8000.02421c3101d2 no veth6d6eecf vethd9b24da 5.查看iptables规则 [root@elk91 ~]# iptables-save | egrep "81|82" -A DOCKER ! -i docker0 -p tcp -m tcp --dport 81 -j DNAT --to-destination 172.17.0.2:80 -A DOCKER ! -i docker0 -p tcp -m tcp --dport 82 -j DNAT --to-destination 172.17.0.3:80 [root@elk91 ~]# 6.查看内核转发参数 [root@elk91 ~]# sysctl -q net.ipv4.ip_forward net.ipv4.ip_forward = 1 [root@elk91 ~]# 7.进入c1容器,测试容器访问外网 [root@elk01 ~]# docker exec -it c1 sh / # ping baidu.com PING baidu.com (182.61.201.211): 56 data bytes 64 bytes from 182.61.201.211: seq=0 ttl=127 time=0.529 ms 64 bytes from 182.61.201.211: seq=1 ttl=127 time=0.716 ms 64 bytes from 182.61.201.211: seq=2 ttl=127 time=0.872 ms # 通过抓取veth6d6eecf的ping协议可以抓到 [root@elk01 ~]# tcpdump -i veth6d6eecf icmp tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on veth6d6eecf, link-type EN10MB (Ethernet), snapshot length 262144 bytes 08:15:51.367180 IP 172.17.0.2 > 182.61.201.211: ICMP echo request, id 12032, seq 0, length 64 08:15:51.367731 IP 182.61.201.211 > 172.17.0.2: ICMP echo reply, id 12032, seq 0, length 64 08:15:52.378209 IP 172.17.0.2 > 182.61.201.211: ICMP echo request, id 12032, seq 1, length 64 # 因为流经宿主机,所有也能抓到 [root@elk01 ~]# tcpdump -i eth0 icmp tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes 08:18:13.440313 IP elk01 > 182.61.201.211: ICMP echo request, id 12544, seq 0, length 64 08:18:13.440985 IP 182.61.201.211 > elk01: ICMP echo reply, id 12544, seq 0, length 64 08:18:14.449092 IP elk01 > 182.61.201.211: ICMP echo request, id 12544, seq 1, length 64 08:18:14.449831 IP 182.61.201.211 > elk01: ICMP echo reply, id 12544, seq 1, length 64 8、抓取c2容器的包 # 很显然,是没有的 [root@elk01 ~]# tcpdump -i vethd9b24da icmp tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on vethd9b24da, link-type EN10MB (Ethernet), snapshot length 262144 bytes 9、关闭内核转发,ping失败 [root@elk01 ~]# sysctl -w net.ipv4.ip_forward=0 net.ipv4.ip_forward = 0 [root@elk01 ~]# docker exec -it c1 sh / # ping baidu.com
7、docker安装redis
bash
# 导入redis [root@elk91 ~]# wget http://192.168.21.253/Resources/Docker/images/Redis/weixiang-redis-v7.2.8.tar.gz [root@elk91 ~]# ll -rw-r--r-- 1 root root 41729536 Jun 30 14:50 weixiang-redis-v7.2.8.tar.gz # 导入redis镜像 [root@elk91 ~]# docker load -i weixiang-redis-v7.2.8.tar.gz 08000c18d16d: Loading layer [==================================================>] 8.121MB/8.121MB 5eb48cee65ab: Loading layer [==================================================>] 10.75kB/10.75kB 80a02d377c0e: Loading layer [==================================================>] 922.6kB/922.6kB e19019c8f17e: Loading layer [==================================================>] 2.418MB/2.418MB c581187f88c5: Loading layer [==================================================>] 30.21MB/30.21MB a738c5aa361f: Loading layer [==================================================>] 1.536kB/1.536kB 5f70bf18a086: Loading layer [==================================================>] 1.024kB/1.024kB 4d7066b0e143: Loading layer [==================================================>] 4.096kB/4.096kB Loaded image: redis:7.2.8-alpine3.21 # 查看redis镜像 [root@elk91 ~]# docker image ls redis REPOSITORY TAG IMAGE ID CREATED SIZE redis 7.2.8-alpine3.21 124d2eff17a2 2 months ago 40.9MB # 运行redis [root@elk91 ~]# docker run --name redis-server -p 6379:6379 -d redis:7.2.8-alpine3.21 327514f89685a0eaec3c2a1c14df9a7a35969cb49eac6ad7d81e2c66e55dbb0a 登录做简单测试 [root@elk91 ~]# docker exec -it redis-server redis-cli 127.0.0.1:6379> ping PONG 127.0.0.1:6379> set test "hello" OK 127.0.0.1:6379> get test "hello" 127.0.0.1:6379> ======================================================================= ******************************测试远程连接************************** redis-cli -h 10.X.X.X -p 6379 确保6379端口开放远程连接 sudo iptables -L -n | grep 6379 # 检查是否已放行 注:未开放 sudo firewall-cmd --zone=public --add-port=6379/tcp --permanent # 永久添加规则 sudo firewall-cmd --reload # 重新加载防火墙 sudo firewall-cmd --list-ports # 查看已开放的端口 ======================================================================== *****************************配置永久密码****************************** 先保证redis关闭 [root@elk91 ~]# docker stop redis-server redis-server 删除旧容器 [root@elk91 ~]# docker rm redis-server redis-server 创建新容器+密码 [root@elk91 ~]# docker run --name redis-server -p 6379:6379 -d redis:7.2.8-alpine3.21 redis-server --requirepass "oldboy123.com" 1d974d8087659ee236106900486f897d03f64aa84ef49538187ecf3fa535eaf1 [root@elk91 ~]# 测试连接 [root@elk91 ~]# docker exec -it redis-server redis-cli -a oldboy123.com Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. 127.0.0.1:6379> ping PONG 127.0.0.1:6379> set test "hello" OK 127.0.0.1:6379> get test "hello" 127.0.0.1:6379> ============================================================================ ***********************密码永久化************************************ ———————————————————————————————————————————————————————————————————————————— [root@elk91 ~]# mkdir -p /data/redis 设置固定密码 123 [root@elk91 ~]# echo "requirepass 123" > /data/redis/redis.conf 启动容器 [root@elk91 ~]# docker run --name redis-server -p 6379:6379 \ -v /data/redis/redis.conf:/usr/local/etc/redis/redis.conf \ -d redis:7.2.8-alpine3.21 redis-server /usr/local/etc/redis/redis.conf b1734537ec0073fb86f9485d00d783fa546d88e5b05f81861c88bef48b82b5a4 测试连接 [root@elk91 ~]# docker exec -it redis-server redis-cli -a 123 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. 127.0.0.1:6379> ping PONG 127.0.0.1:6379> set test "hello" OK 127.0.0.1:6379> get test "hello" 127.0.0.1:6379> quit [root@elk91 ~]#
8、docker安装mq
bash
[root@elk01 ~]# docker pull rabbitmq:4.1.1-management-alpine [root@elk01 ~]# docker run -d -p 8091:15672 --hostname my-rabbit --name some-rabbit -e RABBITMQ_DEFAULT_USER=admin -e RABBITMQ_DEFAULT_PASS=123456 rabbitmq:3-management # 浏览器访问http://10.0.0.91:8091/

image

创建测试消息队列

image

image

3、故障排查命令

bash
- 今日内容预告: - 故障排查命令 - docker inspect - docker exec - docker logs - docker cp # 1.故障排查命令之docker logs 1.作用 查看容器的日志信息。 2.实战案例 [root@elk01 ~]# docker pull registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 2.1 环境准备 [root@elk91 ~]# docker run -d --name c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 565f3ec27112beb69e3eeafb8f67ba12ae9eb671f04b6d14ed1d8178766bdc7a [root@elk91 ~]# [root@elk91 ~]# docker run -d --name c2 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 9879012e99be522c71caa5056bd560c936558a04f15e101ef7849fa2ed48cd68 [root@elk91 ~]# [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 9879012e99be registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 "/docker-entrypoint.…" 3 seconds ago Up 2 seconds 80/tcp c2 565f3ec27112 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 9 seconds ago Up 7 seconds 80/tcp c1 [root@elk91 ~]# [root@elk91 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" c1 172.17.0.2 [root@elk91 ~]# [root@elk91 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" c2 172.17.0.3 [root@elk91 ~]# 2.2 实时查看容器日志 [root@elk91 ~]# docker logs -f c1 ... 2.3 查看最近20分钟的日志 [root@elk91 ~]# docker logs -f --since 20m c1 ... 2.4 查看10分钟之前的日志 [root@elk91 ~]# docker logs -f --until 10m c1 # 2.故障排查命令之docker cp 1.作用 将宿主机和容器的文件进行互相拷贝的作用。 2.实战案例 2.1 将宿主机的文件拷贝到容器中 [root@elk91 ~]# docker cp /etc/hosts c1:/ 2.2 验证 [root@elk01 ~]# docker exec -it c1 sh / # ls bin etc media root sys dev home mnt run tmp docker-entrypoint.d hosts opt sbin usr docker-entrypoint.sh lib proc srv var 2.3 将容器的文件拷贝到宿主机 [root@elk91 ~]# docker cp c1:/docker-entrypoint.sh /tmp/ 2.4 验证 [root@elk01 ~]# docker cp c1:/docker-entrypoint.sh /tmp/ [root@elk01 ~]# ll /tmp total 72 drwxrwxrwt 17 root root 4096 Jul 1 10:23 ./ drwxr-xr-x 21 root root 4096 Jun 30 18:19 ../ -rwxrwxr-x 1 root root 1202 Nov 13 2021 docker-entrypoint.sh*

4、容器4种重启策略

bash
1.作用 所谓的重启策略指的是容器在退出时,是否重新启动。 2.指定容器的启动命令实战 # 2.1 导入镜像 [root@elk91 ~]# wget http://192.168.21.253/Resources/Docker/images/weixiang-alpine.tar.gz [root@elk91 ~]# docker load -i weixiang-alpine.tar.gz [root@elk91 ~]# docker image ls alpine REPOSITORY TAG IMAGE ID CREATED SIZE alpine latest 91ef0af61f39 9 months ago 7.8MB [root@elk91 ~]# # 2.2 查看默认的启动命令 [root@elk91 ~]# docker run --name c22 -it -d alpine 5da837de2ba04008aca5204979b37650058df73242f8db4957b9acd166c09de5 [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5da837de2ba0 alpine "/bin/sh" 2 seconds ago Up 2 seconds c22 [root@elk91 ~]# [root@elk91 ~]# docker exec c22 ps -ef PID USER TIME COMMAND 1 root 0:00 /bin/sh 7 root 0:00 ps -ef [root@elk91 ~]# 2.3 修改默认的启动命令 [root@elk91 ~]# docker run --name c33 -itd alpine:latest sleep 30 7e9f8aa2d66bec07b308189876e523961b08ca667eb7f0cd34ea544a6df8a3c7 [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 7e9f8aa2d66b alpine:latest "sleep 10" 3 seconds ago Up 2 seconds c33 [root@elk91 ~]# # 已经改为30秒 [root@elk91 ~]# docker exec c33 ps -ef PID USER TIME COMMAND 1 root 0:00 sleep 30 6 root 0:00 ps -ef [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 7e9f8aa2d66b alpine:latest "sleep 10" 18 seconds ago Exited (0) 8 seconds ago c33 [root@elk91 ~]# 得出结论: 当容器的启动命令退出时,则容器就会处于退出状态。 3.验证容器的重启策略 3.1 移除所有的容器 [root@elk91 ~]# docker container rm -f `docker container ps -qa` 7e9f8aa2d66b 5da837de2ba0 4321f6a0de58 9879012e99be 565f3ec27112 [root@elk91 ~]# 3.2 创建容器 # 行为: 10 秒后退出,不会重启。 [root@elk91 ~]# docker run --name c33 -itd alpine:latest sleep 10 # -itd: 后台运行(-d)并分配交互式终端(-it,虽然对 sleep 无实际作用)。 # alpine:latest: 使用轻量级 Alpine Linux 镜像。 # sleep 10: 容器启动后执行 sleep 10,10 秒后自动退出。 # 效果与默认 no 相同,显式声明更清晰 [root@elk91 ~]# docker run --name policy-no --restart no -itd alpine:latest sleep 10 # --restart no 设置容器的重启策略为no 65412025e8c63872a39f29a660a823954e741a458f6fc6fadba185f96429b53e [root@elk91 ~]# # 行为: 每10秒退出后立即重启(无限循环)。 [root@elk91 ~]# docker run --name policy-always --restart always -itd alpine:latest sleep 10 bea3850b5807c116c31cd09ecb4238da9eb2fdef5329fcb171e5071d37bb4b28 [root@elk91 ~]# # 如果容器自动退出,会重启,如果手动停止则不再重启 [root@elk91 ~]# docker run --name policy-unless-stopped --restart unless-stopped -itd alpine:latest sleep 10 c6e650395e3958a683dbca7085e07342b41ccfcab7497aa8bde72fb930123a9b [root@elk91 ~]# # 仅在容器异常退出(非0状态码)时重启,最多尝试3次 [root@elk91 ~]# docker run --name policy-on-failure-max --restart on-failure:3 -itd alpine:latest sleep 10 c4118afccecaff495d264501d9cd17b6bc1953d0ec7b4c3a5d9c7f3d07bafbce [root@elk91 ~]# 3.3 查看容器的状态 [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES c4118afcceca alpine:latest "sleep 10" 37 seconds ago Exited (0) 26 seconds ago policy-on-failure-max c6e650395e39 alpine:latest "sleep 10" About a minute ago Up 6 seconds policy-unless-stopped bea3850b5807 alpine:latest "sleep 10" About a minute ago Up 1 second policy-always 65412025e8c6 alpine:latest "sleep 10" 2 minutes ago Exited (0) About a minute ago policy-no [root@elk91 ~]# 3.4 停止所有的容器并重启docker服务【模拟服务器端点】 [root@elk91 ~]# docker kill `docker ps -aq` c6e650395e39 bea3850b5807 Error response from daemon: Cannot kill container: c4118afcceca: Container c4118afccecaff495d264501d9cd17b6bc1953d0ec7b4c3a5d9c7f3d07bafbce is not running Error response from daemon: Cannot kill container: 65412025e8c6: Container 65412025e8c63872a39f29a660a823954e741a458f6fc6fadba185f96429b53e is not running [root@elk91 ~]# # 杀死后都不会重启 [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES c4118afcceca alpine:latest "sleep 10" 2 minutes ago Exited (0) 2 minutes ago policy-on-failure-max c6e650395e39 alpine:latest "sleep 10" 4 minutes ago Exited (137) 3 seconds ago policy-unless-stopped bea3850b5807 alpine:latest "sleep 10" 4 minutes ago Exited (137) 3 seconds ago policy-always 65412025e8c6 alpine:latest "sleep 10" 4 minutes ago Exited (0) 4 minutes ago policy-no [root@elk91 ~]# [root@elk91 ~]# systemctl restart docker.service [root@elk91 ~]# # 重启docker.service服务只有always会重启 [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES c4118afcceca alpine:latest "sleep 10" 3 minutes ago Exited (0) 3 minutes ago policy-on-failure-max c6e650395e39 alpine:latest "sleep 10" 4 minutes ago Exited (137) 33 seconds ago policy-unless-stopped bea3850b5807 alpine:latest "sleep 10" 4 minutes ago Up 2 seconds policy-always 65412025e8c6 alpine:latest "sleep 10" 4 minutes ago Exited (0) 4 minutes ago policy-no [root@elk91 ~]# 3.5 启动所有的容器并重启docker服务【模拟服务器端点】 [root@elk91 ~]# docker start `docker ps -qa` c4118afcceca c6e650395e39 bea3850b5807 65412025e8c6 [root@elk91 ~]# [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES c4118afcceca alpine:latest "sleep 10" 5 minutes ago Up 3 seconds policy-on-failure-max c6e650395e39 alpine:latest "sleep 10" 6 minutes ago Up 3 seconds policy-unless-stopped bea3850b5807 alpine:latest "sleep 10" 6 minutes ago Up 3 seconds policy-always 65412025e8c6 alpine:latest "sleep 10" 6 minutes ago Up 2 seconds policy-no [root@elk91 ~]# [root@elk91 ~]# systemctl restart docker.service [root@elk91 ~]# # 启动状态下重启只有policy-no跟policy-on-failure-max不会启动 [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES c4118afcceca alpine:latest "sleep 10" 5 minutes ago Exited (0) 21 seconds ago policy-on-failure-max c6e650395e39 alpine:latest "sleep 10" 6 minutes ago Up 9 seconds policy-unless-stopped bea3850b5807 alpine:latest "sleep 10" 7 minutes ago Up 9 seconds policy-always 65412025e8c6 alpine:latest "sleep 10" 7 minutes ago Exited (255) 32 seconds ago policy-no [root@elk91 ~]# 3.6 重启异常退出验证 [root@elk91 ~]# docker run --name c1 -d --restart on-failure:3 alpine:latest sleep 60 bab1b67d1f4f7d030fecaf48b374d3bb54839ca50d4a46d929ca5e6fb6b01b8c [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bab1b67d1f4f alpine:latest "sleep 60" 2 seconds ago Up 2 seconds c1 [root@elk91 ~]# [root@elk91 ~]# docker inspect -f '{{.State.Pid}}' c1 25060 [root@elk91 ~]# [root@elk91 ~]# kill -9 25060 [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bab1b67d1f4f alpine:latest "sleep 60" 20 seconds ago Up 1 second c1 [root@elk91 ~]# [root@elk91 ~]# docker inspect -f '{{.State.Pid}}' c1 25193 [root@elk91 ~]# [root@elk91 ~]# kill -9 25193 [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bab1b67d1f4f alpine:latest "sleep 60" 46 seconds ago Up 1 second c1 [root@elk91 ~]# [root@elk91 ~]# [root@elk91 ~]# docker inspect -f '{{.State.Pid}}' c1 25304 [root@elk91 ~]# [root@elk91 ~]# kill -9 25304 [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bab1b67d1f4f alpine:latest "sleep 60" About a minute ago Up 1 second c1 [root@elk91 ~]# [root@elk91 ~]# docker inspect -f '{{.State.Pid}}' c1 25457 [root@elk91 ~]# [root@elk91 ~]# kill -9 25457 [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bab1b67d1f4f alpine:latest "sleep 60" About a minute ago Exited (137) 1 second ago c1 [root@elk91 ~]# 得出结论: - 当容器正常退出时,unless-stopped和always始终会重启; - 重启docker服务前,容器是退出状态,则只有always策略会重启,若容器是启动状态,只有no策略不重启; - 当容器的重启策略是on-failure时,只有在异常退出的情况下会重启指定的次数,如果正常退出则不会重启服务; 生产环境用的最多的就是: always,unless-stopped。默认策略为: no。
策略行为适用场景
no​(默认)容器退出后不重启临时任务,无需自动恢复
alwaysstop和kill不会重启,其他都会重启必须保持运行的长期服务(如 Web 服务器)
unless-stopped除非手动停止,否则一直重启需要持久化运行的服务,避免手动停止后意外重启
on-failure[:max-retries]仅在非0退出码时重启,可指定最大重试次数需要处理临时错误的任务

5、docker的单机网络类型

bash
1.docker单机网络类型 - host: 使用宿主机网络。也就是宿主机ifconfig的ip就是容器里面的ip - none: 不分配网络,仅有一个lo本地回环网络。 - bridge: 桥接网络,为容器分配一个虚拟设备对,默认容器的网络类型就是bridge。 当容器通过 bridge 网络访问外网时,宿主机会自动进行 源地址转换 - Containerd: 和一个已经存在的容器公用网络名称空间。 - custom network 自定义网络,自定义网络内置了DNS组件,可以实现基于容器名称解析为容器IP地址。 2.实战案例 2.1 移除所有的容器 [root@elk91 ~]# docker container rm -f `docker container ps -qa` bab1b67d1f4f [root@elk91 ~]# 2.2 查看内置的网络 [root@elk91 ~]# docker network ls NETWORK ID NAME DRIVER SCOPE 2635a91ba0e7 bridge bridge local e41062447c84 host host local f5d81090b2a2 none null local [root@elk91 ~]# 2.3 使用host网络 [root@elk91 ~]# docker run -d --name c1-host --network host registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 44389ca5adbb7ac780e63aec8b146d0fdba423c5505710aca5e95f508c673654 [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 44389ca5adbb registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 2 seconds ago Up 2 seconds c1-host [root@elk91 ~]# [root@elk91 ~]# docker exec -it c1-host sh / # ifconfig docker0 Link encap:Ethernet HWaddr 02:42:3C:0A:40:7B inet addr:172.17.0.1 Bcast:172.17.255.255 Mask:255.255.0.0 inet6 addr: fe80::42:3cff:fe0a:407b/64 Scope:Link UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:16 errors:0 dropped:0 overruns:0 frame:0 TX packets:99 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2306 (2.2 KiB) TX bytes:12172 (11.8 KiB) eth0 Link encap:Ethernet HWaddr 00:0C:29:E8:8B:7C inet addr:10.0.0.91 Bcast:10.0.0.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fee8:8b7c/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:226993 errors:0 dropped:60 overruns:0 frame:0 TX packets:273407 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:140925456 (134.3 MiB) TX bytes:216354611 (206.3 MiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:568315 errors:0 dropped:0 overruns:0 frame:0 TX packets:568315 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:53990765 (51.4 MiB) TX bytes:53990765 (51.4 MiB) / # 2.4 使用bridge网络 【默认就是bridge网络】 [root@elk91 ~]# docker run -d --name c2-bridge --network bridge registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 ed7f91d7a3d449ad991e3b324be4e98d126faa368b7c67c4cc17f5e8ee9e9493 [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ed7f91d7a3d4 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 4 seconds ago Up 3 seconds 80/tcp c2-bridge [root@elk91 ~]# [root@elk91 ~]# docker exec -it c2-bridge sh / # ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:02 inet addr:172.17.0.2 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:18 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2346 (2.2 KiB) TX bytes:0 (0.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # 2.5 使用none网络 [root@elk91 ~]# docker run -d --name c3-none --network none registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 37fb54d45983a81ddf66553905b09692d9129f5e8423c486f7bb89f0d4d4c21c [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 37fb54d45983 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 3 seconds ago Up 2 seconds c3-none [root@elk91 ~]# [root@elk91 ~]# docker exec -it c3-none sh / # ifconfig lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # 2.6 使用Containerd网络 [root@elk91 ~]# docker run -d --name c4-container --network container:c2-bridge registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 sleep 3650d f04add074f989b300a764fdaf33652b0a4662f2c1a4a76d7ee003ed1cfa0f02b # --network container:c2-bridge 新容器 c4-container 不会创建自己的网络栈,两个容器共享相同的网络配置 [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f04add074f98 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 6 seconds ago Up 5 seconds c4-container [root@elk91 ~]# [root@elk91 ~]# docker exec -it c4-container sh / # ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:02 inet addr:172.17.0.2 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:22 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2626 (2.5 KiB) TX bytes:0 (0.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # / # / # ps -ef PID USER TIME COMMAND 1 root 0:00 sleep 3650d 6 root 0:00 sh 13 root 0:00 ps -ef / # / # netstat -untalp Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN - tcp 0 0 :::80 :::* LISTEN - / # / # / # curl 127.0.0.1 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> / # - 自定义网络实战 1.删除所有的容器 [root@elk91 ~]# docker container rm -f `docker container ps -qa` f04add074f98 37fb54d45983 ed7f91d7a3d4 44389ca5adbb [root@elk91 ~]# 2.创建自定义网络 [root@elk91 ~]# docker network create -d bridge --subnet 172.20.0.0/16 --ip-range 172.20.100.0/24 --gateway 172.20.0.254 weixiang 61a49874bc5b49bcec8218861bff4421f8c4f0f4fb51ee0613761d9129fd3f52 [root@elk91 ~]# [root@elk91 ~]# docker network ls NETWORK ID NAME DRIVER SCOPE 2635a91ba0e7 bridge bridge local e41062447c84 host host local f5d81090b2a2 none null local 61a49874bc5b weixiang bridge local [root@elk91 ~]# [root@elk91 ~]# 相关参数说明: -d : 指定自定义网络的驱动,目前支持"bridge","ipvlan","macvlan","overlay",后三者都是跨主机网络。 --subnet: 指定网络的子网信息,网段和子网掩码。 --ip-range: 容器分配的网段。 --gateway: 指定网管地址。 3.查看网络的详细信息 [root@elk91 ~]# docker network inspect weixiang [ { "Name": "weixiang", "Id": "61a49874bc5b49bcec8218861bff4421f8c4f0f4fb51ee0613761d9129fd3f52", "Created": "2025-07-01T11:02:15.132831304+08:00", "Scope": "local", "Driver": "bridge", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": {}, "Config": [ { "Subnet": "172.20.0.0/16", "IPRange": "172.20.100.0/24", "Gateway": "172.20.0.254" } ] }, "Internal": false, "Attachable": false, "Ingress": false, "ConfigFrom": { "Network": "" }, "ConfigOnly": false, "Containers": {}, "Options": {}, "Labels": {} } ] [root@elk91 ~]# 4.创建容器并指定自定义网络 [root@elk91 ~]# docker run --name c1 -d --network weixiang registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 a6091c9f989abce87541c3f011364474ffc369f73e90adabf4232324bc8d7f6b [root@elk91 ~]# [root@elk91 ~]# docker run --name c2 -d --network weixiang registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 b186c82212de7182a3a8ca46108fe3fa776feb0a4dd14fe18c0a2f64c39a0bc8 [root@elk91 ~]# [root@elk91 ~]# [root@elk91 ~]# docker run --name c3 -d --network weixiang --ip 172.20.200.200 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 d7ba459077db815fee646a9607b9f80200cd8b0d84e514576b427c703b5d94d2 [root@elk91 ~]# [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES d7ba459077db registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 4 seconds ago Up 3 seconds 80/tcp c3 b186c82212de registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 49 seconds ago Up 47 seconds 80/tcp c2 a6091c9f989a registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 56 seconds ago Up 55 seconds 80/tcp c1 [root@elk91 ~]# [root@elk91 ~]# [root@elk91 ~]# docker exec -it c1 ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:14:64:00 inet addr:172.20.100.0 Bcast:172.20.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:39 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:5454 (5.3 KiB) TX bytes:0 (0.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) [root@elk91 ~]# [root@elk91 ~]# docker exec -it c2 ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:14:64:01 inet addr:172.20.100.1 Bcast:172.20.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:17 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2156 (2.1 KiB) TX bytes:0 (0.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) [root@elk91 ~]# [root@elk91 ~]# docker exec -it c3 ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:14:C8:C8 inet addr:172.20.200.200 Bcast:172.20.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:16 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2086 (2.0 KiB) TX bytes:0 (0.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) [root@elk91 ~]# 5.测试网络的连通性 [root@elk91 ~]# docker exec -it c1 sh / # ping baidu.com -c 3 PING baidu.com (182.61.244.181): 56 data bytes 64 bytes from 182.61.244.181: seq=0 ttl=127 time=22.525 ms 64 bytes from 182.61.244.181: seq=1 ttl=127 time=21.943 ms 64 bytes from 182.61.244.181: seq=2 ttl=127 time=21.615 ms --- baidu.com ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 21.615/22.027/22.525 ms / # / # / # ping c2 -c 3 PING c2 (172.20.100.1): 56 data bytes 64 bytes from 172.20.100.1: seq=0 ttl=64 time=0.101 ms 64 bytes from 172.20.100.1: seq=1 ttl=64 time=0.073 ms 64 bytes from 172.20.100.1: seq=2 ttl=64 time=0.054 ms --- c2 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.054/0.076/0.101 ms / # / # / # ping c3 -c 3 PING c3 (172.20.200.200): 56 data bytes 64 bytes from 172.20.200.200: seq=0 ttl=64 time=0.115 ms 64 bytes from 172.20.200.200: seq=1 ttl=64 time=0.067 ms 64 bytes from 172.20.200.200: seq=2 ttl=64 time=0.063 ms --- c3 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.063/0.081/0.115 ms / #

6、基于docker部署ES8

1、单点部署
bash
参考链接: https://www.elastic.co/guide/en/elasticsearch/reference/8.18/docker.html 1.创建自定义网络 [root@elk91 ~]# docker network create elastic 62c2988843eebae50481dcd8a33b88109289ac6a86022a9ba3f02376da35e7e6 [root@elk91 ~]# [root@elk91 ~]# docker network ls NETWORK ID NAME DRIVER SCOPE 2635a91ba0e7 bridge bridge local 62c2988843ee elastic bridge local e41062447c84 host host local f5d81090b2a2 none null local 61a49874bc5b weixiang bridge local [root@elk91 ~]# 2.拉取镜像 docker pull docker.elastic.co/elasticsearch/elasticsearch:8.18.3 SVIP: wget http://192.168.21.253/Resources/Docker/images/ElasticStack/8.18.3/weixiang-elasticsearch-v8.18.3.tar.gz docker load -i weixiang-elasticsearch-v8.18.3.tar.gz 3.运行ES [root@elk91 ~]# docker run --name es01 -d --net elastic -p 19200:9200 -it -m 2GB docker.elastic.co/elasticsearch/elasticsearch:8.18.3 4.重置ES的密码 [root@elk91 ~]# docker exec -it es01 /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic WARNING: Owner of file [/usr/share/elasticsearch/config/users] used to be [root], but now is [elasticsearch] WARNING: Owner of file [/usr/share/elasticsearch/config/users_roles] used to be [root], but now is [elasticsearch] This tool will reset the password of the [elastic] user to an autogenerated value. The password will be printed in the console. Please confirm that you would like to continue [y/N]y +SzNgpoVnnYcOqjyjRLr [root@elk91 ~]# 5.拷贝证书到当前节点 [root@elk91 ~]# docker cp es01:/usr/share/elasticsearch/config/certs/http_ca.crt . [root@elk91 ~]# [root@elk91 ~]# ll http_ca.crt -rw-rw---- 1 root root 1935 Jul 1 11:45 http_ca.crt [root@elk91 ~]# [root@elk91 ~]# 6.测试验证单点是否正常 [root@elk91 ~]# curl --cacert http_ca.crt -u elastic:+SzNgpoVnnYcOqjyjRLr https://localhost:19200 { "name" : "4a080ccaa240", "cluster_name" : "docker-cluster", "cluster_uuid" : "cSErCUHISh6rg-dmeXrMaA", "version" : { "number" : "8.18.3", "build_flavor" : "default", "build_type" : "docker", "build_hash" : "28fc77664903e7de48ba5632e5d8bfeb5e3ed39c", "build_date" : "2025-06-18T22:08:41.171261054Z", "build_snapshot" : false, "lucene_version" : "9.12.1", "minimum_wire_compatibility_version" : "7.17.0", "minimum_index_compatibility_version" : "7.0.0" }, "tagline" : "You Know, for Search" } [root@elk91 ~]# [root@elk91 ~]# curl --cacert http_ca.crt -u elastic:+SzNgpoVnnYcOqjyjRLr https://localhost:19200/_cat/nodes 172.18.0.2 62 90 12 0.23 0.31 0.19 cdfhilmrstw * 4a080ccaa240 [root@elk91 ~]# [root@elk91 ~]# curl -k -u elastic:jvNZG7yvjh1rY_laJX*+ https://localhost:19200/_cat/nodes 172.18.0.2 62 90 1 0.19 0.30 0.19 cdfhilmrstw * 4a080ccaa240 [root@elk91 ~]#
2、基于docker部署ES8集群
bash
1.创建节点加入的token【该token的有效期为30m】 [root@elk91 ~]# docker exec -it es01 /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s node eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTcyLjE4LjAuMjo5MjAwIl0sImZnciI6IjNlODgxODE3YmNmYWVjNjI5ZDNiNWVhYjgzYmE0ZjRiYjQ1M2Q3ZmUzZjgwNWNmOTcyNzE1NTExNDFlNjJmYjQiLCJrZXkiOiJLbWxLeEpjQko1Nmh1a3pYR3VvWjp5WjZnRG5aZGU4SWMyM25SYXBJLWV3In0= [root@elk91 ~]# 2.新节点加入并指定token [root@elk91 ~]# docker run -e ENROLLMENT_TOKEN="eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTcyLjE4LjAuMjo5MjAwIl0sImZnciI6IjNlODgxODE3YmNmYWVjNjI5ZDNiNWVhYjgzYmE0ZjRiYjQ1M2Q3ZmUzZjgwNWNmOTcyNzE1NTExNDFlNjJmYjQiLCJrZXkiOiJLbWxLeEpjQko1Nmh1a3pYR3VvWjp5WjZnRG5aZGU4SWMyM25SYXBJLWV3In0=" --name es02 -d --net elastic -it -m 1GB docker.elastic.co/elasticsearch/elasticsearch:8.18.3 [root@elk91 ~]# docker run -e ENROLLMENT_TOKEN="eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTcyLjE4LjAuMjo5MjAwIl0sImZnciI6IjNlODgxODE3YmNmYWVjNjI5ZDNiNWVhYjgzYmE0ZjRiYjQ1M2Q3ZmUzZjgwNWNmOTcyNzE1NTExNDFlNjJmYjQiLCJrZXkiOiJLbWxLeEpjQko1Nmh1a3pYR3VvWjp5WjZnRG5aZGU4SWMyM25SYXBJLWV3In0=" --name es03 -d --net elastic -it -m 1GB docker.elastic.co/elasticsearch/elasticsearch:8.18.3 3.验证集群是否正常 [root@elk91 ~]# curl --cacert http_ca.crt -u elastic:+SzNgpoVnnYcOqjyjRLr https://localhost:19200/_cat/nodes 172.18.0.3 59 71 24 1.11 0.55 0.31 cdfhilmrstw - a8fd0587a255 172.18.0.2 53 76 24 1.11 0.55 0.31 cdfhilmrstw * 4a080ccaa240 172.18.0.4 45 99 58 1.11 0.55 0.31 cdfhilmrstw - 4bd05528d38f [root@elk91 ~]# [root@elk91 ~]# curl -k -u elastic:+SzNgpoVnnYcOqjyjRLr https://localhost:19200/_cat/nodes 172.18.0.3 60 70 3 1.02 0.54 0.31 cdfhilmrstw - a8fd0587a255 172.18.0.2 54 76 3 1.02 0.54 0.31 cdfhilmrstw * 4a080ccaa240 172.18.0.4 46 98 3 1.02 0.54 0.31 cdfhilmrstw - 4bd05528d38f [root@elk91 ~]#
3、部署kibana
bash
1.拉取镜像 [root@elk91 ~]# docker pull docker.elastic.co/kibana/kibana:8.18.3 SVIP: wget http://192.168.21.253/Resources/Docker/images/ElasticStack/8.18.3/weixiang-kibana-v8.18.3.tar.gz docker load < weixiang-kibana-v8.18.3.tar.gz 2.启动kibana [root@elk91 ~]# docker run --name kib01 --net elastic -d -p 15601:5601 docker.elastic.co/kibana/kibana:8.18.3 [root@elk91 ~]# docker logs -f kib01 3.访问kibana的WebUI http://10.0.0.91:15601/ 4.es生成kibana组件的token [root@elk91 ~]# docker exec -it es01 /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana eyJ2ZXIiOiI4LjE0LjAiLCJhZHIiOlsiMTcyLjE4LjAuMjo5MjAwIl0sImZnciI6IjI5Mjk4ODdlOWYxNmIwYTcxNTE2NTk3MWE3OWI0MzJmZWQyOThkOWFjZTA5ZTg2NDgyMjk1MzFkNjkzMDA2MGUiLCJrZXkiOiI1SmNpeEpjQlFzM1lXZ0hteFhuczo4Wi12VFlwSnA3VnpTbnF0VThpRXRBIn0= [root@elk91 ~]# 5.将token拷贝到webUI 略,见视频。 6.生成校验码 [root@elk91 ~]# docker exec kib01 bin/kibana-verification-code Kibana is currently running with legacy OpenSSL providers enabled! For details and instructions on how to disable see https://www.elastic.co/guide/en/kibana/8.18/production.html#openssl-legacy-provider Your verification code is: 214 112 [root@elk91 ~]# 7.修改kibana的配置文件并重启容器 [root@elk91 ~]# docker exec -it kib01 bash kibana@f49381b1e6c6:~$ kibana@f49381b1e6c6:~$ echo i18n.locale: "zh-CN" >> config/kibana.yml kibana@f49381b1e6c6:~$ [root@elk91 ~]# docker restart kib01 kib01 [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f49381b1e6c6 docker.elastic.co/kibana/kibana:8.18.3 "/bin/tini -- /usr/l…" 7 minutes ago Up 4 seconds 0.0.0.0:15601->5601/tcp, :::15601->5601/tcp kib01 [root@elk91 ~]#

image

7、自定义网络案例之部署zabbix系统

bash
- 自定义网络案例之部署zabbix系统 1.清理容器 [root@elk91 ~]# docker container rm -f `docker container ps -qa` f49381b1e6c6 4bd05528d38f a8fd0587a255 4a080ccaa240 2d6d42e5b377 2.移除所有未使用的网络 [root@elk91 ~]# docker network ls NETWORK ID NAME DRIVER SCOPE 2635a91ba0e7 bridge bridge local 62c2988843ee elastic bridge local e41062447c84 host host local f5d81090b2a2 none null local 61a49874bc5b weixiang bridge local [root@elk91 ~]# [root@elk91 ~]# docker network prune -f Deleted Networks: weixiang elastic [root@elk91 ~]# [root@elk91 ~]# docker network ls NETWORK ID NAME DRIVER SCOPE 2635a91ba0e7 bridge bridge local e41062447c84 host host local f5d81090b2a2 none null local [root@elk91 ~]# 3.创建自定义网络 [root@elk91 ~]# docker network create --subnet 172.20.0.0/16 --ip-range 172.20.240.0/20 zabbix-net 8694528252c0036738f94a0617b441d02a415626ef791c047ee4e34dd4620d78 [root@elk91 ~]# [root@elk91 ~]# docker network ls NETWORK ID NAME DRIVER SCOPE 2635a91ba0e7 bridge bridge local e41062447c84 host host local f5d81090b2a2 none null local 8694528252c0 zabbix-net bridge local [root@elk91 ~]# 手动导入镜像: [root@elk91 ~]# wget http://192.168.21.253/Resources/Docker/images/Zabbix/7.2/weixiang-zabbix-java-gateway-alpine-7.2-latest.tar.gz [root@elk91 ~]# docker load -i weixiang-zabbix-java-gateway-alpine-7.2-latest.tar.gz 手动导入镜像: [root@elk91 ~]# wget http://192.168.21.253/Resources/Docker/images/Zabbix/7.2/weixiang-zabbix-server-mysql-alpine-7.2-latest.tar.gz [root@elk91 ~]# docker load -i weixiang-zabbix-server-mysql-alpine-7.2-latest.tar.gz 手动导入镜像: [root@elk91 ~]# wget http://192.168.21.253/Resources/Docker/images/Zabbix/7.2/weixiang-zabbix-web-nginx-mysql-alpine-7.2-latest.tar.gz [root@elk91 ~]# docker load -i weixiang-zabbix-web-nginx-mysql-alpine-7.2-latest.tar.gz 4.部署MySQL服务 [root@elk91 ~]# docker run --name mysql-server -t \ -e MYSQL_DATABASE="zabbix" \ -e MYSQL_USER="zabbix" \ -e MYSQL_PASSWORD="zabbix_pwd" \ -e MYSQL_ROOT_PASSWORD="root_pwd" \ --network=zabbix-net \ --restart unless-stopped \ -d mysql:8.0.36-oracle \ --character-set-server=utf8 --collation-server=utf8_bin \ --default-authentication-plugin=mysql_native_password 5.部署Zabbix Java gateway [root@elk91 ~]# docker run --name zabbix-java-gateway -t \ --network=zabbix-net \ --restart unless-stopped \ -d zabbix/zabbix-java-gateway:alpine-7.2-latest 6.部署 Zabbix server [root@elk91 ~]# docker run --name zabbix-server-mysql -t \ -e DB_SERVER_HOST="mysql-server" \ -e MYSQL_DATABASE="zabbix" \ -e MYSQL_USER="zabbix" \ -e MYSQL_PASSWORD="zabbix_pwd" \ -e MYSQL_ROOT_PASSWORD="root_pwd" \ -e ZBX_JAVAGATEWAY="zabbix-java-gateway" \ --network=zabbix-net \ -p 10051:10051 \ --restart unless-stopped \ -d zabbix/zabbix-server-mysql:alpine-7.2-latest 7.部署 Zabbix webUI docker run --name zabbix-web-nginx-mysql -t \ -e ZBX_SERVER_HOST="zabbix-server-mysql" \ -e DB_SERVER_HOST="mysql-server" \ -e MYSQL_DATABASE="zabbix" \ -e MYSQL_USER="zabbix" \ -e MYSQL_PASSWORD="zabbix_pwd" \ -e MYSQL_ROOT_PASSWORD="root_pwd" \ --network=zabbix-net \ -p 80:8080 \ --restart unless-stopped \ -d zabbix/zabbix-web-nginx-mysql:alpine-7.2-latest 8.访问zabbix webUI http://10.0.0.91/ 用户名: Admin 密 码: zabbix

8、虚拟机磁盘不足解决方案

bash
- docker虚拟机磁盘不足解决方案 # 1.基于LVM卷扩容 lvextend /dev/ubuntu-vg/ubuntu-lv -l +100%FREE resize2fs /dev/mapper/ubuntu--vg-ubuntu--lv # 使用etx4系统 xfs_growfs /dev/mapper/ubuntu--vg-ubuntu--lv # 适用于xfs系统 # 2.扩容之前可用空间49g [root@elk91 ~]# pvs PV VG Fmt Attr PSize PFree /dev/sda3 ubuntu-vg lvm2 a-- <98.00g 49.00g # 2.把vg卷扩容到根目录 [root@elk01 ~]# resize2fs /dev/mapper/ubuntu--vg-ubuntu--lv resize2fs 1.46.5 (30-Dec-2021) Filesystem at /dev/mapper/ubuntu--vg-ubuntu--lv is mounted on /; on-line resizing required old_desc_blocks = 7, new_desc_blocks = 13 The filesystem on /dev/mapper/ubuntu--vg-ubuntu--lv is now 25689088 (4k) blocks long. # 3.查看vg卷信息,可用空间已经没了 [root@elk01 ~]# vgs VG #PV #LV #SN Attr VSize VFree ubuntu-vg 1 1 0 wz--n- <98.00g 0 2.创建磁盘配置挂载点 1、准备一个数据盘, 2、格式化开机自动挂载, 3、原有的dokceker数据('/var/lib/docker/')拷贝到挂载点。

9、docker底层特性

1、内核转发参数
bash
核心作用:让Linux内核充当路由器。 什么是IP转发? 默认情况下,Linux主机只会处理目标地址是本机 IP 的数据包(接收或响应)。 开启IP转发后,内核会根据路由表,将目标地址不是本机 IP 的数据包从一个网络接口转发到另一个网络接口。 这是路由器最基本的功能。 为什么Docker需要它? 容器有自己的 IP 地址空间: Docker容器运行在虚拟的网络中(如默认的 72.17.0.0/16 bridge 网络)。这些IP地址在宿主机的物 理网络(如 10.0.0.0/24)中是不可路由的。 跨网络通信: 容器访问外网(互联网):容器的请求包目标IP是外网地址,源IP是容器IP。宿主机物理网卡收到回包时,目标IP是容器IP(在物理网络不可见),需要内核转发给正确的容器。 不同宿主机上的容器通信(Overlay网络): 数据包需要穿越宿主机的物理网络,内核需要转发封装/解封装后的 Overlay 网络包。 容器访问宿主机其他网络接口: 目标 IP 是宿主机的另一个物理 IP(如 192.168.1.100),源 IP 是容器 IP。 如何启用? 临时启用: bash sysctl -w net.ipv4.ip_forward=1 永久启用(修改配置文件): bash echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf sysctl -p # 使配置生效 Docker 在启动时通常会检查并尝试启用此设置。 Docker 中的工作场景示例(容器访问外网): 容器 172.17.0.2发起请求访问 8.8.8.8。 请求包离开容器,通过 veth pair 到达宿主机的 docker0 网桥。 宿主机的内核路由表判断目标 8.8.8.8 需要通过物理网卡 eth0 发送出去。 IP 转发开启:内核允许这个源 IP 为容器 IP (172.17.0.2),目标 IP 为外部 IP (8.8.8.8) 的包从 eth0 发出。 回包目标 IP 是 172.17.0.2,到达宿主机的 eth0。 IP 转发开启:内核允许这个目标 IP 是容器 IP 的包进入,并根据路由表和网络配置(需要 iptables NAT 配合)将其转发回 docker0 网桥,最终到达容器。 关键点:IP 转发解决了允许非本机目标 IP 的数据包流经宿主机网络栈的问题,是实现容器网络连通性的基础前提。但它不改变数据包的源/目标 IP 地址,这就轮到 iptables 上场了。
2、iptables
3、overlayFS概述
bash
docker使用Linux底层特性之overlayFS # 1、overlayFS概述 OverlayFS是一种堆叠文件系统,它依赖并建立在其它的文件系统之上(例如ext4fs和xfs等),并不直接参与磁盘空间结构的划分,仅仅将原来系 统文件中的文件或者目录进行"合并一起",最后向用户展示"合并"的文件是在同一级的目录, 这就是联合挂载技术, 相对于AUFS (<1.12早期 使用的存储技术), OverlayFS速度更快,实现更简单。 Linux内核为Docker提供的OverlayFS驱动有两种:Overlay和Overlay2。而Overlay2是相对于Overlay的一种改进,在Inode利用率方面比 Overlay更有效。 但是Overlay有环境需求: (1)Docker版本17.06.02+; (2)宿主机文件系统需要是EXT4或XFS格式; OverlayFS实现方式: OverlayFS通过三个目录:lower目录、upper目录、以及work目录实现。 lower: 一般对应的是只读数据。 upper: 可以进行读写操作的目录。 work: 目录为工作基础目录,挂载后会自动创建一个work子目录(实际测试手动卸载后该目录并不会被删除) 该目录主要是存储一些临时存放的结果或中间数据的工作目录。 值得注意的是,在使用过程中其内容用户不可见,最后联合挂载完成给用户呈现的统一视图称为merged目录。 OverlayFS结构分为三个层: LowerDir、Upperdir、MergedDir # LowerDir (只读) 只读的image layer,其实就是rootfs。 在使用Dockfile构建镜像的时候, Image Layer可以分很多层,所以对应的lowerdir会很多(源镜像)。 Lower 包括两个层: (1)系统的init 1)容器在启动以后, 默认情况下lower层是不能够修改内容的, 但是用户有需求需要修改主机名与域名地址, 那么就需要添加init层中的文件(hostname, resolv.conf,hosts,mtab等文件), 用于解决此类问题; 2)修改的内容只对当前的容器生效, 而在docker commit提交为镜像时候,并不会将init层提交。 3)init文件存放的目录为/var/lib/docker/overlay2/<init_id>/diff (2)容器的镜像层 不可修改的数据。 # Upperdir (读写) upperdir则是在lowerdir之上的一层, 为读写层。容器在启动的时候会创建, 所有对容器的修改, 都是在这层。比如容器启动写入的日志文件,或者是应用程序写入的临时文件。 # MergedDir (展示) merged目录是容器的挂载点,在用户视角能够看到的所有文件,都是从这层展示的。 2 overlayFS参考案例 2.1 创建工作目录 [root@elk92 ~]# mkdir -pv /weixiang2025/lower{0..2} /weixiang2025/{uppper,work,merged} mkdir: created directory '/weixiang2025' mkdir: created directory '/weixiang2025/lower0' mkdir: created directory '/weixiang2025/lower1' mkdir: created directory '/weixiang2025/lower2' mkdir: created directory '/weixiang2025/uppper' mkdir: created directory '/weixiang2025/work' mkdir: created directory '/weixiang2025/merged' [root@elk92 ~]# 2.2 挂载文件系统 [root@elk92 ~]# mount -t overlay overlay -o lowerdir=/weixiang2025/lower0:/weixiang2025/lower1:/weixiang 2025/lower2,upperdir=/weixiang2025/uppper,workdir=/weixiang2025/work /weixiang2025/merged/ # -t overlay:指定文件系统类型为 overlay(联合文件系统) # -o:挂载选项 # lowerdir:只读目录 # upperdir:读写目录 # workdir:工作目录 # /weixiang2025/merged/:最终合并后的视图目录(用户实际访问的路径) 2.3 查看挂载信息 [root@elk92 ~]# df -h | grep weixiang overlay 48G 11G 35G 24% /weixiang2025/merged [root@elk92 ~]# [root@db01 ~]#mount |grep merged overlay on /weixiang2025/merged type overlay (rw,relatime,lowerdir=/weixiang2025/lower0:/weixiang2025/ lower1:/weixiang2025/lower2,upperdir=/weixiang2025/uppper,workdir=/weixiang2025/work) 2.4 尝试在lower层写入准备初始数据 ll /weixiang2025/ -R cp /etc/hosts /weixiang2025/lower0/ cp /etc/issue /weixiang2025/lower1/ cp /etc/resolv.conf /weixiang2025/lower2/ ll /weixiang2025/ -R # 如下图所示,文件复制到只读层,但是merged可以看到只读层的数据,用户只能看到merged展示层的数据

image

bash
2.6 尝试在merged目录写入数据,观察数据实际写入的应该是upper层 cp /etc/fstab /weixiang2025/merged/ ll /weixiang2025/ -R 2.7 尝试在merged目录写入数据,观察数据实际写入的应该是upper层 cp /etc/fstab /weixiang2025/upper/ ll /weixiang2025/ -R # 如命令所示,即使写入到merged层也会写到upper,写入到upper也会在upper,且都能在merged所看到

image

bash
2.7 重新挂载,但不挂载upperdir层 umount /weixiang2025/merged mount -t overlay overlay -o lowerdir=/weixiang2025/lower0:/weixiang2025/lower1:/weixiang2025/lower2,work dir=/weixiang2025/work /weixiang2025/merged/ # -t overlay:指定文件系统类型为 overlay(联合文件系统) # -o:挂载选项 # lowerdir:只读目录 # workdir:工作目录 # /weixiang2025/merged/:最终合并后的视图目录(用户实际访问的路径) 2.8 再次尝试写入数据失败,因为没有读写层。 [root@db01 ~]#cp /etc/os-release /weixiang2025/merged/ [root@db01 ~]#cp: cannot create regular file '/weixiang2025/merged/os-release': Read-only file system 2.9 验证docker底层用到了overlay FS文件系统 [root@elk91 ~]# df | grep docker overlay 100820384 23866276 72313928 25% /var/lib/docker/overlay2/96405f04635ebbe0e42547edf21d05ae4241e0f0c70a9e72fb69b442606c9935/merged overlay 100820384 23866276 72313928 25% /var/lib/docker/overlay2/e5fc4c25f17922f516bc403c838af5f337087adcc21c6e4b633c9ac2425d8c34/merged overlay 100820384 23866276 72313928 25% /var/lib/docker/overlay2/6d0fa7d337ca3ff48bcd384678f8dff6c83cf3122dec6c4d69252f0836bd3c20/merged overlay 100820384 23866276 72313928 25% /var/lib/docker/overlay2/bdfea0ad779519994a232dff0cf60926ab1f9b557263b3757efcd8082b635294/merged [root@elk91 ~]# # docker inspect+容器 [root@elk91 ~]# docker inspect mysql-server | grep "Dir" "LowerDir": "/var/lib/docker/overlay2/96405f04635ebbe0e42547edf21d05ae4241e0f0c70a9e72fb69b442606c9935-init/diff:/var/lib/docker/overlay2/90c56df74d925473c86e926ccdeb62b35b1de78cf670ddc8a8d605758f8eb8e5/diff:/var/lib/docker/overlay2/7aa1323f766957477af66065ed1bca4669903017f10c3606ff6849a4db802d86/diff:/var/lib/docker/overlay2/3f8a7a2502ca042400492bacce26d950b69a60156d882cd430d16783232a12dc/diff:/var/lib/docker/overlay2/c096cdb2d64456d432863858286e6cb53af032ed76411934075aa119006c53e4/diff:/var/lib/docker/overlay2/b7d95e1aec2956be158f7629d0470549db57efe6da78413311f7ad034462b07f/diff:/var/lib/docker/overlay2/a1f6294eb0223b17fc21a7cb42b8c18de112e3d0f7fe2fa47822f4aff1a0a735/diff:/var/lib/docker/overlay2/159090b2ba27d544dd7eae6fe980c5326e730572d9275a8f05a12f09c8a55145/diff:/var/lib/docker/overlay2/f6723b4421ded006eedde8941a99b32143ba57654bad7f5dcf77c7dba8af793c/diff:/var/lib/docker/overlay2/93dbd346b04bc87b719873d8c6d9b677099e53246d134f97e63edbd750e0fc2c/diff:/var/lib/docker/overlay2/4f7446615ded070e4077266408230c9b75b8bc9e785c4ff345bf61001a589339/diff:/var/lib/docker/overlay2/30117734fcb31b5f7344bdc10bfcf0b08b4ebc47879b17dbd15df1532219f40d/diff", "MergedDir": "/var/lib/docker/overlay2/96405f04635ebbe0e42547edf21d05ae4241e0f0c70a9e72fb69b442606c9935/merged", "UpperDir": "/var/lib/docker/overlay2/96405f04635ebbe0e42547edf21d05ae4241e0f0c70a9e72fb69b442606c9935/diff", "WorkDir": "/var/lib/docker/overlay2/96405f04635ebbe0e42547edf21d05ae4241e0f0c70a9e72fb69b442606c9935/work" "WorkingDir": "", [root@elk91 ~]# -
4、chroot
bash
Q1: 基于同一个镜像启动的两个容器,各自修改根目录结构为啥互不影响??? chroot是一个轻量级的文件系统隔离工具,适合简单场景。 1.准备bash程序及依赖文件 [root@elk92 ~]# mkdir xixi [root@elk92 ~]# mkdir xixi/bin [root@elk91 ~]# chroot xixi # 更改当前运行进程及其子进程的根目录,因为没有文件及依赖,导致报错 chroot: failed to run command ‘/bin/bash’: No such file or directory [root@elk92 ~]# cp /bin/bash xixi/bin [root@elk92 ~]# ldd /bin/bash # 查看/bin/bash依赖 linux-vdso.so.1 (0x00007ffd69cba000) libtinfo.so.6 => /lib/x86_64-linux-gnu/libtinfo.so.6 (0x00007fa2a07a4000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa2a057b000) /lib64/ld-linux-x86-64.so.2 (0x00007fa2a0945000) [root@elk92 ~]# [root@elk92 ~]# mkdir xixi/{lib,lib64} [root@elk92 ~]# mkdir xixi/lib/x86_64-linux-gnu [root@elk92 ~]# cp /lib/x86_64-linux-gnu/{libtinfo.so.6,libc.so.6} xixi/lib/ [root@elk92 ~]# cp /lib64/ld-linux-x86-64.so.2 xixi/lib64/ [root@elk92 ~]# 2.准备ls程序及依赖文件 [root@elk92 ~]# ldd /usr/bin/ls linux-vdso.so.1 (0x00007ffe18345000) libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007ff98f8b0000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff98f687000) libpcre2-8.so.0 => /lib/x86_64-linux-gnu/libpcre2-8.so.0 (0x00007ff98f5f0000) /lib64/ld-linux-x86-64.so.2 (0x00007ff98f90e000) [root@elk92 ~]# [root@elk92 ~]# cp /usr/bin/ls xixi/bin/ [root@elk92 ~]# cp /lib/x86_64-linux-gnu/{libselinux.so.1,libpcre2-8.so.0} xixi/lib/x86_64-linux-gnu/ [root@elk92 ~]# 3.测试验证 [root@elk92 ~]# cp -r xixi haha # 请确保haha目录不存在!!! [root@elk92 ~]# chroot xixi # 此处改变的根目录是'xixi' bash-5.1# echo xixi > /xixi.log bash-5.1# exit [root@elk92 ~]# [root@elk92 ~]# chroot haha # 改变根目录'haha' bash-5.1# echo haha > /haha.log bash-5.1# exit [root@elk92 ~]# [root@elk92 ~]# chroot haha ls -l / total 16 drwxr-xr-x 2 0 0 4096 Jul 1 08:15 bin -rw-r--r-- 1 0 0 5 Jul 1 08:16 haha.log drwxr-xr-x 3 0 0 4096 Jul 1 08:15 lib drwxr-xr-x 2 0 0 4096 Jul 1 08:15 lib64 [root@elk92 ~]# [root@elk92 ~]# chroot xixi ls -l / total 16 drwxr-xr-x 2 0 0 4096 Jul 1 08:10 bin drwxr-xr-x 3 0 0 4096 Jul 1 08:07 lib drwxr-xr-x 2 0 0 4096 Jul 1 08:08 lib64 -rw-r--r-- 1 0 0 5 Jul 1 08:14 xixi.log [root@elk92 ~]# 4.测试docker验证 [root@elk91 ~]# docker container rm -f `docker container ps -qa` d7b9844a1fb6 742f0c1bfef8 be02bc1ecb79 e839e6a38cd0 8ee508e018c6 c34eab89b17a [root@elk91 ~]# [root@elk91 ~]# [root@elk91 ~]# docker run --name c1 -d registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 d32a067cd5201f9479efa31cb6423820cebb3c891159565751de61215db3dbeb [root@elk91 ~]# [root@elk91 ~]# docker run --name c2 -d registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 7ab5484020043ce99eacdb35e87994cf1767a2b4b9271be7a79dc1409136c6cf [root@elk91 ~]# [root@elk91 ~]# docker exec -it c1 sh / # echo xixi > /xixi.log # 新增xixi.log / # [root@elk91 ~]# [root@elk91 ~]# docker exec -it c2 sh / # echo haha > /haha.log # 新增haha.log / # [root@elk91 ~]# [root@elk91 ~]# docker exec -it c1 ls / bin media srv dev mnt sys docker-entrypoint.d opt tmp docker-entrypoint.sh proc usr etc root var home run xixi.log lib sbin [root@elk91 ~]# [root@elk91 ~]# docker exec -it c2 ls / bin lib sbin dev media srv docker-entrypoint.d mnt sys docker-entrypoint.sh opt tmp etc proc usr haha.log root var home run [root@elk91 ~]# [root@elk91 ~]# docker inspect -f "{{.GraphDriver.Data.MergedDir}}" c1 # 取外部展示目录 /var/lib/docker/overlay2/bc8776713e012bba4469c20cf570000e5db3c93782c0c374cc09cb1dd77eadd0/merged [root@elk91 ~]# [root@elk91 ~]# docker inspect -f "{{.GraphDriver.Data.MergedDir}}" c2 # 取外部展示目录 /var/lib/docker/overlay2/20a3e63e25bd70f06a1a6f7f81b6099f7d202b899504bfa9e73f9ca317b5f97a/merged [root@elk91 ~]# [root@elk91 ~]# chroot /var/lib/docker/overlay2/bc8776713e012bba4469c20cf570000e5db3c93782c0c374cc09cb1dd77eadd0/merged ls -l / total 88 drwxr-xr-x 2 root root 4096 Nov 12 2021 bin drwxr-xr-x 1 root root 4096 Jul 1 08:18 dev drwxr-xr-x 1 root root 4096 Nov 13 2021 docker-entrypoint.d -rwxrwxr-x 1 root root 1202 Nov 13 2021 docker-entrypoint.sh drwxr-xr-x 1 root root 4096 Jul 1 08:18 etc drwxr-xr-x 2 root root 4096 Nov 12 2021 home drwxr-xr-x 1 root root 4096 Nov 12 2021 lib drwxr-xr-x 5 root root 4096 Nov 12 2021 media drwxr-xr-x 2 root root 4096 Nov 12 2021 mnt drwxr-xr-x 2 root root 4096 Nov 12 2021 opt dr-xr-xr-x 2 root root 4096 Nov 12 2021 proc drwx------ 1 root root 4096 Jul 1 08:18 root drwxr-xr-x 1 root root 4096 Jul 1 08:18 run drwxr-xr-x 2 root root 4096 Nov 12 2021 sbin drwxr-xr-x 2 root root 4096 Nov 12 2021 srv drwxr-xr-x 2 root root 4096 Nov 12 2021 sys drwxrwxrwt 1 root root 4096 Nov 13 2021 tmp drwxr-xr-x 1 root root 4096 Nov 12 2021 usr drwxr-xr-x 1 root root 4096 Nov 12 2021 var -rw-r--r-- 1 root root 5 Jul 1 08:18 xixi.log # c1可以看到创建的xx [root@elk91 ~]# [root@elk91 ~]# chroot /var/lib/docker/overlay2/20a3e63e25bd70f06a1a6f7f81b6099f7d202b899504bfa9e73f9ca317b5f97a/merged ls -l / total 88 drwxr-xr-x 2 root root 4096 Nov 12 2021 bin drwxr-xr-x 1 root root 4096 Jul 1 08:18 dev drwxr-xr-x 1 root root 4096 Nov 13 2021 docker-entrypoint.d -rwxrwxr-x 1 root root 1202 Nov 13 2021 docker-entrypoint.sh drwxr-xr-x 1 root root 4096 Jul 1 08:18 etc -rw-r--r-- 1 root root 5 Jul 1 08:18 haha.log # c2可以看到创建的haha drwxr-xr-x 2 root root 4096 Nov 12 2021 home drwxr-xr-x 1 root root 4096 Nov 12 2021 lib drwxr-xr-x 5 root root 4096 Nov 12 2021 media drwxr-xr-x 2 root root 4096 Nov 12 2021 mnt drwxr-xr-x 2 root root 4096 Nov 12 2021 opt dr-xr-xr-x 2 root root 4096 Nov 12 2021 proc drwx------ 1 root root 4096 Jul 1 08:18 root drwxr-xr-x 1 root root 4096 Jul 1 08:18 run drwxr-xr-x 2 root root 4096 Nov 12 2021 sbin drwxr-xr-x 2 root root 4096 Nov 12 2021 srv drwxr-xr-x 2 root root 4096 Nov 12 2021 sys drwxrwxrwt 1 root root 4096 Nov 13 2021 tmp drwxr-xr-x 1 root root 4096 Nov 12 2021 usr drwxr-xr-x 1 root root 4096 Nov 12 2021 var [root@elk91 ~]# 调研KVM虚拟机,并部署Ubuntu系统。 https://www.cnblogs.com/yinzhengjie/tag/KVM/
5、cgroup
bash
- docker底层Linux特性之cgroup 1.什么是cgroup 所谓的cgroup本质上是Linux用做资源限制,可以限制Linux的cpu,memory,disk,I/O。 具体见10.4
6、namespace
bash
namespace是Linux用于隔离进程资源的,比如IPC,NET,MNT,PID,UTS,USER。 具体看10.5

10、docker支持跨主机互联的解决方案

bash
- docker支持跨主机互联的解决方案: - macvlan 【不需要安装第三方组件,内核原生支持,不推荐,需要手动分配IP地址】 - overlay + consul【在更高的版本中移除了相关配置,因此高版本方案不推荐】 - flannel + etcd【推荐】 - calico + etcd【推荐】 - macvlan实现容器跨主机网络互联 1 安装docker环境 1.1 拷贝脚本 [root@elk91 ~]# scp weixiang-autoinstall-docker-docker-compose.tar.gz 10.0.0.92:~ 1.2 安装docker环境 [root@elk92 ~]# tar xf weixiang-autoinstall-docker-docker-compose.tar.gz [root@elk92 ~]# ./install-docker.sh i 1.3 导入镜像 [root@elk92 ~]# wget http://192.168.21.253/Resources/Docker/images/weixiang-alpine.tar.gz [root@elk92 ~]# docker load -i weixiang-alpine.tar.gz 2.所有节点加载Linux内核macvlan模块 lsmod | grep macvlan modprobe macvlan lsmod | grep macvlan 3.所有节点创建同网段的自定义网络类型【注意观察你的parent对应的物理网卡名称是不是eth0,如果是ens33需要做修改】 [root@elk92 ~]# docker network create -d macvlan --subnet 172.29.0.0/16 --gateway 172.29.0.254 -o parent=eth0 weixiang-macvlan # -d macvlan 指定网络驱动为 Macvlan(一种让容器直接复用宿主机物理网卡的技术) # --subnet 172.29.0.0/16 定义该网络的IP地址范围(172.29.0.0 ~ 172.29.255.255) # --gateway 172.29.0.254 指定该网络的网关地址(通常是宿主机或路由器的IP) # -o parent=eth0 绑定到宿主机的物理网卡 eth0(容器会通过此网卡直接通信) # weixiang-macvlan 自定义的网络名称 4.10.0.0.91节点基于自定义网络启动容器并手动分配IP地址 4.1 启动容器 [root@elk91 ~]# docker container run --rm -it --name xixi --network weixiang-macvlan --ip 172.29.0.91 alpine # docker container run 创建并启动一个新容器 # --rm 容器退出时自动删除(适合临时测试) # -it 交互式终端(-i 保持STDIN打开,-t 分配伪终端) # --name xixi 将容器命名为 xixi # --network weixiang-macvlan 连接到名为 weixiang-macvlan 的 Macvlan 网络 # --ip 172.29.0.91 手动指定容器IP(必须属于 Macvlan 的子网范围) # alpine 使用 Alpine Linux 镜像(轻量级Linux发行版) / # ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:1D:00:5B inet addr:172.29.0.91 Bcast:172.29.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # / # ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 5: eth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP link/ether 02:42:ac:1d:00:5b brd ff:ff:ff:ff:ff:ff inet 172.29.0.91/16 brd 172.29.255.255 scope global eth0 valid_lft forever preferred_lft forever / # 4.2 查看物理网卡并抓包 [root@elk91 ~]# tcpdump -i eth0 icmp # 【抓包的效果需要下一步进行ping测试】 tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes 09:05:26.290015 IP 172.29.0.92 > 172.29.0.91: ICMP echo request, id 9, seq 0, length 64 09:05:26.290048 IP 172.29.0.91 > 172.29.0.92: ICMP echo reply, id 9, seq 0, length 64 09:05:27.290437 IP 172.29.0.92 > 172.29.0.91: ICMP echo request, id 9, seq 1, length 64 09:05:27.290472 IP 172.29.0.91 > 172.29.0.92: ICMP echo reply, id 9, seq 1, length 64 09:05:28.290845 IP 172.29.0.92 > 172.29.0.91: ICMP echo request, id 9, seq 2, length 64 09:05:28.290875 IP 172.29.0.91 > 172.29.0.92: ICMP echo reply, id 9, seq 2, length 64 5.10.0.0.92节点基于自定义网络启动容器并手动分配IP地址 [root@elk92 ~]# docker container run --rm -it --name haha --network weixiang-macvlan --ip 172.29.0.92 alpine / # ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:1D:00:5C inet addr:172.29.0.92 Bcast:172.29.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # / # ping 172.29.0.91 -c 3 PING 172.29.0.91 (172.29.0.91): 56 data bytes 64 bytes from 172.29.0.91: seq=0 ttl=64 time=0.581 ms 64 bytes from 172.29.0.91: seq=1 ttl=64 time=0.327 ms 64 bytes from 172.29.0.91: seq=2 ttl=64 time=0.237 ms --- 172.29.0.91 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.237/0.381/0.581 ms / # / # ping baidu.com # 不难发现,macvlan的网络是无法联网的 PING baidu.com (182.61.201.211): 56 data bytes 6.解决macvlan无法访问外网的情况 [root@elk92 ~]# docker network connect bridge haha [root@elk92 ~]# [root@elk92 ~]# docker exec -it haha sh / # ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:1D:00:5C inet addr:172.29.0.92 Bcast:172.29.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:14 errors:0 dropped:0 overruns:0 frame:0 TX packets:37 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1251 (1.2 KiB) TX bytes:1890 (1.8 KiB) eth1 Link encap:Ethernet HWaddr 02:42:AC:11:00:02 inet addr:172.17.0.2 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:37 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:5314 (5.1 KiB) TX bytes:0 (0.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:38 errors:0 dropped:0 overruns:0 frame:0 TX packets:38 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:4103 (4.0 KiB) TX bytes:4103 (4.0 KiB) / # / # ping baidu.com -c 3 PING baidu.com (182.61.201.211): 56 data bytes 64 bytes from 182.61.201.211: seq=0 ttl=127 time=6.510 ms 64 bytes from 182.61.201.211: seq=1 ttl=127 time=6.914 ms 64 bytes from 182.61.201.211: seq=2 ttl=127 time=6.947 ms --- baidu.com ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 6.510/6.790/6.947 ms / # 7.移除网卡 [root@elk92 ~]# docker exec -it haha sh / # ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:1D:00:01 inet addr:172.29.0.1 Bcast:172.29.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:2 errors:0 dropped:0 overruns:0 frame:0 TX packets:9 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:120 (120.0 B) TX bytes:378 (378.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:38 errors:0 dropped:0 overruns:0 frame:0 TX packets:38 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3252 (3.1 KiB) TX bytes:3252 (3.1 KiB) / # / # ping baidu.com -c 3 PING baidu.com (182.61.244.181): 56 data bytes [root@elk92 ~]# docker network disconnect bridge haha #总结 macvlan原理就是创建一个网络驱动为Macvlan,直接绑定到宿主机的物理网卡上,为容器分配与物理网络同网段的独立IP地址。当另一台节点 的容器也连接到这个Macvlan网络就可以实现互相通信,前提是必须使用相同的子网和网关配置,如果想上网,那就将容器连接到宿主机的Bridge 网络(默认docker0),宿主机会自动进行源地址转换,容器流量会被伪装成宿主机的IP地址,从而共享宿主机的网络连接可以让容器共享宿主 机的网络访问能力 8.macvlan的优缺点 优点: (1)docker原生支持,无需安装额外插件,只需要加载内核模块,配置起来相对简单。 (2)适合小规模docker环境,例如只有1-3台,如果服务器过多,手动分配IP地址可能会无形之间增加工作量; 缺点: (1)需要手动分配IP地址,如果让其自动分配IP地址可能会存在多个主机自动分配的IP地址冲突的情况,到时候还需要人工介入维护; (2)本机相同网络(本案例为"weixiang_macvlan")的容器之间相互通信没问题,跨主机之间的容器进行通信也没问题,但容器无法与宿主机之间进行通信,也无法连接到外网 (3).macvlan需要绑定一块物理网卡,若网卡已经被绑定,则无法创建; 温馨提示: 如果非要使用macvlan,我们需要手动分配IP地址,无法联网的问题,只需要使用"docker network connect"重新分配一块网卡即可解决。
2、docker容器的数据持久化
bash
- docker容器的数据持久化 1.准备测试数据 [root@elk91 ~]# mkdir /data /scripts [root@elk91 ~]# [root@elk91 ~]# cp /etc/fstab /scripts/ [root@elk91 ~]# echo www.weixiang.com > /data/index.html [root@elk91 ~]# [root@elk91 ~]# ll /data/ /scripts/ /data/: total 12 drwxr-xr-x 2 root root 4096 Jul 2 10:00 ./ drwxr-xr-x 23 root root 4096 Jul 2 09:58 ../ -rw-r--r-- 1 root root 18 Jul 2 10:00 index.html /scripts/: total 12 drwxr-xr-x 2 root root 4096 Jul 2 10:00 ./ drwxr-xr-x 23 root root 4096 Jul 2 09:58 ../ -rw-r--r-- 1 root root 657 Jul 2 10:00 fstab [root@elk91 ~]# 2.启动容器 [root@elk91 ~]# docker run -d --name c1 -v /data/:/usr/share/nginx/html -v /scripts/:/weixiang/weixiang98 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 9b548e7ec5937acc5b53747be3fbc56412e9070217dca8c2aac4ee243a6fbc03 # /data/(宿主机目录)挂载到容器内的 /usr/share/nginx/html(Nginx 默认网站根目录) [root@elk91 ~]# [root@elk91 ~]# docker run -d --name c2 --volumes-from c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 882d6a663090b3582f45d48cc1e5f67e9545441b1916049851accd8256523d71 # 让c2共享c1的所有数据卷(即 /usr/share/nginx/html 和 /weixiang/weixiang98),数据实时同步 # --volumes-from 实现了容器间的数据卷共享 [root@elk91 ~]# [root@elk91 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 882d6a663090 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 4 seconds ago Up 2 seconds 80/tcp c2 9b548e7ec593 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" About a minute ago Up About a minute 80/tcp c1 [root@elk91 ~]# 3.查看容器的IP地址 [root@elk91 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" c1 172.17.0.2 [root@elk91 ~]# [root@elk91 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" c2 172.17.0.3 [root@elk91 ~]# 4.访问测试 [root@elk91 ~]# curl 172.17.0.2 www.weixiang.com [root@elk91 ~]# [root@elk91 ~]# curl 172.17.0.3 www.weixiang.com [root@elk91 ~]# 5.修改数据验证测试 [root@elk91 ~]# docker exec -it c2 sh / # / # ls -l /weixiang/weixiang98/ total 4 -rw-r--r-- 1 root root 657 Jul 2 02:00 fstab / # / # ls -l /usr/share/nginx/html/ total 4 -rw-r--r-- 1 root root 18 Jul 2 02:00 index.html / # / # echo weixiang98 6666 > /usr/share/nginx/html/index.html / # [root@elk91 ~]# [root@elk91 ~]# curl 172.17.0.3 weixiang98 6666 [root@elk91 ~]# [root@elk91 ~]# curl 172.17.0.2 weixiang98 6666 [root@elk91 ~]# [root@elk91 ~]# cat /data/index.html weixiang98 6666 [root@elk91 ~]#
3、docker的存储卷实战
bash
- docker的存储卷实战 1.查看本地的存储卷列表 [root@elk91 ~]# docker volume ls DRIVER VOLUME NAME local 1cb075fcbafedb384081f3b298daf27fd936b6051702848347fb64e5d7f63e3b local 2bd2db3887b6034c116d7aba1bf35b084300dacd2de8045c23fe7e5013ea8a5b local 9a5b60fd57eec338cc29e10364cf1ee906846cb6bea7ee9223a21ab379ae5e2b local 5916872fe9be5af1746276a82bfd4595ae370d4746b2cd19e5ed43a4c9c2fdb1 local bbefc5477cd59d4f710cc92e2188fcb5181b23c77c5ea4490b792a87f819bfc9 [root@elk91 ~]# 2.移除所有未使用的存储卷 [root@elk91 ~]# docker volume prune -f Deleted Volumes: bbefc5477cd59d4f710cc92e2188fcb5181b23c77c5ea4490b792a87f819bfc9 1cb075fcbafedb384081f3b298daf27fd936b6051702848347fb64e5d7f63e3b 2bd2db3887b6034c116d7aba1bf35b084300dacd2de8045c23fe7e5013ea8a5b 5916872fe9be5af1746276a82bfd4595ae370d4746b2cd19e5ed43a4c9c2fdb1 9a5b60fd57eec338cc29e10364cf1ee906846cb6bea7ee9223a21ab379ae5e2b Total reclaimed space: 752.3MB [root@elk91 ~]# [root@elk91 ~]# docker volume ls DRIVER VOLUME NAME [root@elk91 ~]# 3.创建匿名存储卷 [root@elk91 ~]# docker volume create 43835c8fe50a9469852d3a719a592816254b88de62b7b879fc7d62fee9f67951 [root@elk91 ~]# [root@elk91 ~]# docker volume ls DRIVER VOLUME NAME local 43835c8fe50a9469852d3a719a592816254b88de62b7b879fc7d62fee9f67951 [root@elk91 ~]# 4.创建自定义的存储卷 [root@elk91 ~]# docker volume create weixiang weixiang [root@elk91 ~]# [root@elk91 ~]# docker volume ls DRIVER VOLUME NAME local 43835c8fe50a9469852d3a719a592816254b88de62b7b879fc7d62fee9f67951 local weixiang [root@elk91 ~]# 5.查看存储卷的存储路径 [root@elk91 ~]# docker volume inspect weixiang [ { "CreatedAt": "2025-07-02T10:26:11+08:00", "Driver": "local", "Labels": {}, "Mountpoint": "/var/lib/docker/volumes/weixiang/_data", "Name": "weixiang", "Options": {}, "Scope": "local" } ] [root@elk91 ~]# [root@elk91 ~]# ll /var/lib/docker/volumes/weixiang/_data/ total 8 drwxr-xr-x 2 root root 4096 Jul 2 10:26 ./ drwx-----x 3 root root 4096 Jul 2 10:26 ../ [root@elk91 ~]# 6.往存储卷内写入测试数据 [root@elk91 ~]# echo https://www.weixiang.com > /var/lib/docker/volumes/weixiang/_data/index.html [root@elk91 ~]# [root@elk91 ~]# ll /var/lib/docker/volumes/weixiang/_data/ total 12 drwxr-xr-x 2 root root 4096 Jul 2 10:26 ./ drwx-----x 3 root root 4096 Jul 2 10:26 ../ -rw-r--r-- 1 root root 26 Jul 2 10:26 index.html [root@elk91 ~]# 7.为容器使用已存在的存储卷 [root@elk91 ~]# docker run -d --name c1 -v weixiang:/usr/share/nginx/html registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 9fef585d02ec788134c223f49471ebdb5c549bdbb1eeedcbee0774f78cd748e5 # /usr/share/nginx/html是容器内的目标路径 # 不会自动创建_data目录,因为_data是Docker管理卷的宿主机侧结构 [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 9fef585d02ec registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 4 seconds ago Up 3 seconds 80/tcp c1 [root@elk91 ~]# [root@elk91 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" `docker container ps -lq` 172.17.0.2 [root@elk91 ~]# [root@elk91 ~]# curl 172.17.0.2 https://www.weixiang.com [root@elk91 ~]# 8.为容器使用存储卷若不存在则会自动创建 [root@elk91 ~]# docker volume ls DRIVER VOLUME NAME local 43835c8fe50a9469852d3a719a592816254b88de62b7b879fc7d62fee9f67951 local weixiang [root@elk91 ~]# [root@elk91 ~]# docker run -d --name c2 -v weixiang98:/usr/share/nginx/html registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 6d1fbfa9815f930e30bade908bdbb315bd47875a526af2c452229096d370886f [root@elk91 ~]# [root@elk91 ~]# docker volume ls DRIVER VOLUME NAME local 43835c8fe50a9469852d3a719a592816254b88de62b7b879fc7d62fee9f67951 local weixiang98 local weixiang [root@elk91 ~]# [root@elk91 ~]# docker volume inspect weixiang98 [ { "CreatedAt": "2025-07-02T10:29:29+08:00", "Driver": "local", "Labels": null, "Mountpoint": "/var/lib/docker/volumes/weixiang98/_data", "Name": "weixiang98", "Options": null, "Scope": "local" } ] [root@elk91 ~]# [root@elk91 ~]# ll /var/lib/docker/volumes/weixiang98/_data/ total 244 drwxr-xr-x 2 root root 4096 Jul 2 10:29 ./ drwx-----x 3 root root 4096 Jul 2 10:29 ../ -rw-r--r-- 1 root root 233472 Jan 20 2024 1.jpg -rw-r--r-- 1 root root 494 May 25 2021 50x.html -rw-r--r-- 1 root root 357 Jan 20 2024 index.html [root@elk91 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" c2 172.17.0.3 [root@elk91 ~]# curl 172.17.0.3 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> 9.删除容器时数据并不丢失 [root@elk91 ~]# docker exec -it c2 sh / # echo AAAAAAAAAAAAAAAAAAAAAA > /usr/share/nginx/html/index.html / # [root@elk91 ~]# [root@elk91 ~]# curl 172.17.0.3 AAAAAAAAAAAAAAAAAAAAAA [root@elk91 ~]# [root@elk91 ~]# cat /var/lib/docker/volumes/weixiang98/_data/index.html AAAAAAAAAAAAAAAAAAAAAA [root@elk91 ~]# [root@elk91 ~]# docker container rm -f `docker container ps -qa` 6d1fbfa9815f 9fef585d02ec [root@elk91 ~]# [root@elk91 ~]# cat /var/lib/docker/volumes/weixiang98/_data/index.html AAAAAAAAAAAAAAAAAAAAAA [root@elk91 ~]# 10.容器未指定存储卷则使用匿名存储卷 [root@elk91 ~]# docker volume ls DRIVER VOLUME NAME local 43835c8fe50a9469852d3a719a592816254b88de62b7b879fc7d62fee9f67951 local weixiang98 local weixiang [root@elk91 ~]# [root@elk91 ~]# docker run -d --name c3 -v /usr/share/nginx/html registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 a1a5968829cba43282e75921fe2c17cfc4c58e5187a57ec6e261184a0913df74 [root@elk91 ~]# [root@elk91 ~]# docker volume ls DRIVER VOLUME NAME local 2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea local 43835c8fe50a9469852d3a719a592816254b88de62b7b879fc7d62fee9f67951 local weixiang98 local weixiang [root@elk91 ~]# [root@elk91 ~]# docker exec -it c3 sh / # echo BBBBBBBBBBBBBBBBBBB > /usr/share/nginx/html/index.html / # [root@elk91 ~]# 11.基于容器查看对应的存储卷 [root@elk91 ~]# docker volume inspect 2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea [ { "CreatedAt": "2025-07-02T10:32:22+08:00", "Driver": "local", "Labels": null, "Mountpoint": "/var/lib/docker/volumes/2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea/_data", "Name": "2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea", "Options": null, "Scope": "local" } ] [root@elk91 ~]# [root@elk91 ~]# cat /var/lib/docker/volumes/2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea/_data/index.html BBBBBBBBBBBBBBBBBBB [root@elk91 ~]# [root@elk91 ~]# docker container inspect -f "{{range .Mounts}}{{.Name}}{{end}}" c3 2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea # 通过上面这条命令可以看到容器所属的存储券 [root@elk91 ~]# [root@elk91 ~]# docker volume ls | grep 2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea local 2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea [root@elk91 ~]# [root@elk91 ~]# docker volume inspect 2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea [ { "CreatedAt": "2025-07-02T10:32:22+08:00", "Driver": "local", "Labels": null, "Mountpoint": "/var/lib/docker/volumes/2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea/_data", "Name": "2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea", "Options": null, "Scope": "local" } ] [root@elk91 ~]# [root@elk91 ~]# [root@elk91 ~]# cat /var/lib/docker/volumes/2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea/_data/index.html BBBBBBBBBBBBBBBBBBB [root@elk91 ~]# 12.删除容器时可以删除匿名存储卷 [root@elk91 ~]# cat /var/lib/docker/volumes/2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea/_data/index.html BBBBBBBBBBBBBBBBBBB [root@elk91 ~]# [root@elk91 ~]# docker rm -fv c3 # 使用-v选项可以一并删除匿名存储卷,慎用!!! c3 # -v 同时删除容器关联的匿名卷(未命名的卷) [root@elk91 ~]# [root@elk91 ~]# cat /var/lib/docker/volumes/2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea/_data/index.html cat: /var/lib/docker/volumes/2a36e6c19f8f1507af750e8836e531090cb8c0d68c8e4224ab1a8ec9c56fd0ea/_data/index.html: No such file or directory [root@elk91 ~]# [root@elk91 ~]# docker run -d --name c11 -v xixi:/data registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 a35adef48c8e33ed3396face791db08eba83beeb14b0b103bd29ba0be817e462 [root@elk91 ~]# [root@elk91 ~]# docker container inspect -f "{{range .Mounts}}{{.Name}}{{end}}" c11 xixi [root@elk91 ~]# [root@elk91 ~]# docker volume ls | grep xixi local xixi [root@elk91 ~]# [root@elk91 ~]# docker rm -fv c11 # 【如果存储卷的名称自定义过,则-v参数无效!】 c11 [root@elk91 ~]# [root@elk91 ~]# docker volume ls | grep xixi local xixi [root@elk91 ~]# 13.存储卷删除 [root@elk91 ~]# docker volume ls DRIVER VOLUME NAME local 8a2e675ac658b11a48a2c71b9b7236af46e991d5ce4f4295845db05bbb3be073 local 1124b3efd4d97a08c6ebaa91a6e8038b0f309ae6e39680d5ee735577aa96fc04 local 4869fabfc4c7779d78987afd8210a056302c4eda08ddc4fc24238daa79cf6ff5 local 7040d72c84839a50bf7d5e101ba6a59c2b4c0dd08f5f01c18e87ee9149c3979a local 65058e93eaa69eb8058488868d5ae36f9a6aea7d385d691ec4f24fc04d7bb3cf local a73a2164815f90db0bda1dd0f77eaf32062a109ba44975fb3b99b3217f051af8 local c3223eb83a386d29b2329cb340965758690fa79a12e20027ec7816568480f106 local d0845aeb50ef73ebb9b10662713be9752d448c440c196aab61819ee3ddeb5936 local f4dfa064d9572bb1e717a79a944810efa4f059b7db38dbda72c9f04b9de97b7c [root@elk91 ~]# [root@elk91 ~]# [root@elk91 ~]# docker volume rm 8a2e675ac658b11a48a2c71b9b7236af46e991d5ce4f4295845db05bbb3be073 1124b3efd4d97a08c6ebaa91a6e8038b0f309ae6e39680d5ee735577aa96fc04 4869fabfc4c7779d78987afd8210a056302c4eda08ddc4fc24238daa79cf6ff5 8a2e675ac658b11a48a2c71b9b7236af46e991d5ce4f4295845db05bbb3be073 1124b3efd4d97a08c6ebaa91a6e8038b0f309ae6e39680d5ee735577aa96fc04 4869fabfc4c7779d78987afd8210a056302c4eda08ddc4fc24238daa79cf6ff5 [root@elk91 ~]# [root@elk91 ~]# docker volume ls DRIVER VOLUME NAME local 7040d72c84839a50bf7d5e101ba6a59c2b4c0dd08f5f01c18e87ee9149c3979a local 65058e93eaa69eb8058488868d5ae36f9a6aea7d385d691ec4f24fc04d7bb3cf local a73a2164815f90db0bda1dd0f77eaf32062a109ba44975fb3b99b3217f051af8 local c3223eb83a386d29b2329cb340965758690fa79a12e20027ec7816568480f106 local d0845aeb50ef73ebb9b10662713be9752d448c440c196aab61819ee3ddeb5936 local f4dfa064d9572bb1e717a79a944810efa4f059b7db38dbda72c9f04b9de97b7c [root@elk91 ~]# [root@elk91 ~]# docker volume prune -f Deleted Volumes: 7040d72c84839a50bf7d5e101ba6a59c2b4c0dd08f5f01c18e87ee9149c3979a d0845aeb50ef73ebb9b10662713be9752d448c440c196aab61819ee3ddeb5936 c3223eb83a386d29b2329cb340965758690fa79a12e20027ec7816568480f106 f4dfa064d9572bb1e717a79a944810efa4f059b7db38dbda72c9f04b9de97b7c a73a2164815f90db0bda1dd0f77eaf32062a109ba44975fb3b99b3217f051af8 65058e93eaa69eb8058488868d5ae36f9a6aea7d385d691ec4f24fc04d7bb3cf Total reclaimed space: 1.406MB [root@elk91 ~]# [root@elk91 ~]# docker volume ls DRIVER VOLUME NAME [root@elk91 ~]#
4、docker底层Linux特性之cgroup
bash
- docker底层Linux特性之cgroup 1.什么是cgroup 所谓的cgroup本质上是Linux用做资源限制,可以限制Linux的cpu,memory,disk,I/O。 2.docker底层基于system管理cgroup [root@elk92 ~]# docker info | grep Cgroup Cgroup Driver: systemd Cgroup Version: 2 [root@elk92 ~]# 3. 拉取镜像 [root@elk92 ~]# docker pull jasonyin2020/weixiang-linux-tools:v0.1 v0.1: Pulling from jasonyin2020/weixiang-linux-tools 59bf1c3509f3: Pull complete cdc010c9a849: Pull complete bac97e2f09ed: Pull complete d2167fa4e835: Pull complete Digest: sha256:eac6c50d80c7452db54871790fb26a6ca4d63dd3d4c98499293b3bab90832259 Status: Downloaded newer image for jasonyin2020/weixiang-linux-tools:v0.1 docker.io/jasonyin2020/weixiang-linux-tools:v0.1 [root@elk92 ~]# SVIP: [root@elk91 ~]# wget http://192.168.21.253/Resources/Docker/images/weixiang-stress-tools.tar.gz [root@elk91 ~]# docker load -i weixiang-stress-tools.tar.gz 4.启动容器 [root@elk91 ~]# docker run -d --name stress --cpu-quota 30000 -m 209715200 jasonyin2020/weixiang-linux-tools:v0.1 tail -f /etc/hosts ae48e0f8b4657a80216ea7029a36447e84558f117837c7af0bff2923de8508d1 # --cpu-quota 30000 限制CPU使用量为30% # -m 209715200 限制内存为200MB [root@elk91 ~]# [root@elk91 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ae48e0f8b465 jasonyin2020/weixiang-linux-tools:v0.1 "tail -f /etc/hosts" 2 seconds ago Up 2 seconds stress [root@elk91 ~]# 5.CPU压测 [root@elk91 ~]# docker exec -it stress sh /usr/local/stress # /usr/local/stress # stress -c 4 --verbose --timeout 10m stress: info: [12] dispatching hogs: 4 cpu, 0 io, 0 vm, 0 hdd stress: dbug: [12] using backoff sleep of 12000us stress: dbug: [12] setting timeout to 600s stress: dbug: [12] --> hogcpu worker 4 [13] forked stress: dbug: [12] using backoff sleep of 9000us stress: dbug: [12] setting timeout to 600s stress: dbug: [12] --> hogcpu worker 3 [14] forked stress: dbug: [12] using backoff sleep of 6000us stress: dbug: [12] setting timeout to 600s stress: dbug: [12] --> hogcpu worker 2 [15] forked stress: dbug: [12] using backoff sleep of 3000us stress: dbug: [12] setting timeout to 600s stress: dbug: [12] --> hogcpu worker 1 [16] forked ... [root@elk91 ~]# docker stats stress # 需要单独再开一个终端,发现CPU使用率仅有30%左右 ... CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS ae48e0f8b465 stress 30.55% 1.508MiB / 200MiB 0.75% 2.49kB / 0B 0B / 8.19kB 7 CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS ae48e0f8b465 stress 29.72% 1.508MiB / 200MiB 0.75% 2.49kB / 0B 0B / 8.19kB 7 CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS ae48e0f8b465 stress 29.72% 1.508MiB / 200MiB 0.75% 2.49kB / 0B 0B / 8.19kB 7 CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS ae48e0f8b465 stress 30.02% 1.508MiB / 200MiB 0.75% 2.49kB / 0B 0B / 8.19kB 7 ... 6.内存压测 [root@elk91 ~]# docker exec -it stress sh /usr/local/stress # stress -m 5 --vm-bytes 52428800 --vm-keep --verbose stress: info: [23] dispatching hogs: 0 cpu, 0 io, 5 vm, 0 hdd stress: dbug: [23] using backoff sleep of 15000us stress: dbug: [23] --> hogvm worker 5 [24] forked stress: dbug: [23] using backoff sleep of 12000us stress: dbug: [23] --> hogvm worker 4 [25] forked stress: dbug: [23] using backoff sleep of 9000us stress: dbug: [23] --> hogvm worker 3 [26] forked stress: dbug: [23] using backoff sleep of 6000us stress: dbug: [23] --> hogvm worker 2 [27] forked stress: dbug: [23] using backoff sleep of 3000us stress: dbug: [23] --> hogvm worker 1 [28] forked stress: dbug: [26] allocating 52428800 bytes ... stress: dbug: [26] touching bytes in strides of 4096 bytes ... stress: dbug: [28] allocating 52428800 bytes ... stress: dbug: [28] touching bytes in strides of 4096 bytes ... stress: dbug: [25] allocating 52428800 bytes ... stress: dbug: [25] touching bytes in strides of 4096 bytes ... stress: dbug: [27] allocating 52428800 bytes ... stress: dbug: [27] touching bytes in strides of 4096 bytes ... stress: dbug: [24] allocating 52428800 bytes ... stress: dbug: [24] touching bytes in strides of 4096 bytes ... ... [root@elk91 ~]# docker stats stress #不难发现,内存的使用率仅有200MB左右,尽管我的压测的结果是250M,很明显,打不到250MB的流量。 ... CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS ae48e0f8b465 stress 30.51% 199.8MiB / 200MiB 99.91% 2.63kB / 0B 6.34GB / 5.25GB 12 CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS ae48e0f8b465 stress 30.51% 199.8MiB / 200MiB 99.91% 2.63kB / 0B 6.34GB / 5.25GB 12 CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS ae48e0f8b465 stress 31.31% 200MiB / 200MiB 100.00% 2.63kB / 0B 6.39GB / 5.31GB 12 CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS ae48e0f8b465 stress 31.31% 200MiB / 200MiB 100.00% 2.63kB / 0B 6.39GB / 5.31GB 12 彩蛋:对已经运行的容器做资源限制 1.实验环境 [root@elk91 ~]# docker run -d --name xixi registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 6bb7e1e43c727660b50afc15e79303917c78ca2ed0f81df9217cf3644a0da111 [root@elk91 ~]# [root@elk91 ~]# docker stats xixi --no-stream CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS 6bb7e1e43c72 xixi 0.00% 2.918MiB / 3.785GiB 0.08% 2.09kB / 0B 0B / 24.6kB 3 [root@elk91 ~]# [root@elk91 ~]# free -h total used free shared buff/cache available Mem: 3.8Gi 1.3Gi 676Mi 1.0Mi 1.9Gi 2.2Gi Swap: 3.8Gi 0.0Ki 3.8Gi [root@elk91 ~]# 2.在不停止容器的情况下配置资源限制【配置0.5核心,50MiB内存】 [root@elk91 ~]# docker update --cpu-quota 50000 -m 52428800 --memory-swap 52428800 xixi xixi [root@elk91 ~]# 3.验证测试 [root@elk91 ~]# docker stats xixi --no-stream CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS 6bb7e1e43c72 xixi 0.00% 2.918MiB / 50MiB 5.84% 2.3kB / 0B 0B / 24.6kB 3 [root@elk91 ~]#
5、docker底层Linux特性之namespace技术
bash
- docker底层Linux特性之namespace技术 1.namespace概述 Linux Namespace是内核级别的资源隔离机制,Docker利用其实现容器化。关键Namespace类型: PID:隔离进程ID(容器内PID独立) NET:隔离网络栈(容器拥有独立IP、端口等) IPC:隔离进程间通信 MNT:隔离文件系统挂载点 UTS:隔离主机名和域名 USER:隔离用户和用户组ID 在"/proc"目录可以查看当前进程的名称空间: [root@elk92 ~]# ll /proc/$$/ns total 0 dr-x--x--x 2 root root 0 Mar 20 11:54 ./ dr-xr-xr-x 9 root root 0 Mar 19 10:54 ../ lrwxrwxrwx 1 root root 0 Mar 20 11:54 cgroup -> 'cgroup:[4026531835]' lrwxrwxrwx 1 root root 0 Mar 20 11:54 ipc -> 'ipc:[4026531839]' lrwxrwxrwx 1 root root 0 Mar 20 11:54 mnt -> 'mnt:[4026531841]' lrwxrwxrwx 1 root root 0 Mar 20 11:54 net -> 'net:[4026531840]' lrwxrwxrwx 1 root root 0 Mar 20 11:54 pid -> 'pid:[4026531836]' lrwxrwxrwx 1 root root 0 Mar 20 11:54 pid_for_children -> 'pid:[4026531836]' lrwxrwxrwx 1 root root 0 Mar 20 11:54 time -> 'time:[4026531834]' lrwxrwxrwx 1 root root 0 Mar 20 11:54 time_for_children -> 'time:[4026531834]' lrwxrwxrwx 1 root root 0 Mar 20 11:54 user -> 'user:[4026531837]' lrwxrwxrwx 1 root root 0 Mar 20 11:54 uts -> 'uts:[4026531838]' [root@elk92 ~]# 2.验证docker多容器共享了net网络名称空间 [root@elk92 ~]# docker run -d --name c1 alpine sleep 10d a61b1bdfec98f15781d095e611b3461e8abdccfa0723c77e5d8bb2cc1c4bf202 [root@elk92 ~]# [root@elk92 ~]# docker run -d --name c2 --network container:c1 alpine sleep 20d 116d88d5a2c04b201787d8d82cedccd04f88d8e8eaf68a1f542db85702164c61 [root@elk92 ~]# [root@elk92 ~]# docker inspect -f '{{.State.Pid}}' c1 5550 [root@elk92 ~]# [root@elk92 ~]# docker inspect -f '{{.State.Pid}}' c2 5676 [root@elk92 ~]# [root@elk92 ~]# ll /proc/5550/ns/net lrwxrwxrwx 1 root root 0 Jul 2 11:41 /proc/5550/ns/net -> 'net:[4026532667]' [root@elk92 ~]# [root@elk92 ~]# ll /proc/5676/ns/net lrwxrwxrwx 1 root root 0 Jul 2 11:42 /proc/5676/ns/net -> 'net:[4026532667]' [root@elk92 ~]# [root@elk92 ~]# [root@elk92 ~]# docker exec c1 ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:02 inet addr:172.17.0.2 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:20 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2486 (2.4 KiB) TX bytes:0 (0.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) [root@elk92 ~]# [root@elk92 ~]# docker exec c2 ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:02 inet addr:172.17.0.2 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:20 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2486 (2.4 KiB) TX bytes:0 (0.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) [root@elk92 ~]# # PID Namespace未共享,各自容器有独立的进程空间 [root@elk92 ~]# ll /proc/5550/ns/pid lrwxrwxrwx 1 root root 0 Jul 2 11:42 /proc/5550/ns/pid -> 'pid:[4026532666]' [root@elk92 ~]# [root@elk92 ~]# ll /proc/5676/ns/pid lrwxrwxrwx 1 root root 0 Jul 2 11:42 /proc/5676/ns/pid -> 'pid:[4026532733]' [root@elk92 ~]# [root@elk92 ~]# docker exec c1 ps -ef PID USER TIME COMMAND 1 root 0:00 sleep 10d 13 root 0:00 ps -ef [root@elk92 ~]# [root@elk92 ~]# docker exec c2 ps -ef PID USER TIME COMMAND 1 root 0:00 sleep 20d 13 root 0:00 ps -ef [root@elk92 ~]# - Docker底层用到了Linux特性 - 内核转发 - iptables - overlay FS - chroot - cgroup - namespace

11、制作镜像案例

1、手动制作镜像
bash
1.启动基础镜像 [root@elk92 ~]# docker run -d --name myweb alpine tail -f /etc/hosts 33cd639560a251bcd8547fd30ff617e0f8522605eda8f1eee947290245ec58ca # tail -f /etc/hosts 容器启动后执行的命令 [root@elk92 ~]# [root@elk92 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 33cd639560a2 alpine "tail -f /etc/hosts" 2 seconds ago Up 1 second myweb [root@elk92 ~]# 2.安装nginx服务 [root@elk92 ~]# docker exec -it myweb sh / # cat /etc/apk/repositories https://dl-cdn.alpinelinux.org/alpine/v3.20/main https://dl-cdn.alpinelinux.org/alpine/v3.20/community / # / # sed -i 's#dl-cdn.alpinelinux.org#mirrors.aliyun.com#g' /etc/apk/repositories / # / # cat /etc/apk/repositories https://mirrors.aliyun.com/alpine/v3.20/main https://mirrors.aliyun.com/alpine/v3.20/community / # / # apk update # 更新软件源 fetch https://mirrors.aliyun.com/alpine/v3.20/main/x86_64/APKINDEX.tar.gz fetch https://mirrors.aliyun.com/alpine/v3.20/community/x86_64/APKINDEX.tar.gz v3.20.6-210-g791eef147fb [https://mirrors.aliyun.com/alpine/v3.20/main] v3.20.6-210-g791eef147fb [https://mirrors.aliyun.com/alpine/v3.20/community] OK: 24177 distinct packages available / # / # apk add nginx (1/2) Installing pcre (8.45-r3) (2/2) Installing nginx (1.26.3-r0) Executing nginx-1.26.3-r0.pre-install Executing nginx-1.26.3-r0.post-install Executing busybox-1.36.1-r29.trigger OK: 9 MiB in 16 packages / # / # nginx -v nginx version: nginx/1.26.3 / # 3.启动nginx / # nginx -t nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful / # / # nginx 4.访问测试 [root@elk92 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" myweb 172.17.0.2 [root@elk92 ~]# [root@elk92 ~]# curl 172.17.0.2 <html> <head><title>404 Not Found</title></head> <body> <center><h1>404 Not Found</h1></center> <hr><center>nginx</center> </body> </html> [root@elk92 ~]# 5.手动将镜像提交 [root@elk92 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 33cd639560a2 alpine "tail -f /etc/hosts" 4 minutes ago Up 4 minutes myweb # 将当前运行的容器 myweb 保存为一个新的镜像,并命名为 myweb:v0.1。 [root@elk92 ~]# docker container commit myweb myweb:v0.1 sha256:cd48894434af1010053cef97eac3439ed8aae69075f3ff74fc5d304f6532e0ce [root@elk92 ~]# [root@elk92 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE myweb v0.1 cd48894434af 3 seconds ago 11.6MB alpine latest 91ef0af61f39 9 months ago 7.8MB [root@elk92 ~]# 6.基于自建的镜像运行 [root@elk92 ~]# docker run -d --name c1 myweb:v0.1 7d272a5b8e6e1ac943d7abd2238759844dba06aeaca7e5c9703d3f3ee9b61754 [root@elk92 ~]# [root@elk92 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 7d272a5b8e6e myweb:v0.1 "tail -f /etc/hosts" 3 seconds ago Up 2 seconds c1 [root@elk92 ~]# [root@elk92 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" c1 172.17.0.3 [root@elk92 ~]# [root@elk92 ~]# curl 172.17.0.3 curl: (7) Failed to connect to 172.17.0.3 port 80 after 0 ms: Connection refused [root@elk92 ~]# # 因为构建的镜像已经有nginx,所以直接可以启动 [root@elk92 ~]# docker exec c1 nginx [root@elk92 ~]# [root@elk92 ~]# curl 172.17.0.3 <html> <head><title>404 Not Found</title></head> <body> <center><h1>404 Not Found</h1></center> <hr><center>nginx</center> </body> </html> [root@elk92 ~]# 7.修改容器的启动命令 [root@elk92 ~]# docker run -d --name c2 myweb:v0.1 nginx -g 'daemon off;' 846fd82a137935985a2546998046efef20da4eecefa374e3a174835a3b834c56 # nginx -g 'daemon off;'这是传递给容器的命令(CMD),它会覆盖镜像中默认的CMD或ENTRYPOINT [root@elk92 ~]# [root@elk92 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 846fd82a1379 myweb:v0.1 "nginx -g 'daemon of…" 2 seconds ago Up 2 seconds c2 [root@elk92 ~]# [root@elk92 ~]# docker ps -l --no-trunc CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 846fd82a137935985a2546998046efef20da4eecefa374e3a174835a3b834c56 myweb:v0.1 "nginx -g 'daemon off;'" 8 seconds ago Up 7 seconds c2 [root@elk92 ~]# [root@elk92 ~]# [root@elk92 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" c2 172.17.0.4 [root@elk92 ~]# [root@elk92 ~]# curl 172.17.0.4 <html> <head><title>404 Not Found</title></head> <body> <center><h1>404 Not Found</h1></center> <hr><center>nginx</center> </body> </html> [root@elk92 ~]# 8.如果想要修改容器的启动命令,则需要再次手动提交一次镜像 [root@elk92 ~]# docker commit c2 myweb:v0.2 sha256:6c1a7d94dec69c4b5f4e7f80ab960b55e79de5458edabe35647e3417118a5973 [root@elk92 ~]# [root@elk92 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE myweb v0.2 6c1a7d94dec6 3 seconds ago 11.6MB myweb v0.1 cd48894434af 4 minutes ago 11.6MB alpine latest 91ef0af61f39 9 months ago 7.8MB [root@elk92 ~]# [root@elk92 ~]# docker run -d --name c3 myweb:v0.2 77057fb5e0825c97d61ddf0e51d5b069f375182ae8a305f622fdde433d76c030 [root@elk92 ~]# [root@elk92 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 77057fb5e082 myweb:v0.2 "nginx -g 'daemon of…" 3 seconds ago Up 2 seconds c3 [root@elk92 ~]# [root@elk92 ~]# docker ps -l --no-trunc CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 77057fb5e0825c97d61ddf0e51d5b069f375182ae8a305f622fdde433d76c030 myweb:v0.2 "nginx -g 'daemon off;'" 9 seconds ago Up 8 seconds c3 [root@elk92 ~]# [root@elk92 ~]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" c3 172.17.0.5 [root@elk92 ~]# [root@elk92 ~]# curl 172.17.0.5 <html> <head><title>404 Not Found</title></head> <body> <center><h1>404 Not Found</h1></center> <hr><center>nginx</center> </body> </html> [root@elk92 ~]#
2、Dockerfile
bash
https://docs.docker.com/reference/dockerfile/ - Dockerfile基础 1.什么是Dockerfile 所谓的Dockerfile指的是通过指令快速构建镜像的一种方式。 常见的Dockerfile指令有如下: - Dockerfile - FROM - 指定基础镜像 - 调用基础镜像触发器 - ONBUILD - 定义触发器触发Dockerfile指令。 - MANTIANER - 声明作者信息。 - LABEL - 给镜像打标签 - RUN - 在容器中运行命令。 - COPY - 将本地文件复制到容器中。 - 用于多阶段构建。 - ADD - 将本地文件复制到容器中。 - 可以解压tar包格式文件。 - CMD - 指定容器的启动命令。 - ENTRYPOINT - 指定容器的启动命令。 - ENV - 向容器传递环境变量。 - ARG - 在构建阶段定义的环境变量。 - HEATHCHECK - 配置容器的健康检查,周期性检测容器是否健康。 - USER - 指定容器的运行用户。 - WORKDIR - 指定工作目录。 - EXPOSE - 暴露容器的端口。 - VOLUME - 定义容器的匿名存储卷。 - SHELL - 声明解释器,linux默认为"/bin/sh" - 多阶段构建 dockerfile和镜像的区别是啥? 我们可以基于Dockerfile构建镜像,而镜像可以直接拉取后使用,无需编译。 镜像就好像一道做好的菜,而Dockerfile就好像一个做菜的流程。 2.构建Dockerfile的流程 - A.手动构建镜像记录相关命令; - B.使用Dockefile改写;
1、Dockerfile构建多服务镜像案例

基于Ubuntu构建镜像案例

bash
1 导入镜像 [root@elk92 ~]# wget http://192.168.21.253/Resources/Docker/images/Linux/centos.tar.gz [root@elk92 ~]# docker load -i centos.tar.gz 2 编写Dockerfile [root@elk92 centos]# cat Dockerfile # 指定基础镜像 FROM centos:centos7.9.2009 # 声明作者信息 MAINTAINER yinzhengjie@weixiang.com # 给镜像打标签 LABEL school=weixiang \ class=weixiang98 # 在容器中运行命令 RUN mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup && \ curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo && \ curl -o /etc/yum.repos.d/epel.repo https://mirrors.aliyun.com/repo/epel-7.repo && \ yum -y install nginx openssh-server && \ ssh-keygen -A && \ rm -rf /var/cache/yum/ # 拷贝宿主机的文件到容器中 COPY start.sh / # CMD ["tail","-f","/etc/hosts"] CMD ["bash","-x","/start.sh"] [root@elk92 centos]# [root@elk92 centos]# cat start.sh #!/bin/bash # 启动SSH服务 /usr/sbin/sshd # init root password echo 123456 | passwd --stdin root # 启动Nginx并强制它在前台运行 nginx -g 'daemon off;' # Docker容器要求至少有一个前台进程,否则容器会立即退出。这里nginx是主进程,保持前台运行使得容器持续工作。 3 编译安装 [root@elk92 centos]# docker build -t myweb:v1.1 . [root@elk92 centos]# [root@elk92 centos]# docker image ls myweb REPOSITORY TAG IMAGE ID CREATED SIZE myweb v1.1 ae3dfaae5128 3 minutes ago 264MB myweb v1.0 fae885ada54e 11 minutes ago 264MB myweb v0.2 6c1a7d94dec6 3 hours ago 11.6MB myweb v0.1 cd48894434af 3 hours ago 11.6MB [root@elk92 centos]# [root@elk92 centos]# 4 测试验证 [root@elk92 centos]# docker run -d --name myweb-server myweb:v1.1 40cdefe2de80d0cb708a57d43c71d1c8d56d6f7a53fde26edfb5833761c5a329 [root@elk92 centos]# [root@elk92 centos]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 40cdefe2de80 myweb:v1.1 "bash -x /start.sh" 2 seconds ago Up 1 second myweb-server [root@elk92 centos]# [root@elk92 centos]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" myweb-server 172.17.0.7 [root@elk92 centos]# [root@elk92 centos]# curl 172.17.0.7 <html> <head><title>403 Forbidden</title></head> <body> <center><h1>403 Forbidden</h1></center> <hr><center>nginx/1.20.1</center> </body> </html> [root@elk92 centos]# [root@elk92 centos]# ssh 172.17.0.7 The authenticity of host '172.17.0.7 (172.17.0.7)' can't be established. ED25519 key fingerprint is SHA256:Oolb92cq59P5NDG39OONo1rfEXi2eMwbf4SeDbzgLBI. This host key is known by the following other names/addresses: ~/.ssh/known_hosts:1: [hashed name] Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '172.17.0.7' (ED25519) to the list of known hosts. root@172.17.0.7's password: [root@40cdefe2de80 ~]# [root@40cdefe2de80 ~]# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 07:14 ? 00:00:00 bash -x /start.sh root 8 1 0 07:14 ? 00:00:00 /usr/sbin/sshd root 11 1 0 07:14 ? 00:00:00 nginx: master process nginx -g daemon off; nginx 12 11 0 07:14 ? 00:00:00 nginx: worker process nginx 13 11 0 07:14 ? 00:00:00 nginx: worker process root 14 8 0 07:15 ? 00:00:00 sshd: root@pts/0 root 16 14 0 07:15 pts/0 00:00:00 -bash root 29 16 0 07:15 pts/0 00:00:00 ps -ef [root@40cdefe2de80 ~]#

基于Ubuntu构建镜像案例

bash
1.导入镜像 [root@elk92 centos]# wget http://192.168.21.253/Resources/Docker/images/Linux/ubuntu.tar.gz [root@elk92 centos]# docker load -i ubuntu.tar.gz 2.编写Dockerfile [root@elk92 ubuntu]# cat Dockerfile FROM ubuntu:22.04 MAINTAINER yinzhengjie@weixiang.com LABEL school=weixiang \ class=weixiang98 RUN apt update && apt -y install nginx openssh-server && \ mkdir /run/sshd && \ ssh-keygen -A && \ rm -rf /var/cache/ && \ sed -ri 's@#(PermitRootLogin) prohibit-password@\1 yes@' /etc/ssh/sshd_config COPY start.sh / CMD ["bash","-x","/start.sh"] [root@elk92 ubuntu]# [root@elk92 ubuntu]# [root@elk92 ubuntu]# cat start.sh #!/bin/bash /usr/sbin/sshd echo root:123456 | chpasswd nginx -g "daemon off;" [root@elk92 ubuntu]# [root@elk92 ubuntu]# 3.编译镜像 [root@elk92 ubuntu]# docker build -t myweb:v2.0 . [root@elk92 ubuntu]# docker image ls myweb:v2.0 REPOSITORY TAG IMAGE ID CREATED SIZE myweb v2.0 8e1db47a5230 7 seconds ago 265MB [root@elk92 ubuntu]# [root@elk92 ubuntu]# docker image ls myweb REPOSITORY TAG IMAGE ID CREATED SIZE myweb v2.0 8e1db47a5230 16 seconds ago 265MB myweb v1.1 ae3dfaae5128 49 minutes ago 264MB myweb v1.0 fae885ada54e 56 minutes ago 264MB myweb v0.2 6c1a7d94dec6 4 hours ago 11.6MB myweb v0.1 cd48894434af 4 hours ago 11.6MB [root@elk92 ubuntu]# 4.测试运行 [root@elk92 ubuntu]# docker run -d --name haha myweb:v2.0 8a710a1651e230a6d3a70e20d832a0b4a85e8e4ec178e5ff3f7d2e8d2314fd86 [root@elk92 ubuntu]# [root@elk92 ubuntu]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 8a710a1651e2 myweb:v2.0 "bash -x /start.sh" 3 seconds ago Up 2 seconds haha [root@elk92 ubuntu]# [root@elk92 ubuntu]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" haha 172.17.0.8 [root@elk92 ubuntu]# [root@elk92 ubuntu]# curl 172.17.0.8 <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p> <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p> </body> </html> [root@elk92 ubuntu]# [root@elk92 ubuntu]# rm -f /root/.ssh/known_hosts [root@elk92 ubuntu]# [root@elk92 ubuntu]# ssh 172.17.0.8 The authenticity of host '172.17.0.8 (172.17.0.8)' can't be established. ED25519 key fingerprint is SHA256:CxYhv5m8pOco/yCgKHp2ZS8eq7QdvJyRrEg3Pp1enY4. This key is not known by any other names Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '172.17.0.8' (ED25519) to the list of known hosts. root@172.17.0.8's password: Welcome to Ubuntu 22.04.4 LTS (GNU/Linux 5.15.0-142-generic x86_64) * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support: https://ubuntu.com/pro This system has been minimized by removing packages and content that are not required on a system that users do not log into. To restore this content, you can run the 'unminimize' command. The programs included with the Ubuntu system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright. Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. root@8a710a1651e2:~# root@8a710a1651e2:~# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 08:02 ? 00:00:00 bash -x /start.sh root 9 1 0 08:02 ? 00:00:00 sshd: /usr/sbin/sshd [listener] 0 of 10-100 startups root 14 1 0 08:02 ? 00:00:00 nginx: master process nginx -g daemon off; www-data 15 14 0 08:02 ? 00:00:00 nginx: worker process www-data 16 14 0 08:02 ? 00:00:00 nginx: worker process root 19 9 0 08:02 ? 00:00:00 sshd: root@pts/0 root 30 19 0 08:02 pts/0 00:00:00 -bash root 35 30 0 08:02 pts/0 00:00:00 ps -ef root@8a710a1651e2:~#

基于alpine构建镜像案例

bash
- 基于alpine构建镜像案例 1.编写Dockerfile [root@elk92 alpine]# cat Dockerfile FROM alpine MAINTAINER yinzhengjie@weixiang.com LABEL school=weixiang \ class=weixiang98 RUN apk add nginx openssh-server && \ ssh-keygen -A && \ sed -ri 's@#(PermitRootLogin) prohibit-password@\1 yes@' /etc/ssh/sshd_config && \ rm -rf /var/cache/ COPY start.sh / CMD ["/bin/sh","-x","/start.sh"] [root@elk92 alpine]# [root@elk92 alpine]# [root@elk92 alpine]# cat start.sh #!/bin/bash # start sshd service /usr/sbin/sshd # init root password echo root:123| chpasswd # start nginx service nginx -g 'daemon off;' [root@elk92 alpine]# [root@elk92 alpine]# 2.编译镜像 [root@elk92 alpine]# docker build -t myweb:v3.0 . # Docker会从当前目录(.)查找名为Dockerfile的文件,并根据其中的指令构建镜像 [root@elk92 alpine]# docker image ls myweb REPOSITORY TAG IMAGE ID CREATED SIZE myweb v3.0 9dc4c8d330ea 2 minutes ago 10.6MB myweb v2.0 8e1db47a5230 31 minutes ago 265MB myweb v1.1 ae3dfaae5128 About an hour ago 264MB myweb v1.0 fae885ada54e About an hour ago 264MB myweb v0.2 6c1a7d94dec6 5 hours ago 11.6MB myweb v0.1 cd48894434af 5 hours ago 11.6MB [root@elk92 alpine]# 3.测试验证 [root@elk92 alpine]# docker run -d --name c5 myweb:v3.0 d85bba9e6a53ece8eb08f3f0c85cf72dfff53cbc53d35e8644d77dc502598f91 [root@elk92 alpine]# [root@elk92 alpine]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES d85bba9e6a53 myweb:v3.0 "/bin/sh -x /start.sh" 29 seconds ago Up 28 seconds c5 [root@elk92 alpine]# [root@elk92 alpine]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" c5 172.17.0.10 [root@elk92 alpine]# [root@elk92 alpine]# curl 172.17.0.10 <html> <head><title>404 Not Found</title></head> <body> <center><h1>404 Not Found</h1></center> <hr><center>nginx</center> </body> </html> [root@elk92 alpine]# [root@elk92 alpine]# ssh 172.17.0.10 The authenticity of host '172.17.0.10 (172.17.0.10)' can't be established. ED25519 key fingerprint is SHA256:PbUaRmPAM2zaAUktZCh4HZKi+4jQnj14JK/zlORHaX4. This key is not known by any other names Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '172.17.0.10' (ED25519) to the list of known hosts. root@172.17.0.10's password: Welcome to Alpine! The Alpine Wiki contains a large amount of how-to guides and general information about administrating Alpine systems. See <https://wiki.alpinelinux.org/>. You can setup the system with the command: setup-alpine You may change this message by editing /etc/motd. d85bba9e6a53:~# ps -ef PID USER TIME COMMAND 1 root 0:00 /bin/sh -x /start.sh 9 root 0:00 sshd: /usr/sbin/sshd [listener] 0 of 10-100 startups 12 root 0:00 nginx: master process nginx -g daemon off; 13 nginx 0:00 nginx: worker process 14 nginx 0:00 nginx: worker process 23 root 0:00 sshd: root@pts/0 25 root 0:00 -sh 26 root 0:00 ps -ef d85bba9e6a53:~# 课堂练习之使用docker构建游戏镜像。 1.准备游戏代码 略。见视频。 2.准备Dockerfile文件 [root@elk92 01-games]# cat Dockerfile FROM myweb:v3.0 COPY games.conf /etc/nginx/http.d/default.conf #COPY weixiang-bird.tar.gz weixiang-killbird.tar.gz weixiang-tankedazhan.tar.gz / # #RUN mkdir /usr/share/nginx/html && \ # tar xf /weixiang-bird.tar.gz -C /usr/share/nginx/html && \ # tar xf /weixiang-killbird.tar.gz -C /usr/share/nginx/html && \ # tar xf /weixiang-tankedazhan.tar.gz -C /usr/share/nginx/html && \ # rm -f /weixiang-bird.tar.gz /weixiang-killbird.tar.gz /weixiang-tankedazhan.tar.gz # ADD指令可以将tar包解压并放在特定的路径。如果不是tar包,则和COPY作用相同,只用于拷贝文件到特定路径 ADD weixiang-bird.tar.gz /usr/share/nginx/html ADD weixiang-killbird.tar.gz /usr/share/nginx/html ADD weixiang-tankedazhan.tar.gz /usr/share/nginx/html [root@elk92 01-games]# [root@elk92 01-games]# [root@elk92 01-games]# cat games.conf server { listen 0.0.0.0:80; root /usr/share/nginx/html/bird/; server_name game01.weixiang.com; } server { listen 0.0.0.0:80; root /usr/share/nginx/html/killbird/; server_name game02.weixiang.com; } server { listen 0.0.0.0:80; root /usr/share/nginx/html/tankedazhan/; server_name game03.weixiang.com; } [root@elk92 01-games]# 3.编译镜像 [root@elk92 01-games]# docker build -t myweb:v3.7 . [root@elk92 01-games]# docker image ls myweb REPOSITORY TAG IMAGE ID CREATED SIZE myweb v3.7 26c60729ce8a 4 minutes ago 28.1MB myweb v3.6 6078aaa4287a 7 minutes ago 42.9MB myweb v3.5 3b10924f6415 10 minutes ago 42.9MB myweb v3.4 4dc24061e7c5 14 minutes ago 13MB myweb v3.3 959502b93c4e 16 minutes ago 13MB myweb v3.2 e08cb9855424 18 minutes ago 13MB myweb v3.1 4cda27d6de2c 21 minutes ago 13MB myweb v3.0 9dc4c8d330ea 36 minutes ago 10.6MB myweb v2.0 8e1db47a5230 About an hour ago 265MB myweb v1.1 ae3dfaae5128 2 hours ago 264MB myweb v1.0 fae885ada54e 2 hours ago 264MB myweb v0.2 6c1a7d94dec6 5 hours ago 11.6MB myweb v0.1 cd48894434af 5 hours ago 11.6MB [root@elk92 01-games]# 4.测试验证 [root@elk92 01-games]# docker run -dp 83:80 --name c37 myweb:v3.7 5.windows添加解析并访问测试 10.0.0.92 game01.weixiang.com game02.weixiang.com game03.weixiang.com
2、ENV环境变量实战案例之自定义root密码
bash
# 1.编写Dockerfile [root@elk92 02-games-new]# cat Dockerfile FROM myweb:v3.0 # 指定基础镜像为myweb:v3.0 COPY games.conf /etc/nginx/http.d/default.conf # 拷贝当前目录下的games.conf到/etc/nginx/http.d/default.conf,目录 ADD code/beiJiXiong.tar.gz /usr/share/nginx/html/beijixiong # ADD如果tar.gz包那就解压后删除 ADD code/haiDao.tar.gz /usr/share/nginx/html/haidao ADD code/huaXiangJi.tar.gz /usr/share/nginx/html/huaxiangji ADD code/qiZhuan.tar.gz /usr/share/nginx/html/qizhuan ADD code/saiChe.tar.gz /usr/share/nginx/html/saiche ADD start.sh / # 执行脚本 ENV SCHOOL=weixiang \ # 向容器传递环境变量,但是该变量可以在容器运行时使用-e选项覆盖! CLASS=weixiang98 \ ROOT_INIT_PASSWD=yinzhengjie # 传参密码为yinzhengjie [root@elk92 02-games-new]# # 2.编写nginx配置文件 [root@elk92 02-games-new]# cat games.conf server { listen 0.0.0.0:80; root /usr/share/nginx/html/beijixiong/; server_name game01.weixiang.com; } server { listen 0.0.0.0:80; root /usr/share/nginx/html/haidao/; server_name game02.weixiang.com; } server { listen 0.0.0.0:80; root /usr/share/nginx/html/huaxiangji/; server_name game03.weixiang.com; } server { listen 0.0.0.0:80; root /usr/share/nginx/html/qizhuan/; server_name game04.weixiang.com; } server { listen 0.0.0.0:80; root /usr/share/nginx/html/saiche/; server_name game05.weixiang.com; } # 3.编写执行脚本 [root@elk92 02-games-new]# cat build.sh #!/bin/bash VERSION=${1:-9} # 不指定参数,默认是9 docker build -t myweb:v3.${VERSION} . # 构建镜像 docker rm -f `docker ps -aq` # 强制删除所有容器 docker run -d --name games -p 81:80 myweb:v3.${VERSION} # 运行容器games docker run -d --name games02 -e ROOT_INIT_PASSWD=laonanhai -p 82:80 myweb:v3.${VERSION} # # 运行容器games02并指定密码为laonanhai docker ps -a # 查看所有容器 docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" games docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" games02 [root@elk91 02-games-new]# vim start.sh 1 #!/bin/bash 2 4 /usr/sbin/sshd # 启动SSH守护进程,允许远程登录容器 5 7 if [ -n "$1" ]; # 如果启动脚本传入了参数,如mypassword 8 then 9 echo root:$1 | chpasswd # 那么就使用mypassword作为密码传入 10 elif [ -n "$ROOT_INIT_PASSWD" ]; # 如果存在环境变量ROOT_INIT_PASSWD 11 then 12 echo root:${ROOT_INIT_PASSWD}| chpasswd # 那么就使用环境变量的值作为密码传入 13 else 14 echo root:123| chpasswd # 否则默认密码 "123" 15 fi 16 # 优先级: 手动参数 > 环境变量 ROOT_INIT_PASSWD > 默认密码 123 17 # start nginx service 18 nginx -g 'daemon off;' # 强制 Nginx 在前台运行(容器需要前台进程防止退出)。 # 4.启动执行脚本 [root@elk91 02-games-new]# ./build.sh Sending build context to Docker daemon 154.2MB Step 1/9 : FROM myweb:v3.0 ---> 5c488db4198a Step 2/9 : COPY games.conf /etc/nginx/http.d/default.conf ---> 09a3119360ae Step 3/9 : ADD code/beiJiXiong.tar.gz /usr/share/nginx/html/beijixiong ---> cb31e287cf6e Step 4/9 : ADD code/haiDao.tar.gz /usr/share/nginx/html/haidao ---> c21f581ceb9b Step 5/9 : ADD code/huaXiangJi.tar.gz /usr/share/nginx/html/huaxiangji ---> 0302b1971428 Step 6/9 : ADD code/qiZhuan.tar.gz /usr/share/nginx/html/qizhuan ---> 7061d7aea82a Step 7/9 : ADD code/saiChe.tar.gz /usr/share/nginx/html/saiche ---> b63ae0db62f4 Step 8/9 : ADD start.sh / ---> 45cab762d6a0 Step 9/9 : ENV SCHOOL=weixiang CLASS=weixiang98 ROOT_INIT_PASSWD=yinzhengjie ---> Running in cab48bd93397 Removing intermediate container cab48bd93397 ---> c18d5d1b5a28 Successfully built c18d5d1b5a28 Successfully tagged myweb:v3.9 "docker rm" requires at least 1 argument. See 'docker rm --help'. Usage: docker rm [OPTIONS] CONTAINER [CONTAINER...] Remove one or more containers 192cfbeea3e55e4ede0d983def5b551ea2f4ec88c44c8b91f909dfeaf9d3d127 116d4c40e1e18a6979dadc45bc74ccf67519fa2869a0b5952364f359c9e10c4b CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 116d4c40e1e1 myweb:v3.9 "/bin/sh -x /start.sh" 1 second ago Up Less than a second 0.0.0.0:82->80/tcp, :::82->80/tcp games02 192cfbeea3e5 myweb:v3.9 "/bin/sh -x /start.sh" 2 seconds ago Up Less than a second 0.0.0.0:81->80/tcp, :::81->80/tcp games 172.17.0.2 172.17.0.3 [root@elk91 02-games-new]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 116d4c40e1e1 myweb:v3.9 "/bin/sh -x /start.sh" 7 seconds ago Up 5 seconds 0.0.0.0:82->80/tcp, :::82->80/tcp games02 192cfbeea3e5 myweb:v3.9 "/bin/sh -x /start.sh" 8 seconds ago Up 6 seconds 0.0.0.0:81->80/tcp, :::81->80/tcp games # 登陆测试密码

image

image

3、docker容器无法访问时可能遇到的错误总结
bash
- # 1.内核参数未开启 [root@elk92 02-games-new]# sysctl -q net.ipv4.ip_forward net.ipv4.ip_forward = 1 - # 2.检查宿主机的路由【如果有问题重启docker服务】 [root@elk92 02-games-new]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.0.0.254 0.0.0.0 UG 0 0 0 eth0 10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0 [root@elk92 02-games-new]# - # 3.docker容器未启动 [root@elk92 02-games-new]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e1a939f1802c myweb:v3.9 "/bin/sh -x /start.sh" 44 minutes ago Up 44 minutes 0.0.0.0:82->80/tcp, :::82->80/tcp games02 3a60f702bf5f myweb:v3.9 "/bin/sh -x /start.sh" 44 minutes ago Up 44 minutes 0.0.0.0:81->80/tcp, :::81->80/tcp games - # 4.docker容器没有配置端口映射,看docker PORTS - # 5.iptables规则有问题 可能会缺失某个链,重启docker服务后会自动充值这些链和相应的路由表。 - # 6.windows开启代理 尤其是基于域名的方式访问,走代理可能无法访问! - # 7.windows没有配置解析 修改windows的hosts文件。 温馨提示: 生产环境中尽量少重启服务!!!!
4、ARG和ENV的区别实战案例
bash
1.编写Dockerfile [root@elk92 02-games-new]# cat Dockerfile FROM myweb:v3.0 COPY games.conf /etc/nginx/http.d/default.conf ADD code/beiJiXiong.tar.gz /usr/share/nginx/html/beijixiong ADD code/haiDao.tar.gz /usr/share/nginx/html/haidao ADD code/huaXiangJi.tar.gz /usr/share/nginx/html/huaxiangji ADD code/qiZhuan.tar.gz /usr/share/nginx/html/qizhuan ADD code/saiChe.tar.gz /usr/share/nginx/html/saiche ADD start.sh / # 向容器传递环境变量,但是该变量可以在容器运行时使用-e选项覆盖! ENV SCHOOL=weixiang \ CLASS=weixiang98 \ ROOT_INIT_PASSWD=yinzhengjie # 向容器传递环境变量,但是仅限于构建阶段有效,容器启动阶段则无效。 ARG fanRen=HanLi \ xianNi=WangLin RUN mkdir /weixiang/ && \ touch /weixiang/${fanRen} && \ touch /weixiang/${xianNi} && \ # 引用上面ENV的变量 touch /weixiang/${SCHOOL} && \ # 引用上面ENV的变量 touch /weixiang/${CLASS} # 查看环境变量 [root@elk91 02-games-new]# docker exec -it games02 env PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=688dab63a94c TERM=xterm ROOT_INIT_PASSWD=laonanhai SCHOOL=weixiang CLASS=weixiang98 HOME=/root # 查看目录 [root@elk91 02-games-new]# docker exec -it games02 ls -l /weixiang total 0 -rw-r--r-- 1 root root 0 Jul 5 07:44 HanLi -rw-r--r-- 1 root root 0 Jul 5 07:44 WangLin -rw-r--r-- 1 root root 0 Jul 5 07:44 weixiang98 -rw-r--r-- 1 root root 0 Jul 5 07:44 weixiang # 总结: 如果想在编译时候生效的变量就用AVG,如果想在容器运行时候生效的变量就用ENV ARG和ENV的区别 相同点: 都是可以用来传递环境变量的。 区别: - 1.AVG在构建阶段可以被替换,生命后期仅在构建阶段有效,容器运行时不可见。 - 2.ENV在构建阶段时不可以被替换,但在容器运行时可以被替换,容器运行后变量依旧有效;
5、ENTRYPOINT和CMD区别验证
bash
# 查看父镜像的cmd指令 [root@elk91 02-games-new]# docker history --no-trunc myweb:v3.0 IMAGE CREATED CREATED BY SIZE COMMENT sha256:5c488db4198aee786162ac4935a68722ef8941a88bb8daa6bd123df0c42c7f78 2 days ago /bin/sh -c #(nop) CMD ["/bin/sh" "-x" "/start.sh"] # 1.编写Dockerfile [root@elk92 02-games-new]# cat Dockerfile FROM myweb:v3.0 COPY games.conf /etc/nginx/http.d/default.conf ADD code/beiJiXiong.tar.gz /usr/share/nginx/html/beijixiong ADD code/haiDao.tar.gz /usr/share/nginx/html/haidao ADD code/huaXiangJi.tar.gz /usr/share/nginx/html/huaxiangji ADD code/qiZhuan.tar.gz /usr/share/nginx/html/qizhuan ADD code/saiChe.tar.gz /usr/share/nginx/html/saiche ADD start.sh / # 向容器传递环境变量,但是该变量可以在容器运行时使用-e选项覆盖! ENV SCHOOL=weixiang \ CLASS=weixiang98 \ ROOT_INIT_PASSWD=yinzhengjie # 向容器传递环境变量,但是仅限于构建阶段有效,容器启动阶段则无效。 ARG fanRen=HanLi \ xianNi=WangLin RUN mkdir /weixiang/ && \ touch /weixiang/${fanRen} && \ touch /weixiang/${xianNi} && \ touch /weixiang/${SCHOOL} && \ touch /weixiang/${CLASS} # 指定容器的启动命令,若没有指定,则继承基础镜像的CMD指令。 # 该启动命令在容器运行时可以被覆盖 CMD ["sleep","10d"] # 可以覆盖基础镜像的指令 docker run -d --name c1 myweb:v3.11 sleep 20d # 2. 修改 Dockerfile [root@elk92 02-games-new]# cat Dockerfile FROM myweb:v3.0 COPY games.conf /etc/nginx/http.d/default.conf ADD code/beiJiXiong.tar.gz /usr/share/nginx/html/beijixiong ADD code/haiDao.tar.gz /usr/share/nginx/html/haidao ADD code/huaXiangJi.tar.gz /usr/share/nginx/html/huaxiangji ADD code/qiZhuan.tar.gz /usr/share/nginx/html/qizhuan ADD code/saiChe.tar.gz /usr/share/nginx/html/saiche ADD start.sh / # 向容器传递环境变量,但是该变量可以在容器运行时使用-e选项覆盖! ENV SCHOOL=weixiang \ CLASS=weixiang98 \ ROOT_INIT_PASSWD=yinzhengjie # 向容器传递环境变量,但是仅限于构建阶段有效,容器启动阶段则无效。 ARG fanRen=HanLi \ xianNi=WangLin RUN mkdir /weixiang/ && \ touch /weixiang/${fanRen} && \ touch /weixiang/${xianNi} && \ touch /weixiang/${SCHOOL} && \ touch /weixiang/${CLASS} # 指定容器的启动名,该命令不可用在容器运行时进行替换。 ENTRYPOINT ["sleep""10d"] # 编译镜像后可以看到sleep 10d的指令 [root@elk91 02-games-new]# ./build.sh 12

image

bash
# 运行run,追加的sleep 20d以参数的方式传递给容器 [root@elk91 02-games-new]# docker run -d --name c1 myweb:v3.12 sleep 20d

image

bash
注意: # 如果ENTRYPOINT和CMD指令搭配使用,则CMD将作为参数传递给ENTRYPOINT指令 ENTRYPOINT ["tail"] CMD ["-f","/etc/hosts"] 1、上面这样编译后的指令就会是tail -f /etc/hosts, 2、如果 docker run -d --name c1 myweb:v3.14 -F /etc/hostnane,指令就变成tail -F /etc/hosts ENTRYPOINT ["/bin/sh","-x","/start.sh"] CMD ["weixiang98"] 1、 这样编译后的指令是/bin/sh -x /start.sh weixiang98,weixiang98以参数的方式传进去了,如果指定参数将覆盖weixiang98 # 编写脚本 [root@elk91 02-games-new]# vim start.sh 1 #!/bin/bash 2 4 /usr/sbin/sshd # 启动SSH守护进程,允许远程登录容器 5 7 if [ -n "$1" ]; # 检查是否存在第一个位置参数,下方传入了run命令传入了xixi 8 then 9 echo root:$1 | chpasswd # 使用第一个位置参数作为密码传入 10 elif [ -n "$ROOT_INIT_PASSWD" ]; # 如果存在环境变量ROOT_INIT_PASSWD 11 then 12 echo root:${ROOT_INIT_PASSWD}| chpasswd # 那么就使用环境变量的值作为密码传入 13 else 14 echo root:123| chpasswd # 否则默认密码 "123" 15 fi 16 # 优先级: 手动参数 > 环境变量 ROOT_INIT_PASSWD > 默认密码 123 17 # start nginx service 18 nginx -g 'daemon off;' # 强制 Nginx 在前台运行(容器需要前台进程防止退出)。 [root@elk92 02-games-new]# 2.编译镜像 [root@elk92 02-games-new]# docker build -t myweb:v3.15 . 3.测试验证 [root@elk92 02-games-new]# docker run -d --name c2 myweb:v3.15 xixi # 这里是把xixi以变量的方式传递进去,被脚本读到 8d025967357bb5f351f2b873ba4160628beb9e2a1be8503d54db8759a9cce25c [root@elk92 02-games-new]# [root@elk92 02-games-new]# docker ps -l --no-trunc CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 8d025967357bb5f351f2b873ba4160628beb9e2a1be8503d54db8759a9cce25c myweb:v3.15 "/bin/sh -x /start.sh xixi" 38 seconds ago Up 37 seconds c2 [root@elk92 02-games-new]# [root@elk92 02-games-new]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" c2 172.17.0.5 [root@elk92 02-games-new]# [root@elk92 02-games-new]# ssh 172.17.0.5 The authenticity of host '172.17.0.5 (172.17.0.5)' can't be established. ED25519 key fingerprint is SHA256:PbUaRmPAM2zaAUktZCh4HZKi+4jQnj14JK/zlORHaX4. This host key is known by the following other names/addresses: ~/.ssh/known_hosts:7: [hashed name] ~/.ssh/known_hosts:10: [hashed name] ~/.ssh/known_hosts:11: [hashed name] ~/.ssh/known_hosts:12: [hashed name] Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '172.17.0.5' (ED25519) to the list of known hosts. root@172.17.0.5's password: # 注意,密码应该是'xixi' Welcome to Alpine! The Alpine Wiki contains a large amount of how-to guides and general information about administrating Alpine systems. See <https://wiki.alpinelinux.org/>. You can setup the system with the command: setup-alpine You may change this message by editing /etc/motd. 8d025967357b:~#
6、EXPOSE端口暴露案例
bash
1.编写Dockerfile [root@elk92 02-games-new]# cat Dockerfile FROM myweb:v3.0 COPY games.conf /etc/nginx/http.d/default.conf ADD code/beiJiXiong.tar.gz /usr/share/nginx/html/beijixiong ADD code/haiDao.tar.gz /usr/share/nginx/html/haidao ADD code/huaXiangJi.tar.gz /usr/share/nginx/html/huaxiangji ADD code/qiZhuan.tar.gz /usr/share/nginx/html/qizhuan ADD code/saiChe.tar.gz /usr/share/nginx/html/saiche ADD start.sh / # 向容器传递环境变量,但是该变量可以在容器运行时使用-e选项覆盖! ENV SCHOOL=weixiang \ CLASS=weixiang98 \ ROOT_INIT_PASSWD=yinzhengjie # 向容器传递环境变量,但是仅限于构建阶段有效,容器启动阶段则无效。 ARG fanRen=HanLi \ xianNi=WangLin RUN mkdir /weixiang/ && \ touch /weixiang/${fanRen} && \ touch /weixiang/${xianNi} && \ touch /weixiang/${SCHOOL} && \ touch /weixiang/${CLASS} # 对外暴露容器的服务端口,默认就是tcp协议,因此可以不指定,也可以指定udp,sctp协议。 EXPOSE 80/tcp 22/tcp # 指定容器的启动命令,若没有指定,则继承基础镜像的CMD指令。 # 该启动命令在容器运行时可以被覆盖 # CMD ["sleep","10d"] # 指定容器的启动名,该命令不可用在容器运行时进行替换。 # 如果ENTRYPOINT和CMD指令搭配使用,则CMD将作为参数传递给ENTRYPOINT指令 #ENTRYPOINT ["tail"] #CMD ["-f","/etc/hosts"] ENTRYPOINT ["/bin/sh","-x","/start.sh"] CMD ["weixiang98"] [root@elk92 02-games-new]# 2.编译 [root@elk92 02-games-new]# cat build.sh #!/bin/bash VERSION=${1:-16} docker build -t myweb:v3.${VERSION} . docker rm -f `docker ps -aq` # docker run -d --name games -p 81:80 myweb:v3.${VERSION} # docker run -d --name games02 -e ROOT_INIT_PASSWD=laonanhai -p 82:80 myweb:v3.${VERSION} docker run -dP --name games myweb:v3.${VERSION} docker ps -a docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" games # docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" games02 [root@elk92 02-games-new]# [root@elk92 02-games-new]# ./build.sh [root@elk92 02-games-new]# docker ps -l --no-trunc CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES c8734104a2b497ffb5ead61ee7522e4452873a2ef5fec1943d816e8befc55fc7 myweb:v3.16 "/bin/sh -x /start.sh weixiang98" 43 seconds ago Up 42 seconds 0.0.0.0:32769->22/tcp, :::32769->22/tcp, 0.0.0.0:32768->80/tcp, :::32768->80/tcp games [root@elk92 02-games-new]# [root@elk92 02-games-new]# docker container inspect -f "{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" games 172.17.0.2 [root@elk92 02-games-new]# 3.测试运行 [root@elk93 ~]# ssh 10.0.0.92 -p 32769 The authenticity of host '[10.0.0.92]:32769 ([10.0.0.92]:32769)' can't be established. ED25519 key fingerprint is SHA256:PbUaRmPAM2zaAUktZCh4HZKi+4jQnj14JK/zlORHaX4. This key is not known by any other names Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '[10.0.0.92]:32769' (ED25519) to the list of known hosts. root@10.0.0.92's password: Welcome to Alpine! The Alpine Wiki contains a large amount of how-to guides and general information about administrating Alpine systems. See <https://wiki.alpinelinux.org/>. You can setup the system with the command: setup-alpine You may change this message by editing /etc/motd. c8734104a2b4:~# ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:02 inet addr:172.17.0.2 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:57 errors:0 dropped:0 overruns:0 frame:0 TX packets:29 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:8238 (8.0 KiB) TX bytes:5727 (5.5 KiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) c8734104a2b4:~#
7、WORKDIR指定进入容器默认的工作目录
bash
1.编写Dockerfile指令 [root@elk92 02-games-new]# cat Dockerfile FROM myweb:v3.0 COPY games.conf /etc/nginx/http.d/default.conf ADD code/beiJiXiong.tar.gz /usr/share/nginx/html/beijixiong ADD code/haiDao.tar.gz /usr/share/nginx/html/haidao ADD code/huaXiangJi.tar.gz /usr/share/nginx/html/huaxiangji ADD code/qiZhuan.tar.gz /usr/share/nginx/html/qizhuan ADD code/saiChe.tar.gz /usr/share/nginx/html/saiche ADD start.sh / # 向容器传递环境变量,但是该变量可以在容器运行时使用-e选项覆盖! ENV SCHOOL=weixiang \ CLASS=weixiang98 \ ROOT_INIT_PASSWD=yinzhengjie # 向容器传递环境变量,但是仅限于构建阶段有效,容器启动阶段则无效。 ARG fanRen=HanLi \ xianNi=WangLin RUN mkdir /weixiang/ && \ touch /weixiang/${fanRen} && \ touch /weixiang/${xianNi} && \ touch /weixiang/${SCHOOL} && \ touch /weixiang/${CLASS} # 对外暴露容器的服务端口,默认就是tcp协议,因此可以不指定,也可以指定udp,sctp协议。 EXPOSE 80/tcp 22/tcp # 指定容器的默认工作目录 WORKDIR /etc/nginx/ # 指定容器的启动命令,若没有指定,则继承基础镜像的CMD指令。 # 该启动命令在容器运行时可以被覆盖 # CMD ["sleep","10d"] # 指定容器的启动名,该命令不可用在容器运行时进行替换。 # 如果ENTRYPOINT和CMD指令搭配使用,则CMD将作为参数传递给ENTRYPOINT指令 #ENTRYPOINT ["tail"] #CMD ["-f","/etc/hosts"] ENTRYPOINT ["/bin/sh","-x","/start.sh"] CMD ["weixiang98"] [root@elk92 02-games-new]# [root@elk92 02-games-new]# ./build.sh 17 2.运行容器时可以指定工作目录 [root@elk92 02-games-new]# docker exec -it games sh /etc/nginx # ls -l total 36 -rw-r--r-- 1 root root 1077 Mar 16 18:49 fastcgi.conf -rw-r--r-- 1 root root 1007 Mar 16 18:49 fastcgi_params drwxr-xr-x 1 root root 4096 Jul 3 01:11 http.d -rw-r--r-- 1 root root 5349 Mar 16 18:49 mime.types drwxr-xr-x 2 root root 4096 Jul 2 08:30 modules -rw-r--r-- 1 root root 3214 Mar 16 18:49 nginx.conf -rw-r--r-- 1 root root 636 Mar 16 18:49 scgi_params -rw-r--r-- 1 root root 664 Mar 16 18:49 uwsgi_params /etc/nginx # /etc/nginx # pwd /etc/nginx /etc/nginx # /etc/nginx # [root@elk92 02-games-new]# [root@elk92 02-games-new]# docker exec -it -w /usr/share/nginx/html games sh /usr/share/nginx/html # ls -l total 20 drwxr-xr-x 5 root root 4096 Jul 3 01:11 beijixiong drwxr-xr-x 4 root root 4096 Jul 3 01:11 haidao drwxr-xr-x 4 root root 4096 Jul 3 01:11 huaxiangji drwxr-xr-x 4 root root 4096 Jul 3 01:11 qizhuan drwxr-xr-x 6 root root 4096 Jul 3 01:11 saiche /usr/share/nginx/html # /usr/share/nginx/html # pwd /usr/share/nginx/html /usr/share/nginx/html #
8、VOLUME指令对容器指定路径数据持久化
bash
1.编写Dockerfile [root@elk92 02-games-new]# cat Dockerfile FROM myweb:v3.0 COPY games.conf /etc/nginx/http.d/default.conf ADD code/beiJiXiong.tar.gz /usr/share/nginx/html/beijixiong ADD code/haiDao.tar.gz /usr/share/nginx/html/haidao ADD code/huaXiangJi.tar.gz /usr/share/nginx/html/huaxiangji ADD code/qiZhuan.tar.gz /usr/share/nginx/html/qizhuan ADD code/saiChe.tar.gz /usr/share/nginx/html/saiche ADD start.sh / # 向容器传递环境变量,但是该变量可以在容器运行时使用-e选项覆盖! ENV SCHOOL=weixiang \ CLASS=weixiang98 \ ROOT_INIT_PASSWD=yinzhengjie # 向容器传递环境变量,但是仅限于构建阶段有效,容器启动阶段则无效。 ARG fanRen=HanLi \ xianNi=WangLin RUN mkdir /weixiang/ && \ touch /weixiang/${fanRen} && \ touch /weixiang/${xianNi} && \ touch /weixiang/${SCHOOL} && \ touch /weixiang/${CLASS} # 对外暴露容器的服务端口,默认就是tcp协议,因此可以不指定,也可以指定udp,sctp协议。 EXPOSE 80/tcp 22/tcp # 指定容器的默认工作目录 WORKDIR /etc/nginx/ # 指定容器的匿名存储卷,相当于启动容器时执行了 "docker run -v /usr/share/nginx/html ..." VOLUME /usr/share/nginx/html # 指定容器的启动命令,若没有指定,则继承基础镜像的CMD指令。 # 该启动命令在容器运行时可以被覆盖 # CMD ["sleep","10d"] # 指定容器的启动名,该命令不可用在容器运行时进行替换。 # 如果ENTRYPOINT和CMD指令搭配使用,则CMD将作为参数传递给ENTRYPOINT指令 #ENTRYPOINT ["tail"] #CMD ["-f","/etc/hosts"] ENTRYPOINT ["/bin/sh","-x","/start.sh"] CMD ["weixiang98"] [root@elk92 02-games-new]# ./build.sh 18 2.测试验证 [root@elk92 02-games-new]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 05a01bf8249a myweb:v3.18 "/bin/sh -x /start.s…" 2 minutes ago Up 2 minutes 0.0.0.0:32773->22/tcp, :::32773->22/tcp, 0.0.0.0:32772->80/tcp, :::32772->80/tcp games [root@elk92 02-games-new]# # 查看存储券的VOLUMENAME [root@elk92 02-games-new]# docker inspect -f "{{range .Mounts}}{{.Name}}---{{.Source}}{{end}}" games 08a02acc04915ae523becc6fae9283fa0d8015a23384ebf23f32016a3827a95b---/var/lib/docker/volumes/08a02acc04915ae523becc6fae9283fa0d8015a23384ebf23f32016a3827a95b/_data # 查看存储券 [root@elk92 02-games-new]# docker volume ls DRIVER VOLUME NAME local 08a02acc04915ae523becc6fae9283fa0d8015a23384ebf23f32016a3827a95b [root@elk92 02-games-new]# # 查看指定Docker卷的详细信息。 [root@elk92 02-games-new]# docker volume inspect 08a02acc04915ae523becc6fae9283fa0d8015a23384ebf23f32016a3827a95b [ { "CreatedAt": "2025-07-03T11:48:53+08:00", "Driver": "local", "Labels": null, "Mountpoint": "/var/lib/docker/volumes/08a02acc04915ae523becc6fae9283fa0d8015a23384ebf23f32016a3827a95b/_data", "Name": "08a02acc04915ae523becc6fae9283fa0d8015a23384ebf23f32016a3827a95b", "Options": null, "Scope": "local" } ] [root@elk92 02-games-new]# [root@elk92 02-games-new]# ls -l /var/lib/docker/volumes/08a02acc04915ae523becc6fae9283fa0d8015a23384ebf23f32016a3827a95b/_data total 20 drwxr-xr-x 5 root root 4096 Jul 3 11:48 beijixiong drwxr-xr-x 4 root root 4096 Jul 3 11:48 haidao drwxr-xr-x 4 root root 4096 Jul 3 11:48 huaxiangji drwxr-xr-x 4 root root 4096 Jul 3 11:48 qizhuan drwxr-xr-x 6 root root 4096 Jul 3 11:48 saiche # 删除所有的容器 [root@elk92 02-games-new]# docker container rm -f `docker container ps -qa` 05a01bf8249a [root@elk92 02-games-new]# [root@elk92 02-games-new]# ls -l /var/lib/docker/volumes/08a02acc04915ae523becc6fae9283fa0d8015a23384ebf23f32016a3827a95b/_data # 可以保证数据不丢失!!! total 20 drwxr-xr-x 5 root root 4096 Jul 3 11:48 beijixiong drwxr-xr-x 4 root root 4096 Jul 3 11:48 haidao drwxr-xr-x 4 root root 4096 Jul 3 11:48 huaxiangji drwxr-xr-x 4 root root 4096 Jul 3 11:48 qizhuan drwxr-xr-x 6 root root 4096 Jul 3 11:48 saiche
9、HEALCHECK检查容器服务是否健康及日志输出到docker logs
bash
1.编写Dockerfile [root@elk92 02-games-new]# cat Dockerfile FROM myweb:v3.0 COPY games.conf /etc/nginx/http.d/default.conf ADD code/beiJiXiong.tar.gz /usr/share/nginx/html/beijixiong ADD code/haiDao.tar.gz /usr/share/nginx/html/haidao ADD code/huaXiangJi.tar.gz /usr/share/nginx/html/huaxiangji ADD code/qiZhuan.tar.gz /usr/share/nginx/html/qizhuan ADD code/saiChe.tar.gz /usr/share/nginx/html/saiche ADD start.sh / # 向容器传递环境变量,但是该变量可以在容器运行时使用-e选项覆盖! ENV SCHOOL=weixiang \ CLASS=weixiang98 \ ROOT_INIT_PASSWD=yinzhengjie # 向容器传递环境变量,但是仅限于构建阶段有效,容器启动阶段则无效。 ARG fanRen=HanLi \ xianNi=WangLin RUN mkdir /weixiang/ && \ touch /weixiang/${fanRen} && \ touch /weixiang/${xianNi} && \ touch /weixiang/${SCHOOL} && \ touch /weixiang/${CLASS} # 对外暴露容器的服务端口,默认就是tcp协议,因此可以不指定,也可以指定udp,sctp协议。 EXPOSE 80/tcp 22/tcp # 指定容器的默认工作目录 WORKDIR /etc/nginx/ # 指定容器的匿名存储卷,相当于启动容器时执行了 "docker run -v /usr/share/nginx/html ..." VOLUME /usr/share/nginx/html # 将日志输出到控制台,以便于使用docker logs命令进行查看。/dev/stdout表示白标准输出 RUN ln -svf /dev/stdout /var/log/nginx/access.log && \ ln -svf /dev/stderr /var/log/nginx/error.log # 周期性检查服务是否健康,如果不健康则将服务标记为不健康状态。 # --interval: 间隔多长时间检查一次。 # --timeout: 检测的超时时间。 HEALTHCHECK --interval=3s --timeout=1s \ CMD wget http://localhost/ || exit 1 # 指定容器的启动命令,若没有指定,则继承基础镜像的CMD指令。 # 该启动命令在容器运行时可以被覆盖 # CMD ["sleep","10d"] # 指定容器的启动名,该命令不可用在容器运行时进行替换。 # 如果ENTRYPOINT和CMD指令搭配使用,则CMD将作为参数传递给ENTRYPOINT指令 #ENTRYPOINT ["tail"] #CMD ["-f","/etc/hosts"] ENTRYPOINT ["/bin/sh","-x","/start.sh"] CMD ["weixiang98"] [root@elk92 02-games-new]# [root@elk92 02-games-new]# 2.测试验证 [root@elk92 02-games-new]# ./build.sh 19 [root@elk92 02-games-new]# docker logs -f games [root@elk92 02-games-new]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 49679bd724dc myweb:v3.19 "/bin/sh -x /start.s…" 3 seconds ago Up 2 seconds (health: starting) 0.0.0.0:32783->22/tcp, :::32783->22/tcp, 0.0.0.0:32782->80/tcp, :::32782->80/tcp games [root@elk92 02-games-new]# [root@elk92 02-games-new]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 49679bd724dc myweb:v3.19 "/bin/sh -x /start.s…" 13 seconds ago Up 11 seconds (healthy) 0.0.0.0:32783->22/tcp, :::32783->22/tcp, 0.0.0.0:32782->80/tcp, :::32782->80/tcp games [root@elk92 02-games-new]# 温馨提示: 可以修改nginx的监听端口并热加载,可以观察到就处于不健康状态啦~ [root@elk92 02-games-new]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 49679bd724dc myweb:v3.19 "/bin/sh -x /start.s…" 2 minutes ago Up 2 minutes (unhealthy) 0.0.0.0:32783->22/tcp, :::32783->22/tcp, 0.0.0.0:32782->80/tcp, :::32782->80/tcp games [root@elk92 02-games-new]# # 没做日志文件的软连接之前是看不到日志的 [root@elk91 02-games-new]# docker logs -f games + /usr/sbin/sshd + '[' -n weixiang98 ] + chpasswd + echo root:weixiang98 chpasswd: password for 'root' changed + nginx -g 'daemon off;' # 在Dockerfile里面加了以下的指令可以输出 RUN ln -svf /dev/stdout /var/log/nginx/access.log && \ ln -svf /dev/stderr /var/log/nginx/error.log # 执行编译 [root@elk91 02-games-new]# ./build.sh 19 # 成功查看日志,3秒一次更新检查 [root@elk91 02-games-new]# docker logs -f games + /usr/sbin/sshd + '[' -n weixiang98 ] + + chpasswd echo root:weixiang98 chpasswd: password for 'root' changed + nginx -g 'daemon off;' 127.0.0.1 - - [05/Jul/2025:11:57:15 +0000] "GET / HTTP/1.1" 200 1075 "-" "Wget" "-" 127.0.0.1 - - [05/Jul/2025:11:57:18 +0000] "GET / HTTP/1.1" 200 1075 "-" "Wget" "-" 127.0.0.1 - - [05/Jul/2025:11:57:22 +0000] "GET / HTTP/1.1" 200 1075 "-" "Wget" "-" 127.0.0.1 - - [05/Jul/2025:11:57:25 +0000] "GET / HTTP/1.1" 200 1075 "-" "Wget" "-" 127.0.0.1 - - [05/Jul/2025:11:57:28 +0000] "GET / HTTP/1.1" 200 1075 "-" "Wget" "-"
10、USER指令可以指定运行容器的用户
bash
1.默认是以root用户运行 [root@elk92 02-games-new]# docker exec -it games sh /etc/nginx # whoami root /etc/nginx # 2.编写Dockerfile [root@elk92 02-games-new]# cat Dockerfile FROM myweb:v3.0 COPY games.conf /etc/nginx/http.d/default.conf ADD code/beiJiXiong.tar.gz /usr/share/nginx/html/beijixiong ADD code/haiDao.tar.gz /usr/share/nginx/html/haidao ADD code/huaXiangJi.tar.gz /usr/share/nginx/html/huaxiangji ADD code/qiZhuan.tar.gz /usr/share/nginx/html/qizhuan ADD code/saiChe.tar.gz /usr/share/nginx/html/saiche ADD start.sh / # 向容器传递环境变量,但是该变量可以在容器运行时使用-e选项覆盖! ENV SCHOOL=weixiang \ CLASS=weixiang98 \ ROOT_INIT_PASSWD=yinzhengjie # 向容器传递环境变量,但是仅限于构建阶段有效,容器启动阶段则无效。 ARG fanRen=HanLi \ xianNi=WangLin RUN mkdir /weixiang/ && \ touch /weixiang/${fanRen} && \ touch /weixiang/${xianNi} && \ touch /weixiang/${SCHOOL} && \ touch /weixiang/${CLASS} # 对外暴露容器的服务端口,默认就是tcp协议,因此可以不指定,也可以指定udp,sctp协议。 EXPOSE 80/tcp 22/tcp # 指定容器的默认工作目录 WORKDIR /etc/nginx/ # 指定容器的匿名存储卷,相当于启动容器时执行了 "docker run -v /usr/share/nginx/html ..." VOLUME /usr/share/nginx/html # 将日志输出到控制台,以便于使用docker logs命令进行查看。 RUN ln -svf /dev/stdout /var/log/nginx/access.log && \ ln -svf /dev/stderr /var/log/nginx/error.log && \ apk add curl && \ # adduser weixiang -D -S -s /bin/sh -h /home/weixiang -u 666 && \ rm -rf /var/cache/ # 周期性检查服务是否健康,如果不健康则将服务标记为不健康状态。 # --interval: 间隔多长时间检查一次。 # --timeout: 检测的超时时间。 HEALTHCHECK --interval=3s --timeout=1s \ CMD curl -f http://localhost/ || exit 1 USER nginx # 指定容器的启动命令,若没有指定,则继承基础镜像的CMD指令。 # 该启动命令在容器运行时可以被覆盖 #CMD ["sleep","10d"] # 指定容器的启动名,该命令不可用在容器运行时进行替换。 # 如果ENTRYPOINT和CMD指令搭配使用,则CMD将作为参数传递给ENTRYPOINT指令 #ENTRYPOINT ["tail"] #CMD ["-f","/etc/hosts"] ENTRYPOINT ["/bin/sh","-x","/start.sh"] CMD ["weixiang98"] 3.测试验证 [root@elk92 02-games-new]# docker exec -it games sh /etc/nginx $ whoami nginx /etc/nginx $ /etc/nginx $ netstat -untalp Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 10/nginx: master pr tcp 0 0 127.0.0.1:50378 127.0.0.1:80 TIME_WAIT - tcp 0 0 127.0.0.1:50380 127.0.0.1:80 TIME_WAIT - tcp 0 0 127.0.0.1:50374 127.0.0.1:80 TIME_WAIT - tcp 0 0 127.0.0.1:55706 127.0.0.1:80 TIME_WAIT - tcp 0 0 127.0.0.1:50392 127.0.0.1:80 TIME_WAIT - tcp 0 0 127.0.0.1:55710 127.0.0.1:80 TIME_WAIT - tcp 0 0 127.0.0.1:55702 127.0.0.1:80 TIME_WAIT - tcp 0 0 127.0.0.1:55682 127.0.0.1:80 TIME_WAIT - tcp 0 0 127.0.0.1:55692 127.0.0.1:80 TIME_WAIT - /etc/nginx $ /etc/nginx $ [root@elk92 02-games-new]# [root@elk92 02-games-new]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3a222b11866c myweb:v3.20 "/bin/sh -x /start.s…" 38 seconds ago Up 37 seconds (healthy) 0.0.0.0:32791->22/tcp, :::32791->22/tcp, 0.0.0.0:32790->80/tcp, :::32790->80/tcp games [root@elk92 02-games-new]# # 如果指定别的用户,要确定是否有对服务运行的权限
11、ONBUILD指定基础镜像触发器
bash
1.编写Dockerfile并构建基础镜像 [root@elk92 02-games-new]# cat Dockerfile FROM myweb:v3.0 COPY games.conf /etc/nginx/http.d/default.conf ADD code/beiJiXiong.tar.gz /usr/share/nginx/html/beijixiong ADD code/haiDao.tar.gz /usr/share/nginx/html/haidao ADD code/huaXiangJi.tar.gz /usr/share/nginx/html/huaxiangji ADD code/qiZhuan.tar.gz /usr/share/nginx/html/qizhuan ADD code/saiChe.tar.gz /usr/share/nginx/html/saiche ADD start.sh / # 向容器传递环境变量,但是该变量可以在容器运行时使用-e选项覆盖! ENV SCHOOL=weixiang \ CLASS=weixiang98 \ ROOT_INIT_PASSWD=yinzhengjie # 向容器传递环境变量,但是仅限于构建阶段有效,容器启动阶段则无效。 ARG fanRen=HanLi \ xianNi=WangLin RUN mkdir /weixiang/ && \ touch /weixiang/${fanRen} && \ touch /weixiang/${xianNi} && \ touch /weixiang/${SCHOOL} && \ touch /weixiang/${CLASS} # 对外暴露容器的服务端口,默认就是tcp协议,因此可以不指定,也可以指定udp,sctp协议。 EXPOSE 80/tcp 22/tcp # 指定容器的默认工作目录 WORKDIR /etc/nginx/ # 指定容器的匿名存储卷,相当于启动容器时执行了 "docker run -v /usr/share/nginx/html ..." VOLUME /usr/share/nginx/html # 将日志输出到控制台,以便于使用docker logs命令进行查看。 RUN ln -svf /dev/stdout /var/log/nginx/access.log && \ ln -svf /dev/stderr /var/log/nginx/error.log && \ apk add curl && \ # adduser weixiang -D -S -s /bin/sh -h /home/weixiang -u 666 && \ rm -rf /var/cache/ # 周期性检查服务是否健康,如果不健康则将服务标记为不健康状态。 # --interval: 间隔多长时间检查一次。 # --timeout: 检测的超时时间。 HEALTHCHECK --interval=3s --timeout=1s \ CMD curl -f http://localhost/ || exit 1 # 指定运行服务的用户 # USER nginx # 指定基础镜像触发器,说白了,就是谁用咱们的镜像,则会调用这些Dockerfile指令 ONBUILD RUN mkdir /weixiang-xixi && mkdir /weixiang-haha ONBUILD LABEL email=y1053419035@qq.com \ auther=JasonYin # 指定容器的启动命令,若没有指定,则继承基础镜像的CMD指令。 # 该启动命令在容器运行时可以被覆盖 #CMD ["sleep","10d"] # 指定容器的启动名,该命令不可用在容器运行时进行替换。 # 如果ENTRYPOINT和CMD指令搭配使用,则CMD将作为参数传递给ENTRYPOINT指令 #ENTRYPOINT ["tail"] #CMD ["-f","/etc/hosts"] ENTRYPOINT ["/bin/sh","-x","/start.sh"] CMD ["weixiang98"] [root@elk92 02-games-new]# [root@elk92 02-games-new]# ./build.sh 21 [root@elk92 02-games-new]# docker image ls myweb:v3.21 REPOSITORY TAG IMAGE ID CREATED SIZE myweb v3.21 307603a4ecb2 19 seconds ago 112MB [root@elk92 02-games-new]# 2.查看基础镜像触发器指令 [root@elk92 02-games-new]# docker image inspect -f "{{.Config.OnBuild}}" myweb:v3.21 [RUN mkdir /weixiang-xixi && mkdir /weixiang-haha LABEL email=y1053419035@qq.com auther=JasonYin] [root@elk92 02-games-new]# 3.进入到基础镜像容器查验证 [root@elk92 02-games-new]# docker run -d --name c1 myweb:v3.21 ba323851f6e32252cc02f85a978b2281cc51c47db0a5e066a7bcf852b9f477a4 [root@elk92 02-games-new]# [root@elk92 02-games-new]# docker exec c1 ls -l / total 72 drwxr-xr-x 2 root root 4096 Sep 6 2024 bin drwxr-xr-x 5 root root 340 Jul 3 07:32 dev drwxr-xr-x 1 root root 4096 Jul 3 07:32 etc drwxr-xr-x 2 root root 4096 Sep 6 2024 home drwxr-xr-x 1 root root 4096 Sep 6 2024 lib drwxr-xr-x 5 root root 4096 Sep 6 2024 media drwxr-xr-x 2 root root 4096 Sep 6 2024 mnt drwxr-xr-x 2 root root 4096 Jul 3 02:48 weixiang drwxr-xr-x 2 root root 4096 Sep 6 2024 opt dr-xr-xr-x 305 root root 0 Jul 3 07:32 proc drwx------ 2 root root 4096 Sep 6 2024 root drwxr-xr-x 1 root root 4096 Jul 2 08:30 run drwxr-xr-x 2 root root 4096 Sep 6 2024 sbin drwxr-xr-x 2 root root 4096 Sep 6 2024 srv -rw-r--r-- 1 root root 292 Jul 3 02:48 start.sh dr-xr-xr-x 13 root root 0 Jul 3 07:32 sys drwxrwxrwt 2 root root 4096 Sep 6 2024 tmp drwxr-xr-x 1 root root 4096 Sep 6 2024 usr drwxr-xr-x 1 root root 4096 Jul 3 06:50 var [root@elk92 02-games-new]# 4.再次编写Dockerfile [root@elk92 02-games-new]# cat weixiang98-onbuild.dockerfile FROM myweb:v3.21 # 必须包含上面创建的父镜像 RUN mkdir /weixiang-heihei [root@elk92 02-games-new]# 5.编译测试 , -f指定dockerfile文件 [root@elk92 02-games-new]# docker build -t myweb:v3.23 -f weixiang98-onbuild.dockerfile . Sending build context to Docker daemon 143.2MB Step 1/2 : FROM myweb:v3.21 # Executing 2 build triggers ---> Running in 9da668347ea7 Removing intermediate container 9da668347ea7 ---> Running in 715a01bcb870 Removing intermediate container 715a01bcb870 ---> e96d7942515e Step 2/2 : RUN mkdir /weixiang-heihei ---> Running in 06a9fdd9c306 Removing intermediate container 06a9fdd9c306 ---> 761ab3a07c83 Successfully built 761ab3a07c83 Successfully tagged myweb:v3.22 [root@elk92 02-games-new]# [root@elk92 02-games-new]# [root@elk92 02-games-new]# docker run -d --name c2 myweb:v3.22 786d7bac1b3a6cad4b5ac0e6f85626e795cefecd1636af6f402fd5f399ac6134 [root@elk92 02-games-new]# [root@elk92 02-games-new]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 786d7bac1b3a myweb:v3.22 "/bin/sh -x /start.s…" 2 seconds ago Up 1 second (health: starting) 22/tcp, 80/tcp c2 [root@elk92 02-games-new]# [root@elk92 02-games-new]# docker exec c2 ls -l / total 88 drwxr-xr-x 2 root root 4096 Sep 6 2024 bin drwxr-xr-x 5 root root 340 Jul 3 07:38 dev drwxr-xr-x 1 root root 4096 Jul 3 07:38 etc drwxr-xr-x 2 root root 4096 Sep 6 2024 home drwxr-xr-x 1 root root 4096 Sep 6 2024 lib drwxr-xr-x 5 root root 4096 Sep 6 2024 media drwxr-xr-x 2 root root 4096 Sep 6 2024 mnt drwxr-xr-x 2 root root 4096 Jul 3 02:48 weixiang drwxr-xr-x 2 root root 4096 Jul 3 07:36 weixiang-haha # 很明显,我们埋下的脚本操作成功了! drwxr-xr-x 2 root root 4096 Jul 3 07:36 weixiang-heihei drwxr-xr-x 2 root root 4096 Jul 3 07:36 weixiang-xixi # 很明显,我们埋下的脚本操作成功了! drwxr-xr-x 2 root root 4096 Sep 6 2024 opt dr-xr-xr-x 310 root root 0 Jul 3 07:38 proc drwx------ 2 root root 4096 Sep 6 2024 root drwxr-xr-x 1 root root 4096 Jul 3 07:38 run drwxr-xr-x 2 root root 4096 Sep 6 2024 sbin drwxr-xr-x 2 root root 4096 Sep 6 2024 srv -rw-r--r-- 1 root root 292 Jul 3 02:48 start.sh dr-xr-xr-x 13 root root 0 Jul 3 07:38 sys drwxrwxrwt 2 root root 4096 Sep 6 2024 tmp drwxr-xr-x 1 root root 4096 Sep 6 2024 usr drwxr-xr-x 1 root root 4096 Jul 3 06:50 var [root@elk92 02-games-new]#
12、SHELL声明解释器案例
bash
1.编写Dockerfile [root@elk92 02-games-new]# cat shell.dockerfile FROM myweb:v3.0 # SHELL ["/bin/bash","-c"] SHELL ["/bin/sh","-c"] RUN mkdir /weixiang-heihei [root@elk92 02-games-new]# 2.构建测试 [root@elk92 02-games-new]# docker build -t myweb:v3.23 -f shell.dockerfile . Sending build context to Docker daemon 143.2MB Step 1/3 : FROM myweb:v3.0 ---> 9dc4c8d330ea Step 2/3 : SHELL ["/bin/sh","-c"] ---> Running in 2f58c0fc4a6f Removing intermediate container 2f58c0fc4a6f ---> 4e23dbcb37eb Step 3/3 : RUN mkdir /weixiang-heihei ---> Running in cc2fa5d2a5af Removing intermediate container cc2fa5d2a5af ---> 159e4d61d8c2 Successfully built 159e4d61d8c2 Successfully tagged myweb:v3.23 [root@elk92 02-games-new]# [root@elk92 02-games-new]# cat /etc/shells # /etc/shells: valid login shells /bin/sh /bin/bash /usr/bin/bash /bin/rbash /usr/bin/rbash /usr/bin/sh /bin/dash /usr/bin/dash /usr/bin/tmux /usr/bin/screen [root@elk92 02-games-new]#
13、多阶段构建
bash
1.编写Dockerfile [root@elk92 02-games-new]# cat mutiple-step.dockerfile FROM alpine AS xixi RUN mkdir /weixiang-xixi && \ echo xixi > /weixiang-xixi/xixi.txt && \ dd if=/dev/zero of=/weixiang-xixi/xixi.log count=1024 bs=1M EXPOSE 22 WORKDIR /usr CMD ["sleep","20d"] FROM alpine AS haha RUN mkdir /weixiang-haha && \ echo haha > /weixiang-haha/haha.txt && \ dd if=/dev/zero of=/weixiang-haha/haha.log count=1024 bs=1M EXPOSE 80 WORKDIR /etc CMD ["sleep","10d"] FROM alpine RUN mkdir /weixiang-hehe && \ echo hehe > /weixiang-hehe/hehe.log COPY --from=xixi /weixiang-xixi/xixi.txt /weixiang-hehe COPY --from=haha /weixiang-haha/haha.txt /weixiang-hehe CMD ["tail","-f","/etc/hosts"] [root@elk92 02-games-new]# 2.编译镜像 [root@elk92 02-games-new]# docker build -t demo:v0.1 -f mutiple-step.dockerfile . [root@elk92 02-games-new]# [root@elk92 02-games-new]# [root@elk92 02-games-new]# docker image ls demo REPOSITORY TAG IMAGE ID CREATED SIZE demo v0.1 c99d325e940f 3 seconds ago 7.8MB [root@elk92 02-games-new]#

多阶段构建图解

bash
虽然第0阶段跟第1阶段加起来有2个g的文件,但是拷贝过来的只有xixi.txt跟haha.txt - 多阶段构建实战案例 1.编写Dockerfile [root@elk92 03-mutiple-step]# cat Dockerfile FROM alpine # FROM alpine AS myweb RUN sed -i 's#dl-cdn.alpinelinux.org#mirrors.aliyun.com#' /etc/apk/repositories && \ apk update && apk add openssh-server curl wget gcc g++ make && \ wget "https://sourceforge.net/projects/pcre/files/pcre/8.44/pcre-8.44.tar.gz" && \ wget "http://nginx.org/download/nginx-1.20.2.tar.gz" && \ tar xf pcre-8.44.tar.gz && \ tar xf nginx-1.20.2.tar.gz && \ rm -f nginx-1.20.2.tar.gz pcre-8.44.tar.gz && \ cd nginx-1.20.2 && \ ./configure --prefix=/usr/local/nginx --with-pcre=/pcre-8.44 --without-http_gzip_module && \ make && \ make install CMD ["tail","-f","/etc/hosts"] FROM alpine AS apps MAINTAINER weixiang-weixiang98 LABEL school=weixiang \ class=weixiang98 \ auther=JasonYin \ address="老男孩it教育沙河镇" # 从哪个FROM阶段拷贝,可以指定别名,如果未使用AS关键字定义别名,则可以用索引下标。 COPY --from=0 /usr/local/nginx /usr/local/nginx # COPY --from=myweb /usr/local/nginx /usr/local/nginx RUN sed -i 's#dl-cdn.alpinelinux.org#mirrors.aliyun.com#' /etc/apk/repositories && \ apk update && apk add openssh-server curl && \ sed -i 's@#PermitRootLogin prohibit-password@PermitRootLogin yes@g' /etc/ssh/sshd_config && \ ln -sv /usr/local/nginx/sbin/nginx /usr/sbin && \ rm -rf /var/cache/ && \ ssh-keygen -A WORKDIR /weixiang-nginx-1.20.2/conf HEALTHCHECK --interval=3s \ --timeout=3s \ --start-period=30s \ --retries=3 \ CMD curl -f http://localhost/ || exit 1 #ADD weixiang-killbird.tar.gz /usr/local/nginx/html #COPY start.sh / EXPOSE 80 22 # CMD ["sh","-x","/start.sh"] CMD ["sleep","10d"] [root@elk92 03-mutiple-step]# [root@elk92 03-mutiple-step]# docker build -t demo:v1.0 . 2.测试验证 [root@elk92 03-mutiple-step]# docker run -d demo:v1.0 09164109d79e9fc592f43e5a25e426d5cd4593eda0c6d82eb80055e0f1f40380 [root@elk92 03-mutiple-step]# [root@elk92 03-mutiple-step]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 09164109d79e demo:v1.0 "sleep 10d" 2 seconds ago Up 1 second (health: starting) 22/tcp, 80/tcp suspicious_brahmagupta [root@elk92 03-mutiple-step]# [root@elk92 03-mutiple-step]# [root@elk92 03-mutiple-step]# [root@elk92 03-mutiple-step]# docker exec -it suspicious_brahmagupta sh /weixiang-nginx-1.20.2/conf # /weixiang-nginx-1.20.2/conf # ls -l /usr/local/nginx/ total 16 drwxr-xr-x 2 root root 4096 Jul 3 08:55 conf drwxr-xr-x 2 root root 4096 Jul 3 08:55 html drwxr-xr-x 2 root root 4096 Jul 3 08:55 logs drwxr-xr-x 2 root root 4096 Jul 3 08:55 sbin /weixiang-nginx-1.20.2/conf #
14、通过dockerignore忽略不必要的文件
bash
参考链接: https://docs.docker.com/reference/dockerfile/#dockerignore-file 1.dockerignore的作用 和".gitignore"类似,可以忽略不必要的代码。 而".dockerignore"也是相同的作用,就是忽略当前目录下指定的文件,忽略的文件将不会发送到docker daemon守护进程。 2.实战案例 2.1 准备文件 [root@elk92 test]# dd if=/dev/zero of=bigfile01 count=1024 bs=1M 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.74814 s, 614 MB/s [root@elk92 test]# [root@elk92 test]# dd if=/dev/zero of=bigfile02 count=1024 bs=2M 1024+0 records in 1024+0 records out 2147483648 bytes (2.1 GB, 2.0 GiB) copied, 5.37281 s, 400 MB/s [root@elk92 test]# [root@elk92 test]# ll total 3145748 drwxr-xr-x 2 root root 4096 Jul 4 09:08 ./ drwx------ 10 root root 4096 Jul 4 09:07 ../ -rw-r--r-- 1 root root 1073741824 Jul 4 09:08 bigfile01 -rw-r--r-- 1 root root 2147483648 Jul 4 09:08 bigfile02 -rw-r--r-- 1 root root 14 Jul 4 09:07 Dockerfile [root@elk92 test]# 2.2 编写Dockerfile编译测试 [root@elk92 test]# cat Dockerfile FROM alpine RUN mkdir /oldbyedu-haha CMD ["sleep","20d"] [root@elk92 test]# [root@elk92 test]# docker build -t test:v0.1 . # 直接构建镜像 Sending build context to Docker daemon 3.2GB # 会读取当前目录下的所有文件并传输给docker daemon进程。 Step 1/3 : FROM alpine ---> 91ef0af61f39 Step 2/3 : RUN mkdir /oldbyedu-haha ---> Running in f6a41e9b99f0 Removing intermediate container f6a41e9b99f0 ---> 5399bd3b8126 Step 3/3 : CMD ["sleep","20d"] ---> Running in 985d33589226 Removing intermediate container 985d33589226 ---> 075c4ce99cfd Successfully built 075c4ce99cfd Successfully tagged test:v0.1 [root@elk92 test]# [root@elk92 test]# docker image ls test REPOSITORY TAG IMAGE ID CREATED SIZE test v0.1 075c4ce99cfd 45 seconds ago 7.8MB # 但是用不上3.2G的文件,所以只读取了但是没用上 [root@elk92 test]# 2.3 编写".dockerignore"文件进行优化 [root@elk92 test]# cat .dockerignore # 把要忽略的文件名称放在.dockerignore里面 big* [root@elk92 test]# [root@elk92 test]# docker build -t test:v0.1 . Sending build context to Docker daemon 3.072kB # 很明显,忽略的大文件传输。 Step 1/3 : FROM alpine ---> 91ef0af61f39 Step 2/3 : RUN mkdir /oldbyedu-haha ---> Using cache ---> 5399bd3b8126 Step 3/3 : CMD ["sleep","20d"] ---> Using cache ---> 075c4ce99cfd Successfully built 075c4ce99cfd Successfully tagged test:v0.1 [root@elk92 test]# [root@elk92 test]#
3、Dockerfile优化
bash
- Dockerfile的优化 编译速度: ---> 快【时间】 - 1.使用".dockerignore"; - 2.编写Dockerfile时将不经常修改的指令往上放,充分利用缓存; - 3.使用国内或私有的镜像源下载软件包; - 4.直接将较大的软件包下载到本地使用COPY或者ADD指令进行拷贝;(节省下载过程) 镜像大小: ---> 小【带宽】 - 1.删除无用的缓存; - 2.使用较小的基础镜像; - 3.删除无用的软件包; - 4.合并多条Dockefile指令;【减少镜像分层】 - 5.多阶段构建;
4、scratch自定义基础镜像
bash
1.下载软件包 [root@elk92 04-scratch]# wget https://mirrors.tuna.tsinghua.edu.cn/lxc-images/images/alpine/3.22/amd64/default/20250628_13%3A00/rootfs.tar.xz 2.解压软件包 [root@elk92 04-scratch]# tar xf rootfs.tar.xz [root@elk92 04-scratch]# rm -f rootfs.tar.xz 3.更改根目录结果 [root@elk92 04-scratch]# vim etc/os-release [root@elk92 04-scratch]# [root@elk92 04-scratch]# cat etc/os-release NAME="weixiang Alpine Linux" ID=alpine VERSION_ID=3.22.1 PRETTY_NAME="weixiang Alpine Linux v3.22" HOME_URL="https://www.weixiang.com" [root@elk92 04-scratch]# [root@elk92 04-scratch]# mkdir weixiang [root@elk92 04-scratch]# [root@elk92 04-scratch]# ll total 80 drwxr-xr-x 20 root root 4096 Jul 4 09:33 ./ drwxr-xr-x 6 root root 4096 Jul 4 09:30 ../ drwxr-xr-x 2 root root 4096 Jun 28 21:00 bin/ drwxr-xr-x 2 root root 4096 Jun 28 21:00 dev/ drwxr-xr-x 24 root root 4096 Jul 4 09:32 etc/ drwxr-xr-x 2 root root 4096 May 30 20:13 home/ drwxr-xr-x 7 root root 4096 Jun 28 21:00 lib/ drwxr-xr-x 5 root root 4096 May 30 20:13 media/ drwxr-xr-x 2 root root 4096 May 30 20:13 mnt/ drwxr-xr-x 2 root root 4096 Jul 4 09:32 weixiang/ drwxr-xr-x 2 root root 4096 May 30 20:13 opt/ dr-xr-xr-x 2 root root 4096 May 30 20:13 proc/ drwx------ 2 root root 4096 May 30 20:13 root/ drwxr-xr-x 3 root root 4096 May 30 20:13 run/ drwxr-xr-x 2 root root 4096 Jun 28 21:00 sbin/ drwxr-xr-x 2 root root 4096 May 30 20:13 srv/ drwxr-xr-x 2 root root 4096 May 30 20:13 sys/ drwxrwxrwt 2 root root 4096 May 30 20:13 tmp/ drwxr-xr-x 8 root root 4096 Jun 28 21:00 usr/ drwxr-xr-x 11 root root 4096 May 30 20:13 var/ [root@elk92 04-scratch]# 4.将文件重新打包 [root@elk92 04-scratch]# mkdir test [root@elk92 04-scratch]# mv * test/ mv: cannot move 'test' to a subdirectory of itself, 'test/test' [root@elk92 04-scratch]# [root@elk92 04-scratch]# ll total 12 drwxr-xr-x 3 root root 4096 Jul 4 09:33 ./ drwxr-xr-x 6 root root 4096 Jul 4 09:30 ../ drwxr-xr-x 20 root root 4096 Jul 4 09:33 test/ [root@elk92 04-scratch]# [root@elk92 04-scratch]# cd test/ [root@elk92 test]# [root@elk92 test]# tar zcf weixiang-rootfs.tar.xz * [root@elk92 test]# [root@elk92 test]# mv weixiang-rootfs.tar.xz ../ [root@elk92 test]# [root@elk92 test]# cd .. [root@elk92 04-scratch]# [root@elk92 04-scratch]# rm -rf test/ [root@elk92 04-scratch]# [root@elk92 04-scratch]# ll total 4028 drwxr-xr-x 2 root root 4096 Jul 4 09:34 ./ drwxr-xr-x 6 root root 4096 Jul 4 09:30 ../ -rw-r--r-- 1 root root 4115303 Jul 4 09:34 weixiang-rootfs.tar.xz [root@elk92 04-scratch]# 5.编写Dockerfile [root@elk92 04-scratch]# ll total 4032 drwxr-xr-x 2 root root 4096 Jul 4 09:37 ./ drwxr-xr-x 6 root root 4096 Jul 4 09:30 ../ -rw-r--r-- 1 root root 220 Jul 4 09:37 Dockerfile -rw-r--r-- 1 root root 4115303 Jul 4 09:34 weixiang-rootfs.tar.xz [root@elk92 04-scratch]# [root@elk92 04-scratch]# [root@elk92 04-scratch]# cat Dockerfile # 使用'scratch'关键字表示不引用任何基础镜像。 FROM scratch MAINTAINER JasonYin LABEL auther=JasonYin \ school=weixiang \ class=weixiang98 ADD weixiang-rootfs.tar.xz / CMD ["sleep","3600"] [root@elk92 04-scratch]# 6.编译镜像 [root@elk92 04-scratch]# docker build -t mylinux:v0.1.0 . Sending build context to Docker daemon 4.118MB Step 1/5 : FROM scratch ---> Step 2/5 : MAINTAINER JasonYin ---> Running in 78463430659d Removing intermediate container 78463430659d ---> 9f95c9a2ac62 Step 3/5 : LABEL auther=JasonYin school=weixiang class=weixiang98 ---> Running in 3ab46601f937 Removing intermediate container 3ab46601f937 ---> 0f930f845273 Step 4/5 : ADD weixiang-rootfs.tar.xz / ---> 92c8a03883e3 Step 5/5 : CMD ["sleep","3600"] ---> Running in fc32263ff343 Removing intermediate container fc32263ff343 ---> d92b359fe466 Successfully built d92b359fe466 Successfully tagged mylinux:v0.1.0 [root@elk92 04-scratch]# [root@elk92 04-scratch]# docker image ls mylinux REPOSITORY TAG IMAGE ID CREATED SIZE mylinux v0.1.0 d92b359fe466 6 seconds ago 10.2MB [root@elk92 04-scratch]# 7.启动测试 [root@elk92 04-scratch]# docker run --name c1 -d mylinux:v0.1.0 35bcf6fceaaed02029399b9477eafd6210f94489a0429e6105d030c10747ba25 [root@elk92 04-scratch]# [root@elk92 04-scratch]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 35bcf6fceaae mylinux:v0.1.0 "sleep 3600" 3 seconds ago Up 2 seconds c1 [root@elk92 04-scratch]# [root@elk92 04-scratch]# docker exec -it c1 sh / # cat /etc/os-release NAME="weixiang Alpine Linux" ID=alpine VERSION_ID=3.22.1 PRETTY_NAME="weixiang Alpine Linux v3.22" HOME_URL="https://www.weixiang.com" / # / # ls -l / total 60 drwxr-xr-x 2 root root 4096 Jun 28 13:00 bin drwxr-xr-x 5 root root 340 Jul 4 01:38 dev drwxr-xr-x 1 root root 4096 Jul 4 01:38 etc drwxr-xr-x 2 root root 4096 May 30 12:13 home drwxr-xr-x 7 root root 4096 Jun 28 13:00 lib drwxr-xr-x 5 root root 4096 May 30 12:13 media drwxr-xr-x 2 root root 4096 May 30 12:13 mnt drwxr-xr-x 2 root root 4096 Jul 4 01:32 weixiang drwxr-xr-x 2 root root 4096 May 30 12:13 opt dr-xr-xr-x 288 root root 0 Jul 4 01:38 proc drwx------ 1 root root 4096 Jul 4 01:38 root drwxr-xr-x 3 root root 4096 May 30 12:13 run drwxr-xr-x 2 root root 4096 Jun 28 13:00 sbin drwxr-xr-x 2 root root 4096 May 30 12:13 srv dr-xr-xr-x 13 root root 0 Jul 4 01:38 sys drwxrwxrwt 2 root root 4096 May 30 12:13 tmp drwxr-xr-x 8 root root 4096 Jun 28 13:00 usr drwxr-xr-x 11 root root 4096 May 30 12:13 var / #
5、docker-compose
bash
定位:容器化应用的本地编排工具 1、统一管理:服务、网络、存储卷、环境变量等集中配置 2、一键启动/停止完整环境 docker compose up -d/docker compose down 3、高效管理服务依赖 健康检查集成:确保服务真正可用才启动依赖项 启动顺序控制:避免数据库未就绪时应用已启动

与原生 Docker 命令对比

场景原生 Docker 命令Docker Compose
启动多容器多个 docker run​ + 手动维护网络docker compose up​ 一键完成
查看服务状态docker ps​ 过滤容器docker compose ps​ 按服务展示
查看日志docker logs <container-id>docker compose logs -f web
扩展副本数手动创建多个容器docker compose up --scale api=3
环境变量管理冗长的 -e VAR=value​ 参数在 YAML 中集中管理
1、docker-compose实现单个服务案例
bash
1.编写docker-compose文件 [root@elk92 01-xiuxian]# cat docker-compose.yaml # 定义一个服务 services: # 服务的名称 xiuxian: # 指定镜像的名称 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 # 指定容器的名称,如果不指定名称,那就是目录名+服务的名称+序号 container_name: xiuxian # 指定该端口映射,把宿主机的81端口映射到容器的80端口 ports: - "81:80" [root@elk92 01-xiuxian]# 2.启动服务【创建并启动容器】 [root@elk92 01-xiuxian]# docker-compose up -d # -d选项表示放在后台运行 [+] Building 0.0s (0/0) docker:default [+] Running 2/2 ✔ Network 01-xiuxian_default Created 0.1s ✔ Container xiuxian Started 0.0s [root@elk92 01-xiuxian]# 3.查看服务 [root@elk92 01-xiuxian]# docker-compose ps # 定义的容器名 # 定义的镜像名 # 不定义使用默认启动命令 NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS xiuxian registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" xiuxian 3 seconds ago Up 2 seconds 0.0.0.0:81->80/tcp, :::81->80/tcp [root@elk92 01-xiuxian]# [root@elk92 01-xiuxian]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 24fe8c8b93a2 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 8 seconds ago Up 8 seconds 0.0.0.0:81->80/tcp, :::81->80/tcp xiuxian [root@elk92 01-xiuxian]# 4.访问测试 [root@elk92 01-xiuxian]# ss -ntl | grep 81 LISTEN 0 4096 0.0.0.0:81 0.0.0.0:* LISTEN 0 4096 [::]:81 [::]:* [root@elk92 01-xiuxian]# 访问测试: http://10.0.0.92:81/ 5.移除服务 【停止并删除容器】 [root@elk92 01-xiuxian]# docker-compose down -t 0 # 此处的-t表示指定停止服务的延迟时间,设置为0表示立刻移除容器 [+] Running 2/2 ✔ Container xiuxian Removed 0.2s ✔ Network 01-xiuxian_default Removed 0.2s [root@elk92 01-xiuxian]# [root@elk92 01-xiuxian]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [root@elk92 01-xiuxian]# docker-compose
2、docker-compose实现环境变量传递和启动命令参数
bash
1.环境准备 1.1 拷贝MySQL镜像 [root@elk91 ~]# scp weixiang-mysql-v8.0.36-oracle.tar.gz 10.0.0.92:~ 1.2 导入镜像 [root@elk92 ~]# docker load < weixiang-mysql-v8.0.36-oracle.tar.gz 2.编写docker-compose文件 [root@elk92 02-mysql]# cat docker-compose.yaml services: mysql_server: image: mysql:8.0.36-oracle container_name: db ports: - "3306:3306" # 定义环境变量 environment: MYSQL_ALLOW_EMPTY_PASSWORD: "yes" MYSQL_DATABASE: "wordpress" MYSQL_USER: "weixiang98" MYSQL_PASSWORD: "weixiang" # 向容器传递参数【覆盖容器的CMD指令,若容器是ENTRYPOINT则下面就是传参。】 command: ["--character-set-server=utf8", "--collation-server=utf8_bin", "--default-authentication-plugin=mysql_native_password"] [root@elk92 02-mysql]# [root@elk92 02-mysql]# 3.启动服务 [root@elk92 02-mysql]# docker-compose up -d [+] Building 0.0s (0/0) docker:default [+] Running 2/2 ✔ Network 02-mysql_default Created 0.1s ✔ Container db Started 0.0s [root@elk92 02-mysql]# [root@elk92 02-mysql]# docker-compose ps NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS db mysql:8.0.36-oracle "docker-entrypoint.s…" mysql_server 2 seconds ago Up 2 seconds 0.0.0.0:3306->3306/tcp, :::3306->3306/tcp, 33060/tcp [root@elk92 02-mysql]# [root@elk92 02-mysql]# 4.测试验证 [root@elk92 02-mysql]# docker-compose exec -it mysql_server mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 8 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SHOW DATABASES; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | wordpress | +--------------------+ 5 rows in set (0.00 sec) mysql> mysql> SELECT user,host,plugin FROM mysql.user; +------------------+-----------+-----------------------+ | user | host | plugin | +------------------+-----------+-----------------------+ | weixiang98 | % | mysql_native_password | | root | % | mysql_native_password | | mysql.infoschema | localhost | caching_sha2_password | | mysql.session | localhost | caching_sha2_password | | mysql.sys | localhost | caching_sha2_password | | root | localhost | mysql_native_password | +------------------+-----------+-----------------------+ 6 rows in set (0.00 sec) mysql> 5.停止服务 [root@elk92 02-mysql]# docker-compose down -t 0 [+] Running 2/2 ✔ Container db Removed 0.2s ✔ Network 02-mysql_default Removed 0.2s [root@elk92 02-mysql]#
3、docker-compose服务引用存储卷
bash
# 清理存储卷 [root@elk92 02-mysql]# docker volume prune -f 1.编写docker-compose文件 [root@elk92 03-mysql-volumes]# cat docker-compose.yaml services: mysql_server: image: mysql:8.0.36-oracle container_name: db ports: - "3306:3306" environment: MYSQL_ALLOW_EMPTY_PASSWORD: "yes" MYSQL_DATABASE: "wordpress" MYSQL_USER: "weixiang98" MYSQL_PASSWORD: "weixiang" command: ["--character-set-server=utf8", "--collation-server=utf8_bin", "--default-authentication-plugin=mysql_native_password"] # 定义存储卷 volumes: # 存储卷名称:容器挂载点:读写模式(rw,ro) - db:/var/lib/mysql:rw - web:/weixiang/weixiang98:rw # 定义存储卷 volumes: # 定义存储卷名称,便于上面的services进行调用 db: # 指定docker 存储卷的名称,如果不指定,会用目录名称_存储券名称 name: mysql80 web: name: web-server [root@elk92 03-mysql-volumes]# 2.启动服务 [root@elk92 03-mysql-volumes]# docker-compose up -d [+] Building 0.0s (0/0) docker:default [+] Running 4/4 ✔ Network 03-mysql-volumes_default Created 0.1s ✔ Volume "mysql80" Created 0.0s ✔ Volume "web-server" Created 0.0s ✔ Container db Started 0.0s [root@elk92 03-mysql-volumes]# 3.查看存储卷 [root@elk92 03-mysql-volumes]# docker volume ls DRIVER VOLUME NAME local mysql80 local web-server [root@elk92 03-mysql-volumes]# 4.移除服务时,可以使用-v选项删除存储卷 [root@elk92 03-mysql-volumes]# docker-compose down -t 0 -v [+] Running 4/4 ✔ Container db Removed 0.3s ✔ Volume web-server Removed 0.0s ✔ Volume mysql80 Removed 0.0s ✔ Network 03-mysql-volumes_default Removed 0.2s [root@elk92 03-mysql-volumes]# 5.再次查看验证发现存储卷被删除 [root@elk92 03-mysql-volumes]# docker volume ls DRIVER VOLUME NAME [root@elk92 03-mysql-volumes]#
4、docker-compose服务引用自定义网络
bash
1.编写docker-compose文件 [root@elk92 04-mysql-volume-network]# cat docker-compose.yaml services: mysql_server: image: mysql:8.0.36-oracle container_name: db ports: - "3306:3306" environment: MYSQL_ALLOW_EMPTY_PASSWORD: "yes" MYSQL_DATABASE: "wordpress" MYSQL_USER: "weixiang98" MYSQL_PASSWORD: "weixiang" command: ["--character-set-server=utf8", "--collation-server=utf8_bin", "--default-authentication-plugin=mysql_native_password"] volumes: - db:/var/lib/mysql:rw - web:/weixiang/weixiang98:rw # 指定服务使用的网络 networks: - mysql - redis # 自定义docker的网络 networks: # 自定义网络 mysql: # 指定docker在创建自定义网络的名称 name: mysql-net # 声明网络是外部网络,意思是该网络不通过docker-compose创建或删除。 # 说白了,该网络需要你自己提前创建好。 # external: true # 指定网络的驱动 driver: bridge # 定义IP配置 ipam: # 指定IPAM 的驱动,一般使用默认值。 driver: default # 定义具体的配置信息 config: # 指定子网 - subnet: 172.28.0.0/16 # 指定IP地址的分配范围 ip_range: 172.28.100.0/24 # 指定网关地址 gateway: 172.28.100.254 # 如果定义多个网络,如果Service没有引用,则不会创建该网络 redis: name: redis-net mongodb: name: mongodb-net volumes: db: name: mysql80 web: name: web-server [root@elk92 04-mysql-volume-network]# 2.创建服务测试 [root@elk92 04-mysql-volume-network]# docker-compose up -d [+] Building 0.0s (0/0) docker:default [+] Running 5/5 ✔ Network mysql-net Created 0.1s ✔ Network redis-net Created 0.1s ✔ Volume "web-server" Created 0.0s ✔ Volume "mysql80" Created 0.0s ✔ Container db Started 0.0s [root@elk92 04-mysql-volume-network]# 3.验证网络 [root@elk92 04-mysql-volume-network]# docker network ls NETWORK ID NAME DRIVER SCOPE 229d1a22723b bridge bridge local 3f773caea859 host host local 5b2221ed06ae mysql-net bridge local 0f118e21048e none null local d257acbf283e redis-net bridge local [root@elk92 04-mysql-volume-network]# 4.删除服务 [root@elk92 04-mysql-volume-network]# docker-compose down -t 0 -v [+] Running 5/5 ✔ Container db Removed 0.4s ✔ Volume web-server Removed 0.0s ✔ Volume mysql80 Removed 0.0s ✔ Network redis-net Removed 0.2s ✔ Network mysql-net Removed 0.4s [root@elk92 04-mysql-volume-network]# 5.观察网络是否自动删除 [root@elk92 04-mysql-volume-network]# docker network ls NETWORK ID NAME DRIVER SCOPE 229d1a22723b bridge bridge local 3f773caea859 host host local 0f118e21048e none null local [root@elk92 04-mysql-volume-network]#
5、docker-compose实现服务依赖之wordpress
bash
1.导入wordpress镜像 1.1 准备镜像 [root@elk91 ~]# scp weixiang-wordpress-v6.7.1-php8.1-apache.tar.gz 10.0.0.92:~ 1.2 导入镜像 [root@elk92 ~]# docker load < weixiang-wordpress-v6.7.1-php8.1-apache.tar.gz 2.编写docker-compose文件 [root@elk92 05-multiple-services-wordpress]# cat docker-compose.yaml services: mysql_server: image: mysql:8.0.36-oracle container_name: db ports: - "3306:3306" environment: MYSQL_ALLOW_EMPTY_PASSWORD: "yes" MYSQL_DATABASE: "wordpress" MYSQL_USER: "weixiang98" MYSQL_PASSWORD: "weixiang" command: ["--character-set-server=utf8", "--collation-server=utf8_bin", "--default-authentication-plugin=mysql_native_password"] volumes: - db:/var/lib/mysql:rw networks: - wordpress # 指定容器的重启策略 restart: always wordpress: # 依赖的服务要优先启动 depends_on: - mysql_server image: wordpress:6.7.1-php8.1-apache container_name: wp # 指定容器的重启策略 restart: unless-stopped ports: - "81:80" environment: - WORDPRESS_DB_HOST=mysql_server:3306 - WORDPRESS_DB_USER=weixiang98 - WORDPRESS_DB_PASSWORD=weixiang - WORDPRESS_DB_NAME=wordpress volumes: - wp:/var/www/html networks: - wordpress networks: wordpress: name: wordpress-net driver: bridge ipam: driver: default config: - subnet: 172.28.0.0/16 ip_range: 172.28.100.0/24 gateway: 172.28.100.254 volumes: db: name: mysql80 wp: name: wordpress [root@elk92 05-multiple-services-wordpress]# 3.运行容器 [root@elk92 05-multiple-services-wordpress]# docker-compose up -d [+] Building 0.0s (0/0) docker:default [+] Running 5/5 ✔ Network wordpress-net Created 0.1s ✔ Volume "mysql80" Created 0.0s ✔ Volume "wordpress" Created 0.0s ✔ Container db Started 0.0s ✔ Container wp Started 0.0s [root@elk92 05-multiple-services-wordpress]# [root@elk92 05-multiple-services-wordpress]# [root@elk92 05-multiple-services-wordpress]# docker-compose ps NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS db mysql:8.0.36-oracle "docker-entrypoint.s…" mysql_server 2 seconds ago Up 2 seconds 0.0.0.0:3306->3306/tcp, :::3306->3306/tcp, 33060/tcp wp wordpress:6.7.1-php8.1-apache "docker-entrypoint.s…" wordpress 2 seconds ago Up 1 second 0.0.0.0:81->80/tcp, :::81->80/tcp [root@elk92 05-multiple-services-wordpress]# [root@elk92 05-multiple-services-wordpress]# 4.访问测试 http://10.0.0.92:81/ 5.删除服务 [root@elk92 05-multiple-services-wordpress]# docker-compose down -t 0 -v [+] Running 5/5 ✔ Container wp Removed 0.2s ✔ Container db Removed 0.2s ✔ Volume mysql80 Removed 0.1s ✔ Volume wordpress Removed 0.1s ✔ Network wordpress-net Removed 0.2s [root@elk92 05-multiple-services-wordpress]#
6、docker-compose部署ES和kibana环境及env变量文件使用
bash
参考链接: https://www.elastic.co/guide/en/elasticsearch/reference/8.18/docker.html 1.导入镜像 1.1 拷贝镜像 [root@elk91 ~]# scp weixiang-elasticsearch-v8.18.3.tar.gz weixiang-kibana-v8.18.3.tar.gz 10.0.0.92:~ 1.2 导入镜像 [root@elk92 ~]# docker load -i weixiang-elasticsearch-v8.18.3.tar.gz [root@elk92 ~]# docker load -i weixiang-kibana-v8.18.3.tar.gz 2.准备文件 [root@elk92 06-multiple-es-cluster]# ll total 24 drwxr-xr-x 2 root root 4096 Jul 4 14:54 ./ drwxr-xr-x 8 root root 4096 Jul 4 14:42 ../ -rw-r--r-- 1 root root 8284 Jul 4 14:54 docker-compose.yaml -rw-r--r-- 1 root root 757 Jul 4 14:49 .env [root@elk92 06-multiple-es-cluster]# [root@elk92 06-multiple-es-cluster]# [root@elk92 06-multiple-es-cluster]# cat .env # 配置环境变量文件 # Password for the 'elastic' user (at least 6 characters) ELASTIC_PASSWORD=yinzhengjie # Password for the 'kibana_system' user (at least 6 characters) KIBANA_PASSWORD=yinzhengjie # Version of Elastic products STACK_VERSION=8.18.3 # Set the cluster name CLUSTER_NAME=docker-cluster # Set to 'basic' or 'trial' to automatically start the 30-day trial LICENSE=basic #LICENSE=trial # Port to expose Elasticsearch HTTP API to the host # ES_PORT=9200 ES_PORT=127.0.0.1:19200 # Port to expose Kibana to the host # KIBANA_PORT=5601 KIBANA_PORT=15601 #KIBANA_PORT=80 # Increase or decrease based on the available host memory (in bytes) MEM_LIMIT=1073741824 # Project namespace (defaults to the current folder name if not set) #COMPOSE_PROJECT_NAME=myproject [root@elk92 06-multiple-es-cluster]# [root@elk92 06-multiple-es-cluster]# [root@elk92 06-multiple-es-cluster]# [root@elk92 06-multiple-es-cluster]# cat docker-compose.yaml version: "2.2" services: setup: # 初始化容器 (只运行一次) image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION} volumes: - certs:/usr/share/elasticsearch/config/certs # 指定运行用户 user: "0" command: > bash -c ' if [ x${ELASTIC_PASSWORD} == x ]; then echo "Set the ELASTIC_PASSWORD environment variable in the .env file"; exit 1; elif [ x${KIBANA_PASSWORD} == x ]; then echo "Set the KIBANA_PASSWORD environment variable in the .env file"; exit 1; fi; if [ ! -f config/certs/ca.zip ]; then echo "Creating CA"; bin/elasticsearch-certutil ca --silent --pem -out config/certs/ca.zip; unzip config/certs/ca.zip -d config/certs; fi; if [ ! -f config/certs/certs.zip ]; then echo "Creating certs"; echo -ne \ "instances:\n"\ " - name: es01\n"\ " dns:\n"\ " - es01\n"\ " - localhost\n"\ " ip:\n"\ " - 127.0.0.1\n"\ " - name: es02\n"\ " dns:\n"\ " - es02\n"\ " - localhost\n"\ " ip:\n"\ " - 127.0.0.1\n"\ " - name: es03\n"\ " dns:\n"\ " - es03\n"\ " - localhost\n"\ " ip:\n"\ " - 127.0.0.1\n"\ > config/certs/instances.yml; bin/elasticsearch-certutil cert --silent --pem -out config/certs/certs.zip --in config/certs/instances.yml --ca-cert config/certs/ca/ca.crt --ca-key config/certs/ca/ca.key; unzip config/certs/certs.zip -d config/certs; fi; echo "Setting file permissions" chown -R root:root config/certs; find . -type d -exec chmod 750 \{\} \;; find . -type f -exec chmod 640 \{\} \;; echo "Waiting for Elasticsearch availability"; until curl -s --cacert config/certs/ca/ca.crt https://es01:9200 | grep -q "missing authentication credentials"; do sleep 30; done; echo "Setting kibana_system password"; until curl -s -X POST --cacert config/certs/ca/ca.crt -u "elastic:${ELASTIC_PASSWORD}" -H "Content-Type: application/json" https://es01:9200/_security/user/kibana_system/_password -d "{\"password\":\"${KIBANA_PASSWORD}\"}" | grep -q "^{}"; do sleep 10; done; echo "All done!"; ' healthcheck: # 健康检查 test: ["CMD-SHELL", "[ -f config/certs/es01/es01.crt ]"] interval: 1s timeout: 5s retries: 120 es01: depends_on: # es01依赖setup成功完成 setup: condition: service_healthy image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION} volumes: - certs:/usr/share/elasticsearch/config/certs - esdata01:/usr/share/elasticsearch/data ports: - ${ES_PORT}:9200 environment: - node.name=es01 - cluster.name=${CLUSTER_NAME} - cluster.initial_master_nodes=es01,es02,es03 - discovery.seed_hosts=es02,es03 - ELASTIC_PASSWORD=${ELASTIC_PASSWORD} - bootstrap.memory_lock=true - xpack.security.enabled=true - xpack.security.http.ssl.enabled=true - xpack.security.http.ssl.key=certs/es01/es01.key - xpack.security.http.ssl.certificate=certs/es01/es01.crt - xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt - xpack.security.transport.ssl.enabled=true - xpack.security.transport.ssl.key=certs/es01/es01.key - xpack.security.transport.ssl.certificate=certs/es01/es01.crt - xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt - xpack.security.transport.ssl.verification_mode=certificate - xpack.license.self_generated.type=${LICENSE} - xpack.ml.use_auto_machine_memory_percent=true # 是否配置内存限制 mem_limit: ${MEM_LIMIT} # 配置系统限制 ulimits: memlock: soft: -1 # 内存锁定软限制 (无限制) hard: -1 # 内存锁定硬限制 (无限制) # 配置健康检查 healthcheck: test: [ "CMD-SHELL", "curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'", ] interval: 10s timeout: 10s retries: 120 es02: depends_on: # es02依赖es01健康 - es01 image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION} volumes: - certs:/usr/share/elasticsearch/config/certs - esdata02:/usr/share/elasticsearch/data environment: - node.name=es02 - cluster.name=${CLUSTER_NAME} - cluster.initial_master_nodes=es01,es02,es03 - discovery.seed_hosts=es01,es03 - ELASTIC_PASSWORD=${ELASTIC_PASSWORD} - bootstrap.memory_lock=true - xpack.security.enabled=true - xpack.security.http.ssl.enabled=true - xpack.security.http.ssl.key=certs/es02/es02.key - xpack.security.http.ssl.certificate=certs/es02/es02.crt - xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt - xpack.security.transport.ssl.enabled=true - xpack.security.transport.ssl.key=certs/es02/es02.key - xpack.security.transport.ssl.certificate=certs/es02/es02.crt - xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt - xpack.security.transport.ssl.verification_mode=certificate - xpack.license.self_generated.type=${LICENSE} - xpack.ml.use_auto_machine_memory_percent=true mem_limit: ${MEM_LIMIT} ulimits: memlock: soft: -1 hard: -1 healthcheck: test: [ "CMD-SHELL", "curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'", ] interval: 10s timeout: 10s retries: 120 es03: depends_on: # es03依赖es02健康 - es02 image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION} volumes: - certs:/usr/share/elasticsearch/config/certs - esdata03:/usr/share/elasticsearch/data environment: - node.name=es03 - cluster.name=${CLUSTER_NAME} - cluster.initial_master_nodes=es01,es02,es03 - discovery.seed_hosts=es01,es02 - ELASTIC_PASSWORD=${ELASTIC_PASSWORD} - bootstrap.memory_lock=true - xpack.security.enabled=true - xpack.security.http.ssl.enabled=true - xpack.security.http.ssl.key=certs/es03/es03.key - xpack.security.http.ssl.certificate=certs/es03/es03.crt - xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt - xpack.security.transport.ssl.enabled=true - xpack.security.transport.ssl.key=certs/es03/es03.key - xpack.security.transport.ssl.certificate=certs/es03/es03.crt - xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt - xpack.security.transport.ssl.verification_mode=certificate - xpack.license.self_generated.type=${LICENSE} - xpack.ml.use_auto_machine_memory_percent=true mem_limit: ${MEM_LIMIT} ulimits: memlock: soft: -1 hard: -1 healthcheck: test: [ "CMD-SHELL", "curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'", ] interval: 10s timeout: 10s retries: 120 kibana: depends_on: # 所有三个ES节点 (es01, es02, es03) 必须健康。 es01: condition: service_healthy es02: condition: service_healthy es03: condition: service_healthy image: docker.elastic.co/kibana/kibana:${STACK_VERSION} volumes: - certs:/usr/share/kibana/config/certs - kibanadata:/usr/share/kibana/data ports: - ${KIBANA_PORT}:5601 environment: - SERVERNAME=kibana - ELASTICSEARCH_HOSTS=https://es01:9200 - ELASTICSEARCH_USERNAME=kibana_system - ELASTICSEARCH_PASSWORD=${KIBANA_PASSWORD} - ELASTICSEARCH_SSL_CERTIFICATEAUTHORITIES=config/certs/ca/ca.crt mem_limit: ${MEM_LIMIT} healthcheck: test: [ "CMD-SHELL", "curl -s -I http://localhost:5601 | grep -q 'HTTP/1.1 302 Found'", ] interval: 10s timeout: 10s retries: 120 volumes: certs: driver: local # 数据直接存储在 Docker 宿主机上 esdata01: driver: local esdata02: driver: local esdata03: driver: local kibanadata: driver: local [root@elk92 06-multiple-es-cluster]# 3.启动服务 [root@elk92 06-multiple-es-cluster]# docker-compose up -d [+] Building 0.0s (0/0) docker:default [+] Running 11/11 ✔ Network 06-multiple-es-cluster_default Created 0.1s ✔ Volume "06-multiple-es-cluster_esdata03" Created 0.0s ✔ Volume "06-multiple-es-cluster_kibanadata" Created 0.0s ✔ Volume "06-multiple-es-cluster_certs" Created 0.0s ✔ Volume "06-multiple-es-cluster_esdata01" Created 0.0s ✔ Volume "06-multiple-es-cluster_esdata02" Created 0.0s ✔ Container 06-multiple-es-cluster-setup-1 Healthy 0.2s ✔ Container 06-multiple-es-cluster-es01-1 Healthy 0.0s ✔ Container 06-multiple-es-cluster-es02-1 Healthy 0.0s ✔ Container 06-multiple-es-cluster-es03-1 Healthy 0.0s ✔ Container 06-multiple-es-cluster-kibana-1 Started 0.0s [root@elk92 06-multiple-es-cluster]# [root@elk92 06-multiple-es-cluster]# [root@elk92 06-multiple-es-cluster]# docker-compose ps NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS 06-multiple-es-cluster-es01-1 docker.elastic.co/elasticsearch/elasticsearch:8.18.3 "/bin/tini -- /usr/l…" es01 2 minutes ago Up About a minute (healthy) 9300/tcp, 127.0.0.1:19200->9200/tcp 06-multiple-es-cluster-es02-1 docker.elastic.co/elasticsearch/elasticsearch:8.18.3 "/bin/tini -- /usr/l…" es02 2 minutes ago Up About a minute (healthy) 9200/tcp, 9300/tcp 06-multiple-es-cluster-es03-1 docker.elastic.co/elasticsearch/elasticsearch:8.18.3 "/bin/tini -- /usr/l…" es03 2 minutes ago Up About a minute (healthy) 9200/tcp, 9300/tcp 06-multiple-es-cluster-kibana-1 docker.elastic.co/kibana/kibana:8.18.3 "/bin/tini -- /usr/l…" kibana 2 minutes ago Up 23 seconds (health: starting) 0.0.0.0:15601->5601/tcp, :::15601->5601/tcp [root@elk92 06-multiple-es-cluster]# 4.访问测试 http://10.0.0.92:15601/ 5.修改kibana的语言 [root@elk92 06-multiple-es-cluster]# docker-compose exec -it kibana bash kibana@4914a063e7e1:~$ kibana@4914a063e7e1:~$ echo >> config/kibana.yml kibana@4914a063e7e1:~$ kibana@4914a063e7e1:~$ echo i18n.locale: "zh-CN" >> config/kibana.yml kibana@4914a063e7e1:~$ kibana@4914a063e7e1:~$ cat config/kibana.yml # # ** THIS IS AN AUTO-GENERATED FILE ** # # Default Kibana configuration for docker target server.host: "0.0.0.0" server.shutdownTimeout: "5s" elasticsearch.hosts: [ "http://elasticsearch:9200" ] monitoring.ui.container.elasticsearch.enabled: true i18n.locale: "zh-CN" kibana@4914a063e7e1:~$ [root@elk92 06-multiple-es-cluster]# docker-compose restart kibana # 重启kibana服务 [+] Restarting 1/1 ✔ Container 06-multiple-es-cluster-kibana-1 Started 2.1s [root@elk92 06-multiple-es-cluster]# [root@elk92 06-multiple-es-cluster]# docker-compose ps NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS 06-multiple-es-cluster-es01-1 docker.elastic.co/elasticsearch/elasticsearch:8.18.3 "/bin/tini -- /usr/l…" es01 10 minutes ago Up 10 minutes (healthy) 9300/tcp, 127.0.0.1:19200->9200/tcp 06-multiple-es-cluster-es02-1 docker.elastic.co/elasticsearch/elasticsearch:8.18.3 "/bin/tini -- /usr/l…" es02 10 minutes ago Up 10 minutes (healthy) 9200/tcp, 9300/tcp 06-multiple-es-cluster-es03-1 docker.elastic.co/elasticsearch/elasticsearch:8.18.3 "/bin/tini -- /usr/l…" es03 10 minutes ago Up 10 minutes (healthy) 9200/tcp, 9300/tcp 06-multiple-es-cluster-kibana-1 docker.elastic.co/kibana/kibana:8.18.3 "/bin/tini -- /usr/l…" kibana 10 minutes ago Up 2 seconds (health: starting) 0.0.0.0:15601->5601/tcp, :::15601->5601/tcp [root@elk92 06-multiple-es-cluster]# 6.停止服务 [root@elk92 06-multiple-es-cluster]# docker-compose down -t 0 -v [+] Running 11/11 ✔ Container 06-multiple-es-cluster-kibana-1 Removed 0.4s ✔ Container 06-multiple-es-cluster-es03-1 Removed 0.5s ✔ Container 06-multiple-es-cluster-es02-1 Removed 0.3s ✔ Container 06-multiple-es-cluster-es01-1 Removed 0.3s ✔ Container 06-multiple-es-cluster-setup-1 Removed 0.0s ✔ Volume 06-multiple-es-cluster_esdata01 Removed 0.1s ✔ Volume 06-multiple-es-cluster_esdata03 Removed 0.0s ✔ Volume 06-multiple-es-cluster_esdata02 Removed 0.1s ✔ Volume 06-multiple-es-cluster_kibanadata Removed 0.1s ✔ Volume 06-multiple-es-cluster_certs Removed 0.1s ✔ Network 06-multiple-es-cluster_default Removed 0.2s [root@elk92 06-multiple-es-cluster]#
7、Services引用secrets实战案例
bash
secrets 部分实现了 安全的敏感信息管理,是 Docker 的安全特性之一。它的核心作用是 将敏感数据(如密码)与容器镜像和 Compose文 件分离,避免明文存储敏感信息 - Services引用secrets实战案例 参考链接: https://docs.docker.com/compose/how-tos/use-secrets/ 1.编写docker-compose文件 [root@elk92 07-multiple-wordpress-secrets]# ll total 20 drwxr-xr-x 2 root root 4096 Jul 4 16:17 ./ drwxr-xr-x 9 root root 4096 Jul 4 15:54 ../ -rw-r--r-- 1 root root 12 Jul 4 16:11 db_password.txt -rw-r--r-- 1 root root 27 Jul 4 15:58 db_root_password.txt -rw-r--r-- 1 root root 1799 Jul 4 16:17 docker-compose.yaml [root@elk92 07-multiple-wordpress-secrets]# [root@elk92 07-multiple-wordpress-secrets]# [root@elk92 07-multiple-wordpress-secrets]# cat docker-compose.yaml services: mysql_server: image: mysql:8.0.36-oracle container_name: db ports: - "3306:3306" environment: # MYSQL_ALLOW_EMPTY_PASSWORD: "yes" MYSQL_ROOT_PASSWORD_FILE: /run/secrets/db_root_password MYSQL_DATABASE: "wordpress" MYSQL_USER: "weixiang98" #MYSQL_PASSWORD: "weixiang" MYSQL_PASSWORD_FILE: /run/secrets/db_password command: ["--character-set-server=utf8", "--collation-server=utf8_bin", "--default-authentication-plugin=mysql_native_password"] volumes: - db:/var/lib/mysql:rw networks: - wordpress restart: always # 定义服务引用secrets资源,并将其挂载到"/run/secrets/"目录下。 secrets: - db_root_password - db_password wordpress: depends_on: - mysql_server image: wordpress:6.7.1-php8.1-apache container_name: wp restart: unless-stopped ports: - "81:80" environment: - WORDPRESS_DB_HOST=mysql_server:3306 - WORDPRESS_DB_USER=weixiang98 # - WORDPRESS_DB_PASSWORD=weixiang - WORDPRESS_DB_PASSWORD_FILE=/run/secrets/db_password - WORDPRESS_DB_NAME=wordpress volumes: - wp:/var/www/html networks: - wordpress secrets: - db_password networks: wordpress: name: wordpress-net driver: bridge ipam: driver: default config: - subnet: 172.28.0.0/16 ip_range: 172.28.100.0/24 gateway: 172.28.100.254 # 定义secrets secrets: # 定义secret的名称便于Service进行引用 db_password: # 指定secret的文件名称。直接将变量写到该文件中即可。但是经过测试并不起作用。 file: db_password.txt db_root_password: file: db_root_password.txt volumes: db: name: mysql80 wp: name: wordpress [root@elk92 07-multiple-wordpress-secrets]# [root@elk92 07-multiple-wordpress-secrets]# [root@elk92 07-multiple-wordpress-secrets]# cat db_password.txt yinzhengjie [root@elk92 07-multiple-wordpress-secrets]# [root@elk92 07-multiple-wordpress-secrets]# [root@elk92 07-multiple-wordpress-secrets]# cat db_root_password.txt weixiang [root@elk92 07-multiple-wordpress-secrets]# 2.测试验证 [root@elk92 07-multiple-wordpress-secrets]# docker-compose exec -it mysql_server bash bash-4.4# ls /run/secrets/ db_password db_root_password bash-4.4# bash-4.4# cat /run/secrets/db_password yinzhengjie bash-4.4# bash-4.4# exit [root@elk92 07-multiple-wordpress-secrets]# [root@elk92 07-multiple-wordpress-secrets]# [root@elk92 07-multiple-wordpress-secrets]# docker-compose exec -it wordpress bash root@cbe02fc95b05:/var/www/html# root@cbe02fc95b05:/var/www/html# ls /run/secrets/ db_password root@cbe02fc95b05:/var/www/html# root@cbe02fc95b05:/var/www/html# cat /run/secrets/db_password yinzhengjie root@cbe02fc95b05:/var/www/html# 温馨提示: 尽管我们传递变量到容器中,但是mysql容器并没有重置root密码。可能是MySQL官方人员在写root密码是没有修改基于文件的注入时的变量。 后期解决思路,可以手动修改密码,进入容器后去测试,但是就得不尝试了,因为需要运维人员手动干预。
8、docker-compose部署zabbix案例
bash
1.导入镜像 1.1 拷贝镜像 [root@elk91 ~]# scp weixiang-zabbix-* 10.0.0.92:~ 1.2 导入镜像 [root@elk92 ~]# for i in `ls -1 weixiang-zabbix-*`;do docker load -i $i;done 2.编写docker-compose文件 [root@elk92 08-multiple-zabbix]# cat docker-compose.yaml services: mysql-server: image: mysql:8.0.36-oracle container_name: mysql-server restart: unless-stopped environment: MYSQL_ROOT_PASSWORD: "123456" MYSQL_DATABASE: zabbix MYSQL_USER: weixiang98 MYSQL_PASSWORD: weixiang networks: - zabbix-net command: ["--character-set-server=utf8", "--collation-server=utf8_bin", "--default-authentication-plugin=mysql_native_password"] zabbix-java-gateway: container_name: zabbix-java-gateway image: zabbix/zabbix-java-gateway:alpine-7.2-latest restart: unless-stopped networks: - zabbix-net zabbix-server: container_name: zabbix-server-mysql # 依赖哪个服务 depends_on: - mysql-server image: zabbix/zabbix-server-mysql:alpine-7.2-latest restart: unless-stopped environment: DB_SERVER_HOST: mysql-server MYSQL_DATABASE: zabbix MYSQL_USER: weixiang98 MYSQL_PASSWORD: weixiang MYSQL_ROOT_PASSWORD: "123456" ZBX_JAVAGATEWAY: zabbix-java-gateway networks: - zabbix-net ports: - "10051:10051" zabbix-web-nginx-mysql: container_name: zabbix-web-nginx-mysql depends_on: - zabbix-server image: zabbix/zabbix-web-nginx-mysql:alpine-7.2-latest ports: - "8080:8080" restart: unless-stopped environment: DB_SERVER_HOST: mysql-server MYSQL_DATABASE: zabbix MYSQL_USER: weixiang98 MYSQL_PASSWORD: weixiang MYSQL_ROOT_PASSWORD: "123456" networks: - zabbix-net networks: zabbix-net: name: yinzhengjie-zabbix ipam: driver: default config: - subnet: 172.20.100.0/16 gateway: 172.20.100.254 [root@elk92 08-multiple-zabbix]# 3.测试验证 [root@elk92 08-multiple-zabbix]# docker-compose up -d [+] Building 0.0s (0/0) docker:default [+] Running 5/5 ✔ Network yinzhengjie-zabbix Created 0.1s ✔ Container zabbix-java-gateway Started 0.0s ✔ Container mysql-server Started 0.0s ✔ Container zabbix-server-mysql Started 0.0s ✔ Container zabbix-web-nginx-mysql Started 0.0s [root@elk92 08-multiple-zabbix]# [root@elk92 08-multiple-zabbix]# docker-compose ps NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS mysql-server mysql:8.0.36-oracle "docker-entrypoint.s…" mysql-server 7 seconds ago Up 6 seconds 3306/tcp, 33060/tcp zabbix-java-gateway zabbix/zabbix-java-gateway:alpine-7.2-latest "docker-entrypoint.s…" zabbix-java-gateway 7 seconds ago Up 6 seconds 10052/tcp zabbix-server-mysql zabbix/zabbix-server-mysql:alpine-7.2-latest "/usr/bin/docker-ent…" zabbix-server 7 seconds ago Up 6 seconds 0.0.0.0:10051->10051/tcp, :::10051->10051/tcp zabbix-web-nginx-mysql zabbix/zabbix-web-nginx-mysql:alpine-7.2-latest "docker-entrypoint.sh" zabbix-web-nginx-mysql 7 seconds ago Up 5 seconds (health: starting) 0.0.0.0:8080->8080/tcp, :::8080->8080/tcp, 8443/tcp [root@elk92 08-multiple-zabbix]# [root@elk92 08-multiple-zabbix]# 4.访问webUI http://10.0.0.92:8080/ - 今日作业: - 完成课堂的所有练习并整理思维导图; - 使用docker-compose部署3个服务,要求如下: - 启动一个nginx,2个tomcat; - nginx代理2个tomcat的请求 - 将3个容器加入到同一个网络中,网段为: 172.31.100.0/24 - 要求windows能够访问到nginx的页面,nginx调度到后端的tomcat实例,基于rr算法; - 扩展作业: - 使用docker-compose部署minio集群; - 使用docker-compose部署RabbitMQ集群; - 在互联网中找一个JAVA,Golang前后端分离的项目使用docker-compose部署出来;
6、docker-compose编译镜像
bash
1.环境准备 [root@elk92 09-build-images]# ll total 20 drwxr-xr-x 2 root root 4096 Jul 7 09:15 ./ drwxr-xr-x 11 root root 4096 Jul 7 09:09 ../ -rw-r--r-- 1 root root 374 Jul 7 09:15 docker-compose.yaml -rw-r--r-- 1 root root 71 Jul 7 09:09 f1.dockerfile -rw-r--r-- 1 root root 74 Jul 7 09:15 f2.dockerfile # 创建配置文件 [root@elk92 09-build-images]# cat docker-compose.yaml services: web01: image: www.weixiang.com/apps/web:v0.1 # 定义编译的相关参数 build: # 指定编译的上下文环境路径 context: . # 指定Dockerfile文件 dockerfile: f1.dockerfile web02: image: www.weixiang.com/apps/web:v0.2 build: context: . dockerfile: f2.dockerfile command: ["sleep","10d"] [root@elk92 09-build-images]# # 创建dockfile文件 [root@elk92 09-build-images]# cat f1.dockerfile FROM alpine RUN mkdir /opt/old CMD ["tail","-f","/etc/hosts"] [root@elk92 09-build-images]# cat f2.dockerfile FROM alpine RUN mkdir weixiang CMD ["tail","-f","/etc/hostname"] 2.编译镜像 [root@elk92 09-build-images]# docker-compose build [root@elk92 09-build-images]# docker image ls www.weixiang.com/apps/web REPOSITORY TAG IMAGE ID CREATED SIZE www.weixiang.com/apps/web v0.2 5d0e12f29304 10 seconds ago 7.8MB www.weixiang.com/apps/web v0.1 0d0edfc43d1f 10 seconds ago 7.8MB [root@elk92 09-build-images]# 3.启动测试 [root@elk92 09-build-images]# docker-compose up -d [+] Building 0.0s (0/0) docker:default [+] Running 3/3 ✔ Network 09-build-images_default Created 0.1s ✔ Container 09-build-images-web02-1 Started 0.0s ✔ Container 09-build-images-web01-1 Started 0.0s [root@elk92 09-build-images]# [root@elk92 09-build-images]# docker-compose ps NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS 09-build-images-web01-1 www.weixiang.com/apps/web:v0.1 "tail -f /etc/hosts" web01 6 seconds ago Up 4 seconds 09-build-images-web02-1 www.weixiang.com/apps/web:v0.2 "sleep 10d" web02 6 seconds ago Up 4 seconds [root@elk92 09-build-images]#
7、docker-registry私有镜像仓库部署
bash
1.什么是docker-registry docker-registry是docker官方开源的一款轻量级私有镜像仓库。 什么是私有仓库: 不对全世界开放的镜像仓库,一般是公司内部使用的私有镜像仓库。 为什么需要私有仓库: (1)安全性问题: 官方镜像仓库大家都可以访问,若都放在官方,那自然大家都能拿到你们公司的自建镜像。 (2)访问速度: 官方的服务器在国外,访问速度可想而知,有时候甚至官方网站都打不开。 主流的私有仓库: docker registry: 是一个轻量级的私有镜像仓库,基本上不占用内存,很适合学习环境中使用。 上传的镜像会被进行压缩处理,没有提供较好的WebUI,维护起来比较麻烦,尤其是删除镜像。 harbor: 基于官方的"docker registry"进行二次开发,是一个时候企业级使用的镜像仓库。 2.部署实战 2.1 拉取镜像 [root@elk93 ~]# docker pull registry:3.0.0 3.0.0: Pulling from library/registry f18232174bc9: Pull complete e5a9c19e7b9d: Pull complete e8a894506e86: Pull complete e1822bac1992: Pull complete b5da7f963a9e: Pull complete Digest: sha256:1fc7de654f2ac1247f0b67e8a459e273b0993be7d2beda1f3f56fbf1001ed3e7 Status: Downloaded newer image for registry:3.0.0 docker.io/library/registry:3.0.0 [root@elk93 ~]# SVIP导入镜像 [root@elk93 ~]# wget http://192.168.21.253/Resources/Docker/images/weixiang-registry-v3.0.0.tar.gz [root@elk93 ~]# docker load < weixiang-registry-v3.0.0.tar.gz 2.2 运行服务 [root@elk92 ~]# docker run -d --network host --restart=always --name weixiang-registry -v /var/lib/registry registry:3.0.0 228abce3cabed0fed08a07593443d7d45233732b0cebf0d4cccad2e4e6d3b3b9 # --network host # 使用宿主机的网络命名空间(容器直接共享主机网络) # --restart=always # 设置容器自动重启策略(无论退出状态如何都重启) # -v /var/lib/registry # 挂载数据卷(将宿主机目录绑定到容器内的/var/lib/registry) [root@elk93 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 228abce3cabe registry:3.0.0 "/entrypoint.sh /etc…" 6 seconds ago Up 5 seconds weixiang-registry [root@elk93 ~]# [root@elk93 ~]# ss -ntl | grep 5000 LISTEN 0 4096 *:5000 *:* [root@elk93 ~]# 3.基本使用 3.1 访问webUI http://10.0.0.92:5000/v2/_catalog 3.2 给镜像打标签 [root@elk93 ~]# docker image ls mysql REPOSITORY TAG IMAGE ID CREATED SIZE mysql 8.0.36-oracle f5f171121fa3 15 months ago 603MB [root@elk92 ~]# docker tag mysql:8.0.36-oracle elk92.weixiang.com:5000/weixiang-db/mysql:8.0.36-oracle # mysql:8.0.36-oracle 源镜像(本地已有的镜像) # elk92.weixiang.com:5000 私有仓库地址(域名+端口) # /weixiang-db/mysql 项目/命名空间路径(必须与推送路径一致) # :8.0.36-oracle 镜像标签 [root@elk92 ~]# docker image ls elk92.weixiang.com:5000/weixiang-db/mysql:8.0.36-oracle REPOSITORY TAG IMAGE ID CREATED SIZE elk92.weixiang.com:5000/weixiang-db/mysql 8.0.36-oracle f5f171121fa3 15 months ago 603MB 3.3 配置不安全的仓库地址【表示使用http协议】 [root@elk92 ~]# cat /etc/docker/daemon.json { "registry-mirrors": ["https://tuv7rqqq.mirror.aliyuncs.com","https://docker.mirrors.ustc.edu.cn/","https://hub-mirror.c.163.com/","https://reg-mirror.qiniu.com"], "insecure-registries": ["elk92.weixiang.com:5000"] # 使用http协议 } [root@elk92 ~]# systemctl daemon-reload [root@elk92 ~]# systemctl restart docker [root@elk92 ~]# docker info | grep Registries -A 2 Insecure Registries: elk93.weixiang.com:5000 127.0.0.0/8 3.4 配置hosts解析 [root@elk92 ~]# echo 10.0.0.92 elk92.weixiang.com >> /etc/hosts 3.4 推送镜像到registry仓库 [root@elk92 ~]# docker push elk92.weixiang.com:5000/weixiang-db/mysql:8.0.36-oracle # 如果是在91节点推送,那么92节点的registry仓库要在运行状态,必须在该服务器上运行仓库服务才能接收镜像,但必须运行兼容 Docker # Registry HTTP API V2 协议的服务,支持 HTTP/HTTPS 和标准认证方式 The push refers to repository [localhost:5000/weixiang-db/mysql] 318dde184d61: Pushed 1c0ff7ed67c4: Pushed 876b8cd855eb: Pushed 84d659420bad: Pushed 9513d2aedd12: Pushed eaa1e85de732: Pushed a6909c467615: Pushed 5b76076a2dd4: Pushed fb5c92e924ab: Pushed 152c1ecea280: Pushed fc037c17567d: Pushed 8.0.36-oracle: digest: sha256:c57363379dee26561c2e554f82e70704be4c8129bd0d10e29252cc0a34774004 size: 2618 [root@elk93 ~]#

image

image

bash
3.4 客户端拉取镜像 [root@elk91 ~]# echo 10.0.0.92 elk92.weixiang.com >> /etc/hosts [root@elk91 ~]# cat /etc/docker/daemon.json { "registry-mirrors": ["https://tuv7rqqq.mirror.aliyuncs.com","https://docker.mirrors.ustc.edu.cn/","https://hub-mirror.c.163.com/","https://reg-mirror.qiniu.com"], "insecure-registries": ["elk92.weixiang.com:5000"] } [root@elk91 ~]# docker info | grep Registries -A 2 Insecure Registries: elk92.weixiang.com:5000 127.0.0.0/8 # 拉取镜像 [root@elk91 ~]# docker pull elk92.weixiang.com:5000/weixiang-db/mysql:8.0.36-oracle 8.0.36-oracle: Pulling from weixiang-db/mysql Digest: sha256:c57363379dee26561c2e554f82e70704be4c8129bd0d10e29252cc0a34774004 Status: Downloaded newer image for elk93.weixiang.com:5000/weixiang-db/mysql:8.0.36-oracle elk93.weixiang.com:5000/weixiang-db/mysql:8.0.36-oracle [root@elk91 ~]# [root@elk91 ~]# docker image ls elk92.weixiang.com:5000/weixiang-db/mysql:8.0.36-oracle REPOSITORY TAG IMAGE ID CREATED SIZE elk92.weixiang.com:5000/weixiang-db/mysql 8.0.36-oracle f5f171121fa3 15 months ago 603MB [root@elk92 ~]# 3.5 删除镜像 # 删除前查看大小 [root@elk92 09-build-images]# docker exec -it weixiang-registry sh /var/lib/registry/docker/registry/v2 # du -sh * 166M blobs 92.0K repositories [root@elk92 ~]# docker exec weixiang-registry rm -rf /var/lib/registry/docker/registry/v2/repositories/weixiang-db # 删除元数据信息 [root@elk92 ~]# docker exec weixiang-registry registry garbage-collect /etc/distribution/config.yml # 回收数据 # 回收后 [root@elk92 09-build-images]# docker exec -it weixiang-registry sh /var/lib/registry/docker/registry/v2 # du -sh * 3.5M blobs 92.0K repositories docker-registry存在的bug解决方案 1.推送新镜像到registry仓库 [root@elk92 ~]# docker tag alpine:latest elk92.weixiang.com:5000/weixiang-linux/alpine [root@elk92 ~]# docker push elk92.weixiang.com:5000/weixiang-linux/alpine Using default tag: latest The push refers to repository [elk92.weixiang.com:5000/weixiang-linux/alpine] 63ca1fbb43ae: Pushed latest: digest: sha256:33735bd63cf84d7e388d9f6d297d348c523c044410f553bd878c6d7829612735 size: 528 [root@elk93 ~]# 2.但是推送一个已经删除的镜像到registry仓库会存在问题 [root@elk92 ~]# docker push elk92.weixiang.com:5000/weixiang-db/mysql:8.0.36-oracle # 说是已经推送成功,但是并没有成功上传!【此时返回的是缓存】 The push refers to repository [elk93.weixiang.com:5000/weixiang-db/mysql] 318dde184d61: Layer already exists 1c0ff7ed67c4: Layer already exists 876b8cd855eb: Layer already exists 84d659420bad: Layer already exists 9513d2aedd12: Layer already exists eaa1e85de732: Layer already exists a6909c467615: Layer already exists 5b76076a2dd4: Layer already exists fb5c92e924ab: Layer already exists 152c1ecea280: Layer already exists fc037c17567d: Layer already exists 8.0.36-oracle: digest: sha256:c57363379dee26561c2e554f82e70704be4c8129bd0d10e29252cc0a34774004 size: 2618 [root@elk93 ~]# 3.查看webUI【发现能看到镜像 】 http://10.0.0.92:5000/v2/_catalog 4.客户端拉取失败 [root@elk91 ~]# docker pull elk92.weixiang.com:5000/weixiang-db/mysql:8.0.36-oracle Error response from daemon: manifest for elk93.weixiang.com:5000/weixiang-db/mysql:8.0.36-oracle not found: manifest unknown: manifest unknown 5.解决方案 重启docker-registries容器即可。 [root@elk92 ~]# docker restart weixiang-registry weixiang-registry [root@elk93 ~]# [root@elk92 ~]# docker push elk92.weixiang.com:5000/weixiang-db/mysql:8.0.36-oracle The push refers to repository [elk93.weixiang.com:5000/weixiang-db/mysql] 318dde184d61: Pushed 1c0ff7ed67c4: Pushed 876b8cd855eb: Pushed 84d659420bad: Pushed 9513d2aedd12: Pushed eaa1e85de732: Pushed a6909c467615: Pushed 5b76076a2dd4: Pushed fb5c92e924ab: Pushed 152c1ecea280: Pushed fc037c17567d: Pushed 8.0.36-oracle: digest: sha256:c57363379dee26561c2e554f82e70704be4c8129bd0d10e29252cc0a34774004 size: 2618 [root@elk93 ~]# 1、拉取镜像 2、基于镜像运行容器

12、harbor企业级镜像仓库实战

1、部署harbor镜像仓库
bash
1.harbor概述 harbor是VMware公司开源的一款企业级镜像仓库,底层基于docker-compose来管理harbor服务。 官网地址: https://github.com/goharbor/harbor 2.部署harbor实战 2.1 下载harbor软件包 [root@elk92 ~]# wget http://192.168.21.253/Resources/Docker/softwares/harbor-offline-installer-v2.13.1.tgz 2.2 解压软件包 [root@elk92 ~]# tar xf harbor-offline-installer-v2.13.1.tgz -C /usr/local/ 2.3 修改harbor的配置文件 [root@elk92 ~]# cd /usr/local/harbor/ [root@elk92 harbor]# [root@elk92 harbor]# cp harbor.yml{.tmpl,} [root@elk92 harbor]# [root@elk92 harbor]# vim harbor.yml ... # hostname: reg.mydomain.com hostname: 10.0.0.91 ... ## https related config #https: # # https port for harbor, default is 443 # port: 443 # # The path of cert and key files for nginx # certificate: /your/certificate/path # private_key: /your/private/key/path # # enable strong ssl ciphers (default: false) # # strong_ssl_ciphers: false ... # harbor_admin_password: Harbor12345 harbor_admin_password: 1 ... # data_volume: /data data_volume: /weixiang/data/harbor ... 2.4 安装harbor服务 [root@elk92 harbor]# ./install.sh ... [Step 5]: starting Harbor ... [+] Building 0.0s (0/0) docker:default [+] Running 10/10 ✔ Network harbor_harbor Created 0.1s ✔ Container harbor-log Started 0.0s ✔ Container redis Started 0.0s ✔ Container registryctl Started 0.0s ✔ Container harbor-portal Started 0.0s ✔ Container registry Started 0.0s ✔ Container harbor-db Started 0.0s ✔ Container harbor-core Started 0.0s ✔ Container harbor-jobservice Started 0.0s ✔ Container nginx Started 0.0s ✔ ----Harbor has been installed and started successfully.---- [root@elk92 harbor]# [root@elk92 harbor]# ll total 650932 drwxr-xr-x 3 root root 4096 Jul 7 10:43 ./ drwxr-xr-x 14 root root 4096 Jul 7 10:38 ../ drwxr-xr-x 3 root root 4096 Jul 7 10:43 common/ -rw-r--r-- 1 root root 3646 May 22 15:48 common.sh -rw-r--r-- 1 root root 5998 Jul 7 10:43 docker-compose.yml -rw-r--r-- 1 root root 666471629 May 22 15:48 harbor.v2.13.1.tar.gz -rw-r--r-- 1 root root 14784 Jul 7 10:40 harbor.yml -rw-r--r-- 1 root root 14688 May 22 15:48 harbor.yml.tmpl -rwxr-xr-x 1 root root 1975 Jul 7 10:42 install.sh* -rw-r--r-- 1 root root 11347 May 22 15:48 LICENSE -rwxr-xr-x 1 root root 2211 May 22 15:48 prepare* [root@elk92 harbor]# [root@elk92 harbor]# docker-compose ps -a NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS harbor-core goharbor/harbor-core:v2.13.1 "/harbor/entrypoint.…" core About a minute ago Up About a minute (healthy) harbor-db goharbor/harbor-db:v2.13.1 "/docker-entrypoint.…" postgresql About a minute ago Up About a minute (healthy) harbor-jobservice goharbor/harbor-jobservice:v2.13.1 "/harbor/entrypoint.…" jobservice About a minute ago Up About a minute (healthy) harbor-log goharbor/harbor-log:v2.13.1 "/bin/sh -c /usr/loc…" log About a minute ago Up About a minute (healthy) 127.0.0.1:1514->10514/tcp harbor-portal goharbor/harbor-portal:v2.13.1 "nginx -g 'daemon of…" portal About a minute ago Up About a minute (healthy) nginx goharbor/nginx-photon:v2.13.1 "nginx -g 'daemon of…" proxy About a minute ago Up About a minute (healthy) 0.0.0.0:80->8080/tcp, :::80->8080/tcp redis goharbor/redis-photon:v2.13.1 "redis-server /etc/r…" redis About a minute ago Up About a minute (healthy) registry goharbor/registry-photon:v2.13.1 "/home/harbor/entryp…" registry About a minute ago Up About a minute (healthy) registryctl goharbor/harbor-registryctl:v2.13.1 "/home/harbor/start.…" registryctl About a minute ago Up About a minute (healthy) [root@elk91 harbor]# 2.5 访问webUI http://10.0.0.92/harbor/projects 初始用户名: admin 初始化密码: 1

image

2、harbor实现镜像的基础管理
bash
1.创建项目

image

image

bash
2.客户端节点配置harbor仓库 [root@elk91 ~]# cat /etc/docker/daemon.json { "registry-mirrors": ["https://tuv7rqqq.mirror.aliyuncs.com","https://docker.mirrors.ustc.edu.cn/","https://hub-mirror.c.163.com/","https://reg-mirror.qiniu.com"], "insecure-registries": ["elk92.weixiang.com:5000","10.0.0.92"] } [root@elk91 ~]# systemctl daemon-reload [root@elk91 ~]# [root@elk91 ~]# systemctl restart docker.service 3.给镜像打标签 [root@elk91 ~]# docker tag mysql:8.0.36-oracle 10.0.0.92/weixiang98/mysql:8.0.36-oracle [root@elk91 ~]# 4.登录服务端harbor仓库 [root@elk91 ~]# docker login 10.0.0.92 # 交互式登录 Username: admin Password: WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded [root@elk91 ~]# [root@elk91 ~]# docker login -u admin -p 1 10.0.0.92 # 非交互式登录 WARNING! Using --password via the CLI is insecure. Use --password-stdin. WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store # docker pull 通常不需要认证 允许公开读取(可配置) # docker push 必须认证 涉及写入,需严格权限控制 [root@elk92 ~]# 5.推送镜像到harbor仓库 [root@elk92 ~]# docker push 10.0.0.92/weixiang98/mysql:8.0.36-oracle The push refers to repository [10.0.0.92/weixiang98/mysql] 318dde184d61: Pushed 1c0ff7ed67c4: Pushed 876b8cd855eb: Pushed 84d659420bad: Pushed 9513d2aedd12: Pushed eaa1e85de732: Pushed a6909c467615: Pushed 5b76076a2dd4: Pushed fb5c92e924ab: Pushed 152c1ecea280: Pushed fc037c17567d: Pushed 8.0.36-oracle: digest: sha256:c57363379dee26561c2e554f82e70704be4c8129bd0d10e29252cc0a34774004 size: 2618 [root@elk92 ~]# 6.拉取镜像 [root@elk92 ~]# docker pull 10.0.0.91/weixiang98/mysql:8.0.36-oracle 8.0.36-oracle: Pulling from weixiang98/mysql Digest: sha256:c57363379dee26561c2e554f82e70704be4c8129bd0d10e29252cc0a34774004 Status: Image is up to date for 10.0.0.91/weixiang98/mysql:8.0.36-oracle 10.0.0.91/weixiang98/mysql:8.0.36-oracle [root@elk92 ~]#

image

bash
7.删除镜像

image

bash
8.删除项目 【项目必须为空才能删除,说白了,删除项目之前,要删除该项目下的所有镜像。】

image 3、docker-registry迁移至harbor

bash
1.配置仓库

16db8a090069429101187aff2dabdff4

bash
2.新建复制规则

f7464678ec9bb7a6bd21f04193169a86

bash
3.启动复制规则

aea08bd6354fe13529ccd15914e9d129

bash
4.测试验证

image

image

bash
# 可以拉取 [root@elk91 ~]# docker pull 10.0.0.92/weixiang-linux/alpine:latest latest: Pulling from weixiang-linux/alpine Digest: sha256:33735bd63cf84d7e388d9f6d297d348c523c044410f553bd878c6d7829612735 Status: Downloaded newer image for 10.0.0.92/weixiang-linux/alpine:latest 10.0.0.92/weixiang-linux/alpine:latest
bash
5.设置项目为公开

image

4、harbor同步数据
bash
- 1.部署harbor环境 1.1 准备软件包 [root@elk92 ~]# scp harbor-offline-installer-v2.13.1.tgz 10.0.0.91:~ 1.2 解压软件包 [root@elk91 ~]# tar xf harbor-offline-installer-v2.13.1.tgz -C /usr/local/ 1.3 修改harbor的配置文件 [root@elk91 ~]# cd /usr/local/harbor/ [root@elk91 harbor]# [root@elk91 harbor]# cp harbor.yml{.tmpl,} [root@elk91 harbor]# [root@elk91 harbor]# vim harbor.yml ... #hostname: reg.mydomain.com hostname: 10.0.0.91 ... #https: # # https port for harbor, default is 443 # port: 443 # # The path of cert and key files for nginx # certificate: /your/certificate/path # private_key: /your/private/key/path # # enable strong ssl ciphers (default: false) # # strong_ssl_ciphers: false ... # harbor_admin_password: Harbor12345 harbor_admin_password: 1 ... # data_volume: /data data_volume: /weixiang/data/harbor ... 1.4 安装harbor服务 [root@elk91 harbor]# systemctl disable --now nginx [root@elk91 harbor]# ./install.sh

配置harbor从91节点同步

image

image

bash
彩蛋: harbor出现问题解决小技巧 [root@elk91 harbor]# docker-compose down -t 0 [+] Running 10/10 ✔ Container harbor-jobservice Removed 0.2s ✔ Container registryctl Removed 0.3s ✔ Container nginx Removed 0.0s ✔ Container harbor-portal Removed 0.3s ✔ Container harbor-core Removed 0.2s ✔ Container redis Removed 0.3s ✔ Container harbor-db Removed 0.3s ✔ Container registry Removed 0.3s ✔ Container harbor-log Removed 0.2s ✔ Network harbor_harbor Removed 0.2s [root@elk91 harbor]# [root@elk91 harbor]# [root@elk91 harbor]# docker-compose up -d [+] Building 0.0s (0/0) docker:default [+] Running 10/10 ✔ Network harbor_harbor Created 0.1s ✔ Container harbor-log Started 0.0s ✔ Container redis Started 0.0s ✔ Container registryctl Started 0.0s ✔ Container registry Started 0.0s ✔ Container harbor-db Started 0.0s ✔ Container harbor-portal Started 0.0s ✔ Container harbor-core Started 0.0s ✔ Container harbor-jobservice Started 0.0s ✔ Container nginx Started 0.0s [root@elk91 harbor]# [root@elk91 harbor]# docker-compose ps -a NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS harbor-core goharbor/harbor-core:v2.13.1 "/harbor/entrypoint.…" core 21 seconds ago Up 19 seconds (health: starting) harbor-db goharbor/harbor-db:v2.13.1 "/docker-entrypoint.…" postgresql 21 seconds ago Up 20 seconds (health: starting) harbor-jobservice goharbor/harbor-jobservice:v2.13.1 "/harbor/entrypoint.…" jobservice 21 seconds ago Up 18 seconds (health: starting) harbor-log goharbor/harbor-log:v2.13.1 "/bin/sh -c /usr/loc…" log 21 seconds ago Up 21 seconds (health: starting) 127.0.0.1:1514->10514/tcp harbor-portal goharbor/harbor-portal:v2.13.1 "nginx -g 'daemon of…" portal 21 seconds ago Up 19 seconds (health: starting) nginx goharbor/nginx-photon:v2.13.1 "nginx -g 'daemon of…" proxy 21 seconds ago Up 18 seconds (health: starting) 0.0.0.0:80->8080/tcp, :::80->8080/tcp redis goharbor/redis-photon:v2.13.1 "redis-server /etc/r…" redis 21 seconds ago Up 20 seconds (health: starting) registry goharbor/registry-photon:v2.13.1 "/home/harbor/entryp…" registry 21 seconds ago Up 19 seconds (health: starting) registryctl goharbor/harbor-registryctl:v2.13.1 "/home/harbor/start.…" registryctl 21 seconds ago Up 20 seconds (health: starting) [root@elk91 harbor]#

image

bash
等一分钟

image

5、harbor的高可用解决方案
bash
方案一: 多个harbor共享存储。 挂载存储卷,nfs 方案二: 仓库复制。(官方推荐) 方案二: keepalived
1、仓库复制(官方推荐)

新建项目

image

bash
# 请为本地已有的 alpine:latest 镜像创建一个新的别名(标签),这个别名指向私有仓库 10.0.0.92 中的 weixiang-xixi/alpine 镜像位置" [root@elk91 harbor]# docker tag alpine:latest 10.0.0.92/weixiang-xixi/alpine 为什么需要这个步骤? Docker 要求推送到私有仓库的镜像必须: 包含完整的仓库地址前缀(10.0.0.92/) 包含项目名称(weixiang-xixi/) 包含镜像名称(alpine) [root@elk91 harbor]# docker push 10.0.0.92/weixiang-xixi/alpine Using default tag: latest The push refers to repository [10.0.0.92/weixiang-xixi/alpine] 63ca1fbb43ae: Pushed latest: digest: sha256:33735bd63cf84d7e388d9f6d297d348c523c044410f553bd878c6d7829612735 size: 528 # 经过下方页面显示,91成功获取到92新建的项目

image

image

92自动获取91项目

image

image

image

bash
多个harbor共享存储。 1.安装keepalived [root@elk91 ~]# apt -y install keepalived [root@elk92 ~]# apt -y install keepalived 2.91节点修改keepalived配置 [root@elk91 ~]# ifconfig ... eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.0.0.91 netmask 255.255.255.0 broadcast 10.0.0.255 inet6 fe80::20c:29ff:fee8:8b7c prefixlen 64 scopeid 0x20<link> ether 00:0c:29:e8:8b:7c txqueuelen 1000 (Ethernet) RX packets 1149700 bytes 1334270651 (1.3 GB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1026632 bytes 1117756007 (1.1 GB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [root@elk91 ~]# cat > /etc/keepalived/keepalived.conf <<'EOF' ! Configuration File for keepalived global_defs { router_id 10.0.0.91 } vrrp_script chk_nginx { script "/etc/keepalived/check_port.sh 8443" interval 2 weight -20 } vrrp_instance VI_1 { state MASTER interface eth0 virtual_router_id 251 priority 100 advert_int 1 mcast_src_ip 10.0.0.91 nopreempt authentication { auth_type PASS auth_pass yinzhengjie_k8s } track_script { chk_nginx } virtual_ipaddress { 10.0.0.230 } } EOF 2.92节点修改keepalived配置 [root@elk92 harbor]# ifconfig ... eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.0.0.92 netmask 255.255.255.0 broadcast 10.0.0.255 inet6 fe80::20c:29ff:fe0d:67d5 prefixlen 64 scopeid 0x20<link> ether 00:0c:29:0d:67:d5 txqueuelen 1000 (Ethernet) RX packets 917723 bytes 1096507658 (1.0 GB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 476754 bytes 434552251 (434.5 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [root@elk92 ~]# cat > /etc/keepalived/keepalived.conf <<'EOF' ! Configuration File for keepalived global_defs { router_id 10.0.0.92 } vrrp_script chk_nginx { script "/etc/keepalived/check_port.sh 8443" interval 2 weight -20 } vrrp_instance VI_1 { state MASTER interface eth0 virtual_router_id 251 priority 100 advert_int 1 mcast_src_ip 10.0.0.92 nopreempt authentication { auth_type PASS auth_pass yinzhengjie_k8s } track_script { chk_nginx } virtual_ipaddress { 10.0.0.230 } } EOF 3.启动keepalived [root@elk91 ~]# systemctl enable --now keepalived [root@elk92 harbor]# systemctl enable --now keepalived 4.测试验证 http://10.0.0.230/ 5.停止keepalived观察VIP是否飘逸

image

bash
[root@elk91 harbor]# systemctl stop keepalived

image

6、harbor仓库密码base64编码存储路径
bash
harbor仓库密码base64编码存储路径 1.启动harbor仓库 [root@elk92 ~]# cd /usr/local/harbor/ [root@elk92 harbor]# [root@elk92 harbor]# ll total 650932 drwxr-xr-x 3 root root 4096 Jul 7 10:43 ./ drwxr-xr-x 14 root root 4096 Jul 7 10:38 ../ drwxr-xr-x 3 root root 4096 Jul 7 10:43 common/ -rw-r--r-- 1 root root 3646 May 22 15:48 common.sh -rw-r--r-- 1 root root 5998 Jul 7 10:43 docker-compose.yml -rw-r--r-- 1 root root 666471629 May 22 15:48 harbor.v2.13.1.tar.gz -rw-r--r-- 1 root root 14784 Jul 7 10:40 harbor.yml -rw-r--r-- 1 root root 14688 May 22 15:48 harbor.yml.tmpl -rwxr-xr-x 1 root root 1975 Jul 7 10:42 install.sh* -rw-r--r-- 1 root root 11347 May 22 15:48 LICENSE -rwxr-xr-x 1 root root 2211 May 22 15:48 prepare* [root@elk92 harbor]# [root@elk92 harbor]# docker-compose down -t 0 [+] Running 10/10 ✔ Container nginx Removed 0.0s ✔ Container harbor-jobservice Removed 0.4s ✔ Container registryctl Removed 0.4s ✔ Container harbor-portal Removed 0.4s ✔ Container harbor-core Removed 0.2s ✔ Container redis Removed 0.3s ✔ Container harbor-db Removed 0.2s ✔ Container registry Removed 0.3s ✔ Container harbor-log Removed 0.2s ✔ Network harbor_harbor Removed 0.2s [root@elk92 harbor]# [root@elk92 harbor]# docker-compose up -d [+] Building 0.0s (0/0) docker:default [+] Running 10/10 ✔ Network harbor_harbor Created 0.1s ✔ Container harbor-log Started 0.0s ✔ Container registry Started 0.0s ✔ Container harbor-db Started 0.0s ✔ Container registryctl Started 0.0s ✔ Container harbor-portal Started 0.1s ✔ Container redis Started 0.0s ✔ Container harbor-core Started 0.0s ✔ Container harbor-jobservice Started 0.0s ✔ Container nginx Started 0.0s [root@elk92 harbor]# [root@elk92 harbor]# [root@elk92 harbor]# docker-compose ps -a NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS harbor-core goharbor/harbor-core:v2.13.1 "/harbor/entrypoint.…" core 8 seconds ago Up 6 seconds (health: starting) harbor-db goharbor/harbor-db:v2.13.1 "/docker-entrypoint.…" postgresql 8 seconds ago Up 6 seconds (health: starting) harbor-jobservice goharbor/harbor-jobservice:v2.13.1 "/harbor/entrypoint.…" jobservice 8 seconds ago Up 4 seconds (health: starting) harbor-log goharbor/harbor-log:v2.13.1 "/bin/sh -c /usr/loc…" log 8 seconds ago Up 7 seconds (health: starting) 127.0.0.1:1514->10514/tcp harbor-portal goharbor/harbor-portal:v2.13.1 "nginx -g 'daemon of…" portal 8 seconds ago Up 6 seconds (health: starting) nginx goharbor/nginx-photon:v2.13.1 "nginx -g 'daemon of…" proxy 8 seconds ago Up 5 seconds (health: starting) 0.0.0.0:80->8080/tcp, :::80->8080/tcp redis goharbor/redis-photon:v2.13.1 "redis-server /etc/r…" redis 8 seconds ago Up 6 seconds (health: starting) registry goharbor/registry-photon:v2.13.1 "/home/harbor/entryp…" registry 8 seconds ago Up 6 seconds (health: starting) registryctl goharbor/harbor-registryctl:v2.13.1 "/home/harbor/start.…" registryctl 8 seconds ago Up 6 seconds (health: starting) [root@elk92 harbor]# 2.查看密码存储文件 [root@elk93 ~]# docker login 10.0.0.230 Username: admin Password: WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded [root@elk93 ~]# [root@elk93 ~]# cat /root/.docker/config.json ; echo { "auths": { "10.0.0.230": { "auth": "YWRtaW46MQ==" }, "10.0.0.92": { "auth": "YWRtaW46MQ==" } } } [root@elk93 ~]# 3.解码数据 [root@elk93 ~]# echo YWRtaW46MQ== | base64 -d ;echo admin:1 [root@elk93 ~]# 4.操作完后记得退出登录【安全起见】 [root@elk93 ~]# cat /root/.docker/config.json;echo { "auths": { "10.0.0.230": { "auth": "YWRtaW46MQ==" }, "10.0.0.92": { "auth": "YWRtaW46MQ==" } } } [root@elk93 ~]# [root@elk93 ~]# docker logout 10.0.0.230 Removing login credentials for 10.0.0.230 [root@elk93 ~]# [root@elk93 ~]# cat /root/.docker/config.json;echo { "auths": { "10.0.0.92": { "auth": "YWRtaW46MQ==" } } } [root@elk93 ~]# [root@elk93 ~]# docker logout 10.0.0.92 Removing login credentials for 10.0.0.92 [root@elk93 ~]# [root@elk93 ~]# cat /root/.docker/config.json;echo { "auths": {} } [root@elk93 ~]#

13、将本地镜像推送到第三方阿里云服务器

登录账号并创建仓库

8539a8da9a59207ae10b37c5534336e6

32cce67318fd95b1a6ed9bdedb04ecbe_720

4bf5c49622cb20e6a3efa2976756c6eb

4bf5c49622cb20e6a3efa2976756c6eb

4bf5c49622cb20e6a3efa2976756c6eb

b25e3b82b0ea79b12e2c0efb2f1fcf93

image

bash
# 登录阿里云镜像仓库服务器 [root@elk92 harbor]# docker login --username=z13406161615 crpi-cmgicwpae1gewpy2.cn-guangzhou.personal.cr.aliyuncs.com Password: # 密码weixiang.123 WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded # 将镜像推送到Registry [root@elk92 harbor]# docker tag alpine:latest crpi-cmgicwpae1gewpy2.cn-guangzhou.personal.cr.aliyuncs.com/zhangweixiang/weixiang98:v1 [root@elk92 harbor]# docker push crpi-cmgicwpae1gewpy2.cn-guangzhou.personal.cr.aliyuncs.com/zhangweixiang/weixiang98:v1 The push refers to repository [crpi-cmgicwpae1gewpy2.cn-guangzhou.personal.cr.aliyuncs.com/zhangweixiang/weixiang98] 63ca1fbb43ae: Pushed v1: digest: sha256:33735bd63cf84d7e388d9f6d297d348c523c044410f553bd878c6d7829612735 size: 528 # 返回阿里云查看

image

bash
删除镜像 删除仓库 退出登录 [root@elk92 harbor]# docker logout crpi-cmgicwpae1gewpy2.cn-guangzhou.personal.cr.aliyuncs.com Removing login credentials for crpi-cmgicwpae1gewpy2.cn-guangzhou.personal.cr.aliyuncs.com

image

image

14、docker hub官方仓库使用

注册账号并登录

7ee8538754c384f2f75353e701376fc5

image

image

bash
# 登录官方 【使用你自己注册的docker官方账号】 [root@elk92 harbor]# docker login -u weixiang987 Password: # 密码weixiang.987 WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded # 推送镜像到官网 [root@elk92 harbor]# docker tag alpine:latest weixiang987/weixiang-weixiang989:tagname [root@elk92 harbor]# docker push weixiang987/weixiang-weixiang989:tagname The push refers to repository [docker.io/weixiang987/weixiang-weixiang989] 63ca1fbb43ae: Pushing [==================================================>] 8.082MB # 查看docker hub官方WebUI

image

image

bash
# 第三方下载官方镜像 [root@elk91 harbor]# docker pull weixiang987/weixiang-weixiang989:tagname tagname: Pulling from weixiang987/weixiang-weixiang989 Digest: sha256:33735bd63cf84d7e388d9f6d297d348c523c044410f553bd878c6d7829612735 Status: Downloaded newer image for weixiang987/weixiang-weixiang989:tagname docker.io/weixiang987/weixiang-weixiang989:tagname # 删除官方的仓库 # 退出仓库 [root@elk92 harbor]# docker logout weixiang987/weixiang-weixiang989:tagname Removing login credentials for weixiang987

image

bash
常见的错误Q1: [root@elk93 ~]# docker push yinzhengjie2019/weixiang-weixiang98:xixi The push refers to repository [docker.io/yinzhengjie2019/weixiang-weixiang98] Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) [root@elk93 ~]# 解决方案: 配置docker代理。 [root@elk93 ~]# systemctl cat docker # /lib/systemd/system/docker.service [Unit] Description=weixiang linux Docke Engine Documentation=https://docs.docker.com,https://www.weixiang.com Wants=network-online.target [Service] Type=notify ExecStart=/usr/bin/dockerd # 配置docker代理 Environment="HTTP_PROXY=http://10.0.0.1:7890" Environment="HTTPS_PROXY=http://10.0.0.1:7890" [Install] WantedBy=multi-user.target [root@elk93 ~]# [root@elk93 ~]# docker info | grep Proxy HTTP Proxy: http://10.0.0.1:7890 HTTPS Proxy: http://10.0.0.1:7890 [root@elk93 ~]# 常见的错误Q2: [root@elk93 ~]# docker push yinzhengjie2019/weixiang-weixiang98:xixi The push refers to repository [docker.io/yinzhengjie2019/weixiang-weixiang98] 63ca1fbb43ae: Preparing denied: requested access to the resource is denied [root@elk93 ~]# 解决方案: 需要登录harbor官方才能推送镜像。 [root@elk93 ~]# docker login -u yinzhengjie2019 Password: WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded [root@elk93 ~]#

15、containerd基本使用

1、containerd安装
bash
1.什么是Containerd 所谓的Containerd其实就是docker公司开源的一款轻量级容器管理工具,由docker engine剥离出来的项目。 符合OCI开发规范,Containerd项目在2017捐献给了CNCF组织。 2.安装Containerd服务 [root@elk93 ~]# cat /lib/systemd/system/containerd.service # /lib/systemd/system/docker.service [Unit] Description=weixiang linux Containerd Server Documentation=https://docs.docker.com,https://www.weixiang.com Wants=network-online.target [Service] Type=notify ExecStart=/weixiang/softwares/docker/containerd [Install] WantedBy=multi-user.target [root@elk93 ~]# [root@elk93 ~]# systemctl daemon-reload [root@elk93 ~]# [root@elk93 ~]# systemctl enable --now containerd.service Created symlink /etc/systemd/system/multi-user.target.wants/containerd.service → /lib/systemd/system/containerd.service. [root@elk93 ~]# [root@elk93 ~]# ctr version Client: Version: v1.6.20 Revision: 2806fc1057397dbaeefbea0e4e17bddfbd388f38 Go version: go1.19.7 Server: Version: v1.6.20 Revision: 2806fc1057397dbaeefbea0e4e17bddfbd388f38 UUID: 21e84e97-e26d-4d59-9450-8eb986c8ff4d [root@elk93 ~]# [root@elk93 ~]#
2、Containerd的名称空间管理【用来隔离容器,镜像,任务等资源信息】
bash
3.1 查看名称空间 [root@elk93 ~]# ctr ns ls NAME LABELS [root@elk93 ~]# [root@elk93 ~]# 3.2 创建名称空间 [root@elk93 ~]# ctr ns create weixiang [root@elk93 ~]# [root@elk93 ~]# ctr ns ls NAME LABELS weixiang [root@elk93 ~]# 3.3 为名称空间打标签 [root@elk93 ~]# ctr ns label weixiang class=weixiang98 [root@elk93 ~]# [root@elk93 ~]# ctr ns ls NAME LABELS weixiang class=weixiang98 [root@elk93 ~]# [root@elk93 ~]# ctr ns label weixiang class="" [root@elk93 ~]# [root@elk93 ~]# ctr ns ls NAME LABELS weixiang [root@elk93 ~]# 3.4 移除名称空间 [root@elk93 ~]# ctr ns remove weixiang weixiang [root@elk93 ~]# [root@elk93 ~]# ctr ns ls NAME LABELS [root@elk93 ~]#
3、ctr镜像管理
bash
4.1 查看镜像 [root@elk93 ~]# ctr image ls REF TYPE DIGEST SIZE PLATFORMS LABELS [root@elk93 ~]# [root@elk93 ~]# 4.2 拉取镜像 【支持官方的镜像拉取】 [root@elk93 ~]# ctr image pull registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 [root@elk93 ~]# ctr image ls REF TYPE DIGEST SIZE PLATFORMS LABELS registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - [root@elk93 ~]# [root@elk93 ~]# HTTP_PROXY=http://10.0.0.1:10808 HTTPS_PROXY=http://10.0.0.1:10808 ctr image pull docker.io/library/nginx:1.29.0-alpine [root@elk93 ~]# [root@elk93 ~]# ctr image ls REF TYPE DIGEST SIZE PLATFORMS LABELS docker.io/library/nginx:1.29.0-alpine application/vnd.oci.image.index.v1+json sha256:b2e814d28359e77bd0aa5fed1939620075e4ffa0eb20423cc557b375bd5c14ad 21.4 MiB linux/386,linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64/v8,linux/ppc64le,linux/riscv64,linux/s390x,unknown/unknown - registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - [root@elk93 ~]# 4.3 拉取镜像到指定名称空间 [root@elk93 ~]# ctr ns ls NAME LABELS default [root@elk93 ~]# [root@elk93 ~]# ctr -n weixiang image pull registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 # 如果指定的名称空间不存在会自动创建。 [root@elk93 ~]# [root@elk93 ~]# ctr ns ls NAME LABELS default weixiang [root@elk93 ~]# 4.4 查看指定名称空间的镜像 [root@elk93 ~]# ctr -n weixiang image ls REF TYPE DIGEST SIZE PLATFORMS LABELS registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 application/vnd.docker.distribution.manifest.v2+json sha256:3ac38ee6161e11f2341eda32be95dcc6746f587880f923d2d24a54c3a525227e 9.6 MiB linux/amd64 - [root@elk93 ~]# [root@elk93 ~]# ctr -n default image ls REF TYPE DIGEST SIZE PLATFORMS LABELS docker.io/library/nginx:1.29.0-alpine application/vnd.oci.image.index.v1+json sha256:b2e814d28359e77bd0aa5fed1939620075e4ffa0eb20423cc557b375bd5c14ad 21.4 MiB linux/386,linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64/v8,linux/ppc64le,linux/riscv64,linux/s390x,unknown/unknown - registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - [root@elk93 ~]# [root@elk93 ~]# ctr image ls REF TYPE DIGEST SIZE PLATFORMS LABELS docker.io/library/nginx:1.29.0-alpine application/vnd.oci.image.index.v1+json sha256:b2e814d28359e77bd0aa5fed1939620075e4ffa0eb20423cc557b375bd5c14ad 21.4 MiB linux/386,linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64/v8,linux/ppc64le,linux/riscv64,linux/s390x,unknown/unknown - registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - [root@elk93 ~]# 4.5 导出镜像 [root@elk93 ~]# ctr -n weixiang image export weixiang-xiuxian-v2.tar.gz registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 [root@elk93 ~]# [root@elk93 ~]# ll -h weixiang-xiuxian-v2.tar.gz -rw-r--r-- 1 root root 9.7M Jul 8 10:52 weixiang-xiuxian-v2.tar.gz [root@elk93 ~]# 4.6 删除镜像 [root@elk93 ~]# ctr -n weixiang image rm registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 [root@elk93 ~]# [root@elk93 ~]# ctr -n weixiang image ls REF TYPE DIGEST SIZE PLATFORMS LABELS [root@elk93 ~]# 4.7 导入镜像 [root@elk93 ~]# ctr -n weixiang image ls REF TYPE DIGEST SIZE PLATFORMS LABELS [root@elk93 ~]# [root@elk93 ~]# ctr -n weixiang image import weixiang-xiuxian-v2.tar.gz unpacking registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 (sha256:3ac38ee6161e11f2341eda32be95dcc6746f587880f923d2d24a54c3a525227e)...done [root@elk93 ~]# [root@elk93 ~]# ctr -n weixiang image ls REF TYPE DIGEST SIZE PLATFORMS LABELS registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 application/vnd.docker.distribution.manifest.v2+json sha256:3ac38ee6161e11f2341eda32be95dcc6746f587880f923d2d24a54c3a525227e 9.6 MiB linux/amd64 - [root@elk93 ~]# 4.8 删除名称空间时该名称必须为空 [root@elk93 ~]# ctr -n weixiang image ls REF TYPE DIGEST SIZE PLATFORMS LABELS registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 application/vnd.docker.distribution.manifest.v2+json sha256:3ac38ee6161e11f2341eda32be95dcc6746f587880f923d2d24a54c3a525227e 9.6 MiB linux/amd64 - [root@elk93 ~]# [root@elk93 ~]# ctr ns remove weixiang ERRO[0000] unable to delete weixiang error="namespace \"weixiang\" must be empty, but it still has images, blobs, snapshots on \"overlayfs\" snapshotter: failed precondition" ctr: unable to delete weixiang: namespace "weixiang" must be empty, but it still has images, blobs, snapshots on "overlayfs" snapshotter: failed precondition [root@elk93 ~]# [root@elk93 ~]# ctr -n weixiang image rm registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 [root@elk93 ~]# [root@elk93 ~]# ctr -n weixiang image ls REF TYPE DIGEST SIZE PLATFORMS LABELS [root@elk93 ~]# [root@elk93 ~]# ctr ns remove weixiang weixiang [root@elk93 ~]# [root@elk93 ~]# ctr ns ls NAME LABELS default [root@elk93 ~]#
4、容器管理
bash
5.1 查看容器列表 [root@elk93 ~]# ctr containers ls CONTAINER IMAGE RUNTIME [root@elk93 ~]# 5.2 创建容器 [root@elk93 ~]# ctr containers create --env SCHOOL=weixiang --env CLASS=weixiang98 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 c1 [root@elk93 ~]# [root@elk93 ~]# ctr containers ls CONTAINER IMAGE RUNTIME c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 io.containerd.runc.v2 [root@elk93 ~]# 5.3 查看容器的详细信息 [root@elk93 ~]# ctr containers info c1 { "ID": "c1", "Labels": { "com.docker.compose.project": "dockerfile", "com.docker.compose.service": "apps_v1", "com.docker.compose.version": "2.23.0", "io.containerd.image.config.stop-signal": "SIGQUIT", "maintainer": "NGINX Docker Maintainers \u003cdocker-maint@nginx.com\u003e" }, "Image": "registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1", "Runtime": { ... 5.4 删除容器 [root@elk93 ~]# ctr containers rm c1 [root@elk93 ~]# [root@elk93 ~]# ctr containers ls CONTAINER IMAGE RUNTIME [root@elk93 ~]# 6.task管理容器 6.1 启动容器 [root@elk93 ~]# ctr tasks start -d c1 /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh [root@elk93 ~]# 6.2 查看task [root@elk93 ~]# ctr tasks ls TASK PID STATUS c1 4603 RUNNING [root@elk93 ~]# 6.3 链接容器 [root@elk93 ~]# ctr tasks exec -t --exec-id $RANDOM c1 sh / # ifconfig -a lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # / # ps -ef PID USER TIME COMMAND 1 root 0:00 nginx: master process nginx -g daemon off; 24 nginx 0:00 nginx: worker process 25 nginx 0:00 nginx: worker process 35 root 0:00 sh 47 root 0:00 sh 53 root 0:00 ps -ef / # 6.4 停止任务 [root@elk93 ~]# ctr task ls TASK PID STATUS c1 4603 RUNNING [root@elk93 ~]# [root@elk93 ~]# ctr task kill c1 [root@elk93 ~]# [root@elk93 ~]# ctr task ls TASK PID STATUS c1 4603 STOPPED [root@elk93 ~]# [root@elk93 ~]# ctr tasks exec -t --exec-id $RANDOM c1 sh # 如果容器已经停止运行,则无法链接。 ctr: cannot exec in a stopped state: unknown [root@elk93 ~]# 6.5 删除已经停止的容器 [root@elk92 harbor]# ctr tasks start c1 -d /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ [root@elk92 harbor]# ctr tasks ls TASK PID STATUS c1 117636 RUNNING [root@elk93 ~]# [root@elk93 ~]# ctr task rm c1 [root@elk93 ~]# [root@elk93 ~]# ctr task ls TASK PID STATUS [root@elk93 ~]# [root@elk93 ~]# ctr task rm c1 # 容器不存在则报错! ERRO[0000] failed to load task from c1 error="no running task found: task c1 not found: not found" ctr: no running task found: task c1 not found: not found [root@elk93 ~]# [root@elk93 ~]# ctr c ls # 删除task并不会删除容器 CONTAINER IMAGE RUNTIME c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 io.containerd.runc.v2 [root@elk93 ~]# 6.6 查看容器的进程信息列表 [root@elk93 ~]# ctr task ls TASK PID STATUS c1 5019 RUNNING [root@elk93 ~]# [root@elk93 ~]# ctr task ps c1 PID INFO 5019 - 5048 - 5049 - [root@elk93 ~]# [root@elk93 ~]# kill -9 5019 # 其实主进程ID就是宿主机的PID [root@elk93 ~]# [root@elk93 ~]# ctr task ls TASK PID STATUS c1 5019 STOPPED [root@elk93 ~]# 6.7 容器的暂停和恢复 [root@elk93 ~]# ctr task ls TASK PID STATUS c1 5256 RUNNING [root@elk93 ~]# [root@elk93 ~]# ctr task pause c1 [root@elk93 ~]# [root@elk93 ~]# ctr task ls TASK PID STATUS c1 5256 PAUSED [root@elk93 ~]# [root@elk93 ~]# ctr tasks exec -t --exec-id $RANDOM c1 sh ctr: cannot exec in a paused state: unknown [root@elk93 ~]# [root@elk93 ~]# ctr task resume c1 [root@elk93 ~]# [root@elk93 ~]# ctr task ls TASK PID STATUS c1 5256 RUNNING [root@elk93 ~]# [root@elk93 ~]# ctr tasks exec -t --exec-id $RANDOM c1 sh / # ifconfig lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # / # env CLASS=weixiang98 SHLVL=1 HOME=/root PKG_RELEASE=1 NGINX_VERSION=1.20.1 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin NJS_VERSION=0.5.3 SCHOOL=weixiang PWD=/ / # 6.8 启动容器时使用宿主机网络 [root@elk93 ~]# ctr c create --net-host registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 xiuxian [root@elk93 ~]# ctr c ls CONTAINER IMAGE RUNTIME c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 io.containerd.runc.v2 xiuxian registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 io.containerd.runc.v2 [root@elk93 ~]# [root@elk93 ~]# ctr task start xiuxian -d /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ [root@elk93 ~]# 6.9 访问测试 [root@elk93 ~]# ctr tasks exec -t --exec-id $RANDOM xiuxian sh / # ifconfig -a docker0 Link encap:Ethernet HWaddr 02:42:CA:D4:F1:F2 inet addr:172.17.0.1 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) eth0 Link encap:Ethernet HWaddr 00:0C:29:BB:92:91 inet addr:10.0.0.93 Bcast:10.0.0.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:febb:9291/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:286647 errors:0 dropped:0 overruns:0 frame:0 TX packets:230373 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:237348649 (226.3 MiB) TX bytes:151225885 (144.2 MiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:2886 errors:0 dropped:0 overruns:0 frame:0 TX packets:2886 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:170996 (166.9 KiB) TX bytes:170996 (166.9 KiB) / # [root@elk93 ~]# ctr tasks pause xiuxian [root@elk93 ~]# [root@elk93 ~]# ctr tasks resume xiuxian [root@elk93 ~]# 可以单独开一个终端测试: [root@elk92 harbor]# curl 10.0.0.93 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@elk92 harbor]# - Containerd和docker容器的区别: - docker的功能性更强,支持更丰富的5种网络模式,而Containerd仅支持host网络,如果需要其他网络模式,则需要单独安装CNI插件; - docker容器可以直接启动,而Containerd是先创建容器,再基于容器启动任务; - 综上所述,我们不建议单独使用Containerd,如果需要使用容器的话,则还是比较推荐实用docker。
5、containerd连接harbor服务
bash
1.harbor创建项目

image

bash
2.给镜像打标签 [root@elk93 ~]# ctr i ls REF TYPE DIGEST SIZE PLATFORMS LABELS docker.io/library/nginx:1.29.0-alpine application/vnd.oci.image.index.v1+json sha256:b2e814d28359e77bd0aa5fed1939620075e4ffa0eb20423cc557b375bd5c14ad 21.4 MiB linux/386,linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64/v8,linux/ppc64le,linux/riscv64,linux/s390x,unknown/unknown - registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - [root@elk93 ~]# [root@elk93 ~]# ctr i tag registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 10.0.0.91/weixiang-containerd/xiuxian:v1 10.0.0.92/weixiang-containerd/xiuxian:v1 [root@elk93 ~]# [root@elk93 ~]# ctr i ls REF TYPE DIGEST SIZE PLATFORMS LABELS 10.0.0.92/weixiang-containerd/xiuxian:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - docker.io/library/nginx:1.29.0-alpine application/vnd.oci.image.index.v1+json sha256:b2e814d28359e77bd0aa5fed1939620075e4ffa0eb20423cc557b375bd5c14ad 21.4 MiB linux/386,linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64/v8,linux/ppc64le,linux/riscv64,linux/s390x,unknown/unknown - registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - [root@elk93 ~]# 3.推送镜像 [root@elk93 ~]# ctr i push 10.0.0.91/weixiang-containerd/xiuxian:v1 --plain-http -u admin:1 manifest-sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c: done |++++++++++++++++++++++++++++++++++++++| config-sha256:f28fd43be4ad41fc768dcc3629f8479d1443df01ada10ac9a771314e4fdef599: done |++++++++++++++++++++++++++++++++++++++| elapsed: 0.4 s total: 9.4 Ki (23.3 KiB/s) [root@elk93 ~]#

image

bash
4.测试验证 [root@elk93 ~]# ctr -n test i pull 10.0.0.91/weixiang-containerd/xiuxian:v1 --plain-http 10.0.0.92/weixiang-containerd/xiuxian:v1: resolved |++++++++++++++++++++++++++++++++++++++| manifest-sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:2dd61e30a21aeb966df205382a40dcbcf45af975cc0cb836d555b9cd0ad760f5: done |++++++++++++++++++++++++++++++++++++++| config-sha256:f28fd43be4ad41fc768dcc3629f8479d1443df01ada10ac9a771314e4fdef599: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:5758d4e389a3f662e94a85fb76143dbe338b64f8d2a65f45536a9663b05305ad: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:51d66f6290217acbf83f15bc23a88338819673445804b1461b2c41d4d0c22f94: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:ff9c6add3f30f658b4f44732bef1dd44b6d3276853bba31b0babc247f3eba0dc: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:dcc43d9a97b44cf3b3619f2c185f249891b108ab99abcc58b19a82879b00b24b: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:5dcfac0f2f9ca3131599455f5e79298202c7e1b5e0eb732498b34e9fe4cb1173: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:2c6e86e57dfd729d8240ceab7c18bd1e5dd006b079837116bc1c3e1de5e1971a: done |++++++++++++++++++++++++++++++++++++++| elapsed: 0.4 s total: 6.7 Mi (16.7 MiB/s) unpacking linux/amd64 sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c... done: 420.30215ms [root@elk93 ~]# [root@elk93 ~]# ctr -n test i ls REF TYPE DIGEST SIZE PLATFORMS LABELS 10.0.0.92/weixiang-containerd/xiuxian:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - [root@elk93 ~]#
6、Containerd实现数据持久化
bash
# 1.创建容器并指定挂载 [root@elk93 ~]# ctr container create --mount type=bind,src=/yinzhengjie/games,dst=/usr/local/nginx/html,options=rbind:rw registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 xiuxian01 # 2.查看容器 [root@elk93 ~]# ctr c ls CONTAINER IMAGE RUNTIME c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 io.containerd.runc.v2 xiuxian registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 io.containerd.runc.v2 xiuxian01 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 io.containerd.runc.v2 [root@elk93 ~]# # 3.启动task [root@elk93 ~]# mkdir -pv /yinzhengjie/games mkdir: created directory '/yinzhengjie' mkdir: created directory '/yinzhengjie/games' [root@elk93 ~]# [root@elk93 ~]# ctr t start xiuxian01 -d /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh [root@elk93 ~]# [root@elk93 ~]# ctr t ls TASK PID STATUS xiuxian01 5893 RUNNING c1 5256 RUNNING xiuxian 5446 RUNNING [root@elk93 ~]# # 4.链接容器写入测试数据 [root@elk93 ~]# ls -l /yinzhengjie/games/ total 0 [root@elk93 ~]# [root@elk93 ~]# ctr t exec -t --exec-id $RANDOM xiuxian01 sh / # ls /usr/local/nginx/html / # / # echo www.weixiang.com > /usr/local/nginx/html/index.html / # / # ls -l /usr/local/nginx/html total 4 -rw-r--r-- 1 root root 18 Jul 8 07:12 index.html / # / # [root@elk93 ~]# [root@elk93 ~]# ls -l /yinzhengjie/games/ total 4 -rw-r--r-- 1 root root 18 Jul 8 15:12 index.html [root@elk93 ~]# [root@elk93 ~]# cat /yinzhengjie/games/index.html www.weixiang.com [root@elk93 ~]# [root@elk93 ~]# # 5.删除容器 [root@elk93 ~]# ctr t ls TASK PID STATUS c1 5256 RUNNING xiuxian 5446 RUNNING xiuxian01 5893 RUNNING [root@elk93 ~]# [root@elk93 ~]# ctr t kill xiuxian01 # 停止任务,让容器处于停止状态 [root@elk93 ~]# [root@elk93 ~]# ctr t ls TASK PID STATUS c1 5256 RUNNING xiuxian 5446 RUNNING xiuxian01 5893 STOPPED [root@elk93 ~]# [root@elk93 ~]# ctr c ls CONTAINER IMAGE RUNTIME c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 io.containerd.runc.v2 xiuxian registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 io.containerd.runc.v2 xiuxian01 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 io.containerd.runc.v2 [root@elk93 ~]# [root@elk93 ~]# ctr c rm xiuxian01 # 删除容器也会一并删除停止的任务 [root@elk93 ~]# [root@elk93 ~]# ctr c ls CONTAINER IMAGE RUNTIME c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 io.containerd.runc.v2 xiuxian registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 io.containerd.runc.v2 [root@elk93 ~]# [root@elk93 ~]# ctr t ls TASK PID STATUS c1 5256 RUNNING xiuxian 5446 RUNNING [root@elk93 ~]# # 6.验证数据是否丢失 [root@elk93 ~]# ll /yinzhengjie/games/ total 12 drwxr-xr-x 2 root root 4096 Jul 8 15:12 ./ drwxr-xr-x 3 root root 4096 Jul 8 15:09 ../ -rw-r--r-- 1 root root 18 Jul 8 15:12 index.html [root@elk93 ~]# [root@elk93 ~]# cat /yinzhengjie/games/index.html # 很明显,删除容器时数据并不丢失!! www.weixiang.com
7、docker集成Containerd
bash
# 1.为什么要使用Docker集成Containerd实现容器管理 目前Containerd主要任务还在于解决容器运行时的问题,对于其周边生态还不完善。 所以可以借助Docker结合Containerd来实现Docker完整的功能应用。 # 2.启动Containerd服务 [root@elk92 ~]# cat > /lib/systemd/system/containerd.service <<EOF [Unit] Description=weixiang linux Containerd Server Documentation=https://docs.docker.com,https://www.weixiang.com Wants=network-online.target [Service] Type=notify ExecStart=/weixiang/softwares/docker/containerd\ [Install] WantedBy=multi-user.target EOF [root@elk92 ~]# systemctl daemon-reload [root@elk92 ~]# [root@elk92 ~]# systemctl enable --now containerd.service Created symlink /etc/systemd/system/multi-user.target.wants/containerd.service → /lib/systemd/system/containerd.service. [root@elk92 ~]# [root@elk92 ~]# ll /run/containerd/containerd.sock srw-rw---- 1 root root 0 Jul 8 15:23 /run/containerd/containerd.sock= [root@elk92 ~]# # 3.修改docker启动脚本 [root@elk92 ~]# cat /lib/systemd/system/docker.service [Unit] Description=weixiang linux Docke Engine Documentation=https://docs.docker.com,https://www.weixiang.com Wants=network-online.target [Service] Type=notify ExecStart=/usr/bin/dockerd --containerd /run/containerd/containerd.sock --debug [Install] WantedBy=multi-user.target [root@elk92 ~]# [root@elk92 ~]# systemctl daemon-reload [root@elk92 ~]# [root@elk92 ~]# systemctl restart docker.service [root@elk92 ~]# [root@elk92 ~]# ctr namespace ls NAME LABELS moby [root@elk92 ~]# [root@elk92 ~]# ctr -n moby i ls REF TYPE DIGEST SIZE PLATFORMS LABELS [root@elk92 ~]# # 4.启动容器 [root@elk92 ~]# docker run --restart unless-stopped -dp 88:80 --name yinzhengjie-games registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 884a7a2d7def95906b9fc79fe7f143e698223d58f024277839666d4dc51a6952 [root@elk92 ~]# [root@elk92 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 884a7a2d7def registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 "/docker-entrypoint.…" 6 seconds ago Up 4 seconds 0.0.0.0:88->80/tcp, :::88->80/tcp yinzhengjie-games [root@elk92 ~]# [root@elk92 ~]# docker ps -lq --no-trunc 884a7a2d7def95906b9fc79fe7f143e698223d58f024277839666d4dc51a6952 [root@elk92 ~]# # 5.验证docker底层调用了Containerd [root@elk92 ~]# ctr -n moby c ls | grep 78fd3f9a5e38bd41115a65c61fc8b6b6df27e368b5e972f60dafc46f06be3dc5 884a7a2d7def95906b9fc79fe7f143e698223d58f024277839666d4dc51a6952 - io.containerd.runc.v2 [root@elk92 ~]# [root@elk92 ~]# ctr -n moby t ls | grep 78fd3f9a5e38bd41115a65c61fc8b6b6df27e368b5e972f60dafc46f06be3dc5 884a7a2d7def95906b9fc79fe7f143e698223d58f024277839666d4dc51a6952 145835 RUNNING [root@elk92 ~]# [root@elk92 ~]# [root@elk92 ~]# ctr -n moby t exec --exec-id $RANDOM -t 78fd3f9a5e38bd41115a65c61fc8b6b6df27e368b5e972f60dafc46f06be3dc5 sh / # ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:02 inet addr:172.17.0.2 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:40 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:5524 (5.3 KiB) TX bytes:0 (0.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # [root@elk92 ~]# [root@elk92 ~]# docker exec -it yinzhengjie-games sh / # ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:02 inet addr:172.17.0.2 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:50 errors:0 dropped:0 overruns:0 frame:0 TX packets:7 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:6222 (6.0 KiB) TX bytes:1017 (1017.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # 综上操作,不难发现,docker底层可以调用Containerd,并丰富Containerd的功能。
8、harbor的https实战
bash
1.准备证书 [root@elk91 ~]# wget http://192.168.21.253/Resources/Docker/SSL/harbor.yinzhengjie.com_nginx.zip 2.解压证书 [root@elk91 ~]# unzip harbor.yinzhengjie.com_nginx.zip -d /usr/local/harbor/ Archive: harbor.yinzhengjie.com_nginx.zip creating: /usr/local/harbor/harbor.yinzhengjie.com_nginx/ inflating: /usr/local/harbor/harbor.yinzhengjie.com_nginx/harbor.yinzhengjie.com.csr inflating: /usr/local/harbor/harbor.yinzhengjie.com_nginx/harbor.yinzhengjie.com_bundle.crt inflating: /usr/local/harbor/harbor.yinzhengjie.com_nginx/harbor.yinzhengjie.com_bundle.pem inflating: /usr/local/harbor/harbor.yinzhengjie.com_nginx/harbor.yinzhengjie.com.key [root@elk91 ~]# 3.修改harbor 的配置文件 [root@elk91 ~]# cd /usr/local/harbor/ [root@elk91 harbor]# [root@elk91 harbor]# ll total 650936 drwxr-xr-x 4 root root 4096 Jul 8 16:21 ./ drwxr-xr-x 14 root root 4096 Jul 7 10:38 ../ drwxr-xr-x 3 root root 4096 Jul 7 10:43 common/ -rw-r--r-- 1 root root 3646 May 22 15:48 common.sh -rw-r--r-- 1 root root 5998 Jul 7 10:43 docker-compose.yml -rw-r--r-- 1 root root 666471629 May 22 15:48 harbor.v2.13.1.tar.gz drwxrwxrwx 2 root root 4096 Jul 8 16:19 harbor.yinzhengjie.com_nginx/ -rw-r--r-- 1 root root 14784 Jul 7 10:40 harbor.yml -rw-r--r-- 1 root root 14688 May 22 15:48 harbor.yml.tmpl -rwxr-xr-x 1 root root 1975 Jul 7 10:42 install.sh* -rw-r--r-- 1 root root 11347 May 22 15:48 LICENSE -rwxr-xr-x 1 root root 2211 May 22 15:48 prepare* [root@elk91 harbor]# [root@elk91 harbor]# vim harbor.yml ... # hostname: 10.0.0.91 hostname: harbor.yinzhengjie.com ... https: # https port for harbor, default is 443 port: 443 # The path of cert and key files for nginx certificate: /usr/local/harbor/harbor.yinzhengjie.com_nginx/harbor.yinzhengjie.com_bundle.crt private_key: /usr/local/harbor/harbor.yinzhengjie.com_nginx/harbor.yinzhengjie.com.key 4.重新安装harbor [root@elk91 harbor]# ./install.sh [root@elk91 harbor]# [root@elk91 harbor]# ss -ntl | egrep "80|443" LISTEN 0 4096 0.0.0.0:443 0.0.0.0:* LISTEN 0 4096 0.0.0.0:80 0.0.0.0:* LISTEN 0 4096 [::]:443 [::]:* LISTEN 0 4096 [::]:80 [::]:* [root@elk91 harbor]# [root@elk91 harbor]# 5.windows添加解析 10.0.0.91 linux.weixiang.info 6.访问验证 https://linux.weixiang.info/harbor/projects

image

9、docker访问https的harbor
bash
1.添加hosts解析 [root@elk92 ~]# echo 10.0.0.91 linux.weixiang.info >> /etc/hosts [root@elk92 ~]# tail -1 /etc/hosts 10.0.0.91 harbor.yinzhengjie.com [root@elk92 ~]# 2.拉取镜像测试 [root@elk92 ~]# docker pull linux.weixiang.info/weixiang-xixi/alpine:latest latest: Pulling from weixiang-xixi/alpine Digest: sha256:33735bd63cf84d7e388d9f6d297d348c523c044410f553bd878c6d7829612735 Status: Downloaded newer image for harbor.yinzhengjie.com/weixiang-xixi/alpine:latest harbor.yinzhengjie.com/weixiang-xixi/alpine:latest [root@elk92 ~]# 3.推送镜像测试 [root@elk92 ~]# docker login -u admin -p 1 harbor.yinzhengjie.com WARNING! Using --password via the CLI is insecure. Use --password-stdin. WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded [root@elk92 ~]# [root@elk92 ~]# docker push harbor.yinzhengjie.com/weixiang-xixi/alpine:v1 The push refers to repository [harbor.yinzhengjie.com/weixiang-xixi/alpine] 63ca1fbb43ae: Layer already exists v1: digest: sha256:33735bd63cf84d7e388d9f6d297d348c523c044410f553bd878c6d7829612735 size: 528 4.webUI验证 略,见视频。 # 可能会出现的错误 1、 [root@elk92 ~]# docker pull harbor.yinzhengjie.com/weixiang-xixi/alpine:latest Error response from daemon: Get "https://harbor.yinzhengjie.com/v2/": dial tcp: lookup harbor.yinzhengjie.com on 127.0.0.53:53: no such host 解决方案: linux服务器配置解析即可。 2、 配置https证书进行拉取测试的时候一直报错,cannot validate certificate for 10.0.0.91 because it doesnt contain any IP SANs,经过查阅资料发现是Harbor的HTTPS证书配置与Docker客户端的验证机制不匹配通过修改harbor.yml文件中的 hostname: linux.weixiang.info,改成自己证书的域名,然后重新./install.sh访问通过

image

10、containerd访问https的harbor
bash
1.打镜像标签 [root@elk93 ~]# ctr i ls REF TYPE DIGEST SIZE PLATFORMS LABELS 10.0.0.92/weixiang-containerd/xiuxian:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - docker.io/library/nginx:1.29.0-alpine application/vnd.oci.image.index.v1+json sha256:b2e814d28359e77bd0aa5fed1939620075e4ffa0eb20423cc557b375bd5c14ad 21.4 MiB linux/386,linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64/v8,linux/ppc64le,linux/riscv64,linux/s390x,unknown/unknown - registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - [root@elk93 ~]# [root@elk93 ~]# ctr i tag registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 harbor.yinzhengjie.com/library/xiuxian:v1 linux.weixiang.info/library/xiuxian:v1 [root@elk93 ~]# [root@elk93 ~]# ctr i ls REF TYPE DIGEST SIZE PLATFORMS LABELS 10.0.0.92/weixiang-containerd/xiuxian:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - docker.io/library/nginx:1.29.0-alpine application/vnd.oci.image.index.v1+json sha256:b2e814d28359e77bd0aa5fed1939620075e4ffa0eb20423cc557b375bd5c14ad 21.4 MiB linux/386,linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64/v8,linux/ppc64le,linux/riscv64,linux/s390x,unknown/unknown - harbor.yinzhengjie.com/library/xiuxian:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - [root@elk93 ~]# 2.添加hosts解析 [root@elk93 ~]# echo 10.0.0.91 linux.weixiang.info >> /etc/hosts [root@elk93 ~]# tail -1 /etc/hosts 10.0.0.91 linux.weixiang.info [root@elk93 ~]# 3.推送镜像到harbor仓库 [root@elk93 ~]# ctr i push linux.weixiang.info/library/xiuxian:v1 -u admin:1 manifest-sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c: done |++++++++++++++++++++++++++++++++++++++| config-sha256:f28fd43be4ad41fc768dcc3629f8479d1443df01ada10ac9a771314e4fdef599: done |++++++++++++++++++++++++++++++++++++++| elapsed: 0.4 s total: 9.4 Ki (23.4 KiB/s) [root@elk93 ~]# 4.访问webUI验证 https://linux.weixiang.info/harbor/projects/1/repositories

image

bash
5.拉取镜像 [root@elk92 ~]# ctr -n weixiang i pull linux.weixiang.info/library/xiuxian:v1 [root@elk92 ~]# [root@elk92 ~]# ctr ns ls NAME LABELS moby weixiang [root@elk92 ~]# [root@elk92 ~]# ctr -n weixiang i ls REF TYPE DIGEST SIZE PLATFORMS LABELS harbor.yinzhengjie.com/library/xiuxian:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 - 彩蛋: 如果你的证书不是权威机构办法的,可能按照这种方式推送有问题,我们需要跳过证书校验,push时候使用"--skip-verify, -k "跳过证书校验。 - 今日内容回顾: - 第三方仓库阿里云 ** - docker hub官方仓库 * - containerd的基本使用 *** - 名称空间 - 镜像 - 容器 - 存储卷 - 对接harbor - harbor的https实战 ***** - docker和Containerd的区别 ***** - 1.containerd存在名称空间和task的概念; - 2.containerd没有网络,仅支持host,none两种模式; - 3.docker可以集成Containerd,实现Containerd的功能扩容; - 今日作业: - 完成课堂的所有练习并整理思维导图; 扩展作业: 在不集成docker环境的前提下,独立完成containerd的网络插件部署,让其创建的容器能够有自己的网卡。 - 明天环境准备: 10.0.0.250 harbor.weixiang.com ---> 1c 2G 10.0.0.231 master231 ---> 2C4G 10.0.0.232 worker232 ---> 2C4G 10.0.0.233 worker233 ---> 2C4G 磁盘建议: 100GB

12、kubernetes

1、kubernetes安装与介绍

1、Kubernetes集群架构

image

bash
# k8s集群架构有三个,分别是master、worker、网络插件(CNI) master又分为: 1、api-server,集群管理的唯一入口点,负责认证、授权、准入控制、请求验证、数据读写(状态存储在 etcd 中),所有用户操作 ( kubectl)、其他控制平面组件、Worker Nodes 都通过 API Server 与集群交互。 2、schedule,负责调度新创建的 Pod (容器组) 到合适的 Worker Node 上运行,它只做决策,不实际运行 Pod 3、Controller Manager,负责监视集群状态,维护K8S集群,康出乐买内者 4、etcd,负责数据存储、,API Server 是唯一能直接与 etcd 交互的组件 worker又分为: 1、kubelet,管理容器的生命周期,并上报容器状态和节点状态到api-server、 2、kube-proxy,实现容器访问,用于代理容器的访问入口 CNI,网络插件,解决 Kubernetes 集群中复杂的容器网络通信问题 1.什么是Kubernetes Kubernetes简称K8S,是一款用于编排容器的工具,支持单点,集群模式。 官网地址: https://kubernetes.io/ 2.K8S架构 - master: ---> control plane 控制面板 管理K8S集群。 - etcd: 默认使用https协议,用于数据存储。 - api-server: 默认使用https协议,用于K8S集群管理访问控制入口。 - controller manager: 用于维护K8S集群。 - schedule: 负责调度相关工作。 - slave: ---> worker node 工作节点 负责干活的。 - kubelet: 管理容器的生命周期,并上报容器状态和节点状态到api-server。 - kube-proxy: 实现容器访问,用于代理容器的访问入口。 - CNI:Container Network Interface - Falnnel - Calico - cilium - Canel - ...
2、Kubernetes的三种网络类型

image

bash
- k8s组件网络 即物理机的IP地址网段,ens33。 - CNI: Pod(容器)网络 为容器提供网络,可以跨主机通信,可以不同网段间的主机通信,缺点是如果主机宕机了,schedule把pode调度到新的节点,只能手 动修改配置文件的ip指向 - Service: 服务网络 为容器提供统一的访问入口,优点类似于负载均衡器。网络关联到具体服务(比如mysql),不关心ip地址是哪个
3、Kubernetes的部署方式
bash
1.官方默认都有两种部署方式: (在生产环境中都可以使用,且都支持高可用环境。咱们学习过程中,建议选择kubeadm。) - 二进制部署K8S集群 手动部署K8S各个组件,配置文件,启动脚本及证书生成,kubeconfig文件。 配置繁琐,对新手不友好,尤其是证书管理。但是可以自定义配置信息,老手部署的话2小时起步,新手20+小时 - kubeadm部署K8S集群 是官方提供的一种快速部署K8S各组件的部署方式,如果镜像准备就绪的情况下,基于容器的方式部署。 需要提前安装kubelet,docker或者containerd,kubeadm组件。 配置简单,适合新手。新手在镜像准备好的情况下,仅需要2分钟部署完毕。 2.第三方提供的部署方式: 国内公司: - 青云科技: kubesphere ---》kubekey 底层基于kubeadm快速部署K8S,提供了丰富的图形化管理界面。 - kuboard 底层基于kubeadm快速部署K8S,提供了丰富的图形化管理界面。 - kubeasz 底层基于二进制方式部署,结合ansible的playbook实现的快速部署管理K8S集群。 国外的产品: - rancher: 和国内的kubesphere很相似,也是K8S发行商,提供了丰富的图形化管理界面。 还基于K8S研发出来了K3S,号称轻量级的K8S。 云厂商: - 阿里云的ACK的SAAS产品 - 腾讯云的TKE的SAAS产品 - 华为云的CCE的SAAS产品 - ucloud的UK8S的SAAS产品 - 亚马逊的Amazon EKS的SAAS产品 - 京东云,百度云的SAAS产品等。 其他部署方式: - minikube: 适合在windows部署K8S,适合开发环境搭建的使用。不建议生产环境部署。 - kind: 可以部署多套K8S环境,轻量级的命令行管理工具。 - yum: 不推荐,版本支持较低,默认是1.5.2。 CNCF技术蓝图: https://landscape.cncf.io/ 3.二进制部署和kubeadm部署的区别: 相同点: 都可以部署K8S高可用集群。 不同点: - 1.部署难度: kubeadm简单. - 2.部署时间: kubeadm短时间。 - 3.证书管理: 二进制需要手动生成,而kubeadm自建三套10年的CA证书,各组件证书有效期为1年。 - 4.软件安装: kubeadm需要单独安装kubeadm,kubectl和kubelet组件,由kubelet组件启动K8S其他相关Pod,而二进制需要安装除了kubeadm的其他K8S组件。 4.Kubernetes的版本选择 首先,是K8S 1.23.17版本,该版本的第一个rc版本是2021年初,最后一个版本是23年年初结束。 其次,部署K8S 1.24,时间允许的话,我们会基于kubekey的方式部署K8S 1.28版本,而后部署kubesphere来管理多套K8S集群。 最后,我们以二进制部署K8S集群搭建最新版本讲解。
4、k8s集群环境准备
bash
- k8s集群环境准备 1.环境准备 推荐阅读: https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/ 环境准备: 硬件配置: 2core 4GB 磁盘: 100GB+ [系统盘] 磁盘b: 300GB [Rook待用] 磁盘C: 500GB [Rook待用] 磁盘D: 1TB [Rook待用] 操作系统: Ubuntu 22.04.04 LTS IP和主机名: 10.0.0.231 master231 10.0.0.232 worker232 10.0.0.233 worker233 所有节点能够上网,机器必须"干净"。 温馨提示: 虚拟机的windows宿主机存储路径磁盘最少保证200GB剩余空间。 2.关闭swap分区 swapoff -a && sysctl -w vm.swappiness=0 # 临时关闭 sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab # 基于配置文件关闭 3.确保各个节点MAC地址或product_uuid唯一 ifconfig eth0 | grep ether | awk '{print $2}' cat /sys/class/dmi/id/product_uuid 温馨提示: 一般来讲,硬件设备会拥有唯一的地址,但是有些虚拟机的地址可能会重复。 Kubernetes使用这些值来唯一确定集群中的节点。 如果这些值在每个节点上不唯一,可能会导致安装失败。 4.检查网络节点是否互通 简而言之,就是检查你的k8s集群各节点是否互通,可以使用ping命令来测试。 ping baidu.com -c 10 5.允许iptable检查桥接流量 cat <<EOF | tee /etc/modules-load.d/k8s.conf br_netfilter EOF cat <<EOF | tee /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 EOF sysctl --system 6 检查端口是否被占用 参考链接: https://kubernetes.io/zh-cn/docs/reference/networking/ports-and-protocols/ 检查master节点和worker节点的各组件端口是否被占用。 7.所有节点修改cgroup的管理进程为systemd 7.1 安装docker环境 略,见视频。 wget http://192.168.21.253/Resources/Docker/scripts/weixiang-autoinstall-docker-docker-compose.tar.gz tar xf weixiang-autoinstall-docker-docker-compose.tar.gz ./install-docker.sh i 7.2 检查cgroup驱动是否是systemd [root@master231 ~]# docker info | grep "Cgroup Driver:" Cgroup Driver: systemd [root@master231 ~]# [root@worker232 ~]# docker info | grep "Cgroup Driver:" Cgroup Driver: systemd [root@worker232 ~]# [root@worker233 ~]# docker info | grep "Cgroup Driver:" Cgroup Driver: systemd [root@worker233 ~]# 8.所有节点安装kubeadm,kubelet,kubectl 8.1 软件包说明 你需要在每台机器上安装以下的软件包: kubeadm: 用来初始化K8S集群的工具。 kubelet: 在集群中的每个节点上用来启动Pod和容器等。 kubectl: 用来与K8S集群通信的命令行工具。 kubeadm不能帮你安装或者管理kubelet或kubectl,所以你需要确保它们与通过kubeadm安装的控制平面(master)的版本相匹配。 如果不这样做,则存在发生版本偏差的风险,可能会导致一些预料之外的错误和问题。 然而,控制平面与kubelet间的相差一个次要版本不一致是支持的,但kubelet的版本不可以超过"API SERVER"的版本。 例如,1.7.0版本的kubelet可以完全兼容1.8.0版本的"API SERVER",反之则不可以。 8.2 K8S所有节点配置软件源(建议拷贝2次)【线上同学】 apt-get update && apt-get install -y apt-transport-https curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - cat <<EOF >/etc/apt/sources.list.d/kubernetes.list deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main EOF apt-get update 8.3 查看一下当前环境支持的k8s版本【线上同学】 [root@master231 ~]# apt-cache madison kubeadm kubeadm | 1.28.2-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages kubeadm | 1.28.1-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages kubeadm | 1.28.0-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages ... kubeadm | 1.23.17-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages kubeadm | 1.23.16-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages kubeadm | 1.23.15-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages kubeadm | 1.23.14-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages ... 8.4 所有节点安装 kubelet kubeadm kubectl apt-get -y install kubelet=1.23.17-00 kubeadm=1.23.17-00 kubectl=1.23.17-00 SVIP:【可以直接跳过8.2-8.4】 我已经把软件包打包好了,放在了nginx站点目录。 wget http://192.168.21.253/Resources/Kubernetes/softwares/k8s-v1.23.17.tar.gz tar xf k8s-v1.23.17.tar.gz cd var/cache/apt/archives/ && dpkg -i *.deb 8.5 检查各组件版本 [root@master231 archives]# kubeadm version kubeadm version: &version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.17", GitCommit:"953be8927218ec8067e1af2641e540238ffd7576", GitTreeState:"clean", BuildDate:"2023-02-22T13:33:14Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"} [root@master231 archives]# [root@master231 archives]# kubectl version Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.17", GitCommit:"953be8927218ec8067e1af2641e540238ffd7576", GitTreeState:"clean", BuildDate:"2023-02-22T13:34:27Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"} The connection to the server localhost:8080 was refused - did you specify the right host or port? [root@master231 archives]# [root@master231 archives]# kubelet --version Kubernetes v1.23.17 [root@master231 archives]# [root@master231 archives]# [root@worker232 archives]# kubeadm version kubeadm version: &version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.17", GitCommit:"953be8927218ec8067e1af2641e540238ffd7576", GitTreeState:"clean", BuildDate:"2023-02-22T13:33:14Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"} [root@worker232 archives]# [root@worker232 archives]# kubectl version Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.17", GitCommit:"953be8927218ec8067e1af2641e540238ffd7576", GitTreeState:"clean", BuildDate:"2023-02-22T13:34:27Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"} The connection to the server localhost:8080 was refused - did you specify the right host or port? [root@worker232 archives]# [root@worker232 archives]# kubelet --version Kubernetes v1.23.17 [root@worker232 archives]# [root@worker232 archives]# [root@worker233 archives]# kubeadm version kubeadm version: &version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.17", GitCommit:"953be8927218ec8067e1af2641e540238ffd7576", GitTreeState:"clean", BuildDate:"2023-02-22T13:33:14Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"} [root@worker233 archives]# [root@worker233 archives]# kubectl version Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.17", GitCommit:"953be8927218ec8067e1af2641e540238ffd7576", GitTreeState:"clean", BuildDate:"2023-02-22T13:34:27Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"} The connection to the server localhost:8080 was refused - did you specify the right host or port? [root@worker233 archives]# [root@worker233 archives]# kubelet --version Kubernetes v1.23.17 [root@worker233 archives]# [root@worker233 archives]# 参考链接: https://kubernetes.io/zh/docs/tasks/tools/install-kubectl-linux/ 9.检查时区 [root@master231 ~]# ln -svf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime '/etc/localtime' -> '/usr/share/zoneinfo/Asia/Shanghai' # -v:命令在执行成功时会打印出它做了什么 # -f: 如果目标位置 /etc/localtime 已经存在一个文件或链接,-f 参数会强制先删除已有的文件,然后再创建新的链接。 [root@master231 ~]# [root@master231 ~]# ll /etc/localtime lrwxrwxrwx 1 root root 33 Feb 10 11:26 /etc/localtime -> /usr/share/zoneinfo/Asia/Shanghai [root@master231 ~]# [root@master231 ~]# date -R Thu, 22 May 2025 10:18:16 +0800 [root@master231 ~]# 10.拍快照 init 0
5、基于kubeadm组件初始化K8S的master组件
bash
1.提前导入镜像 [root@master231 ~]# wget http://192.168.21.253/Resources/Kubernetes/K8S%20Cluster/kubeadm/weixiang-master-1.23.17.tar.gz [root@master231 ~]# docker load -i weixiang-master-1.23.17.tar.gz [root@master231 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE registry.aliyuncs.com/google_containers/kube-apiserver v1.23.17 62bc5d8258d6 23 months ago 130MB registry.aliyuncs.com/google_containers/kube-controller-manager v1.23.17 1dab4fc7b6e0 23 months ago 120MB registry.aliyuncs.com/google_containers/kube-scheduler v1.23.17 bc6794cb54ac 23 months ago 51.9MB registry.aliyuncs.com/google_containers/kube-proxy v1.23.17 f21c8d21558c 23 months ago 111MB registry.aliyuncs.com/google_containers/etcd 3.5.6-0 fce326961ae2 2 years ago 299MB registry.aliyuncs.com/google_containers/coredns v1.8.6 a4ca41631cc7 3 years ago 46.8MB registry.aliyuncs.com/google_containers/pause 3.6 6270bb605e12 3 years ago 683kB [root@master231 ~]# 2.使用kubeadm初始化master节点 [root@master231 ~]# kubeadm init --kubernetes-version=v1.23.17 --image-repository registry.aliyuncs.com/google_containers --pod-network-cidr=10.100.0.0/16 --service-cidr=10.200.0.0/16 --service-dns-domain=weixiang.com #1 --kubernetes-version=v1.23.17: 指定要安装的 Kubernetes 版本。 #2 --image-repository registry.aliyuncs.com/google_containers: 这个非常关键。Kubernetes 的核心组件镜像是存放在 # Google 的镜像仓库(k8s.gcr.io)上的,在中国大陆访问很困难。这个参数将镜像仓库地址替换为阿里云的镜像加速器,确保你 # 能顺利拉取镜像 #3 --pod-network-cidr=10.100.0.0/16: 指定 Pod 网络的 IP 地址范围。集群中每个 Pod(运行容器的最小单元)都会从这个 # 地址池里获得一个 IP 地址。这个地址段必须与你后续安装的网络插件(如 Calico、Flannel)配置的地址段一致。 #4 --service-cidr=10.200.0.0/16: 指定 Service 网络的 IP 地址范围。Service 是为一组 Pod 提供一个稳定访问入口的抽 # 象层,它也会有一个虚拟 IP(ClusterIP),这个 IP 从该地址池分配。 #5 --service-dns-domain=weixiang.com: 自定义集群内部的 DNS 域名。默认是 cluster.local。设置后,你的服务可以通 # 过 服务名.命名空间.svc.weixiang.com 的形式被访问。 ... Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config #1 mkdir -p $HOME/.kube: 在当前用户的家目录下创建一个名为 .kube 的隐藏文件夹。这是 kubectl 默认存放配置的地方 #2 sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config: 将主配置文件(它属于 root)拷贝到你用户的配置目录,并重命名为 config。 #3 sudo chown $(id -u):$(id -g) $HOME/.kube/config: 这一步至关重要。因为上一步是用 sudo 拷贝的,所以新文件的所有 # 者是 root。这条命令将 config 文件的所有者改回为当前用户($(id -u)是当前用户ID,$(id -g)是当前用户组ID),这样 # 你就不需要每次用 sudo 来运行 kubectl 了。 Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 10.0.0.231:6443 --token nrb0nv.pvw1w3fbskraz3bd \ --discovery-token-ca-cert-hash sha256:2e59d5d84d2fad847601a21917719c93efce1620746da1f48e358a9174c1c152 [root@master231 ~]# 相关参数说明: 使用kubeadm初始化集群时,可能会出现如下的输出信息: [init] 使用初始化的K8S版本。 [preflight] 主要是做安装K8S集群的前置工作,比如下载镜像,这个时间取决于你的网速。 [certs] 生成证书文件,默认存储在"/etc/kubernetes/pki"目录哟。 [kubeconfig] 生成K8S集群的默认配置文件,默认存储在"/etc/kubernetes"目录哟。 [kubelet-start] 启动kubelet, 环境变量默认写入:"/var/lib/kubelet/kubeadm-flags.env" 配置文件默认写入:"/var/lib/kubelet/config.yaml" [control-plane] 使用静态的目录,默认的资源清单存放在:"/etc/kubernetes/manifests"。 此过程会创建静态Pod,包括"kube-apiserver""kube-controller-manager""kube-scheduler" [etcd] 创建etcd的静态Pod,默认的资源清单存放在:""/etc/kubernetes/manifests" [wait-control-plane] 等待kubelet从资源清单目录"/etc/kubernetes/manifests"启动静态Pod。 [apiclient] 等待所有的master组件正常运行。 [upload-config] 创建名为"kubeadm-config"的ConfigMap在"kube-system"名称空间中。 [kubelet] 创建名为"kubelet-config-1.22"的ConfigMap在"kube-system"名称空间中,其中包含集群中kubelet的配置 [upload-certs] 跳过此节点,详情请参考”--upload-certs" [mark-control-plane] 标记控制面板,包括打标签和污点,目的是为了标记master节点。 [bootstrap-token] 创建token口令,例如:"kbkgsa.fc97518diw8bdqid"。 如下图所示,这个口令将来在加入集群节点时很有用,而且对于RBAC控制也很有用处哟。 [kubelet-finalize] 更新kubelet的证书文件信息 [addons] 添加附加组件,例如: CoreDNS"和"kube-proxy” 3.拷贝授权文件,用于管理K8S集群 [root@master231 ~]# mkdir -p $HOME/.kube [root@master231 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config [root@master231 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config 4.查看master组件是否正常工作 [root@master231 ~]# kubectl get componentstatuses Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Healthy ok etcd-0 Healthy {"health":"true","reason":""} scheduler Healthy ok [root@master231 ~]# [root@master231 ~]# [root@master231 ~]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-0 Healthy {"health":"true","reason":""} [root@master231 ~]# 5.查看工作节点 [root@master231 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master231 NotReady control-plane,master 3m13s v1.23.17 [root@master231 ~]# [root@master231 ~]# kubectl get no NAME STATUS ROLES AGE VERSION master231 NotReady control-plane,master 3m15s v1.23.17 [root@master231 ~]# [root@master231 ~]# kubectl get no -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master231 NotReady control-plane,master 3m23s v1.23.17 10.0.0.231 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic docker://20.10.24 [root@master231 ~]# 6.master初始化不成功解决问题的方法 可能存在的原因: - 由于没有禁用swap分区导致无法完成初始化; - 每个2core以上的CPU导致无法完成初始化; - 没有手动导入镜像; 解决方案: - 1.检查上面的是否有上面的情况 free -h lscpu - 2.重置当前节点环境 [root@master231 ~]# kubeadm reset -f - 3.再次尝试初始化master节点
6、基于kubeadm部署worker组件
bash
1.提前导入镜像 [root@worker232 ~]# wget http://192.168.21.253/Resources/Kubernetes/K8S%20Cluster/kubeadm/weixiang-slave-1.23.17.tar.gz [root@worker232 ~]# docker load -i weixiang-slave-1.23.17.tar.gz [root@worker232 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE registry.aliyuncs.com/google_containers/kube-proxy v1.23.17 f21c8d21558c 2 years ago 111MB registry.aliyuncs.com/google_containers/coredns v1.8.6 a4ca41631cc7 3 years ago 46.8MB registry.aliyuncs.com/google_containers/pause 3.6 6270bb605e12 3 years ago 683kB [root@worker232 ~]# [root@worker233 ~]# wget http://192.168.21.253/Resources/Kubernetes/K8S%20Cluster/kubeadm/weixiang-slave-1.23.17.tar.gz [root@worker233 ~]# docker load -i weixiang-slave-1.23.17.tar.gz [root@worker233 ~]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE registry.aliyuncs.com/google_containers/kube-proxy v1.23.17 f21c8d21558c 2 years ago 111MB registry.aliyuncs.com/google_containers/coredns v1.8.6 a4ca41631cc7 3 years ago 46.8MB registry.aliyuncs.com/google_containers/pause 3.6 6270bb605e12 3 years ago 683kB [root@worker233 ~]# 2.将worker节点加入到master集群(注意,不要复制我的,而是根据你上一步master生成的token加入集群) [root@worker232 ~]# kubeadm join 10.0.0.231:6443 --token nrb0nv.pvw1w3fbskraz3bd \ --discovery-token-ca-cert-hash sha256:2e59d5d84d2fad847601a21917719c93efce1620746da1f48e358a9174c1c152 # 使用下面这个 kubeadm join 10.1.24.13:6443 --token j22cr0.04cqj6phehu4rvct \ --discovery-token-ca-cert-hash sha256:0a3262eddefaeda176117e50757807ebc5a219648bda6afad21bc1d5f921caca [root@worker233 ~]# kubeadm join 10.0.0.231:6443 --token nrb0nv.pvw1w3fbskraz3bd \ --discovery-token-ca-cert-hash sha256:2e59d5d84d2fad847601a21917719c93efce1620746da1f48e358a9174c1c152 3.验证worker节点是否加入成功 [root@master231 ~]# kubectl get no NAME STATUS ROLES AGE VERSION master231 NotReady control-plane,master 8m1s v1.23.17 worker232 NotReady <none> 42s v1.23.17 worker233 NotReady <none> 38s v1.23.17 [root@master231 ~]# [root@master231 ~]# kubectl get no -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master231 NotReady control-plane,master 8m2s v1.23.17 10.0.0.231 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic docker://20.10.24 worker232 NotReady <none> 43s v1.23.17 10.0.0.232 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic docker://20.10.24 worker233 NotReady <none> 39s v1.23.17 10.0.0.233 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic docker://20.10.24 [root@master231 ~]#
7、部署CNI插件之Flannel
bash
1.线上学员部署【需要docker能够翻墙】 kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml 2.SVIP学员使用我给的资源清单 2.1 # 所有节点导入镜像 !! wget http://192.168.21.253/Resources/Kubernetes/K8S%20Cluster/CNI/flannel/images/v0.27.0/weixiang-flannel-v0.27.0.tar.gz docker load -i weixiang-flannel-v0.27.0.tar.gz 2.2 修改Pod网段 [root@master231 ~]# wget http://192.168.21.253/Resources/Kubernetes/K8S%20Cluster/CNI/flannel/kube-flannel-v0.27.0.yml [root@master231 ~]# grep 16 kube-flannel-v0.27.0.yml "Network": "10.244.0.0/16", [root@master231 ~]# [root@master231 ~]# [root@master231 ~]# sed -i '/16/s#244#100#' kube-flannel-v0.27.0.yml [root@master231 ~]# [root@master231 ~]# grep 16 kube-flannel-v0.27.0.yml "Network": "10.100.0.0/16", [root@master231 ~]# [root@master231 ~]# 2.3 部署服务组件 [root@master231 ~]# kubectl apply -f kube-flannel-v0.27.0.yml namespace/kube-flannel created serviceaccount/flannel created clusterrole.rbac.authorization.k8s.io/flannel created clusterrolebinding.rbac.authorization.k8s.io/flannel created configmap/kube-flannel-cfg created daemonset.apps/kube-flannel-ds created [root@master231 ~]# [root@master231 ~]# kubectl get pods -A -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-flannel kube-flannel-ds-5hbns 1/1 Running 0 14s 10.0.0.231 master231 <none> <none> kube-flannel kube-flannel-ds-dzffl 1/1 Running 0 14s 10.0.0.233 worker233 <none> <none> kube-flannel kube-flannel-ds-h5kwh 1/1 Running 0 14s 10.0.0.232 worker232 <none> <none> kube-system coredns-6d8c4cb4d-k52qr 1/1 Running 0 50m 10.100.0.3 master231 <none> <none> kube-system coredns-6d8c4cb4d-rvzd9 1/1 Running 0 50m 10.100.0.2 master231 <none> <none> kube-system etcd-master231 1/1 Running 0 50m 10.0.0.231 master231 <none> <none> kube-system kube-apiserver-master231 1/1 Running 0 50m 10.0.0.231 master231 <none> <none> kube-system kube-controller-manager-master231 1/1 Running 0 50m 10.0.0.231 master231 <none> <none> kube-system kube-proxy-588bm 1/1 Running 0 43m 10.0.0.232 worker232 <none> <none> kube-system kube-proxy-9bb67 1/1 Running 0 50m 10.0.0.231 master231 <none> <none> kube-system kube-proxy-n9mv6 1/1 Running 0 42m 10.0.0.233 worker233 <none> <none> kube-system kube-scheduler-master231 1/1 Running 0 50m 10.0.0.231 master231 <none> <none> [root@master231 ~]# 2.4 检查节点是否就绪 [root@master231 ~]# kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master231 Ready control-plane,master 53m v1.23.17 10.0.0.231 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic docker://20.10.24 worker232 Ready <none> 46m v1.23.17 10.0.0.232 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic docker://20.10.24 worker233 Ready <none> 46m v1.23.17 10.0.0.233 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic docker://20.10.24 [root@master231 ~]# # 可能会出现的错误 E0709 03:27:12.515335 1 main.go:359] Error registering network: failed to acquire lease: subnet "10.244.0.0/16" specified in the flannel net config doesnt contain "10.100.0.0/24" PodCIDR of the "master231" node 错误原因: pod网段和Flannel网段不一致。 - 验证CNI网络插件是否正常,curl下别的ip是否可以连接 1.下载资源清单 [root@master231 ~]# wget http://192.168.21.253/Resources/Kubernetes/K8S%20Cluster/CNI/flannel/weixiang-network-cni-test.yaml 2.应用资源清单 [root@master231 ~]# cat weixiang-network-cni-test.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-v1 spec: nodeName: worker232 containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 name: xiuxian --- apiVersion: v1 kind: Pod metadata: name: xiuxian-v2 spec: nodeName: worker233 containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 name: xiuxian [root@master231 ~]# [root@master231 ~]# kubectl apply -f weixiang-network-cni-test.yaml pod/xiuxian-v1 created pod/xiuxian-v2 created [root@master231 ~]# 3.访问测试 [root@master231 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-v1 1/1 Running 0 14s 10.100.1.2 worker232 <none> <none> xiuxian-v2 1/1 Running 0 14s 10.100.2.2 worker233 <none> <none> [root@master231 ~]# [root@master231 ~]# curl 10.100.1.2 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 ~]# [root@master231 ~]# curl 10.100.2.2 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v2</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: red">凡人修仙传 v2 </h1> <div> <img src="2.jpg"> <div> </body> </html> [root@master231 ~]# 4.删除pod [root@master231 ~]# kubectl delete -f weixiang-network-cni-test.yaml pod "xiuxian-v1" deleted pod "xiuxian-v2" deleted [root@master231 ~]# [root@master231 ~]# kubectl get pods -o wide No resources found in default namespace. [root@master231 ~]# - kubectl工具实现自动补全功能 1.添加环境变量 [root@master231 ~]# kubectl completion bash > ~/.kube/completion.bash.inc [root@master231 ~]# [root@master231 ~]# echo source '$HOME/.kube/completion.bash.inc' >> ~/.bashrc [root@master231 ~]# [root@master231 ~]# source ~/.bashrc [root@master231 ~]# 2.验证自动补全功能 [root@master231 ~]# kubectl # 连续按2次tab键测试能否出现命令 alpha auth cordon diff get patch run version annotate autoscale cp drain help plugin scale wait api-resources certificate create edit kustomize port-forward set api-versions cluster-info debug exec label proxy taint apply completion delete explain logs replace top attach config describe expose options rollout uncordon [root@master231 ~]#
8、harbor基于自建证书https实战
bash
1.环境准备 harbor250 10.0.0.250 硬件配置: 1C,2G+,100GB+ 2.安装docker和docker-compose [root@harbor250 ~]# wget http://192.168.21.253/Resources/Docker/scripts/weixiang-autoinstall-docker-docker-compose.tar.gz [root@harbor250 ~]# tar xf weixiang-autoinstall-docker-docker-compose.tar.gz [root@harbor250 ~]# ./install-docker.sh i 3.安装并解压harbor安装包 3.1 下载harbor软件包 [root@harbor250 ~]# wget http://192.168.21.253/Resources/Docker/softwares/harbor-offline-installer-v2.13.1.tgz 3.2 解压harbor安装包 [root@harbor250 ~]# tar xf harbor-offline-installer-v2.13.1.tgz -C /usr/local/ 4.配置CA证书 4.1 进入到harbor程序的根目录 [root@harbor250 ~]# cd /usr/local/harbor/ [root@harbor250 harbor]# [root@harbor250 harbor]# ll total 636508 drwxr-xr-x 2 root root 4096 May 22 09:05 ./ drwxr-xr-x 11 root root 4096 May 22 09:05 ../ -rw-r--r-- 1 root root 3646 Jan 16 22:10 common.sh -rw-r--r-- 1 root root 651727378 Jan 16 22:11 harbor.v2.12.2.tar.gz -rw-r--r-- 1 root root 14288 Jan 16 22:10 harbor.yml.tmpl -rwxr-xr-x 1 root root 1975 Jan 16 22:10 install.sh* -rw-r--r-- 1 root root 11347 Jan 16 22:10 LICENSE -rwxr-xr-x 1 root root 2211 Jan 16 22:10 prepare* [root@harbor250 harbor]# 4.2 创建证书存放目录 [root@harbor250 harbor]# apt -y install tree [root@harbor250 harbor]# mkdir -pv certs/{ca,harbor-server,docker-client} mkdir: created directory 'certs' mkdir: created directory 'certs/ca' mkdir: created directory 'certs/harbor-server' mkdir: created directory 'certs/docker-client' [root@harbor250 harbor]# [root@harbor250 harbor]# tree certs/ certs/ ├── ca ├── docker-client └── harbor-server 3 directories, 0 files [root@harbor250 harbor]# 4.3 创建CA的私钥 [root@harbor250 harbor]# cd certs/ [root@harbor250 certs]# [root@harbor250 certs]# openssl genrsa -out ca/ca.key 4096 [root@harbor250 certs]# [root@harbor250 certs]# tree . ├── ca │   └── ca.key ├── docker-client └── harbor-server 3 directories, 1 file [root@harbor250 certs]# 4.4 基于自建的CA私钥创建CA证书(注意,证书签发的域名范围) [root@harbor250 certs]# openssl req -x509 -new -nodes -sha512 -days 3650 \ -subj "/C=CN/ST=Beijing/L=Beijing/O=example/OU=Personal/CN=weixiang.com" \ -key ca/ca.key \ -out ca/ca.crt [root@harbor250 certs]# tree . ├── ca │   ├── ca.crt │   └── ca.key ├── docker-client └── harbor-server 3 directories, 2 files [root@harbor250 certs]# 4.5 查看自建证书信息 [root@harbor250 certs]# openssl x509 -in ca/ca.crt -noout -text Certificate: Data: Version: 3 (0x2) Serial Number: 02:ea:3b:33:6a:55:85:d9:0e:76:7f:cd:6c:67:1e:57:bf:0e:7f:f4 Signature Algorithm: sha512WithRSAEncryption Issuer: C = CN, ST = Beijing, L = Beijing, O = example, OU = Personal, CN = weixiang.com Validity Not Before: May 22 01:07:50 2025 GMT Not After : May 20 01:07:50 2035 GMT Subject: C = CN, ST = Beijing, L = Beijing, O = example, OU = Personal, CN = weixiang.com ...
9、配置harbor服务端证书
bash
5.1 生成harbor服务器的私钥 [root@harbor250 certs]# openssl genrsa -out harbor-server/harbor250.weixiang.com.key 4096 [root@harbor250 certs]# [root@harbor250 certs]# tree . ├── ca │ ├── ca.crt │ └── ca.key ├── docker-client └── harbor-server └── harbor250.weixiang.com.key 3 directories, 3 files [root@harbor250 certs]# 5.2 harbor服务器基于私钥签发证书认证请求(csr文件),让自建CA认证 [root@harbor250 certs]# openssl req -sha512 -new \ -subj "/C=CN/ST=Beijing/L=Beijing/O=example/OU=Personal/CN=harbor250.weixiang.com" \ -key harbor-server/harbor250.weixiang.com.key \ -out harbor-server/harbor250.weixiang.com.csr [root@harbor250 certs]# tree . ├── ca │ ├── ca.crt │ └── ca.key ├── docker-client └── harbor-server ├── harbor250.weixiang.com.csr └── harbor250.weixiang.com.key 3 directories, 4 files [root@harbor250 certs]# 5.3 生成 x509 v3 的扩展文件用于认证 [root@harbor250 certs]# cat > harbor-server/v3.ext <<-EOF authorityKeyIdentifier=keyid,issuer basicConstraints=CA:FALSE keyUsage = digitalSignature, nonRepudiation, keyEncipherment, dataEncipherment extendedKeyUsage = serverAuth subjectAltName = @alt_names [alt_names] DNS.1=harbor250.weixiang.com EOF [root@harbor250 certs]# tree . ├── ca │ ├── ca.crt │ └── ca.key ├── docker-client └── harbor-server ├── harbor250.weixiang.com.csr ├── harbor250.weixiang.com.key └── v3.ext 3 directories, 5 files [root@harbor250 certs]# 5.4 基于 x509 v3 的扩展文件认证签发harbor server证书 [root@harbor250 certs]# openssl x509 -req -sha512 -days 3650 \ -extfile harbor-server/v3.ext \ -CA ca/ca.crt -CAkey ca/ca.key -CAcreateserial \ -in harbor-server/harbor250.weixiang.com.csr \ -out harbor-server/harbor250.weixiang.com.crt [root@harbor250 certs]# tree . ├── ca │ ├── ca.crt │ └── ca.key ├── docker-client └── harbor-server ├── harbor250.weixiang.com.crt ├── harbor250.weixiang.com.csr ├── harbor250.weixiang.com.key └── v3.ext 3 directories, 6 files [root@harbor250 certs]# 5.5 修改harbor的配置文件使用自建证书 [root@harbor250 certs]# cp ../harbor.yml{.tmpl,} [root@harbor250 certs]# [root@harbor250 certs]# vim ../harbor.yml ... hostname: harbor250.weixiang.com https: ... certificate: /usr/local/harbor/certs/harbor-server/harbor250.weixiang.com.crt private_key: /usr/local/harbor/certs/harbor-server/harbor250.weixiang.com.key ... harbor_admin_password: 1 ... data_volume: /var/lib/harbor ... 5.6 安装harbor服务 [root@harbor250 certs]# ../install.sh [Step 0]: checking if docker is installed ... Note: docker version: 20.10.24 [Step 1]: checking docker-compose is installed ... Note: docker-compose version: 2.23.0 [Step 2]: loading Harbor images ... ... [Step 5]: starting Harbor ... [+] Building 0.0s (0/0) docker:default [+] Running 10/10 ✔ Network harbor_harbor Created 0.0s ✔ Container harbor-log Started 0.0s ✔ Container harbor-db Started 0.0s ✔ Container registryctl Started 0.1s ✔ Container harbor-portal Started 0.1s ✔ Container redis Started 0.1s ✔ Container registry Started 0.1s ✔ Container harbor-core Started 0.0s ✔ Container nginx Started 0.0s ✔ Container harbor-jobservice Started 0.0s ✔ ----Harbor has been installed and started successfully.---- [root@harbor250 certs]# [root@harbor250 certs]# ss -ntl | grep 80 LISTEN 0 4096 0.0.0.0:80 0.0.0.0:* LISTEN 0 4096 [::]:80 [::]:* [root@harbor250 certs]# 6. 访问harbor的WebUI 6.1 在windows添加hosts文件解析如下: 10.0.0.250 harbor250.weixiang.com 6.2 访问测试: https://harbor250.weixiang.com/harbor/projects/1/repositories
10、K8S节点配置docker客户端证书实战
bash
1.生成docker客户端证书 [root@harbor250 certs]# cp ca/ca.crt harbor-server/harbor250.weixiang.com.key docker-client/ [root@harbor250 certs]# [root@harbor250 certs]# cp harbor-server/harbor250.weixiang.com.crt docker-client/harbor250.weixiang.com.cert [root@harbor250 certs]# [root@harbor250 certs]# tree . ├── ca │ ├── ca.crt │ └── ca.key ├── docker-client │ ├── ca.crt │ ├── harbor250.weixiang.com.cert │ └── harbor250.weixiang.com.key └── harbor-server ├── harbor250.weixiang.com.crt ├── harbor250.weixiang.com.csr ├── harbor250.weixiang.com.key └── v3.ext 3 directories, 9 files [root@harbor250 certs]# 2.k8s所有节点添加hosts文件解析 echo 8.148.236.36 harbor250.weixiang.com >> /etc/hosts 3.K8S所有节点拷贝证书文件 mkdir -pv /etc/docker/certs.d/harbor250.weixiang.com/ scp 8.148.236.36:/usr/local/harbor/certs/docker-client/* /etc/docker/certs.d/harbor250.weixiang.com/ 4.测试登录 [root@master231 ~]# docker login -u admin -p 1 harbor250.weixiang.com WARNING! Using --password via the CLI is insecure. Use --password-stdin. WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded [root@master231 ~]# [root@worker232 ~]# docker login -u admin -p 1 harbor250.weixiang.com WARNING! Using --password via the CLI is insecure. Use --password-stdin. WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded [root@worker232 ~]# [root@worker233 ~]# docker login -u admin -p 1 harbor250.weixiang.com WARNING! Using --password via the CLI is insecure. Use --password-stdin. WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded [root@worker233 ~]# - 将镜像推送到harbor仓库 1.harbor新建项目 新建一个名为"weixiang-xiuxian"的公开项目。 2.给镜像打标签 [root@worker232 ~]# docker tag registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 3.推送镜像到harbor仓库 [root@worker232 ~]# docker push harbor250.weixiang.com/weixiang-xiuxian/apps:v1 The push refers to repository [harbor250.weixiang.com/weixiang-xiuxian/apps] 8e2be8913e57: Pushed 9d5b000ce7c7: Pushed b8dbe22b95f7: Pushed c39c1c35e3e8: Pushed 5f66747c8a72: Pushed 15d7cdc64789: Pushed 7fcb75871b21: Pushed v1: digest: sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c size: 1778 [root@worker232 ~]# 4.harbor的WebUI验证 https://harbor250.weixiang.com/harbor/projects/2/repositories

image

bash
5.拉取镜像测试 [root@worker233 ~]# docker pull harbor250.weixiang.com/weixiang-xiuxian/apps:v1 v1: Pulling from weixiang-xiuxian/apps 5758d4e389a3: Already exists 51d66f629021: Already exists ff9c6add3f30: Already exists dcc43d9a97b4: Already exists 5dcfac0f2f9c: Already exists 2c6e86e57dfd: Already exists 2dd61e30a21a: Pull complete Digest: sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c Status: Downloaded newer image for harbor250.weixiang.com/weixiang-xiuxian/apps:v1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@worker233 ~]# [root@worker233 ~]# docker image ls harbor250.weixiang.com/weixiang-xiuxian/apps:v1 REPOSITORY TAG IMAGE ID CREATED SIZE harbor250.weixiang.com/weixiang-xiuxian/apps v1 f28fd43be4ad 17 months ago 23MB [root@worker233 ~]# 可能会出现的错误1: E0709 03:27:12.515335 1 main.go:359] Error registering network: failed to acquire lease: subnet "10.244.0.0/16" specified in the flannel net config doesn't contain "10.100.0.0/24" PodCIDR of the "master231" node 错误原因': pod网段和Flannel网段不一致。 可能会出现的错误2: [root@master231 ~]# docker login -u admin -p 1 harbor250.weixiang.com WARNING! Using --password via the CLI is insecure. Use --password-stdin. Error response from daemon: Get "https://harbor250.weixiang.com/v2/": dial tcp: lookup harbor250.weixiang.com on 127.0.0.53:53: no such host [root@master231 ~]# 错误原因: 没有配置hosts文件解析。 可能会出现的错误3: [root@master231 ~]# docker login -u admin -p 1 harbor250.weixiang.com WARNING! Using --password via the CLI is insecure. Use --password-stdin. Error response from daemon: Get "https://harbor250.weixiang.com/v2/": x509: certificate signed by unknown authority [root@master231 ~]# 可能会出现的错误4: 如果出现密码登录一直错误的问题,需要删除之前的数据卷 cd /usr/local/harbor docker-compose down -v # 删除所有数据 rm -rf /var/lib/harbor # 删除数据卷 vim harbor.yml # 修改密码 ./install.sh # 重新安装 其他错误总结: https://www.cnblogs.com/yinzhengjie/p/18645161
11、k8s部署服务的三种方式
bash
- 1.响应式创建资源 直接基于命令行操作,就可以完成资源的管理。 - 2.声明式创建资源 需要先写yaml资源清单,应用资源清单后才能生效。 - 3.API管理资源【运维开发,K8S二次开发工程师】 需要编程语言导入client-go库,基于官方提供的API操作K8S集群。 响应式创建资源 1.创建pod [root@master231 ~]# kubectl run xiuxian --image=harbor250.weixiang.com/weixiang-xiuxian/apps:v1 --env=SCHOOL=weixiang --env=CLASS=weixiang98 --port=80 pod/xiuxian created # kubectl run:创建并运行一个 Kubernetes Pod名称为xiuxian # --image=harbor250.weixiang.com/weixiang-xiuxian/apps:v1 指定容器镜像地址: # harbor250.weixiang.com:私有镜像仓库(Harbor) 2.查看pod列表 [root@master231 ~]# kubectl get pods NAME READY STATUS RESTARTS AGE xiuxian 1/1 Running 0 37s [root@master231 ~]# [root@master231 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian 1/1 Running 0 6s 10.100.2.4 worker233 <none> <none> # -o wide 输出详细格式(wide 格式) 3.连接容器 [root@master231 ~]# kubectl exec -it xiuxian -- sh / # ifconfig # -- 分隔符(表示 kubectl 参数结束) eth0 Link encap:Ethernet HWaddr 1E:48:24:4E:68:EB inet addr:10.100.2.4 Bcast:10.100.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:15 errors:0 dropped:0 overruns:0 frame:0 TX packets:1 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2160 (2.1 KiB) TX bytes:42 (42.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # / # env | egrep "CLASS|SCHOOL" CLASS=weixiang98 SCHOOL=weixiang / # / # netstat -untalp Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 1/nginx: master pro tcp 0 0 :::80 :::* LISTEN 1/nginx: master pro / # 4.访问测试 [root@master231 ~]# curl 10.100.2.4 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> # 为什么每个节点都能curl通 因为各个节点上安装了cni插件,并且指定网段为--pod-network-cidr=10.100.0.0/16,所以当容器被创建后因为ip在这个网段 ,所以每个节点都可以访问到这个网段的pods 5.删除资源 [root@master231 ~]# kubectl delete pod xiuxian pod "xiuxian" deleted [root@master231 ~]# [root@master231 ~]# kubectl get pods -o wide No resources found in default namespace. [root@master231 ~]# - 资源清单的组成 apiVersion: 生命API的版本信息。 举例: - myapp/alpha - myapp/betav1 - myapp/betav2 - myapp/v1 kind: k8s的资源类型。 linux一切皆文件,而k8s一切皆资源。比如我们见到过的:cs(ComponentStatus),no(Node),po(Pod),... metadata: 【关注点】 资源的元数据信息,用来声明其所在的名称空间,资源的名称,资源注解,标签等信息。 spec: 【关注点】 期望资源的运行状态。比如我们想要让Pod调度到哪个节点,环境变量传递,资源限制,运行的镜像。 status: 实际运行的状态,由K8S组件自行维护。 参考命令: [root@master231 ~]# kubectl api-resources 1. 核心工作负载资源 资源名称 缩写 APIVERSION 作用说明 pods po v1 容器运行的基本单元 deployments deploy apps/v1 管理 Pod 副本的无状态应用 daemonsets ds apps/v1 确保每个节点运行 Pod 副本 statefulsets sts apps/v1 有状态应用的 Pod 管理 cronjobs cj batch/v1 定时任务 jobs - batch/v1 一次性任务 2. 网络与访问控制 资源名称 缩写 APIVERSION 作用说明 services svc v1 暴露应用的服务 ingresses ing networking.k8s.io/v1 HTTP/HTTPS 流量路由 networkpolicies netpol networking.k8s.io/v1 网络访问控制策略 endpoints ep v1 服务后端 Pod 的 IP 地址列表 3. 配置与存储 资源名称 缩写 APIVERSION 作用说明 configmaps cm v1 存储非敏感配置数据 secrets - v1 存储敏感数据(密码/密钥) persistentvolumeclaims pvc v1 存储资源请求 persistentvolumes pv v1 集群存储资源 storageclasses sc storage.k8s.io/v1 存储类型定义 4. 集群管理 资源名称 缩写 APIVERSION 作用说明 nodes no v1 集群节点信息 namespaces ns v1 资源隔离的逻辑分区 serviceaccounts sa v1 Pod 身份认证 resourcequotas quota v1 命名空间资源配额 - 声明式创建资源 1.创建工作目录 [root@master231 ~]# mkdir -pv /weixiang/manifests/pods mkdir: created directory '/weixiang/manifests/pods' [root@master231 ~]# [root@master231 ~]# cd /weixiang/manifests/pods [root@master231 pods]# 2.编写资源清单 [root@master231 pods]# cat 01-pods-xiuxian.yaml # 指定API的版本号 apiVersion: v1 # 指定资源的类型 kind: Pod # 定义元数据信息 metadata: # 指定资源的名称 name: xiuxian # 给资源打标签 labels: apps: xiuxian school: weixiang class: weixiang98 # 定义期望资源状态 spec: # 调度到指定节点的名称 kubectl get pods -o wide可以看到有哪个节点 nodeName: worker233 # 定义Pod运行的容器 containers: # 指定容器的名称 - name: c1 # 指定容器的镜像 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 pods]# 3.创建资源 [root@master231 pods]# kubectl create -f 01-pods-xiuxian.yaml pod/xiuxian created [root@master231 pods]# [root@master231 pods]# kubectl create -f 01-pods-xiuxian.yaml # 已经存在的资源不能重复创建!! Error from server (AlreadyExists): error when creating "01-pods-xiuxian.yaml": pods "xiuxian" already exists [root@master231 pods]# 4.查看资源 [root@master231 pods]# kubectl get -f 01-pods-xiuxian.yaml -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian 1/1 Running 0 54s 10.100.2.5 worker233 <none> <none> [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian 1/1 Running 0 56s 10.100.2.5 worker233 <none> <none> [root@master231 pods]# 5.查看资源的标签 [root@master231 pods]# kubectl get pods -o wide --show-labels NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS xiuxian 1/1 Running 0 96s 10.100.2.5 worker233 <none> <none> apps=xiuxian,class=weixiang98,school=weixiang # -o:--output 的缩写 # wide:输出格式会显示额外列信息(如 Pod IP、节点名等) # --show-labels:在输出中添加一列显示资源的所有标签(labels) [root@master231 pods]# kubectl get -f 01-pods-xiuxian.yaml -o wide --show-labels NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS xiuxian 1/1 Running 0 102s 10.100.2.5 worker233 <none> <none> apps=xiuxian,class=weixiang98,school=weixiang [root@master231 pods]# 6.删除资源 [root@master231 pods]# kubectl delete pod -l apps=xiuxian pod "xiuxian" deleted [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide --show-labels No resources found in default namespace. [root@master231 pods]# [root@master231 pods]#

2、pod

1、pod的容器类型
bash
1.什么是Pod 所谓的Pod是K8S集群调度的最小单元,所谓的最小单元就是不可拆分。 Pod是一组容器的集合,其包含三种容器类型: (启动顺序依次是:基础架构容器,初始化容器,业务容器,对于用户而言,只需要额外关注后两者的容器类型即可。 ) - 基础架构容器(registry.aliyuncs.com/google_containers/pause:3.6) 为Pod提供基础linux名称空间(ipc,net,time,user)共享。 基础架构容器无需运维人员部署,而是有kubelet组件自行维护。 - 初始化容器 可选的容器类型,一般情况下,为业务容器做初始化工作。可以定义多个初始化容器。 - 业务容器 用户的实际业务。可以定义多个业务容器。 2.验证基础架构和业务容器 2.1 编写资源清单修改容器的启动命令 [root@master231 pods]# cat > 02-pods-xiuxian-command.yaml << EOF apiVersion: v1 kind: Pod metadata: name: xiuxian-command labels: apps: xiuxian school: weixiang class: weixiang98 spec: nodeName: worker233 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 # 修改容器的启动命令,相当于替换了Dockerfile的ENTRYPOINT指令 command: ["sleep","10d"] EOF [root@master231 pods]# [root@master231 pods]# kubectl apply -f 02-pods-xiuxian-command.yaml pod/xiuxian-command created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-command 1/1 Running 0 4s 10.100.2.7 worker233 <none> <none> [root@master231 pods]# [root@master231 pods]# curl 10.100.2.7 curl: (7) Failed to connect to 10.100.2.7 port 80 after 0 ms: Connection refused [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-command -- sh / # ps -ef PID USER TIME COMMAND 1 root 0:00 sleep 10d 8 root 0:00 sh 15 root 0:00 ps -ef / # 2.2 验证pod底层的容器 [root@worker233 ~]# docker ps -a | grep xiuxian-command cfc0d156b4ba f28fd43be4ad "sleep 10d" 3 minutes ago Up 3 minutes k8s_c1_xiuxian-command_default_81089a63-e5ba-4a5a-aeb4-edf243016d45_0 5cde11c1c532 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 3 minutes ago Up 3 minutes k8s_POD_xiuxian-command_default_81089a63-e5ba-4a5a-aeb4-edf243016d45_0 [root@worker233 ~]# [root@worker233 ~]# docker inspect -f {{.State.Pid}} cfc0d156b4ba 25242 [root@worker233 ~]# [root@worker233 ~]# docker inspect -f {{.State.Pid}} 5cde11c1c532 25126 [root@worker233 ~]# [root@worker233 ~]# [root@worker233 ~]# docker inspect -f {{.State.Pid}} k8s_c1_xiuxian-command_default_81089a63-e5ba-4a5a-aeb4-edf243016d45_0 25242 [root@worker233 ~]# [root@worker233 ~]# docker inspect -f {{.State.Pid}} k8s_POD_xiuxian-command_default_81089a63-e5ba-4a5a-aeb4-edf243016d45_0 25126 [root@worker233 ~]# [root@worker233 ~]# ll /proc/25242/ns total 0 dr-x--x--x 2 root root 0 Jul 10 09:24 ./ dr-xr-xr-x 9 root root 0 Jul 10 09:24 ../ lrwxrwxrwx 1 root root 0 Jul 10 09:24 cgroup -> 'cgroup:[4026532758]' lrwxrwxrwx 1 root root 0 Jul 10 09:24 ipc -> 'ipc:[4026532660]' lrwxrwxrwx 1 root root 0 Jul 10 09:24 mnt -> 'mnt:[4026532755]' lrwxrwxrwx 1 root root 0 Jul 10 09:24 net -> 'net:[4026532662]' lrwxrwxrwx 1 root root 0 Jul 10 09:24 pid -> 'pid:[4026532757]' lrwxrwxrwx 1 root root 0 Jul 10 09:28 pid_for_children -> 'pid:[4026532757]' lrwxrwxrwx 1 root root 0 Jul 10 09:28 time -> 'time:[4026531834]' lrwxrwxrwx 1 root root 0 Jul 10 09:28 time_for_children -> 'time:[4026531834]' lrwxrwxrwx 1 root root 0 Jul 10 09:28 user -> 'user:[4026531837]' lrwxrwxrwx 1 root root 0 Jul 10 09:24 uts -> 'uts:[4026532756]' [root@worker233 ~]# [root@worker233 ~]# ll /proc/25126/ns total 0 dr-x--x--x 2 65535 65535 0 Jul 10 09:24 ./ dr-xr-xr-x 9 65535 65535 0 Jul 10 09:24 ../ lrwxrwxrwx 1 65535 65535 0 Jul 10 09:29 cgroup -> 'cgroup:[4026532751]' lrwxrwxrwx 1 65535 65535 0 Jul 10 09:24 ipc -> 'ipc:[4026532660]' lrwxrwxrwx 1 65535 65535 0 Jul 10 09:29 mnt -> 'mnt:[4026532658]' lrwxrwxrwx 1 65535 65535 0 Jul 10 09:24 net -> 'net:[4026532662]' lrwxrwxrwx 1 65535 65535 0 Jul 10 09:29 pid -> 'pid:[4026532661]' lrwxrwxrwx 1 65535 65535 0 Jul 10 09:29 pid_for_children -> 'pid:[4026532661]' lrwxrwxrwx 1 65535 65535 0 Jul 10 09:29 time -> 'time:[4026531834]' lrwxrwxrwx 1 65535 65535 0 Jul 10 09:29 time_for_children -> 'time:[4026531834]' lrwxrwxrwx 1 65535 65535 0 Jul 10 09:29 user -> 'user:[4026531837]' lrwxrwxrwx 1 65535 65535 0 Jul 10 09:29 uts -> 'uts:[4026532659]'
2、删除容器对Pod的IP地址变化
bash
3.删除业务容器并不会导致Pod的IP地址变化 3.1 删除业务容器 [root@worker233 ~]# docker ps -a | grep xiuxian-command cfc0d156b4ba f28fd43be4ad "sleep 10d" 6 minutes ago Up 6 minutes k8s_c1_xiuxian-command_default_81089a63-e5ba-4a5a-aeb4-edf243016d45_0 5cde11c1c532 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 6 minutes ago Up 6 minutes k8s_POD_xiuxian-command_default_81089a63-e5ba-4a5a-aeb4-edf243016d45_0 [root@worker233 ~]# [root@worker233 ~]# docker rm -fv cfc0d156b4ba cfc0d156b4ba [root@worker233 ~]# [root@worker233 ~]# docker ps -a | grep xiuxian-command 26e023ae8f5f f28fd43be4ad "sleep 10d" 2 seconds ago Up 2 seconds k8s_c1_xiuxian-command_default_81089a63-e5ba-4a5a-aeb4-edf243016d45_1 5cde11c1c532 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 6 minutes ago Up 6 minutes k8s_POD_xiuxian-command_default_81089a63-e5ba-4a5a-aeb4-edf243016d45_0 [root@worker233 ~]# 3.2 删除容器后,查看Pod会有重启【重新创建新的容器】的次数 [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-command 1/1 Running 1 7m9s 10.100.2.7 worker233 <none> <none> [root@master231 pods]# 4.删除基础架构容器则IP地址会发生变化 4.1 删除基础架构容器 [root@worker233 ~]# docker ps -a | grep xiuxian-command 26e023ae8f5f f28fd43be4ad "sleep 10d" 2 minutes ago Up 2 minutes k8s_c1_xiuxian-command_default_81089a63-e5ba-4a5a-aeb4-edf243016d45_1 5cde11c1c532 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 9 minutes ago Up 9 minutes k8s_POD_xiuxian-command_default_81089a63-e5ba-4a5a-aeb4-edf243016d45_0 [root@worker233 ~]# [root@worker233 ~]# docker rm -fv 5cde11c1c532 5cde11c1c532 [root@worker233 ~]# [root@worker233 ~]# docker ps -a | grep xiuxian-command a0f0334d5349 f28fd43be4ad "sleep 10d" 10 seconds ago Up 10 seconds k8s_c1_xiuxian-command_default_81089a63-e5ba-4a5a-aeb4-edf243016d45_2 fe2f8af61ab8 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 10 seconds ago Up 10 seconds k8s_POD_xiuxian-command_default_81089a63-e5ba-4a5a-aeb4-edf243016d45_0 26e023ae8f5f f28fd43be4ad "sleep 10d" 3 minutes ago Exited (137) 10 seconds ago k8s_c1_xiuxian-command_default_81089a63-e5ba-4a5a-aeb4-edf243016d45_1 [root@worker233 ~]# 4.2 删除基础架构容器后,发现业务容器也会随之重启,会回去新的ip地址 [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-command 1/1 Running 2 (6s ago) 9m55s 10.100.2.8 worker233 <none> <none> [root@master231 pods]# 5.初始化容器,业务容器和基础架构容器启动顺序验证 5.1 编写资源清单 [root@master231 pods]# cat 03-pods-xiuxian-initContainers.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-initcontainers labels: apps: xiuxian school: weixiang class: weixiang98 spec: nodeName: worker233 # 定义初始化容器 initContainers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 command: ["sleep","30s"] - name: c2 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 command: ["sleep","50s"] # 定义业务容器 containers: - name: c3 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 command: ["sleep","10d"] [root@master231 pods]# 5.2 创建资源 [root@master231 pods]# kubectl apply -f 03-pods-xiuxian-initContainers.yaml pod/xiuxian-initcontainers created [root@master231 pods]# [root@master231 pods]# kubectl get -f 03-pods-xiuxian-initContainers.yaml NAME READY STATUS RESTARTS AGE xiuxian-initcontainers 0/1 Init:0/2 0 8s [root@master231 pods]# 5.3 查看对应的pod产生的容器 [root@worker233 ~]# docker ps -a | grep xiuxian-initcontainers f9a9175fc48f f28fd43be4ad "sleep 30s" 24 seconds ago Up 24 seconds k8s_c1_xiuxian-initcontainers_default_9c9afafe-fbd7-49d8-9cd6-9b3a7e3739d9_0 3ed2b4be086a registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 24 seconds ago Up 24 seconds k8s_POD_xiuxian-initcontainers_default_9c9afafe-fbd7-49d8-9cd6-9b3a7e3739d9_0 [root@worker233 ~]# 5.4 再次查看资源 [root@master231 pods]# kubectl get -f 03-pods-xiuxian-initContainers.yaml # 可以观察到初始化容器已经完成1个,还有一个未完成。总公有2个初始化容器。 NAME READY STATUS RESTARTS AGE xiuxian-initcontainers 0/1 Init:1/2 0 45s [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-initcontainers -- sh # 很明显,默认链接的是业务容器。但此时业务容器未启动! Defaulted container "c3" out of: c3, c1 (init), c2 (init) error: unable to upgrade connection: container not found ("c3") [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-initcontainers -c c1 -- sh # 因此c1容器已经执行完毕,此时退出了 error: unable to upgrade connection: container not found ("c1") [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-initcontainers -c c2 -- sh # c2容器正在运行中 / # / # ifconfig eth0 Link encap:Ethernet HWaddr E6:6A:C3:53:E9:9A inet addr:10.100.2.9 Bcast:10.100.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:13 errors:0 dropped:0 overruns:0 frame:0 TX packets:1 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1900 (1.8 KiB) TX bytes:42 (42.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # / # [root@master231 pods]# 5.5 再次查看资源,所有的初始化容器都执行完毕 [root@master231 pods]# kubectl get -f 03-pods-xiuxian-initContainers.yaml -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-initcontainers 1/1 Running 0 83s 10.100.2.9 worker233 <none> <none> [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-initcontainers -c c2 -- sh # 无法进入初始化容器,因为所有的 初始化容器都已经执行完毕。 error: unable to upgrade connection: container not found ("c2") [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-initcontainers -- sh # 很明显,此时链接默认的业务容器是可以进入的。因为c1、c2都执行完毕了 Defaulted container "c3" out of: c3, c1 (init), c2 (init) / # ifconfig eth0 Link encap:Ethernet HWaddr E6:6A:C3:53:E9:9A inet addr:10.100.2.9 Bcast:10.100.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:13 errors:0 dropped:0 overruns:0 frame:0 TX packets:1 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1900 (1.8 KiB) TX bytes:42 (42.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # / # [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-initcontainers -c c3 -- sh # 当然,也可以使用-c选项链接业务容器。 / # / # ps -ef PID USER TIME COMMAND 1 root 0:00 sleep 10d 14 root 0:00 sh 21 root 0:00 ps -ef / # / # ifconfig eth0 Link encap:Ethernet HWaddr E6:6A:C3:53:E9:9A inet addr:10.100.2.9 Bcast:10.100.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:13 errors:0 dropped:0 overruns:0 frame:0 TX packets:1 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1900 (1.8 KiB) TX bytes:42 (42.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / #
3、部署MySQL服务到K8S集群
bash
1.新建harbor项目 建议项目名称为"weixiang-db" 2.导入镜像 [root@worker233 ~]# wget http://192.168.21.253/Resources/Docker/images/WordPress/weixiang-mysql-v8.0.36-oracle.tar.gz [root@worker233 ~]# docker load -i weixiang-mysql-v8.0.36-oracle.tar.gz 3.推送镜像到harbor仓库 [root@worker233 ~]# docker tag mysql:8.0.36-oracle harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle # mysql:8.0.36-oracle:源镜像,本地已存在的 Docker 镜像 # harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle:目标镜像:新标签格式 # harbor250.weixiang.com:私有镜像仓库地址 # weixiang-db:Harbor 仓库中的项目名称 [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle 4.编写资源清单 [root@master231 pods]# cat 04-pods-mysql-env.yaml apiVersion: v1 kind: Pod metadata: name: mysql-env-args labels: apps: db spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle # 向容器传递环境变量 env: # 指定变量的名称 - name: MYSQL_ALLOW_EMPTY_PASSWORD # 指定变量的值 value: "yes" # 指定变量的名称 - name: MYSQL_DATABASE value: "wordpress" - name: MYSQL_USER value: weixiang98 - name: MYSQL_PASSWORD value: weixiang # 向容器传递启动参数,相当于替换了Dockerfile的CMD指令。 args: - "--character-set-server=utf8" - "--collation-server=utf8_bin" - "--default-authentication-plugin=mysql_native_password" [root@master231 pods]# 5.部署服务并查看IP地址 [root@master231 pods]# kubectl apply -f 04-pods-mysql-env.yaml pod/mysql-env-args created [root@master231 pods]# [root@master231 pods]# kubectl get pods -l apps=db --show-labels NAME READY STATUS RESTARTS AGE LABELS mysql-env-args 1/1 Running 0 18s apps=db [root@master231 pods]# [root@master231 pods]# [root@master231 pods]# kubectl get pods -l apps=db --show-labels -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS mysql-env-args 1/1 Running 0 22s 10.100.2.10 worker233 <none> <none> apps=db [root@master231 pods]# 6.链接MySQL服务 [root@master231 pods]# kubectl exec -it mysql-env-args -- mysql wordpress Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 9 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SELECT DATABASE(); +------------+ | DATABASE() | +------------+ | wordpress | +------------+ 1 row in set (0.00 sec) mysql> mysql> SHOW DATABASES; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | wordpress | +--------------------+ 5 rows in set (0.00 sec) mysql> mysql> SELECT USER(); +----------------+ | USER() | +----------------+ | root@localhost | +----------------+ 1 row in set (0.00 sec) mysql> mysql> SELECT user,host,plugin FROM mysql.user; +------------------+-----------+-----------------------+ | user | host | plugin | +------------------+-----------+-----------------------+ | weixiang98 | % | mysql_native_password | | root | % | mysql_native_password | | mysql.infoschema | localhost | caching_sha2_password | | mysql.session | localhost | caching_sha2_password | | mysql.sys | localhost | caching_sha2_password | | root | localhost | mysql_native_password | +------------------+-----------+-----------------------+ 6 rows in set (0.00 sec) mysql>

3、服务的暴露方式

bash
# 1、hostnetwork 主机网络模式,Pod直接使用宿主机的网络命名空间,容器内看到的网络接口和IP地址就是宿主机的,容器端口直接绑定到宿主机网络,临时调试 效率最高 # 2、hostport 将容器端口映射到宿主机的指定端口 # 3、port-forward 端口转发,通过kubectl客户端建立隧道,将本地端口转发到集群内部Pod,不依赖Service或网络策略
1、在k8s集群部署WordPress
bash
1.导入镜像 [root@worker233 ~]# wget http://192.168.21.253/Resources/Docker/images/WordPress/weixiang-wordpress-v6.7.1-php8.1-apache.tar.gz [root@worker233 ~]# docker load -i weixiang-wordpress-v6.7.1-php8.1-apache.tar.gz 2.新建项目 建议项目名称为"weixiang-wp" 3.推送镜像到harbor仓库 [root@worker233 ~]# docker tag wordpress:6.7.1-php8.1-apache harbor250.weixiang.com/weixiang-wp/wordpress:6.7.1-php8.1-apache [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-wp/wordpress:6.7.1-php8.1-apache 4.查看Pod的IP地址 [root@master231 pods]# kubectl get pods --show-labels -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS mysql-env-args 1/1 Running 0 39m 10.100.2.10 worker233 <none> <none> apps=db [root@master231 pods]# 5.编写资源清单 [root@master231 pods]# cat 05-pods-wordpress-hostNetwork.yaml apiVersion: v1 kind: Pod metadata: name: wordpress-hostnetwork labels: apps: wp spec: # 使用宿主机的网络,因为要让外面访问,会使用物理机的网 hostNetwork: true containers: - name: wp image: harbor250.weixiang.com/weixiang-wp/wordpress:6.7.1-php8.1-apache env: - name: WORDPRESS_DB_HOST value: "10.100.2.10" # mysql的地址 - name: WORDPRESS_DB_NAME value: "wordpress" - name: WORDPRESS_DB_USER value: weixiang98 - name: WORDPRESS_DB_PASSWORD value: weixiang [root@master231 pods]# [root@master231 pods]# kubectl apply -f 05-pods-wordpress-hostNetwork.yaml pod/wordpress-hostnetwork created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide --show-labels NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS mysql-env-args 1/1 Running 0 41m 10.100.2.10 worker233 <none> <none> apps=db wordpress-hostnetwork 1/1 Running 0 6s 10.0.0.233 worker233 <none> <none> apps=wp 6.测试验证 http://10.0.0.233/wp-admin/ 7.查看数据库内容 [root@master231 pods]# kubectl exec -it mysql-env-args -- mysql wordpress Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 29 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SHOW TABLES; +-----------------------+ | Tables_in_wordpress | +-----------------------+ | wp_commentmeta | | wp_comments | | wp_links | | wp_options | | wp_postmeta | | wp_posts | | wp_term_relationships | | wp_term_taxonomy | | wp_termmeta | | wp_terms | | wp_usermeta | | wp_users | +-----------------------+ 12 rows in set (0.00 sec) mysql> 8.删除所有的pod [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES mysql-env-args 1/1 Running 0 44m 10.100.2.10 worker233 <none> <none> wordpress-hostnetwork 1/1 Running 0 2m43s 10.0.0.233 worker233 <none> <none> [root@master231 pods]# [root@master231 pods]# kubectl delete pods --all pod "mysql-env-args" deleted pod "wordpress-hostnetwork" deleted [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide No resources found in default namespace. [root@master231 pods]#
2、k8s部署jenkins实战案例基于hostPort暴露
bash
1.导入镜像 [root@worker233 ~]# wget http://192.168.21.253/Resources/Kubernetes/Project/DevOps/weixiang-jenkins-v2.479.1-alpine-jdk21.tar.gz [root@worker233 ~]# docker load -i weixiang-jenkins-v2.479.1-alpine-jdk21.tar.gz 2.新建harbor项目 推荐项目名称为"weixiang-devops" 3.推送镜像到harbor仓库 [root@worker233 ~]# docker tag jenkins/jenkins:2.479.1-alpine-jdk21 harbor250.weixiang.com/weixiang-devops/jenkins:2.479.1-alpine-jdk21 [root@worker233 ~]# [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-devops/jenkins:2.479.1-alpine-jdk21 4.编写资源清单 [root@master231 pods]# cat 06-pods-jenkins-hostPort.yaml apiVersion: v1 kind: Pod metadata: name: jenkins-hostport labels: apps: jenkins spec: containers: - name: jenkins image: harbor250.weixiang.com/weixiang-devops/jenkins:2.479.1-alpine-jdk21 # 定义容器暴露的端口 ports: # jenkins的监听端口 - containerPort: 8080 # 宿主机暴露的端口 hostPort: 18080 [root@master231 pods]# [root@master231 pods]# kubectl apply -f 06-pods-jenkins-hostPort.yaml pod/jenkins-hostport created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide -l apps=jenkins NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES jenkins-hostport 1/1 Running 0 12s 10.100.2.11 worker233 <none> <none> [root@master231 pods]# 5.测试验证 [root@worker233 ~]# iptables-save | grep 18080 | grep to-destination -A CNI-DN-c9420656d7ac17e547d6b -p tcp -m tcp --dport 18080 -j DNAT --to-destination 10.100.2.11:8080 [root@worker233 ~]# 浏览器访问: 【需要查看初始密码】 http://10.0.0.233:18080/login [root@master231 pods]# kubectl exec jenkins-hostport -- cat /var/jenkins_home/secrets/initialAdminPassword e262dca42e874eb0aa442d9fdbad484a [root@master231 pods]# 6.删除资源 [root@master231 pods]# kubectl delete pod -l apps=jenkins pod "jenkins-hostport" deleted [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide No resources found in default namespace. [root@master231 pods]#
3、基于port-forward暴露sonarQube服务
bash
1.导入镜像 [root@worker233 ~]# wget http://192.168.21.253/Resources/Kubernetes/Project/DevOps/weixiang-sonarqube-v9.9.7-community.tar.gz [root@worker233 ~]# docker load -i weixiang-sonarqube-v9.9.7-community.tar.gz 2.推送镜像到harbor仓库 [root@worker233 ~]# docker tag sonarqube:9.9.7-community harbor250.weixiang.com/weixiang-devops/sonarqube:9.9.7-community [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-devops/sonarqube:9.9.7-community 3.编写资源清单 [root@master231 pods]# cat 07-pods-sonarqube.yaml apiVersion: v1 kind: Pod metadata: name: sonarqube labels: apps: sonarqube spec: containers: - name: sonar image: harbor250.weixiang.com/weixiang-devops/sonarqube:9.9.7-community ports: # 容器内服务监听端口 - containerPort: 9000 [root@master231 pods]# [root@master231 pods]# kubectl apply -f 07-pods-sonarqube.yaml pod/sonarqube created [root@master231 pods]# [root@master231 pods]# kubectl get pods -l apps=sonarqube -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES sonarqube 1/1 Running 0 9s 10.100.2.12 worker233 <none> <none> [root@master231 pods]# 4.配置端口转发 # 将集群内部的Pod服务暴露给外部访问 [root@master231 pods]# kubectl port-forward po/sonarqube --address=0.0.0.0 9999:9000 # po/sonarqube:指定要转发的资源类型和名称(po 是 pod 的缩写) # --address=0.0.0.0 监听所有网络接口(开放给所有IP访问) # 9999:9000 端口映射(本地端口:Pod端口) 5.测试验证 http://10.0.0.231:9999/ 默认的用户名和密码均为: admin 5.删除资源 [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES sonarqube 1/1 Running 0 7m52s 10.100.2.12 worker233 <none> <none> [root@master231 pods]# [root@master231 pods]# kubectl delete pods sonarqube pod "sonarqube" deleted [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide No resources found in default namespace.
3、k8s部署ES单点案例
bash
1.导入镜像 [root@worker233 ~]# wget http://192.168.21.253/Resources/Docker/images/ElasticStack/7.17.25/weixiang-elasticsearch-v7.17.25.tar.gz [root@worker233 ~]# docker load -i weixiang-elasticsearch-v7.17.25.tar.gz 2.新建harbor项目 推荐项目名称为"weixiang-elasticstack" 3.推送镜像到harbor仓库 [root@worker233 ~]# docker tag docker.elastic.co/elasticsearch/elasticsearch:7.17.25 harbor250.weixiang.com/weixiang-elasticstack/elasticsearch:7.17.25 [root@worker233 ~]# [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-elasticstack/elasticsearch:7.17.25 4.编写资源清单 [root@master231 pods]# cat 08-pods-elasticsearch.yaml apiVersion: v1 kind: Pod metadata: name: elasticsearch labels: apps: es spec: containers: - name: es image: harbor250.weixiang.com/weixiang-elasticstack/elasticsearch:7.17.25 ports: - containerPort: 9200 # 为端口起名称,注意端口名称必须唯一。 name: http - containerPort: 9300 name: tcp env: - name: discovery.type value: "single-node" - name: node.name value: "elk91" - name: cluster.name value: "weixiang-weixiang98-single" - name: ES_JAVA_OPTS value: "-Xms512m -Xmx512m" [root@master231 pods]# 5.测试验证 [root@master231 pods]# kubectl apply -f 08-pods-elasticsearch.yaml pod/elasticsearch created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide -l apps=es NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES elasticsearch 1/1 Running 0 7s 10.100.2.13 worker233 <none> <none> [root@master231 pods]# [root@master231 pods]# curl http://10.100.2.13:9200 { "name" : "elk91", "cluster_name" : "weixiang-weixiang98-single", "cluster_uuid" : "VFd4ygajTtiMMi8Lc80yjw", "version" : { "number" : "7.17.25", "build_flavor" : "default", "build_type" : "docker", "build_hash" : "f9b6b57d1d0f76e2d14291c04fb50abeb642cfbf", "build_date" : "2024-10-16T22:06:36.904732810Z", "build_snapshot" : false, "lucene_version" : "8.11.3", "minimum_wire_compatibility_version" : "6.8.0", "minimum_index_compatibility_version" : "6.0.0-beta1" }, "tagline" : "You Know, for Search" } [root@master231 pods]# [root@master231 pods]# curl http://10.100.2.13:9200/_cat/nodes 10.100.2.13 28 97 33 0.58 0.24 0.11 cdfhilmrstw * elk91 [root@master231 pods]# 6.对外暴露端口 [root@master231 pods]# kubectl port-forward po/elasticsearch --address=0.0.0.0 9200:9200 7.windows访问测试 http://10.0.0.231:9200/_cat/nodes
4、k8s部署kibana对接ES
bash
1.导入镜像 [root@worker233 ~]# wget http://192.168.21.253/Resources/Docker/images/ElasticStack/7.17.25/weixiang-kibana-v7.17.25.tar.gz [root@worker233 ~]# docker load -i weixiang-kibana-v7.17.25.tar.gz 2.推送镜像到harbor仓库 [root@worker233 ~]# docker tag docker.elastic.co/kibana/kibana:7.17.25 harbor250.weixiang.com/weixiang-elasticstack/kibana:7.17.25 [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-elasticstack/kibana:7.17.25 3.编写资源清单 [root@master231 pods]# cat 09-pods-kibana.yaml apiVersion: v1 kind: Pod metadata: name: kibana labels: apps: kibana spec: containers: - name: kibana image: harbor250.weixiang.com/weixiang-elasticstack/kibana:7.17.25 ports: - containerPort: 5601 name: webui env: - name: ELASTICSEARCH_HOSTS value: http://10.100.2.13:9200 # 配置Kibana连接的Elasticsearch 地址 [root@master231 pods]# [root@master231 pods]# kubectl apply -f 09-pods-kibana.yaml pod/kibana created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES elasticsearch 1/1 Running 0 8m54s 10.100.2.13 worker233 <none> <none> kibana 1/1 Running 0 5s 10.100.2.14 worker233 <none> <none> [root@master231 pods]# 4.暴露服务 [root@master231 pods]# kubectl port-forward po/kibana --address=0.0.0.0 5601:5601 Forwarding from 0.0.0.0:5601 -> 5601 5.测试验证 http://10.0.0.231:5601/ 6.删除资源 [root@master231 pods]# kubectl get pods NAME READY STATUS RESTARTS AGE elasticsearch 1/1 Running 0 10m kibana 1/1 Running 0 110s [root@master231 pods]# [root@master231 pods]# kubectl delete pods --all pod "elasticsearch" deleted pod "kibana" deleted [root@master231 pods]# [root@master231 pods]# kubectl get pods No resources found in default namespace. [root@master231 pods]#
5、一个Pod运行多个容器
bash
1.编写资源清单 [root@master231 pods]# cat 10-pods-xiuxian-multiple-command.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-mutiple labels: apps: xiuxian spec: # 定义容器列表 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 ports: - containerPort: 80 - name: c2 # 注意,容器的名称不能重复 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 command: - tail - -f - /etc/hosts [root@master231 pods]# 2.部署服务 [root@master231 pods]# kubectl apply -f 10-pods-xiuxian-multiple-command.yaml pod/xiuxian-mutiple created [root@master231 pods]# [root@master231 pods]# kubectl get pods NAME READY STATUS RESTARTS AGE xiuxian-mutiple 2/2 Running 0 6s [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-mutiple 2/2 Running 0 57s 10.100.1.4 worker232 <none> <none> [root@master231 pods]# 3.链接容器测试 [root@master231 pods]# kubectl exec -it -c c1 xiuxian-mutiple -- sh / # netstat -untalp Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 1/nginx: master pro tcp 0 0 :::80 :::* LISTEN 1/nginx: master pro / # / # ps -ef PID USER TIME COMMAND 1 root 0:00 nginx: master process nginx -g daemon off; 32 nginx 0:00 nginx: worker process 33 nginx 0:00 nginx: worker process 34 root 0:00 sh 41 root 0:00 ps -ef / # [root@master231 pods]# kubectl exec -it -c c2 xiuxian-mutiple -- sh / # netstat -untalp Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN - tcp 0 0 :::80 :::* LISTEN - / # / # ps -ef PID USER TIME COMMAND 1 root 0:00 tail -f /etc/hosts 7 root 0:00 sh 15 root 0:00 ps -ef / # 4.worker节点测试验证 # 此命令可以看到两个容器的启动命令 [root@worker232 ~]# docker ps -a --no-trunc | grep xiuxian-mutiple 1436e3780de57a3c47014b66adf27aa4b974ad0d6ad2a2917347bd11f130e83f sha256:f28fd43be4ad41fc768dcc3629f8479d1443df01ada10ac9a771314e4fdef599 "tail -f /etc/hosts" 30 seconds ago Up 29 seconds k8s_c2_xiuxian-mutiple_default_b9bb6246-b8c1-4c02-9332-16749d73e14b_0 9ce6ff35cbbfd05c7072cfc7c484bad499d3bd506bd875c21a14015a14b626a6 sha256:f28fd43be4ad41fc768dcc3629f8479d1443df01ada10ac9a771314e4fdef599 "/docker-entrypoint.sh nginx -g 'daemon off;'" 30 seconds ago Up 29 seconds k8s_c1_xiuxian-mutiple_default_b9bb6246-b8c1-4c02-9332-16749d73e14b_0 558e01a31ee1030725e819e05559ac1e8dccbfe8b16368da04cc28cfe8317929 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 30 seconds ago Up 29 seconds k8s_POD_xiuxian-mutiple_default_b9bb6246-b8c1-4c02-9332-16749d73e14b_0 [root@worker232 ~]# 5.删除资源 [root@master231 pods]# kubectl delete -f 10-pods-xiuxian-multiple-command.yaml pod "xiuxian-mutiple" deleted [root@master231 pods]# # 当你在 Dockerfile 中同时定义了 ENTRYPOINT 和 CMD 时,CMD 的内容会作为参数传递给 ENTRYPOINT。最终的启动命令是 ENTRYPOINT + CMD # Kubernetes Pod 定义中的 spec.containers[].command 字段直接对应且覆盖 Docker 的 ENTRYPOINT # Kubernetes Pod 定义中的 spec.containers[].args 字段直接对应且覆盖 Docker 的 CMD

4、故障排查技巧

1、故障排查技巧describe
bash
首先describe看一下报错信息,然后看一下具体的日志信息,然后修改启动命令启动试一下 1.什么是describe describe可以查看k8s任意资源的详细信息。 我们可以通过详细信息查看资源的运行状态及事件信息。 2.实战案例 2.1 准备资源清单 [root@master231 pods]# cat 11-pods-Troubleshooting-describe.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-describe labels: apps: xiuxian spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111 ports: - containerPort: 80 [root@master231 pods]# 2.2 创建资源 [root@master231 pods]# kubectl apply -f 11-pods-Troubleshooting-describe.yaml pod/xiuxian-describe created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-describe 0/1 ImagePullBackOff 0 4s 10.100.1.5 worker232 <none> <none> # ImagePullBackOff镜像拉取失败 2.3 查看资源的详细信息 [root@master231 pods]# kubectl describe pod xiuxian-describe Name: xiuxian-describe Namespace: default Priority: 0 Node: worker232/10.0.0.232 Start Time: Fri, 11 Jul 2025 09:08:22 +0800 Labels: apps=xiuxian Annotations: <none> Status: Pending IP: 10.100.1.5 IPs: IP: 10.100.1.5 Containers: c1: Container ID: Image: harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111 Image ID: Port: 80/TCP Host Port: 0/TCP State: Waiting Reason: ErrImagePull Ready: False Restart Count: 0 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xc94x (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: kube-api-access-xc94x: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 64s default-scheduler Successfully assigned default/xiuxian-describe to worker232 Normal Pulling 19s (x3 over 63s) kubelet Pulling image "harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111" Warning Failed 19s (x3 over 63s) kubelet Failed to pull image "harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111": rpc error: code = Unknown desc = Error response from daemon: received unexpected HTTP status: 502 Bad Gateway Warning Failed 19s (x3 over 63s) kubelet Error: ErrImagePull Normal BackOff 5s (x4 over 62s) kubelet Back-off pulling image "harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111" Warning Failed 5s (x4 over 62s) kubelet Error: ImagePullBackOff [root@master231 pods]# 报错信息:(注意观察: Reason字段,发现了Failed关键字。) 报错为"code = Unknown desc = Error response from daemon: received unexpected HTTP status: 502 Bad Gateway"。 问题原因: harbor服务可能挂掉了。 解决方案: 重启harbor。 [root@harbor250.weixiang.com harbor]# docker-compose down -t 0 [+] Running 10/10 ✔ Container nginx Removed 0.2s ✔ Container harbor-jobservice Removed 0.0s ✔ Container registryctl Removed 0.0s ✔ Container harbor-portal Removed 0.2s ✔ Container harbor-core Removed 0.2s ✔ Container registry Removed 0.0s ✔ Container redis Removed 0.0s ✔ Container harbor-db Removed 0.0s ✔ Container harbor-log Removed 0.2s ✔ Network harbor_harbor Removed 0.2s [root@harbor250.weixiang.com harbor]# [root@harbor250.weixiang.com harbor]# docker-compose up -d [+] Building 0.0s (0/0) docker:default [+] Running 10/10 ✔ Network harbor_harbor Created 0.1s ✔ Container harbor-log Started 0.0s ✔ Container registryctl Started 0.0s ✔ Container harbor-db Started 0.0s ✔ Container harbor-portal Started 0.0s ✔ Container redis Started 0.0s ✔ Container registry Started 0.0s ✔ Container harbor-core Started 0.0s ✔ Container harbor-jobservice Started 0.0s ✔ Container nginx Started 0.0s [root@harbor250.weixiang.com harbor]# 2.4 再次查看资源的详细信息 [root@master231 pods]# kubectl delete -f 11-pods-Troubleshooting-describe.yaml pod "xiuxian-describe" deleted [root@master231 pods]# [root@master231 pods]# kubectl apply -f 11-pods-Troubleshooting-describe.yaml # 删除之前的案例重新创建后发现有新的报错信息。 pod/xiuxian-describe created [root@master231 pods]# [root@master231 pods]# kubectl describe pod xiuxian-describe Name: xiuxian-describe Namespace: default Priority: 0 Node: worker233/10.0.0.233 Start Time: Fri, 11 Jul 2025 09:14:41 +0800 Labels: apps=xiuxian Annotations: <none> Status: Pending IP: 10.100.2.18 IPs: IP: 10.100.2.18 Containers: c1: Container ID: Image: harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111 Image ID: Port: 80/TCP Host Port: 0/TCP State: Waiting Reason: ImagePullBackOff Ready: False Restart Count: 0 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8b7d7 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: kube-api-access-8b7d7: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 6s default-scheduler Successfully assigned default/xiuxian-describe to worker233 Normal Pulling 5s kubelet Pulling image "harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111" Warning Failed 5s kubelet Failed to pull image "harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111": rpc error: code = Unknown desc = Error response from daemon: unknown: artifact weixiang-xiuxian/apps:v11111111111 not found Warning Failed 5s kubelet Error: ErrImagePull Normal BackOff 3s (x2 over 4s) kubelet Back-off pulling image "harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111" Warning Failed 3s (x2 over 4s) kubelet Error: ImagePullBackOff [root@master231 pods]# 报错信息: (注意观察: Reason字段,发现了Failed关键字。) 报错信息为: Failed to pull image "harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111": rpc error: code = Unknown desc = Error response from daemon: unknown: artifact weixiang-xiuxian/apps:v11111111111 not found 解决方案: - 尝试手动拉取镜像? - 如果手动拉取失败,请考虑是不是私有镜像仓库,需要登录认证权限才能拉取? - 如果不是上面2个问题,则说明镜像仓库真的不存在该镜像,则需要检查资源清单,镜像名称是否写错了。 [root@worker233 ~]# docker login harbor250.weixiang.com Authenticating with existing credentials... WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded [root@worker233 ~]# [root@worker233 ~]# docker pull harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111 Error response from daemon: unknown: artifact weixiang-xiuxian/apps:v11111111111 not found [root@worker233 ~]# 2.5 删除资源 [root@master231 pods]# kubectl delete -f 11-pods-Troubleshooting-describe.yaml pod "xiuxian-describe" deleted [root@master231 pods]# -
2、故障排查技巧之cp
bash
- 故障排查技巧之cp 1.什么是cp 所谓的cp就是和docker类似,用于将文件从容器和宿主机之间进行数据的互相拷贝。 2.实战案例 2.1 将Pod的容器文件拷贝到宿主机 [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-mutiple 2/2 Running 0 41m 10.100.1.6 worker232 <none> <none> [root@master231 pods]# [root@master231 pods]# kubectl cp xiuxian-mutiple:/docker-entrypoint.sh -c c1 /tmp/docker-entrypoint.sh # xiuxian-mutiple: pod名称 # /docker-entrypoint.sh: 根目录下的docker-entrypoint.sh文件 tar: removing leading '/' from member names [root@master231 pods]# [root@master231 pods]# ll /tmp/docker-entrypoint.sh -rw-r--r-- 1 root root 1202 Jul 11 10:01 /tmp/docker-entrypoint.sh [root@master231 pods]# [root@master231 pods]# [root@master231 pods]# wc -l /tmp/docker-entrypoint.sh 38 /tmp/docker-entrypoint.sh [root@master231 pods]# 2.2 将宿主机的文件拷贝到Pod中的容器 [root@master231 pods]# kubectl cp /etc/hosts xiuxian-mutiple:/hosts -c c1 [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-mutiple -c c1 -- sh / # ls -l / total 76 drwxr-xr-x 2 root root 4096 Nov 12 2021 bin drwxr-xr-x 5 root root 360 Jul 11 01:21 dev drwxr-xr-x 1 root root 4096 Nov 13 2021 docker-entrypoint.d -rwxrwxr-x 1 root root 1202 Nov 13 2021 docker-entrypoint.sh drwxr-xr-x 1 root root 4096 Jul 11 01:21 etc drwxr-xr-x 2 root root 4096 Nov 12 2021 home -rw-r--r-- 1 root root 261 Jul 11 02:02 hosts drwxr-xr-x 1 root root 4096 Nov 12 2021 lib drwxr-xr-x 5 root root 4096 Nov 12 2021 media drwxr-xr-x 2 root root 4096 Nov 12 2021 mnt drwxr-xr-x 2 root root 4096 Nov 12 2021 opt dr-xr-xr-x 297 root root 0 Jul 11 01:21 proc drwx------ 1 root root 4096 Jul 11 02:00 root drwxr-xr-x 1 root root 4096 Jul 11 01:21 run drwxr-xr-x 2 root root 4096 Nov 12 2021 sbin drwxr-xr-x 2 root root 4096 Nov 12 2021 srv dr-xr-xr-x 13 root root 0 Jul 11 01:21 sys drwxrwxrwt 1 root root 4096 Nov 13 2021 tmp drwxr-xr-x 1 root root 4096 Nov 12 2021 usr drwxr-xr-x 1 root root 4096 Nov 12 2021 var / # / # cat /hosts 127.0.0.1 localhost 127.0.1.1 yinzhengjie # The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters 10.0.0.250 harbor250.weixiang.com / #
3、故障排查技巧之exec
bash
1.什么是exec 所谓的exec和docker类似,可以直接在pod指定的容器中执行命令。 2.实战案例 2.1 执行命令 [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-mutiple 2/2 Running 0 43m 10.100.1.6 worker232 <none> <none> [root@master231 pods]# [root@master231 pods]# kubectl exec xiuxian-mutiple -c c1 -- ls -l / total 76 drwxr-xr-x 2 root root 4096 Nov 12 2021 bin drwxr-xr-x 5 root root 360 Jul 11 01:21 dev drwxr-xr-x 1 root root 4096 Nov 13 2021 docker-entrypoint.d -rwxrwxr-x 1 root root 1202 Nov 13 2021 docker-entrypoint.sh drwxr-xr-x 1 root root 4096 Jul 11 01:21 etc drwxr-xr-x 2 root root 4096 Nov 12 2021 home -rw-r--r-- 1 root root 261 Jul 11 02:02 hosts drwxr-xr-x 1 root root 4096 Nov 12 2021 lib drwxr-xr-x 5 root root 4096 Nov 12 2021 media drwxr-xr-x 2 root root 4096 Nov 12 2021 mnt drwxr-xr-x 2 root root 4096 Nov 12 2021 opt dr-xr-xr-x 296 root root 0 Jul 11 01:21 proc drwx------ 1 root root 4096 Jul 11 02:00 root drwxr-xr-x 1 root root 4096 Jul 11 01:21 run drwxr-xr-x 2 root root 4096 Nov 12 2021 sbin drwxr-xr-x 2 root root 4096 Nov 12 2021 srv dr-xr-xr-x 13 root root 0 Jul 11 01:21 sys drwxrwxrwt 1 root root 4096 Nov 13 2021 tmp drwxr-xr-x 1 root root 4096 Nov 12 2021 usr drwxr-xr-x 1 root root 4096 Nov 12 2021 var [root@master231 pods]# [root@master231 pods]# kubectl exec xiuxian-mutiple -c c1 -- ps -ef PID USER TIME COMMAND 1 root 0:00 nginx: master process nginx -g daemon off; 32 nginx 0:00 nginx: worker process 33 nginx 0:00 nginx: worker process 101 root 0:00 ps -ef [root@master231 pods]# [root@master231 pods]# kubectl exec xiuxian-mutiple -c c2 -- ps -ef PID USER TIME COMMAND 1 root 0:00 tail -f /etc/hosts 18 root 0:00 ps -ef [root@master231 pods]# 2.2 链接容器进行交互 [root@master231 pods]# kubectl exec -it xiuxian-mutiple -c c2 -- sh / # ifconfig eth0 Link encap:Ethernet HWaddr 82:A2:6D:0D:8D:44 inet addr:10.100.1.6 Bcast:10.100.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:24 errors:0 dropped:0 overruns:0 frame:0 TX packets:8 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2788 (2.7 KiB) TX bytes:1059 (1.0 KiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / #
4、故障排查技巧之command&args
bash
1.什么是command * args 其中command相当与替换了Dockerfile的ENTRYPOINT指令。 而args相当于替换了Dockerfile的CMD指令。 说白了,就是用来替换启动命令的,尤其是在容器无法启动时,我们将容器启动后进行故障排查就很有必要。 2.实战案例 2.1 编写资源清单 [root@master231 pods]# cat 10-pods-xiuxian-multiple-command.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-mutiple labels: apps: xiuxian spec: # 定义容器列表 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 ports: - containerPort: 80 - name: c2 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 command: - tail - -f - /etc/hosts [root@master231 pods]# [root@master231 pods]# 2.2 测试验证问题 [root@master231 pods]# kubectl exec -it xiuxian-mutiple -c c1 -- sh / # ifconfig eth0 Link encap:Ethernet HWaddr 82:A2:6D:0D:8D:44 inet addr:10.100.1.6 Bcast:10.100.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:24 errors:0 dropped:0 overruns:0 frame:0 TX packets:8 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2788 (2.7 KiB) TX bytes:1059 (1.0 KiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) / # / # ps -ef PID USER TIME COMMAND 1 root 0:00 nginx: master process nginx -g daemon off; 32 nginx 0:00 nginx: worker process 33 nginx 0:00 nginx: worker process 108 root 0:00 sh 115 root 0:00 ps -ef / # / # netstat -untlap Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 1/nginx: master pro tcp 0 0 :::80 :::* LISTEN 1/nginx: master pro / # / # [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-mutiple -c c\2 -- sh / # / # ps -ef PID USER TIME COMMAND 1 root 0:00 tail -f /etc/hosts 24 root 0:00 sh 30 root 0:00 ps -ef / # / # netstat -untalp Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN - tcp 0 0 :::80 :::* LISTEN - / # / # nginx 2025/07/11 02:10:51 [emerg] 32#32: bind() to 0.0.0.0:80 failed (98: Address in use) nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address in use) 2025/07/11 02:10:51 [notice] 32#32: try again to bind() after 500ms 2025/07/11 02:10:51 [emerg] 32#32: bind() to 0.0.0.0:80 failed (98: Address in use) nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address in use) 2025/07/11 02:10:51 [notice] 32#32: try again to bind() after 500ms 2025/07/11 02:10:51 [emerg] 32#32: bind() to 0.0.0.0:80 failed (98: Address in use) nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address in use) 2025/07/11 02:10:51 [notice] 32#32: try again to bind() after 500ms 2025/07/11 02:10:51 [emerg] 32#32: bind() to 0.0.0.0:80 failed (98: Address in use) nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address in use) 2025/07/11 02:10:51 [notice] 32#32: try again to bind() after 500ms 2025/07/11 02:10:51 [emerg] 32#32: bind() to 0.0.0.0:80 failed (98: Address in use) nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address in use) 2025/07/11 02:10:51 [notice] 32#32: try again to bind() after 500ms 2025/07/11 02:10:51 [emerg] 32#32: still could not bind() nginx: [emerg] still could not bind() / # 2.3 解决问题 / # vi /etc/nginx/conf.d/default.conf ... server { listen 81; ... / # nginx -t nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful / # / # nginx 2025/07/11 02:12:42 [notice] 40#40: using the "epoll" event method 2025/07/11 02:12:42 [notice] 40#40: nginx/1.20.1 2025/07/11 02:12:42 [notice] 40#40: built by gcc 10.2.1 20201203 (Alpine 10.2.1_pre1) 2025/07/11 02:12:42 [notice] 40#40: OS: Linux 5.15.0-119-generic 2025/07/11 02:12:42 [notice] 40#40: getrlimit(RLIMIT_NOFILE): 524288:524288 2025/07/11 02:12:42 [notice] 41#41: start worker processes / # 2025/07/11 02:12:42 [notice] 41#41: start worker process 42 2025/07/11 02:12:42 [notice] 41#41: start worker process 43 / # / # ps -ef PID USER TIME COMMAND 1 root 0:00 tail -f /etc/hosts 24 root 0:00 sh 41 root 0:00 nginx: master process nginx 42 nginx 0:00 nginx: worker process 43 nginx 0:00 nginx: worker process 44 root 0:00 ps -ef / # / # netstat -tnulp Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:81 0.0.0.0:* LISTEN 36/nginx: master pr tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:82 0.0.0.0:* LISTEN 83/nginx: master pr tcp 0 0 :::80 :::* LISTEN - # 修改nginx端口为82,如果重启nginx的话,之前81的端口进程不会停止,而82的端口会跟着重启,需要先把nginx的进程给全部杀死后再启动nginx,这样旧的进程就会被终止 / # killall nginx / # netstat -tnulp Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:82 0.0.0.0:* LISTEN 95/nginx: master pr tcp 0 0 :::80 :::* LISTEN - / #
5、故障排查技巧之explain
bash
1.什么是explain 所谓的explain 就是k8s为使用人员提供的文档,对相关字段进行补充说明的。 一般我们在编写资源清单时,遇到问题就可以使用它来纠正。 2.实战案例 [root@master231 pods]# cat 12-pods-Troubleshooting-explain.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-explain labels: apps: xiuxian spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 ports: - containerPort: 80 xixi: haha [root@master231 pods]# [root@master231 pods]# kubectl apply -f 12-pods-Troubleshooting-explain.yaml error: error validating "12-pods-Troubleshooting-explain.yaml": error validating data: ValidationError(Pod.spec.containers[0]): unknown field "xixi" in io.k8s.api.core.v1.Container; if you choose to ignore these errors, turn validation off with --validate=false [root@master231 pods]# 报错信息: 关键报错信息为: unknown field "xixi" ... # 未知的字段 3.使用explain查看各级字段包含的子字段 3.1 语法格式 kubectl explain <Type>[.field][.field][.field][.field]... 3.2 相关数据类型说明 <string> 字符串类型,表示值只能是单个字符串,特殊字符则需要使用双引号""引起来。 <Object> 对象,表示有下级字段。且都是键值对。 <boolean> 表示值是一个布尔类型,有效值为: true或者false。 <[]Object> 说白了,就是有多个<Object>对象,每个<Object>使用"-"进行区分。使用时采用下标索引的方式引用:0,1,2,... -required- 这个关键字表示强制的意思,说白了,就是在定义资源清单时,必须引用该字段。 <integer> 整型,表示的是一个整数类型。 <map[string]string> 对应Golang编程语言的map类型,其中key是String,value是String类型。 3.3 测试样例 [root@master231 pods]# kubectl explain Pod [root@master231 pods]# kubectl explain Pod.spec [root@master231 pods]# kubectl explain Pod.spec.containers.ports [root@master231 pods]# kubectl explain Pod.metadata.labels
6、故障排查技巧之logs
bash
1.什么是logs 和docker类似,我们可以使用kubectl工具在命令行中查看Pod容器日志的信息。 2.实战案例 2.1 创建Pod [root@master231 pods]# cat 10-pods-xiuxian-multiple-command.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-mutiple labels: apps: xiuxian spec: # 定义容器列表 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 ports: - containerPort: 80 - name: c2 # 注意,容器的名称不能重复问 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 command: - tail - -f - /etc/hosts [root@master231 pods]# [root@master231 pods]# kubectl apply -f 10-pods-xiuxian-multiple-command.yaml pod/xiuxian-mutiple created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-mutiple 2/2 Running 0 14s 10.100.1.6 worker232 <none> <none> [root@master231 pods]# 2.2 实时查看指定的容器 [root@master231 pods]# kubectl logs xiuxian-mutiple error: a container name must be specified for pod xiuxian-mutiple, choose one of: [c1 c2] [root@master231 pods]# [root@master231 pods]# kubectl logs xiuxian-mutiple -c c1 -f /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh 10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf 10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh /docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh /docker-entrypoint.sh: Configuration complete; ready for start up 2025/07/11 01:21:45 [notice] 1#1: using the "epoll" event method 2025/07/11 01:21:45 [notice] 1#1: nginx/1.20.1 2025/07/11 01:21:45 [notice] 1#1: built by gcc 10.2.1 20201203 (Alpine 10.2.1_pre1) 2025/07/11 01:21:45 [notice] 1#1: OS: Linux 5.15.0-119-generic 2025/07/11 01:21:45 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 524288:524288 2025/07/11 01:21:45 [notice] 1#1: start worker processes 2025/07/11 01:21:45 [notice] 1#1: start worker process 32 2025/07/11 01:21:45 [notice] 1#1: start worker process 33 2.3 访问测试 [root@master231 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-mutiple 2/2 Running 0 2m6s 10.100.1.6 worker232 <none> <none> [root@master231 ~]# [root@master231 ~]# curl 10.100.1.6 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 ~]# 2.4 查看最近1m中的日志信息 [root@master231 ~]# kubectl logs --since 1m xiuxian-mutiple -c c1 -f 10.100.0.0 - - [11/Jul/2025:01:23:57 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 2.5 查看日志时显示时间 [root@master231 ~]# kubectl logs --since 4m xiuxian-mutiple -c c1 -f 10.100.0.0 - - [11/Jul/2025:01:23:57 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" [root@master231 ~]# [root@master231 ~]# kubectl logs --since 4m xiuxian-mutiple -c c1 -f --timestamps 2025-07-11T01:23:57.625315150Z 10.100.0.0 - - [11/Jul/2025:01:23:57 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-"
7、查看Pod容器重启前的日志
bash
1.如何查容器重启之前的日志呢? 可以通过kubectl logs -p 选项来查看。 2.实战案例 2.1 创建资源 [root@master231 01-pods]# cat > 01-pods-xiuxian.yaml <<EOF # 指定API的版本号 apiVersion: v1 # 指定资源的类型 kind: Pod # 定义元数据信息 metadata: # 指定资源的名称 name: xiuxian # 给资源打标签 labels: apps: xiuxian school: weixiang class: weixiang98 # 定义期望资源状态 spec: # 调度到指定节点的名称 nodeName: worker233 # 定义Pod运行的容器 containers: # 指定容器的名称 - name: c1 # 指定容器的镜像 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 EOF [root@master231 01-pods]# [root@master231 01-pods]# kubectl apply -f 01-pods-xiuxian.yaml pod/xiuxian created [root@master231 01-pods]# [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian 1/1 Running 0 4s 10.100.2.79 worker233 <none> <none> [root@master231 01-pods]# [root@master231 01-pods]# curl 10.100.2.79 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 01-pods]# curl 10.100.2.79 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 01-pods]# curl 10.100.2.79 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 01-pods]# [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian 1/1 Running 0 28s 10.100.2.79 worker233 <none> <none> [root@master231 01-pods]# [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian 1/1 Running 0 29s 10.100.2.79 worker233 <none> <none> [root@master231 01-pods]# [root@master231 01-pods]# kubectl logs -f xiuxian /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh 10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf 10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh /docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh /docker-entrypoint.sh: Configuration complete; ready for start up 2025/07/15 00:52:35 [notice] 1#1: using the "epoll" event method 2025/07/15 00:52:35 [notice] 1#1: nginx/1.20.1 2025/07/15 00:52:35 [notice] 1#1: built by gcc 10.2.1 20201203 (Alpine 10.2.1_pre1) 2025/07/15 00:52:35 [notice] 1#1: OS: Linux 5.15.0-143-generic 2025/07/15 00:52:35 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 524288:524288 2025/07/15 00:52:35 [notice] 1#1: start worker processes 2025/07/15 00:52:35 [notice] 1#1: start worker process 32 2025/07/15 00:52:35 [notice] 1#1: start worker process 33 10.100.0.0 - - [15/Jul/2025:00:52:59 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [15/Jul/2025:00:53:00 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [15/Jul/2025:00:53:00 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 2.2 停止容器发现会自动拉起新的容器 [root@worker233 ~]# docker ps -a | grep xiuxian 786f6f44fd09 f28fd43be4ad "/docker-entrypoint.…" 17 seconds ago Up 16 seconds k8s_c1_xiuxian_default_495fe7ae-bb8b-48ca-bb15-df53d85a6725_0 995141f83e53 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 17 seconds ago Up 16 seconds k8s_POD_xiuxian_default_495fe7ae-bb8b-48ca-bb15-df53d85a6725_0 [root@worker233 ~]# [root@worker233 ~]# [root@worker233 ~]# docker kill 786f6f44fd09 786f6f44fd09 [root@worker233 ~]# [root@worker233 ~]# docker ps -a | grep xiuxian 5fa57c37a3f2 f28fd43be4ad "/docker-entrypoint.…" 1 second ago Up Less than a second k8s_c1_xiuxian_default_495fe7ae-bb8b-48ca-bb15-df53d85a6725_1 786f6f44fd09 f28fd43be4ad "/docker-entrypoint.…" 43 seconds ago Exited (137) 1 second ago k8s_c1_xiuxian_default_495fe7ae-bb8b-48ca-bb15-df53d85a6725_0 995141f83e53 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 43 seconds ago Up 43 seconds k8s_POD_xiuxian_default_495fe7ae-bb8b-48ca-bb15-df53d85a6725_0 [root@worker233 ~]# 2.3 查看重启前的日志 [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian 1/1 Running 1 (11s ago) 54s 10.100.2.79 worker233 <none> <none> [root@master231 01-pods]# [root@master231 01-pods]# kubectl logs -f xiuxian # 默认查看的是当前容器的日志 /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh 10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf 10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh /docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh /docker-entrypoint.sh: Configuration complete; ready for start up 2025/07/15 00:53:17 [notice] 1#1: using the "epoll" event method 2025/07/15 00:53:17 [notice] 1#1: nginx/1.20.1 2025/07/15 00:53:17 [notice] 1#1: built by gcc 10.2.1 20201203 (Alpine 10.2.1_pre1) 2025/07/15 00:53:17 [notice] 1#1: OS: Linux 5.15.0-143-generic 2025/07/15 00:53:17 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 524288:524288 2025/07/15 00:53:17 [notice] 1#1: start worker processes 2025/07/15 00:53:17 [notice] 1#1: start worker process 32 2025/07/15 00:53:17 [notice] 1#1: start worker process 33 ^C [root@master231 01-pods]# [root@master231 01-pods]# [root@master231 01-pods]# kubectl logs -f xiuxian -p # 查看重启之前的日志使用‘-p’选项 /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh 10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf 10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh /docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh /docker-entrypoint.sh: Configuration complete; ready for start up 2025/07/15 00:52:35 [notice] 1#1: using the "epoll" event method 2025/07/15 00:52:35 [notice] 1#1: nginx/1.20.1 2025/07/15 00:52:35 [notice] 1#1: built by gcc 10.2.1 20201203 (Alpine 10.2.1_pre1) 2025/07/15 00:52:35 [notice] 1#1: OS: Linux 5.15.0-143-generic 2025/07/15 00:52:35 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 524288:524288 2025/07/15 00:52:35 [notice] 1#1: start worker processes 2025/07/15 00:52:35 [notice] 1#1: start worker process 32 2025/07/15 00:52:35 [notice] 1#1: start worker process 33 10.100.0.0 - - [15/Jul/2025:00:52:59 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [15/Jul/2025:00:53:00 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [15/Jul/2025:00:53:00 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" [root@master231 01-pods]# 2.4 删除容器 [root@worker233 ~]# docker ps -a | grep xiuxian 5fa57c37a3f2 f28fd43be4ad "/docker-entrypoint.…" 1 second ago Up Less than a second k8s_c1_xiuxian_default_495fe7ae-bb8b-48ca-bb15-df53d85a6725_1 786f6f44fd09 f28fd43be4ad "/docker-entrypoint.…" 43 seconds ago Exited (137) 1 second ago k8s_c1_xiuxian_default_495fe7ae-bb8b-48ca-bb15-df53d85a6725_0 995141f83e53 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 43 seconds ago Up 43 seconds k8s_POD_xiuxian_default_495fe7ae-bb8b-48ca-bb15-df53d85a6725_0 [root@worker233 ~]# [root@worker233 ~]# [root@worker233 ~]# docker rm 786f6f44fd09 786f6f44fd09 [root@worker233 ~]# [root@worker233 ~]# docker ps -a | grep xiuxian 5fa57c37a3f2 f28fd43be4ad "/docker-entrypoint.…" 35 seconds ago Up 34 seconds k8s_c1_xiuxian_default_495fe7ae-bb8b-48ca-bb15-df53d85a6725_1 995141f83e53 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" About a minute ago Up About a minute k8s_POD_xiuxian_default_495fe7ae-bb8b-48ca-bb15-df53d85a6725_0 [root@worker233 ~]# [root@worker233 ~]# 2.5 再次链接测试 [root@master231 01-pods]# kubectl logs -f xiuxian -p # 由于容器已经不存在,则无法查看重启之前的日志!【说白了,-p选项依赖于容器必须存在。】 unable to retrieve container logs for docker://786f6f44fd09e9028f117d79020665507a2c831cc2752de9253da46e6668df9a[root@master231 01-pods]# [root@master231 01-pods]# ‍```

5、Pod创建流程快速上手

bash
- 1.管理员应用资源清单 ----》 会访问api-server ---》 etcd ----》 scheduler调度Pod到对应的节点 ---> api-server会将结果写入到etcd中 ----》 kubelet组件管理Pod生命周期并上报容器状态给api-server,api-server将结果存储在etcd中。 ---> api-server将调度任务下发给相应的kubelet节点 ----》 kubelet开始创建Pod ---》 基础架构容器: ---》 初始化容器 : ---》 业务容器 : ---》docker hub官方 ---》 阿里云|华为云|腾讯云|elastic官方镜像 ---》 自建的harbor仓库 ---》如果worker节点挂掉,意味着kubelet不正常工作,则controller manager开始介入,意味着需要重新创建一个 新的Pod,而后又由scheduler负责后续的调度工作。 1.用户提交请求 通过kubectl apply命令或API提交Pod的YAML文件到Kubernetes API-Server 2.API-Server处理 验证请求的合法性,认证、授权后,将Pod配置写入etcd存储。 3.Scheduler调度介入 Scheduler根据资源需求,节点亲和性等规则,调度Pod到对应的节点 4.Kubelet创建Pod 目标节点的Kubelet监听到Pod分配,调用容器运行时(如 Docker/containerd)发起创建请求 拉取镜像,创建容器,挂载存储卷、设置网络(通过 CNI 插件)。 5.控制器管理(如适用) 若Pod由控制器(如 Deployment)创建,控制器会持续监控并确保实际状态与期望状态一致。

6、存储卷

1、Pod容器数据持久化方案之emptyDir
bash
1.什么是emptyDir 所谓的emptyDir是一个空目录,用于将容器的数据做临时的存储。 其应用场景多用于一个Pod内多个容器共享数据的场景。 其特点就是随着Pod删除,其数据丢失。 2.实战案例 2.1 编写资源清单 [root@master231 pods]# cat 13-pods-volumes-emptyDir.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-emptydir labels: apps: xiuxian spec: # 定义存储卷 volumes: - name: data # 声明存储卷类型是一个临时的空目录。 emptyDir: {} containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 # 挂载存储卷 volumeMounts: # 指定存储卷的名称 - name: data # 存储卷挂载到容器的路径,有点类似于: docker run -v data:/usr/share/nginx/html/ ... mountPath: /usr/share/nginx/html/ - name: c2 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 command: ["tail","-f","/etc/hosts"] volumeMounts: - name: data # 如果挂载路径不存在,则会自动创建。有点类似于: docker run -v data:/weixiang-weixiang98 mountPath: /weixiang-weixiang98 # 以上c1跟c2容器基于存储卷可以实现数据共享 [root@master231 pods]# [root@master231 pods]# kubectl apply -f 13-pods-volumes-emptyDir.yaml pod/xiuxian-emptydir created [root@master231 pods]# [root@master231 pods]# kubectl get pods -l apps=xiuxian -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-emptydir 2/2 Running 0 6s 10.100.1.7 worker232 <none> <none> [root@master231 pods]# 2.2 测试验证 [root@master231 pods]# curl 10.100.1.7 <html> <head><title>403 Forbidden</title></head> <body> <center><h1>403 Forbidden</h1></center> <hr><center>nginx/1.20.1</center> </body> </html> [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-emptydir -c c1 -- sh / # echo www.weixiang.com > /usr/share/nginx/html/index.html / # [root@master231 pods]# [root@master231 pods]# curl 10.100.1.7 www.weixiang.com [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-emptydir -c c2 -- sh / # / # ls -l /weixiang-weixiang98/ total 4 -rw-r--r-- 1 root root 18 Jul 11 03:58 index.html / # / # cat /weixiang-weixiang98/index.html # 很明显,c1和c2本身就网络共享,现在我们也实现数据共享。 www.weixiang.com / # / # echo "weixiang98 very good!" > /weixiang-weixiang98/index.html / # / # cat /weixiang-weixiang98/index.html weixiang98 very good! / # [root@master231 pods]# [root@master231 pods]# curl 10.100.1.7 weixiang98 very good! [root@master231 pods]# 2.3 删除数据验证 2.3.1 删除容器 [root@worker232 ~]# docker ps -a | grep xiuxian-emptydir faf08df525ae f28fd43be4ad "tail -f /etc/hosts" 3 minutes ago Up 3 minutes k8s_c2_xiuxian-emptydir_default_b7f366d6-c2b8-4f28-86bd-609a84984670_0 526b47279694 f28fd43be4ad "/docker-entrypoint.…" 3 minutes ago Up 3 minutes k8s_c1_xiuxian-emptydir_default_b7f366d6-c2b8-4f28-86bd-609a84984670_0 7a6e0d3f941d registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 3 minutes ago Up 3 minutes k8s_POD_xiuxian-emptydir_default_b7f366d6-c2b8-4f28-86bd-609a84984670_0 [root@worker232 ~]# [root@worker232 ~]# docker rm -f faf08df525ae 526b47279694 faf08df525ae 526b47279694 [root@worker232 ~]# [root@worker232 ~]# docker ps -a | grep xiuxian-emptydir a30d3f8690a0 f28fd43be4ad "tail -f /etc/hosts" 2 seconds ago Up 1 second k8s_c2_xiuxian-emptydir_default_b7f366d6-c2b8-4f28-86bd-609a84984670_1 dc325a7573f6 f28fd43be4ad "/docker-entrypoint.…" 2 seconds ago Up 1 second k8s_c1_xiuxian-emptydir_default_b7f366d6-c2b8-4f28-86bd-609a84984670_1 7a6e0d3f941d registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 3 minutes ago Up 3 minutes k8s_POD_xiuxian-emptydir_default_b7f366d6-c2b8-4f28-86bd-609a84984670_0 [root@worker232 ~]# 2.3.2 验证数据是否丢失 [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-emptydir 2/2 Running 2 4m17s 10.100.1.7 worker232 <none> <none> [root@master231 pods]# [root@master231 pods]# curl 10.100.1.7 weixiang98 very good! [root@master231 pods]# [root@master231 pods]# 2.4 验证emptyDir底层的数据存储路径 2.4.1 验证kubelet底层存储pod的数据目录 [root@worker232 ~]# cat /var/lib/kubelet/pods/b7f366d6-c2b8-4f28-86bd-609a84984670/volumes/kubernetes.io~empty-dir/data/index.html weixiang98 very good! [root@worker232 ~]# [root@worker232 ~]# echo xixi > /var/lib/kubelet/pods/b7f366d6-c2b8-4f28-86bd-609a84984670/volumes/kubernetes.io~empty-dir/data/index.html [root@worker232 ~]# [root@worker232 ~]# cat /var/lib/kubelet/pods/b7f366d6-c2b8-4f28-86bd-609a84984670/volumes/kubernetes.io~empty-dir/data/index.html xixi [root@worker232 ~]# [root@worker232 ~]# 格式: /var/lib/kubelet/pods/<POD_ID>/volumes/kubernetes.io~empty-dir 2.4.2 测试验证pod数据是否发生变化 [root@master231 pods]# curl 10.100.1.7 xixi 2.5 删除pod观察数据是否丢失 2.5.1 删除pod [root@master231 pods]# kubectl delete pods xiuxian-emptydir pod "xiuxian-emptydir" deleted [root@master231 pods]# 2.5.2 worker节点测试验证 [root@worker232 ~]# cat /var/lib/kubelet/pods/b7f366d6-c2b8-4f28-86bd-609a84984670/volumes/kubernetes.io~empty-dir/data/index.html cat: /var/lib/kubelet/pods/b7f366d6-c2b8-4f28-86bd-609a84984670/volumes/kubernetes.io~empty-dir/data/index.html: No such file or directory [root@worker232 ~]# [root@worker232 ~]# ll /var/lib/kubelet/pods/b7f366d6-c2b8-4f28-86bd-609a84984670/ ls: cannot access '/var/lib/kubelet/pods/b7f366d6-c2b8-4f28-86bd-609a84984670/': No such file or directory [root@worker232 ~]#

image

1、filebeat采集修仙业务日志案例
bash
- 1.部署ES单点和kibana,要求windows能够正常访问kibana页面; - 2.部署修仙业务,windows客户端可以正常访问修仙业务; - 3.将修仙业务的容器日志采集到ES节点,通过kibana可以正常查看; 镜像参考: http://192.168.21.253/Resources/Docker/images/ElasticStack/7.17.25/
2、部署ES和kibana
bash
[root@master231 pods]# cat 14-pods-es-kibana.yaml apiVersion: v1 kind: Pod metadata: name: elasticsearch-kibana spec: hostNetwork: true containers: - name: es image: harbor250.weixiang.com/weixiang-elasticstack/elasticsearch:7.17.25 ports: - containerPort: 9200 name: http - containerPort: 9300 name: tcp env: - name: discovery.type value: "single-node" - name: node.name value: "elk91" - name: cluster.name value: "weixiang-weixiang98-single" - name: ES_JAVA_OPTS value: "-Xms512m -Xmx512m" - name: kibana image: harbor250.weixiang.com/weixiang-elasticstack/kibana:7.17.25 ports: - containerPort: 5601 name: webui env: - name: ELASTICSEARCH_HOSTS value: http://127.0.0.1:9200 [root@master231 pods]# [root@master231 pods]# kubectl apply -f 14-pods-es-kibana.yaml pod/elasticsearch-kibana created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES elasticsearch-kibana 2/2 Running 0 4s 10.0.0.233 worker233 <none> <none> [root@master231 pods]#
3、部署filebeat集成修仙业务
bash
2.1 推送镜像到harbor仓库 [root@worker233 ~]# wget http://192.168.21.253/Resources/Docker/images/ElasticStack/7.17.25/weixiang-filebeat-v7.17.25.tar.gz [root@worker233 ~]# docker load -i weixiang-filebeat-v7.17.25.tar.gz [root@worker233 ~]# docker tag docker.elastic.co/beats/filebeat:7.17.25 harbor250.weixiang.com/weixiang-elasticstack/filebeat:7.17.25 [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-elasticstack/filebeat:7.17.25 2.2 编写资源清单 [root@master231 pods]# cat 15-pods-xiuxian-filebeat.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian labels: apps: xiuxian spec: # 定义存储卷 volumes: - name: log # 声明存储卷类型是一个临时的空目录。 emptyDir: {} containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 # 挂载存储卷 volumeMounts: # 指定存储卷的名称 - name: log # 存储卷挂载到容器的路径 mountPath: /var/log/nginx - name: c2 image: harbor250.weixiang.com/weixiang-elasticstack/filebeat:7.17.25 volumeMounts: - name: log mountPath: /data command: - tail - -f - /etc/hosts [root@master231 pods]# [root@master231 pods]# kubectl apply -f 15-pods-xiuxian-filebeat.yaml pod/xiuxian created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES elasticsearch-kibana 2/2 Running 0 5m12s 10.0.0.233 worker233 <none> <none> xiuxian 2/2 Running 0 11s 10.100.1.8 worker232 <none> <none> [root@master231 pods]# 2.3 客户端访问测试 [root@master231 ~]# for i in `seq 10`; do curl 10.100.1.8 ; done 2.4 查看修仙业务的日志 [root@master231 pods]# kubectl exec -it xiuxian -c c1 -- sh / # ls -l /var/log/nginx/ total 4 -rw-r--r-- 1 root root 0 Jul 11 06:56 access.log -rw-r--r-- 1 root root 507 Jul 11 06:56 error.log / # / # ls -l /var/log/nginx/access.log -rw-r--r-- 1 root root 0 Jul 11 06:56 /var/log/nginx/access.log / # / # tail -f /var/log/nginx/access.log 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 2.5 filebeat采集日志 [root@master231 pods]# kubectl exec -it xiuxian -c c2 -- bash filebeat@xiuxian:~$ filebeat@xiuxian:~$ ls -l /data/ total 8 -rw-r--r-- 1 root root 910 Jul 11 06:58 access.log -rw-r--r-- 1 root root 507 Jul 11 06:56 error.log filebeat@xiuxian:~$ filebeat@xiuxian:~$ cat /data/access.log 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:06:58:01 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" filebeat@xiuxian:~$ filebeat@xiuxian:~$ filebeat version filebeat version 7.17.25 (amd64), libbeat 7.17.25 [ef6504bc5cb524dfe5000d367f8d775dc7e82473 built 2024-10-15 15:24:12 +0000 UTC] filebeat@xiuxian:~$ filebeat@xiuxian:~$ cat > /tmp/xiuxian-to-es.yaml <<EOF filebeat.inputs: - type: log paths: - /data/access.log* output.elasticsearch: hosts: - 10.0.0.233:9200 index: "weixiang98-efk-xiuxian-%{+yyyy.MM.dd}" setup.ilm.enabled: false setup.template.name: "weixiang-weixiang98" setup.template.pattern: "weixiang98*" setup.template.overwrite: false setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 0 EOF filebeat@xiuxian:~$ filebeat -e -c /tmp/xiuxian-to-es.yaml 2.6 kibana查看日志 http://10.0.0.233:5601/ 3.删除资源 [root@master231 pods]# kubectl delete -f 14-pods-es-kibana.yaml -f 15-pods-xiuxian-filebeat.yaml pod "elasticsearch-kibana" deleted pod "xiuxian" deleted [root@master231 pods]#
2、存储卷案例之hostPath
bash
1.什么是hostPath hostPath存储卷用于Pod容器访问worker节点的任意工作目录。 典型的用法就是同步时间。 2.hostPath数据持久化案例 2.1 编写资源清单 [root@master231 pods]# cat 16-pods-xiuxian-hostPath.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-hostpath labels: apps: xiuxian spec: volumes: - name: log hostPath: path: /data containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 volumeMounts: - name: log mountPath: /var/log/nginx - name: c2 image: harbor250.weixiang.com/weixiang-elasticstack/filebeat:7.17.25 volumeMounts: - name: log mountPath: /data command: - tail - -f - /etc/hosts [root@master231 pods]# 2.2 创建资源 [root@master231 pods]# kubectl apply -f 16-pods-xiuxian-hostPath.yaml pod/xiuxian-hostpath created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-hostpath 2/2 Running 0 4s 10.100.1.9 worker232 <none> <none> [root@master231 pods]# [root@master231 pods]# for i in `seq 5`;do curl 10.100.1.9;done 2.3 测试验证 [root@worker232 ~]# cat /data/access.log 10.100.0.0 - - [11/Jul/2025:07:38:09 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:07:38:09 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:07:38:09 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:07:38:09 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:07:38:09 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" [root@worker232 ~]# 2.4 删除pod资源 [root@master231 pods]# kubectl delete pod xiuxian-hostpath pod "xiuxian-hostpath" deleted [root@master231 pods]# [root@master231 pods]# kubectl get pods No resources found in default namespace. [root@master231 pods]# 2.5 不难发现,数据不丢失 [root@worker232 ~]# ll /data total 16 drwxr-xr-x 2 root root 4096 Jul 11 15:37 ./ drwxr-xr-x 22 root root 4096 Jul 11 15:37 ../ -rw-r--r-- 1 root root 455 Jul 11 15:38 access.log -rw-r--r-- 1 root root 1260 Jul 11 15:39 error.log [root@worker232 ~]# [root@worker232 ~]# cat /data/access.log 10.100.0.0 - - [11/Jul/2025:07:38:09 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:07:38:09 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:07:38:09 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:07:38:09 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" 10.100.0.0 - - [11/Jul/2025:07:38:09 +0000] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-" [root@worker232 ~]#
1、hostPath时间同步
bash
3.1 未配置时区时间是不正确的 [root@master231 pods]# cat 17-pods-xiuxian-hostPath-timezone.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-hostpath-timezonen labels: apps: xiuxian spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 pods]# [root@master231 pods]# kubectl apply -f 17-pods-xiuxian-hostPath-timezone.yaml pod/xiuxian-hostpath-timezone created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-hostpath-timezone 1/1 Running 0 4s 10.100.1.10 worker232 <none> <none> [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-hostpath-timezone -- date -R Fri, 11 Jul 2025 07:42:39 +0000 [root@master231 pods]# [root@master231 pods]# kubectl delete -f 17-pods-xiuxian-hostPath-timezone.yaml pod "xiuxian-hostpath-timezone" deleted [root@master231 pods]# 3.2 配置时区 [root@master231 pods]# cat 17-pods-xiuxian-hostPath-timezone.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-hostpath-timezone labels: apps: xiuxian spec: volumes: - name: log hostPath: path: /etc/localtime containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 volumeMounts: - name: log mountPath: /etc/localtime [root@master231 pods]# [root@master231 pods]# kubectl apply -f 17-pods-xiuxian-hostPath-timezone.yaml pod/xiuxian-hostpath-timezone created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-hostpath-timezone 1/1 Running 0 4s 10.100.1.11 worker232 <none> <none> [root@master231 pods]# [root@master231 pods]# kubectl exec xiuxian-hostpath-timezone -- date -R Fri, 11 Jul 2025 15:44:22 +0800 [root@master231 pods]#
3、存储卷案例之nfs
bash
1.什么是nfs nfs表示网络文件系统,存在客户端和服务端,需要单独部署服务端。 K8S在使用nfs时,集群应该安装nfs的相关模块。 nfs的应用场景: - 1.实现跨节点不同Pod的数据共享; - 2.实现跨节点存储数据; 2.Ubuntu系统部署nfs-server 2.1 K8S集群所有节点安装nfs驱动 apt -y install nfs-kernel-server 2.2 创建服务端的共享目录 [root@master231 pods]# mkdir -pv /yinzhengjie/data/nfs-server 2.3 配置nfs-server端 [root@master231 pods]# tail -1 /etc/exports /yinzhengjie/data/nfs-server *(rw,no_root_squash) [root@master231 pods]# 2.4 重启配置生效 [root@master231 pods]# systemctl enable --now nfs-server [root@master231 pods]# [root@master231 pods]# systemctl restart nfs-server [root@master231 pods]# [root@master231 pods]# exportfs /yinzhengjie/data/nfs-server <world> [root@master231 pods]# 2.5 客户端worker232验证测试 [root@worker232 ~]# mount -t nfs 10.1.24.13:/yinzhengjie/data/nfs-server /mnt/ [root@worker232 ~]# [root@worker232 ~]# df -h | grep mnt 10.0.0.231:/yinzhengjie/data/nfs-server 48G 11G 35G 24% /mnt [root@worker232 ~]# [root@worker232 ~]# cp /etc/fstab /mnt/ [root@worker232 ~]# [root@worker232 ~]# ll /mnt/ total 12 drwxr-xr-x 2 root root 4096 Jul 11 16:12 ./ drwxr-xr-x 22 root root 4096 Jul 11 15:37 ../ -rw-r--r-- 1 root root 658 Jul 11 16:12 fstab [root@worker232 ~]# [root@worker232 ~]# umount /mnt [root@worker232 ~]# [root@worker232 ~]# df -h | grep mnt [root@worker232 ~]# 2.6 客户端work233验证测试 [root@worker233 ~]# mount -t nfs 10.1.24.13:/yinzhengjie/data/nfs-server /opt/ [root@worker233 ~]# [root@worker233 ~]# df -h | grep opt 10.0.0.231:/yinzhengjie/data/nfs-server 48G 11G 35G 24% /opt [root@worker233 ~]# [root@worker233 ~]# ll /opt/ total 12 drwxr-xr-x 2 root root 4096 Jul 11 16:12 ./ drwxr-xr-x 22 root root 4096 Jul 11 16:10 ../ -rw-r--r-- 1 root root 658 Jul 11 16:12 fstab [root@worker233 ~]# [root@worker233 ~]# [root@worker233 ~]# umount /opt [root@worker233 ~]# [root@worker233 ~]# df -h | grep opt [root@worker233 ~]# 3.k8s使用nfs存储卷案例 3.1 编写资源清单 [root@master231 pods]# cat 18-pods-xiuxian-nfs.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-apps-v1 labels: apps: xiuxian spec: volumes: - name: data # 指定存储卷 nfs: # NFS服务器 server: 10.1.24.13 # nfs的共享路径 path: /yinzhengjie/data/nfs-server nodeName: worker232 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 volumeMounts: - name: data mountPath: /usr/share/nginx/html/ --- apiVersion: v1 kind: Pod metadata: name: xiuxian-apps-v2 labels: apps: xiuxian spec: volumes: - name: data nfs: server: 10.1.24.13 path: /yinzhengjie/data/nfs-server nodeName: worker233 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 volumeMounts: - name: data mountPath: /usr/share/nginx/html/ # 实现效果:worker232 和 worker233 上的两个 Pod(xiuxian-apps-v1 和 xiuxian-apps-v2)均将容器内的 /usr/share/ # nginx/html/ 目录挂载到 同一个 NFS 服务器路径(10.0.0.231:/yinzhengjie/data/nfs-server 3.2 创建测试 [root@master231 pods]# kubectl apply -f 18-pods-xiuxian-nfs.yaml pod/xiuxian-apps-v1 created pod/xiuxian-apps-v2 created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-apps-v1 1/1 Running 0 5s 10.100.1.14 worker232 <none> <none> xiuxian-apps-v2 1/1 Running 0 5s 10.100.2.23 worker233 <none> <none> [root@master231 pods]# [root@master231 pods]# curl 10.100.1.14 <html> <head><title>403 Forbidden</title></head> <body> <center><h1>403 Forbidden</h1></center> <hr><center>nginx/1.20.1</center> </body> </html> [root@master231 pods]# [root@master231 pods]# curl 10.100.2.23 <html> <head><title>403 Forbidden</title></head> <body> <center><h1>403 Forbidden</h1></center> <hr><center>nginx/1.20.1</center> </body> </html> [root@master231 pods]# 3.3 准备数据 [root@master231 pods]# echo www.weixiang.com > /yinzhengjie/data/nfs-server/index.html [root@master231 pods]# [root@master231 pods]# curl 10.100.1.14 www.weixiang.com [root@master231 pods]# [root@master231 pods]# curl 10.100.2.23 www.weixiang.com [root@master231 pods]# [root@master231 pods]# 3.4 验证数据 [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-apps-v1 1/1 Running 0 93s 10.100.1.14 worker232 <none> <none> xiuxian-apps-v2 1/1 Running 0 93s 10.100.2.23 worker233 <none> <none> [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-apps-v1 -- sh / # echo xixi > /usr/share/nginx/html/index.html / # [root@master231 pods]# [root@master231 pods]# curl 10.100.1.14 xixi [root@master231 pods]# [root@master231 pods]# curl 10.100.2.23 xixi [root@master231 pods]# [root@master231 pods]# kubectl exec -it xiuxian-apps-v2 -- sh / # echo haha > /usr/share/nginx/html/index.html / # [root@master231 pods]# [root@master231 pods]# curl 10.100.2.23 haha [root@master231 pods]# [root@master231 pods]# curl 10.100.1.14 haha [root@master231 pods]# 3.5 删除资源 [root@master231 pods]# kubectl delete -f 18-pods-xiuxian-nfs.yaml pod "xiuxian-apps-v1" deleted pod "xiuxian-apps-v2" deleted [root@master231 pods]#
4、三种存储卷区别
特性emptyDirhostPathNFS
数据存储位置Pod 所在节点的内存或磁盘Pod 所在节点的指定路径远程 NFS 服务器
生命周期随 Pod 删除而销毁与节点生命周期一致(持久化)持久化(除非手动删除)
跨节点共享❌ 仅限同一 Pod 内容器共享❌ 仅限同一节点上的 Pod 共享✅ 支持跨节点共享
性能高(内存模式极快)高(本地磁盘 I/O)低(依赖网络带宽)
配置复杂度简单(无需预配置)中等(需确保主机路径存在)复杂(需部署 NFS 服务器)
典型用途临时缓存、容器间共享中间数据访问主机文件、日志持久化跨节点共享静态资源、持久化存储

7、使用k8s部署wordpress

bash
使用k8s部署wordpress,要求如下: - mysql部署到worker233节点; - wordpress部署到worker232节点,且windows可以正常访问wordpress; - 测试验证,删除MySQL和WordPress容器后数据不丢失,实现秒级恢复。 1.准备工作目录 [root@master231 ~]# rm -f /yinzhengjie/data/nfs-server/* [root@master231 ~]# [root@master231 ~]# ll /yinzhengjie/data/nfs-server/ total 8 drwxr-xr-x 2 root root 4096 Jul 11 16:50 ./ drwxr-xr-x 3 root root 4096 Jul 11 16:10 ../ [root@master231 ~]# [root@master231 ~]# mkdir -pv /yinzhengjie/data/nfs-server/case-demo/wordpres/{wp,db} mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo' mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/wordpres' mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/wordpres/wp' mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/wordpres/db' [root@master231 ~]# 2.编写资源清单 [root@master231 pods]# cat 19-pods-mysql-wordpress-nfs.yaml apiVersion: v1 kind: Pod metadata: name: wordpress-db labels: apps: db spec: volumes: - name: data nfs: server: 10.1.24.13 path: /yinzhengjie/data/nfs-server/case-demo/wordpres/db nodeName: worker233 hostNetwork: true containers: - name: db image: harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle env: - name: MYSQL_ROOT_PASSWORD value: "123456" - name: MYSQL_DATABASE value: "wordpress" - name: MYSQL_USER value: weixiang98 - name: MYSQL_PASSWORD value: weixiang volumeMounts: - name: data mountPath: /var/lib/mysql --- apiVersion: v1 kind: Pod metadata: name: wordpress-wp labels: apps: wp spec: volumes: - name: data nfs: server: 10.1.24.13 path: /yinzhengjie/data/nfs-server/case-demo/wordpres/wp hostNetwork: true nodeName: worker232 containers: - name: wp image: harbor250.weixiang.com/weixiang-wp/wordpress:6.7.1-php8.1-apache env: - name: WORDPRESS_DB_HOST value: "10.1.24.4" - name: WORDPRESS_DB_NAME value: "wordpress" - name: WORDPRESS_DB_USER value: weixiang98 - name: WORDPRESS_DB_PASSWORD value: weixiang volumeMounts: - name: data mountPath: /var/www/html 3.创建资源 [root@master231 pods]# kubectl apply -f 19-pods-mysql-wordpress-nfs.yaml pod/wordpress-db created pod/wordpress-wp created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES wordpress-db 1/1 Running 0 5s 10.0.0.233 worker233 <none> <none> wordpress-wp 1/1 Running 0 5s 10.0.0.232 worker232 <none> <none> [root@master231 pods]# 4.访问测试并发表文章 http://43.139.47.66/ 5.删除容器测试验证 5.1 删除数据库容器发现数据不丢失 [root@worker233 ~]# docker ps -a | grep wordpress-db 637bc42d4de2 f5f171121fa3 "docker-entrypoint.s…" 4 minutes ago Up 4 minutes k8s_db_wordpress-db_default_5a662d1b-77ed-49d1-8168-498b90cfa360_0 dd5884226070 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 4 minutes ago Up 4 minutes k8s_POD_wordpress-db_default_5a662d1b-77ed-49d1-8168-498b90cfa360_0 [root@worker233 ~]# [root@worker233 ~]# docker rm -f 637bc42d4de2 637bc42d4de2 [root@worker233 ~]# [root@worker233 ~]# docker ps -a | grep wordpress-db c55cec1ef17d f5f171121fa3 "docker-entrypoint.s…" 1 second ago Up 1 second k8s_db_wordpress-db_default_5a662d1b-77ed-49d1-8168-498b90cfa360_1 dd5884226070 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 4 minutes ago Up 4 minutes k8s_POD_wordpress-db_default_5a662d1b-77ed-49d1-8168-498b90cfa360_0 [root@worker233 ~]# 5.2 删除WordPress容器发现数据不丢失 [root@worker232 ~]# docker ps -a | grep wordpress-wp b3dd17bb94fb 13ffff361078 "docker-entrypoint.s…" 5 minutes ago Up 5 minutes k8s_wp_wordpress-wp_default_06bda2ad-91dc-4212-a236-0e9f20010c41_0 93e2024e8fa9 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 5 minutes ago Up 5 minutes k8s_POD_wordpress-wp_default_06bda2ad-91dc-4212-a236-0e9f20010c41_0 5934470870ac registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 10 minutes ago Exited (0) 9 minutes ago k8s_POD_wordpress-wp_default_d7a27bd6-3feb-4ad2-86aa-8ece34cf1033_0 [root@worker232 ~]# [root@worker232 ~]# docker rm -f b3dd17bb94fb b3dd17bb94fb [root@worker232 ~]# [root@worker232 ~]# docker ps -a | grep wordpress-wp 51f380baa48b 13ffff361078 "docker-entrypoint.s…" 1 second ago Up Less than a second k8s_wp_wordpress-wp_default_06bda2ad-91dc-4212-a236-0e9f20010c41_1 93e2024e8fa9 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 5 minutes ago Up 5 minutes k8s_POD_wordpress-wp_default_06bda2ad-91dc-4212-a236-0e9f20010c41_0 5934470870ac registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 10 minutes ago Exited (0) 9 minutes ago k8s_POD_wordpress-wp_default_d7a27bd6-3feb-4ad2-86aa-8ece34cf1033_0 [root@worker232 ~]# 5.3 删除Pod并重新创建 [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES wordpress-db 1/1 Running 1 6m58s 10.0.0.233 worker233 <none> <none> wordpress-wp 1/1 Running 1 6m58s 10.0.0.232 worker232 <none> <none> [root@master231 pods]# [root@master231 pods]# kubectl delete -f 19-pods-mysql-wordpress-nfs.yaml pod "wordpress-db" deleted pod "wordpress-wp" deleted [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide No resources found in default namespace. [root@master231 pods]# [root@master231 pods]# kubectl apply -f 19-pods-mysql-wordpress-nfs.yaml pod/wordpress-db created pod/wordpress-wp created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES wordpress-db 1/1 Running 0 1s 10.0.0.233 worker233 <none> <none> wordpress-wp 1/1 Running 0 1s 10.0.0.232 worker232 <none> <none> [root@master231 pods]# 5.4 再次访问之前的数据 发现数据并没有丢失。

8、Pod的重启策略

bash
1.什么是重启策略 所谓的Pod重启策略其实就是Pod内容器退出时是否重新创建新的容器并启动。 这一点要和docker的重启策略区分开来,因为Pod的重启策略'重启'指的是创建新的容器。 2.常见的重启策略 - Always 无论容器是否正常,异常退出,始终重启容器。 - Never 无论容器是否正常,异常退出,始终不重启容器。 - OnFailure 容器如果正常退出,则不重启容器。 容器如果异常退出,则重启容器。 温馨提示: 如果未指定重启策略,则默认的重启策略是ALways。 3.实战案例 3.1 Never案例 [root@master231 pods]# cat 20-pods-restartPolicy-Never.yaml apiVersion: v1 kind: Pod metadata: name: pods-restartpolicy-never spec: # 指定容器的重启策略 restartPolicy: Never containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 command: - sleep - "10" [root@master231 pods]# [root@master231 pods]# kubectl apply -f 20-pods-restartPolicy-Never.yaml pod/pods-restartpolicy-never created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pods-restartpolicy-never 1/1 Running 0 6s 10.100.2.24 worker233 <none> <none> [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pods-restartpolicy-never 0/1 Completed 0 40s 10.100.2.24 worker233 <none> <none> [root@master231 pods]# # 验证 [root@worker233 ~]# docker ps -a |grep pods-restartpolicy-never 3.2 OnFailure案例 [root@master231 pods]# cat 21-pods-restartPolicy-OnFailure.yaml apiVersion: v1 kind: Pod metadata: name: pods-restartpolicy-onfailure spec: # 指定容器的重启策略 restartPolicy: OnFailure containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 command: - sleep - "10d" [root@master231 pods]# [root@master231 pods]# kubectl apply -f 21-pods-restartPolicy-OnFailure.yaml pod/pods-restartpolicy-onfailure created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pods-restartpolicy-onfailure 1/1 Running 0 3s 10.100.2.26 worker233 <none> <none> [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pods-restartpolicy-onfailure 1/1 Running 0 25s 10.100.2.26 worker233 <none> <none> [root@master231 pods]# 去相应的worker节点kill容器,发现会自动重启 [root@worker233 ~]# docker ps -a | grep pods-restartpolicy-onfailure 86aae01618b3 f28fd43be4ad "sleep 10d" About a minute ago Up About a minute k8s_c1_pods-restartpolicy-onfailure_default_c6d974a2-b77b-492b-a4f5-9aad2912abb6_0 786dc0c0adad registry.aliyuncs.com/google_containers/pause:3.6 "/pause" About a minute ago Up About a minute k8s_POD_pods-restartpolicy-onfailure_default_c6d974a2-b77b-492b-a4f5-9aad2912abb6_0 [root@worker233 ~]# [root@worker233 ~]# docker kill 86aae01618b3 86aae01618b3 [root@worker233 ~]# [root@worker233 ~]# docker ps -a | grep pods-restartpolicy-onfailure bcaba4f5c9f9 f28fd43be4ad "sleep 10d" 9 seconds ago Up 9 seconds k8s_c1_pods-restartpolicy-onfailure_default_c6d974a2-b77b-492b-a4f5-9aad2912abb6_1 86aae01618b3 f28fd43be4ad "sleep 10d" About a minute ago Exited (137) 9 seconds ago k8s_c1_pods-restartpolicy-onfailure_default_c6d974a2-b77b-492b-a4f5-9aad2912abb6_0 786dc0c0adad registry.aliyuncs.com/google_containers/pause:3.6 "/pause" About a minute ago Up About a minute k8s_POD_pods-restartpolicy-onfailure_default_c6d974a2-b77b-492b-a4f5-9aad2912abb6_0 [root@worker233 ~]# 接下来,在去master节点查看: [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pods-restartpolicy-onfailure 1/1 Running 1 (36s ago) 105s 10.100.2.26 worker233 <none> <none> [root@master231 pods]# 3.3 Always案例 [root@master231 pods]# cat 22-pods-restartPolicy-Always.yaml apiVersion: v1 kind: Pod metadata: name: pods-restartpolicy-always spec: # 指定容器的重启策略 restartPolicy: Always containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 command: - sleep - "10d" [root@master231 pods]# [root@master231 pods]# kubectl apply -f 22-pods-restartPolicy-Always.yaml pod/pods-restartpolicy-always created [root@master231 pods]# [root@master231 pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pods-restartpolicy-always 1/1 Running 0 4s 10.100.2.27 worker233 <none> <none> [root@master231 pods]#
重启策略容器正常退出时
(exit code = 0)
容器异常退出时
(exit code ≠ 0)
典型应用场景
Always重启重启需要持续运行的服务(如 Web 服务器)
Never❌ 不重启❌ 不重启一次性任务(如数据备份/迁移)
OnFailure❌ 不重启重启批处理任务(失败后需重试)

9、K8s控制器

1、k8s常见的控制器之rc
bash
1.什么是rc控制器 rc控制器全称为"replicationcontrollers",rc的作用就是控制指定Pod副本数量始终存活。 当pod数量不足时,会根据模板来创建新的Pod。删除rc资源时会级联删除相应的pod。 rc是基于标签关联pod的。 2.实战案例 2.1 编写资源清单 [root@master231 replicationcontrollers]# cat 01-rc-xiuxian.yaml apiVersion: v1 kind: ReplicationController metadata: name: rc-xiuxian spec: # 指定Pod副本数量,若不指定,则默认为1副本。 replicas: 3 # rc控制器基于标签关联Pod selector: apps: v1 # 匹配Pod的标签 # 定义Pod的模板 template: metadata: name: pods-restartpolicy-always labels: apps: v1 # 必须与selector匹配 school: weixiang spec: restartPolicy: Always containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 command: - sleep - "10d" [root@master231 replicationcontrollers]# 2.2 创建资源 [root@master231 replicationcontrollers]# kubectl apply -f 01-rc-xiuxian.yaml replicationcontroller/rc-xiuxian created [root@master231 replicationcontrollers]# 2.3 查看资源 [root@master231 replicationcontrollers]# kubectl get pods -o wide -l apps=v1 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES rc-xiuxian-6hpng 1/1 Running 0 35s 10.100.1.18 worker232 <none> <none> rc-xiuxian-ltfhb 1/1 Running 0 35s 10.100.1.17 worker232 <none> <none> rc-xiuxian-z5nph 1/1 Running 0 35s 10.100.2.28 worker233 <none> <none> [root@master231 replicationcontrollers]# 2.4 删除所有的Pod会自动创建新的Pod [root@master231 replicationcontrollers]# kubectl delete pods --all pod "rc-xiuxian-6hpng" deleted pod "rc-xiuxian-ltfhb" deleted pod "rc-xiuxian-z5nph" deleted [root@master231 replicationcontrollers]# [root@master231 replicationcontrollers]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES rc-xiuxian-2mjl9 1/1 Running 0 8s 10.100.2.31 worker233 <none> <none> rc-xiuxian-ht9zt 1/1 Running 0 8s 10.100.2.32 worker233 <none> <none> rc-xiuxian-qs652 1/1 Running 0 8s 10.100.1.20 worker232 <none> <none> [root@master231 replicationcontrollers]# 2.5 删除rc资源会自动删除pod【级联删除】 [root@master231 replicationcontrollers]# kubectl get rc NAME DESIRED CURRENT READY AGE rc-xiuxian 3 3 3 4m18s [root@master231 replicationcontrollers]# [root@master231 replicationcontrollers]# kubectl delete rc rc-xiuxian replicationcontroller "rc-xiuxian" deleted [root@master231 replicationcontrollers]# [root@master231 replicationcontrollers]# kubectl get pods NAME READY STATUS RESTARTS AGE rc-xiuxian-2mjl9 1/1 Terminating 0 2m24s rc-xiuxian-ht9zt 1/1 Terminating 0 2m24s rc-xiuxian-qs652 1/1 Terminating 0 2m24s [root@master231 replicationcontrollers]# [root@master231 replicationcontrollers]# kubectl get pods No resources found in default namespace. [root@master231 replicationcontrollers]# 课堂练习之redis案例 使用rc控制器部署5个redis服务副本。 参考镜像: http://192.168.21.253/Resources/Kubernetes/Case-Demo/weixiang-redis-6.0.5.tar.gz 1.导入镜像 [root@worker232 ~]# wget http://192.168.21.253/Resources/Kubernetes/Case-Demo/weixiang-redis-6.0.5.tar.gz [root@worker232 ~]# docker load -i weixiang-redis-6.0.5.tar.gz 2.推送镜像到harbor仓库 [root@worker232 ~]# docker tag redis:6.0.5 harbor250.weixiang.com/weixiang-db/redis:6.0.5 [root@worker232 ~]# docker push harbor250.weixiang.com/weixiang-db/redis:6.0.5 3.编写资源清单 [root@master231 replicationcontrollers]# cat 02-rc-redis.yaml apiVersion: v1 kind: ReplicationController metadata: name: rc-redis spec: # 指定副本的数量 replicas: 5 # rc控制器基于标签关联Pod selector: apps: redis template: metadata: labels: apps: redis spec: containers: - name: redis-server image: harbor250.weixiang.com/weixiang-redis/redis:6.0.5 [root@master231 replicationcontrollers]# 4.创建资源 [root@master231 replicationcontrollers]# kubectl apply -f 02-rc-redis.yaml replicationcontroller/rc-redis created [root@master231 replicationcontrollers]# [root@master231 replicationcontrollers]# kubectl get pods -l apps=redis NAME READY STATUS RESTARTS AGE rc-redis-6jqvh 1/1 Running 0 11s rc-redis-9qdkn 1/1 Running 0 11s rc-redis-klvw5 1/1 Running 0 11s rc-redis-rxbgt 1/1 Running 0 11s rc-redis-xqlx9 1/1 Running 0 11s [root@master231 replicationcontrollers]# 5.测试验证 [root@master231 replicationcontrollers]# kubectl exec -it rc-redis-6jqvh -- redis-cli 127.0.0.1:6379> KEYS * (empty array) 127.0.0.1:6379> set school weixiang OK 127.0.0.1:6379> KEYS * 1) "school" 127.0.0.1:6379> 127.0.0.1:6379> get school "weixiang" 127.0.0.1:6379> 6.删除rc资源 [root@master231 replicationcontrollers]# kubectl delete -f 02-rc-redis.yaml replicationcontroller "rc-redis" deleted [root@master231 replicationcontrollers]# [root@master231 replicationcontrollers]# kubectl get pods -l apps=redis No resources found in default namespace. [root@master231 replicationcontrollers]#
2、k8s常见的控制器之rs
bash
1.什么是rs 所谓的rs全称为"replicasets",表示副本集,说白了,也是K8S集群用于控指定Pod副本数量的资源。 rs相比于rc更加轻量级,且功能更丰富。 2.实战案例 2.1 rs能够实现rc的功能 [root@master231 replicasets]# cat 01-rs-xiuxian.yaml apiVersion: apps/v1 kind: ReplicaSet metadata: name: rs-xiuxian spec: # 指定副本的数量 replicas: 3 # rc控制器基于标签关联Pod selector: # 基于标签匹配Pod matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 replicasets]# [root@master231 replicasets]# kubectl apply -f 01-rs-xiuxian.yaml replicaset.apps/rs-xiuxian created [root@master231 replicasets]# [root@master231 replicasets]# [root@master231 replicasets]# kubectl get pods -o wide -l apps=v1 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES rs-xiuxian-gb6sl 1/1 Running 0 6s 10.100.1.27 worker232 <none> <none> rs-xiuxian-hbxxd 1/1 Running 0 6s 10.100.2.38 worker233 <none> <none> rs-xiuxian-lncz2 1/1 Running 0 6s 10.100.1.26 worker232 <none> <none> [root@master231 replicasets]# [root@master231 replicasets]# kubectl delete pods -l apps=v1 pod "rs-xiuxian-gb6sl" deleted pod "rs-xiuxian-hbxxd" deleted pod "rs-xiuxian-lncz2" deleted [root@master231 replicasets]# [root@master231 replicasets]# kubectl get pods -o wide -l apps=v1 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES rs-xiuxian-srq46 1/1 Running 0 3s 10.100.2.39 worker233 <none> <none> rs-xiuxian-st48r 1/1 Running 0 3s 10.100.1.28 worker232 <none> <none> rs-xiuxian-z69bg 1/1 Running 0 3s 10.100.2.40 worker233 <none> <none> [root@master231 replicasets]# [root@master231 replicasets]# kubectl delete -f 01-rs-matchLabels-xiuxian.yaml replicaset.apps "rs-xiuxian" deleted [root@master231 replicasets]# [root@master231 replicasets]# kubectl get pods -o wide No resources found in default namespace. [root@master231 replicasets]# 2.2 rs优于rc的功能 [root@master231 replicasets]# kubectl run xiuxian01 --image=harbor250.weixiang.com/weixiang-xiuxian/apps:v1 -l apps=v1 pod/xiuxian01 created [root@master231 replicasets]# [root@master231 replicasets]# kubectl run xiuxian02 --image=harbor250.weixiang.com/weixiang-xiuxian/apps:v2 -l apps=v2 pod/xiuxian02 created [root@master231 replicasets]# [root@master231 replicasets]# kubectl get pods -o wide --show-labels NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS xiuxian01 1/1 Running 0 11s 10.100.2.42 worker233 <none> <none> apps=v1 xiuxian02 1/1 Running 0 5s 10.100.1.30 worker232 <none> <none> apps=v2 [root@master231 replicasets]# [root@master231 replicasets]# [root@master231 replicasets]# cat 02-rc-matchExpressions-xiuxian.yaml apiVersion: apps/v1 kind: ReplicaSet metadata: name: rs-xiuxain-matchexpressions spec: replicas: 5 # 需要维持的 Pod 副本数量 selector: # 定义标签表达式 matchExpressions: # 指定标签的key - key: apps # 指定标签的value values: - v1 - v2 - v3 # 指定key和value之间的关系,有效值为: In, NotIn, Exists and DoesNotExist # In: # 表示key的值必须在values中任意一个。 # NotIn: # 和In相反,说白了,就是key的值不能再values定义的列表中。 # Exists: # 表示存在key,value任意,因此可以省略不写。 # DoesNotExist: # 表示不存在key,value任意,因此可以省略不写。 operator: In # 操作符:标签值必须在指定列表中 template: metadata: labels: apps: v3 # 这个是指定标签,也就是--show-labels能查出来的内容 spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 [root@master231 replicasets]# [root@master231 replicasets]# [root@master231 replicasets]# kubectl apply -f 02-rc-matchExpressions-xiuxian.yaml replicaset.apps/rs-xiuxain-matchexpressions created [root@master231 replicasets]# [root@master231 replicasets]# kubectl get pods -o wide --show-labels NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS rs-xiuxain-matchexpressions-vthpd 1/1 Running 0 6s 10.100.2.44 worker233 <none> <none> apps=v3 rs-xiuxain-matchexpressions-vzff7 1/1 Running 0 6s 10.100.1.31 worker232 <none> <none> apps=v3 rs-xiuxain-matchexpressions-w7vps 1/1 Running 0 6s 10.100.2.43 worker233 <none> <none> apps=v3 xiuxian01 1/1 Running 0 78s 10.100.2.42 worker233 <none> <none> apps=v1 xiuxian02 1/1 Running 0 72s 10.100.1.30 worker232 <none> <none> apps=v2 [root@master231 replicasets]# [root@master231 replicasets]# [root@master231 replicasets]# kubectl delete -f 02-rc-matchExpressions-xiuxian.yaml replicaset.apps "rs-xiuxain-matchexpressions" deleted [root@master231 replicasets]# [root@master231 replicasets]# kubectl get pods -o wide --show-labels No resources found in default namespace. [root@master231 replicasets]# # labels: apps: v3就是创建LABELS的标签,跟image没关系,image只是用这个镜像创建容器
3、RC与RS的区别
bash
rc必须精确匹配所有指定的标签,如果Pod有额外标签,不会被选中 rs对比更灵活,可以使用In,NotIn,Exists,DoesNotExist等操作符 rcapi的版本是v1,rsapi的版本是apps/v1
注意点
bash
- rc和rs都不支持声明式更新 1.什么是声明式更新 就是在应用资源清单后,自动更新Pod服务。 rc和rs都需要运维人员参与,才能实现自动更新。 2.实战案例 2.1 验证rs [root@master231 replicasets]# kubectl get rs -o yaml | grep "image:" - image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 [root@master231 replicasets]# [root@master231 replicasets]# kubectl apply -f 02-rc-matchExpressions-xiuxian.yaml replicaset.apps/rs-xiuxain-matchexpressions configured [root@master231 replicasets]# [root@master231 replicasets]# kubectl get rs -o yaml | grep "image:" - image: harbor250.weixiang.com/weixiang-xiuxian/apps:v2 [root@master231 replicasets]# [root@master231 replicasets]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES rs-xiuxain-matchexpressions-4b79z 1/1 Running 0 73s 10.100.2.46 worker233 <none> <none> rs-xiuxain-matchexpressions-5sl6d 1/1 Running 0 73s 10.100.2.45 worker233 <none> <none> rs-xiuxain-matchexpressions-8jxsc 1/1 Running 0 73s 10.100.2.47 worker233 <none> <none> rs-xiuxain-matchexpressions-rbrc8 1/1 Running 0 73s 10.100.1.33 worker232 <none> <none> rs-xiuxain-matchexpressions-xh4wt 1/1 Running 0 73s 10.100.1.32 worker232 <none> <none> # 得出来的结果是v3 [root@master231 replicasets]# curl 10.100.2.45 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v3</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: pink">凡人修仙传 v3 </h1> <div> <img src="3.jpg"> <div> </body> </html> # 修改配置文件为v2

image

bash
# 虽然查看结果是v2 [root@master231 replicasets]# kubectl get rs -o yaml | grep "image:" - image: harbor250.weixiang.com/weixiang-xiuxian/apps:v2 # 但是curl出来的结果依旧是v3 [root@master231 replicasets]# curl 10.100.1.116 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v3</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: pink">凡人修仙传 v3 </h1> <div> <img src="3.jpg"> <div> </body> </html> # 手动删除后才能实现更新 [root@master231 replicasets]# kubectl delete pods --all pod "rs-xiuxain-matchexpressions-4b79z" deleted pod "rs-xiuxain-matchexpressions-5sl6d" deleted pod "rs-xiuxain-matchexpressions-8jxsc" deleted pod "rs-xiuxain-matchexpressions-rbrc8" deleted pod "rs-xiuxain-matchexpressions-xh4wt" deleted [root@master231 replicasets]# [root@master231 replicasets]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES rs-xiuxain-matchexpressions-2j8zd 1/1 Running 0 6s 10.100.1.35 worker232 <none> <none> rs-xiuxain-matchexpressions-7r8zq 1/1 Running 0 6s 10.100.2.48 worker233 <none> <none> rs-xiuxain-matchexpressions-fm24z 1/1 Running 0 6s 10.100.1.36 worker232 <none> <none> rs-xiuxain-matchexpressions-v8phc 1/1 Running 0 6s 10.100.1.34 worker232 <none> <none> rs-xiuxain-matchexpressions-wfjxk 1/1 Running 0 6s 10.100.2.49 worker233 <none> <none> [root@master231 replicasets]# # 查看结果是v2 [root@master231 replicasets]# curl 10.100.2.48 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v2</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: red">凡人修仙传 v2 </h1> <div> <img src="2.jpg"> <div> </body> </html> 2.2 验证rc [root@master231 replicationcontrollers]# kubectl get rc rc-xiuxian -o yaml | grep "image: " image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 replicationcontrollers]# [root@master231 replicationcontrollers]# kubectl edit rc rc-xiuxian replicationcontroller/rc-xiuxian edited [root@master231 replicationcontrollers]# [root@master231 replicationcontrollers]# kubectl get rc rc-xiuxian -o yaml | grep "image: " # rc的配置已经更新了 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v2 [root@master231 replicationcontrollers]# [root@master231 replicationcontrollers]# kubectl get pods -o yaml | grep "image:" | sort | uniq # pod还是之前的老版本镜像 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 replicationcontrollers]# [root@master231 replicationcontrollers]# kubectl delete pods --all # 手动删除pod后 pod "rc-xiuxian-8g2th" deleted pod "rc-xiuxian-969nv" deleted pod "rc-xiuxian-j8288" deleted [root@master231 replicationcontrollers]# [root@master231 replicationcontrollers]# kubectl get pods -o yaml | grep "image:" | sort | uniq # 新创建的pod变成了v2版本 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v2 [root@master231 replicationcontrollers]#

4、k8s常见的控制器之deploy
bash
# deploy实现声明式更新,核心作用可总结为:通过声明式配置,自动化管理Pod的部署、更新、回滚和扩缩容,确保应用始终以期望的状态运行 # 级联删除,删除控制器,语法kubectl delete deployment 资源名称 [root@master231 deployments]# kubectl delete deployment rs-xiuxain-matchexpressions 1.什么是deploy 所谓的deploy全称为"deployments",顾名思义,就是用来部署服务如(Web服务、API服务等)。 通常用deployments来部署一些有状态,无状态服务。也是工作中用到最多的控制器。 deployments并不像rs,rc那样直接控制Pod副本,而是底层调用了rs控制器来完成pod副本的控制。 相比于rs,rc而言,deployments还有一个非常重要的特性,就是支持声明式更新,自动实现滚动更新策略。 2.实战案例 2.1 支持标签匹配 [root@master231 deployments]# cat 01-deploy-matchLabels-xiuxian.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 deployments]# [root@master231 deployments]# kubectl apply -f 01-deploy-matchLabels-xiuxian.yaml deployment.apps/deploy-xiuxian created [root@master231 deployments]# [root@master231 deployments]# kubectl get deploy,rs,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-xiuxian 3/3 3 3 10s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 apps=v1 NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/deploy-xiuxian-7b574d64b 3 3 3 10s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 apps=v1,pod-template-hash=7b574d64b NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxian-7b574d64b-4phkx 1/1 Running 0 10s 10.100.1.40 worker232 <none> <none> pod/deploy-xiuxian-7b574d64b-kh4jn 1/1 Running 0 10s 10.100.1.41 worker232 <none> <none> pod/deploy-xiuxian-7b574d64b-tnwjm 1/1 Running 0 10s 10.100.2.53 worker233 <none> <none> [root@master231 deployments]# kubectl delete -f 01-deploy-matchLabels-xiuxian.yaml deployment.apps "deploy-xiuxian" deleted [root@master231 deployments]# kubectl get deploy,rs,po -o wide No resources found in default namespace. # 这张图足以说明deploy底层是rs 1、命名规则:所有ReplicaSet的名称都包含Deployment的全名(rs-xiuxain-matchexpressions)并追加一个唯一哈希值(如 77b4d9ccf9)。 关键点:这种命名规则是Deployment自动生成的,明确表明这些ReplicaSet是由该Deployment创建和管理的。 2、Deployment 的选择器:apps in (v1,v2,v3)(宽泛匹配)。 ReplicaSet 的选择器:在Deployment的选择器基础上追加了唯一哈希标签(pod-template-hash=77b4d9ccf9)。 ReplicaSet 的选择器是Deployment选择器的子集,确保ReplicaSet管理的Pod一定属于该Deployment。 3、ReplicaSet的副本数由Deployment控制,都是5片

image

bash
2.2 支持标签表达式 [root@master231 deployments]# cat 02-deploy-matchExpressions-xiuxian.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxain-matchexpressions spec: replicas: 5 selector: matchExpressions: - key: apps values: - v1 - v2 - v3 operator: In template: metadata: labels: apps: v3 spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 deployments]# [root@master231 deployments]# [root@master231 deployments]# kubectl apply -f 02-deploy-matchExpressions-xiuxian.yaml deployment.apps/deploy-xiuxain-matchexpressions created [root@master231 deployments]# [root@master231 deployments]# kubectl get deploy,rs,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-xiuxain-matchexpressions 5/5 5 5 10s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 apps in (v1,v2,v3) NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/deploy-xiuxain-matchexpressions-79bd78ffb4 5 5 5 10s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 apps in (v1,v2,v3),pod-template-hash=79bd78ffb4 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxain-matchexpressions-79bd78ffb4-6b7sb 1/1 Running 0 10s 10.100.2.54 worker233 <none> <none> pod/deploy-xiuxain-matchexpressions-79bd78ffb4-fh75g 1/1 Running 0 10s 10.100.1.42 worker232 <none> <none> pod/deploy-xiuxain-matchexpressions-79bd78ffb4-s2cxb 1/1 Running 0 10s 10.100.2.56 worker233 <none> <none> pod/deploy-xiuxain-matchexpressions-79bd78ffb4-t9gcz 1/1 Running 0 10s 10.100.1.43 worker232 <none> <none> pod/deploy-xiuxain-matchexpressions-79bd78ffb4-vjrj5 1/1 Running 0 10s 10.100.2.55 worker233 <none> <none> [root@master231 deployments]# [root@master231 deployments]# [root@master231 deployments]# kubectl delete -f 02-deploy-matchExpressions-xiuxian.yaml deployment.apps "deploy-xiuxain-matchexpressions" deleted [root@master231 deployments]# [root@master231 deployments]# kubectl get deploy,rs,po -o wide No resources found in default namespace. [root@master231 deployments]# [root@master231 deployments]# 3.验证Deploy支持声明式更新 3.1 部署旧的服务 [root@master231 deployments]# kubectl apply -f 01-deploy-matchLabels-xiuxian.yaml deployment.apps/deploy-xiuxian created [root@master231 deployments]# [root@master231 deployments]# kubectl get deploy,rs,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-xiuxian 3/3 3 3 19s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 apps=v1 NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/deploy-xiuxian-7b574d64b 3 3 3 19s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 apps=v1,pod-template-hash=7b574d64b NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxian-7b574d64b-6v8p8 1/1 Running 0 19s 10.100.2.57 worker233 <none> <none> pod/deploy-xiuxian-7b574d64b-cxb6z 1/1 Running 0 19s 10.100.1.44 worker232 <none> <none> pod/deploy-xiuxian-7b574d64b-xlcqg 1/1 Running 0 19s 10.100.2.58 worker233 <none> <none> [root@master231 deployments]# [root@master231 deployments]# curl 10.100.2.57 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 deployments]# 3.2 更新服务 [root@master231 deployments]# grep image 01-deploy-matchLabels-xiuxian.yaml image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 deployments]# [root@master231 deployments]# sed -i '/image/s#v1#v2#' 01-deploy-matchLabels-xiuxian.yaml [root@master231 deployments]# [root@master231 deployments]# grep image 01-deploy-matchLabels-xiuxian.yaml image: harbor250.weixiang.com/weixiang-xiuxian/apps:v2 [root@master231 deployments]# [root@master231 deployments]# kubectl apply -f 01-deploy-matchLabels-xiuxian.yaml deployment.apps/deploy-xiuxian configured [root@master231 deployments]# 3.3 测试验证 [root@master231 deployments]# kubectl get deploy,rs,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-xiuxian 3/3 3 3 90s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v2 apps=v1 NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/deploy-xiuxian-55d9fd6bcf 3 3 3 5s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v2 apps=v1,pod-template-hash=55d9fd6bcf replicaset.apps/deploy-xiuxian-7b574d64b 0 0 0 90s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 apps=v1,pod-template-hash=7b574d64b NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxian-55d9fd6bcf-7972d 1/1 Running 0 5s 10.100.1.45 worker232 <none> <none> pod/deploy-xiuxian-55d9fd6bcf-9hlqk 1/1 Running 0 3s 10.100.2.60 worker233 <none> <none> pod/deploy-xiuxian-55d9fd6bcf-x6m89 1/1 Running 0 4s 10.100.2.59 worker233 <none> <none> [root@master231 deployments]# [root@master231 deployments]# curl 10.100.1.45 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v2</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: red">凡人修仙传 v2 </h1> <div> <img src="2.jpg"> <div> </body> </html> [root@master231 deployments]# 3.4 验证回滚 [root@master231 deployments]# grep image 01-deploy-matchLabels-xiuxian.yaml image: harbor250.weixiang.com/weixiang-xiuxian/apps:v2 [root@master231 deployments]# [root@master231 deployments]# sed -i '/image/s#v2#v1#' 01-deploy-matchLabels-xiuxian.yaml [root@master231 deployments]# [root@master231 deployments]# grep image 01-deploy-matchLabels-xiuxian.yaml image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 deployments]# [root@master231 deployments]# kubectl apply -f 01-deploy-matchLabels-xiuxian.yaml deployment.apps/deploy-xiuxian configured [root@master231 deployments]# [root@master231 deployments]# kubectl apply -f 01-deploy-matchLabels-xiuxian.yaml deployment.apps/deploy-xiuxian unchanged [root@master231 deployments]# kubectl apply -f 01-deploy-matchLabels-xiuxian.yaml deployment.apps/deploy-xiuxian unchanged [root@master231 deployments]# [root@master231 deployments]# kubectl get deploy,rs,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-xiuxian 3/3 3 3 3m16s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 apps=v1 NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/deploy-xiuxian-55d9fd6bcf 0 0 0 111s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v2 apps=v1,pod-template-hash=55d9fd6bcf replicaset.apps/deploy-xiuxian-7b574d64b 3 3 3 3m16s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 apps=v1,pod-template-hash=7b574d64b NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxian-7b574d64b-m8g4d 1/1 Running 0 14s 10.100.2.61 worker233 <none> <none> pod/deploy-xiuxian-7b574d64b-ncxlr 1/1 Running 0 13s 10.100.2.62 worker233 <none> <none> pod/deploy-xiuxian-7b574d64b-nqhsz 1/1 Running 0 15s 10.100.1.46 worker232 <none> <none> [root@master231 deployments]# [root@master231 deployments]# [root@master231 deployments]# curl 10.100.2.62 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 deployments]# 3.5 升级新版本 [root@master231 deployments]# grep image 01-deploy-matchLabels-xiuxian.yaml image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 deployments]# [root@master231 deployments]# sed -i '/image/s#v1#v3#' 01-deploy-matchLabels-xiuxian.yaml [root@master231 deployments]# [root@master231 deployments]# grep image 01-deploy-matchLabels-xiuxian.yaml image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 [root@master231 deployments]# [root@master231 deployments]# kubectl apply -f 01-deploy-matchLabels-xiuxian.yaml deployment.apps/deploy-xiuxian configured [root@master231 deployments]# [root@master231 deployments]# kubectl get deploy,rs,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-xiuxian 3/3 3 3 5m50s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v3 apps=v1 NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/deploy-xiuxian-55d9fd6bcf 0 0 0 4m25s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v2 apps=v1,pod-template-hash=55d9fd6bcf replicaset.apps/deploy-xiuxian-7b574d64b 0 0 0 5m50s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 apps=v1,pod-template-hash=7b574d64b replicaset.apps/deploy-xiuxian-8445b8c95b 3 3 3 3s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v3 apps=v1,pod-template-hash=8445b8c95b NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxian-7b574d64b-ncxlr 1/1 Terminating 0 2m47s 10.100.2.62 worker233 <none> <none> pod/deploy-xiuxian-8445b8c95b-ckl24 1/1 Running 0 2s 10.100.2.63 worker233 <none> <none> pod/deploy-xiuxian-8445b8c95b-jr4kn 1/1 Running 0 3s 10.100.1.47 worker232 <none> <none> pod/deploy-xiuxian-8445b8c95b-wbdt4 1/1 Running 0 1s 10.100.1.48 worker232 <none> <none> [root@master231 deployments]# 温馨提示: - 1.不难发现,当我们第一次使用新的镜像部署时,则会产生一个新的rs资源,如果镜像之间部署过,则使用老的rs资源。 - 2.Deployment在部署服务时会产生rs资源,但是rs的资源数量上限默认时10个,超过10个时,系统会自动删除就得rs,换句话说,就是Deploy只保留最近10个rs版本; - 3.如果需要修改Deploy保留rs的数量,则可以通过"Deployment.spec.revisionHistoryLimit"来设置,不指定则默认为10个。

1、Deploy部署RabbitMQ
bash
使用deployment控制器将RabbitMQ部署到K8S集群,并在windows界面能够正常访问其WebUI 参考镜像: http://192.168.21.253/Resources/Kubernetes/Case-Demo/weixiang-rabbitmq-v4.1.2-management-alpine.tar.gz 1.导入镜像 [root@worker233 ~]# wget http://192.168.21.253/Resources/Kubernetes/Case-Demo/weixiang-rabbitmq-v4.1.2-management-alpine.tar.gz [root@worker233 ~]# docker load -i weixiang-rabbitmq-v4.1.2-management-alpine.tar.gz 2.推送镜像到harbor仓库 [root@worker233 ~]# docker tag rabbitmq:4.1.2-management-alpine harbor250.weixiang.com/weixiang-mq/rabbitmq:4.1.2-management-alpine [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-mq/rabbitmq:4.1.2-management-alpine 3.k8s编写资源清单 [root@master231 deployments]# cat 03-deploy-RabbitMQ.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-rabbitmq spec: replicas: 1 selector: matchLabels: apps: rabbitmq template: metadata: labels: apps: rabbitmq spec: containers: - name: rabbitmq image: harbor250.weixiang.com/weixiang-mq/rabbitmq:4.1.2-management-alpine ports: - containerPort: 15672 name: webui [root@master231 deployments]# 4.创建资源 [root@master231 deployments]# kubectl apply -f 03-deploy-RabbitMQ.yaml deployment.apps/deploy-rabbitmq created [root@master231 deployments]# [root@master231 deployments]# kubectl get deploy,rs,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-rabbitmq 1/1 1 1 8s rabbitmq harbor250.weixiang.com/weixiang-mq/rabbitmq:4.1.2-management-alpine apps=rabbitmq NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/deploy-rabbitmq-5684db7b8c 1 1 1 8s rabbitmq harbor250.weixiang.com/weixiang-mq/rabbitmq:4.1.2-management-alpine apps=rabbitmq,pod-template-hash=5684db7b8c NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-rabbitmq-5684db7b8c-5nlb8 1/1 Running 0 8s 10.100.2.64 worker233 <none> <none> [root@master231 deployments]# 5.暴露服务 [root@master231 deployments]# kubectl port-forward deploy/deploy-rabbitmq --address=0.0.0.0 8080:15672 Forwarding from 0.0.0.0:8080 -> 15672 # deploy/deploy-rabbitmq <资源类型>/<资源名称> 6.访问测试 http://10.0.0.231:8080/ 默认的用户名和密码为: "guest" 7.删除资源 [root@master231 deployments]# kubectl delete -f 03-deploy-RabbitMQ.yaml deployment.apps "deploy-rabbitmq" deleted [root@master231 deployments]#
5、k8s常见的控制器中jobs
bash
- k8s常见的控制器中jobs 1.什么是jobs 所谓的jobs就是k8s集群用于处理一次性任务的控制器。 2.实战案例 2.1 导入镜像 [root@worker233 ~]# wget http://192.168.21.253/Resources/Kubernetes/Case-Demo/weixiang-perl.5.34.tar.gz [root@worker233 ~]# docker load -i weixiang-perl.5.34.tar.gz 2.2 推送镜像到harbor仓库 [root@worker233 ~]# docker tag perl:5.34.0 harbor250.weixiang.com/weixiang-perl/perl:5.34.0 [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-perl/perl:5.34.0 2.3 编写资源清单 [root@master231 jobs]# cat 01-jobs-pi.yaml apiVersion: batch/v1 kind: Job metadata: name: jobs-pi spec: # 定义Pod的模板 template: spec: containers: - name: pi # image: perl:5.34.0 image: harbor250.weixiang.com/weixiang-perl/perl:5.34.0 command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"] restartPolicy: Never # 如果容器执行失败,则重试的次数,重试会重新创建新的Pod backoffLimit: 4 [root@master231 jobs]# 2.4 创建资源 [root@master231 jobs]# kubectl apply -f 01-jobs-pi.yaml job.batch/jobs-pi created [root@master231 jobs]# [root@master231 jobs]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES jobs-pi-cp5wz 1/1 Running 0 3s 10.100.2.66 worker233 <none> <none> [root@master231 jobs]# [root@master231 jobs]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES jobs-pi-cp5wz 0/1 Completed 0 14s 10.100.2.66 worker233 <none> <none> # 一次性job运行成功会显示Completed 2.5 查看日志 [root@master231 jobs]# kubectl logs jobs-pi-cp5wz 3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117067982148 08651328230664709384460955058223172535940812848111745028410270193852110555964462294895493038196442881097566 59334461284756482337867831652712019091456485669234603486104543266482133936072602491412737245870066063155881 74881520920962829254091715364367892590360011330530548820466521384146951941511609433057270365759591953092186 11738193261179310511854807446237996274956735188575272489122793818301194912983367336244065664308602139494639 52247371907021798609437027705392171762931767523846748184676694051320005681271452635608277857713427577896091 73637178721468440901224953430146549585371050792279689258923542019956112129021960864034418159813629774771309 96051870721134999999837297804995105973173281609631859502445945534690830264252230825334468503526193118817101 00031378387528865875332083814206171776691473035982534904287554687311595628638823537875937519577818577805321 71226806613001927876611195909216420198938095257201065485863278865936153381827968230301952035301852968995773 62259941389124972177528347913151557485724245415069595082953311686172785588907509838175463746493931925506040 09277016711390098488240128583616035637076601047101819429555961989467678374494482553797747268471040475346462 0804668425906949129331367702898915210475216205696602405803815019351125338243003558764024749647326391419927 26042699227967823547816360093417216412199245863150302861829745557067498385054945885869269956909272107975093 02955321165344987202755960236480665499119881834797753566369807426542527862551818417574672890977772793800081 64706001614524919217321721477235014144197356854816136115735255213347574184946843852332390739414333454776241 68625189835694855620992192221842725502542568876717904946016534668049886272327917860857843838279679766814541 00953883786360950680064225125205117392984896084128488626945604241965285022210661186306744278622039194945047 1237137869609563643719172874677646575739624138908658326459958133904780275901 [root@master231 jobs]#
6、k8s常见的控制器之cj
bash
1.什么是cj 所谓的cj全称为"cronjobs",其并不直接管理pod,而是底层周期性调用Job控制器。 参考链接: https://kubernetes.io/zh-cn/docs/concepts/workloads/controllers/cron-jobs/ 2.实战案例 2.1 导入镜像 [root@worker233 ~]# wget http://192.168.21.253/Resources/Kubernetes/Case-Demo/busybox/weixiang-busybox-v1.28.tar.gz [root@worker233 ~]# docker load -i weixiang-busybox-v1.28.tar.gz [root@worker233 ~]# docker tag busybox:1.28 harbor250.weixiang.com/weixiang-linux/busybox:1.28 [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-linux/busybox:1.28 2.2 编写资源清单 [root@master231 06-cronjobs]# cat 01-cj-demo.yaml apiVersion: batch/v1 # 使用的 Kubernetes API 版本,batch/v1 是 CronJob 和 Job 的稳定 API 版本。 kind: CronJob # 定义的对象类型是 CronJob。 metadata: name: hello # 这个 CronJob 对象在 Kubernetes 集群中的名字叫 "hello"。 spec: schedule: "* * * * *" # 指定调度的周期,"* * * * *" 表示每分钟运行一次。 jobTemplate: # 定义Job控制器的模板 spec: template: spec: volumes: # 定义Pod可以使用的存储卷。 - name: log # 定义了一个名为 "log" 的卷。 hostPath: # 卷的类型是hostPath,表示使用宿主节点上的文件或目录。 path: /etc/localtime # 指定宿主机上的路径为 `/etc/localtime`。这个文件包含了宿主机的时区信息。 containers: - name: hello image: harbor250.weixiang.com/weixiang-busybox/busybox:1.28 volumeMounts: # 卷挂载 (Volume Mounts) - name: log # 指定要挂载上面定义的名为 "log" 的卷。 mountPath: /etc/localtime # 将卷挂载到容器内的路径 `/etc/localtime`。 # 目的:将宿主机的时区文件覆盖容器内的时区文件。这通常是为了确保容器内的时间(尤其是 `date` 命令的输出)与宿主机(以及你所在的时区)一致,避免容器使用默认的 UTC 时间。 command: - /bin/sh - -c - date; echo Hello from the Kubernetes cluster restartPolicy: OnFailure # 定义 Pod 内容器失败时的重启策略。 [root@master231 06-cronjobs]# [root@master231 06-cronjobs]# kubectl apply -f 01-cj-demo.yaml cronjob.batch/hello created [root@master231 06-cronjobs]# 2.3 测试验证 [root@master231 06-cronjobs]# kubectl get cj,jobs,po -o wide NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE CONTAINERS IMAGES SELECTOR cronjob.batch/hello * * * * * False 0 20s 81s hello harbor250.weixiang.com/weixiang-linux/busybox:1.28 <none> NAME COMPLETIONS DURATION AGE CONTAINERS IMAGES SELECTOR job.batch/hello-29208002 1/1 3s 80s hello harbor250.weixiang.com/weixiang-linux/busybox:1.28 controller-uid=1cb1f886-4241-45f5-87d9-e469c3540d07 job.batch/hello-29208003 1/1 3s 20s hello harbor250.weixiang.com/weixiang-linux/busybox:1.28 controller-uid=430c44ec-b8e8-4cc7-8840-9267b96895e8 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/hello-29208002-xmfrb 0/1 Completed 0 80s 10.100.1.50 worker232 <none> <none> pod/hello-29208003-mf955 0/1 Completed 0 20s 10.100.2.67 worker233 <none> <none> [root@master231 06-cronjobs]# [root@master231 06-cronjobs]# kubectl logs hello-29208002-xmfrb Mon Jul 14 16:02:00 CST 2025 Hello from the Kubernetes cluster [root@master231 06-cronjobs]# [root@master231 06-cronjobs]# kubectl logs hello-29208003-mf955 Mon Jul 14 16:03:00 CST 2025 Hello from the Kubernetes cluster [root@master231 06-cronjobs]# [root@master231 06-cronjobs]# kubectl delete -f 01-cj-demo.yaml cronjob.batch "hello" deleted [root@master231 06-cronjobs]# [root@master231 06-cronjobs]# kubectl get cj,jobs,po -o wide No resources found in default namespace. [root@master231 06-cronjobs]#
7、k8s常见的控制器之ds
bash
1.什么是ds 所谓的ds全称为"daemonsets",作用就是在每个worker节点上部署且仅有一个Pod。 DaemonSet 的一些典型用法: - 1.在每个节点上运行集群守护进程 - 2.在每个节点上运行日志收集守护进程 - 3.在每个节点上运行监控守护进程 2.实战案例 [root@master231 07-daemonsets]# cat 01-ds-xiuxian.yaml apiVersion: apps/v1 kind: DaemonSet metadata: name: ds-xiuxian spec: selector: matchLabels: apps: xiuxian template: metadata: labels: apps: xiuxian spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 07-daemonsets]# [root@master231 07-daemonsets]# kubectl apply -f 01-ds-xiuxian.yaml daemonset.apps/ds-xiuxian created [root@master231 07-daemonsets]# [root@master231 07-daemonsets]# kubectl get ds,po -o wide NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR daemonset.apps/ds-xiuxian 2 2 2 2 2 <none> 5s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 apps=xiuxian NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/ds-xiuxian-9hfms 1/1 Running 0 5s 10.100.2.70 worker233 <none> <none> pod/ds-xiuxian-qqrzg 1/1 Running 0 5s 10.100.1.52 worker232 <none> <none> [root@master231 07-daemonsets]# kubectl get nodes # 本案例有3个worker节点,但是ds仅在2个节点部署成功,原因是这个ds没有配置'污点容忍' NAME STATUS ROLES AGE VERSION master231 Ready control-plane,master 5d5h v1.23.17 worker232 Ready <none> 5d5h v1.23.17 worker233 Ready <none> 5d5h v1.23.17
MySQL数据备份恢复案例
bash
- 1.使用deployment部署WordPress服务并创建业务数据; [root@master231 01-wordpress-backup]# cat 01-deploy-wordpress.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-wordpress spec: replicas: 1 selector: matchLabels: apps: wp template: metadata: labels: apps: wp spec: hostNetwork: true containers: - name: db image: harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle env: - name: MYSQL_ROOT_PASSWORD value: "123456" - name: MYSQL_DATABASE value: "wordpress" - name: MYSQL_USER value: weixiang98 - name: MYSQL_PASSWORD value: weixiang - name: wp image: harbor250.weixiang.com/weixiang-wp/wordpress:6.7.1-php8.1-apache env: - name: WORDPRESS_DB_HOST value: "127.0.0.1" - name: WORDPRESS_DB_NAME value: "wordpress" - name: WORDPRESS_DB_USER value: weixiang98 - name: WORDPRESS_DB_PASSWORD value: weixiang [root@master231 01-wordpress-backup]# [root@master231 01-wordpress-backup]# kubectl apply -f 01-deploy-wordpress.yaml deployment.apps/deploy-wordpress created [root@master231 01-wordpress-backup]# [root@master231 count]# kubectl get deploy,rs,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-wordpress 1/1 1 1 2m56s db,wp harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle,harbor250.weixiang.com/weixiang-wp/wordpress:6.7.1-php8.1-apache apps=wp NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/deploy-wordpress-58879d5974 1 1 1 2m56s db,wp harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle,harbor250.weixiang.com/weixiang-wp/wordpress:6.7.1-php8.1-apache apps=wp,pod-template-hash=58879d5974 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-wordpress-58879d5974-4vn7z 2/2 Running 0 2m56s 10.1.20.5 worker232 <none> <none> #访问10.1.20.5的公网ip

image

创建数据

image

bash
- 2.使用cj控制器周期性备份MySQL数据库和WordPress数据; [root@master231 01-wordpress-backup]# mkdir /yinzhengjie/data/nfs-server/case-demo/backup [root@master231 01-wordpress-backup]# [root@master231 01-wordpress-backup]# ll /yinzhengjie/data/nfs-server/case-demo/backup/ total 8 drwxr-xr-x 2 root root 4096 Jul 14 17:05 ./ drwxr-xr-x 4 root root 4096 Jul 14 17:05 ../ [root@master231 01-wordpress-backup]# [root@master231 01-wordpress-backup]# cat 02-cj-backup-mysql.yaml apiVersion: batch/v1 kind: CronJob metadata: name: cj-db-backup # 任务名称 spec: schedule: "* * * * *" # 定时规则(每分钟执行一次) jobTemplate: spec: template: spec: volumes: - name: log hostPath: path: /etc/localtime # 挂载主机时区文件(保持容器时间与主机一致) - name: backup nfs: server: 10.0.0.231 # NFS 服务器地址 path: /yinzhengjie/data/nfs-server/case-demo/backup/ # NFS共享路径 containers: - name: db image: harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle volumeMounts: - name: log mountPath: /etc/localtime # 时区文件挂载路径 - name: backup mountPath: /data # NFS 存储挂载路径(备份文件写入此处) command: - /bin/sh - -c - mysqldump -h 10.0.0.233 -p123456 wordpress > /data/wordpress-`date +%F-%T`.sql restartPolicy: OnFailure # 任务失败时自动重启容器 [root@master231 01-wordpress-backup]# [root@master231 01-wordpress-backup]# [root@master231 01-wordpress-backup]# kubectl apply -f 02-cj-backup-mysql.yaml cronjob.batch/cj-db-backup created [root@master231 01-wordpress-backup]# - 3.删除MySQL数据库的内容及wordpress数据,请根据备份恢复数据; 3.1 破坏数据 [root@master231 01-wordpress-backup]# kubectl exec -it deploy-wordpress-58879d5974-c8cbp -c db -- mysql -p123456 mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 64 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SHOW DATABASES; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | wordpress | +--------------------+ 5 rows in set (0.00 sec) mysql> DROP DATABASE wordpress; Query OK, 12 rows affected (0.04 sec) mysql> mysql> SHOW DATABASES; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | +--------------------+ 4 rows in set (0.00 sec) mysql> 3.2 恢复数据 [root@master231 01-wordpress-backup]# cat 03-jobs-backup-mysql.yaml apiVersion: batch/v1 kind: Job metadata: name: jobs-recover spec: template: spec: volumes: - name: log hostPath: path: /etc/localtime - name: backup nfs: server: 10.0.0.231 path: /yinzhengjie/data/nfs-server/case-demo/backup/ containers: - name: db image: harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle volumeMounts: - name: log mountPath: /etc/localtime - name: backup mountPath: /data command: - /bin/sh - -c - mysql -h 10.0.0.233 -p123456 -e "CREATE DATABASE IF NOT EXISTS wordpress"; mysql -h 10.0.0.233 -p123456 wordpress < /data/wordpress-2025-07-14-17:10:00.sql restartPolicy: OnFailure [root@master231 01-wordpress-backup]# [root@master231 01-wordpress-backup]# [root@master231 01-wordpress-backup]# kubectl apply -f 03-jobs-backup-mysql.yaml job.batch/jobs-recover created [root@master231 01-wordpress-backup]# [root@master231 01-wordpress-backup]# 3.3 测试验证 略,见视频。 3.4 记得删除资源 [root@master231 01-wordpress-backup]# kubectl delete -f . deployment.apps "deploy-wordpress" deleted cronjob.batch "cj-db-backup" deleted job.batch "jobs-recover" deleted [root@master231 01-wordpress-backup]#

10、名称空间

1、namespace
bash
- 名称空间namespace实战 1.什么是名称空间 名称空间namespace是k8s用来隔离K8S集群资源的。 K8S一切皆资源,但不是所有的资源都支持名称空间隔离的。 不支持名称空间隔离的资源我们称之为全局资源,支持名称空间隔离的我们称之为局部资源。 [root@master231 manifests]# kubectl api-resources | wc -l 57 [root@master231 manifests]# [root@master231 manifests]# kubectl api-resources NAME SHORTNAMES APIVERSION NAMESPACED KIND bindings v1 true Binding componentstatuses cs v1 false ComponentStatus configmaps cm v1 true ConfigMap endpoints ep v1 true Endpoints events ev v1 true Event limitranges limits v1 true LimitRange namespaces ns v1 false Namespace nodes no v1 false Node persistentvolumeclaims pvc v1 true PersistentVolumeClaim persistentvolumes pv v1 false PersistentVolume pods po v1 true Pod podtemplates v1 true PodTemplate replicationcontrollers rc v1 true ReplicationController ... 相关资源说明: NAME: 资源的名称。 SHORTNAMES: 资源的简写名称。 APIVERSION: 指定资源的API版本号。 NAMESPACED: 该资源是否支持名称空间。 KIND: 资源的类型。 2.名称空间的基本管理 2.1 查看名称空间列表 [root@master231 manifests]# kubectl get ns NAME STATUS AGE default Active 5d22h kube-flannel Active 5d21h kube-node-lease Active 5d22h kube-public Active 5d22h kube-system Active 5d22h [root@master231 manifests]# 2.2 创建名称空间 [root@master231 manifests]# kubectl create namespace weixiang namespace/weixiang created [root@master231 manifests]# [root@master231 manifests]# kubectl get ns NAME STATUS AGE default Active 5d22h kube-flannel Active 5d21h kube-node-lease Active 5d22h kube-public Active 5d22h kube-system Active 5d22h weixiang Active 1s [root@master231 manifests]# 2.3 删除名称空间 [root@master231 manifests]# kubectl get ns NAME STATUS AGE default Active 5d22h kube-flannel Active 5d21h kube-node-lease Active 5d22h kube-public Active 5d22h kube-system Active 5d22h weixiang Active 1s [root@master231 manifests]# [root@master231 manifests]# kubectl delete ns weixiang namespace "weixiang" deleted [root@master231 manifests]# [root@master231 manifests]# kubectl get ns NAME STATUS AGE default Active 5d22h kube-flannel Active 5d21h kube-node-lease Active 5d22h kube-public Active 5d22h kube-system Active 5d22h [root@master231 manifests]# 温馨提示: 删除名称空间,则意味着该名称空间下的所有资源都被删除。 3.查看指定名称空间的资源 3.1 查看指定名称空间的pod [root@master231 manifests]# kubectl get ns NAME STATUS AGE default Active 5d22h kube-flannel Active 5d21h kube-node-lease Active 5d22h kube-public Active 5d22h kube-system Active 5d22h [root@master231 manifests]# [root@master231 manifests]# kubectl get pods --namespace default NAME READY STATUS RESTARTS AGE xiuxian 1/1 Running 1 15m [root@master231 manifests]# [root@master231 manifests]# kubectl get pods -n kube-system -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES coredns-6d8c4cb4d-k52qr 1/1 Running 5 (35m ago) 5d22h 10.100.0.13 master231 <none> <none> coredns-6d8c4cb4d-rvzd9 1/1 Running 5 (35m ago) 5d22h 10.100.0.12 master231 <none> <none> etcd-master231 1/1 Running 5 (35m ago) 5d22h 10.0.0.231 master231 <none> <none> kube-apiserver-master231 1/1 Running 5 (35m ago) 5d22h 10.0.0.231 master231 <none> <none> kube-controller-manager-master231 1/1 Running 6 (35m ago) 5d22h 10.0.0.231 master231 <none> <none> kube-proxy-588bm 1/1 Running 5 (35m ago) 5d22h 10.0.0.232 worker232 <none> <none> kube-proxy-9bb67 1/1 Running 5 (35m ago) 5d22h 10.0.0.231 master231 <none> <none> kube-proxy-n9mv6 1/1 Running 5 (35m ago) 5d22h 10.0.0.233 worker233 <none> <none> kube-scheduler-master231 1/1 Running 6 (35m ago) 5d22h 10.0.0.231 master231 <none> <none> [root@master231 manifests]# [root@master231 manifests]# kubectl get pods -o wide # 如果不使用-n选项,则表示默认使用'default'名称空间。 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian 1/1 Running 1 15m 10.100.2.79 worker233 <none> <none> [root@master231 manifests]# 3.2 查看所有名称空间的资源 [root@master231 manifests]# kubectl get pods -o wide -A # 查看所有的名称空间的pod资源 NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default xiuxian 1/1 Running 1 16m 10.100.2.79 worker233 <none> <none> kube-flannel kube-flannel-ds-5hbns 1/1 Running 5 (36m ago) 5d21h 10.0.0.231 master231 <none> <none> kube-flannel kube-flannel-ds-dzffl 1/1 Running 5 (14h ago) 5d21h 10.0.0.233 worker233 <none> <none> kube-flannel kube-flannel-ds-h5kwh 1/1 Running 5 (14h ago) 5d21h 10.0.0.232 worker232 <none> <none> kube-system coredns-6d8c4cb4d-k52qr 1/1 Running 5 (36m ago) 5d22h 10.100.0.13 master231 <none> <none> kube-system coredns-6d8c4cb4d-rvzd9 1/1 Running 5 (36m ago) 5d22h 10.100.0.12 master231 <none> <none> kube-system etcd-master231 1/1 Running 5 (36m ago) 5d22h 10.0.0.231 master231 <none> <none> kube-system kube-apiserver-master231 1/1 Running 5 (36m ago) 5d22h 10.0.0.231 master231 <none> <none> kube-system kube-controller-manager-master231 1/1 Running 6 (36m ago) 5d22h 10.0.0.231 master231 <none> <none> kube-system kube-proxy-588bm 1/1 Running 5 (36m ago) 5d22h 10.0.0.232 worker232 <none> <none> kube-system kube-proxy-9bb67 1/1 Running 5 (36m ago) 5d22h 10.0.0.231 master231 <none> <none> kube-system kube-proxy-n9mv6 1/1 Running 5 (36m ago) 5d22h 10.0.0.233 worker233 <none> <none> kube-system kube-scheduler-master231 1/1 Running 6 (36m ago) 5d22h 10.0.0.231 master231 <none> <none> [root@master231 manifests]# [root@master231 manifests]# kubectl get pods -o wide --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default xiuxian 1/1 Running 1 16m 10.100.2.79 worker233 <none> <none> kube-flannel kube-flannel-ds-5hbns 1/1 Running 5 (36m ago) 5d21h 10.0.0.231 master231 <none> <none> kube-flannel kube-flannel-ds-dzffl 1/1 Running 5 (14h ago) 5d21h 10.0.0.233 worker233 <none> <none> kube-flannel kube-flannel-ds-h5kwh 1/1 Running 5 (14h ago) 5d21h 10.0.0.232 worker232 <none> <none> kube-system coredns-6d8c4cb4d-k52qr 1/1 Running 5 (36m ago) 5d22h 10.100.0.13 master231 <none> <none> kube-system coredns-6d8c4cb4d-rvzd9 1/1 Running 5 (36m ago) 5d22h 10.100.0.12 master231 <none> <none> kube-system etcd-master231 1/1 Running 5 (36m ago) 5d22h 10.0.0.231 master231 <none> <none> kube-system kube-apiserver-master231 1/1 Running 5 (36m ago) 5d22h 10.0.0.231 master231 <none> <none> kube-system kube-controller-manager-master231 1/1 Running 6 (36m ago) 5d22h 10.0.0.231 master231 <none> <none> kube-system kube-proxy-588bm 1/1 Running 5 (36m ago) 5d22h 10.0.0.232 worker232 <none> <none> kube-system kube-proxy-9bb67 1/1 Running 5 (36m ago) 5d22h 10.0.0.231 master231 <none> <none> kube-system kube-proxy-n9mv6 1/1 Running 5 (36m ago) 5d22h 10.0.0.233 worker233 <none> <none> kube-system kube-scheduler-master231 1/1 Running 6 (36m ago) 5d22h 10.0.0.231 master231 <none> <none> [root@master231 manifests]# [root@master231 manifests]# kubectl get ds -A # 查看所有名称空间的ds资源 NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE kube-flannel kube-flannel-ds 3 3 3 3 3 <none> 5d21h kube-system kube-proxy 3 3 3 3 3 kubernetes.io/os=linux 5d22h [root@master231 manifests]# 3.3 创建资源到指定的名称空间 [root@master231 08-namespaces]# cat 01-ns-deploy-xiuxian.yaml apiVersion: v1 kind: Namespace metadata: name: weixiang --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian # 指定名称空间,如果不指定,则默认放在default名称空间哟 namespace: weixiang spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 [root@master231 08-namespaces]# [root@master231 08-namespaces]# kubectl apply -f 01-ns-deploy-xiuxian.yaml namespace/weixiang created deployment.apps/deploy-xiuxian created [root@master231 08-namespaces]# [root@master231 08-namespaces]# kubectl get pods NAME READY STATUS RESTARTS AGE xiuxian 1/1 Running 1 19m [root@master231 08-namespaces]# [root@master231 08-namespaces]# kubectl get deployments.apps No resources found in default namespace. [root@master231 08-namespaces]# [root@master231 08-namespaces]# kubectl get ns NAME STATUS AGE default Active 5d22h kube-flannel Active 5d21h kube-node-lease Active 5d22h kube-public Active 5d22h kube-system Active 5d22h weixiang Active 27s [root@master231 08-namespaces]# [root@master231 08-namespaces]# kubectl -n weixiang get deploy,rs,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-xiuxian 3/3 3 3 28s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v3 apps=v1 NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/deploy-xiuxian-8445b8c95b 3 3 3 28s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v3 apps=v1,pod-template-hash=8445b8c95b NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxian-8445b8c95b-2r5qb 1/1 Running 0 28s 10.100.1.75 worker232 <none> <none> pod/deploy-xiuxian-8445b8c95b-b4sbs 1/1 Running 0 28s 10.100.2.80 worker233 <none> <none> pod/deploy-xiuxian-8445b8c95b-nqgbz 1/1 Running 0 28s 10.100.1.74 worker232 <none> <none> [root@master231 08-namespaces]# 3.4 删除名称空间则意味着该名称空间下的所有资源都被删除 [root@master231 08-namespaces]# kubectl -n weixiang get deploy,rs,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-xiuxian 3/3 3 3 99s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v3 apps=v1 NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/deploy-xiuxian-8445b8c95b 3 3 3 99s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v3 apps=v1,pod-template-hash=8445b8c95b NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxian-8445b8c95b-2r5qb 1/1 Running 0 99s 10.100.1.75 worker232 <none> <none> pod/deploy-xiuxian-8445b8c95b-b4sbs 1/1 Running 0 99s 10.100.2.80 worker233 <none> <none> pod/deploy-xiuxian-8445b8c95b-nqgbz 1/1 Running 0 99s 10.100.1.74 worker232 <none> <none> [root@master231 08-namespaces]# [root@master231 08-namespaces]# kubectl delete namespaces weixiang namespace "weixiang" deleted [root@master231 08-namespaces]# [root@master231 08-namespaces]# [root@master231 08-namespaces]# kubectl -n weixiang get deploy,rs,po -o wide No resources found in weixiang namespace. [root@master231 08-namespaces]# [root@master231 08-namespaces]# kubectl get ns NAME STATUS AGE default Active 5d22h kube-flannel Active 5d21h kube-node-lease Active 5d22h kube-public Active 5d22h kube-system Active 5d22h [root@master231 08-namespaces]#

11、Server⭐

bash
Service的四种类型 - ClusterIP 仅适用于K8S集群内部的服务相互访问。 - NodePort 在ClusterIP基础之上,会在所有的工作节点上增加NAT转发规则,从而让K8S集群外部用户能够访问到K8S集群内部的Pod服务。 - LoadBalancer 一般用于云厂商的K8S环境,如果自建的K8S集群,需要部署第三方组件(MetallB或者OpenELB)才能实现此功能。 - ExternalName 将K8S集群外部的服务映射到K8S集群内部的Service,说白了后端服务不在K8S集群内部。 其底层逻辑就是一个CNAME别名。
1、Server的类型之ClusterIP⭐
bash
- Server的类型之ClusterIP 1.什么是Service Service是K8S集群用来代理Pod作为统一的访问入口。 Service的底层实现依赖于kube-proxy组件的工作模式实现。 说白了,我们可以将Service理解为一个4层的负载均衡器。 Service为Pod提供了三个功能: - 统一的访问入口 - 服务发现【后端的Pod的IP地址发生变化时会自动更新】 - 负载均衡【流量会均衡的达到后端的Pod】 2.使用Service代理Pod 2.1 编写资源清单 [root@master231 09-service]# cat 01-ns-deploy-svc-xiuxian.yaml apiVersion: v1 kind: Namespace metadata: name: weixiang --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian namespace: weixiang spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxian # 若不指定名称空间,则Service默认会在default名称空间下基于标签选择器查找pod namespace: weixiang spec: # 指定Service的类型 type: ClusterIP # 基于标签关联后端的Pod selector: apps: v1 # 配置端口映射 ports: # 端口使用的协议,支持TCP,UDP,SCTP,若不指定则默认为TCP - protocol: TCP # 指定Service的端口号 port: 88 # 指定关联后端Pod的端口号 targetPort: 80 [root@master231 09-service]# 2.2 测试验证【统一的访问入口】 [root@master231 09-service]# kubectl apply -f 01-ns-deploy-svc-xiuxian.yaml namespace/weixiang created deployment.apps/deploy-xiuxian created service/svc-xiuxian created [root@master231 09-service]# [root@master231 09-service]# kubectl get deploy,rs,svc,po -o wide # 不难发现,在默认名称空间下找到我们创建的资源。 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 5d23h <none> [root@master231 09-service]# [root@master231 09-service]# kubectl get deploy,rs,svc,po -o wide -n weixiang # 查看资源必须指定名称空间。 NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-xiuxian 3/3 3 3 14s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 apps=v1 NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/deploy-xiuxian-7b574d64b 3 3 3 14s c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1 apps=v1,pod-template-hash=7b574d64b NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/svc-xiuxian ClusterIP 10.200.139.208 <none> 88/TCP 14s apps=v1 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxian-7b574d64b-4qfmb 1/1 Running 0 13s 10.100.1.82 worker232 <none> <none> pod/deploy-xiuxian-7b574d64b-nc579 1/1 Running 0 13s 10.100.2.84 worker233 <none> <none> pod/deploy-xiuxian-7b574d64b-qgzz4 1/1 Running 0 13s 10.100.1.83 worker232 <none> <none> [root@master231 09-service]# # 查看deploy控制器信息

image

bash
[root@master231 09-service]# kubectl -n weixiang describe svc svc-xiuxian Name: svc-xiuxian Namespace: weixiang Labels: <none> Annotations: <none> Selector: apps=v1 Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: 10.200.139.208 # 这是Service的IP地址 IPs: 10.200.139.208 Port: <unset> 88/TCP # 这是Service的端口。 TargetPort: 80/TCP Endpoints: 10.100.1.82:80,10.100.1.83:80,10.100.2.84:80 # 很明显,这些IP地址就是我们的Pod地址 Session Affinity: None Events: <none> # 查看service信息

image

bash
# Pod是临时的,可能因故障、扩缩容或滚动更新导致IP变化,ClusterIP Service提供一个固定虚拟IP(VIP),流量通过该IP自动负载均衡到后端Pod # 无论后端Pod如何变化,其他服务只需访问该 IP 2.3 访问Service测试,访问service的ip地址 [root@master231 09-service]# curl 10.200.139.208:88 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 09-service]# 2.4 Service能够自动发现Pod的IP变化【服务发现】 [root@master231 09-service]# kubectl -n weixiang describe svc svc-xiuxian | grep Endpoints Endpoints: 10.100.1.82:80,10.100.1.83:80,10.100.2.84:80 [root@master231 09-service]# [root@master231 09-service]# kubectl delete pods --all -n weixiang pod "deploy-xiuxian-7b574d64b-4qfmb" deleted pod "deploy-xiuxian-7b574d64b-nc579" deleted pod "deploy-xiuxian-7b574d64b-qgzz4" deleted [root@master231 09-service]# [root@master231 09-service]# kubectl -n weixiang get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-7b574d64b-447t8 1/1 Running 0 8s 10.100.1.84 worker232 <none> <none> deploy-xiuxian-7b574d64b-72vmd 1/1 Running 0 8s 10.100.2.85 worker233 <none> <none> deploy-xiuxian-7b574d64b-hxq26 1/1 Running 0 8s 10.100.2.86 worker233 <none> <none> # 可以看到变化的ip在service里面可以看到 [root@master231 09-service]# kubectl -n weixiang describe svc svc-xiuxian | grep Endpoints Endpoints: 10.100.1.84:80,10.100.2.85:80,10.100.2.86:80 [root@master231 09-service]# [root@master231 09-service]# curl 10.200.139.208:88 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 09-service]# 2.5 验证Service可以实现负载均衡【负载均衡】 [root@master231 09-service]# kubectl -n weixiang get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-7b574d64b-447t8 1/1 Running 0 108s 10.100.1.84 worker232 <none> <none> deploy-xiuxian-7b574d64b-72vmd 1/1 Running 0 108s 10.100.2.85 worker233 <none> <none> deploy-xiuxian-7b574d64b-hxq26 1/1 Running 0 108s 10.100.2.86 worker233 <none> <none> [root@master231 09-service]# [root@master231 09-service]# kubectl -n weixiang exec -it deploy-xiuxian-7b574d64b-447t8 -- sh / # echo AAAAA > /usr/share/nginx/html/index.html / # [root@master231 09-service]# [root@master231 09-service]# kubectl -n weixiang exec -it deploy-xiuxian-7b574d64b-72vmd -- sh / # echo BBBB > /usr/share/nginx/html/index.html / # [root@master231 09-service]# [root@master231 09-service]# kubectl -n weixiang exec -it deploy-xiuxian-7b574d64b-hxq26 -- sh / # echo CCC > /usr/share/nginx/html/index.html / # [root@master231 09-service]# [root@master231 09-service]# for i in `seq 10`; do curl 10.200.139.208:88;done BBBB BBBB BBBB AAAAA BBBB CCC CCC BBBB AAAAA AAAAA 2.6 删除资源 [root@master231 09-service]# kubectl delete -f 01-ns-deploy-svc-xiuxian.yaml namespace "weixiang" deleted deployment.apps "deploy-xiuxian" deleted service "svc-xiuxian" deleted [root@master231 09-service]#

image

bash
Deployment--Selector: apps=v1 # 表示该 Deployment 管理所有标签为apps=v1的ReplicaSet和Pod Pod--Labels: apps=v1 # 被Service的selector匹配,使流量可以路由到这些Pod Service--Selector: apps=v1 # 仅需匹配Pod的apps=v1 标签,即可将流量转发到这些 Pod Deployment → ReplicaSet → Pod 通过 apps=v1 和 pod-template-hash 实现层级控制 Service → Pod 仅通过apps=v1关联,实现流量路由 Deployment 的标签选择器(Selector)和 Service 的标签选择器(Selector)虽然都使用了 apps=v1,但二者是独立工作的,彼此之间没有直接关联
2、Server的类型之NodePort⭐
bash
1.编写资源清单 [root@master231 09-service]# cat 02-deploy-svc-NodePort.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxian spec: # 指定Service类型,若不指定,默认为: ClusterIP type: NodePort selector: apps: v1 ports: - protocol: TCP port: 88 targetPort: 80 # 如果Service类型是NodePort,则可以定义NodePort的端口号。 # 若不指定端口,则默认会在一个端口范围(30000-32767)内随时生成。 nodePort: 30080 [root@master231 09-service]# 2.创建资源 [root@master231 09-service]# kubectl apply -f 02-deploy-svc-NodePort.yaml deployment.apps/deploy-xiuxian created service/svc-xiuxian created [root@master231 09-service]# [root@master231 09-service]# kubectl get svc svc-xiuxian NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc-xiuxian NodePort 10.200.132.219 <none> 88:30080/TCP 9s [root@master231 09-service]# [root@master231 09-service]# [root@master231 09-service]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 6d svc-xiuxian NodePort 10.200.132.219 <none> 88:30080/TCP 13s [root@master231 09-service]# [root@master231 09-service]# kubectl describe service svc-xiuxian Name: svc-xiuxian Namespace: default Labels: <none> Annotations: <none> Selector: apps=v1 Type: NodePort IP Family Policy: SingleStack IP Families: IPv4 IP: 10.200.132.219 IPs: 10.200.132.219 Port: <unset> 88/TCP TargetPort: 80/TCP NodePort: <unset> 30080/TCP Endpoints: 10.100.1.91:80,10.100.1.92:80,10.100.2.90:80 Session Affinity: None External Traffic Policy: Cluster Events: <none> [root@master231 09-service]# [root@master231 09-service]# [root@master231 09-service]# kubectl get pods -o wide -l apps=v1 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-7b574d64b-6p6mz 1/1 Running 0 35s 10.100.1.92 worker232 <none> <none> deploy-xiuxian-7b574d64b-cm4z5 1/1 Running 0 35s 10.100.1.91 worker232 <none> <none> deploy-xiuxian-7b574d64b-xtm48 1/1 Running 0 35s 10.100.2.90 worker233 <none> <none> [root@master231 09-service]# 3.访问Service测试 [root@master231 09-service]# curl 10.200.132.219:88 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 09-service]# [root@master231 09-service]# curl 10.0.0.231:30080 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 09-service]# [root@master231 09-service]# curl 10.0.0.232:30080 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 09-service]# [root@master231 09-service]# curl 10.0.0.233:30080 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 09-service]# # 为什么所有节点都能访问? kube-proxy运行在每个节点上,负责监听nodePort并转发流量,即使 Pod 不在某节点上,该节点的 kube-proxy 仍会将请求转发到正确的 Pod,当访问NodeIP:NodePort时,流量进入节点的nodePort(如 30080),kube-proxy根据iptables/ipvs规则,将请求转发到Service 的ClusterIP:Port(10.200.141.245:88)。Service再负载均衡到后端Pod(如 10.100.2.62:80)。 温馨提示: 这意味着我们可以在浏览器中访问Pod服务,输入: http://106.55.44.37:30088/

image

bash
4.验证底层的实现逻辑【所有的worker节点均有NAT规则】 [root@master231 09-service]# iptables-save | grep 30088 -A KUBE-NODEPORTS -p tcp -m comment --comment "default/svc-xiuxian" -m tcp --dport 30088 -j KUBE-SVC-ZXVCLWXCD4F2GESE -A KUBE-SVC-ZXVCLWXCD4F2GESE -p tcp -m comment --comment "default/svc-xiuxian" -m tcp --dport 30088 -j KUBE-MARK-MASQ

image

image

bash
完整流量流程: 当外部请求到达节点 IP:30088 时: 1.进入KUBE-NODEPORTS链 → 匹配30088端口 → 跳转到KUBE-SVC-XXX链 2.在服务链中: 先打上KUBE-MARK-MASQ标记(为 SNAT 做准备) 负载均衡跳转到具体Pod的链(如 KUBE-SEP-XXXX) 3.最终将流量DNAT(目标地址转换)到后端 Pod IP:Port
bash
[root@master231 09-service]# iptables-save | grep KUBE-SVC-ZXVCLWXCD4F2GESE :KUBE-SVC-ZXVCLWXCD4F2GESE - [0:0] -A KUBE-NODEPORTS -p tcp -m comment --comment "default/svc-xiuxian" -m tcp --dport 30080 -j KUBE-SVC-ZXVCLWXCD4F2GESE -A KUBE-SERVICES -d 10.200.132.219/32 -p tcp -m comment --comment "default/svc-xiuxian cluster IP" -m tcp --dport 88 -j KUBE-SVC-ZXVCLWXCD4F2GES -A KUBE-SVC-ZXVCLWXCD4F2GESE ! -s 10.100.0.0/16 -d 10.200.132.219/32 -p tcp -m comment --comment "default/svc-xiuxian cluster IP" -m tcp --dport 88 -j KUBE-MARK-MASQ -A KUBE-SVC-ZXVCLWXCD4F2GESE -p tcp -m comment --comment "default/svc-xiuxian" -m tcp --dport 30080 -j KUBE-MARK-MASQ -A KUBE-SVC-ZXVCLWXCD4F2GESE -m comment --comment "default/svc-xiuxian" -m statistic --mode random --probability 0.33333333349 -j KUBE-SEP-L5ZU5BM2QM3R5ZBC -A KUBE-SVC-ZXVCLWXCD4F2GESE -m comment --comment "default/svc-xiuxian" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-REOST7VWSBSVUBUR -A KUBE-SVC-ZXVCLWXCD4F2GESE -m comment --comment "default/svc-xiuxian" -j KUBE-SEP-YXKRLFDNZ7SMMFTT [root@master231 09-service]# [root@master231 09-service]# iptables-save | grep KUBE-SEP-L5ZU5BM2QM3R5ZBC :KUBE-SEP-L5ZU5BM2QM3R5ZBC - [0:0] -A KUBE-SEP-L5ZU5BM2QM3R5ZBC -s 10.100.1.91/32 -m comment --comment "default/svc-xiuxian" -j KUBE-MARK-MASQ -A KUBE-SEP-L5ZU5BM2QM3R5ZBC -p tcp -m comment --comment "default/svc-xiuxian" -m tcp -j DNAT --to-destination 10.100.1.91:80 -A KUBE-SVC-ZXVCLWXCD4F2GESE -m comment --comment "default/svc-xiuxian" -m statistic --mode random --probability 0.33333333349 -j KUBE-SEP-L5ZU5BM2QM3R5ZBC [root@master231 09-service]# [root@master231 09-service]# iptables-save | grep KUBE-SEP-REOST7VWSBSVUBUR :KUBE-SEP-REOST7VWSBSVUBUR - [0:0] -A KUBE-SEP-REOST7VWSBSVUBUR -s 10.100.1.92/32 -m comment --comment "default/svc-xiuxian" -j KUBE-MARK-MASQ -A KUBE-SEP-REOST7VWSBSVUBUR -p tcp -m comment --comment "default/svc-xiuxian" -m tcp -j DNAT --to-destination 10.100.1.92:80 -A KUBE-SVC-ZXVCLWXCD4F2GESE -m comment --comment "default/svc-xiuxian" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-REOST7VWSBSVUBUR [root@master231 09-service]# [root@master231 09-service]# iptables-save | grep KUBE-SEP-YXKRLFDNZ7SMMFTT :KUBE-SEP-YXKRLFDNZ7SMMFTT - [0:0] -A KUBE-SEP-YXKRLFDNZ7SMMFTT -s 10.100.2.90/32 -m comment --comment "default/svc-xiuxian" -j KUBE-MARK-MASQ -A KUBE-SEP-YXKRLFDNZ7SMMFTT -p tcp -m comment --comment "default/svc-xiuxian" -m tcp -j DNAT --to-destination 10.100.2.90:80 -A KUBE-SVC-ZXVCLWXCD4F2GESE -m comment --comment "default/svc-xiuxian" -j KUBE-SEP-YXKRLFDNZ7SMMFTT `
1、案例、部署pod
bash
13课堂练习: 使用deploy部署3个pod,要求如下: - 两个pod提供tomcat服务, 一个pod提供nginx代理tomcat; - 删除tomcat,或者nginx的Pod保证数据不丢失; - 要求windows通过nginx能够访问到tomcat业务; 1.推送镜像到harbor仓库 [root@worker233 ~]# wget http://192.168.21.253/Resources/Kubernetes/Case-Demo/tomcat/weixiang-tomcat-v9.0.87.tar.gz [root@worker233 ~]# docker load -i weixiang-tomcat-v9.0.87.tar.gz [root@worker233 ~]# docker tag registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/tomcat:9.0.87-jdk17 harbor250.weixiang.com/weixiang-tomcat/tomcat:9.0.87-jdk17 [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-tomcat/tomcat:9.0.87-jdk17 2.准备数据目录 [root@master231 ~]# mkdir -pv /yinzhengjie/data/nfs-server/case-demo/lb/{tomcat01,tomcat02,nginx} mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo' mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/lb' mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/lb/tomcat01' mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/lb/tomcat02' mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/lb/nginx' [root@master231 ~]# 3.编写资源清单 [root@master231 02-lb-nginx-tomcat]# cat 01-deploy-tomcat-nginx.yaml apiVersion: apps/v1 # 指定使用的Kubernetes API版本,表示使用apps组的v1版本API kind: Deployment # 定义资源类型,这里是Deployment,表示这是一个部署资源 metadata: # 资源的元数据 name: deploy-tomcat01 # 部署的名称,这里是deploy-tomcat01 spec: # 部署规格 (spec) replicas: 1 # Deployment控制器会确保始终有1个Pod在运行 selector: # 控制器基于标签关联Pod模板中的标签,也就是tomcat01 matchLabels: apps: tomcat01 template: # 定义Pod的模板 metadata: # Pod的元数据 labels: # 为Pod设置的标签,这里设置了apps: tomcat01,与上面的selector匹配 apps: tomcat01 spec: # pod规格定义 volumes: - name: data # 定义了一个名为data的存储卷 nfs: # 使用NFS(Network File System)作为存储类型 server: 10.1.24.13 # NFS服务器的 IP 地址 (10.1.24.13) path: /yinzhengjie/data/nfs-server/case-demo/lb/tomcat01 # NFS服务器上的共享路径 containers: - name: c1 # 定义了一个名为 c1 的容器 image: harbor250.weixiang.com/weixiang-tomcat/tomcat:9.0.87-jdk17 volumeMounts: # 将存储卷挂载到容器内部 - name: data # 引用之前定义的 data 卷 mountPath: /usr/local/tomcat/webapps/ROOT # 在容器内的挂载路径 (/usr/local/tomcat/webapps/ROOT) --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-tomcat02 spec: replicas: 1 selector: matchLabels: apps: tomcat02 template: metadata: labels: apps: tomcat02 spec: volumes: - name: data nfs: server: 10.1.24.13 path: /yinzhengjie/data/nfs-server/case-demo/lb/tomcat02 containers: - name: c1 image: harbor250.weixiang.com/weixiang-tomcat/tomcat:9.0.87-jdk17 volumeMounts: - name: data mountPath: /usr/local/tomcat/webapps/ROOT --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-nginx spec: replicas: 1 selector: matchLabels: apps: nginx template: metadata: labels: apps: nginx spec: volumes: - name: data nfs: server: 10.1.24.13 path: /yinzhengjie/data/nfs-server/case-demo/lb/nginx containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 volumeMounts: - name: data mountPath: /etc/nginx/conf.d [root@master231 02-lb-nginx-tomcat]# 4.准备tomcat测试数据 [root@master231 02-lb-nginx-tomcat]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-nginx-846784cc45-tnvbr 1/1 Running 0 4m41s 10.100.1.11 worker232 <none> <none> deploy-tomcat01-6fb8b5888d-kfsws 1/1 Running 0 4m41s 10.100.1.12 worker232 <none> <none> deploy-tomcat02-694ff867f4-l25wm 1/1 Running 0 4m41s 10.100.2.13 worker233 <none> <none> [root@master231 02-lb-nginx-tomcat]# [root@master231 02-lb-nginx-tomcat]# kubectl exec -it deploy-tomcat01-6fb8b5888d-kfsws -- bash root@deploy-tomcat01-6fb8b5888d-kfsws:/usr/local/tomcat# echo tomcat01 > webapps/ROOT/index.html root@deploy-tomcat01-6fb8b5888d-kfsws:/usr/local/tomcat# exit [root@master231 02-lb-nginx-tomcat]# [root@master231 02-lb-nginx-tomcat]# kubectl exec -it deploy-tomcat02-694ff867f4-l25wm -- bash root@deploy-tomcat02-694ff867f4-l25wm:/usr/local/tomcat# echo tomcat02 > webapps/ROOT/index.html root@deploy-tomcat02-694ff867f4-l25wm:/usr/local/tomcat# exit [root@master231 02-lb-nginx-tomcat]# 5.创建svc关联nginx和tomcat并测试验证 [root@master231 02-lb-nginx-tomcat]# cat 02-svc-tomcat-nginx.yaml apiVersion: v1 kind: Service metadata: name: svc-lb spec: type: NodePort # NodePort-允许从集群外部访问服务 selector: apps: nginx # 选择带有此标签的Pod ports: - protocol: TCP port: 80 # Service在集群内部的端口 nodePort: 30080 # 节点上暴露的端口(范围 30000-32767) --- apiVersion: v1 kind: Service metadata: name: svc-tomcat01 spec: type: ClusterIP # ClusterIP - 默认类型,仅在集群内部可访问 selector: apps: tomcat01 # 选择器,选择带有此标签的 Pod ports: - protocol: TCP port: 8080 # 8080,仅限集群内部访问 --- apiVersion: v1 kind: Service metadata: name: svc-tomcat02 spec: type: ClusterIP selector: apps: tomcat02 ports: - protocol: TCP port: 8080 [root@master231 02-lb-nginx-tomcat]# [root@master231 02-lb-nginx-tomcat]# kubectl apply -f 02-svc-tomcat-nginx.yaml service/svc-lb created service/svc-tomcat01 created service/svc-tomcat02 created [root@master231 02-lb-nginx-tomcat]# [root@master231 02-lb-nginx-tomcat]# kubectl get -f 02-svc-tomcat-nginx.yaml NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc-lb NodePort 10.200.115.173 <none> 80:30080/TCP 4s svc-tomcat01 ClusterIP 10.200.64.189 <none> 8080/TCP 4s svc-tomcat02 ClusterIP 10.200.249.153 <none> 8080/TCP 4s [root@master231 02-lb-nginx-tomcat]# [root@master231 02-lb-nginx-tomcat]# curl 10.200.64.189:8080 tomcat01 [root@master231 02-lb-nginx-tomcat]# [root@master231 02-lb-nginx-tomcat]# curl 10.200.249.153:8080 tomcat02 [root@master231 02-lb-nginx-tomcat]# 6.修改nginx的配置文件 [root@master231 02-lb-nginx-tomcat]# kubectl exec -it deploy-nginx-846784cc45-tnvbr -- sh / # / # cat > /etc/nginx/conf.d/default.conf <<EOF upstream tomcat { server svc-tomcat01:8080; server svc-tomcat02:8080; } server { listen 80; listen [::]:80; server_name localhost; location / { proxy_pass http://tomcat; } error_page 500 502 503 504 /50x.html; location = /50x.html { root /usr/share/nginx/html; } } EOF / # nginx -t nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful / # / # nginx -s reload 2025/07/15 07:29:22 [notice] 33#33: signal process started / # 7.测试验证 [root@master231 02-lb-nginx-tomcat]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 6d4h svc-lb NodePort 10.200.115.173 <none> 80:30080/TCP 9m14s svc-tomcat01 ClusterIP 10.200.64.189 <none> 8080/TCP 9m14s svc-tomcat02 ClusterIP 10.200.249.153 <none> 8080/TCP 9m14s [root@master231 02-lb-nginx-tomcat]# [root@master231 02-lb-nginx-tomcat]# for i in `seq 10`; do curl 10.200.115.173;done tomcat02 tomcat01 tomcat02 tomcat01 tomcat02 tomcat01 tomcat02 tomcat01 tomcat02 tomcat01 [root@master231 02-lb-nginx-tomcat]# 或者: http://10.0.0.231:30080/ 8.删除pod验证配置是否丢失 [root@master231 02-lb-nginx-tomcat]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-nginx-7847679966-5tl77 1/1 Running 0 2m21s 10.100.2.17 worker233 <none> <none> deploy-tomcat01-6fb8b5888d-bdtq8 1/1 Running 0 4m8s 10.100.1.15 worker232 <none> <none> deploy-tomcat02-694ff867f4-klbch 1/1 Running 0 4m8s 10.100.2.16 worker233 <none> <none> [root@master231 02-lb-nginx-tomcat]# [root@master231 02-lb-nginx-tomcat]# kubectl delete pods --all pod "deploy-nginx-7847679966-5tl77" deleted pod "deploy-tomcat01-6fb8b5888d-bdtq8" deleted pod "deploy-tomcat02-694ff867f4-klbch" deleted [root@master231 02-lb-nginx-tomcat]# [root@master231 02-lb-nginx-tomcat]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-nginx-7847679966-8hrqc 1/1 Running 0 31s 10.100.1.16 worker232 <none> <none> deploy-tomcat01-6fb8b5888d-qqptg 1/1 Running 0 31s 10.100.2.18 worker233 <none> <none> deploy-tomcat02-694ff867f4-wbh6d 1/1 Running 0 31s 10.100.1.17 worker232 <none> <none> [root@master231 02-lb-nginx-tomcat]# [root@master231 02-lb-nginx-tomcat]# for i in `seq 10`; do curl 10.200.115.173;done tomcat01 tomcat02 tomcat01 tomcat02 tomcat01 tomcat02 tomcat01 tomcat02 tomcat01 tomcat02 [root@master231 02-lb-nginx-tomcat]# [root@master231 02-lb-nginx-tomcat]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 6d4h svc-lb NodePort 10.200.115.173 <none> 80:30080/TCP 11m svc-tomcat01 ClusterIP 10.200.64.189 <none> 8080/TCP 11m svc-tomcat02 ClusterIP 10.200.249.153 <none> 8080/TCP 11m [root@master231 02-lb-nginx-tomcat]# [root@master231 02-lb-nginx-tomcat]#
2、案例二、基于Service暴露wordpress
bash
课堂练习基于Service暴露wordpress 1.使用deploy部署MySQL和WordPress 2.WordPress关联MySQL时使用svc进行关联 3.windows能够正常访问wordpress 4.删除MySQL和wordpress确保数据不丢失
bash
参考案例: 1.创建工作目录 [root@master231 ~]# mkdir -pv /yinzhengjie/data/nfs-server/case-demo/wordpres/{wp,db} mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/wordpres' mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/wordpres/wp' mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/wordpres/db' [root@master231 ~]# [root@master231 ~]# tree /yinzhengjie/data/nfs-server/case-demo/wordpres /yinzhengjie/data/nfs-server/case-demo/wordpres ├── db └── wp 2 directories, 0 files 2.编写资源清单 [root@master231 03-wordpress]# cat 01-deploy-svc-db-wp.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-db spec: replicas: 1 selector: matchLabels: apps: db template: metadata: labels: apps: db spec: volumes: - name: data nfs: server: 10.0.0.231 path: /yinzhengjie/data/nfs-server/case-demo/wordpres/db containers: - name: db image: harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle env: - name: MYSQL_ROOT_PASSWORD value: "123456" - name: MYSQL_DATABASE value: "wordpress" - name: MYSQL_USER value: weixiang98 - name: MYSQL_PASSWORD value: weixiang volumeMounts: - name: data mountPath: /var/lib/mysql --- apiVersion: v1 kind: Service metadata: name: svc-db spec: type: ClusterIP selector: apps: db ports: - protocol: TCP port: 3306 --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-wp spec: replicas: 1 selector: matchLabels: apps: wp template: metadata: labels: apps: wp spec: volumes: - name: data nfs: server: 10.0.0.231 path: /yinzhengjie/data/nfs-server/case-demo/wordpres/wp containers: - name: wp image: harbor250.weixiang.com/weixiang-wp/wordpress:6.7.1-php8.1-apache env: - name: WORDPRESS_DB_HOST # 这里要指定数据库的svc value: "svc-db" - name: WORDPRESS_DB_NAME value: "wordpress" - name: WORDPRESS_DB_USER value: weixiang98 - name: WORDPRESS_DB_PASSWORD value: weixiang volumeMounts: - name: data mountPath: /var/www/html --- apiVersion: v1 kind: Service metadata: name: svc-wp spec: type: NodePort selector: apps: wp ports: - protocol: TCP port: 80 targetPort: 80 nodePort: 30090 [root@master231 03-wordpress]# [root@master231 03-wordpress]# [root@master231 03-wordpress]# kubectl apply -f 01-deploy-svc-db-wp.yaml deployment.apps/deploy-db created service/svc-db created deployment.apps/deploy-wp created service/svc-wp created [root@master231 03-wordpress]# [root@master231 03-wordpress]# kubectl get deploy,svc,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-db 1/1 1 1 24s db harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle apps=db deployment.apps/deploy-wp 1/1 1 1 24s wp harbor250.weixiang.com/weixiang-wp/wordpress:6.7.1-php8.1-apache apps=wp NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 6d5h <none> service/svc-db ClusterIP 10.200.93.103 <none> 3306/TCP 24s apps=db service/svc-wp NodePort 10.200.141.252 <none> 80:30090/TCP 24s apps=wp NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-db-5559f94ffc-7gjq7 1/1 Running 0 24s 10.100.2.19 worker233 <none> <none> pod/deploy-wp-5b9799fdfb-fz2sg 1/1 Running 0 24s 10.100.1.18 worker232 <none> <none> [root@master231 03-wordpress]# 3.测试验证 http://10.0.0.231:30090/ 4.删除资源 [root@master231 03-wordpress]# kubectl delete -f 01-deploy-svc-db-wp.yaml deployment.apps "deploy-db" deleted service "svc-db" deleted deployment.apps "deploy-wp" deleted service "svc-wp" deleted [root@master231 03-wordpress]#
3、Service底层组件kube-proxy工作模式实战
bash
1.Service的底层工作模式 Service底层基于kube-proxy组件实现代理。 而kube-proxy组件支持iptables,ipvs两种工作模式。 2.看kube-proxy的Pod日志查看默认的代理模式 [root@master231 ~]# kubectl get ds -A -o wide NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR kube-flannel kube-flannel-ds 3 3 3 3 3 <none> 6d5h kube-flannel ghcr.io/flannel-io/flannel:v0.27.0 app=flannel,k8s-app=flannel kube-system kube-proxy 3 3 3 3 3 kubernetes.io/os=linux 6d6h kube-proxy registry.aliyuncs.com/google_containers/kube-proxy:v1.23.17 k8s-app=kube-proxy [root@master231 ~]# [root@master231 ~]# [root@master231 ~]# kubectl get pods -A -l k8s-app=kube-proxy -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-system kube-proxy-588bm 1/1 Running 1 (106m ago) 6d5h 10.0.0.232 worker232 <none> <none> kube-system kube-proxy-9bb67 1/1 Running 1 (106m ago) 6d6h 10.0.0.231 master231 <none> <none> kube-system kube-proxy-n9mv6 1/1 Running 1 (106m ago) 6d5h 10.0.0.233 worker233 <none> <none> [root@master231 ~]# [root@master231 ~]# kubectl -n kube-system logs -f kube-proxy-588bm I0715 06:58:12.048801 1 node.go:163] Successfully retrieved node IP: 10.0.0.232 I0715 06:58:12.048880 1 server_others.go:138] "Detected node IP" address="10.0.0.232" I0715 06:58:12.049031 1 server_others.go:572] "Unknown proxy mode, assuming iptables proxy" proxyMode="" I0715 06:58:12.109240 1 server_others.go:206] "Using iptables Proxier" # 很明显,此处使用了iptables的代理模式 ...

image

bash
3.验证底层的确是基于iptables实现的【了解即可,可读性极差!】 [root@master231 ~]# kubectl get svc -A NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 6d6h kube-system kube-dns ClusterIP 10.200.0.10 <none> 53/UDP,53/TCP,9153/TCP 6d6h [root@master231 ~]# [root@master231 ~]# [root@master231 ~]# iptables-save | grep 10.200.0.10 -A KUBE-SERVICES -d 10.200.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-SVC-JD5MR3NA4I4DYORP ... [root@master231 ~]# [root@master231 ~]# iptables-save | grep KUBE-SVC-JD5MR3NA4I4DYORP ... -A KUBE-SVC-JD5MR3NA4I4DYORP -m comment --comment "kube-system/kube-dns:metrics" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-FXNZTSPPNYZOSYBX -A KUBE-SVC-JD5MR3NA4I4DYORP -m comment --comment "kube-system/kube-dns:metrics" -j KUBE-SEP-74VI23WBSL7TQRXB [root@master231 ~]# [root@master231 ~]# [root@master231 ~]# iptables-save | grep KUBE-SEP-FXNZTSPPNYZOSYBX ... -A KUBE-SEP-FXNZTSPPNYZOSYBX -p tcp -m comment --comment "kube-system/kube-dns:metrics" -m tcp -j DNAT --to-destination 10.100.0.4:9153 [root@master231 ~]# [root@master231 ~]# iptables-save | grep KUBE-SEP-74VI23WBSL7TQRXB ... -A KUBE-SEP-74VI23WBSL7TQRXB -p tcp -m comment --comment "kube-system/kube-dns:metrics" -m tcp -j DNAT --to-destination 10.100.0.5:9153 [root@master231 ~]# [root@master231 ~]# kubectl -n kube-system describe svc kube-dns Name: kube-dns Namespace: kube-system Labels: k8s-app=kube-dns kubernetes.io/cluster-service=true kubernetes.io/name=CoreDNS Annotations: prometheus.io/port: 9153 prometheus.io/scrape: true Selector: k8s-app=kube-dns Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: 10.200.0.10 IPs: 10.200.0.10 Port: dns 53/UDP TargetPort: 53/UDP Endpoints: 10.100.0.4:53,10.100.0.5:53 Port: dns-tcp 53/TCP TargetPort: 53/TCP Endpoints: 10.100.0.4:53,10.100.0.5:53 Port: metrics 9153/TCP TargetPort: 9153/TCP Endpoints: 10.100.0.4:9153,10.100.0.5:9153 Session Affinity: None Events: <none> [root@master231 ~]# [root@master231 ~]# kubectl -n kube-system get pods -l k8s-app=kube-dns -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES coredns-6d8c4cb4d-k52qr 1/1 Running 1 (110m ago) 6d6h 10.100.0.4 master231 <none> <none> coredns-6d8c4cb4d-rvzd9 1/1 Running 1 (110m ago) 6d6h 10.100.0.5 master231 <none> <none> [root@master231 ~]# 4.修改kube-proxy的代理模式 [root@master231 ~]# kubectl get configmap kube-proxy -n kube-system -o yaml | \ sed -e "s/strictARP: false/strictARP: true/" | \ sed -e 's#mode: ""#mode: "ipvs"#' | \ kubectl apply -f - -n kube-system 5.删除pod使得配置生效 [root@master231 ~]# kubectl get pods -A -l k8s-app=kube-proxy -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-system kube-proxy-588bm 1/1 Running 1 (115m ago) 6d6h 10.0.0.232 worker232 <none> <none> kube-system kube-proxy-9bb67 1/1 Running 1 (115m ago) 6d6h 10.0.0.231 master231 <none> <none> kube-system kube-proxy-n9mv6 1/1 Running 1 (115m ago) 6d6h 10.0.0.233 worker233 <none> <none> [root@master231 ~]# [root@master231 ~]# kubectl -n kube-system delete pods -l k8s-app=kube-proxy pod "kube-proxy-588bm" deleted pod "kube-proxy-9bb67" deleted pod "kube-proxy-n9mv6" deleted [root@master231 ~]# [root@master231 ~]# kubectl -n kube-system get pods -l k8s-app=kube-proxy -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-proxy-g5sfd 1/1 Running 0 16s 10.0.0.233 worker233 <none> <none> kube-proxy-m6mxj 1/1 Running 0 16s 10.0.0.232 worker232 <none> <none> kube-proxy-q5bqh 1/1 Running 0 16s 10.0.0.231 master231 <none> <none> [root@master231 ~]# # 中间遇到一个错,配置完后查看一直是iptables # 解决办法:加载IPVS相关模块,所有节点执行 sudo modprobe ip_vs sudo modprobe ip_vs_rr sudo modprobe ip_vs_wrr sudo modprobe ip_vs_sh sudo modprobe nf_conntrack # 检查模块是否加载成功 lsmod | grep ip_vs ... # 创建配置文件 cat <<EOF | sudo tee /etc/modules-load.d/ipvs.conf ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack EOF # 重启系统模块加载服务 sudo systemctl restart systemd-modules-load # 重新删除pod验证生效 [root@master231 ~]# kubectl -n kube-system logs -f kube-proxy-g5sfd I0715 08:54:25.173976 1 node.go:163] Successfully retrieved node IP: 10.0.0.233 I0715 08:54:25.174278 1 server_others.go:138] "Detected node IP" address="10.0.0.233" I0715 08:54:25.194880 1 server_others.go:269] "Using ipvs Proxier" # 很明显,此处的Service其使用的代理模式为ipvs。 I0715 08:54:25.194912 1 server_others.go:271] "Creating dualStackProxier for ipvs"

image

bash
6.验证ipvs的实现逻辑【可读性较好,且性能更强。】 [root@master231 ~]# apt -y install ipvsadm [root@master231 ~]# kubectl get svc -A NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 6d6h kube-system kube-dns ClusterIP 10.200.0.10 <none> 53/UDP,53/TCP,9153/TCP 6d6h [root@master231 ~]# [root@master231 ~]# ipvsadm -ln | grep 10.200.0.10 -A 2 TCP 10.200.0.10:53 rr -> 10.100.0.4:53 Masq 1 0 0 -> 10.100.0.5:53 Masq 1 0 0 TCP 10.200.0.10:9153 rr -> 10.100.0.4:9153 Masq 1 0 0 -> 10.100.0.5:9153 Masq 1 0 0 UDP 10.200.0.10:53 rr -> 10.100.0.4:53 Masq 1 0 0 -> 10.100.0.5:53 Masq 1 0 0 [root@master231 ~]# [root@master231 ~]# kubectl -n kube-system describe svc kube-dns Name: kube-dns Namespace: kube-system Labels: k8s-app=kube-dns kubernetes.io/cluster-service=true kubernetes.io/name=CoreDNS Annotations: prometheus.io/port: 9153 prometheus.io/scrape: true Selector: k8s-app=kube-dns Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: 10.200.0.10 IPs: 10.200.0.10 Port: dns 53/UDP TargetPort: 53/UDP Endpoints: 10.100.0.4:53,10.100.0.5:53 Port: dns-tcp 53/TCP TargetPort: 53/TCP Endpoints: 10.100.0.4:53,10.100.0.5:53 Port: metrics 9153/TCP TargetPort: 9153/TCP Endpoints: 10.100.0.4:9153,10.100.0.5:9153 Session Affinity: None Events: <none> [root@master231 ~]#
4、metallb实现LoadBalancer⭐
bash
# 概述: LoadBalancer是Kubernetes Service 的一种类型,主要用于在公有云环境中自动配置外部负载均衡器,将外部流量分发到集群内的Pod。 它是NodePort的扩展,提供了更高级的云平台集成能力。 1.metallb概述 如果我们需要在自己的Kubernetes中暴露LoadBalancer的应用,那么Metallb是一个不错的解决方案。 Metallb官网地址: https://metallb.universe.tf/installation/ https://metallb.universe.tf/configuration/_advanced_bgp_configuration/ 如果想要做替代产品,也可以考虑国内kubesphere开源的OpenELB组件来代替。 参考链接: https://www.cnblogs.com/yinzhengjie/p/18962461 2.部署Metallb 2.1 配置kube-proxy代理模式为ipvs 2.2 导入镜像 http://192.168.21.253/Resources/Kubernetes/Add-ons/metallb/v0.15.2/ 2.3 下载metallb组件的资源清单 [root@master231 metallb]# wget https://raw.githubusercontent.com/metallb/metallb/v0.15.2/config/manifests/metallb-native.yaml SVIP: http://192.168.21.253/Resources/Kubernetes/Add-ons/metallb/v0.15.2/metallb-native.yaml 2.4 部署Metallb [root@master231 metallb]# kubectl apply -f metallb-native.yaml 2.5 创建存储池 [root@master231 metallb]# cat > metallb-ip-pool.yaml <<EOF apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: jasonyin2020 namespace: metallb-system spec: addresses: # 注意改为你自己为MetalLB分配的IP地址,改地址,建议设置为你windows能够访问的网段。【建议设置你的虚拟机Vmnet8网段】 # - 10.0.0.150-10.0.0.180 - 10.1.12.20-10.1.12.30 --- apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: yinzhengjie namespace: metallb-system spec: ipAddressPools: - jasonyin2020 EOF [root@master231 metallb]# kubectl apply -f metallb-ip-pool.yaml ipaddresspool.metallb.io/jasonyin2020 created l2advertisement.metallb.io/yinzhengjie created [root@master231 metallb]# [root@master231 metallb]# kubectl get ipaddresspools.metallb.io -A NAMESPACE NAME AUTO ASSIGN AVOID BUGGY IPS ADDRESSES metallb-system jasonyin2020 true false ["10.0.0.150-10.0.0.180"] [root@master231 metallb]# [root@master231 metallb]# 2.6 创建LoadBalancer的Service测试验证 [root@master231 04-xiuxian-loadBalancer]# cat >deploy-svc-LoadBalancer.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxian spec: type: LoadBalancer selector: apps: v1 ports: - port: 80 EOF [root@master231 04-xiuxian-loadBalancer]# [root@master231 04-xiuxian-loadBalancer]# kubectl apply -f deploy-svc-LoadBalancer.yaml deployment.apps/deploy-xiuxian created service/svc-xiuxian created [root@master231 04-xiuxian-loadBalancer]# [root@master231 04-xiuxian-loadBalancer]# kubectl get svc,po -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 6d6h <none> service/svc-xiuxian LoadBalancer 10.200.225.218 10.0.0.150 80:30426/TCP 6s apps=v1 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxian-7b574d64b-hb44z 1/1 Running 0 6s 10.100.1.24 worker232 <none> <none> pod/deploy-xiuxian-7b574d64b-kw96j 1/1 Running 0 6s 10.100.2.25 worker233 <none> <none> pod/deploy-xiuxian-7b574d64b-vtvbz 1/1 Running 0 6s 10.100.1.25 worker232 <none> <none> [root@master231 04-xiuxian-loadBalancer]# curl 10.0.0.150 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 04-xiuxian-loadBalancer]#

image

5、存储卷之DownwardAPI实战
bash
1.DownwardAPI DownwardAPI自身并非一种独立的API资源类型。 DownwardAPI只是一种将Pod的metadata、spec或status中的字段值注入到其内部Container里的方式。 DownwardAPI提供了两种方式用于将POD的信息注入到容器内部 - 环境变量: 用于单个变量,可以将POD信息和容器信息直接注入容器内部. - Volume挂载: 将 POD 信息生成为文件,直接挂载到容器内部中去 2.环境变量方式使用DownwardAPI 2.1 参数说明 fieldRef有效值: - metadata.name - metadata.namespace, - `metadata.labels['<KEY>']` - `metadata.annotations['<KEY>']` - spec.nodeName - spec.serviceAccountName - status.hostIP - status.podIP - status.podIPs resourceFieldRef有效值: - limits.cpu - limits.memory - limits.ephemeral-storage - requests.cpu - requests.memory - requests.ephemeral-storage 参考示例: [root@master231 10-downwardAPI]# kubectl explain po.spec.containers.env.valueFrom 2.2 实战案例 [root@master231 10-downwardAPI]# cat 01-deploy-downwardAPI-env.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian-env spec: replicas: 3 # 指定要运行3个Pod副本 selector: matchLabels: # 定义 Deployment 如何找到要管理的 Pod(匹配标签 apps: v1) apps: v1 template: # 指定pod模板 metadata: labels: apps: v1 spec: # Deployment 规格 volumes: - name: data # 定义一个名为data的临时卷(emptyDir),生命周期与Pod相同 emptyDir: {} # 定义空目录 initContainers: # 定义初始化容器,在主容器启动前运行 - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 env: - name: PodNAME # 定义变量的名称 valueFrom: # 定义值从哪里来 fieldRef: # 表示值从一个字段来 fieldPath: metadata.name # 指定字段的路径 - name: PodIP # 定义变量的名称 valueFrom: # 定义值从哪里来 fieldRef: # 表示值从一个字段来 fieldPath: status.podIP # 指定字段的路径 - name: PodNS valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: # 挂载data卷到 /data 目录 - name: data mountPath: /data command: - /bin/sh - -c # 允许你直接在命令行中指定要执行的命令。 执行命令,将Pod信息写入/data/index.html 文件 - 'echo "Pod_Name: $PodNAME, Pod_IP: $PodIP, Pod_Namespace: $PodNS" > /data/index.html' containers: - name: c2 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 volumeMounts: - name: data mountPath: /usr/share/nginx/html 整体工作流程 这个 Deployment 创建了 3 个 Pod 副本,每个 Pod 包含: 一个初始化容器 一个主容器 一个共享的临时卷 [root@master231 10-downwardAPI]# [root@master231 10-downwardAPI]# kubectl apply -f 01-deploy-downwardAPI-env.yaml deployment.apps/deploy-xiuxian-env created [root@master231 10-downwardAPI]# [root@master231 10-downwardAPI]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-env-f79845c84-2c2n4 1/1 Running 0 3s 10.100.1.39 worker232 <none> <none> deploy-xiuxian-env-f79845c84-jg84z 1/1 Running 0 3s 10.100.1.40 worker232 <none> <none> deploy-xiuxian-env-f79845c84-qktv9 1/1 Running 0 3s 10.100.2.36 worker233 <none> <none> [root@master231 10-downwardAPI]# [root@master231 10-downwardAPI]# curl 10.100.1.39 Pod_Name: deploy-xiuxian-env-f79845c84-2c2n4, Pod_IP: 10.100.1.39, Pod_Namespace: default [root@master231 10-downwardAPI]# [root@master231 10-downwardAPI]# curl 10.100.1.40 Pod_Name: deploy-xiuxian-env-f79845c84-jg84z, Pod_IP: 10.100.1.40, Pod_Namespace: default [root@master231 10-downwardAPI]# [root@master231 10-downwardAPI]# curl 10.100.2.36 Pod_Name: deploy-xiuxian-env-f79845c84-qktv9, Pod_IP: 10.100.2.36, Pod_Namespace: default [root@master231 10-downwardAPI]#
6、ExternalName⭐
bash
1.ExternalName简介 ExternalName的主要作用就是将K8S集群外部的服务映射到K8S集群内部。 ExternalName是没有CLUSTER-IP地址的。 ExternalName 类型的Service直接修改集群的DNS系统,它的生效范围是整个集群的DNS查询,不依赖Pod标签或选择器

image

bash
2.实战案例 2.1 创建svc [root@master231 09-service]# cat 03-deploy-svc-ExternalName.yaml apiVersion: v1 kind: Service metadata: name: svc-blog # Service 名称 spec: type: ExternalName # 关键:Service 类型为 ExternalName externalName: baidu.com # 指向的外部域名,如果在公司内部,没有dns解析的域名,则需要配置coreDNS的A记录 # 功能说明: 1、它不会创建任何实际的负载均衡或代理,而是通过DNS CNAME记录将集群内的服务名称(svc-blog)映射到外部域名(baidu.com) 2、当集群内的Pod访问svc-blog时,DNS查询会直接返回baidu.com的解析结果。 [root@master231 09-service]# kubectl apply -f 03-deploy-svc-ExternalName.yaml service/svc-blog created [root@master231 09-service]# [root@master231 09-service]# kubectl get -f 03-deploy-svc-ExternalName.yaml NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc-blog ExternalName <none> baidu.com <none> 3s [root@master231 09-service]# [root@master231 09-service]# [root@master231 09-service]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 6d23h svc-blog ExternalName <none> baidu.com <none> 9s svc-xiuxian LoadBalancer 10.200.225.218 10.0.0.150 80:30426/TCP 16h [root@master231 09-service]# 2.2 测试验证 [root@master231 09-service]# kubectl get pods NAME READY STATUS RESTARTS AGE deploy-xiuxian-env-f79845c84-2c2n4 1/1 Running 0 34m deploy-xiuxian-env-f79845c84-jg84z 1/1 Running 0 34m deploy-xiuxian-env-f79845c84-qktv9 1/1 Running 0 34m [root@master231 09-service]# [root@master231 09-service]# kubectl exec -it deploy-xiuxian-env-f79845c84-2c2n4 -- sh Defaulted container "c2" out of: c2, c1 (init) / # / # nslookup -type=a svc-blog Server: 10.200.0.10 Address: 10.200.0.10:53 ** server can't find svc-blog.weixiang.com: NXDOMAIN ** server can't find svc-blog.svc.weixiang.com: NXDOMAIN svc-blog.default.svc.weixiang.com canonical name = baidu.com Name: baidu.com Address: 182.61.201.211 Name: baidu.com Address: 182.61.244.181 / # / # ping svc-blog -c 3 PING svc-blog (182.61.201.211): 56 data bytes 64 bytes from 182.61.201.211: seq=0 ttl=127 time=5.941 ms 64 bytes from 182.61.201.211: seq=1 ttl=127 time=6.429 ms 64 bytes from 182.61.201.211: seq=2 ttl=127 time=7.593 ms --- svc-blog ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 5.941/6.654/7.593 ms / #
7、coreDNS
bash
1.coreDNS组件概述 所谓的coreDNS其实就是K8S的DNS解决方案之一,早期的DNS解决方案实施skyDNS组件。 coreDNS组件在k8s 1.9+版本才被纳入K8S官方使用的DNS。kubeadm部署的方式默认就内置了该组件。 CoreDNS的作用有以下几点: - 1.可以将svc的名称解析为CLusterIP; - 2.可以为Pod调度提供流量的负载均衡; - 3.可以配置内网的DNS解析记录(自定义域名解析); 2.验证CoreDNS组件 2.1 查看CoreDNS相关的Pod [root@master231 ~]# kubectl get deploy -n kube-system -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR coredns 2/2 2 2 6d23h coredns registry.aliyuncs.com/google_containers/coredns:v1.8.6 k8s-app=kube-dns [root@master231 ~]# [root@master231 ~]# kubectl -n kube-system get pods -o wide -l k8s-app=kube-dns NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES coredns-6d8c4cb4d-k52qr 1/1 Running 3 (124m ago) 6d23h 10.100.0.9 master231 <none> <none> coredns-6d8c4cb4d-rvzd9 1/1 Running 3 (124m ago) 6d23h 10.100.0.8 master231 <none> <none> [root@master231 ~]# 2.2 查看CoreDNS的Service [root@master231 ~]# kubectl -n kube-system get svc # 基本资源定义 apiVersion: v1 kind: Service metadata: name: kube-dns # Service 名称,集群内DNS服务的固定名称 namespace: kube-system # 部署在kube-system命名空间 labels: k8s-app: kube-dns # 标识为Kubernetes DNS组件 kubernetes.io/cluster-service: "true" # 标记为集群核心服务 kubernetes.io/name: CoreDNS # 实际使用的DNS实现为CoreDNS annotations: prometheus.io/port: "9153" # Prometheus监控指标的暴露端口 prometheus.io/scrape: "true" # 允许Prometheus自动抓取指标 spec: selector: k8s-app: kube-dns # 选择器,关联到CoreDNS的Pod(通过此标签选择后端Pod) type: ClusterIP # 服务类型,仅集群内部可访问 # IP配置(Kubernetes 1.20+特性) ipFamilyPolicy: SingleStack # 单协议栈(仅IPv4) ipFamilies: [IPv4] # 使用的IP协议族 clusterIP: 10.200.0.10 # 固定的集群内部VIP,所有Pod的DNS请求会发往此地址 ports: - name: dns # UDP端口,用于标准DNS查询 port: 53 # 服务暴露端口 protocol: UDP # DNS协议通常优先使用UDP targetPort: 53 # 容器内CoreDNS实际监听端口 - name: dns-tcp # TCP端口,用于大型DNS查询(当UDP报文过大时回退) port: 53 protocol: TCP targetPort: 53 - name: metrics # Prometheus监控指标端口 port: 9153 protocol: TCP targetPort: 9153 # 后端Endpoint列表(由Endpoints Controller自动维护) endpoints: - 10.100.0.8:53 # 实际运行CoreDNS的Pod IP+端口 - 10.100.0.9:53 # 通常有多个副本实现高可用 [root@master231 ~]# 2.3 验证CoreDNS组件解析功能 [root@master231 ~]# kubectl get svc -A NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 7d default svc-blog ExternalName <none> baidu.com <none> 15m default svc-xiuxian LoadBalancer 10.200.225.218 10.0.0.150 80:30426/TCP 17h kube-system kube-dns ClusterIP 10.200.0.10 <none> 53/UDP,53/TCP,9153/TCP 7d metallb-system metallb-webhook-service ClusterIP 10.200.251.95 <none> 443/TCP 17h [root@master231 ~]# # 查看dns是哪个解析服务器的 [root@master231 ~/count/09-service]#kubectl exec -it deploy-xiuxian-7b574d64b-72tbq -- sh / # cat /etc/resolv.conf nameserver 10.200.0.10 search default.svc.weixiang.com svc.weixiang.com weixiang.com options ndots:5 / # [root@master231 ~]# dig @10.200.0.10 kubernetes.default.svc.weixiang.com +short 10.200.0.1 # dig:DNS 查询工具 # @10.200.0.10:指定 DNS 服务器的 IP 地址(这里是 Kubernetes 集群的 DNS 服务 IP,通常是 kube-dns 或 CoreDNS 的 ClusterIP) # kubernetes.default.svc.weixiang.com:要查询的域名(格式为 <服务名>.<命名空间>.svc.<集群域名>)。 # +short:简化输出,只返回结果(不显示详细调试信息) [root@master231 ~]# dig @10.200.0.10 svc-blog.default.svc.weixiang.com +short baidu.com. 182.61.201.211 182.61.244.181 # svc-blog.default.svc.weixiang.com:要查询的域名 [root@master231 ~]# dig @10.200.0.10 svc-xiuxian.default.svc.weixiang.com +short 10.200.225.218 [root@master231 ~]# [root@master231 ~]# dig @10.200.0.10 metallb-webhook-service.metallb-system.svc.weixiang.com +short 10.200.251.95 [root@master231 ~]# [root@master231 ~]# dig @10.200.0.10 kube-dns.kube-system.svc.weixiang.com +short 10.200.0.10 [root@master231 ~]# [root@master231 ~]# DNS解析的A记录格式: <Service_NAME>.<NAMESPACE>.svc.weixiang.com 温馨提示: 工作中可能不是weixiang.com,而是取决于你工作中初始化指定的域名信息,kubeadm部署方式可以通过'--service-dns-domain'实现。 若没有自定义域名,则默认值为"cluster.local"
8、Kubernetes Service 类型对比表
特性ClusterIPNodePortLoadBalancerExternalName
IP 分配分配集群内部虚拟 IP (VIP)分配 ClusterIP + 节点端口 (30000-32767)分配 ClusterIP + NodePort + 云平台外部 LB IP无 IP,仅 DNS CNAME 记录
访问范围仅集群内部集群内部 + 节点IP
外部访问
集群内部 + 外部负载均衡器 IP集群内部通过 DNS 解析外部服务
DNS 行为返回 ClusterIP (A 记录)返回 ClusterIP (A 记录)返回 LB IP (A 记录)返回外部域名 (CNAME 记录)
流量代理目标代理到匹配标签的 Pod代理到匹配标签的 Pod代理到匹配标签的 Pod不代理,直接跳转到外部域名
是否依赖 selector
典型 YAML 片段spec: type: ClusterIP selector: app: my-appspec: type: NodePort ports: - nodePort: 30080spec: type: LoadBalancer annotations: cloud-provider-specific: "value"spec: type: ExternalName externalName: foo.bar.com
适用场景- 服务间内部通信
- 前端访问后端服务
- 开发测试环境
- 非云环境临时暴露服务
- 生产环境(云平台)
- 高可用外部访问
- 集成外部服务到集群 DNS
- 环境隔离
性能与扩展性低延迟,无外部跳转需经过节点网络,性能中等依赖云 LB,支持自动扩展依赖外部服务性能
安全控制通过 NetworkPolicy 限制 Pod 间访问暴露节点端口,需额外防火墙规则云平台集成安全组/ACL无内置安全机制
云平台依赖无关无关强依赖(AWS/GCP/Azure 等)无关
本地环境支持完全支持完全支持需 MetalLB 等工具模拟完全支持
成本无额外成本无额外成本云 LB 按小时计费无额外成本
10、hostNetwork指定dnsPolicy解析策略
bash
1.验证对比hostNetwork的DNS解析策略测试 [root@master231 01-pods]# cat 23-pods-hostNetwork-dnsPolicy.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-v1 labels: apps: v1 spec: hostNetwork: true containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 01-pods]# [root@master231 01-pods]# kubectl apply -f 23-pods-hostNetwork-dnsPolicy.yaml pod/xiuxian-v1 created [root@master231 01-pods]# [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-v1 1/1 Running 0 2s 10.0.0.233 worker233 <none> <none> [root@master231 01-pods]# [root@master231 01-pods]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 7d svc-blog ExternalName <none> baidu.com <none> 23m svc-xiuxian LoadBalancer 10.200.225.218 10.0.0.150 80:30426/TCP 17h [root@master231 01-pods]# [root@master231 01-pods]# [root@master231 01-pods]# kubectl exec -it deploy-xiuxian-env-f79845c84-5nxcb -- sh Defaulted container "c2" out of: c2, c1 (init) / # ping svc-blog -c 3 PING svc-blog (182.61.201.211): 56 data bytes 64 bytes from 182.61.201.211: seq=0 ttl=127 time=6.966 ms 64 bytes from 182.61.201.211: seq=1 ttl=127 time=5.816 ms 64 bytes from 182.61.201.211: seq=2 ttl=127 time=5.841 ms --- svc-blog ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 5.816/6.207/6.966 ms / # / # / # cat /etc/resolv.conf nameserver 10.200.0.10 search default.svc.weixiang.com svc.weixiang.com weixiang.com options ndots:5 / # / # ping svc-blog.default.svc.weixiang.com -c 1 PING svc-blog.default.svc.weixiang.com (182.61.244.181): 56 data bytes 64 bytes from 182.61.244.181: seq=0 ttl=127 time=23.145 ms # 直接查询完整域名svc-blog.default.svc.weixiang.com --- svc-blog.default.svc.weixiang.com ping statistics --- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max = 23.145/23.145/23.145 ms / # / # ping svc-blog.default.svc.weixiang -c 1 ping: bad address 'svc-blog.default.svc.weixiang' / # / # ping svc-blog.default.svc -c 1 PING svc-blog.default.svc (182.61.244.181): 56 data bytes 64 bytes from 182.61.244.181: seq=0 ttl=127 time=22.971 ms --- svc-blog.default.svc ping statistics --- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max = 22.971/22.971/22.971 ms / # [root@master231 01-pods]# kubectl exec -it xiuxian-v1 -- sh / # ping svc-blog -c 3 ping: bad address 'svc-blog' / # / # cat /etc/resolv.conf nameserver 223.5.5.5 nameserver 223.6.6.6 search / # 2.指定DNS的策略 [root@master231 01-pods]# kubectl delete -f 23-pods-hostNetwork-dnsPolicy.yaml pod "xiuxian-v1" deleted [root@master231 01-pods]# [root@master231 01-pods]# cat 23-pods-hostNetwork-dnsPolicy.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-v1 labels: apps: v1 spec: hostNetwork: true # 指定DNS的解析策略,DNS解析优先使用K8S集群的CoreDNS组件。 dnsPolicy: ClusterFirstWithHostNet containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 01-pods]# [root@master231 01-pods]# kubectl apply -f 23-pods-hostNetwork-dnsPolicy.yaml pod/xiuxian-v1 created [root@master231 01-pods]# [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-env-f79845c84-5nxcb 1/1 Running 0 8m31s 10.100.2.37 worker233 <none> <none> deploy-xiuxian-env-f79845c84-lbj5m 1/1 Running 0 8m31s 10.100.1.42 worker232 <none> <none> deploy-xiuxian-env-f79845c84-xxgfn 1/1 Running 0 8m31s 10.100.1.41 worker232 <none> <none> xiuxian-v1 1/1 Running 0 6s 10.0.0.233 worker233 <none> <none> [root@master231 01-pods]# [root@master231 01-pods]# kubectl exec -it xiuxian-v1 -- sh / # cat /etc/resolv.conf nameserver 10.200.0.10 # Kubernetes DNS 服务(CoreDNS)的 ClusterIP search default.svc.weixiang.com svc.weixiang.com weixiang.com # 搜索域 options ndots:5 # 当查询的域名包含至少 5 个点时才直接查询,否则依次尝试搜索域 / # / # ping svc-blog -c 3 PING svc-blog (182.61.201.211): 56 data bytes 64 bytes from 182.61.201.211: seq=0 ttl=128 time=5.676 ms 64 bytes from 182.61.201.211: seq=1 ttl=128 time=6.173 ms 64 bytes from 182.61.201.211: seq=2 ttl=128 time=8.042 ms --- svc-blog ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 5.676/6.630/8.042 ms / # 温馨提示: 如果你既不向使用CoreDNS服务器,也不想使用公网的DNS服务器,而是使用你自己搭建的DNS服务器。 可以参考配置: "kubectl explain po.spec.dnsConfig"
11、endpoints端点服务
bash
1.什么是endpoints 所谓的endpoints简称为ep,除了ExternalName外的其他svc类型,每个svc都会关联一个ep资源。 当删除Service资源时,会自动删除与Service同名称的endpoints资源。 如果想要映射k8s集群外部的服务,可以先定义一个ep资源,而后再创建一个同名称的svc资源即可。 # 用于记录 Service 背后实际 Pod 的 IP 和端口列表 1、服务发现,当 Pod 发生扩缩容、重启或迁移时,Endpoints 自动更新 2、Service 的 ClusterIP 本身不承载流量,实际流量由 kube-proxy 通过 Endpoints 列表转发到具体 Pod 3、支持负载均衡(默认轮询) 2.验证svc关联相应的ep资源 2.1 查看svc和ep的关联性 [root@master231 01-pods]# kubectl get svc,ep NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 7d1h service/svc-blog ExternalName <none> baidu.com <none> 75m service/svc-xiuxian NodePort 10.200.225.218 <none> 88:30080/TCP 18h service/svc-xiuxian-lb LoadBalancer 10.200.132.29 10.0.0.150 88:32116/TCP 9s NAME ENDPOINTS AGE endpoints/kubernetes 10.0.0.231:6443 7d1h endpoints/svc-xiuxian 10.0.0.233:80,10.100.1.41:80,10.100.1.42:80 + 4 more... 18h endpoints/svc-xiuxian-lb 10.0.0.233:80,10.100.1.41:80,10.100.1.42:80 + 4 more... 9s [root@master231 01-pods]# [root@master231 01-pods]# kubectl describe svc kubernetes Name: kubernetes Namespace: default Labels: component=apiserver provider=kubernetes Annotations: <none> Selector: <none> Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: 10.200.0.1 IPs: 10.200.0.1 Port: https 443/TCP TargetPort: 6443/TCP Endpoints: 10.0.0.231:6443 Session Affinity: None Events: <none> [root@master231 01-pods]# 2.2 删除svc时会自动删除同名称的ep资源 [root@master231 01-pods]# kubectl get svc,ep NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 7d1h service/svc-blog ExternalName <none> baidu.com <none> 76m service/svc-xiuxian NodePort 10.200.225.218 <none> 88:30080/TCP 18h service/svc-xiuxian-lb LoadBalancer 10.200.132.29 10.0.0.150 88:32116/TCP 89s NAME ENDPOINTS AGE endpoints/kubernetes 10.0.0.231:6443 7d1h endpoints/svc-xiuxian 10.0.0.233:80,10.100.1.41:80,10.100.1.42:80 + 4 more... 18h endpoints/svc-xiuxian-lb 10.0.0.233:80,10.100.1.41:80,10.100.1.42:80 + 4 more... 89s [root@master231 01-pods]# [root@master231 01-pods]# kubectl delete svc svc-xiuxian service "svc-xiuxian" deleted [root@master231 01-pods]# [root@master231 01-pods]# kubectl get svc,ep NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 7d1h service/svc-blog ExternalName <none> baidu.com <none> 76m service/svc-xiuxian-lb LoadBalancer 10.200.132.29 10.0.0.150 88:32116/TCP 98s NAME ENDPOINTS AGE endpoints/kubernetes 10.0.0.231:6443 7d1h endpoints/svc-xiuxian-lb 10.0.0.233:80,10.100.1.41:80,10.100.1.42:80 + 4 more... 98s [root@master231 01-pods]# 3.endpoint实战案例 3.1 在K8S集群外部部署MySQL数据库 [root@harbor250.weixiang.com ~]# docker run --network host -d --name mysql-server -e MYSQL_ALLOW_EMPTY_PASSWORD="yes" -e MYSQL_DATABASE=wordpress -e MYSQL_USER=weixiang98 -e MYSQL_PASSWORD=weixiang harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle [root@harbor250.weixiang.com ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b96ca94105e7 harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle "docker-entrypoint.s…" 29 seconds ago Up 28 seconds mysql-server [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# ss -ntl | grep 3306 LISTEN 0 70 *:33060 *:* LISTEN 0 151 *:3306 *:* [root@harbor250.weixiang.com ~]# 3.2 k8s集群内部部署wordpress [root@master231 11-endpoints]# cat 01-deploy-svc-ep-wordpress.yaml apiVersion: v1 kind: Endpoints metadata: name: svc-db # 与 Service 同名 subsets: - addresses: - ip: 10.0.0.250 # 外部数据库的真实 IP ports: - port: 3306 --- apiVersion: v1 kind: Service metadata: name: svc-db # 与 Endpoints 同名 spec: type: ClusterIP ports: - protocol: TCP port: 3306 --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-wp spec: replicas: 1 selector: matchLabels: apps: wp template: metadata: labels: apps: wp spec: volumes: - name: data nfs: server: 10.0.0.231 path: /yinzhengjie/data/nfs-server/case-demo/wordpres/wp containers: - name: wp image: harbor250.weixiang.com/weixiang-wp/wordpress:6.7.1-php8.1-apache env: - name: WORDPRESS_DB_HOST value: "svc-db" - name: WORDPRESS_DB_NAME value: "wordpress" - name: WORDPRESS_DB_USER value: weixiang98 - name: WORDPRESS_DB_PASSWORD value: weixiang volumeMounts: - name: data mountPath: /var/www/html --- apiVersion: v1 kind: Service metadata: name: svc-wp spec: type: NodePort selector: apps: wp ports: - protocol: TCP port: 80 targetPort: 80 nodePort: 30090 [root@master231 11-endpoints]# [root@master231 11-endpoints]# kubectl apply -f 01-deploy-svc-ep-wordpress.yaml endpoints/svc-db created service/svc-db created deployment.apps/deploy-wp created service/svc-wp created [root@master231 11-endpoints]# [root@master231 11-endpoints]# kubectl get svc svc-db svc-wp NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc-db ClusterIP 10.200.58.173 <none> 3306/TCP 99s svc-wp NodePort 10.200.7.64 <none> 80:30090/TCP 99s [root@master231 11-endpoints]# [root@master231 11-endpoints]# kubectl get pods -o wide -l apps=wp NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-wp-5b9799fdfb-9gxnq 1/1 Running 0 110s 10.100.1.45 worker232 <none> <none> [root@master231 11-endpoints]# 3.3 访问测试 http://10.0.0.231:30090/ 最后一定要验证下数据库是否有数据 [root@harbor250.weixiang.com ~]# docker exec -it mysql-server mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 8 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> mysql> SHOW DATABASES; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | wordpress | +--------------------+ 5 rows in set (0.00 sec) mysql> USE wordpress; Database changed mysql> mysql> SHOW TABLES; Empty set (0.00 sec) mysql> mysql> mysql> SHOW TABLES; +-----------------------+ | Tables_in_wordpress | +-----------------------+ | wp_commentmeta | | wp_comments | | wp_links | | wp_options | | wp_postmeta | | wp_posts | | wp_term_relationships | | wp_term_taxonomy | | wp_termmeta | | wp_terms | | wp_usermeta | | wp_users | +-----------------------+ 12 rows in set (0.00 sec) mysql> mysql> # Endpoints 的核心作用: 1.它是 Service 的“地址簿”和“联络清单” 📖📞:告诉 Service 应该把收到的请求(网络流量)具体转发给哪个真实的服务器(IP)和哪 个具体的入口(Port)。 2.连接抽象和现实 🌉:Service 提供了一个好记的名字(比如 svc-db)作为访问入口(抽象层)。Endpoints 则把这个名字背后真正干 活的服务器的实际位置(IP:Port)(现实层)告诉 Service。 3.解耦的关键:你的WordPress程序只需要知道找svc-db这个Service 就能访问数据库。它完全不需要关心数据库到底是在Kubernetes 集群内部的一个 Pod 里,还是在集群外部的一台物理机/虚拟机(10.0.0.250)上。这个“具体在哪”的信息,就是由 Endpoints 悄悄提供给 Service 的。
12、kubeadm底层使用了静态Pod技术
bash
1.什么是静态Pod 所谓的静态Pod,指的是kubelet加载一个配置目录,识别Pod类型的资源清单,从而自动创建pod的一种技术。 2.查看静态pod目录 [root@master231 11-endpoints]# grep staticPodPath /var/lib/kubelet/config.yaml staticPodPath: /etc/kubernetes/manifests [root@master231 11-endpoints]# [root@master231 11-endpoints]# ll /etc/kubernetes/manifests/ total 24 drwxr-xr-x 2 root root 4096 Jul 9 10:40 ./ drwxr-xr-x 4 root root 4096 Jul 9 10:40 ../ -rw------- 1 root root 2280 Jul 9 10:40 etcd.yaml -rw------- 1 root root 4025 Jul 9 10:40 kube-apiserver.yaml -rw------- 1 root root 3546 Jul 9 10:40 kube-controller-manager.yaml -rw------- 1 root root 1465 Jul 9 10:40 kube-scheduler.yaml [root@master231 11-endpoints]# [root@master231 11-endpoints]# 3.拷贝Pod资源清单到相应的目录 [root@master231 01-pods]# cp 01-pods-xiuxian.yaml /etc/kubernetes/manifests/ [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-master231 1/1 Running 0 11s 10.100.0.10 master231 <none> <none> [root@master231 01-pods]# [root@master231 01-pods]# scp 01-pods-xiuxian.yaml 10.0.0.232:/etc/kubernetes/manifests/ [root@master231 01-pods]# scp 01-pods-xiuxian.yaml 10.0.0.233:/etc/kubernetes/manifests/ [root@master231 01-pods]# [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-master231 1/1 Running 0 60s 10.100.0.10 master231 <none> <none> xiuxian-worker232 1/1 Running 0 6s 10.100.1.46 worker232 <none> <none> xiuxian-worker233 1/1 Running 0 1s 10.100.2.39 worker233 <none> <none> [root@master231 01-pods]# 4.移动走资源清单 [root@worker233 ~]# mv /etc/kubernetes/manifests/01-pods-xiuxian.yaml /mnt/ [root@worker233 ~]# [root@worker233 ~]# ll /etc/kubernetes/manifests/ total 8 drwxr-xr-x 2 root root 4096 Jul 16 14:36 ./ drwxr-xr-x 4 root root 4096 Jul 9 10:48 ../ [root@worker233 ~]# [root@worker233 ~]# 5.发现Pod被删除 [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-master231 1/1 Running 0 3m38s 10.100.0.10 master231 <none> <none> xiuxian-worker232 1/1 Running 0 3m 10.100.1.47 worker232 <none> <none> [root@master231 01-pods]# 6.删除pod发现会自动创建 [root@master231 01-pods]# kubectl delete pods --all pod "xiuxian-master231" deleted pod "xiuxian-worker232" deleted [root@master231 01-pods]# [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-master231 0/1 Pending 0 2s <none> master231 <none> <none> xiuxian-worker232 0/1 Pending 0 2s <none> worker232 <none> <none> [root@master231 01-pods]# [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-master231 1/1 Running 0 4s 10.100.0.10 master231 <none> <none> xiuxian-worker232 1/1 Running 0 4s 10.100.1.47 worker232 <none> <none> [root@master231 01-pods]# 7.只能识别Pod资源清单 [root@master231 04-deployments]# cp 01-deploy-matchLabels-xiuxian.yaml /etc/kubernetes/manifests/ [root@master231 04-deployments]# [root@master231 04-deployments]# ll /etc/kubernetes/manifests/ total 32 drwxr-xr-x 2 root root 4096 Jul 16 14:38 ./ drwxr-xr-x 4 root root 4096 Jul 9 10:40 ../ -rw-r--r-- 1 root root 301 Jul 16 14:38 01-deploy-matchLabels-xiuxian.yaml -rw-r--r-- 1 root root 504 Jul 16 14:33 01-pods-xiuxian.yaml -rw------- 1 root root 2280 Jul 9 10:40 etcd.yaml -rw------- 1 root root 4025 Jul 9 10:40 kube-apiserver.yaml -rw------- 1 root root 3546 Jul 9 10:40 kube-controller-manager.yaml -rw------- 1 root root 1465 Jul 9 10:40 kube-scheduler.yaml [root@master231 04-deployments]# [root@master231 04-deployments]# head /etc/kubernetes/manifests/* ==> /etc/kubernetes/manifests/01-deploy-matchLabels-xiuxian.yaml <== apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: ==> /etc/kubernetes/manifests/01-pods-xiuxian.yaml <== # 指定API的版本号 apiVersion: v1 # 指定资源的类型 kind: Pod # 定义元数据信息 metadata: # 指定资源的名称 name: xiuxian # 给资源打标签 labels: ==> /etc/kubernetes/manifests/etcd.yaml <== apiVersion: v1 kind: Pod metadata: annotations: kubeadm.kubernetes.io/etcd.advertise-client-urls: https://10.0.0.231:2379 creationTimestamp: null labels: component: etcd tier: control-plane name: etcd ==> /etc/kubernetes/manifests/kube-apiserver.yaml <== apiVersion: v1 kind: Pod metadata: annotations: kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 10.0.0.231:6443 creationTimestamp: null labels: component: kube-apiserver tier: control-plane name: kube-apiserver ==> /etc/kubernetes/manifests/kube-controller-manager.yaml <== apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-controller-manager tier: control-plane name: kube-controller-manager namespace: kube-system spec: ==> /etc/kubernetes/manifests/kube-scheduler.yaml <== apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-scheduler tier: control-plane name: kube-scheduler namespace: kube-system spec: [root@master231 04-deployments]# [root@master231 04-deployments]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-master231 1/1 Running 0 66s 10.100.0.10 master231 <none> <none> xiuxian-worker232 1/1 Running 0 66s 10.100.1.47 worker232 <none> <none> [root@master231 04-deployments]# 总结: - kubeadm之所以能够快速部署k8s集群,底层就用到了静态Pod技术; - 静态Pod的启动前提是必须先启动kubelet服务; - kubelet底层调用了docker|containerd作为容器运行时;
13、修改Service的NodePort端口范围
bash
1.修改api-server的配置文件 [root@master231 ~]# vim /etc/kubernetes/manifests/kube-apiserver.yaml [root@master231 ~]# [root@master231 ~]# cat /etc/kubernetes/manifests/kube-apiserver.yaml apiVersion: v1 kind: Pod metadata: ... name: kube-apiserver namespace: kube-system spec: containers: - command: - kube-apiserver - --advertise-address=10.0.0.231 - --service-node-port-range=3000-50000 ... 2.让kubelet热加载静态Pod目录文件 [root@master231 ~]# mv /etc/kubernetes/manifests/kube-apiserver.yaml /opt/ [root@master231 ~]# mv /opt/kube-apiserver.yaml /etc/kubernetes/manifests/ [root@master231 ~]# 3.创建测试案例 [root@master231 09-service]# cat 02-deploy-svc-NodePort.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxian spec: # 指定Service类型,若不指定,默认为: ClusterIP type: NodePort selector: apps: v1 ports: - protocol: TCP port: 88 targetPort: 80 # 如果Service类型是NodePort,则可以定义NodePort的端口号。 # 若不指定端口,则默认会在一个端口范围(30000-32767)内随时生成。 #nodePort: 30080 nodePort: 8080 [root@master231 09-service]# [root@master231 09-service]# kubectl apply -f 02-deploy-svc-NodePort.yaml deployment.apps/deploy-xiuxian created service/svc-xiuxian created [root@master231 09-service]# [root@master231 09-service]# kubectl get -f 02-deploy-svc-NodePort.yaml NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/deploy-xiuxian 3/3 3 3 3s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/svc-xiuxian NodePort 10.200.92.117 <none> 88:8080/TCP 3s [root@master231 09-service]# [root@master231 09-service]# curl 10.0.0.233:8080 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 09-service]#

11-2、服务暴露方式对比

bash
在 Kubernetes 集群里,每个 Pod 都有自己独立的、集群内部的 IP 地址。比如 10.244.1.5、10.244.2.10 等。这个IP地址从集群外部 是无法直接访问的。我们下面讨论的所有方式,都是为了解决“如何从集群外部访问到集群内部 Pod 里的应用程序”这个问题。 让我们把这些访问方式分为两大类: 1.直接暴露节点(不推荐,特殊场景使用): hostNetwork, hostPort 2.通过 K8s 抽象层(推荐,标准方式): port-forward, Service 第一类:直接暴露节点(绕过 K8s 网络抽象)
直接暴露节点,hostNetwork: true
bash
1. hostNetwork: true 是什么? 这可能是最“暴力”的方式。它直接让 Pod 放弃自己的独立网络空间,和它所在的宿主机(Node)共享同一个网络空间。 怎么工作的? 当你在 Pod 的 spec 中设置 hostNetwork: true,这个Pod里的所有进程就直接监听在宿主机的网络接口上了。在Pod内部执行ifconfig 或ip addr,你会看到宿主机的 IP 地址(比如 10.0.0.231),而不是 Pod 自己的 IP。 如何访问(用你的例子)? 假设你有一个 Nginx Pod,设置了 hostNetwork: true,并且它监听 80 端口。 如果这个 Pod 被调度到了 10.0.0.231 这台节点上。 那么你就可以直接通过 http://10.0.0.231:80 来访问。 如果它被调度到了 10.0.0.232,你就得访问 http://10.0.0.232:80。 致命缺点: 端口冲突:如果 10.0.0.231 这台节点上已经有其他程序(或者另一个 hostNetwork Pod)占用了 80 端口,你的 Nginx Pod 将无法启动。 无法扩展:你无法在同一个节点上启动两个监听相同端口的 hostNetwork Pod。 访问不稳定:你必须知道 Pod 被调度到了哪个节点才能访问,如果 Pod 重启并被调度到另一个节点,访问地址就变了。 使用场景: 一些需要直接管理节点网络或者对网络性能要求极高的系统级组件,比如网络插件(Calico, Flannel)或者监控代理(Node Exporter)可能会使用。普通业务应用绝对不要使用。
直接暴露节点,hostPort
bash
2. hostPort 是什么? 这是一个稍微“温柔”一点的方式。它不共享整个网络空间,只是将 Pod 内的一个端口精确地映射到宿主机上的一个端口。 怎么工作的? 当你在 Pod 的 spec.containers.ports 中设置 hostPort,K8s 会通过 kube-proxy 在 Pod 所在的节点上配置 iptables 规则,将 [节点IP]:[hostPort] 的流量转发到 [Pod内部IP]:[containerPort]。 如何访问(用你的例子)? 假设你有一个 Nginx Pod,监听 80 端口,你为它配置了 hostPort: 8080。 如果这个 Pod 被调度到了 10.0.0.231 这台节点上。 那么你就可以通过 http://10.0.0.231:8080 来访问。流量会从节点的 8080 端口被转发到 Pod 的 80 端口。 如果它被调度到了 10.0.0.232,你就得访问 http://10.0.0.232:8080。 缺点 端口冲突:同样存在端口冲突问题。你不能在同一个节点上启动两个映射了相同 hostPort 的 Pod。 访问不稳定:和 hostNetwork 一样,访问地址依赖于 Pod 所在的节点。 使用场景: 极少使用。可能用于一些需要固定端口暴露的单节点守护进程,但通常有更好的替代方案。同样,普通业务应用不推荐使用。 第二类:通过 K8s 抽象层(标准、推荐的方式) 这类方式利用了 Kubernetes 的核心设计理念:通过抽象来解耦。你不直接关心 Pod 在哪个节点上,K8s 会为你处理好这一切。
port-forward
bash
3. kubectl port-forward 是什么? 这是一个临时的、用于开发和调试的工具,它在你本地的开发机和集群里的某个 Pod 之间建立一个安全的隧道。 怎么工作的? 你在本地机器上执行 kubectl port-forward <pod-name> [本地端口]:[Pod端口]。 kubectl 客户端连接到 K8s API Server。 API Server 指示 Pod 所在节点的 kubelet。 kubelet 在 Pod 和你的本地机器之间建立一个数据流通道。 流量路径是:你的浏览器 -> 你的本地端口 -> kubectl 客户端 -> API Server -> Kubelet -> Pod。 如何访问(用你的例子)? 假设你有一个 Nginx Pod 叫 my-nginx-pod-xyz,它监听 80 端口。 你在你的笔记本电脑上执行命令:kubectl port-forward my-nginx-pod-xyz 9999:80。 然后,你打开你笔记本电脑上的浏览器,访问 http://localhost:9999 或者 http://127.0.0.1:9999。 注意:这里你完全不需要关心 Pod 在哪个节点上,也不需要使用 10.0.0.231 这些 IP。访问地址永远是你的 localhost。 优点 简单安全:无需修改任何 K8s 配置,快速访问 Pod。 绕过防火墙:只要你能连上 K8s API Server,就能访问 Pod。 缺点 临时性:一旦你的 kubectl 命令停止(比如你按了 Ctrl+C 或者关了终端),隧道就断了。 不适合生产:它不提供负载均衡,且依赖于你的本地机器,绝不能用于对外提供服务。 使用场景 开发和调试。比如:你想用本地的数据库客户端连接到集群里的数据库 Pod,或者临时查看一下某个 Web 应用的后台界面。
ClusterIP
bash
Service 是 K8s 中最重要的网络抽象。它的核心思想是:为一组功能相同的 Pod 提供一个统一、稳定的入口地址,并在这组 Pod 之间进行负载均衡。 当 Pod 创建、销毁、重启、漂移时,它们的 IP 地址会变。但 Service 的 IP 地址和 DNS 名称是固定不变的。 Service 有四种类型,我们逐一来看: 这是 Service 的默认类型。它会创建一个仅在集群内部可以访问的虚拟 IP(即 ClusterIP)。 怎么工作的? K8s 会分配一个虚拟 IP(比如 10.96.100.200)。集群内的 DNS 会为这个 Service 创建一个条目(比如 my-nginx-service.default.svc.cluster.local)。当集群内的其他 Pod 访问这个 Service 的 IP 或 DNS 名时,kube-proxy 会通过 iptables 或 IPVS 规则,将请求负载均衡地转发到背后某一个健康的 Nginx Pod 上。 如何访问(用你的例子)? 无法从集群外部直接访问。你不能在浏览器里输入 http://10.96.100.200。 你只能在集群内的另一个 Pod里通过 curl http://my-nginx-service 或者 curl http://10.96.100.200 来访问。 使用场景 集群内部服务之间的通信。比如,你的 "Web前端" Pod 需要调用 "用户服务" Pod,就应该通过一个 ClusterIP 类型的 Service 来实现。
NodePort
bash
是什么? 它在 ClusterIP 的基础上,在集群中每一个节点(Node)上都打开一个相同的、固定的端口(默认范围是 30000-32767),并将流量导向这个 Service。 怎么工作的? 当你创建一个 NodePort Service 时,会发生两件事: 首先,它会创建一个 ClusterIP Service(和上面一样)。 然后,kube-proxy 会在所有节点上(10.0.0.231, 10.0.0.232, 10.0.0.233)都监听一个指定的 NodePort(比如 30080)。 如何访问(用你的例子)? 假设你创建的 NodePort Service 分配到的端口是 30080。 你可以通过任何一个节点的 IP + NodePort 来访问你的服务: http://10.0.0.231:30080 http://10.0.0.232:30080 http://10.0.0.233:30080 无论你访问哪个节点,K8s 都会把流量负载均衡到后端的 Pod 上,即使 Pod 不在那个节点上! 这是 NodePort 最关键的特性。 缺点 端口范围受限,且端口号较大,不直观。 如果节点 IP 发生变化,访问地址也得变。 它本身没有提供一个高可用的入口,如果访问的那个节点挂了,你需要手动换一个节点 IP。 使用场景 当你需要一个快速、临时的对外入口,并且不在意端口号和高可用性时。 作为更高级的负载均衡器(如 LoadBalancer 或外部 LB)的后端。
Load Balancer
bash
是什么? 这是将服务暴露给外部世界的标准、生产级方式。它建立在 NodePort 之上。 怎么工作的? 当你创建一个 LoadBalancer Service 时: 它会自动创建一个 NodePort Service。 它会调用云服务商(如 AWS, GCP, Azure)的 API,去创建一个外部负载均衡器(External Load Balancer)。 云服务商的 LB 会被分配一个公网 IP 地址,并且会自动配置好,将流量转发到你集群中所有节点的 NodePort 上。 如何访问(用你的例子)? 假设云服务商给你分配的外部 IP 是 54.1.2.3。 你只需要访问 http://54.1.2.3(通常是 80 端口)。 用户流量 -> 外部 LB (54.1.2.3) -> (随机一个节点的 NodePort,如 10.0.0.232:30080) -> Service (ClusterIP) -> 某个 Pod。 你完全不需要关心节点的 IP 和 NodePort 端口,云厂商的 LB 帮你搞定了一切,包括高可用。 注意 这种方式依赖于云服务商环境。如果你是在自己的物理机房(On-Premise)搭建 K8s,你需要安装类似 MetalLB 或 Porter 这样的项目来提供 LoadBalancer 的功能。 使用场景 所有需要从公网访问的生产级应用。
ExternalName
bash
是什么? 这是一个特例,它不涉及端口和 IP 转发。它只是在集群内部 DNS 中创建了一个别名(CNAME 记录)。 怎么工作的? 你创建一个 ExternalName Service,指向一个集群外部的域名,比如 database.mycompany.com。 如何访问(用你的例子)? 这不是用来把服务暴露出去的,而是让集群内部的 Pod 去访问外部服务的。 比如,你集群里的一个 Pod 需要访问外部的 RDS 数据库。你可以创建一个名为 my-db 的 ExternalName Service,指向 RDS 的域名。 这样,Pod 里的代码就可以用 mysql.connect("my-db") 来连接,而不用硬编码外部域名。以后如果 RDS 域名换了,你只需要修改 Service,而不用改动应用代码。

12、KubeKey部署

1、KubeKey底层基于kubeadm快速部署K8S集群
bash
1.安装依赖软件包 apt -y install socat conntrack ebtables ipset 2.设置区域 [root@k8s66 ~]# export KKZONE=cn [root@k8s66 ~]# 3.下载kubekey文件 [root@k8s66 ~]# wget https://github.com/kubesphere/kubekey/releases/download/v3.1.10/kubekey-v3.1.10-linux-amd64.tar.gz SVIP: [root@k8s66 ~]# wget http://192.168.21.253/Resources/Kubernetes/Project/kubesphere/kubersphere-on-linux/kube-key-v3.1.10_k8s-v1.28.12/kubekey-v3.1.10-linux-amd64.tar.gz [root@k8s66 ~]# ll kubekey-v3.1.10-linux-amd64.tar.gz -rw-r--r-- 1 root root 37244306 Jul 16 15:29 kubekey-v3.1.10-linux-amd64.tar.gz [root@k8s66 ~]# [root@k8s66 ~]# ll -h kubekey-v3.1.10-linux-amd64.tar.gz -rw-r--r-- 1 root root 36M Jul 16 15:29 kubekey-v3.1.10-linux-amd64.tar.gz [root@k8s66 ~]# 4.解压软件包 [root@k8s66 ~]# tar xf kubekey-v3.1.10-linux-amd64.tar.gz [root@k8s66 ~]# [root@k8s66 ~]# ll kk -rwxr-xr-x 1 root root 82044655 Jun 12 10:58 kk* [root@k8s66 ~]# 5.生成安装配置文件模板 [root@k8s66 ~]# ./kk create config --with-kubernetes v1.28.12 Generate KubeKey config file successfully [root@k8s66 ~]# [root@k8s66 ~]# ll config-sample.yaml -rw-r--r-- 1 root root 1070 Jul 16 15:32 config-sample.yaml [root@k8s66 ~]# 6.修改安装配置文件 [root@k8s66 ~]# cat config-sample.yaml apiVersion: kubekey.kubesphere.io/v1alpha2 kind: Cluster metadata: name: sample spec: hosts: - {name: k8s66, address: 10.1.24.13, internalAddress: 10.1.24.13, user: root, password: "1"} - {name: k8s77, address: 10.1.20.5, internalAddress: 10.1.20.5, user: root, password: "1"} - {name: k8s88, address: 10.1.24.4, internalAddress: 10.1.24.4, user: root, password: "1"} roleGroups: etcd: - k8s66 - k8s77 - k8s88 control-plane: - k8s66 - k8s88 worker: - k8s66 - k8s77 - k8s88 controlPlaneEndpoint: ## Internal loadbalancer for apiservers # internalLoadbalancer: haproxy domain: lb.kubesphere.local address: "" port: 6443 kubernetes: version: v1.28.12 clusterName: cluster.local autoRenewCerts: true containerManager: containerd etcd: type: kubekey network: plugin: calico kubePodsCIDR: 10.100.0.0/16 kubeServiceCIDR: 10.200.0.0/16 ## multus support. https://github.com/k8snetworkplumbingwg/multus-cni multusCNI: enabled: false registry: privateRegistry: "" namespaceOverride: "" registryMirrors: [] insecureRegistries: [] addons: [] [root@k8s66 ~]# 7.安装k8s集群 [root@k8s66 ~]# ./kk create cluster -f config-sample.yaml Warning: When there are at least two nodes in the control-plane, you should set the value of the LB address or enable the internal loadbalancer, or set 'controlPlaneEndpoint.externalDNS' to 'true' if the 'controlPlaneEndpoint.domain' can be resolved in your dns server. _ __ _ _ __ | | / / | | | | / / | |/ / _ _| |__ ___| |/ / ___ _ _ | \| | | | _ \ / _ \ \ / _ \ | | | | |\ \ |_| | |_) | __/ |\ \ __/ |_| | \_| \_/\__,_|_.__/ \___\_| \_/\___|\__, | __/ | |___/ 15:39:16 CST [GreetingsModule] Greetings 15:39:16 CST message: [k8s88] Greetings, KubeKey! 15:39:17 CST message: [k8s66] Greetings, KubeKey! 15:39:17 CST message: [k8s77] Greetings, KubeKey! 15:39:17 CST success: [k8s88] 15:39:17 CST success: [k8s66] 15:39:17 CST success: [k8s77] 15:39:17 CST [NodePreCheckModule] A pre-check on nodes 15:39:18 CST success: [k8s66] 15:39:18 CST success: [k8s88] 15:39:18 CST success: [k8s77] 15:39:18 CST [ConfirmModule] Display confirmation form +-------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+ | name | sudo | curl | openssl | ebtables | socat | ipset | ipvsadm | conntrack | chrony | docker | containerd | nfs client | ceph client | glusterfs client | time | +-------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+ | k8s66 | y | y | y | y | y | y | | y | | | | | | | CST 15:39:18 | | k8s77 | y | y | y | y | y | y | | y | | | | | | | CST 15:39:18 | | k8s88 | y | y | y | y | y | y | | y | | | | | | | CST 15:39:18 | +-------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+ This is a simple check of your environment. Before installation, ensure that your machines meet all requirements specified at https://github.com/kubesphere/kubekey#requirements-and-recommendations Install k8s with specify version: v1.28.12 Continue this installation? [yes/no]: y # 输入字母'y'后按回车键 ... downloading amd64 kubeadm v1.28.12 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 46.3M 100 46.3M 0 0 6029k 0 0:00:07 0:00:07 --:--:-- 9368k 15:40:25 CST message: [localhost] downloading amd64 kubelet v1.28.12 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 105M 100 105M 0 0 8259k 0 0:00:13 0:00:13 --:--:-- 11.5M 15:40:39 CST message: [localhost] downloading amd64 kubectl v1.28.12 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 47.3M 100 47.3M 0 0 9685k 0 0:00:05 0:00:05 --:--:-- 10.1M 15:40:44 CST message: [localhost] ... 15:46:17 CST Pipeline[CreateClusterPipeline] execute successfully Installation is complete. Please check the result using the command: kubectl get pod -A [root@k8s66 ~]# 8.检查集群POD [root@k8s66 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s66 Ready control-plane,worker 11m v1.28.12 k8s77 Ready worker 10m v1.28.12 k8s88 Ready control-plane,worker 10m v1.28.12 [root@k8s66 ~]# [root@k8s66 ~]# [root@k8s66 ~]# kubectl get pod -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-568d4f5458-64zhn 1/1 Running 0 10m kube-system calico-node-7b6cv 1/1 Running 0 10m kube-system calico-node-ch9fk 1/1 Running 0 10m kube-system calico-node-s286v 1/1 Running 0 10m kube-system coredns-57946b76b-6x4zr 1/1 Running 0 10m kube-system coredns-57946b76b-nfhhg 1/1 Running 0 10m kube-system kube-apiserver-k8s66 1/1 Running 0 10m kube-system kube-apiserver-k8s88 1/1 Running 0 10m kube-system kube-controller-manager-k8s66 1/1 Running 0 10m kube-system kube-controller-manager-k8s88 1/1 Running 0 10m kube-system kube-proxy-8qmks 1/1 Running 0 10m kube-system kube-proxy-h44bt 1/1 Running 0 10m kube-system kube-proxy-q8hf9 1/1 Running 0 10m kube-system kube-scheduler-k8s66 1/1 Running 0 10m kube-system kube-scheduler-k8s88 1/1 Running 0 10m kube-system nodelocaldns-6sqt7 1/1 Running 0 10m kube-system nodelocaldns-q26n2 1/1 Running 0 10m kube-system nodelocaldns-r4srh 1/1 Running 0 10m [root@k8s66 ~]# - kubesphere图形化管理K8S集群 1.安装依赖软件包 apt -y install socat conntrack ebtables ipset 2.下载软件包 [root@k8s66 ~]# wget http://192.168.21.253/Resources/Kubernetes/Project/kubesphere/kubersphere-on-linux/kube-key-v3.1.10_k8s-v1.28.12/weixiang-ks-core.tar.gz [root@k8s66 ~]# tar xf weixiang-ks-core.tar.gz 3.基于helm安装kubesphere [root@k8s66 ~]# helm upgrade --install -n kubesphere-system --create-namespace my-ks-server ./ks-core Release "my-ks-server" does not exist. Installing it now. NAME: my-ks-server LAST DEPLOYED: Wed Jul 16 16:16:33 2025 NAMESPACE: kubesphere-system STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: Thank you for choosing KubeSphere Helm Chart. Please be patient and wait for several seconds for the KubeSphere deployment to complete. 1. Wait for Deployment Completion Confirm that all KubeSphere components are running by executing the following command: kubectl get pods -n kubesphere-system 2. Access the KubeSphere Console Once the deployment is complete, you can access the KubeSphere console using the following URL: http://10.0.0.66:30880 3. Login to KubeSphere Console Use the following credentials to log in: Account: admin Password: P@88w0rd NOTE: It is highly recommended to change the default password immediately after the first login. For additional information and details, please visit https://kubesphere.io. [root@k8s66 ~]# [root@k8s66 ~]# kubectl get all -n kubesphere-system NAME READY STATUS RESTARTS AGE pod/extensions-museum-75c98f6748-9gc77 1/1 Running 0 53s pod/ks-apiserver-7bf8c9dd79-xldrr 1/1 Running 0 53s pod/ks-console-7878cbb7c8-5bwfb 1/1 Running 0 13s pod/ks-controller-manager-6d4bf8d6c-5rsjn 1/1 Running 0 53s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/extensions-museum ClusterIP 10.200.117.127 <none> 443/TCP 53s service/ks-apiserver ClusterIP 10.200.59.70 <none> 80/TCP 53s service/ks-console NodePort 10.200.37.161 <none> 80:30880/TCP 53s service/ks-controller-manager ClusterIP 10.200.31.18 <none> 443/TCP 53s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/extensions-museum 1/1 1 1 53s deployment.apps/ks-apiserver 1/1 1 1 53s deployment.apps/ks-console 1/1 1 1 53s deployment.apps/ks-controller-manager 1/1 1 1 53s NAME DESIRED CURRENT READY AGE replicaset.apps/extensions-museum-75c98f6748 1 1 1 53s replicaset.apps/ks-apiserver-7bf8c9dd79 1 1 1 53s replicaset.apps/ks-console-74d76fbb9d 0 0 0 53s replicaset.apps/ks-console-7878cbb7c8 1 1 1 13s replicaset.apps/ks-controller-manager-6d4bf8d6c 1 1 1 53s [root@k8s66 ~]# 4.登录kubesphere http://10.0.0.66:30880 默认的用户名和密码: Account: admin Password: P@88w0rd 5.修改初始密码 推荐密码: "Linux98@2025" - kubesphere图形化基本使用

2、kubesphere之图形化基本使用
01、创建项目

image

[root@k8s66 ~]# kubectl get ns NAME STATUS AGE default Active 31m kube-node-lease Active 31m kube-public Active 31m kube-system Active 31m kubekey-system Active 30m kubesphere-controls-system Active 17m kubesphere-system Active 27m [root@k8s66 ~]# kubectl get ns NAME STATUS AGE default Active 33m kube-node-lease Active 33m kube-public Active 33m kube-system Active 33m kubekey-system Active 32m kubesphere-controls-system Active 19m kubesphere-system Active 28m weixiang Active 45s registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1

image

02、创建守护进程集

点击刚创建的项目进入

image-20250716181605681

image-20250716181811375

image-20250716181849739

image-20250716182258773

image-20250716182131697

image-20250716182347375

image-20250716182239081

image-20250716182508628

image-20250716182556300

image-20250716182608654

image-20250716182727480

03、增加副本数

image-20250716182942924

[root@k8s66 ~]# kubectl get deploy,rs,po -o wide -n weixiang NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/weixiang-xiuxian 3/3 3 3 2m15s c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 app=weixiang-xiuxian NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/weixiang-xiuxian-5555885d4 3 3 3 2m15s c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 app=weixiang-xiuxian,pod-template-hash=5555885d4 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/weixiang-xiuxian-5555885d4-gh92n 1/1 Running 0 2m15s 10.100.236.10 k8s66 <none> <none> pod/weixiang-xiuxian-5555885d4-njpsw 1/1 Running 0 2m15s 10.100.91.7 k8s77 <none> <none> pod/weixiang-xiuxian-5555885d4-q7xqz 1/1 Running 0 2m15s 10.100.64.5 k8s88 <none> <none> [root@k8s66 ~]# kubectl get deploy,rs,po -o wide -n weixiang NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/weixiang-xiuxian 5/5 5 5 4m9s c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 app=weixiang-xiuxian NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/weixiang-xiuxian-5555885d4 5 5 5 4m9s c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 app=weixiang-xiuxian,pod-template-hash=5555885d4 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/weixiang-xiuxian-5555885d4-829gh 1/1 Running 0 43s 10.100.64.6 k8s88 <none> <none> pod/weixiang-xiuxian-5555885d4-g7tfw 1/1 Running 0 43s 10.100.91.8 k8s77 <none> <none> pod/weixiang-xiuxian-5555885d4-gh92n 1/1 Running 0 4m9s 10.100.236.10 k8s66 <none> <none> pod/weixiang-xiuxian-5555885d4-njpsw 1/1 Running 0 4m9s 10.100.91.7 k8s77 <none> <none> pod/weixiang-xiuxian-5555885d4-q7xqz 1/1 Running 0 4m9s 10.100.64.5 k8s88 <none> <none>

image-20250716183131216

image-20250716183140910

[root@k8s66 ~]# kubectl get deploy,rs,po -o wide -n weixiang NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/weixiang-xiuxian 5/5 5 5 75m c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 app=weixiang-xiuxian NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/weixiang-xiuxian-5555885d4 5 5 5 75m c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 app=weixiang-xiuxian,pod-template-hash=5555885d4 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/weixiang-xiuxian-5555885d4-829gh 1/1 Running 1 (15m ago) 72m 10.100.64.8 k8s88 <none> <none> pod/weixiang-xiuxian-5555885d4-g7tfw 1/1 Running 1 (15m ago) 72m 10.100.91.11 k8s77 <none> <none> pod/weixiang-xiuxian-5555885d4-gh92n 1/1 Running 1 (15m ago) 75m 10.100.236.12 k8s66 <none> <none> pod/weixiang-xiuxian-5555885d4-njpsw 1/1 Running 1 (15m ago) 75m 10.100.91.12 k8s77 <none> <none> pod/weixiang-xiuxian-5555885d4-q7xqz 1/1 Running 1 (15m ago) 75m 10.100.64.7 k8s88 <none> <none> [root@k8s66 ~]# kubectl describe pod/weixiang-xiuxian-5555885d4-829gh -n weixiang Name: weixiang-xiuxian-5555885d4-829gh Namespace: weixiang Priority: 0 Service Account: default Node: k8s88/10.0.0.88 Start Time: Wed, 16 Jul 2025 18:29:36 +0800 Labels: app=weixiang-xiuxian pod-template-hash=5555885d4 Annotations: cni.projectcalico.org/containerID: bf54c1af36032864165e03e4820f2275810b7a53b68312463eaafc470b94173d cni.projectcalico.org/podIP: 10.100.64.8/32 cni.projectcalico.org/podIPs: 10.100.64.8/32 kubesphere.io/creator: admin kubesphere.io/imagepullsecrets: {} Status: Running IP: 10.100.64.8 IPs: IP: 10.100.64.8 Controlled By: ReplicaSet/weixiang-xiuxian-5555885d4 Containers: c1: Container ID: containerd://7bc5aad513a039a9cfeb9b3aff8bcbf3ba5230c3383f4a2555bdc04628d7ae45 Image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 Image ID: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps@sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c Port: 80/TCP Host Port: 0/TCP State: Running Started: Wed, 16 Jul 2025 19:26:50 +0800 Last State: Terminated Reason: Unknown Exit Code: 255 Started: Wed, 16 Jul 2025 18:29:37 +0800 Finished: Wed, 16 Jul 2025 19:26:15 +0800 Ready: True Restart Count: 1 Environment: <none> Mounts: /etc/localtime from host-time (ro) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-46glv (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: host-time: Type: HostPath (bare host directory volume) Path: /etc/localtime HostPathType: kube-api-access-46glv: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 72m default-scheduler Successfully assigned weixiang/weixiang-xiuxian-5555885d4-829gh to k8s88 Normal Pulled 72m kubelet Container image "registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1" already present on machine Normal Created 72m kubelet Created container c1 Normal Started 72m kubelet Started container c1 Normal SandboxChanged 15m (x2 over 15m) kubelet Pod sandbox changed, it will be killed and re-created. Normal Pulled 15m kubelet Container image "registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1" already present on machine Normal Created 15m kubelet Created container c1 Normal Started 15m kubelet Started container c1
04、查看详细事件

image-20250716194249221

[root@k8s66 ~]# kubectl get deploy,rs,po -o wide -n weixiang NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/weixiang-xiuxian 5/5 5 5 78m c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 app=weixiang-xiuxian NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/weixiang-xiuxian-5555885d4 5 5 5 78m c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 app=weixiang-xiuxian,pod-template-hash=5555885d4 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/weixiang-xiuxian-5555885d4-829gh 1/1 Running 1 (17m ago) 74m 10.100.64.8 k8s88 <none> <none> pod/weixiang-xiuxian-5555885d4-g7tfw 1/1 Running 1 (18m ago) 74m 10.100.91.11 k8s77 <none> <none> pod/weixiang-xiuxian-5555885d4-gh92n 1/1 Running 1 (18m ago) 78m 10.100.236.12 k8s66 <none> <none> pod/weixiang-xiuxian-5555885d4-njpsw 1/1 Running 1 (18m ago) 78m 10.100.91.12 k8s77 <none> <none> pod/weixiang-xiuxian-5555885d4-q7xqz 1/1 Running 1 (17m ago) 78m 10.100.64.7 k8s88 <none> <none> #开启另一个窗口测试 [root@k8s66 ~]# curl 10.100.236.12 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@k8s66 ~]# kubectl logs -f deployment.apps/weixiang-xiuxian -n weixiang Found 5 pods, using pod/weixiang-xiuxian-5555885d4-gh92n /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh 10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf 10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh /docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh /docker-entrypoint.sh: Configuration complete; ready for start up 2025/07/16 19:26:48 [notice] 1#1: using the "epoll" event method 2025/07/16 19:26:48 [notice] 1#1: nginx/1.20.1 2025/07/16 19:26:48 [notice] 1#1: built by gcc 10.2.1 20201203 (Alpine 10.2.1_pre1) 2025/07/16 19:26:48 [notice] 1#1: OS: Linux 5.15.0-119-generic 2025/07/16 19:26:48 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576 2025/07/16 19:26:48 [notice] 1#1: start worker processes 2025/07/16 19:26:48 [notice] 1#1: start worker process 32 2025/07/16 19:26:48 [notice] 1#1: start worker process 33 10.0.0.66 - - [16/Jul/2025:19:46:13 +0800] "GET / HTTP/1.1" 200 357 "-" "curl/7.81.0" "-"
05、查看容器日志

image-20250716194755085

image-20250716194835535

image-20250716194847056

06、进入容器终端

image-20250716194925606

[root@k8s66 ~]# kubectl exec -it weixiang-xiuxian-5555885d4-gh92n -n weixiang -- sh / # ls -l

image-20250716195327850

image-20250716195350320

07、部署Svc服务

image

bash
部署(Deployment):用于管理 无状态应用(如 Web 服务、API 服务) 有状态副本集(StatefulSet):用于管理 有状态应用(如数据库、消息队列) 守护进程集(DaemonSet):确保 每个节点(或匹配的节点)上运行一个 Pod 副本

image-20250716195546000

image-20250716195601531

image-20250716195645838

image-20250716195716376

image-20250716195801987

image-20250716195848879

08、创建ds类型服务
[root@k8s66 ~]# kubectl get ds,deploy,rs,po -n weixiang NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/ds-xiuxian 3 3 3 3 3 <none> 67s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/weixiang-xiuxian 5/5 5 5 97m NAME DESIRED CURRENT READY AGE replicaset.apps/weixiang-xiuxian-5555885d4 5 5 5 97m NAME READY STATUS RESTARTS AGE pod/ds-xiuxian-875ck 1/1 Running 0 67s pod/ds-xiuxian-gbwp4 1/1 Running 0 67s pod/ds-xiuxian-qh92w 1/1 Running 0 67s pod/weixiang-xiuxian-5555885d4-829gh 1/1 Running 1 (37m ago) 93m pod/weixiang-xiuxian-5555885d4-g7tfw 1/1 Running 1 (37m ago) 93m pod/weixiang-xiuxian-5555885d4-gh92n 1/1 Running 1 (37m ago) 97m pod/weixiang-xiuxian-5555885d4-njpsw 1/1 Running 1 (37m ago) 97m pod/weixiang-xiuxian-5555885d4-q7xqz 1/1 Running 1 (37m ago) 97m

image-20250716195933772

image-20250716200004560

image-20250716200129422

image-20250716200144323

image-20250716200216885

image-20250716200210054

image-20250716200452977

image-20250716200536442

image-20250716200556543

image-20250716200632464

image-20250716200747400

image-20250716200811780

3、Containerd对接harbor自建证书案例及wordpress部署
1、Kubesphere部署db
bash
# 做host解析 # 拷贝证书 # 拉取镜像 1.添加hosts解析 [root@k8s66 ~]# echo 8.148.236.36 harbor250.weixiang.com >> /etc/hosts [root@k8s77 ~]# echo 10.0.0.250 harbor250.weixiang.com >> /etc/hosts [root@k8s88 ~]# echo 110.0.0.250 harbor250.weixiang.com >> /etc/hosts 2.拷贝证书文件 【自建证书,ubuntu系统没有自建ca证书文件,我们只需要将证书文件拷贝过去即可。】 [root@harbor250.weixiang.com certs]# pwd /usr/local/harbor/certs [root@harbor250.weixiang.com certs]# [root@harbor250.weixiang.com certs]# scp ca/ca.crt 10.1.12.3:/etc/ssl/certs/ [root@harbor250.weixiang.com certs]# scp ca/ca.crt 10.1.12.4:/etc/ssl/certs/ [root@harbor250.weixiang.com certs]# scp ca/ca.crt 10.1.12.15:/etc/ssl/certs/ 3.测试验证 [root@k8s88 ~]# ctr -n k8s.io i pull harbor250.weixiang.com/weixiang-xiuxian/apps:v1 harbor250.weixiang.com/weixiang-xiuxian/apps:v1: resolved |++++++++++++++++++++++++++++++++++++++| manifest-sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:2dd61e30a21aeb966df205382a40dcbcf45af975cc0cb836d555b9cd0ad760f5: done |++++++++++++++++++++++++++++++++++++++| config-sha256:f28fd43be4ad41fc768dcc3629f8479d1443df01ada10ac9a771314e4fdef599: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:5758d4e389a3f662e94a85fb76143dbe338b64f8d2a65f45536a9663b05305ad: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:51d66f6290217acbf83f15bc23a88338819673445804b1461b2c41d4d0c22f94: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:ff9c6add3f30f658b4f44732bef1dd44b6d3276853bba31b0babc247f3eba0dc: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:dcc43d9a97b44cf3b3619f2c185f249891b108ab99abcc58b19a82879b00b24b: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:5dcfac0f2f9ca3131599455f5e79298202c7e1b5e0eb732498b34e9fe4cb1173: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:2c6e86e57dfd729d8240ceab7c18bd1e5dd006b079837116bc1c3e1de5e1971a: done |++++++++++++++++++++++++++++++++++++++| elapsed: 0.1 s total: 0.0 B (0.0 B/s) unpacking linux/amd64 sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c... done: 12.675239ms [root@k8s88 ~]# [root@k8s88 ~]# ctr -n k8s.io i ls | grep xiuxian harbor250.weixiang.com/weixiang-xiuxian/apps:v1 application/vnd.docker.distribution.manifest.v2+json sha256:3bee216f250cfd2dbda1744d6849e27118845b8f4d55dda3ca3c6c1227cc2e5c 9.6 MiB linux/amd64 io.cri-containerd.image=managed [root@k8s88 ~]# 4.kubesphere部署db
2、kubesphere部署db

image

​​

3a6ee41b-840b-4513-9f95-d2397ddd02a2

5f00b831-42d4-4a71-9db9-75bcf8510a0e

9465fc16-65f0-4724-9d45-22dfd0141d03

b34fc313-f8b8-43bc-b410-06ffbbe3cc0d

image

bash
# 手动拉取测试 [root@k8s66 ~]#ctr -n k8s.io i pull harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle [root@k8s66 ~]#ctr -n k8s.io i pull harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle: resolved |++++++++++++++++++++++++++++++++++++++| manifest-sha256:c57363379dee26561c2e554f82e70704be4c8129bd0d10e29252cc0a34774004: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:d2433cba0951b4278a867dc36ff9ca8ce6405dc72cdd4e90cd71cadb4b9448a9: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:cf55ff1c80afe9b3de87da1a45d54a12cefebb238605357f8f6039a442e17749: done |++++++++++++++++++++++++++++++++++++++| config-sha256:f5f171121fa3e572eb30770e3c9a6ca240e822fdaea4e2f44882de402c8ce9d4: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:13702d9fe3c31adcdb0b085079474a26bce3991b1485688db0aadbd826debb0a: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:bd37f6d992035c9959b83f8f96b15cac9d66542a608f378e6e97c17830b72d80: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:83bcc87284a1da178d692381d8e66cd94663aba8402e75d63a41496dc6554924: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:c38d8660e1fa1d6fc47ea2236ac9b43e158d804e6f8eeb99cf97a54f4a181199: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:4eaae1e844acce67a77bab16b33f4e674cee523bf63fe968431a61d873e1dbe3: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:5196e1e87d8faf6bd8f3f1345cf3c351d6257786a1544e5df427b38196cbf906: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:7e1bc321f421360243b3a183ec0b93f6e82619bd649ece77a275f7913391c4c8: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:6586d096303c7e53b0fbaee28c83a1fbdb727694d2ad358bc3a6b24ce975bbd6: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:bddd54b9c54941036b01188f0ffda84f03bb0804655111c31437d64fb6eb6942: done |++++++++++++++++++++++++++++++++++++++| elapsed: 1271.6s total: 162.3 (1

imageimage

3、kubesphere部署service

image

image

4、kubesphere部署wp

image

image

image

image

image

image

bash
可以看出服务已经全部部署完成了

image

5、部署wp的service让外部访问

image

image

image

image

bash
访问测试

13、容器镜像拉取策略

1.什么镜像拉取策略
bash
所谓的镜像拉取策略就是指的Pod在部署服务前,如何处理镜像的问题。 官方给出了三种解决方案: - Never 如果本地有镜像则尝试启动容器。 如果本地没有镜像则不会去远程仓库拉取镜像。 - IfNotPresent 如果本地有镜像则尝试启动容器。 如果本地没有镜像则会去远程仓库拉取镜像。 - Always 如果本地有镜像则对比本地的digest(摘要)信息和远程仓库的digest对比,如果相同则使用本地缓存镜像,如果不同,则重新拉取远程仓库的镜像。 如果本地没有镜像则会去远程仓库拉取镜像。 温馨提示: - 1.默认的镜像拉取策略,取决于标签,如果标签的值为"latest"则默认的拉取策略"Always",如果标签的值非"latest",则默认的镜像拉取策略为"IfNotPresent"。 - 2.生产环境中,尽量不要使用latest标签,因为这个标签始终指向最新的镜像。
2.实战案例
bash
2.1 编译镜像并推送到harbor仓库 [root@worker232 ~]# cat Dockerfile FROM harbor250.weixiang.com/weixiang-xiuxian/apps:v1 LABEL school=weixiang \ class=weixiang98 RUN mkdir /weixiang && \ touch /weixiang/xixi.log [root@worker232 ~]# [root@worker232 ~]# docker build -t harbor250.weixiang.com/weixiang-test/demo:v1 . [root@worker232 ~]# [root@worker232 ~]# docker login -u admin -p 1 harbor250.weixiang.com [root@worker232 ~]# [root@worker232 ~]# docker push harbor250.weixiang.com/weixiang-test/demo:v1 # 2.2 测试验证 设置为never [root@master231 01-pods]# cat 24-pods-imagePullPolicy.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-imagepullpolicy002 labels: apps: v1 spec: hostNetwork: true dnsPolicy: ClusterFirstWithHostNet nodeName: worker233 # 把镜像调度到233节点 containers: - name: c1 image: harbor250.weixiang.com/weixiang-test/demo:v1 # 指定镜像的拉取策略 imagePullPolicy: Never # imagePullPolicy: IfNotPresent # imagePullPolicy: Always [root@master231 01-pods]# [root@master231 01-pods]# kubectl apply -f 24-pods-imagePullPolicy.yaml pod/xiuxian-imagepullpolicy002 created [root@master231 01-pods]# [root@master231 ~/count/pods]#kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-7b574d64b-cw9hz 1/1 Running 1 (6h40m ago) 12h 10.100.1.55 worker232 <none> <none> deploy-xiuxian-7b574d64b-lck7b 1/1 Running 1 (6h39m ago) 12h 10.100.2.67 worker233 <none> <none> deploy-xiuxian-7b574d64b-mjl8f 1/1 Running 1 (6h39m ago) 12h 10.100.2.69 worker233 <none> <none> xiuxian-imagepullpolicy002 0/1 ErrImageNeverPull 0 16s 10.1.24.4 worker233 <none> <none> xiuxian-v1 1/1 Running 1 (6h40m ago) 21h 10.1.20.5 worker232 <none> <none> - kubesphere指定容器镜像拉取策略

image

bash
# 修改IfNotPresent策略 apiVersion: v1 kind: Pod metadata: name: xiuxian-imagepullpolicy002 labels: apps: v1 spec: hostNetwork: true dnsPolicy: ClusterFirstWithHostNet nodeName: worker233 containers: - name: c1 image: harbor250.weixiang.com/weixiang-test/demo:v1 # 指定镜像的拉取策略 # imagePullPolicy: Never imagePullPolicy: IfNotPresent # imagePullPolicy: Always # 运行成功 [root@master231 ~/count/pods]#kubectl apply -f 24-pods-imagePullPolicy.yaml pod/xiuxian-imagepullpolicy002 created [root@master231 ~/count/pods]#kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-imagepullpolicy002 1/1 Running 0 6s 10.1.24.4 worker233 <none> <none> # 233节点已经拉取到镜像

image

3.kubesphere实现拉取策略

image

image

bash
优先使用本地镜像:IfNotPresent 每次拉取镜像:Always 仅使用本地镜像:Never

14、configMap资源管理

1、基础介绍
bash
1.什么是configMap 所谓的configMap简称"cm",是用来存储应用程序的配置信息。 configMap的数据底层存储在etcd中,我们可以借助configMap实现配置中心,服务发现等应用场景。 容器怎么用,以存储卷的形式挂载 2.cm的基础管理 [root@master231 12-configmaps]# cat 01-cm-demo.yaml apiVersion: v1 kind: ConfigMap # 声明这是一个 ConfigMap 资源,整个文件三个key metadata: name: cm-games # 给这个 ConfigMap 命名为 "cm-games" data: # 数据部分 # 指定键值对,一个key对应的是一个具体的值。 school: weixiang # school 是键,weixiang 是值 class: weixiang98 # class 是键,weixiang98 是值 # 指定键值对,一个key对应的是一个文件内容 my.cnf: | # my.cnf是键名 [mysqld] datadir=/weixiang/data/mysql80 basedir=/weixiang/softwares/mysql80 port=3306 socket=/tmp/mysql80.sock [client] socket=/tmp/mysql80.sock [root@master231 12-configmaps]# [root@master231 12-configmaps]# kubectl apply -f 01-cm-demo.yaml configmap/cm-games created [root@master231 12-configmaps]# [root@master231 12-configmaps]# kubectl get cm NAME DATA AGE cm-games 3 4s kube-root-ca.crt 1 8d [root@master231 12-configmaps]# [root@master231 12-configmaps]# [root@master231 12-configmaps]# kubectl describe cm cm-games Name: cm-games Namespace: default Labels: <none> Annotations: <none> Data ==== my.cnf: # 这个是key ---- [mysqld] datadir=/weixiang/data/mysql80 basedir=/weixiang/softwares/mysql80 port=3306 socket=/tmp/mysql80.sock [client] socket=/tmp/mysql80.sock school: # 这个是key ---- weixiang class: # 这个是key ---- weixiang98 BinaryData ==== Events: <none> [root@master231 12-configmaps]# [root@master231 12-configmaps]# kubectl delete cm cm-games configmap "cm-games" deleted [root@master231 12-configmaps]# [root@master231 12-configmaps]# kubectl get cm NAME DATA AGE kube-root-ca.crt 1 8d [root@master231 12-configmaps]#
2、pod使用cm之volume引用方式(推荐)
bash
[root@master231 12-configmaps]# kubectl apply -f 01-cm-demo.yaml configmap/cm-games created [root@master231 12-configmaps]# [root@master231 12-configmaps]# cat 02-deploy-cm-volumes.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: # Pod 的规格定义 volumes: # 定义存储卷列表 - name: data # 存储卷名称为 "data" configMap: # 指定存储卷的类型为configMap name: cm-games # 指定引用cm的名称,是之前定义的 "cm-games" items: # 指定引用cm的具体key - key: my.cnf # 引用 ConfigMap 中的 my.cnf 键 path: mysql.conf # 指定将 my.cnf 的内容挂载为容器中的 mysql.conf 文件 - key: school # 引用 ConfigMap 中的 school 键 path: school # 指定将 school 的内容挂载为容器中的 school 文件 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 volumeMounts: # 定义容器的挂载卷 - name: data mountPath: /weixiang # 将存储卷挂载到容器内的 /weixiang 目录 这个配置将会: 1、创建一个名为 deploy-xiuxian 的 Deployment 2、创建 3 个相同的 Pod 3、每个 Pod 中会: 将 ConfigMap cm-games 中的 my.cnf 内容挂载为 /weixiang/mysql.conf 文件 将 ConfigMap cm-games 中的 school 内容挂载为 /weixiang/school 文件 这样容器内的应用程序就可以通过 /weixiang/mysql.conf 和 /weixiang/school 访问这些配置数据了。 [root@master231 12-configmaps]# kubectl apply -f 02-deploy-cm-volumes.yaml deployment.apps/deploy-xiuxian created [root@master231 12-configmaps]# [root@master231 12-configmaps]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-66b58d94fd-4q9f4 1/1 Running 0 4s 10.100.1.58 worker232 <none> <none> deploy-xiuxian-66b58d94fd-ck2wh 1/1 Running 0 4s 10.100.1.59 worker232 <none> <none> deploy-xiuxian-66b58d94fd-w6rp2 1/1 Running 0 4s 10.100.2.46 worker233 <none> <none> [root@master231 12-configmaps]# [root@master231 12-configmaps]# kubectl exec -it deploy-xiuxian-66b58d94fd-4q9f4 -- sh / # ls -l /weixiang/ total 0 lrwxrwxrwx 1 root root 17 Jul 17 03:28 mysql.conf -> ..data/mysql.conf lrwxrwxrwx 1 root root 13 Jul 17 03:28 school -> ..data/school / # / # cat /weixiang/mysql.conf [mysqld] datadir=/weixiang/data/mysql80 basedir=/weixiang/softwares/mysql80 port=3306 socket=/tmp/mysql80.sock [client] socket=/tmp/mysql80.sock / # [root@master231 12-configmaps]# kubectl delete -f 02-deploy-cm-volumes.yaml deployment.apps "deploy-xiuxian" deleted [root@master231 12-configmaps]#

image

3、pod使用cm之env引用方式(不推荐!)
bash
[root@master231 12-configmaps]# cat 03-deploy-cm-env.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian-volume-env spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: volumes: - name: data configMap: name: cm-games items: - key: my.cnf path: mysql.conf - key: school path: school containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 volumeMounts: - name: data mountPath: /weixiang env: - name: weixiang-SCHOOL # 值从某个地方引用 valueFrom: # 值从一个cm资源引用 configMapKeyRef: # 指定cm资源的名称 name: cm-games # 引用cm的具体key key: school - name: weixiang-class # 值从某个地方引用 valueFrom: # 值从一个cm资源引用 configMapKeyRef: # 指定cm资源的名称 name: cm-games # 引用cm的具体key key: class [root@master231 12-configmaps]# [root@master231 12-configmaps]# [root@master231 12-configmaps]# kubectl apply -f 03-deploy-cm-env.yaml deployment.apps/deploy-xiuxian-volume-env created [root@master231 12-configmaps]# [root@master231 12-configmaps]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-volume-env-66c69dfcb9-d7msg 1/1 Running 0 4s 10.100.2.47 worker233 <none> <none> deploy-xiuxian-volume-env-66c69dfcb9-hrttp 1/1 Running 0 4s 10.100.1.60 worker232 <none> <none> deploy-xiuxian-volume-env-66c69dfcb9-vwhmv 1/1 Running 0 4s 10.100.1.61 worker232 <none> <none> [root@master231 12-configmaps]# [root@master231 12-configmaps]# kubectl exec -it deploy-xiuxian-volume-env-66c69dfcb9-d7msg -- env | grep -i weixiang weixiang-SCHOOL=weixiang weixiang-class=weixiang98
4、基于kubesphere实现configMap的管理和挂载使用
bash
首先创建配置字典

image

image

image

bash
[root@k8s66 ~]#kubectl get cm -n weixiang NAME DATA AGE kube-root-ca.crt 1 9m8s weixiang-xixi 3 51s # 看到三条数据

image

YAML可以看到具体信息

image

新建部署测试

image

image

image

image

image

bash
页面无法拉取,只能手动拉取 [root@k8s77 ~]#ctr -n k8s.io i pull harbor250.weixiang.com/weixiang-xiuxian/apps:v3

image

bash
进入终端测试

image

5、configMap应用案例之nginx的端口修改SubPath
bash
1.编写资源清单 [root@master231 05-configmap-nginx]# cat 01-deploy-xiuxian.yaml apiVersion: v1 kind: ConfigMap metadata: name: nginx-conf data: nginx.conf: | user nginx; worker_processes auto; error_log /var/log/nginx/error.log notice; pid /var/run/nginx.pid; events { worker_connections 4096; } http { include /etc/nginx/mime.types; default_type application/octet-stream; #log_format main '$remote_addr - $remote_user [$time_local] "$request" ' # '$status $body_bytes_sent "$http_referer" ' # '"$http_user_agent" "$http_x_forwarded_for"'; log_format weixiang_nginx_json '{"@timestamp":"$time_iso8601",' '"host":"$server_addr",' '"clientip":"$remote_addr",' '"SendBytes":$body_bytes_sent,' '"responsetime":$request_time,' '"upstreamtime":"$upstream_response_time",' '"upstreamhost":"$upstream_addr",' '"http_host":"$host",' '"uri":"$uri",' '"domain":"$host",' '"xff":"$http_x_forwarded_for",' '"referer":"$http_referer",' '"tcp_xff":"$proxy_protocol_addr",' '"http_user_agent":"$http_user_agent",' '"status":"$status"}'; access_log /var/log/nginx/access.log weixiang_nginx_json; sendfile on; #tcp_nopush on; keepalive_timeout 65; #gzip on; include /etc/nginx/conf.d/*.conf; } default.conf: | server { listen 81; listen [::]:81; server_name localhost; location / { root /usr/share/nginx/html; index index.html index.htm; } error_page 500 502 503 504 /50x.html; location = /50x.html { root /usr/share/nginx/html; } } --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: volumes: - name: main # 第一个存储卷名为main configMap: # 存储卷格式为configMap name: nginx-conf # 引用的ConfigMap名称 items: # 指定引用cm的具体key - key: nginx.conf # 引用 nginx.conf 键 path: nginx.conf # 挂载为 nginx.conf 文件 - name: subconf # 第二个存储卷,名为 subconf configMap: name: nginx-conf items: - key: default.conf # 引用 default.conf 键 path: default.conf # 挂载为 default.conf 文件 containers: - name: c1 volumeMounts: - name: subconf # 挂载的是subconf mountPath: /etc/nginx/conf.d/ # 没有使用subPath,则mountPath挂载点类型默认是目录. - name: main mountPath: /etc/nginx/nginx.conf subPath: nginx.conf # 注意,当subPath等于存储卷的path时,则mountPath挂载点不在是一个目录,而是一个文件! [root@master231 05-configmap-nginx]# 1. ConfigMap 挂载本质 ConfigMap 始终以目录形式存储,即使只包含单个文件,在节点上也会生成目录结构(如 /var/lib/kubelet/.../configmap-volume/)。 2. 挂载到目录(无 subPath) 无 items 参数:完全覆盖目标目录,仅保留 ConfigMap 中的文件(删除目录原有内容)。 有 items 参数: 若修改文件名(如 path: custom.conf):仅添加新文件,不覆盖目录原有文件。 若保持原名(如 path: default.conf):覆盖目标目录中的同名文件,保留其他文件。 3. 挂载到文件(必须用 subPath) 必须通过 subPath 指定从 ConfigMap 中提取的文件名(如 subPath: nginx.conf)。 仅替换目标文件,不影响所在目录的其他内容。 4. 黄金法则 需完全清空目录 → 不用 items。 需保留非冲突文件 → 用 items 并修改文件名。 需精确替换单个文件 → 用 subPath + 文件路径。 [root@master231 05-configmap-nginx]# kubectl apply -f 01-deploy-xiuxian.yaml configmap/nginx-conf created deployment.apps/deploy-xiuxian created [root@master231 05-configmap-nginx]# [root@master231 05-configmap-nginx]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-78dcbc5d79-c5vgt 1/1 Running 0 4s 10.100.2.57 worker233 <none> <none> deploy-xiuxian-78dcbc5d79-jvbj5 1/1 Running 0 4s 10.100.1.77 worker232 <none> <none> deploy-xiuxian-78dcbc5d79-lv2tv 1/1 Running 0 4s 10.100.1.78 worker232 <none> <none> [root@master231 05-configmap-nginx]# 2.测试验证 [root@master231 05-configmap-nginx]# curl 10.100.2.57:81 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 05-configmap-nginx]# [root@master231 05-configmap-nginx]# kubectl logs -f deploy-xiuxian-78dcbc5d79-c5vgt ... {"@timestamp":"2025-07-17T06:54:45+00:00","host":"10.100.2.57","clientip":"10.100.0.0","SendBytes":357,"responsetime":0.000,"upstreamtime":"-","upstreamhost":"-","http_host":"10.100.2.57","uri":"/index.html","domain":"10.100.2.57","xff":"-","referer":"-","tcp_xff":"-","http_user_agent":"curl/7.81.0","status":"200"}

image

6、configMap应用案例之MySQL主从同步
bash
- configMap应用案例之MySQL主从同步 参考链接: https://www.cnblogs.com/yinzhengjie/p/18974046 harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle --character-set-server=utf8,--collation-server=utf8_bin,--default-authentication-plugin=mysql_native_password
1、页面创建
bash
创建配置字典

image

image

新建工作负载

image

image

image

bash
创建后报错,需要指定subPath

image

image

bash
需要指定

image

bash
# 进入配置文件中进行修改 [root@k8s66 ~]#kubectl edit deployment -n weixiang deploy-master

image

部署slave服务

image

image

bash
这个保存后也是跟上面的一样的问题 # 进入配置文件中进行修改后生效 [root@k8s66 ~]#kubectl edit deployment -n weixiang deploy-slave

image

创建service服务

image

image

完成创建

image

2、部署服务
bash
[root@k8s66 ~]#kubectl get pods -o wide -n weixiang --show-labels | grep master deploy-master-7d856d45-xcxtw 1/1 Running 0 3h1m 10.100.91.21 k8s77 <none> <none> app=deploy-master,pod-template-hash=7d856d45 [root@k8s66 ~]#kubectl get pods -o wide -l "app in (deploy-master,deploy-slave)" -n weixiang NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-master-7d856d45-xcxtw 1/1 Running 0 3h2m 10.100.91.21 k8s77 <none> <none> deploy-slave-7b674cb6d9-l9wbp 1/1 Running 0 8m36s 10.100.91.23 k8s77 <none> <none> [root@k8s66 ~]#
3、配置主从同步,主库操作
bash
[root@master241 01-mysql-master-slave]# kubectl exec -it deploy-master-7cb4bd8f69-9v74l -- mysql -pyinzhengjie mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 9 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SELECT @@server_id; +-------------+ | @@server_id | +-------------+ | 100 | +-------------+ 1 row in set (0.00 sec) mysql> CREATE USER jasonyin IDENTIFIED BY 'yinzhengjie'; Query OK, 0 rows affected (0.01 sec) mysql> GRANT REPLICATION SLAVE ON *.* TO 'jasonyin'; Query OK, 0 rows affected (0.00 sec) mysql> SHOW GRANTS FOR jasonyin; +--------------------------------------------------+ | Grants for jasonyin@% | +--------------------------------------------------+ | GRANT REPLICATION SLAVE ON *.* TO `jasonyin`@`%` | +--------------------------------------------------+ 1 row in set (0.00 sec) mysql> SHOW MASTER STATUS; +---------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +---------------+----------+--------------+------------------+-------------------+ | binlog.000002 | 668 | | | | +---------------+----------+--------------+------------------+-------------------+ 1 row in set (0.00 sec) # 进入master终端进行操作

image

image

4、从库和主库建立链接
bash
[root@master241 ~]# kubectl exec -it deploy-slave-b56d49979-vcxst -- mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 8 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SELECT @@server_id; +-------------+ | @@server_id | +-------------+ | 200 | +-------------+ 1 row in set (0.00 sec) mysql> CHANGE MASTER TO -> MASTER_HOST='svc-mysql', -> MASTER_USER='jasonyin', -> MASTER_PASSWORD='yinzhengjie', -> MASTER_PORT=3306, -> MASTER_LOG_FILE='binlog.000002', -> MASTER_LOG_POS=668, -> MASTER_CONNECT_RETRY=10; Query OK, 0 rows affected, 10 warnings (0.01 sec) mysql> START SLAVE; Query OK, 0 rows affected, 1 warning (0.01 sec) mysql> SHOW SLAVE STATUS\G *************************** 1. row *************************** Slave_IO_State: Waiting for source to send event Master_Host: svc-mysql Master_User: jasonyin Master_Port: 3306 Connect_Retry: 10 Master_Log_File: binlog.000002 Read_Master_Log_Pos: 668 Relay_Log_File: relay-log    .000002 Relay_Log_Pos: 323 Relay_Master_Log_File: binlog.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 668 Relay_Log_Space: 538 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 100 Master_UUID: 027807bd-5d5b-11f0-aa95-6aee3b288e24 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: Master_public_key_path: Get_master_public_key: 0 Network_Namespace: 1 row in set, 1 warning (0.00 sec) mysql>

image

5、验证主从

image

image

15、secret

1.什么是secret
bash
secret和cm资源类似,也是存储数据的,但是secret主要存储的是敏感数据,比如认证信息,x509数字证书,登录信息等。 k8s会将数据进行base64编码后存储到etcd中。
2.secret管理实战
bash
2.1 编写资源清单 [root@master231 13-secrets]# cat 01-secrets-demo.yaml apiVersion: v1 kind: Secret metadata: name: login-info stringData: # 以明文形式存储非二进制数据(实际存储时会自动base64编码) host: 10.0.0.250 port: "3306" dbName: wordpress username: admin password: yinzhengjie # stringData 字段允许直接写明文值,Kubernetes 会自动转换为 base64 编码 # 与ConfigMap 的区别:Secret 专为敏感数据设计(会特殊标记) [root@master231 13-secrets]# 2.2 测试验证 [root@master231 13-secrets]# kubectl apply -f 01-secrets-demo.yaml secret/login-info created [root@master231 13-secrets]# [root@master231 13-secrets]# kubectl get -f 01-secrets-demo.yaml NAME TYPE DATA AGE login-info Opaque 5 4s [root@master231 13-secrets]# [root@master231 13-secrets]# kubectl describe secrets login-info Name: login-info Namespace: default Labels: <none> Annotations: <none> Type: Opaque Data ==== host: 10 bytes password: 11 bytes port: 4 bytes username: 5 bytes dbName: 9 bytes [root@master231 13-secrets]# [root@master231 13-secrets]# kubectl get secrets login-info -o yaml apiVersion: v1 data: dbName: d29yZHByZXNz host: MTAuMC4wLjI1MA== password: eWluemhlbmdqaWU= port: MzMwNg== username: YWRtaW4= kind: Secret metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"v1","kind":"Secret","metadata":{"annotations":{},"name":"login-info","namespace":"default"},"stringData":{"dbName":"wordpress","host":"10.0.0.250","password":"yinzhengjie","port":"3306","username":"admin"}} creationTimestamp: "2025-07-17T08:23:22Z" name: login-info namespace: default resourceVersion: "112029" uid: 14999a32-bd2b-4ffd-8fc1-7180f23bb4fd type: Opaque [root@master231 13-secrets]# [root@master231 13-secrets]# [root@master231 13-secrets]# echo YWRtaW4= | base64 -d; echo admin [root@master231 13-secrets]# [root@master231 13-secrets]# echo eWluemhlbmdqaWU= | base64 -d; echo yinzhengjie [root@master231 13-secrets]# [root@master231 13-secrets]# kubectl delete -f 01-secrets-demo.yaml secret "login-info" deleted [root@master231 13-secrets]# [root@master231 13-secrets]# kubectl get secrets NAME TYPE DATA AGE default-token-vgjrs kubernetes.io/service-account-token 3 8d [root@master231 13-secrets]# # 3.pod引用secret的两种方式 [root@master231 13-secrets]# cat 02-deploy-secret-volume-env.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian-secrets-volume-env spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: volumes: - name: data secret: # 指定存储卷的类型是secret资源 secretName: login-info # 引用的 Secret 名称 items: # 指定secret要引用的key,若不指定则默认引用所有的key - key: host # 从 Secret 中选择 key=host path: host.conf # 存为文件 /weixiang/host.conf - key: port # 选择 key=port path: port.txt # 存为文件 /weixiang/port.txt - key: dbName # 选择 key=dbName path: db.log # 存为文件 /weixiang/db.log containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 volumeMounts: - name: data mountPath: /weixiang env: - name: weixiang-USER # 定义环境变量名 valueFrom: secretKeyRef: # 值从一个secrets资源引用 name: login-info # 指定secrets资源的名称 key: username # 引用的key(值为 admin) - name: weixiang-PWD # 另一个环境变量 valueFrom: secretKeyRef: name: login-info key: password # 引用的 key(值为 yinzhengjie) [root@master231 13-secrets]# [root@master231 13-secrets]# [root@master231 13-secrets]# kubectl apply -f 01-secrets-demo.yaml -f 02-deploy-secret-volume-env.yaml secret/login-info created deployment.apps/deploy-xiuxian-secrets-volume-env created [root@master231 13-secrets]# [root@master231 13-secrets]# [root@master231 13-secrets]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-secrets-volume-env-d8bd575f7-gq9pc 1/1 Running 0 13s 10.100.1.79 worker232 <none> <none> deploy-xiuxian-secrets-volume-env-d8bd575f7-m8zxp 1/1 Running 0 13s 10.100.2.58 worker233 <none> <none> deploy-xiuxian-secrets-volume-env-d8bd575f7-rhwxr 1/1 Running 0 13s 10.100.1.80 worker232 <none> <none> [root@master231 13-secrets]# [root@master231 13-secrets]# kubectl exec -it deploy-xiuxian-secrets-volume-env-d8bd575f7-gq9pc -- sh / # ls -l /weixiang/ total 0 lrwxrwxrwx 1 root root 13 Jul 17 08:30 db.log -> ..data/db.log lrwxrwxrwx 1 root root 16 Jul 17 08:30 host.conf -> ..data/host.conf lrwxrwxrwx 1 root root 15 Jul 17 08:30 port.txt -> ..data/port.txt / # / # cat /weixiang/db.log ; echo wordpress / # / # cat /weixiang/host.conf ; echo 10.0.0.250 / # / # cat /weixiang/port.txt ; echo 3306 / # / # / # env | grep -i weixiang weixiang-PWD=yinzhengjie weixiang-USER=admin / #
3、kubesphere管理secret实战案例
bash
创建保密字典

image

image

image

bash
创建工作负载

image

image

image

image

image

bash
进入终端查看

image

4、secret实战案例之harbor私有仓库镜像拉取
bash
# 拉取私有仓库镜像需要认证 [root@master231 13-secrets]# cat 03-deploy-secret-harbor.yaml apiVersion: v1 kind: Secret metadata: name: harbor-admin # Secret名称,后续Deployment 会引用 type: kubernetes.io/dockerconfigjson # 指定类型为 Docker 仓库认证 stringData: # 注意,你的环境要将auth字段的用户名和密码做一个替换为'echo -n admin:1 | base64'的输出结果。 .dockerconfigjson: '{"auths":{"harbor250.weixiang.com":{"username":"admin","password":"1","email":"admin@weixiang.com","auth":"YWRtaW46MQ=="}}}' --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian-secrets-harbor spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: volumes: - name: data secret: secretName: login-info # 引用的另一个 Secret(存储数据库配置) items: - key: host path: host.conf - key: port path: port.txt - key: dbName path: db.log # 指定Pod容器拉取镜像的认证信息 imagePullSecrets: - name: "harbor-admin" # 指定上面我们定义的secret名称,将来拉取镜像的时候使用该名称来取 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 # 私有镜像地址 imagePullPolicy: Always # 总是从仓库拉取(避免使用本地缓存) volumeMounts: - name: data mountPath: /weixiang # Secret 文件挂载路径 env: - name: weixiang-USER # 环境变量:从 Secret 注入用户名 valueFrom: secretKeyRef: name: login-info key: username - name: weixiang-PWD # 环境变量:从 Secret 注入密码 valueFrom: secretKeyRef: name: login-info key: password [root@master231 13-secrets]# [root@master231 13-secrets]# [root@master231 13-secrets]# kubectl apply -f 03-deploy-secret-harbor.yaml secret/harbor-admin created deployment.apps/deploy-xiuxian-secrets-harbor created [root@master231 13-secrets]# [root@master231 13-secrets]# [root@master231 13-secrets]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-secrets-harbor-584f65957f-2pfsr 1/1 Running 0 14s 10.100.1.88 worker232 <none> <none> deploy-xiuxian-secrets-harbor-584f65957f-6m9pn 1/1 Running 0 14s 10.100.1.87 worker232 <none> <none> deploy-xiuxian-secrets-harbor-584f65957f-x499g 1/1 Running 0 14s 10.100.2.62 worker233 <none> <none> [root@master231 13-secrets]#

将密码改错测试

image

image

bash
[root@master231 ~/count/13-secrets]#kubectl describe pods deploy-xiuxian-secrets-harbor-584f65957f-ckqzp

image

5、secret认证harbor基于kubesphere实战

image

image

image

image

bash
新建负载测试

image

image

image

image

image

image

image

16、sa的使用

1、sa基本介绍
bash
1.什么是sa 'sa'"ServiceAccount"的简称,表示服务账号,是Pod用来进行身份认证的资源。当 Pod 里的应用程序需要访问 Kubernetes API 来执行操作时(比如,查询其他 Pod 的信息、创建或删除资源等),它就需要一个身份,这个身份就是 Service Account。 SA 本身只是一个身份,它没有任何权限 你需要通过创建 Role (角色) 或 ClusterRole (集群角色),在其中定义具体的权限(比如,可以对哪些资源的哪些动词进行操作,如 get, list, watch, create 等)。 然后,通过创建 RoleBinding (角色绑定) 或 ClusterRoleBinding (集群角色绑定),将这个 SA 和之前创建的 Role/ClusterRole “绑定”在一起 公式:SA (谁) + Role (能做什么) + RoleBinding (在哪里能做) = 完整的授权 "k8s 1.23-"创建sa时会自动创建secret和token信息。从"k8s 1.24+"版本创建sa时不会创建secret和token信息。 2.sa的基本管理 2.1 响应式管理sa [root@master231 ~]# kubectl api-resources | grep sa serviceaccounts sa v1 true ServiceAccount [root@master231 ~]# [root@master231 ~]# [root@master231 ~]# kubectl create serviceaccount myuser serviceaccount/myuser created [root@master231 ~]# [root@master231 ~]# kubectl get sa NAME SECRETS AGE default 1 8d myuser 1 7s [root@master231 ~]# [root@master231 ~]# kubectl describe sa myuser Name: myuser Namespace: default Labels: <none> Annotations: <none> Image pull secrets: <none> Mountable secrets: myuser-token-xswx2 Tokens: myuser-token-xswx2 Events: <none> [root@master231 ~]# [root@master231 ~]# [root@master231 ~]# kubectl get secrets myuser-token-xswx2 NAME TYPE DATA AGE myuser-token-xswx2 kubernetes.io/service-account-token 3 34s [root@master231 ~]# [root@master231 ~]# kubectl delete sa myuser serviceaccount "myuser" deleted [root@master231 ~]# [root@master231 ~]# kubectl get secrets myuser-token-xswx2 Error from server (NotFound): secrets "myuser-token-xswx2" not found [root@master231 ~]# [root@master231 ~]# 2.2 声明式管理sa [root@master231 14-serviceAccount]# kubectl create sa xixi -o yaml --dry-run=client > 01-sa-xixi.yaml [root@master231 14-serviceAccount]# [root@master231 14-serviceAccount]# cat 01-sa-xixi.yaml apiVersion: v1 kind: ServiceAccount metadata: name: xixi [root@master231 14-serviceAccount]# [root@master231 14-serviceAccount]# kubectl apply -f 01-sa-xixi.yaml serviceaccount/xixi created [root@master231 14-serviceAccount]# [root@master231 14-serviceAccount]# kubectl get sa NAME SECRETS AGE default 1 8d xixi 1 3s [root@master231 14-serviceAccount]# [root@master231 14-serviceAccount]# kubectl delete -f 01-sa-xixi.yaml serviceaccount "xixi" deleted [root@master231 14-serviceAccount]# [root@master231 14-serviceAccount]# kubectl get sa NAME SECRETS AGE default 1 8d [root@master231 14-serviceAccount]#
2、pod基于sa认证harbor引用secret实现镜像拉取案例【扩展】
bash
1.harbor仓库创建账号信息 参考案例 用户名: weixiang98 密码: Linux98@2025 邮箱:weixiang98@weixiang.com 2.将私有项目添加创建的weixiang98用户 3.编写资源清单 [root@master231 14-serviceAccount]# cat 02-deploy-sa-secrets-harbor.yaml apiVersion: v1 kind: Secret metadata: name: harbor-weixiang98 type: kubernetes.io/dockerconfigjson stringData: # 解码: # echo bGludXg5ODpMaW51eDk4QDIwMjU= | base64 -d # 编码: # echo -n weixiang98:Linux98@2025 | base64 .dockerconfigjson: '{"auths":{"harbor250.weixiang.com":{"username":"weixiang98","password":"Linux98@2025","email":"weixiang98@weixiang.com","auth":"bGludXg5ODpMaW51eDk4QDIwMjU="}}}' --- apiVersion: v1 kind: ServiceAccount metadata: name: sa-weixiang98 # 让sa绑定secret,将来与sa认证时,会使用secret的认证信息。 imagePullSecrets: - name: harbor-weixiang98 --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian-sa-secrets-harbor spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: #imagePullSecrets: #- name: "harbor-weixiang98" # 指定服务账号,该字段官方已经弃用。推荐使用'serviceAccountName',如果不指定,则默认名称为"default"的sa。 # serviceAccount: sa-weixiang98 # 推荐使用该字段来指定sa的认证信息 serviceAccountName: sa-weixiang98 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 imagePullPolicy: Always [root@master231 14-serviceAccount]# [root@master231 14-serviceAccount]# [root@master231 14-serviceAccount]# kubectl apply -f 02-deploy-sa-secrets-harbor.yaml secret/harbor-weixiang98 created serviceaccount/sa-weixiang98 created deployment.apps/deploy-xiuxian-sa-secrets-harbor created [root@master231 14-serviceAccount]# [root@master231 14-serviceAccount]# kubectl get pods -o wide

image

bash
[root@master231 ~/count/14-serviceAccount]#kubectl apply -f 02-deploy-sa-secrets-harbor.yaml secret/harbor-weixiang98 created serviceaccount/sa-weixiang98 created deployment.apps/deploy-xiuxian-sa-secrets-harbor created # 显示镜像拉取失败 [root@master231 ~/count/14-serviceAccount]#kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-sa-secrets-harbor-66b99db97d-dsrsx 0/1 ImagePullBackOff 0 6s 10.100.1.67 worker232 <none> <none> deploy-xiuxian-sa-secrets-harbor-66b99db97d-pvwvc 0/1 ImagePullBackOff 0 6s 10.100.1.69 worker232 <none> <none> deploy-xiuxian-sa-secrets-harbor-66b99db97d-xzcrj 0/1 ErrImagePull 0 6s 10.100.2.79 worker233 # 查看详细信息 [root@master231 ~/count/14-serviceAccount]#kubectl describe pods deploy-xiuxian-sa-secrets-harbor-66b99db97d-dsrsx

image

创建并添加认证用户

251eb4eaf821c63e8df3e7f489abcc9e

d64994ec6d839365eb539e632f7882d3

bash
重新拉取,显示成功

image

17、API Server内置的访问控制机制

1、基本介绍
bash
API Server的访问方式: - 集群外部: https://IP:Port 比如: https://10.0.0.231:6443 - 集群内部: https://kubernetes.default.svc.weixiang.com 比如: 直接基于名为"kubernetes"的svc访问即可 [root@master231 ~]# kubectl get svc kubernetes NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 6d22h [root@master231 ~]# [root@master231 ~]# kubectl describe svc kubernetes | grep Endpoints Endpoints: 10.0.0.231:6443 [root@master231 ~]# API Server内置了插件化的访问控制机制(每种访问控制机制均有一组专用的插件栈) 认证(Authentication): 核验请求者身份的合法性,进行身份识别,验证客户端身份。 身份核验过程遵循“或”逻辑,且任何一个插件核验成功后都将不再进行后续的插件验证。 均不成功,则失败,或以“匿名者”身份访问,建议禁用“匿名者”。 授权(Authorization): 核验请求的操作是否获得许可,验证客户端是否有权限操作资源对象。 鉴权过程遵循“或”逻辑,且任何一个插件对操作的许可授权后都将不再进行后续的插件验证。 均未许可,则拒绝请求的操作 准入控制(Admission Control): 检查操作内容是否合规,仅同"写"请求相关,负责实现"检验"字段类型是否合法及和补全默认字段。 内容合规性检查过程遵循“与”逻辑,且无论成败,每次的操作请求都要经由所有插件的检验。 将数据写入etcd前,负责检查内容的有效性,因此仅对“写”操作有效。 分两类:validating(校验)和 mutating(补全或订正)。 2.身份认证策略 - X.509客户端证书认证: 在双向TLS通信中,客户端持有数字证书信任的CA,需要在kube-apiserver程序启动时,通过--client-ca-file选项传递。 认证通过后,客户端数字证书中的CN(Common Name)即被识别为用户名,而O(Organization)被识别为组名。 kubeadm部署的K8s集群,默认使用"/etc/kubernetes/pki/ca.crt"(各组件间颁发数字证书的CA)进行客户端认证。 - 持有者令牌(token): - 1.静态令牌文件(Static Token File): 令牌信息保存于文本文件中,由kube-apiserver在启动时通过--token-auth-file选项加载。 加载完成后的文件变动,仅能通过重启程序进行重载,因此,相关的令牌会长期有效。 客户端在HTTP请求中,通过“Authorization: Bearer TOKEN”标头附带令牌令牌以完成认证。 - 2.Bootstrap令牌: 一般用于加入集群时使用,尤其是在集群的扩容场景时会用到。 - 3.Service Account令牌: 该认证方式将由kube-apiserver程序内置直接启用,它借助于经过签名的Bearer Token来验证请求。 签名时使用的密钥可以由--service-account-key-file选项指定,也可以默认使用API Server的tls私钥 用于将Pod认证到API Server之上,以支持集群内的进程与API Server通信。 K8s可使用ServiceAccount准入控制器自动为Pod关联ServiceAccount。 - 4.OIDC(OpenID Connect)令牌: 有点类似于"微信""支付宝"认证的逻辑,自建的话需要配置认证中心。 OAuth2认证机制,通常由底层的IaaS服务所提供。 - 5.Webhook令牌: 基于web的形式进行认证,比如之前配置的"钉钉机器人""微信机器人"等; 是一种用于验证Bearer Token的回调机制,能够扩展支持外部的认证服务,例如LDAP等。 - 身份认证代理(Authenticating Proxy): 由kube-apiserver从请求报文的特定HTTP标头中识别用户身份,相应的标头名称可由特定的选项配置指定。 kube-apiserver应该基于专用的CA来验证代理服务器身份。 - 匿名请求: 生产环境中建议禁用匿名认证。 3.Kubernetes上的用户 “用户”即服务请求者的身份指代,一般使用身份标识符进行识别,比如用户名,用户组,服务账号,匿名用户等。 Kubernetes系统的用户大体可分Service Account,User Account和Anonymous Account。 Service Account: Kubernetes内置的资源类型,用于Pod内的进程访问API Server时使用的身份信息。 引用格式: "system:serviceaccount:NAMESPACE:SA_NAME" User Account: 用户账户,指非Pod类的客户端访问API Server时使用的身份标识,一般是现实中的"人"。 API Server没有为这类账户提供保存其信息的资源类型,相关的信息通常保存于外部的文件(特指"kubeconfig"文件)或认证系统中。 身份核验操作可由API Server进行,也可能是由外部身份认证服务完成。 可以手动定义证书,其中O字段表示组,CN字段表示用户名。 Anonymous Account: 不能被识别为Service Account,也不能被识别为User Account的用户。 这类账户K8S系统称之为"system:anonymous",即“匿名用户”。
2、静态令牌文件认证测试
bash
1.模拟生成token 1.1 方式1 : [root@master231 ~]# echo "$(openssl rand -hex 3).$(openssl rand -hex 8)" 01b202.d5c4210389cbff08 [root@master231 ~]# [root@master231 ~]# echo "$(openssl rand -hex 3).$(openssl rand -hex 8)" 497804.9fc391f505052952 [root@master231 ~]# [root@master231 ~]# [root@master231 ~]# echo "$(openssl rand -hex 3).$(openssl rand -hex 8)" 8fd32c.0868709b9e5786a8 [root@master231 ~]# 1.2 方式2 : [root@master231 ~]# kubeadm token generate jvt496.ls43vufojf45q73i [root@master231 ~]# [root@master231 ~]# kubeadm token generate qo7azt.y27gu4idn5cunudd [root@master231 ~]# [root@master231 ~]# [root@master231 ~]# kubeadm token generate mic1bd.mx3vohsg05bjk5rr [root@master231 ~]# 2.创建csv文件 [root@master231 ~]# cd /etc/kubernetes/pki/ [root@master231 pki]# [root@master231 pki]# cat > token.csv <<EOF 01b202.d5c4210389cbff08,yinzhengjie,10001,k8s 497804.9fc391f505052952,jasonyin,10002,k8s 8fd32c.0868709b9e5786a8,linux97,10003,k3s jvt496.ls43vufojf45q73i,weixiang98,10004,k3s qo7azt.y27gu4idn5cunudd,linux99,10005,k3s mic1bd.mx3vohsg05bjk5rr,linux100,10006,k3s 6e5wc3.hgeezdh5dr0yuq8t,gangzi,10007,k8s EOF 温馨提示: 文件格式为CSV,每行定义一个用户,由“令牌、用户名、用户ID和所属的用户组”四个字段组成,用户组为可选字段 具体格式: token,user,uid,"group1,group2,group3",但K8S 1.23.17版本中实际测试貌似多个组使用逗号分割存在一定问题。 3.修改api-server参数加载token文件 [root@master231 pki]# vim /etc/kubernetes/manifests/kube-apiserver.yaml ... spec: containers: - command: - kube-apiserver - --token-auth-file=/etc/kubernetes/pki/token.csv ... volumeMounts: ... - mountPath: /etc/kubernetes/pki/token.csv name: yinzhengjie-static-token-file readOnly: true ... volumes: ... - hostPath: path: /etc/kubernetes/pki/token.csv type: File name: yinzhengjie-static-token-file ... [root@master231 pki]# mv /etc/kubernetes/manifests/kube-apiserver.yaml /opt/ [root@master231 pki]# [root@master231 pki]# mv /opt/kube-apiserver.yaml /etc/kubernetes/manifests/ [root@master231 pki]# [root@master231 pki]# kubectl get pods -n kube-system -l component=kube-apiserver -o wide # 最少要等待30s+ NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-apiserver-master231 1/1 Running 0 32s 10.0.0.231 master231 <none> <none> [root@master231 pki]# 4.kubectl使用token认证并指定api-server的证书 [root@worker232 ~]# kubectl --server=https://10.1.24.13:6443 --certificate-authority=/etc/kubernetes/pki/ca.crt --token=01b202.d5c4210389cbff08 get nodes Error from server (Forbidden): nodes is forbidden: User "yinzhengjie" cannot list resource "nodes" in API group "" at the cluster scope # --server: https://10.1.24.13:6443 # 6443: 这是 API Server: 默认监听的安全端口 # --certificate-authority=/etc/kubernetes/pki/ca.crt: 指定了证书颁发机构(CA)的证书文件 # --token=01b202.d5c4210389cbff08: 这个参数提供了用于身份认证的令牌 [root@worker232 ~]# [root@worker232 ~]# kubectl --server=https://10.1.24.13:6443 --certificate-authority=/etc/kubernetes/pki/ca.crt --token=jvt496.ls43vufojf45q73i get nodes Error from server (Forbidden): nodes is forbidden: User "weixiang98" cannot list resource "nodes" in API group "" at the cluster scope [root@worker232 ~]# [root@worker232 ~]# kubectl --server=https://10.1.24.13:6443 --certificate-authority=/etc/kubernetes/pki/ca.crt --token=oldboy.yinzhengjiejason get nodes error: You must be logged in to the server (Unauthorized) # 未认证! [root@worker232 ~]# [root@worker232 ~]# kubectl --server=https://10.1.24.13:6443 --certificate-authority=/etc/kubernetes/pki/ca.crt get nodes # 不使用token登录,判定为匿名用户 Please enter Username: admin Please enter Password: Error from server (Forbidden): nodes is forbidden: User "system:anonymous" cannot list resource "nodes" in API group "" at the cluster scope [root@worker232 ~]# kubectl --server=https://10.1.24.13:6443 --certificate-authority=/etc/kubernetes/pki/ca.crt --token=6e5wc3.hgeezdh5dr0yuq8t get nodes Error from server (Forbidden): nodes is forbidden: User "gangzi" cannot list resource "nodes" in API group "" at the cluster scope 5.curl基于token认证案例 [root@harbor250.weixiang.com ~]# curl -k https://106.55.44.37:6443;echo # 如果不指定认证信息,将被识别为匿名"system:anonymous"用户。 { "kind": "Status", "apiVersion": "v1", "metadata": {}, "status": "Failure", "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"", "reason": "Forbidden", "details": {}, "code": 403 } [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# curl -k -H "Authorization: Bearer 01b202.d5c4210389cbff08" https://10.1.24.13:6443/api/v1/pods;echo { "kind": "Status", "apiVersion": "v1", "metadata": {}, "status": "Failure", "message": "pods is forbidden: User \"yinzhengjie\" cannot list resource \"pods\" in API group \"\" at the cluster scope", "reason": "Forbidden", "details": { "kind": "pods" }, "code": 403 } [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# curl -k -H "Authorization: Bearer jvt496.ls43vufojf45q73i" https://10.1.24.13:6443/api/v1/pods;echo { "kind": "Status", "apiVersion": "v1", "metadata": {}, "status": "Failure", "message": "pods is forbidden: User \"weixiang98\" cannot list resource \"pods\" in API group \"\" at the cluster scope", "reason": "Forbidden", "details": { "kind": "pods" }, "code": 403 } [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# curl -k -H "Authorization: Bearer oldboy.yinzhengjiejason" https://10.0.0.231:6443/api/v1/pods;echo { "kind": "Status", "apiVersion": "v1", "metadata": {}, "status": "Failure", "message": "Unauthorized", "reason": "Unauthorized", "code": 401 } [root@harbor250.weixiang.com ~]#
3、X509数字证书认证测试案例
bash
1.客户端节点创建证书签发请求 1.1 创建证书签署请求的密钥 # 233节点生成私钥 [root@worker233 ~]# openssl genrsa -out jiege.key 2048 [root@worker233 ~]# [root@worker233 ~]# ll jiege.key -rw------- 1 root root 1704 Apr 14 10:43 jiege.key [root@worker233 ~]# 1.2 创建证书签署请求 # 使用一个已存在的私钥(jiege.key)来生成一个证书签名请求(Certificate Signing Request, CSR)文件(jiege.csr) [root@worker233 ~]# openssl req -new -key jiege.key -out jiege.csr -subj "/CN=jiege/O=weixiang" # openssl req:用密码学工具包创建新的 CSR # -new:这个选项告诉 req 子命令,我们要创建一个新的证书签名请求 # -key jiege.key:指定了生成此 CSR 所需的私钥文件。 # -out jiege.csr:指定了生成的输出文件的名称 # -subj "/CN=jiege/O=weixiang":主体信息,随意更改 [root@worker233 ~]# ll jiege* -rw-r--r-- 1 root root 911 Apr 14 10:43 jiege.csr -rw------- 1 root root 1704 Apr 14 10:43 jiege.key [root@worker233 ~]# 1.3 将证书签署请求证书使用base64编码【改成你自己的证书内容,我的跟你不一样!】 [root@worker233 ~]# cat jiege.csr | base64 | tr -d '\n';echo LS0tLS1CRUdJTiBDRVJUSUZJQ0FURSBSRVFVRVNULS0tLS0KTUlJQ2FUQ0NBVkVDQVFBd0pERU9NQXdHQTFVRUF3d0ZhbWxsWjJVeEVqQVFCZ05WQkFvTUNXOXNaR0p2ZVdWawpkVENDQVNJd0RRWUpLb1pJaHZjTkFRRUJCUUFEZ2dFUEFEQ0NBUW9DZ2dFQkFKeVJaR3ZOYUdudjN3dk5CQ0llCmtDQ1l2SWZSTlF4QXU3a1VrK21Cb1JlcUdqUTJOZUljYkd1ai9QWDZMYy9IUldTVnVKeWJpYkZNRDh2UzVsU1oKenpKUjhnVC9ubWFTd25zMVhTaFBaQU5SaFNlczdGWGlSUXpxcjd4a1BaWTVtNkJCRUlTQUdkMElOSG12NXRERApENk0yS3EwR3pYMWJUMVA5NzU5aUhGY3R0Q3lVcVRraUJYL1ZueHc1c1VuNFpUSE5PVWRRVk1kcFNkQVVZMWc3CkxrSEt0VkRWRVNzNUh6OTBvREhmcGc1RW1ISElFZFhZZllOcDc3eHR4S0ZES3VRU1JzUEZCeGtwUzZaamU0TmsKUnY3Ujh1ckppSCtBRWJIcTR2NFZHc2ZLaU1aOURCaGJSbnR2a0k5WTRFR250WWUwVUUwYlc2K25Ua1VveUFFMQp1RE1DQXdFQUFhQUFNQTBHQ1NxR1NJYjNEUUVCQ3dVQUE0SUJBUUE1bHB1QW1FVEJBdE5uQTdIMU5pVWRQcy9MCk1PbUJhZkVUYjRxMmszZmNncTVmRmM3NzZOa1B2NEhsYW9yZW03YkVPSk5ZMVlFd3ppQlYxMHF6RWg3Q0dXM1gKMFJwZkpjRGh4TnMzMVZkVlVuY3RRUzAyMzZNdFBEeE1PNjZlYU81dFUzcWt0cDJkN2N5RXBFRyswU2VXU3JMNAozTkpxd3pRUDNRNnJaNk9ONGw1bm9uNFRSNnM1b2VYNDVCMUhldnpjck1pc1pSM2ZZYWFWQmZxcjhLL0l2b25IClh4SlBBMkY1MHk4TkV0MDNjZWxFK3Q5LzdDUURvaUR1M2pBSVl5bVpUUmxGV3EyRno2dmFrWlBWOTMyNm5pQjMKK0N6czYrZysveEpTQlNaV3d0b3F2OHR4SjJPdjBYNDRmdFo5WlpMd2ZVeFduRzNNME1Uc0dmekluN3dRCi0tLS0tRU5EIENFUlRJRklDQVRFIFJFUVVFU1QtLS0tLQo= 2.服务端签发证书 1.2 为客户端创建csr资源证书签发请求资源清单 [root@master231 ~]# cat > csr-jiege.yaml << EOF apiVersion: certificates.k8s.io/v1 kind: CertificateSigningRequest metadata: name: jiege-csr spec: # 将客户端证书签发请求使用base64编码存储,拷贝上一步生成的base64代码替换即可。 request:eEVqQVFCZ05WQkFvTUNXOXNaR0p2ZVdWawpkVENDQVNJd0RRWUpLb1pJaHZjTkFRRUJCUUFEZ2dFUEFEQ0NBUW9DZ2dFQkFKeVJaR3ZOYUdudjN3dk5CQ0llCmtDQ1l2SWZSTlF4QXU3a1VrK21Cb1JlcUdqUTJOZUljYkd1ai9QWDZMYy9IUldTVnVKeWJpYkZNRDh2UzVsU1oKenpKUjhnVC9ubWFTd25zMVhTaFBaQU5SaFNlczdGWGlSUXpxcjd4a1BaWTVtNkJCRUlTQUdkMElOSG12NXRERApENk0yS3EwR3pYMWJUMVA5NzU5aUhGY3R0Q3lVcVRraUJYL1ZueHc1c1VuNFpUSE5PVWRRVk1kcFNkQVVZMWc3CkxrSEt0VkRWRVNzNUh6OTBvREhmcGc1RW1ISElFZFhZZllOcDc3eHR4S0ZES3VRU1JzUEZCeGtwUzZaamU0TmsKUnY3Ujh1ckppSCtBRWJIcTR2NFZHc2ZLaU1aOURCaGJSbnR2a0k5WTRFR250WWUwVUUwYlc2K25Ua1VveUFFMQp1RE1DQXdFQUFhQUFNQTBHQ1NxR1NJYjNEUUVCQ3dVQUE0SUJBUUE1bHB1QW1FVEJBdE5uQTdIMU5pVWRQcy9MCk1PbUJhZkVUYjRxMmszZmNncTVmRmM3NzZOa1B2NEhsYW9yZW03YkVPSk5ZMVlFd3ppQlYxMHF6RWg3Q0dXM1gKMFJwZkpjRGh4TnMzMVZkVlVuY3RRUzAyMzZNdFBEeE1PNjZlYU81dFUzcWt0cDJkN2N5RXBFRyswU2VXU3JMNAozTkpxd3pRUDNRNnJaNk9ONGw1bm9uNFRSNnM1b2VYNDVCMUhldnpjck1pc1pSM2ZZYWFWQmZxcjhLL0l2b25IClh4SlBBMkY1MHk4TkV0MDNjZWxFK3Q5LzdDUURvaUR1M2pBSVl5bVpUUmxGV3EyRno2dmFrWlBWOTMyNm5pQjMKK0N6czYrZysveEpTQlNaV3d0b3F2OHR4SjJPdjBYNDRmdFo5WlpMd2ZVeFduRzNNME1Uc0dmekluN3dRCi0tLS0tRU5EIENFUlRJRklDQVRFIFJFUVVFU1QtLS0tLQo= # 指定颁发证书的请求类型,仅支持如下三种,切均可以由kube-controllmanager中的“csrsigning”控制器发出。 # "kubernetes.io/kube-apiserver-client": # 颁发用于向kube-apiserver进行身份验证的客户端证书。 # 对该签名者的请求是Kubernetes控制器管理器从不自动批准。 # # "kubernetes.io/kube-apiserver-client-kubelet": # 颁发kubelets用于向kube-apiserver进行身份验证的客户端证书。 # 对该签名者的请求可以由kube-controllermanager中的“csrapproving”控制器自动批准。 # # "kubernetes.io/kubelet-serving": # 颁发kubelets用于服务TLS端点的证书,kube-apiserver可以连接到这些端点安全。 # 对该签名者的请求永远不会被kube-controllmanager自动批准。 signerName: kubernetes.io/kube-apiserver-client # 指定证书的过期时间,此处我设置的是24h(3600*24=86400) expirationSeconds: 86400 # 指定在颁发的证书中请求的一组密钥用法。 # 对TLS客户端证书的请求通常请求: # “数字签名(digital signature)”、“密钥加密(key encipherment)”、“客户端身份验证(client auth)”。 # 对TLS服务证书的请求通常请求: # “密钥加密(key encipherment)”、“数字签名(digital signature)”、“服务器身份验证(server auth)”。 # # 有效值的值为: "signing", "digital signature", "content commitment", "key encipherment","key agreement", # "data encipherment", "cert sign", "crl sign", "encipher only", "decipher only", "any", "server auth", # "client auth", "code signing", "email protection", "s/mime", "ipsec end system", "ipsec tunnel","ipsec user", # "timestamping", "ocsp signing", "microsoft sgc", "netscape sgc"。 usages: - client auth EOF 1.2 创建证书签发请求 [root@master231 15-CertificateSigningRequest]# kubectl get csr No resources found [root@master231 15-CertificateSigningRequest]# [root@master231 15-CertificateSigningRequest]# kubectl apply -f csr-jiege.yaml certificatesigningrequest.certificates.k8s.io/jiege-csr created [root@master231 15-CertificateSigningRequest]# [root@master231 15-CertificateSigningRequest]# kubectl get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION jiege-csr 1s kubernetes.io/kube-apiserver-client kubernetes-admin 24h Pending [root@master231 15-CertificateSigningRequest]# 1.3 服务端手动签发证书 [root@master231 15-CertificateSigningRequest]# kubectl certificate approve jiege-csr certificatesigningrequest.certificates.k8s.io/jiege-csr approved [root@master231 15-CertificateSigningRequest]# [root@master231 15-CertificateSigningRequest]# kubectl get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION jiege-csr 34s kubernetes.io/kube-apiserver-client kubernetes-admin 24h Approved,Issued [root@master231 15-CertificateSigningRequest]# 1.4 获取签发后的证书 [root@master231 15-CertificateSigningRequest]# kubectl get csr jiege-csr -o jsonpath='{.status.certificate}' | base64 -d > /opt/jiege.crt [root@master231 15-CertificateSigningRequest]# [root@master231 15-CertificateSigningRequest]# ll /opt/jiege.crt -rw-r--r-- 1 root root 1115 Jul 18 10:36 /opt/jiege.crt [root@master231 15-CertificateSigningRequest]# [root@master231 15-CertificateSigningRequest]# cat /opt/jiege.crt -----BEGIN CERTIFICATE----- MIIDCTCCAfGgAwIBAgIQTN2oV0V6lhvCUNuJ7RapMzANBgkqhkiG9w0BAQsFADAV MRMwEQYDVQQDEwprdWJlcm5ldGVzMB4XDTI1MDcxODAzMjcyOVoXDTI1MDcxOTAz MjcyOVowJDESMBAGA1UEChMJb2xkYm95ZWR1MQ4wDAYDVQQDEwVqaWVnZTCCASIw DQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAJ4K8sFPHEdOhLmZ0kf/wYN6BH+g IXkeZYW1wXgV07XZSaCyMOd3QnHHoVqsj3MZEgP2BL/iDoaAYlelnfdX88XvG33a irip5pIEOSlIjt4KDxUG4FPTK+o9UnjVUFm1NkztygPGmp6ZuC4s3qZYL0qx5dv6 CcIw9+WAhjotr4vFAuwBJFlfAc/X/QSa31+1gzAft2+2fTXmpYf/tnccmnSDRsGR E/mivNghL/HOLqMfX4pQJVy5kybrkPTR0ZlYQwZ6iJZALPUq1HoHN4E74jPkgYaT +6brVe2xNymd3AtJfdlKY/rYw+NrcjU8l5ZpoPrpvGfRP0fMFmUKflD4WJsCAwEA AaNGMEQwEwYDVR0lBAwwCgYIKwYBBQUHAwIwDAYDVR0TAQH/BAIwADAfBgNVHSME GDAWgBQ9Pc+SKeFCzROqyIK9kyMQaKTnozANBgkqhkiG9w0BAQsFAAOCAQEASfgp vxnUnuDgbVGgv/gGpybyiNFocOmFLf9P+IwMCt0+ruPgoOuT2oMTPEJiJwCLIxhI RFyactx5JmVa+WdtJyZtKcuF1RV8RKSXuqIf7rQVqB6zZilPMCvq68o3J9tVkTsM 7VEE0wO4t4nMmxuG2UGgZiv/zvI1TGJ7RZuEu/RjCnYRym1/vWeyAEuNCB2O6hUq sk/owjWidBXcON/xvtSWVTYBE+W2+VpgWsjryIRR1G7lUu1H4hqrFGaUPQFo0+h6 CR1SEgEHbTAqoiIMFRSjq6O2LJrWRaxuI22ipNv2xjWFDZYmGS/AZ+Ynm6E68/NF YMFuvlMEi4RuCaDgXw== -----END CERTIFICATE----- [root@master231 15-CertificateSigningRequest]# 1.5 将证书拷贝到客户端节点,便于后续使用 [root@master231 ~]# scp /opt/jiege.crt 10.1.24.4:~ 3.客户端测试验证 3.1 查看本地证书文件 [root@worker233 ~]# ll jiege.* -rw-r--r-- 1 root root 1115 Jun 3 10:22 jiege.crt -rw-r--r-- 1 root root 911 Jun 3 10:12 jiege.csr -rw------- 1 root root 1704 Jun 3 10:11 jiege.key [root@worker233 ~]# 3.2 访问api-server [root@worker233 ~]# kubectl -s https://10.1.24.13:6443 --client-key jiege.key --client-certificate jiege.crt --insecure-skip-tls-verify get nodes # --client-key jiege.key:之前是token访问,现在直接用文件进行访问 Error from server (Forbidden): nodes is forbidden: User "jiege" cannot list resource "nodes" in API group "" at the cluster scope [root@worker233 ~]# [root@worker233 ~]# [root@worker233 ~]# kubectl -s https://10.1.24.13:6443 --client-key jiege.key --client-certificate jiege.crt --insecure-skip-tls-verify get pods Error from server (Forbidden): pods is forbidden: User "jiege" cannot list resource "pods" in API group "" in the namespace "default" [root@worker233 ~]#
4、kubeconfig方式(关键)
1、kubeconfig概述
bash
kubeconfig概述: 1.kubeconfig是YAML格式的文件,用于存储身份认证信息,以便于客户端加载并认证到API Server。 2.它的主要作用是“配置一切,简化连接”。如果没有 kubeconfig,你每次执行 kubectl 命令时,手动加上 --server、--token 或 --client-certificate 等一大堆参数。kubeconfig 把这些参数全部预先配置好,让你只需简单地执行 kubectl get pods 就能工作 kubeconfig保存有认证到一至多个Kubernetes集群的相关配置信息,并允许管理员按需在各配置间灵活切换 # clusters:集群列表 Kubernetes集群访问端点(API Server)列表。 说白了,就是可以定义多个K8S集群列表。 # users:用户列表 认证到API Server的身份凭据列表。 说白了,可以定义多个用户列表,这个用户可以是token,或者x509证书凭据。 # contexts:上下文列表 将每一个user同可认证到的cluster建立关联的上下文列表。 说白了,就是将多个用户和对应的集群进行关联,将来使用哪个用户,就会去关联的集群进行访问认证。也可以定义多个上下文的关系。 # current-context: 当前默认使用的context。 2、查看Kubeconfig证书文件内容 [root@master231 ~]# kubectl config view apiVersion: v1 clusters: # 定义了所有可以连接的 Kubernetes 集群列表 - cluster: # 列表中的第一个(也是唯一一个)集群定义。 certificate-authority-data: DATA+OMITTED # 用于验证API Server证书的CA(证书颁发机构)证书,实际文件中这里是一长串Base64 字符串。 server: https://10.0.0.231:6443 # 所有 kubectl 命令将要发送到的目标地址 name: kubernetes # 集群的别名为kubernetes,可以在context中引用它 contexts: # 定义了“上下文”列表,上下文是将一个“用户”和一个“集群”绑定在一起的组合。 - context: # 列表中的第一个(也是唯一一个)上下文定义。 cluster: kubernetes # 引用了上面 clusters 部分中定义的那个集群kubernetes user: kubernetes-admin # 引用了下面 users 部分中定义的那个用户。 name: kubernetes-admin@kubernetes # 这个上下文的别名,通常采用 "用户名@集群名" 的格式 current-context: kubernetes-admin@kubernetes # 当前上下文使用kubernetes-admin@kubernetes kind: Config # 指定文件类型 preferences: {} users: # 列表中的第一个(也是唯一一个)用户定义。 - name: kubernetes-admin # 用户的别名为kubernetes-admin,可以在context中引用它,通常表示这是集群的管理员账户 user: client-certificate-data: REDACTED # 用户的客户端证书(Base64 编码) client-key-data: REDACTED # 与客户端证书配对的私钥(Base64 编码) # 查看所有内容 [root@master231 ~]# kubectl config view --raw apiVersion: v1 clusters: - cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJMU1EUXdOekF6TURBd05Gb1hEVE0xTURRd05UQXpNREF3TkZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTTl4Cmh0RHhVQVJsUGo0NlFEa1Rwd3dPWnJsN2d1bG5IUzRYN1Y1S1pFN3cyZVZRakJXUmpRMENnSzNjMFFBa3hoT1YKWXl4Y1pSbVg2U3FkRFZOWFBNQVZzSmNUeDd4VkRWNk9DYVQxSjRkZmcxVWNGTTNidXM5R3VMMzBITVBRYVEvaApyN2RrcnkxTUlLaVh3MUU5SkFSc05PMnhnamJBMHJEWlpIOXRRRlpwMlpUa1BNU1AzMG5WTWJvNWh3MHZLUGplCnoxNlB6Q3JwUjJIRkZrc0dXRmI3SnVobHlkWmpDaVQwOFJPY3N5ZERUTVFXZWZBdTNEcUJvMHpOSmtrcVovaVAKWkFFZ29DNXZ2MEg2N0Q4SEJxSzArRmUrZjJCaUs1SGNoYkF1WndwWjNkQ0pMTXVmU3FSWkNVVmFtTW56dWlaRApQTmVJbmdPSCtsMWZReTFad0pzQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZCRms1eStsM2RFMUhtT3lkSUYybDlDMDgvbk9NQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBQmxjZ0l1YUsxSVZydVBTVzk2SwpkTTZ6V294WmJlaVpqTWdpd2Q2R3lSL0JBdjI2QzB5V1piZjFHY3A4TlBISDJLdlhscTliUGpSODZSUkNpRFQ4Ci9VZGlTWVpQejByNnJrcTVCZ2x1Rk5XNlRTTXJyRndEVDlubVh0d0pZdzVQU29sS0JHQjIvaThaVTVwL3FkQUMKZ2Z3bU1sY3NPV3ZFUVV5bTVUYmZiWVU3NStxODJsNjY5ZGpGenh2VHFEWEIvZ0hoK1JvRXVaRTNSdjd5Slc1MwpMbkVhVWZSYjRCcmxGclFrKzlPRXZKMUF5UTE0LzcwTjlhVlJXZVZpTkxyQVdJTTNnajN1WmVHMk5yMXdic1ozCjM3VDF5MSs3TVlRcUpiUWRleUpyUVRyaGNjMXlRWTJIOEpaOXBqOERhNVVpSjlkQ1ZMeEtJSlFMeTV4b0RXaTgKL2hvPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== server: https://10.0.0.231:6443 name: kubernetes contexts: - context: cluster: kubernetes user: kubernetes-admin name: kubernetes-admin@kubernetes current-context: kubernetes-admin@kubernetes kind: Config preferences: {} users: - name: kubernetes-admin user: client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURJVENDQWdtZ0F3SUJBZ0lJUWFsb3k5Q3ltaHN3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TlRBME1EY3dNekF3TURSYUZ3MHlOakEwTURjd016QXdNRFphTURReApGekFWQmdOVkJBb1REbk41YzNSbGJUcHRZWE4wWlhKek1Sa3dGd1lEVlFRREV4QnJkV0psY201bGRHVnpMV0ZrCmJXbHVNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQXJrT29uQlNkeEhGZHZSTnYKVW9WbUFUdU1lVDB1T3VUalk0eU9meXY4UElsRGVEeGdtdXp5OXBjK0xzdkNFUXJGRHhSL1hVOW8vZzF3NTJFcwpvSXAvQjdhdzl2anZ1M2FidVBrRS9Kc2xwWi9GdjFMdnNoZE1BYWh6ZkZzVmIxUVMxTjVxcjJBZzhaQXp3SmJJCjlGYXhIMzE2WktwaU1GZW1ubGJMVVVYbG9QeVVjSkdEcGRNa3F1ME8vTDIvbGMvNVBqNkpRZWdrUVNXN1ZHUTgKTkcxR29TcVljekhtZkVZdE14WEF0TVNQMTRFR0pCZjBqMG5sd1R3QU92SkJCZWNmQnRoSU5Zek14d2dNYzFJSApXSnkyU1R0Mkd4VkpybVlYSkpNdU5rNkpmeWlxUklBMzNQQ0FOdS9DcHRtV2FGT2lsZXVFUVhrdy9VajdHMDhECm5YZ2ZHd0lEQVFBQm8xWXdWREFPQmdOVkhROEJBZjhFQkFNQ0JhQXdFd1lEVlIwbEJBd3dDZ1lJS3dZQkJRVUgKQXdJd0RBWURWUjBUQVFIL0JBSXdBREFmQmdOVkhTTUVHREFXZ0JRUlpPY3ZwZDNSTlI1anNuU0JkcGZRdFBQNQp6akFOQmdrcWhraUc5dzBCQVFzRkFBT0NBUUVBUnBDMWVUeVI4NXNtZnNJUWozemdzT0NxQWxIcW5Ub2xCNm0wCk14VjdVTGVCVmZoNmg3a3F5cVBzelorczM1MHJxNHpyczI2Qy8xSVBwK1p3MEhvVm9zdmNOSkZvMW0wY2lpUlMKUjVqSXU0Q1Rpd2R0aWZSUUd2SmhmQVFMZmNZd1JlTHJVQUg0YmxYRUZibkorM2FyeHZPQ1B3NThjL2lJTm9XWQpBenlZUElEZHJTSjFCTlZGYkVhSjhYR1ZSYW0rSGRkNHM1bExieGYzWFlPT0o0eWNha29pdWFQN3RUNmw3MXZ2CnAwNS9nOHA3R3NsV1R0cWFFa3JXbW5yUVlXN1Z1M015cWE0M1l4dFFMa2hvVzNad2lseEc1TVo4ZXd1NXdvWlQKQUgrRzB3MkNhbzk4NEVIUFBnL2tQOFVPTGRCZWhjVUgwU2J6YXBBMjJDZ3luN0ozZEE9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb3dJQkFBS0NBUUVBcmtPb25CU2R4SEZkdlJOdlVvVm1BVHVNZVQwdU91VGpZNHlPZnl2OFBJbERlRHhnCm11enk5cGMrTHN2Q0VRckZEeFIvWFU5by9nMXc1MkVzb0lwL0I3YXc5dmp2dTNhYnVQa0UvSnNscFovRnYxTHYKc2hkTUFhaHpmRnNWYjFRUzFONXFyMkFnOFpBendKYkk5RmF4SDMxNlpLcGlNRmVtbmxiTFVVWGxvUHlVY0pHRApwZE1rcXUwTy9MMi9sYy81UGo2SlFlZ2tRU1c3VkdROE5HMUdvU3FZY3pIbWZFWXRNeFhBdE1TUDE0RUdKQmYwCmowbmx3VHdBT3ZKQkJlY2ZCdGhJTll6TXh3Z01jMUlIV0p5MlNUdDJHeFZKcm1ZWEpKTXVOazZKZnlpcVJJQTMKM1BDQU51L0NwdG1XYUZPaWxldUVRWGt3L1VqN0cwOERuWGdmR3dJREFRQUJBb0lCQUhZUGdIdTl1K1VLcU9jZgo4NXVFcE1iUkFTcGlPSi9OMGYvdmlkcStnZlRCU2VSN2d6ZHlzR2cvcnZFbE9pVXhscS9Rd3prRVE2MWFqZE0wCkVuZnhYSDV0VnhiN0wrOWhPNzdsZG10czhPUjBpaFJFcS8rTHFRSzJqUWNDN2xLdU10UGttNEtWTGJ4NlpaVmsKa21CM0d5aXFhZkVwUGJ4aXBZOUFYaDZCckVDVHZ4VGYxUElOcVlkT1JEcjl5S2hFUjZRV2tHTlJzZjZYUFR6MwpRRytMYVRzbERtbW1NL1JickU1V1dlUTJSQlJnWVJjU2hQYmh3cUZGZXhhN2dkVmtRQVFOY21WUW5weHdXcDNCCnZCUWh0MTh6Z2tKbXUwN215aWdjZE9Gak0vdFdTd0ZkSVhZKzBrNHVZNWtmL1dackNRQ0YzUXBrZld6L0pGbEkKNU9VS2VJRUNnWUVBd284d0pTd1BoUTNZWDJDQzgwcWdRNDhiZWlVZFgyN0tjSlRWa0hYSkhheHZEczRvTXo5agpRV0FPaFB2NGdXM0tFYUlnUDN4K3kwa3lzeHFXNVVMdERvVHVyVE45cWQ0L012bVJFZEdjcys0OWNXSkRSTDRTCnZUR2dZQWZvR3hCS21qZjcwR0Zqdlp1VjJtMGl6QTJlNXRubWFpam8xeDRuaGxWc1BCVkJBYVVDZ1lFQTVVdkEKNHNFbkFUQVdBTlRFeVU2R2JXY0JpN0F5KzdYTUkvcGphMmZiRjN1RjRrNTZpZGtTVmNPeTlhUTVVOUZKeWdkWAo4d05CbDdyZldQVGVOd3BBc3RMVkZwd3gvQzRxQ3U4SEE1dXRZSW9wcFRUd3FRWG1pS0tQQVh4bUg2aDNRZElxCnQvL1dnejh2N0E2RTc4V1Q1UmJOZk9XS0lBVlh5UE5oMGo3SlFiOENnWUJCeExtWHR6OC8wU0JWallCMjBjRS8KVlQ4S21VVkduMk1iajVScUV3YjdXdkRuNWxTOGppNzFTSTFmOHZWY2UwcVZqMktyVTJCaFE4czV0RUZTR3IrYgo2dC9yK0w0QUVEcjQ5bGhOMTdmTE16dmQra09YRjFHcVZ2NUp1Q0tFRTR2RWVpeExrc0J1dGd1QUhPaG9aaXBUCkMxSFNqU1c0b2w3bUVEWllVUzc2YVFLQmdRRGt5c2JITzdYZ3NJdHovdG53aUNNSUxOelU5bGFZNUppeVdaZzAKUnFmTmNacHc2cC9JeGtsT1BIeG9NSnBuTVJDd3ZzMGFGV2l3cm0xSHhPV3FBOWYwMXZ4Nm1CWWtMQ2dWU3RZegoybldRTzZ3OFJXdlJLNnNSTVNzQ2I0OHpEWlVabjB5eTFsdkVFQnVRTGhpbGF2OGNlcmxGWTRDRVhQQnYrYkhrCjZITkczd0tCZ0dPekxRZnorMEFoaXJTZTZTZllmanQrMkdVSGc3U21UcjZjNm9jTnlpZGNSQks5Q25jcENiOW4KeVZ2SktzSkNuY2FvTCsra2M1aE1YWEJxendEQzNweVlFOWR2UFRiNXFOa1Z3UEJqa0VMcEsyaXhsRUlYRUc1cApJdjVxeVJWTit1QU9PMm5zNWJXQTUwTUpHK1JjSUZrQUphcUR1R1dMWFNZdmdVOVdPREpZCi0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg== [root@master231 ~]#
2、⭐为静态令牌认证token用户生成kubeconfig实战
bash
1 在一个指定的 kubeconfig 文件中,创建一个名为myk8s 的集群 [root@worker232 ~]# kubectl config set-cluster myk8s --embed-certs=true --certificate-authority=/etc/kubernetes/pki/ca.crt --server="https://10.1.24.13:6443" --kubeconfig=./yinzhengjie-k8s.conf Cluster "myk8s" set. # kubectl config: 这是 kubectl 用于管理 kubeconfig 文件的一系列子命令的入口 # set-cluster: 这是具体的动作 # myk8s:集群名称 # --embed-certs=true:kubectl会读取 --certificate-authority指定的证书文件(/etc/kubernetes/pki/ca.crt),将其内容进行 Base64 编码,然后直接存入 kubeconfig 文件中 # --server="https://10.1.24.13:6443":集群的API Server的地址和端口 # ./yinzhengjie-k8s.conf:表示在当前目录下名为 yinzhengjie-k8s.conf 的文件 [root@worker232 ~]# [root@worker232 ~]# cat ./yinzhengjie-k8s.conf apiVersion: v1 clusters: - cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJMU1EUXdOekF6TURBd05Gb1hEVE0xTURRd05UQXpNREF3TkZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTTl4Cmh0RHhVQVJsUGo0NlFEa1Rwd3dPWnJsN2d1bG5IUzRYN1Y1S1pFN3cyZVZRakJXUmpRMENnSzNjMFFBa3hoT1YKWXl4Y1pSbVg2U3FkRFZOWFBNQVZzSmNUeDd4VkRWNk9DYVQxSjRkZmcxVWNGTTNidXM5R3VMMzBITVBRYVEvaApyN2RrcnkxTUlLaVh3MUU5SkFSc05PMnhnamJBMHJEWlpIOXRRRlpwMlpUa1BNU1AzMG5WTWJvNWh3MHZLUGplCnoxNlB6Q3JwUjJIRkZrc0dXRmI3SnVobHlkWmpDaVQwOFJPY3N5ZERUTVFXZWZBdTNEcUJvMHpOSmtrcVovaVAKWkFFZ29DNXZ2MEg2N0Q4SEJxSzArRmUrZjJCaUs1SGNoYkF1WndwWjNkQ0pMTXVmU3FSWkNVVmFtTW56dWlaRApQTmVJbmdPSCtsMWZReTFad0pzQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZCRms1eStsM2RFMUhtT3lkSUYybDlDMDgvbk9NQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBQmxjZ0l1YUsxSVZydVBTVzk2SwpkTTZ6V294WmJlaVpqTWdpd2Q2R3lSL0JBdjI2QzB5V1piZjFHY3A4TlBISDJLdlhscTliUGpSODZSUkNpRFQ4Ci9VZGlTWVpQejByNnJrcTVCZ2x1Rk5XNlRTTXJyRndEVDlubVh0d0pZdzVQU29sS0JHQjIvaThaVTVwL3FkQUMKZ2Z3bU1sY3NPV3ZFUVV5bTVUYmZiWVU3NStxODJsNjY5ZGpGenh2VHFEWEIvZ0hoK1JvRXVaRTNSdjd5Slc1MwpMbkVhVWZSYjRCcmxGclFrKzlPRXZKMUF5UTE0LzcwTjlhVlJXZVZpTkxyQVdJTTNnajN1WmVHMk5yMXdic1ozCjM3VDF5MSs3TVlRcUpiUWRleUpyUVRyaGNjMXlRWTJIOEpaOXBqOERhNVVpSjlkQ1ZMeEtJSlFMeTV4b0RXaTgKL2hvPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== server: https://10.0.0.231:6443 name: myk8s contexts: null current-context: "" kind: Config preferences: {} users: null [root@worker232 ~]# [root@worker232 ~]# ll yinzhengjie-k8s.conf -rw------- 1 root root 1663 Apr 14 11:26 yinzhengjie-k8s.conf [root@worker232 ~]#

image

bash
2.查看集群信息 [root@worker232 ~]# kubectl config get-clusters --kubeconfig=./yinzhengjie-k8s.conf NAME myk8s [root@worker232 ~]# 3.查看令牌认证文件 [root@master231 ~]# cat /etc/kubernetes/pki/token.csv 01b202.d5c4210389cbff08,yinzhengjie,10001,k8s 497804.9fc391f505052952,jasonyin,10002,k8s 8fd32c.0868709b9e5786a8,linux97,10003,k3s jvt496.ls43vufojf45q73i,weixiang98,10004,k3s qo7azt.y27gu4idn5cunudd,linux99,10005,k3s mic1bd.mx3vohsg05bjk5rr,linux100,10006,k3s [root@master231 ~]# 4.创建用户信息 [root@worker232 ~]# kubectl config set-credentials yinzhengjie --token="01b202.d5c4210389cbff08" --kubeconfig=./yinzhengjie-k8s.conf User "yinzhengjie" set. # 创建yinzhengjie的用户 # --token="01b202.d5c4210389cbff08":token方式进行认证,确保有这个token # --kubeconfig=./yinzhengjie-k8s.conf:要操作的 kubeconfig 文件是当前目录下的 yinzhengjie-k8s.conf [root@worker232 ~]# [root@worker232 ~]# kubectl config set-credentials jasonyin --token="497804.9fc391f505052952" --kubeconfig=./yinzhengjie-k8s.conf User "jasonyin" set. [root@worker232 ~]# [root@worker232 ~]# cat yinzhengjie-k8s.conf apiVersion: v1 clusters: - cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJMU1EUXdOekF6TURBd05Gb1hEVE0xTURRd05UQXpNREF3TkZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTTl4Cmh0RHhVQVJsUGo0NlFEa1Rwd3dPWnJsN2d1bG5IUzRYN1Y1S1pFN3cyZVZRakJXUmpRMENnSzNjMFFBa3hoT1YKWXl4Y1pSbVg2U3FkRFZOWFBNQVZzSmNUeDd4VkRWNk9DYVQxSjRkZmcxVWNGTTNidXM5R3VMMzBITVBRYVEvaApyN2RrcnkxTUlLaVh3MUU5SkFSc05PMnhnamJBMHJEWlpIOXRRRlpwMlpUa1BNU1AzMG5WTWJvNWh3MHZLUGplCnoxNlB6Q3JwUjJIRkZrc0dXRmI3SnVobHlkWmpDaVQwOFJPY3N5ZERUTVFXZWZBdTNEcUJvMHpOSmtrcVovaVAKWkFFZ29DNXZ2MEg2N0Q4SEJxSzArRmUrZjJCaUs1SGNoYkF1WndwWjNkQ0pMTXVmU3FSWkNVVmFtTW56dWlaRApQTmVJbmdPSCtsMWZReTFad0pzQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZCRms1eStsM2RFMUhtT3lkSUYybDlDMDgvbk9NQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBQmxjZ0l1YUsxSVZydVBTVzk2SwpkTTZ6V294WmJlaVpqTWdpd2Q2R3lSL0JBdjI2QzB5V1piZjFHY3A4TlBISDJLdlhscTliUGpSODZSUkNpRFQ4Ci9VZGlTWVpQejByNnJrcTVCZ2x1Rk5XNlRTTXJyRndEVDlubVh0d0pZdzVQU29sS0JHQjIvaThaVTVwL3FkQUMKZ2Z3bU1sY3NPV3ZFUVV5bTVUYmZiWVU3NStxODJsNjY5ZGpGenh2VHFEWEIvZ0hoK1JvRXVaRTNSdjd5Slc1MwpMbkVhVWZSYjRCcmxGclFrKzlPRXZKMUF5UTE0LzcwTjlhVlJXZVZpTkxyQVdJTTNnajN1WmVHMk5yMXdic1ozCjM3VDF5MSs3TVlRcUpiUWRleUpyUVRyaGNjMXlRWTJIOEpaOXBqOERhNVVpSjlkQ1ZMeEtJSlFMeTV4b0RXaTgKL2hvPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== server: https://10.0.0.231:6443 name: myk8s contexts: null current-context: "" kind: Config preferences: {} users: - name: jasonyin user: token: 497804.9fc391f505052952 - name: yinzhengjie user: token: 01b202.d5c4210389cbff08 [root@worker232 ~]# 5.查看用户信息 [root@worker232 ~]# kubectl config get-users --kubeconfig=./yinzhengjie-k8s.conf NAME jasonyin yinzhengjie [root@worker232 ~]#

image

bash
6. 定义上下文 [root@worker232 ~]# kubectl config set-context yinzhengjie@myk8s --user=yinzhengjie --cluster=myk8s --kubeconfig=./yinzhengjie-k8s.conf Context "yinzhengjie@myk8s" created. # 在yinzhengjie-k8s.conf文件中,创建一个名为yinzhengjie@myk8s的上下文,这个上下文将用户 yinzhengjie和集群myk8s绑定在一起 # --user=yinzhengjie: 指定这个上下文使用哪个用户 # --cluster=myk8s: 指定这个上下文连接哪个集群 # --kubeconfig=./yinzhengjie-k8s.conf: 再次指定操作的是当前目录下的 yinzhengjie-k8s.conf 文件 [root@worker232 ~]# kubectl config set-context jasonyin@myk8s --user=jasonyin --cluster=myk8s --kubeconfig=./yinzhengjie-k8s.conf Context "jasonyin@myk8s" created. [root@worker232 ~]# [root@worker232 ~]# cat yinzhengjie-k8s.conf apiVersion: v1 clusters: - cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJMU1EUXdOekF6TURBd05Gb1hEVE0xTURRd05UQXpNREF3TkZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTTl4Cmh0RHhVQVJsUGo0NlFEa1Rwd3dPWnJsN2d1bG5IUzRYN1Y1S1pFN3cyZVZRakJXUmpRMENnSzNjMFFBa3hoT1YKWXl4Y1pSbVg2U3FkRFZOWFBNQVZzSmNUeDd4VkRWNk9DYVQxSjRkZmcxVWNGTTNidXM5R3VMMzBITVBRYVEvaApyN2RrcnkxTUlLaVh3MUU5SkFSc05PMnhnamJBMHJEWlpIOXRRRlpwMlpUa1BNU1AzMG5WTWJvNWh3MHZLUGplCnoxNlB6Q3JwUjJIRkZrc0dXRmI3SnVobHlkWmpDaVQwOFJPY3N5ZERUTVFXZWZBdTNEcUJvMHpOSmtrcVovaVAKWkFFZ29DNXZ2MEg2N0Q4SEJxSzArRmUrZjJCaUs1SGNoYkF1WndwWjNkQ0pMTXVmU3FSWkNVVmFtTW56dWlaRApQTmVJbmdPSCtsMWZReTFad0pzQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZCRms1eStsM2RFMUhtT3lkSUYybDlDMDgvbk9NQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBQmxjZ0l1YUsxSVZydVBTVzk2SwpkTTZ6V294WmJlaVpqTWdpd2Q2R3lSL0JBdjI2QzB5V1piZjFHY3A4TlBISDJLdlhscTliUGpSODZSUkNpRFQ4Ci9VZGlTWVpQejByNnJrcTVCZ2x1Rk5XNlRTTXJyRndEVDlubVh0d0pZdzVQU29sS0JHQjIvaThaVTVwL3FkQUMKZ2Z3bU1sY3NPV3ZFUVV5bTVUYmZiWVU3NStxODJsNjY5ZGpGenh2VHFEWEIvZ0hoK1JvRXVaRTNSdjd5Slc1MwpMbkVhVWZSYjRCcmxGclFrKzlPRXZKMUF5UTE0LzcwTjlhVlJXZVZpTkxyQVdJTTNnajN1WmVHMk5yMXdic1ozCjM3VDF5MSs3TVlRcUpiUWRleUpyUVRyaGNjMXlRWTJIOEpaOXBqOERhNVVpSjlkQ1ZMeEtJSlFMeTV4b0RXaTgKL2hvPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== server: https://10.0.0.231:6443 name: myk8s contexts: - context: cluster: myk8s user: jasonyin name: jasonyin@myk8s - context: cluster: myk8s user: yinzhengjie name: yinzhengjie@myk8s current-context: "" kind: Config preferences: {} users: - name: jasonyin user: token: 497804.9fc391f505052952 - name: yinzhengjie user: token: 01b202.d5c4210389cbff08 [root@worker232 ~]# 7. 查看上下文列表 [root@worker232 ~]# kubectl config get-contexts --kubeconfig=./yinzhengjie-k8s.conf CURRENT NAME CLUSTER AUTHINFO NAMESPACE jasonyin@myk8s myk8s jasonyin yinzhengjie@myk8s myk8s yinzhengjie 8.定义当前使用的上下文 [root@worker232 ~]# kubectl config use-context yinzhengjie@myk8s --kubeconfig=./yinzhengjie-k8s.conf # use-context yinzhengjie@myk8s: 定义默认使用的上下文为yinzhengjie@myk8s Switched to context "yinzhengjie@myk8s". [root@worker232 ~]# cat yinzhengjie-k8s.conf apiVersion: v1 clusters: # 定义所有集群列表 - cluster: # 第一个(也是唯一一个)集群定义 certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJMU1EY3dPVEF6TWpVeE1sb1hEVE0xTURjd056QXpNalV4TWxvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTmdiCkNsbmdtd1p3YUhISnJXeTkzTlpxZXVhZ0RqMDhaZTlMbnZqT2NSWTljbjlwVTFrNys0YkluNG4xNXd0Y2pibnUKdUw4T200SE5lTHJTcmpHVGpRcHpDUEJpMG1mSjBRbXZ6aWhINFpBZG1FejhKQWpnZStqeTJXRmhKWk9tU21yeAoyT2tIc0l2VncxM3Fkdkt4SGdtbUNkYmpaeEhtNkhRYmdNRVcwOEVreGtXVXAzTnFCbjFRMTd6eTFpc2pYcDRkCk8xUkN1ekQydmwrbjVjRHhNc29rTjREYTl1UGdsR2IvUmNrd0tXTnJmSFNNN1R2SktZaXFZWXBocWRueS9qUjMKNzczTVowNVUzNG5nUUcvYXo1Y2U4cW5hVk4yblNFaG1CMlZJYlBNaHJDajZuNTN4cVJ6YzRZWVREbG9GSlFmTQpuR3Q5SWtvMnd0cS92YjJBSHQ4Q0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZEMDl6NUlwNFVMTkU2cklncjJUSXhCb3BPZWpNQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBSkVsSWFzajR6ZkZ3Y1dRYWxrcQplRS83NFlZTk1Rcjl2cmtkSzhrRWdMcmt1bDN3VkZTbzAvM2JBQ0ZEbzllSUVSZ1VWcmUyQzExQ2gyaWM3a3E2ClhUWDNsR3JrK2FlZGtXSWd3NmtKVjcvUEVMZ0ZrTnI1c2Y5RmpYYm1Md2xTR1hSQXNBdlpVRlQ5SCsrZHk5OG8KU0tPcWpNd3MrbDk3a2krajVtWDBxNklDM0xub0lVVGxpVXJGdHJxdndUMDhXVGtSeDJSYlBQOWN3NTdFd1ZsMApHMlE1LzhaZTd0QWswNklUblQ1NlA1Tm02OEtzdDNYNVl2MWhyMEF5R2UxcEZ0QVgxcTdpRjVKcjMyUTZ5azBwCnF1dkRvM2JGMEd0VlloRWlvN0NZOVlncG1wS01iSjNkVm1OS3d1U2puV2V6b2p1c0FLY0hObGN0eHhxOWRPTEsKcVo4PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== server: https://10.1.24.13:6443 # 访问的目标地址 name: myk8s # 集群别名 contexts: # 定义上下文列表,将用户跟集群绑定在一起 - context: # 列表中第一个上下文定义 cluster: myk8s # 引用上面clusters部分定义的集群别名 user: jasonyin # 引用下面users部分定义的用户名称 name: jasonyin@myk8s # 这个上下文的别名 - context: # 列表中第二个上下文定义 cluster: myk8s # 引用上面clusters部分定义的集群别名 user: yinzhengjie # 引用下面users部分定义的用户名称 name: yinzhengjie@myk8s # 这个上下文的别名 current-context: yinzhengjie@myk8s # 当前上下文使用yinzhengjie@myk8s kind: Config # 指定文件类型 preferences: {} users: - name: jasonyin # 列表中第一个用户名 user: token: 497804.9fc391f505052952 # 使用token方式验证 - name: yinzhengjie # 列表中第二个用户名 user: token: 01b202.d5c4210389cbff08 # 使用token方式验证 [root@worker232 ~]# 9. 查看当前使用的上下文 [root@worker232 ~]# kubectl config current-context --kubeconfig=./yinzhengjie-k8s.conf yinzhengjie@myk8s [root@worker232 ~]# [root@worker232 ~]# kubectl config get-contexts --kubeconfig=./yinzhengjie-k8s.conf CURRENT NAME CLUSTER AUTHINFO NAMESPACE jasonyin@myk8s myk8s jasonyin * yinzhengjie@myk8s myk8s yinzhengjie [root@worker232 ~]# 10.打印kubeconfig信息,默认会使用“REDACTED”或者“DATA+OMITTED”关键字隐藏证书信息 [root@worker232 ~]# kubectl config view --kubeconfig=./yinzhengjie-k8s.conf apiVersion: v1 clusters: - cluster: certificate-authority-data: DATA+OMITTED server: https://10.0.0.231:6443 name: myk8s contexts: - context: cluster: myk8s user: jasonyin name: jasonyin@myk8s - context: cluster: myk8s user: yinzhengjie name: yinzhengjie@myk8s current-context: yinzhengjie@myk8s kind: Config preferences: {} users: - name: jasonyin user: token: REDACTED - name: yinzhengjie user: token: REDACTED [root@worker232 ~]# [root@worker232 ~]# kubectl config view --kubeconfig=./yinzhengjie-k8s.conf --raw apiVersion: v1 clusters: - cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJMU1EUXdOekF6TURBd05Gb1hEVE0xTURRd05UQXpNREF3TkZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTTl4Cmh0RHhVQVJsUGo0NlFEa1Rwd3dPWnJsN2d1bG5IUzRYN1Y1S1pFN3cyZVZRakJXUmpRMENnSzNjMFFBa3hoT1YKWXl4Y1pSbVg2U3FkRFZOWFBNQVZzSmNUeDd4VkRWNk9DYVQxSjRkZmcxVWNGTTNidXM5R3VMMzBITVBRYVEvaApyN2RrcnkxTUlLaVh3MUU5SkFSc05PMnhnamJBMHJEWlpIOXRRRlpwMlpUa1BNU1AzMG5WTWJvNWh3MHZLUGplCnoxNlB6Q3JwUjJIRkZrc0dXRmI3SnVobHlkWmpDaVQwOFJPY3N5ZERUTVFXZWZBdTNEcUJvMHpOSmtrcVovaVAKWkFFZ29DNXZ2MEg2N0Q4SEJxSzArRmUrZjJCaUs1SGNoYkF1WndwWjNkQ0pMTXVmU3FSWkNVVmFtTW56dWlaRApQTmVJbmdPSCtsMWZReTFad0pzQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZCRms1eStsM2RFMUhtT3lkSUYybDlDMDgvbk9NQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBQmxjZ0l1YUsxSVZydVBTVzk2SwpkTTZ6V294WmJlaVpqTWdpd2Q2R3lSL0JBdjI2QzB5V1piZjFHY3A4TlBISDJLdlhscTliUGpSODZSUkNpRFQ4Ci9VZGlTWVpQejByNnJrcTVCZ2x1Rk5XNlRTTXJyRndEVDlubVh0d0pZdzVQU29sS0JHQjIvaThaVTVwL3FkQUMKZ2Z3bU1sY3NPV3ZFUVV5bTVUYmZiWVU3NStxODJsNjY5ZGpGenh2VHFEWEIvZ0hoK1JvRXVaRTNSdjd5Slc1MwpMbkVhVWZSYjRCcmxGclFrKzlPRXZKMUF5UTE0LzcwTjlhVlJXZVZpTkxyQVdJTTNnajN1WmVHMk5yMXdic1ozCjM3VDF5MSs3TVlRcUpiUWRleUpyUVRyaGNjMXlRWTJIOEpaOXBqOERhNVVpSjlkQ1ZMeEtJSlFMeTV4b0RXaTgKL2hvPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== server: https://10.0.0.231:6443 name: myk8s contexts: - context: cluster: myk8s user: jasonyin name: jasonyin@myk8s - context: cluster: myk8s user: yinzhengjie name: yinzhengjie@myk8s current-context: yinzhengjie@myk8s kind: Config preferences: {} users: - name: jasonyin user: token: 497804.9fc391f505052952 - name: yinzhengjie user: token: 01b202.d5c4210389cbff08 [root@worker232 ~]# 11 客户端进行认证 [root@worker232 ~]# kubectl get pods --kubeconfig=./yinzhengjie-k8s.conf Error from server (Forbidden): pods is forbidden: User "yinzhengjie" cannot list resource "pods" in API group "" in the namespace "default" [root@worker232 ~]# [root@worker232 ~]# kubectl get pods --kubeconfig=./yinzhengjie-k8s.conf --context=jasonyin@myk8s # --context=jasonyin@myk8s:指定上下文为jasonyin@myk8s Error from server (Forbidden): pods is forbidden: User "jasonyin" cannot list resource "pods" in API group "" in the namespace "default"

3、kubectl加载kubeconfig的优先级
bash
1.基于KUBECONFIG变量 [root@worker232 ~]# export KUBECONFIG=/root/yinzhengjie-k8s.conf # 执行时会自动查找它KUBECONFIG变量 [root@worker232 ~]# kubectl get pods Error from server (Forbidden): pods is forbidden: User "yinzhengjie" cannot list resource "pods" in API group "" in the namespace "default" [root@worker232 ~]# [root@worker232 ~]# kubectl get pods --context=jasonyin@myk8s Error from server (Forbidden): pods is forbidden: User "jasonyin" cannot list resource "pods" in API group "" in the namespace "default" 2.指定kubeconfig文件,优先级高于KUBECONFIG变量 2.1 拷贝kubeconfig文件 [root@master231 ~]# scp /etc/kubernetes/admin.conf 10.1.20.5:~ 2.2 测试验证 [root@worker232 ~]# env | grep KUBECONFIG KUBECONFIG=/root/yinzhengjie-k8s.conf [root@worker232 ~]# [root@worker232 ~]# kubectl get nodes Error from server (Forbidden): nodes is forbidden: User "yinzhengjie" cannot list resource "nodes" in API group "" at the cluster scope [root@worker232 ~]# [root@worker232 ~]# kubectl get nodes --kubeconfig=admin.conf NAME STATUS ROLES AGE VERSION master231 Ready control-plane,master 12d v1.23.17 worker232 Ready <none> 12d v1.23.17 worker233 NotReady <none> 5d18h v1.23.17 3.指定kubeconfig文件,优先级高于"~/.kube/config"文件 3.1 拷贝kubeconfig文件 [root@worker232 ~]# scp yinzhengjie-k8s.conf 10.01.24.13:~ 3.2 测试验证 [root@master231 ~]# env | grep KUBECONFIG [root@master231 ~]# ll ~/.kube/config -rw------- 1 root root 5638 May 22 10:59 /root/.kube/config [root@master231 ~]# [root@master231 ~]# kubectl get nodes --kubeconfig=yinzhengjie-k8s.conf Error from server (Forbidden): nodes is forbidden: User "yinzhengjie" cannot list resource "nodes" in API group "" at the cluster scope 4."~/.kube/config"和KUBECONFIG变量的优先级比较 4.1 准备kubeconfig文件 [root@worker232 ~]# scp yinzhengjie-k8s.conf 10.1.24.13:~ 4.2 配置环境变量 [root@master231 ~]# env | grep KUBECONFIG [root@master231 ~]# [root@master231 ~]# export KUBECONFIG=/root/yinzhengjie-k8s.conf [root@master231 ~]# [root@master231 ~]# env | grep KUBECONFIG KUBECONFIG=/root/yinzhengjie-k8s.conf [root@master231 ~]# [root@master231 ~]# ll ~/.kube/config -rw------- 1 root root 5638 May 22 10:59 /root/.kube/config [root@master231 ~]# 4.3 测试验证 [root@master231 ~]# kubectl get nodes Error from server (Forbidden): nodes is forbidden: User "yinzhengjie" cannot list resource "nodes" in API group "" at the cluster scope [root@master231 ~]# 4.4 删除变量 [root@master231 ~]# unset KUBECONFIG [root@master231 ~]# env | grep KUBECONFIG [root@master231 ~]# [root@master231 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master231 Ready control-plane,master 12d v1.23.17 worker232 Ready <none> 12d v1.23.17 worker233 NotReady <none> 5d18h v1.23.17 [root@master231 ~]# 5.综上所述,kubectl加载kubeconfig文件的优先级总结 - 1.使用--kubeconfig的优先级最大,直接无视后面的两个配置文件; - 2.使用KUBECONFIG变量的优先级次之; - 3.使用默认的"~/.kube/config"最后加载; # 也就是把配置文件拷贝到/.kube/config会自动加载 [root@worker232 ~]# cp yinzhengjie-k8s.conf ~/.kube/config [root@worker232 ~]# ll .kube/ total 16 drwxr-x--- 3 root root 4096 Jul 20 11:51 ./ drwx------ 8 root root 4096 Jul 20 11:17 ../ drwxr-x--- 4 root root 4096 Jul 18 10:57 cache/ -rw------- 1 root root 1941 Jul 20 11:49 config # 验证,什么参数也不用加 [root@worker232 ~]# kubectl get pods No resources found in default namespace.
4、⭐为X509数字证书的用户生成kubeconfig实战
bash
1 准备证书 [root@worker233 ~]# ll jiege.* -rw-r--r-- 1 root root 1115 Apr 14 10:58 jiege.crt -rw-r--r-- 1 root root 911 Apr 14 10:43 jiege.csr -rw------- 1 root root 1704 Apr 14 10:43 jiege.key [root@worker233 ~]# 2 添加证书用户 [root@worker233 ~]# kubectl config set-credentials jiege --client-certificate=/root/jiege.crt --client-key=/root/jiege.key --embed-certs=true --kubeconfig=./yinzhengjie-k8s.conf User "jiege" set. [root@worker233 ~]# [root@worker233 ~]# ll yinzhengjie-k8s.conf -rw------- 1 root root 3935 Apr 14 11:39 yinzhengjie-k8s.conf [root@worker233 ~]# [root@worker233 ~]# cat yinzhengjie-k8s.conf apiVersion: v1 clusters: null contexts: null current-context: "" kind: Config preferences: {} users: - name: jiege user: client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURDakNDQWZLZ0F3SUJBZ0lSQUtxMEY4YXlpUGlFMkdHUWtpYUN4ZWN3RFFZSktvWklodmNOQVFFTEJRQXcKRlRFVE1CRUdBMVVFQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TlRBME1UUXdNalE0TWpoYUZ3MHlOVEEwTVRVdwpNalE0TWpoYU1DUXhFakFRQmdOVkJBb1RDVzlzWkdKdmVXVmtkVEVPTUF3R0ExVUVBeE1GYW1sbFoyVXdnZ0VpCk1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLQW9JQkFRRFRvc2doNmVpeU1CVklBNFVWaEpFSWllb0YKSmRxeGNNRDlCVVFLRGs2WUZCQ2xZUE9sd0xPek8xNU1vNk1OZXEwTGIrOFBhQWdMQml4ZERXTTR6TE1yQmxMQgpiL2x3SGkrV3Z5MnQvU1E4WU5MV09HYnhyUCtQVjJ3dUw4OWEyNHBwVk9teFFrdVExcC9XMFJHM3Zxd1RvVnd5ClRzTnlpa0VqZ0xLbXlLZWVVMWFNS3NldTV6TUNNajFYbldRNk5ZMHB3VzcxR0dxbnZ1MjF2VEpqMUllRTRmSjAKd29IMWNpL3ZsS0Y5bERvaUFPNkFJR2VRMEZPbGlNZWkwMHppVVY1aHVPQUZaOEt4eU1oVEZ6eWVjSkl6aFUwcwpLaDdZdWZ4NVR4YUx6ZmNjdk5mUHFMZDQwbmdtUjFlMjQ0aitCclVJcGVDMkVYMHcwV3pBYTdHdjBWWTlBZ01CCkFBR2pSakJFTUJNR0ExVWRKUVFNTUFvR0NDc0dBUVVGQndNQ01Bd0dBMVVkRXdFQi93UUNNQUF3SHdZRFZSMGoKQkJnd0ZvQVVFV1RuTDZYZDBUVWVZN0owZ1hhWDBMVHorYzR3RFFZSktvWklodmNOQVFFTEJRQURnZ0VCQURVbQpSVzRoRm83cjlreEszK1FuaENQL0lzVjNGZXltQkN5WUdUWVJoUlJOTCtEQldadlhTTUxuSkppNXRsZkFNSmNtCnY2MWN4MDY0cDRXM25TSG1aU04rODUySUR1alBwWjRXeTJ1VmIwVXR6MUtkM1RBVmJTNGdWTnVRMEgvaGs1aXEKSm9Zelh0WjdiQU4xSEgyQ3RjMUlpSGlNYzBHV1djcUtQQWtzZmNrTjR2Z2lYUDNZVTRFS1lJdXBtVWV4czBLbApoRXVHNUp3aGtLVStYWFZqNm1CWDdrNnBIT3Z3SG5lNEJDRW1sT2lIYnRXU3ZPd2poUTB1ZEJ6OEFKUWYxYVJjCkkyMW5oK2dCekpDdk5oOUpLVXpkemVMSFpld0g2dzB1YndJdEUvWDV3S3l6UmNwMUpweGZoZm1TZW00elRKbnMKS2JnV3pOUzYvUHp0ak90NWV4az0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo= client-key-data: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRRFRvc2doNmVpeU1CVkkKQTRVVmhKRUlpZW9GSmRxeGNNRDlCVVFLRGs2WUZCQ2xZUE9sd0xPek8xNU1vNk1OZXEwTGIrOFBhQWdMQml4ZApEV000ekxNckJsTEJiL2x3SGkrV3Z5MnQvU1E4WU5MV09HYnhyUCtQVjJ3dUw4OWEyNHBwVk9teFFrdVExcC9XCjBSRzN2cXdUb1Z3eVRzTnlpa0VqZ0xLbXlLZWVVMWFNS3NldTV6TUNNajFYbldRNk5ZMHB3VzcxR0dxbnZ1MjEKdlRKajFJZUU0Zkowd29IMWNpL3ZsS0Y5bERvaUFPNkFJR2VRMEZPbGlNZWkwMHppVVY1aHVPQUZaOEt4eU1oVApGenllY0pJemhVMHNLaDdZdWZ4NVR4YUx6ZmNjdk5mUHFMZDQwbmdtUjFlMjQ0aitCclVJcGVDMkVYMHcwV3pBCmE3R3YwVlk5QWdNQkFBRUNnZ0VBTnI0TWRubENyNVN3YklnOGpHeFY5NWQwNlEvNW1aeEl6eW5saDVSYjBBcWcKbzZhSVgzK1ErL09IV051YStZbVo2VE55NnRGR0ExUDlkYlJZemdCazkrUVMwK1phNXgxbndkNkJ1bGVZWCtYTApvNDNEVXhBa3FyYzZURmdoa3FibkRvZmdTdkdUQ2t2NTNGOEg3amRyMjBnSnlSbUdoTUl1UnppcS9XazVza0h6CjFWQzRvdWl1Qk1yTStzcXhOWVNmYnJGK3pXV3R1QW05RzBkejVWRzdKSGRIOUEyMHFCeW5uNkF2VU5zempvdm8KYk9jVDVMenc5eGtOKzRjNnlXd3JWdzRRb3hCUWdUVi9Cd0l3bjlqZnB2eXRqaGp4bW9kVEoxcEJZT0ZMb0Q3WQp1YlVoVHdtL1Q1SmZXT0wyR09nZjNOempYeFlVS056WmhvMXJVMVEzSVFLQmdRRHVoV3NwQmdRY2dGZU9OWEhBCjdBQ1MrQldvWFZTeU1TMWdodUF1cFZmakVWL2ZHa3kvOVMyYkZMUVhPaFMrSCtuZUNlZHdUVzZKVC9rNitxYVkKbkVqaGpMenJsTWY3YUt1QkdFUnpZTmc0S2pUekdlOFViaURRRFE2MlRtMDk1eVhVN0lTSjJnS1Vad0RWY0ROUApVR3lBOWFEMHF4aGp1WkJOVFpwaG94MzhId0tCZ1FEakpRRGpscC9uRVFEMFpScm56WFJVSmc4ZTdFUGF6dVBlCkRSYUlrSjFCSzlTRjlBd0pid2hNNkVwRUxWbjNWSnpSZ2JVSENDdnhhbzB0WTFxaldaN1RocTFQb3I4aXQ1RUQKSlE4VG9UMzkrdDgwR0N4T1lZWC8zUUlHcThKa1lGSGtiekhJek9wK1B0UEJESXNIMkdXRWxKUVVrMWo1bG1pWAptdEorRVV4aUl3S0JnUUMwb2FkZ251UzRMTjJobllteS8wY0VCZ3Bvd1oxbGdPYUxaamthT2k4UGo5WFo0RkhsClFTaXplLzlTWTdMWHROVm9TSG5UeTEvOWJ1b2dwemRJOVhvZ0RYUDR1R2ltVlVNa2RadEpBVHRkZFdFNkJSYlEKa3dJWWJQc0tSdVJsNzhudnNOcENoeTVTOHBwb0NSdGlZbFo1Wndyb256WE9OL1kzQktENGRnNDhJd0tCZ0NzMwpYaHp2Q290WEE5eDc1QXVZWG5xb0p4WldFMjd0RUJPdVg4d3AzNUdIdWs2bUtTZ2VWUEQwL1RSTmdLRjdHcjhOCnM1aWI2R2h0UW1FUlZ5eGZIOFhWQ09KdTczaTJma09mNkdkdXRURythbnNwNGp3amQvQS9aMlJIaDV1N2E3bFAKb3FRMndLSzJaMm1DYm0xV3NiSHc1dCtuVFRWbmRZenFxd1BMWE1JTEFvR0FMK21ldGNiejlSSFN1d0NUTW5lRQo0dFFxanBqM3o1OUQwWFR0dytiRG4vYnRBalF1VHJqa2k2Ums2a0E2bG1YYXRKQ2Z3WnVNdTd5L0MyUThUS1hjCjVWcUt1cGNhdnpHTWkzeVJrcmlmSEhpb2V1NGpXNlQyYk1XcDRuUTRoV050cEx1blF5aXNCeGpOZEMzZzBONmEKb2M4eXBOL3ZUVHFGdVB6Q3l2VmxUWEU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K [root@worker233 ~]# 3 查看用户列表 [root@worker233 ~]# kubectl config get-users --kubeconfig=./yinzhengjie-k8s.conf NAME jiege [root@worker233 ~]# 4.创建一个集群 [root@worker233 ~]# kubectl config set-cluster myk8s --embed-certs=false --certificate-authority=/etc/kubernetes/pki/ca.crt --server="https://10.1.24.13:6443" --kubeconfig=./yinzhengjie-k8s.conf Cluster "myk8s" set. [root@worker233 ~]# [root@worker233 ~]# ll /etc/kubernetes/pki/ca.crt -rw-r--r-- 1 root root 1099 Apr 10 14:50 /etc/kubernetes/pki/ca.crt [root@worker233 ~]# [root@worker233 ~]# cat yinzhengjie-k8s.conf apiVersion: v1 clusters: - cluster: certificate-authority: /etc/kubernetes/pki/ca.crt server: https://10.0.0.231:6443 name: myk8s contexts: null current-context: "" kind: Config preferences: {} users: - name: jiege user: client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURDakNDQWZLZ0F3SUJBZ0lSQUtxMEY4YXlpUGlFMkdHUWtpYUN4ZWN3RFFZSktvWklodmNOQVFFTEJRQXcKRlRFVE1CRUdBMVVFQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TlRBME1UUXdNalE0TWpoYUZ3MHlOVEEwTVRVdwpNalE0TWpoYU1DUXhFakFRQmdOVkJBb1RDVzlzWkdKdmVXVmtkVEVPTUF3R0ExVUVBeE1GYW1sbFoyVXdnZ0VpCk1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLQW9JQkFRRFRvc2doNmVpeU1CVklBNFVWaEpFSWllb0YKSmRxeGNNRDlCVVFLRGs2WUZCQ2xZUE9sd0xPek8xNU1vNk1OZXEwTGIrOFBhQWdMQml4ZERXTTR6TE1yQmxMQgpiL2x3SGkrV3Z5MnQvU1E4WU5MV09HYnhyUCtQVjJ3dUw4OWEyNHBwVk9teFFrdVExcC9XMFJHM3Zxd1RvVnd5ClRzTnlpa0VqZ0xLbXlLZWVVMWFNS3NldTV6TUNNajFYbldRNk5ZMHB3VzcxR0dxbnZ1MjF2VEpqMUllRTRmSjAKd29IMWNpL3ZsS0Y5bERvaUFPNkFJR2VRMEZPbGlNZWkwMHppVVY1aHVPQUZaOEt4eU1oVEZ6eWVjSkl6aFUwcwpLaDdZdWZ4NVR4YUx6ZmNjdk5mUHFMZDQwbmdtUjFlMjQ0aitCclVJcGVDMkVYMHcwV3pBYTdHdjBWWTlBZ01CCkFBR2pSakJFTUJNR0ExVWRKUVFNTUFvR0NDc0dBUVVGQndNQ01Bd0dBMVVkRXdFQi93UUNNQUF3SHdZRFZSMGoKQkJnd0ZvQVVFV1RuTDZYZDBUVWVZN0owZ1hhWDBMVHorYzR3RFFZSktvWklodmNOQVFFTEJRQURnZ0VCQURVbQpSVzRoRm83cjlreEszK1FuaENQL0lzVjNGZXltQkN5WUdUWVJoUlJOTCtEQldadlhTTUxuSkppNXRsZkFNSmNtCnY2MWN4MDY0cDRXM25TSG1aU04rODUySUR1alBwWjRXeTJ1VmIwVXR6MUtkM1RBVmJTNGdWTnVRMEgvaGs1aXEKSm9Zelh0WjdiQU4xSEgyQ3RjMUlpSGlNYzBHV1djcUtQQWtzZmNrTjR2Z2lYUDNZVTRFS1lJdXBtVWV4czBLbApoRXVHNUp3aGtLVStYWFZqNm1CWDdrNnBIT3Z3SG5lNEJDRW1sT2lIYnRXU3ZPd2poUTB1ZEJ6OEFKUWYxYVJjCkkyMW5oK2dCekpDdk5oOUpLVXpkemVMSFpld0g2dzB1YndJdEUvWDV3S3l6UmNwMUpweGZoZm1TZW00elRKbnMKS2JnV3pOUzYvUHp0ak90NWV4az0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo= client-key-data: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRRFRvc2doNmVpeU1CVkkKQTRVVmhKRUlpZW9GSmRxeGNNRDlCVVFLRGs2WUZCQ2xZUE9sd0xPek8xNU1vNk1OZXEwTGIrOFBhQWdMQml4ZApEV000ekxNckJsTEJiL2x3SGkrV3Z5MnQvU1E4WU5MV09HYnhyUCtQVjJ3dUw4OWEyNHBwVk9teFFrdVExcC9XCjBSRzN2cXdUb1Z3eVRzTnlpa0VqZ0xLbXlLZWVVMWFNS3NldTV6TUNNajFYbldRNk5ZMHB3VzcxR0dxbnZ1MjEKdlRKajFJZUU0Zkowd29IMWNpL3ZsS0Y5bERvaUFPNkFJR2VRMEZPbGlNZWkwMHppVVY1aHVPQUZaOEt4eU1oVApGenllY0pJemhVMHNLaDdZdWZ4NVR4YUx6ZmNjdk5mUHFMZDQwbmdtUjFlMjQ0aitCclVJcGVDMkVYMHcwV3pBCmE3R3YwVlk5QWdNQkFBRUNnZ0VBTnI0TWRubENyNVN3YklnOGpHeFY5NWQwNlEvNW1aeEl6eW5saDVSYjBBcWcKbzZhSVgzK1ErL09IV051YStZbVo2VE55NnRGR0ExUDlkYlJZemdCazkrUVMwK1phNXgxbndkNkJ1bGVZWCtYTApvNDNEVXhBa3FyYzZURmdoa3FibkRvZmdTdkdUQ2t2NTNGOEg3amRyMjBnSnlSbUdoTUl1UnppcS9XazVza0h6CjFWQzRvdWl1Qk1yTStzcXhOWVNmYnJGK3pXV3R1QW05RzBkejVWRzdKSGRIOUEyMHFCeW5uNkF2VU5zempvdm8KYk9jVDVMenc5eGtOKzRjNnlXd3JWdzRRb3hCUWdUVi9Cd0l3bjlqZnB2eXRqaGp4bW9kVEoxcEJZT0ZMb0Q3WQp1YlVoVHdtL1Q1SmZXT0wyR09nZjNOempYeFlVS056WmhvMXJVMVEzSVFLQmdRRHVoV3NwQmdRY2dGZU9OWEhBCjdBQ1MrQldvWFZTeU1TMWdodUF1cFZmakVWL2ZHa3kvOVMyYkZMUVhPaFMrSCtuZUNlZHdUVzZKVC9rNitxYVkKbkVqaGpMenJsTWY3YUt1QkdFUnpZTmc0S2pUekdlOFViaURRRFE2MlRtMDk1eVhVN0lTSjJnS1Vad0RWY0ROUApVR3lBOWFEMHF4aGp1WkJOVFpwaG94MzhId0tCZ1FEakpRRGpscC9uRVFEMFpScm56WFJVSmc4ZTdFUGF6dVBlCkRSYUlrSjFCSzlTRjlBd0pid2hNNkVwRUxWbjNWSnpSZ2JVSENDdnhhbzB0WTFxaldaN1RocTFQb3I4aXQ1RUQKSlE4VG9UMzkrdDgwR0N4T1lZWC8zUUlHcThKa1lGSGtiekhJek9wK1B0UEJESXNIMkdXRWxKUVVrMWo1bG1pWAptdEorRVV4aUl3S0JnUUMwb2FkZ251UzRMTjJobllteS8wY0VCZ3Bvd1oxbGdPYUxaamthT2k4UGo5WFo0RkhsClFTaXplLzlTWTdMWHROVm9TSG5UeTEvOWJ1b2dwemRJOVhvZ0RYUDR1R2ltVlVNa2RadEpBVHRkZFdFNkJSYlEKa3dJWWJQc0tSdVJsNzhudnNOcENoeTVTOHBwb0NSdGlZbFo1Wndyb256WE9OL1kzQktENGRnNDhJd0tCZ0NzMwpYaHp2Q290WEE5eDc1QXVZWG5xb0p4WldFMjd0RUJPdVg4d3AzNUdIdWs2bUtTZ2VWUEQwL1RSTmdLRjdHcjhOCnM1aWI2R2h0UW1FUlZ5eGZIOFhWQ09KdTczaTJma09mNkdkdXRURythbnNwNGp3amQvQS9aMlJIaDV1N2E3bFAKb3FRMndLSzJaMm1DYm0xV3NiSHc1dCtuVFRWbmRZenFxd1BMWE1JTEFvR0FMK21ldGNiejlSSFN1d0NUTW5lRQo0dFFxanBqM3o1OUQwWFR0dytiRG4vYnRBalF1VHJqa2k2Ums2a0E2bG1YYXRKQ2Z3WnVNdTd5L0MyUThUS1hjCjVWcUt1cGNhdnpHTWkzeVJrcmlmSEhpb2V1NGpXNlQyYk1XcDRuUTRoV050cEx1blF5aXNCeGpOZEMzZzBONmEKb2M4eXBOL3ZUVHFGdVB6Q3l2VmxUWEU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K [root@worker233 ~]# 5 配置上下文 [root@worker233 ~]# kubectl config set-context jiege@myk8s --user=jiege --cluster=myk8s --kubeconfig=./yinzhengjie-k8s.conf Context "jiege@myk8s" created. [root@worker233 ~]# [root@worker233 ~]# cat yinzhengjie-k8s.conf apiVersion: v1 clusters: - cluster: certificate-authority: /etc/kubernetes/pki/ca.crt server: https://10.0.0.231:6443 name: myk8s contexts: - context: cluster: myk8s user: jiege name: jiege@myk8s current-context: "" kind: Config preferences: {} users: - name: jiege user: client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURDakNDQWZLZ0F3SUJBZ0lSQUtxMEY4YXlpUGlFMkdHUWtpYUN4ZWN3RFFZSktvWklodmNOQVFFTEJRQXcKRlRFVE1CRUdBMVVFQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TlRBME1UUXdNalE0TWpoYUZ3MHlOVEEwTVRVdwpNalE0TWpoYU1DUXhFakFRQmdOVkJBb1RDVzlzWkdKdmVXVmtkVEVPTUF3R0ExVUVBeE1GYW1sbFoyVXdnZ0VpCk1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLQW9JQkFRRFRvc2doNmVpeU1CVklBNFVWaEpFSWllb0YKSmRxeGNNRDlCVVFLRGs2WUZCQ2xZUE9sd0xPek8xNU1vNk1OZXEwTGIrOFBhQWdMQml4ZERXTTR6TE1yQmxMQgpiL2x3SGkrV3Z5MnQvU1E4WU5MV09HYnhyUCtQVjJ3dUw4OWEyNHBwVk9teFFrdVExcC9XMFJHM3Zxd1RvVnd5ClRzTnlpa0VqZ0xLbXlLZWVVMWFNS3NldTV6TUNNajFYbldRNk5ZMHB3VzcxR0dxbnZ1MjF2VEpqMUllRTRmSjAKd29IMWNpL3ZsS0Y5bERvaUFPNkFJR2VRMEZPbGlNZWkwMHppVVY1aHVPQUZaOEt4eU1oVEZ6eWVjSkl6aFUwcwpLaDdZdWZ4NVR4YUx6ZmNjdk5mUHFMZDQwbmdtUjFlMjQ0aitCclVJcGVDMkVYMHcwV3pBYTdHdjBWWTlBZ01CCkFBR2pSakJFTUJNR0ExVWRKUVFNTUFvR0NDc0dBUVVGQndNQ01Bd0dBMVVkRXdFQi93UUNNQUF3SHdZRFZSMGoKQkJnd0ZvQVVFV1RuTDZYZDBUVWVZN0owZ1hhWDBMVHorYzR3RFFZSktvWklodmNOQVFFTEJRQURnZ0VCQURVbQpSVzRoRm83cjlreEszK1FuaENQL0lzVjNGZXltQkN5WUdUWVJoUlJOTCtEQldadlhTTUxuSkppNXRsZkFNSmNtCnY2MWN4MDY0cDRXM25TSG1aU04rODUySUR1alBwWjRXeTJ1VmIwVXR6MUtkM1RBVmJTNGdWTnVRMEgvaGs1aXEKSm9Zelh0WjdiQU4xSEgyQ3RjMUlpSGlNYzBHV1djcUtQQWtzZmNrTjR2Z2lYUDNZVTRFS1lJdXBtVWV4czBLbApoRXVHNUp3aGtLVStYWFZqNm1CWDdrNnBIT3Z3SG5lNEJDRW1sT2lIYnRXU3ZPd2poUTB1ZEJ6OEFKUWYxYVJjCkkyMW5oK2dCekpDdk5oOUpLVXpkemVMSFpld0g2dzB1YndJdEUvWDV3S3l6UmNwMUpweGZoZm1TZW00elRKbnMKS2JnV3pOUzYvUHp0ak90NWV4az0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo= client-key-data: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRRFRvc2doNmVpeU1CVkkKQTRVVmhKRUlpZW9GSmRxeGNNRDlCVVFLRGs2WUZCQ2xZUE9sd0xPek8xNU1vNk1OZXEwTGIrOFBhQWdMQml4ZApEV000ekxNckJsTEJiL2x3SGkrV3Z5MnQvU1E4WU5MV09HYnhyUCtQVjJ3dUw4OWEyNHBwVk9teFFrdVExcC9XCjBSRzN2cXdUb1Z3eVRzTnlpa0VqZ0xLbXlLZWVVMWFNS3NldTV6TUNNajFYbldRNk5ZMHB3VzcxR0dxbnZ1MjEKdlRKajFJZUU0Zkowd29IMWNpL3ZsS0Y5bERvaUFPNkFJR2VRMEZPbGlNZWkwMHppVVY1aHVPQUZaOEt4eU1oVApGenllY0pJemhVMHNLaDdZdWZ4NVR4YUx6ZmNjdk5mUHFMZDQwbmdtUjFlMjQ0aitCclVJcGVDMkVYMHcwV3pBCmE3R3YwVlk5QWdNQkFBRUNnZ0VBTnI0TWRubENyNVN3YklnOGpHeFY5NWQwNlEvNW1aeEl6eW5saDVSYjBBcWcKbzZhSVgzK1ErL09IV051YStZbVo2VE55NnRGR0ExUDlkYlJZemdCazkrUVMwK1phNXgxbndkNkJ1bGVZWCtYTApvNDNEVXhBa3FyYzZURmdoa3FibkRvZmdTdkdUQ2t2NTNGOEg3amRyMjBnSnlSbUdoTUl1UnppcS9XazVza0h6CjFWQzRvdWl1Qk1yTStzcXhOWVNmYnJGK3pXV3R1QW05RzBkejVWRzdKSGRIOUEyMHFCeW5uNkF2VU5zempvdm8KYk9jVDVMenc5eGtOKzRjNnlXd3JWdzRRb3hCUWdUVi9Cd0l3bjlqZnB2eXRqaGp4bW9kVEoxcEJZT0ZMb0Q3WQp1YlVoVHdtL1Q1SmZXT0wyR09nZjNOempYeFlVS056WmhvMXJVMVEzSVFLQmdRRHVoV3NwQmdRY2dGZU9OWEhBCjdBQ1MrQldvWFZTeU1TMWdodUF1cFZmakVWL2ZHa3kvOVMyYkZMUVhPaFMrSCtuZUNlZHdUVzZKVC9rNitxYVkKbkVqaGpMenJsTWY3YUt1QkdFUnpZTmc0S2pUekdlOFViaURRRFE2MlRtMDk1eVhVN0lTSjJnS1Vad0RWY0ROUApVR3lBOWFEMHF4aGp1WkJOVFpwaG94MzhId0tCZ1FEakpRRGpscC9uRVFEMFpScm56WFJVSmc4ZTdFUGF6dVBlCkRSYUlrSjFCSzlTRjlBd0pid2hNNkVwRUxWbjNWSnpSZ2JVSENDdnhhbzB0WTFxaldaN1RocTFQb3I4aXQ1RUQKSlE4VG9UMzkrdDgwR0N4T1lZWC8zUUlHcThKa1lGSGtiekhJek9wK1B0UEJESXNIMkdXRWxKUVVrMWo1bG1pWAptdEorRVV4aUl3S0JnUUMwb2FkZ251UzRMTjJobllteS8wY0VCZ3Bvd1oxbGdPYUxaamthT2k4UGo5WFo0RkhsClFTaXplLzlTWTdMWHROVm9TSG5UeTEvOWJ1b2dwemRJOVhvZ0RYUDR1R2ltVlVNa2RadEpBVHRkZFdFNkJSYlEKa3dJWWJQc0tSdVJsNzhudnNOcENoeTVTOHBwb0NSdGlZbFo1Wndyb256WE9OL1kzQktENGRnNDhJd0tCZ0NzMwpYaHp2Q290WEE5eDc1QXVZWG5xb0p4WldFMjd0RUJPdVg4d3AzNUdIdWs2bUtTZ2VWUEQwL1RSTmdLRjdHcjhOCnM1aWI2R2h0UW1FUlZ5eGZIOFhWQ09KdTczaTJma09mNkdkdXRURythbnNwNGp3amQvQS9aMlJIaDV1N2E3bFAKb3FRMndLSzJaMm1DYm0xV3NiSHc1dCtuVFRWbmRZenFxd1BMWE1JTEFvR0FMK21ldGNiejlSSFN1d0NUTW5lRQo0dFFxanBqM3o1OUQwWFR0dytiRG4vYnRBalF1VHJqa2k2Ums2a0E2bG1YYXRKQ2Z3WnVNdTd5L0MyUThUS1hjCjVWcUt1cGNhdnpHTWkzeVJrcmlmSEhpb2V1NGpXNlQyYk1XcDRuUTRoV050cEx1blF5aXNCeGpOZEMzZzBONmEKb2M4eXBOL3ZUVHFGdVB6Q3l2VmxUWEU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K [root@worker233 ~]# 6.查看上下文列表 [root@worker233 ~]# kubectl config get-contexts --kubeconfig=./yinzhengjie-k8s.conf CURRENT NAME CLUSTER AUTHINFO NAMESPACE jiege@myk8s myk8s jiege [root@worker233 ~]# 7.查看kubeconfig信息 [root@worker233 ~]# kubectl --kubeconfig=./yinzhengjie-k8s.conf config view apiVersion: v1 clusters: - cluster: certificate-authority: /etc/kubernetes/pki/ca.crt server: https://10.0.0.231:6443 name: myk8s contexts: - context: cluster: myk8s user: jiege name: jiege@myk8s current-context: "" kind: Config preferences: {} users: - name: jiege user: client-certificate-data: REDACTED client-key-data: REDACTED [root@worker233 ~]# 8.客户端测试验证 [root@worker233 ~]# kubectl get pods --kubeconfig=./yinzhengjie-k8s.conf The connection to the server localhost:8080 was refused - did you specify the right host or port? [root@worker233 ~]# [root@worker233 ~]# kubectl get pods --kubeconfig=./yinzhengjie-k8s.conf --context=jiege@myk8s Error from server (Forbidden): pods is forbidden: User "jiege" cannot list resource "pods" in API group "" in the namespace "default" [root@worker233 ~]# 9.配置默认上下文 [root@worker233 ~]# kubectl config use-context jiege@myk8s --kubeconfig=./yinzhengjie-k8s.conf Switched to context "jiege@myk8s". [root@worker233 ~]# [root@worker233 ~]# cat yinzhengjie-k8s.conf apiVersion: v1 clusters: - cluster: certificate-authority: /etc/kubernetes/pki/ca.crt server: https://10.0.0.231:6443 name: myk8s contexts: - context: cluster: myk8s user: jiege name: jiege@myk8s current-context: jiege@myk8s kind: Config preferences: {} users: - name: jiege user: client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURDakNDQWZLZ0F3SUJBZ0lSQUtxMEY4YXlpUGlFMkdHUWtpYUN4ZWN3RFFZSktvWklodmNOQVFFTEJRQXcKRlRFVE1CRUdBMVVFQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TlRBME1UUXdNalE0TWpoYUZ3MHlOVEEwTVRVdwpNalE0TWpoYU1DUXhFakFRQmdOVkJBb1RDVzlzWkdKdmVXVmtkVEVPTUF3R0ExVUVBeE1GYW1sbFoyVXdnZ0VpCk1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLQW9JQkFRRFRvc2doNmVpeU1CVklBNFVWaEpFSWllb0YKSmRxeGNNRDlCVVFLRGs2WUZCQ2xZUE9sd0xPek8xNU1vNk1OZXEwTGIrOFBhQWdMQml4ZERXTTR6TE1yQmxMQgpiL2x3SGkrV3Z5MnQvU1E4WU5MV09HYnhyUCtQVjJ3dUw4OWEyNHBwVk9teFFrdVExcC9XMFJHM3Zxd1RvVnd5ClRzTnlpa0VqZ0xLbXlLZWVVMWFNS3NldTV6TUNNajFYbldRNk5ZMHB3VzcxR0dxbnZ1MjF2VEpqMUllRTRmSjAKd29IMWNpL3ZsS0Y5bERvaUFPNkFJR2VRMEZPbGlNZWkwMHppVVY1aHVPQUZaOEt4eU1oVEZ6eWVjSkl6aFUwcwpLaDdZdWZ4NVR4YUx6ZmNjdk5mUHFMZDQwbmdtUjFlMjQ0aitCclVJcGVDMkVYMHcwV3pBYTdHdjBWWTlBZ01CCkFBR2pSakJFTUJNR0ExVWRKUVFNTUFvR0NDc0dBUVVGQndNQ01Bd0dBMVVkRXdFQi93UUNNQUF3SHdZRFZSMGoKQkJnd0ZvQVVFV1RuTDZYZDBUVWVZN0owZ1hhWDBMVHorYzR3RFFZSktvWklodmNOQVFFTEJRQURnZ0VCQURVbQpSVzRoRm83cjlreEszK1FuaENQL0lzVjNGZXltQkN5WUdUWVJoUlJOTCtEQldadlhTTUxuSkppNXRsZkFNSmNtCnY2MWN4MDY0cDRXM25TSG1aU04rODUySUR1alBwWjRXeTJ1VmIwVXR6MUtkM1RBVmJTNGdWTnVRMEgvaGs1aXEKSm9Zelh0WjdiQU4xSEgyQ3RjMUlpSGlNYzBHV1djcUtQQWtzZmNrTjR2Z2lYUDNZVTRFS1lJdXBtVWV4czBLbApoRXVHNUp3aGtLVStYWFZqNm1CWDdrNnBIT3Z3SG5lNEJDRW1sT2lIYnRXU3ZPd2poUTB1ZEJ6OEFKUWYxYVJjCkkyMW5oK2dCekpDdk5oOUpLVXpkemVMSFpld0g2dzB1YndJdEUvWDV3S3l6UmNwMUpweGZoZm1TZW00elRKbnMKS2JnV3pOUzYvUHp0ak90NWV4az0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo= client-key-data: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRRFRvc2doNmVpeU1CVkkKQTRVVmhKRUlpZW9GSmRxeGNNRDlCVVFLRGs2WUZCQ2xZUE9sd0xPek8xNU1vNk1OZXEwTGIrOFBhQWdMQml4ZApEV000ekxNckJsTEJiL2x3SGkrV3Z5MnQvU1E4WU5MV09HYnhyUCtQVjJ3dUw4OWEyNHBwVk9teFFrdVExcC9XCjBSRzN2cXdUb1Z3eVRzTnlpa0VqZ0xLbXlLZWVVMWFNS3NldTV6TUNNajFYbldRNk5ZMHB3VzcxR0dxbnZ1MjEKdlRKajFJZUU0Zkowd29IMWNpL3ZsS0Y5bERvaUFPNkFJR2VRMEZPbGlNZWkwMHppVVY1aHVPQUZaOEt4eU1oVApGenllY0pJemhVMHNLaDdZdWZ4NVR4YUx6ZmNjdk5mUHFMZDQwbmdtUjFlMjQ0aitCclVJcGVDMkVYMHcwV3pBCmE3R3YwVlk5QWdNQkFBRUNnZ0VBTnI0TWRubENyNVN3YklnOGpHeFY5NWQwNlEvNW1aeEl6eW5saDVSYjBBcWcKbzZhSVgzK1ErL09IV051YStZbVo2VE55NnRGR0ExUDlkYlJZemdCazkrUVMwK1phNXgxbndkNkJ1bGVZWCtYTApvNDNEVXhBa3FyYzZURmdoa3FibkRvZmdTdkdUQ2t2NTNGOEg3amRyMjBnSnlSbUdoTUl1UnppcS9XazVza0h6CjFWQzRvdWl1Qk1yTStzcXhOWVNmYnJGK3pXV3R1QW05RzBkejVWRzdKSGRIOUEyMHFCeW5uNkF2VU5zempvdm8KYk9jVDVMenc5eGtOKzRjNnlXd3JWdzRRb3hCUWdUVi9Cd0l3bjlqZnB2eXRqaGp4bW9kVEoxcEJZT0ZMb0Q3WQp1YlVoVHdtL1Q1SmZXT0wyR09nZjNOempYeFlVS056WmhvMXJVMVEzSVFLQmdRRHVoV3NwQmdRY2dGZU9OWEhBCjdBQ1MrQldvWFZTeU1TMWdodUF1cFZmakVWL2ZHa3kvOVMyYkZMUVhPaFMrSCtuZUNlZHdUVzZKVC9rNitxYVkKbkVqaGpMenJsTWY3YUt1QkdFUnpZTmc0S2pUekdlOFViaURRRFE2MlRtMDk1eVhVN0lTSjJnS1Vad0RWY0ROUApVR3lBOWFEMHF4aGp1WkJOVFpwaG94MzhId0tCZ1FEakpRRGpscC9uRVFEMFpScm56WFJVSmc4ZTdFUGF6dVBlCkRSYUlrSjFCSzlTRjlBd0pid2hNNkVwRUxWbjNWSnpSZ2JVSENDdnhhbzB0WTFxaldaN1RocTFQb3I4aXQ1RUQKSlE4VG9UMzkrdDgwR0N4T1lZWC8zUUlHcThKa1lGSGtiekhJek9wK1B0UEJESXNIMkdXRWxKUVVrMWo1bG1pWAptdEorRVV4aUl3S0JnUUMwb2FkZ251UzRMTjJobllteS8wY0VCZ3Bvd1oxbGdPYUxaamthT2k4UGo5WFo0RkhsClFTaXplLzlTWTdMWHROVm9TSG5UeTEvOWJ1b2dwemRJOVhvZ0RYUDR1R2ltVlVNa2RadEpBVHRkZFdFNkJSYlEKa3dJWWJQc0tSdVJsNzhudnNOcENoeTVTOHBwb0NSdGlZbFo1Wndyb256WE9OL1kzQktENGRnNDhJd0tCZ0NzMwpYaHp2Q290WEE5eDc1QXVZWG5xb0p4WldFMjd0RUJPdVg4d3AzNUdIdWs2bUtTZ2VWUEQwL1RSTmdLRjdHcjhOCnM1aWI2R2h0UW1FUlZ5eGZIOFhWQ09KdTczaTJma09mNkdkdXRURythbnNwNGp3amQvQS9aMlJIaDV1N2E3bFAKb3FRMndLSzJaMm1DYm0xV3NiSHc1dCtuVFRWbmRZenFxd1BMWE1JTEFvR0FMK21ldGNiejlSSFN1d0NUTW5lRQo0dFFxanBqM3o1OUQwWFR0dytiRG4vYnRBalF1VHJqa2k2Ums2a0E2bG1YYXRKQ2Z3WnVNdTd5L0MyUThUS1hjCjVWcUt1cGNhdnpHTWkzeVJrcmlmSEhpb2V1NGpXNlQyYk1XcDRuUTRoV050cEx1blF5aXNCeGpOZEMzZzBONmEKb2M4eXBOL3ZUVHFGdVB6Q3l2VmxUWEU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K [root@worker233 ~]# 10.再次测试 [root@worker233 ~]# kubectl config current-context --kubeconfig=./yinzhengjie-k8s.conf jiege@myk8s [root@worker233 ~]# [root@worker233 ~]# kubectl config get-contexts --kubeconfig=./yinzhengjie-k8s.conf CURRENT NAME CLUSTER AUTHINFO NAMESPACE * jiege@myk8s myk8s jiege [root@worker233 ~]# [root@worker233 ~]# kubectl get pods --kubeconfig=./yinzhengjie-k8s.conf Error from server (Forbidden): pods is forbidden: User "jiege" cannot list resource "pods" in API group "" in the namespace "default" [root@worker233 ~]# 11.配置KUBECONFIG环境变量 [root@worker233 ~]# export KUBECONFIG=/root/yinzhengjie-k8s.conf [root@worker233 ~]# [root@worker233 ~]# kubectl get pods Error from server (Forbidden): pods is forbidden: User "jiege" cannot list resource "pods" in API group "" in the namespace "default" [root@worker233 ~]# # 直接把conf文件移动,也可以查看 [root@worker233 ~]# mv yinzhengjie-k8s.conf .kube/config
5、⭐K8S默认基于sa进行认证
bash
1.为何需要Service Account Kubernetes原生(kubernetes-native)托管运行于Kubernetes之上,通常需要直接与API Server进行交互以获取必要的信息。 API Server同样需要对这类来自于Pod资源中客户端程序进行身份验证,Service Account也就是设计专用于这类场景的账号。 ServiceAccount是API Server支持的标准资源类型之一。 - 1.基于资源对象保存ServiceAccount的数据; - 2.认证信息保存于ServiceAccount对象专用的Secret中(v1.23-版本) - 3.隶属名称空间级别,专供集群上的Pod中的进程访问API Server时使用; 2.Pod使用ServiceAccount方式 在Pod上使用Service Account通常有两种方式: 自动设定: Service Account通常由API Server自动创建并通过ServiceAccount准入控制器自动关联到集群中创建的每个Pod上。 自定义: 在Pod规范上,使用serviceAccountName指定要使用的特定ServiceAccount。 Kubernetes基于三个组件完成Pod上serviceaccount的自动化,分别对应: ServiceAccount Admission Controller,Token Controller,ServiceAccount Controller。 - ServiceAccount Admission Controller: API Server准入控制器插件,主要负责完成Pod上的ServiceAccount的自动化。 为每个名称空间自动生成一个"default"的sa,若用户未指定sa,则默认使用"default"。 - Token Controller: 为每一个sa分配一个token的组件,已经集成到Controller manager的组件中。 - ServiceAccount Controller: 为sa生成对应的数据信息,已经集成到Controller manager的组件中。 温馨提示: 需要用到特殊权限时,可为Pod指定要使用的自定义ServiceAccount资源对象 3.ServiceAccount Token的不同实现方式 ServiceAccount使用专用的Secret对象(Kubernetes v1.23-)存储相关的敏感信息 - 1.Secret对象的类型标识为“kubernetes.io/service-account-token” - 2.该Secret对象会自动附带认证到API Server用到的Token,也称为ServiceAccount Token ServiceAccount Token的不同实现方式 - 1.Kubernetes v1.20- 系统自动生成专用的Secret对象,并基于secret卷插件关联至相关的Pod; Secret中会自动附带Token且永久有效(安全性低,如果将来获取该token可以长期登录)。 - 2.Kubernetes v1.21-v1.23: 系统自动生成专用的Secret对象,并通过projected卷插件关联至相关的Pod; Pod不会使用Secret上的Token,被弃用后,在未来版本就不在创建该token。 而是由Kubelet向TokenRequest API请求生成,默认有效期为一年,且每小时更新一次; - 3.Kubernetes v1.24+: 系统不再自动生成专用的Secret对象。 而是由Kubelet负责向TokenRequest API请求生成Token,默认有效期为一年,且每小时更新一次; 4.创建sa并让pod引用指定的sa [root@master231 pods]# cat 14-pods-sa.yaml apiVersion: v1 kind: ServiceAccount metadata: name: oldboy --- apiVersion: v1 kind: Pod metadata: name: weixiang-pods-sa spec: # 使用sa的名称 serviceAccountName: oldboy containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 [root@master231 pods]# [root@master231 pods]# kubectl apply -f 14-pods-sa.yaml serviceaccount/oldboy created pod/weixiang-pods-sa created [root@master231 pods]# [root@master231 pods]# kubectl get -f 14-pods-sa.yaml NAME SECRETS AGE serviceaccount/oldboy 1 4s NAME READY STATUS RESTARTS AGE pod/weixiang-pods-sa 1/1 Running 0 4s [root@master231 pods]# 5.验证pod使用sa的验证身份 [root@master231 auth]# kubectl exec -it weixiang-pods-sa -- sh / # ls -l /var/run/secrets/kubernetes.io/serviceaccount total 0 lrwxrwxrwx 1 root root 13 Feb 23 04:13 ca.crt -> ..data/ca.crt lrwxrwxrwx 1 root root 16 Feb 23 04:13 namespace -> ..data/namespace lrwxrwxrwx 1 root root 12 Feb 23 04:13 token -> ..data/token / # / # TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token) / # / # curl -k -H "Authorization: Bearer ${TOKEN}" https://kubernetes.default.svc.weixiang.com;echo { "kind": "Status", "apiVersion": "v1", "metadata": {}, "status": "Failure", "message": "forbidden: User \"system:serviceaccount:default:oldboy\" cannot get path \"/\"", "reason": "Forbidden", "details": {}, "code": 403 } / # 6.其他测试实例 [root@master231 14-serviceAccount]# cat 02-deploy-sa-secrets-harbor.yaml apiVersion: v1 kind: Secret metadata: name: harbor-weixiang98 type: kubernetes.io/dockerconfigjson stringData: # 解码: # echo bGludXg5ODpMaW51eDk4QDIwMjU= | base64 -d # 编码: # echo -n weixiang98:Linux98@2025 | base64 .dockerconfigjson: '{"auths":{"harbor250.weixiang.com":{"username":"weixiang98","password":"Linux98@2025","email":"weixiang98@weixiang.com","auth":"bGludXg5ODpMaW51eDk4QDIwMjU="}}}' --- apiVersion: v1 kind: ServiceAccount metadata: name: sa-weixiang98 # 让sa绑定secret,将来与sa认证时,会使用secret的认证信息。 imagePullSecrets: - name: harbor-weixiang98 --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian-sa-secrets-harbor spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: #imagePullSecrets: #- name: "harbor-weixiang98" # 指定服务账号,该字段官方已经弃用。推荐使用'serviceAccountName',如果不指定,则默认名称为"default"的sa。 # serviceAccount: sa-weixiang98 # 推荐使用该字段来指定sa的认证信息 serviceAccountName: sa-weixiang98 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 imagePullPolicy: Always [root@master231 14-serviceAccount]# [root@master231 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-sa-secrets-harbor-66b99db97d-7wvz9 1/1 Running 0 149m 10.100.1.95 worker232 <none> <none> deploy-xiuxian-sa-secrets-harbor-66b99db97d-8kq29 1/1 Running 0 149m 10.100.1.96 worker232 <none> <none> deploy-xiuxian-sa-secrets-harbor-66b99db97d-cqgd6 1/1 Running 0 149m 10.100.2.67 worker233 <none> <none> [root@master231 ~]# [root@master231 ~]# kubectl exec -it deploy-xiuxian-sa-secrets-harbor-66b99db97d-7wvz9 -- sh / # ls -l /var/run/secrets/kubernetes.io/serviceaccount total 0 lrwxrwxrwx 1 root root 13 Jul 18 01:32 ca.crt -> ..data/ca.crt lrwxrwxrwx 1 root root 16 Jul 18 01:32 namespace -> ..data/namespace lrwxrwxrwx 1 root root 12 Jul 18 01:32 token -> ..data/token / # / # TOKEN=/var/run/secrets/kubernetes.io/serviceaccount/token / # / # TOKEN=`cat /var/run/secrets/kubernetes.io/serviceaccount/token` / # curl -k -H "Authorization: Bearer ${TOKEN}" https://kubernetes.default.svc.weixiang.com;echo { "kind": "Status", "apiVersion": "v1", "metadata": {}, "status": "Failure", "message": "forbidden: User \"system:serviceaccount:default:sa-weixiang98\" cannot get path \"/\"", "reason": "Forbidden", "details": {}, "code": 403 } / #
6、⭐启用authorization模式

dd3615df0f88d8beebabf7bb3227ec24

bash
- RBAC基础概念图解 # 实体(Entity): 在RBAC也称为Subject,通常指的是User、Group或者是ServiceAccount; # 角色(Role): 承载资源操作权限的容器。 1.资源(Resource): 在RBAC中也称为Object,指代Subject期望操作的目标,例如Service,Deployments,ConfigMap,Secret、Pod等资源。 仅限于"/api/v1/...""/apis/<group>/<version>/..."起始的路径; 其它路径对应的端点均被视作“非资源类请求(Non-Resource Requests)”,例如"/api""/healthz"等端点; 2.动作(Actions): Subject可以于Object上执行的特定操作,具体的可用动作取决于Kubernetes的定义。 资源型对象: 只读操作:get、list、watch等。 读写操作:create、update、patch、delete、deletecollection等。 非资源型端点仅支持"get"操作。 # 角色绑定(Role Binding): 将角色关联至实体上,它能够将角色具体的操作权限赋予给实体。 角色的类型: 1.Namespace级别: 称为Role,定义名称空间范围内的资源操作权限集合。 2.Namespace和Cluster级别: 称为ClusterRole,定义集群范围内的资源操作权限集合,包括集群级别及名称空间级别的资源对象。 角色绑定的类型: Cluster级别: 称为ClusterRoleBinding,可以将实体(User、Group或ServiceAccount)关联至ClusterRole。 Namespace级别: 称为RoleBinding,可以将实体关联至ClusterRole或Role。 即便将Subject使用RoleBinding关联到了ClusterRole上,该角色赋予到Subject的权限也会降级到RoleBinding所属的Namespace范围之内。 - ClusterRole 启用RBAC鉴权模块时,API Server会自动创建一组ClusterRole和ClusterRoleBinding对象 多数都以“system:”为前缀,也有几个面向用户的ClusterRole未使用该前缀,如cluster-admin、admin等。 它们都默认使用“kubernetes.io/bootstrapping: rbac-defaults”这一标签。 默认的ClusterRole大体可以分为5个类别。 API发现相关的角色: 包括system:basic-user、system:discovery和system:public-info-viewer。 面向用户的角色: 包括cluster-admin、admin、edit和view。 核心组件专用的角色: 包括system:kube-scheduler、system:volume-scheduler、system:kube-controller-manager、system:node和system:node-proxier等。 其它组件专用的角色: 包括system:kube-dns、system:node-bootstrapper、system:node-problem-detector和system:monitoring等。 内置控制器专用的角色: 专为内置的控制器使用的角色,具体可参考官网文档。 查看面向用户的四个角色 1.查看角色列表 [root@master231 ~]# kubectl get clusterrole | egrep -v "^system|calico|flannel|metallb|kubeadm|tigera" NAME CREATED AT admin 2025-07-09T02:40:56Z cluster-admin 2025-07-09T02:40:56Z edit 2025-07-09T02:40:56Z view 2025-07-09T02:40:56Z [root@master231 ~]# 相关角色说明: 从权限角度而言,其中cluster-admin相当于Linux的root用户。 其次是admin,edit和view权限依次减小。其中view权限最小,对于的大多数资源只能读取。 2.查看集群角色'cluster-admin'相当于Linux的root用户。 [root@master231 pods]# kubectl get clusterrole cluster-admin -o yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole # 定义资源类型为 ClusterRole(集群角色) metadata: annotations: # 表示此角色会自动更新 rbac.authorization.kubernetes.io/autoupdate: "true" creationTimestamp: "2025-05-22T02:57:49Z" labels: kubernetes.io/bootstrapping: rbac-defaults # 表示这是 Kubernetes 引导过程创建的默认 RBAC 角色 name: cluster-admin # 角色名称:cluster-admin,这是 Kubernetes 预定义的最高权限角色 resourceVersion: "86" uid: 7324f527-e03e-425a-a4ba-3f4dca4a6017 rules: - apiGroups: # 第一条规则:授予对所有 API 组和资源的完全权限 - '*' # 通配符表示所有 API 组 resources: # 适用的资源类型 - '*' # 通配符表示所有资源类型 verbs: # 允许的操作 - '*' # 通配符表示所有操作(get, list, create, update, delete 等) - nonResourceURLs: # # 第二条规则:授予对非资源 URL 的完全访问权限 - '*' verbs: - '*' - 验证管理员绑定的cluster-admin角色 1.查看集群角色绑定 [root@master231 pods]# kubectl get clusterrolebindings cluster-admin -o yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding # 指定了要创建的对象的类型 # ClusterRoleBinding:这是一种集群级别的绑定。它的作用是将一个ClusterRole(集群角色)绑定到一个或多个 subjects(主体,如用户、用户组或服务账户)。 metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" creationTimestamp: "2025-05-22T02:57:49Z" labels: kubernetes.io/bootstrapping: rbac-defaults # 表明该对象是 Kubernetes 集群在初始化(bootstrapping)时创建的默认 RBAC 配置之一。 name: cluster-admin # 这个 ClusterRoleBinding 对象的名字。这个名字在集群中必须是唯一的。 resourceVersion: "148" uid: 96e93f0a-c1fc-4740-b879-72fc2fd30d3e roleRef: # 角色引用,这部分指定了要授予什么权限 apiGroup: rbac.authorization.k8s.io # 被引用的角色所属的API组,这里同样是 RBAC API 组 kind: ClusterRole # 被引用的对象类型是一个 ClusterRole。ClusterRole 定义了一组集群范围的权限 name: cluster-admin # 被引用的 ClusterRole 的名字。 subjects: # 指定了将权限授予谁,subjects 是一个列表,可以包含多个用户、用户组或服务账户 - apiGroup: rbac.authorization.k8s.io # 主体所属的 API 组 kind: Group # 主体的类型是一个用户组 (Group) name: system:masters # 用户组的名称 2.查看管理员证书 [root@master231 pods]# kubectl config view --raw -o jsonpath='{.users[0].user.client-certificate-data}' | base64 -d > /tmp/xixi.cert [root@master231 pods]# openssl x509 -in /tmp/xixi.cert -text Certificate: Data: Version: 3 (0x2) Serial Number: 4201491904469302487 (0x3a4eb2a06c927cd7) Signature Algorithm: sha256WithRSAEncryption Issuer: CN = kubernetes Validity Not Before: May 22 02:57:42 2025 GMT Not After : May 22 02:57:44 2026 GMT Subject: O = system:masters, CN = kubernetes-admin Subject Public Key Info: ...

image

Role授权给一个用户类型实战

bash
- Role授权给一个用户类型实战 1.未授权前测试 [root@worker233 ~]# kubectl get pods Error from server (Forbidden): pods is forbidden: User "jiege" cannot list resource "pods" in API group "" in the namespace "default" [root@worker233 ~]# [root@worker233 ~]# ll ~/.kube/config -rw------- 1 root root 4115 Jul 18 11:27 /root/.kube/config [root@worker233 ~]# [root@worker233 ~]# kubectl config view apiVersion: v1 clusters: - cluster: certificate-authority: /etc/kubernetes/pki/ca.crt server: https://10.0.0.231:6443 name: myk8s contexts: - context: cluster: myk8s user: jiege name: jiege@myk8s current-context: jiege@myk8s kind: Config preferences: {} users: - name: jiege user: client-certificate-data: REDACTED client-key-data: REDACTED [root@worker233 ~]# 2.创建Role [root@master231 ~]# kubectl create role reader --resource=po,svc --verb=get,watch,list -o yaml --dry-run=client # 将来可用于声明式 #kubectl create role reader:创建一个role对象reader #--resource=po,svc:指定这个角色能操作的资源是 pods,svc # --verb=get,watch,list:指定允许对这些资源执行的操作是只读的 apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: creationTimestamp: null name: reader rules: - apiGroups: - "" resources: - pods - services verbs: - get - watch - list [root@master231 ~]# [root@master231 ~]# kubectl create role reader --resource=po,svc --verb=get,watch,list # 响应式创建 role.rbac.authorization.k8s.io/reader created [root@master231 ~]# [root@master231 ~]# kubectl get role NAME CREATED AT reader 2025-07-18T07:00:52Z [root@master231 ~]# 3.创建角色绑定 [root@master231 ~]# kubectl create rolebinding jiege-as-reader --user=jiege --role=reader -o yaml --dry-run=client # 将名为 reader 的 Role(角色/权限集)授予给一个名为 jiege 的用户。这个授权仅在当前的 Kubernetes 命名空间中生效。 # jiege-as-reader:[用户]-as-[角色] # --user=jiege:--user: 表明主体(Subject)的类型是用户(User) apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: creationTimestamp: null name: jiege-as-reader roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: reader subjects: - apiGroup: rbac.authorization.k8s.io kind: User name: jiege [root@master231 ~]# [root@master231 ~]# kubectl create rolebinding jiege-as-reader --user=jiege --role=reader rolebinding.rbac.authorization.k8s.io/jiege-as-reader created [root@master231 ~]# [root@master231 ~]# kubectl get rolebindings jiege-as-reader NAME ROLE AGE jiege-as-reader Role/reader 13s [root@master231 ~]# [root@master231 ~]# kubectl get rolebindings jiege-as-reader -o wide NAME ROLE AGE USERS GROUPS SERVICEACCOUNTS jiege-as-reader Role/reader 16s jiege [root@master231 ~]# 4.授权后再次验证 [root@worker233 ~]# kubectl get pods NAME READY STATUS RESTARTS AGE deploy-xiuxian-sa-secrets-harbor-66b99db97d-7wvz9 1/1 Running 0 5h30m deploy-xiuxian-sa-secrets-harbor-66b99db97d-8kq29 1/1 Running 0 5h30m deploy-xiuxian-sa-secrets-harbor-66b99db97d-cqgd6 1/1 Running 0 5h30m [root@worker233 ~]# [root@worker233 ~]# kubectl get pods,svc -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxian-sa-secrets-harbor-66b99db97d-7wvz9 1/1 Running 0 5h31m 10.100.1.95 worker232 <none> <none> pod/deploy-xiuxian-sa-secrets-harbor-66b99db97d-8kq29 1/1 Running 0 5h31m 10.100.1.96 worker232 <none> <none> pod/deploy-xiuxian-sa-secrets-harbor-66b99db97d-cqgd6 1/1 Running 0 5h31m 10.100.2.67 worker233 <none> <none> NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 9d <none> service/svc-blog ExternalName <none> baidu.com <none> 2d4h <none> service/svc-db ClusterIP 10.200.58.173 <none> 3306/TCP 2d3h <none> service/svc-wp NodePort 10.200.7.64 <none> 80:30090/TCP 2d3h apps=wp service/svc-xiuxian NodePort 10.200.92.117 <none> 88:8080/TCP 2d apps=v1 service/svc-xiuxian-lb LoadBalancer 10.200.132.29 10.0.0.150 88:32116/TCP 2d3h apps=v1 [root@worker233 ~]# [root@worker233 ~]# [root@worker233 ~]# kubectl get deploy -o wide Error from server (Forbidden): deployments.apps is forbidden: User "jiege" cannot list resource "deployments" in API group "apps" in the namespace "default" [root@worker233 ~]# [root@worker233 ~]# kubectl get pods -A # 值得注意的是,尽管我们能够查看default的Pod,但不能查看所有名称空间的Pod,如果你想要查看所有名称空间的Pod,请使用CLusterRole。 Error from server (Forbidden): pods is forbidden: User "jiege" cannot list resource "pods" in API group "" at the cluster scope [root@worker233 ~]# 5.修改权限 方式一: (响应式) [root@master231 ~]# kubectl create role reader --resource=po,svc,deploy --verb=get,watch,list -o yaml --dry-run=client | kubectl apply -f - Warning: resource roles/reader is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically. role.rbac.authorization.k8s.io/reader configured [root@master231 ~]# 方式二: (声明式) [root@master231 ~]# kubectl create role reader --resource=po,svc,deploy --verb=get,watch,list -o yaml --dry-run=client apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: creationTimestamp: null name: reader rules: - apiGroups: - "" resources: - pods - services verbs: - get - watch - list - apiGroups: - apps resources: - deployments verbs: - get - watch - list [root@master231 ~]# [root@master231 ~]# kubectl create role reader --resource=po,svc,deploy --verb=get,watch,list -o yaml --dry-run=client > 01-Role-jiege.yaml [root@master231 ~]# [root@master231 ~]# kubectl apply -f 01-Role-jiege.yaml role.rbac.authorization.k8s.io/reader configured [root@master231 ~]# 6.测试验证 [root@worker233 ~]# kubectl get deploy -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deploy-xiuxian-sa-secrets-harbor 3/3 3 3 5h32m c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v3 apps=v1 [root@worker233 ~]# [root@worker233 ~]# kubectl delete deploy deploy-xiuxian-sa-secrets-harbor Error from server (Forbidden): deployments.apps "deploy-xiuxian-sa-secrets-harbor" is forbidden: User "jiege" cannot delete resource "deployments" in API group "apps" in the namespace "default" [root@worker233 ~]#

ClusterRole授权给一个用户组类型

bash
- ClusterRole授权给一个用户组类型 1.授权前测试 [root@worker232 ~]# kubectl config view --kubeconfig=./yinzhengjie-k8s.conf apiVersion: v1 clusters: - cluster: certificate-authority-data: DATA+OMITTED server: https://10.0.0.231:6443 name: myk8s contexts: - context: cluster: myk8s user: jasonyin name: jasonyin@myk8s - context: cluster: myk8s user: yinzhengjie name: yinzhengjie@myk8s current-context: yinzhengjie@myk8s kind: Config preferences: {} users: - name: jasonyin user: token: REDACTED - name: yinzhengjie user: token: REDACTED [root@worker232 ~]# [root@worker232 ~]# kubectl get pods --kubeconfig=./yinzhengjie-k8s.conf Error from server (Forbidden): pods is forbidden: User "yinzhengjie" cannot list resource "pods" in API group "" in the namespace "default" [root@worker232 ~]# 2.创建集群角色 [root@master231 ~]# kubectl create clusterrole reader --resource=deploy,rs,pods --verb=get,watch,list -o yaml --dry-run=client # 创建个集群角色叫reader # --resource=deploy,rs,pods:指定的资源deploy,rs,pods # --verb=get,watch,list: # -o yaml:定了命令的输出格式为 YAML,执行文件后才会创建 # --dry-run: 表示这只是一个“演习”,并不会真的执行任何操作。 apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: creationTimestamp: null name: reader rules: - apiGroups: - "" resources: - pods verbs: - get - watch - list - apiGroups: - apps resources: - deployments - replicasets verbs: - get - watch - list [root@master231 ~]# [root@master231 ~]# kubectl create clusterrole reader --resource=deploy,rs,pods --verb=get,watch,list clusterrole.rbac.authorization.k8s.io/reader created [root@master231 ~]# kubectl get clusterrole reader NAME CREATED AT reader 2025-07-18T07:09:19Z [root@master231 ~]# 3.将集群角色绑定给k8s组 [root@master231 ~]# cat /etc/kubernetes/pki/token.csv 01b202.d5c4210389cbff08,yinzhengjie,10001,k8s 497804.9fc391f505052952,jasonyin,10002,k8s 8fd32c.0868709b9e5786a8,linux97,10003,k3s jvt496.ls43vufojf45q73i,weixiang98,10004,k3s qo7azt.y27gu4idn5cunudd,linux99,10005,k3s mic1bd.mx3vohsg05bjk5rr,linux100,10006,k3s [root@master231 ~]# [root@master231 ~]# kubectl create clusterrolebinding k8s-as-reader --clusterrole=reader --group=k8s -o yaml --dry-run=client # 将一个名为 reader 的 ClusterRole(集群角色)授予给一个名为 k8s 的用户组(Group) apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: creationTimestamp: null name: k8s-as-reader roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: reader subjects: - apiGroup: rbac.authorization.k8s.io kind: Group name: k8s [root@master231 ~]# [root@master231 ~]# kubectl create clusterrolebinding k8s-as-reader --clusterrole=reader --group=k8s clusterrolebinding.rbac.authorization.k8s.io/k8s-as-reader created [root@master231 ~]# [root@master231 ~]# kubectl get clusterrolebindings k8s-as-reader NAME ROLE AGE k8s-as-reader ClusterRole/reader 10s [root@master231 ~]# [root@master231 ~]# kubectl get clusterrolebindings k8s-as-reader -o wide NAME ROLE AGE USERS GROUPS SERVICEACCOUNTS k8s-as-reader ClusterRole/reader 25s k8s [root@master231 ~]# 4.基于kubeconfig测试 [root@worker232 ~]# kubectl get deploy,rs,pod -o wide --kubeconfig=./yinzhengjie-k8s.conf --context=gangzi@myk8s NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-xiuxian-sa-secrets-harbor 3/3 3 3 5h39m c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v3 apps=v1 NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/deploy-xiuxian-sa-secrets-harbor-66b99db97d 3 3 3 5h39m c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v3 apps=v1,pod-template-hash=66b99db97d NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxian-sa-secrets-harbor-66b99db97d-7wvz9 1/1 Running 0 5h39m 10.100.1.95 worker232 <none> <none> pod/deploy-xiuxian-sa-secrets-harbor-66b99db97d-8kq29 1/1 Running 0 5h39m 10.100.1.96 worker232 <none> <none> pod/deploy-xiuxian-sa-secrets-harbor-66b99db97d-cqgd6 1/1 Running 0 5h39m 10.100.2.67 worker233 <none> <none> [root@worker232 ~]# [root@worker232 ~]# [root@worker232 ~]# kubectl get deploy,rs,pod -o wide --kubeconfig=./yinzhengjie-k8s.conf NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-xiuxian-sa-secrets-harbor 3/3 3 3 5h39m c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v3 apps=v1 NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR replicaset.apps/deploy-xiuxian-sa-secrets-harbor-66b99db97d 3 3 3 5h39m c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v3 apps=v1,pod-template-hash=66b99db97d NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxian-sa-secrets-harbor-66b99db97d-7wvz9 1/1 Running 0 5h39m 10.100.1.95 worker232 <none> <none> pod/deploy-xiuxian-sa-secrets-harbor-66b99db97d-8kq29 1/1 Running 0 5h39m 10.100.1.96 worker232 <none> <none> pod/deploy-xiuxian-sa-secrets-harbor-66b99db97d-cqgd6 1/1 Running 0 5h39m 10.100.2.67 worker233 <none> <none> [root@worker232 ~]# 5.基于token测试 [root@worker232 ~]# kubectl --server=https://10.1.24.13:6443 --token=01b202.d5c4210389cbff08 --certificate-authority=/etc/kubernetes/pki/ca.crt get deploy -o wide -A NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR default deploy-xiuxian-sa-secrets-harbor 3/3 3 3 5h41m c1 harbor250.weixiang.com/weixiang-xiuxian/apps:v3 apps=v1 kube-system coredns 2/2 2 2 9d coredns registry.aliyuncs.com/google_containers/coredns:v1.8.6 k8s-app=kube-dns metallb-system controller 1/1 1 1 2d22h controller quay.io/metallb/controller:v0.15.2 app=metallb,component=controller [root@worker232 ~]# [root@worker232 ~]# [root@worker232 ~]# kubectl --server=https://10.1.24.13:6443 --token=jvt496.ls43vufojf45q73i --certificate-authority=/etc/kubernetes/pki/ca.crt get deploy,rs,po -o wide -A # 很明显,weixiang98属于k3s分组,不属于K8S组的,因此无法访问! Error from server (Forbidden): deployments.apps is forbidden: User "weixiang98" cannot list resource "deployments" in API group "apps" at the cluster scope Error from server (Forbidden): replicasets.apps is forbidden: User "weixiang98" cannot list resource "replicasets" in API group "apps" at the cluster scope Error from server (Forbidden): pods is forbidden: User "weixiang98" cannot list resource "pods" in API group "" at the cluster scope [root@worker232 ~]# [root@worker232 ~]# 6.更新权限 6.1 资源无法删除 [root@worker232 ~]# kubectl --server=https://10.1.24.13:6443 --token=01b202.d5c4210389cbff08 --certificate-authority=/etc/kubernetes/pki/ca.crt get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-sa-secrets-harbor-66b99db97d-7wvz9 1/1 Running 0 5h42m 10.100.1.95 worker232 <none> <none> deploy-xiuxian-sa-secrets-harbor-66b99db97d-8kq29 1/1 Running 0 5h42m 10.100.1.96 worker232 <none> <none> deploy-xiuxian-sa-secrets-harbor-66b99db97d-cqgd6 1/1 Running 0 5h42m 10.100.2.67 worker233 <none> <none> [root@worker232 ~]# [root@worker232 ~]# kubectl --server=https://10.1.24.13:6443 --token=01b202.d5c4210389cbff08 - certificate-authority=/etc/kubernetes/pki/ca.crt delete pods --all Error from server (Forbidden): pods "deploy-xiuxian-sa-secrets-harbor-66b99db97d-7wvz9" is forbidden: User "yinzhengjie" cannot delete resource "pods" in API group "" in the namespace "default" Error from server (Forbidden): pods "deploy-xiuxian-sa-secrets-harbor-66b99db97d-8kq29" is forbidden: User "yinzhengjie" cannot delete resource "pods" in API group "" in the namespace "default" Error from server (Forbidden): pods "deploy-xiuxian-sa-secrets-harbor-66b99db97d-cqgd6" is forbidden: User "yinzhengjie" cannot delete resource "pods" in API group "" in the namespace "default" [root@worker232 ~]# [root@worker232 ~]# kubectl --server=https://10.1.24.13:6443 --token=01b202.d5c4210389cbff08 --certificate-authority=/etc/kubernetes/pki/ca.crt get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-sa-secrets-harbor-66b99db97d-7wvz9 1/1 Running 0 5h42m 10.100.1.95 worker232 <none> <none> deploy-xiuxian-sa-secrets-harbor-66b99db97d-8kq29 1/1 Running 0 5h42m 10.100.1.96 worker232 <none> <none> deploy-xiuxian-sa-secrets-harbor-66b99db97d-cqgd6 1/1 Running 0 5h42m 10.100.2.67 worker233 <none> <none> [root@worker232 ~]# 6.2 允许删除操作 [root@master231 ~]# kubectl create clusterrole reader --resource=deploy,rs,pods --verb=get,watch,list,delete -o yaml --dry-run=client | kubectl apply -f - Warning: resource clusterroles/reader is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically. clusterrole.rbac.authorization.k8s.io/reader configured [root@master231 ~]# 7.验证删除权限是否生效 [root@worker232 ~]# kubectl --server=https://10.1.24.13:6443 --token=01b202.d5c4210389cbff08 --certificate-authority=/etc/kubernetes/pki/ca.crt delete pods --all pod "deploy-xiuxian-sa-secrets-harbor-66b99db97d-7wvz9" deleted pod "deploy-xiuxian-sa-secrets-harbor-66b99db97d-8kq29" deleted pod "deploy-xiuxian-sa-secrets-harbor-66b99db97d-cqgd6" deleted [root@worker232 ~]# [root@worker232 ~]# kubectl --server=https://10.1.24.13:6443 --token=01b202.d5c4210389cbff08 --certificate-authority=/etc/kubernetes/pki/ca.crt get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-sa-secrets-harbor-66b99db97d-8vjcf 1/1 Running 0 4s 10.100.2.69 worker233 <none> <none> deploy-xiuxian-sa-secrets-harbor-66b99db97d-k2qxh 1/1 Running 0 4s 10.100.1.97 worker232 <none> <none> deploy-xiuxian-sa-secrets-harbor-66b99db97d-rbssn 1/1 Running 0 4s 10.100.2.68 worker233 <none> <none> [root@worker232 ~]#
7、ClusterRole授权给一个ServiceAccount类型
bash
1.导入镜像 [root@worker233 ~]# wget http://192.168.21.253/Resources/Kubernetes/RBAC/python-v3.9.16.tar.gz [root@worker233 ~]# docker load -i python-v3.9.16.tar.gz [root@worker233 ~]# docker tag python:3.9.16-alpine3.16 harbor250.weixiang.com/weixiang-python/python:3.9.16-alpine3.16 [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-python/python:3.9.16-alpine3.16 2.编写资源清单 [root@master231 16-rbac]# cat > weixiang-sa-rbac.yaml <<EOF apiVersion: v1 kind: ServiceAccount metadata: name: oldboy # 服务账户名称 namespace: default # 虽然没有显式指定,但这是默认命名空间 --- # 定义 Deployment 部署配置 apiVersion: apps/v1 kind: Deployment metadata: name: xiuxian # 部署名称 spec: replicas: 1 # 只运行1个副本 selector: matchLabels: app: xiuxian # 选择器标签,用于匹配Pod template: metadata: labels: app: xiuxian # Pod标签,必须与selector匹配 spec: nodeName: worker232 # 强制调度到worker232节点 serviceAccountName: oldboy # 使用上面创建的ServiceAccount containers: - image: harbor250.weixiang.com/weixiang-python/python:3.9.16-alpine3.16 command: # 容器启动命令 - tail - -f - /etc/hosts # 保持容器运行的简单命令 name: apps # 容器名称 --- # 定义 ClusterRole(集群角色),设置权限规则 apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: reader-oldboy # 集群角色名称 rules: # 第一条规则:针对核心API组(空字符串表示核心API组)的资源权限 - apiGroups: - "" # 核心API组(如Pod、Service等) resources: - pods # 对Pod资源的权限 - services # 对Service资源的权限 verbs: # 允许的操作 - get # 查看 - watch # 监听 - list # 列表 - delete # 删除 # 第二条规则:针对apps API组的资源权限 - apiGroups: - apps # 包含Deployment等资源 resources: - deployments # 对Deployment资源的权限 verbs: - get - watch - list - delete --- # 创建 ClusterRoleBinding(集群角色绑定),将角色绑定到ServiceAccount apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: reader-oldboy-bind # 绑定名称 roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole # 绑定到集群角色 name: reader-oldboy # 指定要绑定的ClusterRole名称 subjects: - kind: ServiceAccount # 绑定到服务账户 name: oldboy # 服务账户名称 namespace: default # 服务账户所在的命名空间 EOF 3.创建资源 [root@master231 16-rbac]# kubectl apply -f weixiang-sa-rbac.yaml serviceaccount/oldboy created deployment.apps/xiuxian created clusterrole.rbac.authorization.k8s.io/reader-oldboy created clusterrolebinding.rbac.authorization.k8s.io/reader-oldboy-bind created [root@master231 16-rbac]# [root@master231 16-rbac]# kubectl get pods -o wide -l app=xiuxian NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-77b9d95d56-4dxpr 1/1 Running 0 6s 10.100.1.99 worker232 <none> <none> [root@master231 16-rbac]# [root@master231 16-rbac]# 4.安装依赖包 [root@master231 sa]# kubectl exec -it xiuxian-77b9d95d56-4dxpr -- sh / # / # python -V Python 3.9.16 / # / # pip install kubernetes -i https://pypi.tuna.tsinghua.edu.cn/simple/ ... Successfully installed cachetools-5.5.2 certifi-2025.1.31 charset-normalizer-3.4.1 durationpy-0.9 google-auth-2.38.0 idna-3.10 kubernetes-32.0.1 oauthlib-3.2.2 pyasn1-0.6.1 pyasn1-modules-0.4.2 python-dateutil-2.9.0.post0 pyyaml-6.0.2 requests-2.32.3 requests-oauthlib-2.0.0 rsa-4.9 six-1.17.0 urllib3-2.4.0 websocket-client-1.8.0 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv WARNING: You are using pip version 22.0.4; however, version 25.0.1 is available. You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command. / # 5.编写python脚本 / # cat > view-k8s-resources.py <<EOF from kubernetes import client, config with open('/var/run/secrets/kubernetes.io/serviceaccount/token') as f: token = f.read() configuration = client.Configuration() configuration.host = "https://kubernetes" # APISERVER地址 configuration.ssl_ca_cert="/var/run/secrets/kubernetes.io/serviceaccount/ca.crt" # CA证书 configuration.verify_ssl = True # 启用证书验证 configuration.api_key = {"authorization": "Bearer " + token} # 指定Token字符串 client.Configuration.set_default(configuration) apps_api = client.AppsV1Api() core_api = client.CoreV1Api() try: print("###### Deployment列表 ######") #列出default命名空间所有deployment名称 for dp in apps_api.list_namespaced_deployment("default").items: print(dp.metadata.name) except: print("没有权限访问Deployment资源!") try: #列出default命名空间所有pod名称 print("###### Pod列表 ######") for po in core_api.list_namespaced_pod("default").items: print(po.metadata.name) except: print("没有权限访问Pod资源!") EOF 3.运行python脚本 / # python3 view-k8s-resources.py ###### Deployment列表 ###### xiuxian ###### Pod列表 ###### xiuxian-6dffdd86b-m8f2h / # 4.更新权限 [root@master231 16-rbac]# kubectl get clusterrolebinding reader-oldboy-bind -o wide NAME ROLE AGE USERS GROUPS SERVICEACCOUNTS reader-oldboy-bind ClusterRole/reader-oldboy 7m45s default/oldboy [root@master231 16-rbac]# [root@master231 16-rbac]# [root@master231 16-rbac]# kubectl delete clusterrolebinding reader-oldboy-bind clusterrolebinding.rbac.authorization.k8s.io "reader-oldboy-bind" deleted [root@master231 16-rbac]# [root@master231 16-rbac]# kubectl get clusterrolebinding reader-oldboy-bind -o wide Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io "reader-oldboy-bind" not found [root@master231 16-rbac]# 5.再次测试验证 / # python view-k8s-resources.py ###### Deployment列表 ###### 没有权限访问Deployment资源! ###### Pod列表 ###### 没有权限访问Pod资源! / #
8、kubectl管理多套K8S集群
bash
1.环境准备 [root@worker233 ~]# rm -f ~/.kube/config [root@worker233 ~]# 2.准备kubeconfig文件 [root@worker233 ~]# cat ~/.kube/config apiVersion: v1 clusters: - cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURCVENDQWUyZ0F3SUJBZ0lJVkpobTE1cXdRb1l3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TlRBM01UWXdOelF3TkROYUZ3MHpOVEEzTVRRd056UTFORE5hTUJVeApFekFSQmdOVkJBTVRDbXQxWW1WeWJtVjBaWE13Z2dFaU1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLCkFvSUJBUUMraTl6OHJtL3l6d0N2WGVHUUpqMks4ZkRNbnZ5UDQ3THAwU3B4NWNrdzdBRHowZS81WTAwMW1hR1gKVGYzMk4wd2ZQL3ZXSlJXNDJIWk91QTB3UGpVZHJMVmZ3a2hhcHR2UnEyNE01YmIxSzU1eEtVQ0dOSm9YT2N6bwp5Z0VCR05tY3grMEdXeTBtZ291UmZKS1dFcTEwSmRaOWs4czNBTVYyK1J5ZFZYVHFjbENXZ3FZZWEvazJ5K3RHCjZsdnR4VlFrS2dtcWlWRWVGQTRUWFJEcXhBSVdrSjRBa2lrM1Y1RXBOQm4weWY3UUtpTm15YlZYQ1JxNWpTd3AKdCt6MjZmekRINVJnNXFWcHkwWkdwVkhwOVcwTHR6Y1hiMlJ0a2ZRallLYjZzWVhQSkwySmo0cXRLNGhTalpOcwpici9CT25qeTBDV1dsZElpOXlRTEQ2dWExRkpqQWdNQkFBR2pXVEJYTUE0R0ExVWREd0VCL3dRRUF3SUNwREFQCkJnTlZIUk1CQWY4RUJUQURBUUgvTUIwR0ExVWREZ1FXQkJUSEdkbXZRbDZOTGFqYkdnN2xjeERabW1RSmtEQVYKQmdOVkhSRUVEakFNZ2dwcmRXSmxjbTVsZEdWek1BMEdDU3FHU0liM0RRRUJDd1VBQTRJQkFRQW5Ta3JlQXF1ZQpWR2Q3d1pCVzlDd2psNlFpNFVncXZDWTdjRldMV0w2UWVTMVdkTW1EbXVDeVdWSmpwWWZtMTdqOEJ5cUs0T1htCkc5a2pIbjdHY1Jnb0pPRG9oR3RWemd2d3REZHd4Z2NWbDk3R3VNeWtRS0dEU09ZT05jZVZjYTE2a3g2d3paMU4KNTc3eW1LV3daRExYZDZ3RkV0VTU4RmdjcFJLUmc4cmpmYkpqNnFjcWY3WWRaeEx2N2JuZGdwZjV2ZTh6L3k0Nwo5Qkt3YXZJUkpRRjc2VkN2MnNtbXRLd3ROdmtRUk9TNVpXY3JlMW01dUdVdXdCZkgxcytnWlp2T0JQWmdjR3NkCkVjV2d3V2VnL3VkL0N5Wms1WmVnc0FuVXkrdEJmc21RVHl2UGJxQnhoenpYZi91dDZUQVdYMHZFem5oQUdUWEEKUFZjank0SGgwYTFxCi0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K server: https://10.0.0.66:6443 name: k8s66 - cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJMU1EY3dPVEF5TkRBME9Wb1hEVE0xTURjd056QXlOREEwT1Zvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTmNIClR5NVFnUFhQRDhZTWpvWHpuRnBIKzljSCtZczhrQ3lZQVhTQThvUnFrdHFTdWMxV3hYdEpLMjIrb1l0WXYyeGYKbEJNYWkwV2ZkaEp1NGtUSUVDNDhiVUNmNHV0R2U3MXNiM1QwelVtUlV6eDJMcW9OZ3p2cm92UTF3ams0MldmWAp5MHgvVkgyYnJaSENjZkx4MHQ1bmExNnZUcllaZjc2Nk1iSTBhbkhSdGVpRVR1eGdHZVA1bEk5ekdGS3RoN2diClJsbFZIL21rV0pZOGdhVmVldWpwUzYwS0tzSmJJNjhodW5QTVN5STlrZTFPN0ROdzJCc3dFczUxV1hCME5HWlIKdmNXZTRjWW51UkJWcFU4ZFFBSDRmam5iZzRWSGZ2ZTIvMDhNR2VnSmJJTEI4MUF0VzkwRjVWWlB6eXRzNUYzUQp5UU8zQjNIU3VVMTdQay8yVWFVQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZEazVYUEVSV1ZNS016YmFTNGZIOEQ2ZU1uZm1NQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBTk5oMk1LMGtyaEtuc1NjamQ2OApkU1g1TUdBWUowVkxVL0lUZmUxdFpJc2U0bXVBR2hmd0lIb1hVczNDMENrSzJLczdFaUcydkU3WWY3UndzTEVICjE3V21ZT1BsZXBjSmdobmFaZjF3enpFQklPcE13Q3cwSGM4WS96dm1wUjVpR0pmbGh3Mk9jcVVvWHQ5QnVySzQKUG90bXY1bkpmMUE5WW05clBHcTh6R29TWFFuMDdwSFVmSkJRaTNucE9zWlFPeEs1RE1KZDdTMko0TUJEeU5MZApSVlRqWFoyU2hrSnh3blBvMi9Fb2RGb3l1SHRnNU96MHNQalBXVjlSeFV1Zkc2N21KYmNZSEU5MmxCQkY5YUUzCjNqekIveVFFV09TSkJEaUcybENWVFFzNnV4WmxFRE9sZFVkNU1pcjJBZDB4N1lxc1hFbnFnS1hlcGd5dWdXaDEKTVBFPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== server: https://10.0.0.231:6443 name: k8s231 contexts: - context: cluster: k8s231 user: master231 name: xixi - context: cluster: k8s66 user: master66 name: haha current-context: xixi kind: Config preferences: {} users: - name: master66 user: client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURJVENDQWdtZ0F3SUJBZ0lJRGZzTUs0Nm1SdUF3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TlRBM01UWXdOelF3TkROYUZ3MHlOakEzTVRZd056UTFORFJhTURReApGekFWQmdOVkJBb1REbk41YzNSbGJUcHRZWE4wWlhKek1Sa3dGd1lEVlFRREV4QnJkV0psY201bGRHVnpMV0ZrCmJXbHVNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQXZwSUZZQVZ3eHh4NlcxblMKcHlLWWk5MUx5RWJsSVhyWmhJV2QxZkUyK0J3VU92cnNvNUl1cjNVc2VwRVVmY0IvUE1hV0Z0aVVRSnlSY0VKUQpiZk9LZ3lVWXBEck5iTHBEaWJXUXlUSU1XQlBVNXhVZk5BdmN2RC9qNE4rczdzdUtWbEpzLzNFazhIZW1pNXYrClI4a2x3dVlvejlacmI1ajM2WjJyckttNDlDMmFHYU9VczUxL0xoeGgyL25BZVFWa2k5a1R2bWhUWVJVRGhONDkKYVdXMWhiUEJ0SktuWjZRUHdrV1R1MHA2cWhIcFp0M1BYODN5T3RJdFArWUhXTXJjN3JYYW94N1d4RjAxbU9EeQo3N01HODlqNVI1VEtuVDdQL3NaQ1NnUmx0QVFWTlVoN1dRcnpJOWhnb2VmRm5Tb1JKL3ZIUXNmbkI4T3B1NXYzCmRuN1Nzd0lEQVFBQm8xWXdWREFPQmdOVkhROEJBZjhFQkFNQ0JhQXdFd1lEVlIwbEJBd3dDZ1lJS3dZQkJRVUgKQXdJd0RBWURWUjBUQVFIL0JBSXdBREFmQmdOVkhTTUVHREFXZ0JUSEdkbXZRbDZOTGFqYkdnN2xjeERabW1RSgprREFOQmdrcWhraUc5dzBCQVFzRkFBT0NBUUVBVWxVMVZCQk1aYWl4S0lBUGhRZEdRdWtFUnlPRGNpZjV3VW9YCmlNSFA4S25oNDFTV2JSV3FtSlhUQW1FN2hlRS93aDBUN2ZwWUhyd29rR1ZrMTh4dWpqL3QxV0d1bkQ3d1lWSXcKbkRBTFE4VDVLYWxrSldQczBTRERhd2pXY2ZaTUNrbkFpQzErMDVaeWZYVnRKdmpXS1RjWkVPSDlzSTdPOFFSdQpVNnBWTldERlFPa2dUbUQ5NzlBUGtzbmVEN1RnZ09ma0oyZGxWQ0JYbkJXeTJXZUFIZ0ZyTzJLSFhFWURDVjF6CnprT3dsNFJsN21HMmtocGRqd1JhTVdrMlBGemxkNncyS2h3UHZLMXBhWktwbXFoaUZmTkdSdUdyWnpMQWhiS00KY1B5N0NFa09oRUl4WDluNzV0RW9DN3habjNaK2F1c1NmKzd0VXBjMUhpYzJzSGx3Tnc9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcFFJQkFBS0NBUUVBdnBJRllBVnd4eHg2VzFuU3B5S1lpOTFMeUVibElYclpoSVdkMWZFMitCd1VPdnJzCm81SXVyM1VzZXBFVWZjQi9QTWFXRnRpVVFKeVJjRUpRYmZPS2d5VVlwRHJOYkxwRGliV1F5VElNV0JQVTV4VWYKTkF2Y3ZEL2o0TitzN3N1S1ZsSnMvM0VrOEhlbWk1ditSOGtsd3VZb3o5WnJiNWozNloycnJLbTQ5QzJhR2FPVQpzNTEvTGh4aDIvbkFlUVZraTlrVHZtaFRZUlVEaE40OWFXVzFoYlBCdEpLblo2UVB3a1dUdTBwNnFoSHBadDNQClg4M3lPdEl0UCtZSFdNcmM3clhhb3g3V3hGMDFtT0R5NzdNRzg5ajVSNVRLblQ3UC9zWkNTZ1JsdEFRVk5VaDcKV1Fyekk5aGdvZWZGblNvUkovdkhRc2ZuQjhPcHU1djNkbjdTc3dJREFRQUJBb0lCQVFDNTczRE5JaUhLTTYveQpSMjV2NGlKaWZKVS9JR1RaU0tySVVUSVJsdHRpTXN0T1RKcnNjV21aaWNMUEI1U3RxVTY3dHFxa09jWlVVQzdiCkQraTNqcUo3bUlzMVdhWXF5b1d3Ni9VTFNRaWdPeUZFSWVpaStGdnpWSWI3Vm1HOVQ5eDJvczkwWUNtQmNjeU8KZExJaHlsRk1teElBengxZDhpc1l4ZHpGaDRkTXZTMk8yMUdFZTdFK1VXSnJxWSt4QWdPT0VCOXBLQldIUFRXYQpmaHBDUlNuZHVXSGhwY3VmQWlTM1dtZTRFMDNiRGd5KzkwU1NJeGlxcy9wYzFKU3dVcDYxdDJmTjFIZ2JsTmJRCmwwTEVVQ0lYSUtmUFA0S09IM1RWb0RLVHh4QlpIZERLK0V0engxZndPY015RnRWSk4yVnIyS2ZJalRkL0dBaVMKNERCY1Z5OEJBb0dCQU00cHFtTmpyWlFjVGY0RnFYNy81d1F1aHFHa0JySGdaRktwTW11TC9TWUZWMmZQR3JGSwpNZXh2Y0VEclJQLzhsVy9SdC9RcXQwdkE1VVVIWUxuZnY2ZjF3V3phendIaTZUT1FzSjd2cTFWZTBtVSt3QUFyCllKU2Fvc2xSQWNCKzVjU04vMDJRTVp6Y1gzRDRROVkwZVdDenFha3M4Z244K0lRSE54cGx3K0NCQW9HQkFPeWoKYlB4N1pCdFlNY04zVlQ0bTRmSEE4N3RCVFZOM3lJZFpyam9YbTBrQno2Q2RmMjlhRG9qTXloZmJzQTBvb05jUgozdk56aTZHYXorZXZZUGpod29LU2ZyM3YyWmNyWlMxWlRnSldTNWV3cGJaVlc2djFpSWNnVkgzTURrUm1pcklwClFkdlByeEtRaWViUHRjMDExWXdzVC9VREN1d3BraUpERDkvNVRKa3pBb0dCQUtJazh2V29kK1I5UG5vRFNnT3cKa0traFRwSGl4OEV1a3JqUWllODc4cVhzblludndUYWUvQlRRc0tEZWFTNU1JZHdJUFgyeit1V0JtTkJwdFJGNwpnT2xBeUJndEg1S0VQSlZwdnYvQjBDY2NwSzBzWlNXOCtBRG9mZytIdnJEL0hRY3lCeEdoenVOb0QwaHllaWdHCnJVLzQvZjhvY2xTWXVYeGRrR0VhbUt3QkFvR0JBTFlEeWl1MkdUenpMR3ZGMlJwU1BBNkF0TGgrdFNTVGlQbWEKNEdrc2lUT3hvZXRQMlpwanNiUkZtTmliRVNJOTh1RWFqTnZETDA2aFRuTk5zWkFkemtneXVDd09WZFp4K1lQVgpJaXlvQmNMcWk3dmdrZitGZjNzNFFlVDAxTENHRXY2UXYyaDhxWlBrK3owKzNQNjMvc2F2dXR5aGQ0QkpSVzczCkdEblZmcmFEQW9HQVBtdXVBUjAzWll2VEZxeVZzdUNQWWRza3RJOC8rQThuUVcvSktrUlJoaTR5c1FPL3grVGsKdDZESjdad2FJTXpybFFIL2JxRnZGdlR4cmQzWVc3bzhNRkplaGxHaUpWckEvS1ZWNFZiT0FzWVpiSGRzSHE0cQptdTg3Rmc1TWwxa0x4V2NqNmpZZVJlcmRjTThDQ3doU0phR0duWjVTZW9LLzExb1BYNkZvbVhJPQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo= - name: master231 user: client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURJVENDQWdtZ0F3SUJBZ0lJQ1g5TDltb3BUakl3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TlRBM01Ea3dNalF3TkRsYUZ3MHlOakEzTURrd01qUXdOVEJhTURReApGekFWQmdOVkJBb1REbk41YzNSbGJUcHRZWE4wWlhKek1Sa3dGd1lEVlFRREV4QnJkV0psY201bGRHVnpMV0ZrCmJXbHVNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQXI3UmlJVVMrbExLOVlDZFQKams0Qm5wS1lsenc3aW9RTzhwbTR1QVFrZ1hyNktmQnpSdnliZHpSdHBaUmxwTUVYNXFhNUxLQy9VMjIwbERDQQpqcllVbS94NEI4ME1wV0k0Q3dOSXp6VG5tWXZDelhSZnNtcFErdHpsTUlic21HMTZpTGtxeSt6YUlIeGplRzhKClN1ci9qL2dUVGtxckFManV3d0xITjJCTUwwZ0M1NHlKUWQ2L3RrTmtNZU14dG1LcWFSOFgyM2NwaE9kenBTMSsKZXIrOTdOMmpUT2FzeWoybXg0WEtWb2dhMnkwdHpWV3lrVVhTNldkQ2dqMkRUd1VIRVdJUHhoaU50TVdSMC9jLwpJbUJjMnRIcHovMnZ4WHpkNUJpRG9RVnE3UEdWRGw5SnhDSlBhQ1ZrWlBrUy9zaVhROFBsVHlRcWo2ZzBMTWYyClM2ZHFyUUlEQVFBQm8xWXdWREFPQmdOVkhROEJBZjhFQkFNQ0JhQXdFd1lEVlIwbEJBd3dDZ1lJS3dZQkJRVUgKQXdJd0RBWURWUjBUQVFIL0JBSXdBREFmQmdOVkhTTUVHREFXZ0JRNU9WenhFVmxUQ2pNMjJrdUh4L0ErbmpKMwo1akFOQmdrcWhraUc5dzBCQVFzRkFBT0NBUUVBbkhaN2NCWnBaRkVWaGN0U09LSm5oRkNmcHBTeHQxT014M2ZDCllWS0lkeUl5VzdMMzBDNlM1eStyK3J6S1prUExEQkYySC9KMUs4U2lQNDRYSDhwOVVKT0lOREtmUWEvaFpxbFQKUHpMMWJVekF0N0RUN3FZZlovbWY5elpoV0p3a20zRDR5L0ptTHVudkpNR0R6WE93NXRMc0J3NVpacm96djdCKwo5LzRYVDEyWnp6WXU1ay84M280cjcwWjdOTWpOeGl0RlFJNWduMWVvREJoZlQ5SGhIMURRK1QvanU2ZGpkdSsvCmZPWG0zSXE1UmRXM2ZyRVFldks3R2VBTFhHQmdSSnBaZnVIb1J4ZkY4SnJWVGY5NWgxeTZ5UG41UnBwVXFndWkKV01qRE41ZkhmK2RwRTVQVGdRREVsbHRMWXhzWVFWUWsyYkUwNEdoKzRBQlJwZHZ3aXc9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb2dJQkFBS0NBUUVBcjdSaUlVUytsTEs5WUNkVGprNEJucEtZbHp3N2lvUU84cG00dUFRa2dYcjZLZkJ6ClJ2eWJkelJ0cFpSbHBNRVg1cWE1TEtDL1UyMjBsRENBanJZVW0veDRCODBNcFdJNEN3Tkl6elRubVl2Q3pYUmYKc21wUSt0emxNSWJzbUcxNmlMa3F5K3phSUh4amVHOEpTdXIvai9nVFRrcXJBTGp1d3dMSE4yQk1MMGdDNTR5SgpRZDYvdGtOa01lTXh0bUtxYVI4WDIzY3BoT2R6cFMxK2VyKzk3TjJqVE9hc3lqMm14NFhLVm9nYTJ5MHR6Vld5CmtVWFM2V2RDZ2oyRFR3VUhFV0lQeGhpTnRNV1IwL2MvSW1CYzJ0SHB6LzJ2eFh6ZDVCaURvUVZxN1BHVkRsOUoKeENKUGFDVmtaUGtTL3NpWFE4UGxUeVFxajZnMExNZjJTNmRxclFJREFRQUJBb0lCQUdUbm5mZG5LRi96S2tpLwpMUUFya2tpRC9mZlVrb2toQUFpWStYbE1mNzZRWm55UloxS2NIWmhlMXAxaDFPSENOcnl5Z09POVNVTmJYSHBDClJJSXcxVE1qMGJjQkxrTUlYaEVOQ05KZW1xY1dtWUF1VmdyN2ZaZ2tKY2N4cFV5QWl2OWIvOVR2Si9IQ0hjbjcKVW9YZzRYWEh6U2FJUVI2QUZqYU9NT1IvZkNFbU1TNGU2UzhtUFYyb2N6dlJPTzcxYmtIYm5NM1A3UHVsVUVPawpVenpCZXphMjlmTU1xclIvUDNkNHNCb2tLQ2V4UnMwQU1NNXN4VHRnaGJRTWZCQnB0OU12QWp4SGZubm5jSHpsCm9LZG1SZ3pqQUFuYTAwNkMxMFZWSGtIWDZYNmJEKzE4NkFyNzVNM0JsdCtDb2JMY1doYndXc092UVJyUlJMemEKQVRSRmpTRUNnWUVBek9MVDJ4ajBIV1QyanVZTkdmbG4yZWZZV2gwUmJzNU9LNVVXNkZDNlRRbmJPSVM0ZDZxOQpYcmt4RGFlM04xbmdCUzlaZ2NjOEx6UWtLU0dWSE1nS3VjNGw4RnpGWDE3MWUwbXpQN09vTmM3U01FZ3hlREV2ClNLdGx6T0d3SGsxNThudVRJcHVtUEZGQUZzcWJDbUNpdWhtVnB6NWRaZnBJTVhaUkRudTlxYVVDZ1lFQTI0bmQKeWZqWVg5MVJHVUlYem9rcTlvdXJud21XSldWWldoRkxnV3NGZGluajFxTm4yc1g5VUE2TU1EZytzN2R2SXg2UApWNmhMOWFSSlZEeVNxVDg2RXZPS0FlajB3TWFQY3Awdi95ZEh4UlJHd2JHUmtGNHhqNWFnNk5SV0NCZFJnQnlXCi9qNzNFUVJyVjU4VDJnUGg0VTRhMVFMZDJSQXpJT0RFU0ZrMW5ta0NnWUJ5bHhLQ1ljeDJmRGRoNk15L0VEekQKSk9aZVVBK2w5NERFNDFleWl5UUhYbEhicEc4L2pxRG5UNUJkNE1XYUVZdzNtaW5uYWJVQmVab1gzdzUwMEhVZgpRbXI1cWdsQnMreDhEZFpROUh4SnkrakcxRG5HelV0eXkxbmVZd09MankxN0x4NDFwdlFzbkF6S01uclFMUWdXCkthVUhxdHUxNDJ0cExwRmJGbDRYZVFLQmdHbURaMjlOQkdGK3N4MmFvR3FKam5hVVJsWFhlNnhaZTRwSVNhdlgKemZZdXgrdysrUWt5b3o2NDN6UEZ0STBYbW5pY2xYUWgxUEFvbDMyKzV4WWs1enA0aGxuSXB1bUlCU1dtMm95ZApTbWMwQ1pYS1RCWEF6NzBkUGhUcENMZzJ6TnJ2NHJvcmRQOWV5bUNBZWtBTUlhSHhzZit5c3dGQ1FmQ0pWbzBYCkl5Z1JBb0dBSGNzOGtUc05rdVd3bkxTVjJpdWxuUEtoUm5LaklYRGdoSi8vb20xMktxUnA5MFBTeUVxV0F3MloKRTFadXQyUGJBN2dhSFp3RTNPT1JhdmhvUWlNdmxUcXl3bXRaZzc2Q3I1ZHBVbkQ3T3BDQkhtOGs3aEE3RWpEcApES0RHRmdSRFJmdVlLcGxWTHg4dXUxZ2ZIcEQ2cFBQOTh3cXpYNUFSS01YOTM0Y2dlMlk9Ci0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg== [root@worker233 ~]# [root@worker233 ~]# [root@worker233 ~]# kubectl config view apiVersion: v1 clusters: - cluster: certificate-authority-data: DATA+OMITTED server: https://10.0.0.231:6443 name: k8s231 - cluster: certificate-authority-data: DATA+OMITTED server: https://10.0.0.66:6443 name: k8s66 contexts: - context: cluster: k8s66 user: master66 name: haha - context: cluster: k8s231 user: master231 name: xixi current-context: xixi kind: Config preferences: {} users: - name: master231 user: client-certificate-data: REDACTED client-key-data: REDACTED - name: master66 user: client-certificate-data: REDACTED client-key-data: REDACTED [root@worker233 ~]# 3.测试验证 [root@worker233 ~]# kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master231 Ready control-plane,master 9d v1.23.17 10.0.0.231 <none> Ubuntu 22.04.4 LTS 5.15.0-143-generic docker://20.10.24 worker232 Ready <none> 9d v1.23.17 10.0.0.232 <none> Ubuntu 22.04.4 LTS 5.15.0-143-generic docker://20.10.24 worker233 Ready <none> 9d v1.23.17 10.0.0.233 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic docker://20.10.24 [root@worker233 ~]# [root@worker233 ~]# [root@worker233 ~]# kubectl get nodes -o wide --context=haha NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8s66 Ready control-plane,worker 2d v1.28.12 10.0.0.66 <none> Ubuntu 22.04.4 LTS 5.15.0-143-generic containerd://1.7.13 k8s77 Ready worker 2d v1.28.12 10.0.0.77 <none> Ubuntu 22.04.4 LTS 5.15.0-143-generic containerd://1.7.13 k8s88 Ready control-plane,worker 2d v1.28.12 10.0.0.88 <none> Ubuntu 22.04.4 LTS 5.15.0-143-generic containerd://1.7.13 [root@worker233 ~]# [root@worker233 ~]# kubectl --context=haha get ns NAME STATUS AGE default Active 2d kube-node-lease Active 2d kube-public Active 2d kube-system Active 2d kubekey-system Active 2d kubesphere-controls-system Active 2d kubesphere-system Active 2d weixiang Active 2d [root@worker233 ~]#

18、扩容

bash
- metrics-server环境部署及故障排查案例 1.什么metrics-s erver metrics-server为K8S集群的"kubectl top"命令提供数据监控,也提供了"HPA(HorizontalPodAutoscaler)"的使用。 同时,还为dashboard提供可视化的操作节点。 [root@master231 ~]# kubectl top pods error: Metrics API not available [root@master231 ~]# [root@master231 ~]# kubectl top nodes error: Metrics API not available [root@master231 ~]# 彩蛋: hpa和vpa的区别? - hpa:(水平扩容) 表示Pod数量资源不足时,可以自动增加Pod副本数量,以抵抗流量过多的情况,降低负载。 - vpa: (垂直扩容) 表示可以动态调整容器的资源上限,比如一个Pod一开始是200Mi内存,如果资源达到定义的阈值,就可以扩展内存,但不会增加pod副本数量。 典型的区别在于vpa具有一定的资源上限问题,因为pod是K8S集群调度的最小单元,不可拆分,因此这个将来扩容时,取决于单节点的资源上限。 部署文档 https://github.com/kubernetes-sigs/metrics-server 彩蛋: metrics-server组件本质上是从kubelet组件获取监控数据 [root@master231 pki]# pwd /etc/kubernetes/pki [root@master231 pki]# [root@master231 pki]# ll apiserver-kubelet-client.* -rw-r--r-- 1 root root 1164 Apr 7 11:00 apiserver-kubelet-client.crt -rw------- 1 root root 1679 Apr 7 11:00 apiserver-kubelet-client.key [root@master231 pki]# [root@master231 pki]# curl -s -k --key apiserver-kubelet-client.key --cert apiserver-kubelet-client.crt https://10.0.0.231:10250/metrics/resource | wc -l 102 [root@master231 pki]# [root@master231 pki]# curl -s -k --key apiserver-kubelet-client.key --cert apiserver-kubelet-client.crt https://10.0.0.232:10250/metrics/resource | wc -l 67 [root@master231 pki]# [root@master231 pki]# curl -s -k --key apiserver-kubelet-client.key --cert apiserver-kubelet-client.crt https://10.0.0.233:10250/metrics/resource | wc -l 57 [root@master231 pki]# 2.部署metrics-server组件 2.1 下载资源清单 [root@master231 ~]# wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/high-availability-1.21+.yaml SVIP: [root@master231 ~]# wget http://192.168.21.253/Resources/Kubernetes/Add-ons/metrics-server/0.7.x/high-availability-1.21%2B.yaml 2.2 编辑配置文件 [root@master231 ~]# vim high-availability-1.21+.yaml ... 114 apiVersion: apps/v1 115 kind: Deployment 116 metadata: ... 144 - args: 145 - --kubelet-insecure-tls # 不要验证Kubelets提供的服务证书的CA。不配置则会报错x509。 ... ... image: registry.aliyuncs.com/google_containers/metrics-server:v0.7.2 2.3 部署metrics-server组件 [root@master231 ~]# kubectl apply -f high-availability-1.21+.yaml serviceaccount/metrics-server created clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created clusterrole.rbac.authorization.k8s.io/system:metrics-server created rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created service/metrics-server created deployment.apps/metrics-server created poddisruptionbudget.policy/metrics-server created apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created [root@master231 ~]# 镜像下载地址: http://192.168.21.253/Resources/Kubernetes/Add-ons/metrics-server/0.7.x/ 2.4 查看镜像是否部署成功 [root@master231 02-metrics-server]# kubectl get pods,svc -n kube-system -l k8s-app=metrics-server -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/metrics-server-57c6f647bb-6n74n 1/1 Running 0 71s 10.100.1.102 worker232 <none> <none> pod/metrics-server-57c6f647bb-f5t5j 1/1 Running 0 71s 10.100.2.73 worker233 <none> <none> NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/metrics-server ClusterIP 10.200.24.39 <none> 443/TCP 71s k8s-app=metrics-server [root@master231 02-metrics-server]# [root@master231 02-metrics-server]# kubectl -n kube-system describe svc metrics-server Name: metrics-server Namespace: kube-system Labels: k8s-app=metrics-server Annotations: <none> Selector: k8s-app=metrics-server Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: 10.200.24.39 IPs: 10.200.24.39 Port: https 443/TCP TargetPort: https/TCP Endpoints: 10.100.1.102:10250,10.100.2.73:10250 Session Affinity: None Events: <none> [root@master231 02-metrics-server]# 2.5 验证metrics组件是否正常工作 [root@master231 02-metrics-server]# kubectl top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% master231 73m 3% 1331Mi 17% worker232 31m 1% 831Mi 10% worker233 30m 1% 912Mi 11% [root@master231 02-metrics-server]# [root@master231 02-metrics-server]# kubectl top pods -A NAMESPACE NAME CPU(cores) MEMORY(bytes) default xiuxian-77b9d95d56-4dxpr 1m 0Mi kube-flannel kube-flannel-ds-5hbns 5m 12Mi kube-flannel kube-flannel-ds-dzffl 4m 12Mi kube-flannel kube-flannel-ds-h5kwh 5m 12Mi kube-system coredns-6d8c4cb4d-k52qr 1m 11Mi kube-system coredns-6d8c4cb4d-rvzd9 1m 11Mi kube-system etcd-master231 9m 108Mi kube-system kube-apiserver-master231 29m 216Mi kube-system kube-controller-manager-master231 13m 53Mi kube-system kube-proxy-g5sfd 5m 17Mi kube-system kube-proxy-m6mxj 5m 17Mi kube-system kube-proxy-q5bqh 5m 17Mi kube-system kube-scheduler-master231 2m 16Mi kube-system metrics-server-57c6f647bb-6n74n 2m 14Mi kube-system metrics-server-57c6f647bb-f5t5j 2m 14Mi metallb-system controller-644c958987-57rzz 1m 14Mi metallb-system speaker-4t849 3m 15Mi metallb-system speaker-gnh9v 3m 16Mi metallb-system speaker-kp99v 3m 15Mi [root@master231 02-metrics-server]# [root@master231 02-metrics-server]# kubectl top pods NAME CPU(cores) MEMORY(bytes) xiuxian-77b9d95d56-4dxpr 1m 0Mi [root@master231 02-metrics-server]# - 水平Pod伸缩hpa实战 1.什么是hpa hpa是k8s集群内置的资源,全称为"HorizontalPodAutoscaler"。 可以自动实现Pod水平伸缩,说白了,在业务高峰期可以自动扩容Pod副本数量,在集群的低谷期,可以自动缩容Pod副本数量。 2.hpa 2.1 导入镜像 [root@worker233 ~]# wget http://192.168.21.253/Resources/Kubernetes/Add-ons/metrics-server/weixiang-linux-tools-v0.1-stress.tar.gz [root@worker233 ~]# docker load -i weixiang-linux-tools-v0.1-stress.tar.gz [root@worker233 ~]# docker tag jasonyin2020/weixiang-linux-tools:v0.1 harbor250.weixiang.com/weixiang-test/stress:v0.1 [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-test/stress:v0.1 2.2 编写资源清单 [root@master231 17-HorizontalPodAutoscaler]# cat > 01-deploy-hpa.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-stress spec: replicas: 1 selector: matchLabels: app: stress template: metadata: labels: app: stress spec: containers: # - image: jasonyin2020/weixiang-linux-tools:v0.1 - image: harbor250.weixiang.com/weixiang-test/stress:v0.1 name: weixiang-linux-tools args: - tail - -f - /etc/hosts # 配置容器的资源限制 resources: # 容器的期望资源 requests: # CPU限制,1 =1000m cpu: 0.2 # 内存限制 memory: 300Mi # 容器的资源上限 limits: cpu: 0.5 memory: 500Mi --- apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: stress-hpa spec: maxReplicas: 5 minReplicas: 2 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: deploy-stress targetCPUUtilizationPercentage: 95 EOF 2.3 创建资源 [root@master231 17-HorizontalPodAutoscaler]# kubectl apply -f 01-deploy-hpa.yaml deployment.apps/deploy-stress created horizontalpodautoscaler.autoscaling/stress-hpa created [root@master231 17-HorizontalPodAutoscaler]# 2.4 测试验证 [root@master231 17-HorizontalPodAutoscaler]# kubectl get deploy,hpa,po -o wide # 第一次查看发现Pod副本数量只有1个 NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-stress 2/2 2 2 44s weixiang-linux-tools harbor250.weixiang.com/weixiang-test/stress:v0.1 app=stress NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE horizontalpodautoscaler.autoscaling/stress-hpa Deployment/deploy-stress <unknown>/95% 2 5 2 44s NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-stress-74688d4f56-g4zcc 1/1 Running 0 44s 10.100.1.103 worker232 <none> <none> pod/deploy-stress-74688d4f56-kz92b 1/1 Running 0 29s 10.100.2.74 worker233 <none> <none> [root@master231 17-HorizontalPodAutoscaler]# [root@master231 17-HorizontalPodAutoscaler]# 彩蛋:(响应式创建hpa) [root@master231 horizontalpodautoscalers]# kubectl autoscale deploy deploy-stress --min=2 --max=5 --cpu-percent=95 -o yaml --dry-run=client 2.5 压力测试 [root@master231 ~]# kubectl exec deploy-stress-74688d4f56-g4zcc -- stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 10m stress: info: [7] dispatching hogs: 8 cpu, 4 io, 2 vm, 0 hdd 2.6 查看Pod副本数量 [root@master231 17-HorizontalPodAutoscaler]# kubectl get deploy,hpa,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-stress 3/3 3 3 2m47s weixiang-linux-tools harbor250.weixiang.com/weixiang-test/stress:v0.1 app=stress NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE horizontalpodautoscaler.autoscaling/stress-hpa Deployment/deploy-stress 125%/95% 2 5 3 2m47s NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-stress-74688d4f56-5rpqm 1/1 Running 0 17s 10.100.1.104 worker232 <none> <none> pod/deploy-stress-74688d4f56-g4zcc 1/1 Running 0 2m47s 10.100.1.103 worker232 <none> <none> pod/deploy-stress-74688d4f56-kz92b 1/1 Running 0 2m32s 10.100.2.74 worker233 <none> <none> [root@master231 17-HorizontalPodAutoscaler]# [root@master231 17-HorizontalPodAutoscaler]# 2.7 再次压测 [root@master231 ~]# kubectl exec deploy-stress-74688d4f56-5rpqm -- stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 10m stress: info: [6] dispatching hogs: 8 cpu, 4 io, 2 vm, 0 hdd [root@master231 ~]# kubectl exec deploy-stress-74688d4f56-kz92b -- stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 10m stress: info: [7] dispatching hogs: 8 cpu, 4 io, 2 vm, 0 hdd 2.6 发现最多有5个Pod创建 [root@master231 17-HorizontalPodAutoscaler]# kubectl get deploy,hpa,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-stress 5/5 5 5 4m28s weixiang-linux-tools harbor250.weixiang.com/weixiang-test/stress:v0.1 app=stress NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE horizontalpodautoscaler.autoscaling/stress-hpa Deployment/deploy-stress 245%/95% 2 5 5 4m28s NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-stress-74688d4f56-4q775 1/1 Running 0 28s 10.100.2.75 worker233 <none> <none> pod/deploy-stress-74688d4f56-5rpqm 1/1 Running 0 118s 10.100.1.104 worker232 <none> <none> pod/deploy-stress-74688d4f56-9ndkh 1/1 Running 0 28s 10.100.1.105 worker232 <none> <none> pod/deploy-stress-74688d4f56-g4zcc 1/1 Running 0 4m28s 10.100.1.103 worker232 <none> <none> pod/deploy-stress-74688d4f56-kz92b 1/1 Running 0 4m13s 10.100.2.74 worker233 <none> <none> [root@master231 17-HorizontalPodAutoscaler]# 2.7 取消压测后【参考笔记】 需要等待10min会自动缩容Pod数量到2个。 [root@master231 horizontalpodautoscalers]# kubectl get deploy,hpa,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-stress 2/2 2 2 20m weixiang-linux-tools harbor250.weixiang.com/weixiang-casedemo/weixiang-linux-tools:v0.1 app=stress NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE horizontalpodautoscaler.autoscaling/stress-hpa Deployment/deploy-stress 0%/95% 2 5 5 20m NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-stress-5d7c796c97-dnlzj 1/1 Running 0 15m 10.100.203.180 worker232 <none> <none> pod/deploy-stress-5d7c796c97-f9rff 1/1 Running 0 20m 10.100.203.150 worker232 <none> <none> pod/deploy-stress-5d7c796c97-ld8s9 1/1 Terminating 0 14m 10.100.140.123 worker233 <none> <none> pod/deploy-stress-5d7c796c97-rzgsm 1/1 Terminating 0 20m 10.100.140.121 worker233 <none> <none> pod/deploy-stress-5d7c796c97-zxgp6 1/1 Terminating 0 16m 10.100.140.122 worker233 <none> <none> [root@master231 horizontalpodautoscalers]# [root@master231 horizontalpodautoscalers]# [root@master231 horizontalpodautoscalers]# kubectl get deploy,hpa,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-stress 2/2 2 2 21m weixiang-linux-tools harbor250.weixiang.com/weixiang-casedemo/weixiang-linux-tools:v0.1 app=stress NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE horizontalpodautoscaler.autoscaling/stress-hpa Deployment/deploy-stress 0%/95% 2 5 2 21m NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-stress-5d7c796c97-dnlzj 1/1 Running 0 16m 10.100.203.180 worker232 <none> <none> pod/deploy-stress-5d7c796c97-f9rff 1/1 Running 0 21m 10.100.203.150 worker232 <none> <none> [root@master231 horizontalpodautoscalers]#

19、管理多套集群的方式

bash
# 1.kubectl + kubeconfig 原生支持:直接使用 kubectl 命令,通过切换 kubeconfig 文件管理不同集群。 轻量级:无需额外组件,适合熟悉命令行的高级用户。 # 2.kubesphere(企业级全栈平台) 可视化操作:提供完整的 Web 控制台。 多租户管理:支持基于角色的多集群权限控制 # 3.Kubernetes Dashboard(官方 UI) 多集群支持:需手动切换 kubeconfig 或为每个集群单独部署 Dashboard。 # 4.kuboard(国产轻量级) 中文友好:界面和文档对中文用户友好。 多集群支持:支持通过一个界面管理多个集群。 功能全面:提供监控、日志、存储管理等功能。
1、Dashboard可视化管理K8S
bash
- 部署附加组件Dashboard可视化管理K8S 1.什么是dashboard极速入门 dashboard是一款图形化管理K8S集群的图形化解决方案。 参考链接: https://github.com/kubernetes/dashboard/releases?page=9 2.Dashboard部署实战案例 2.1 下载资源清单 [root@master231 03-dashboard]# wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.1/aio/deploy/recommended.yaml SVIP: [root@master231 03-dashboard]# wget http://192.168.21.253/Resources/Kubernetes/Add-ons/dashboard/01-dashboard.yaml 2.2 导入镜像 [root@worker233 ~]# wget http://192.168.21.253/Resources/Kubernetes/Add-ons/dashboard/weixiang-dashboard-v2.5.1.tar.gz [root@worker233 ~]# docker load -i weixiang-dashboard-v2.5.1.tar.gz [root@worker233 ~]# docker tag harbor.weixiang.com/weixiang-add-ons/dashboard:v2.5.1 harbor250.weixiang.com/weixiang-add-ons/dashboard:v2.5.1 [root@worker233 ~]# [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-add-ons/dashboard:v2.5.1 [root@worker233 ~]# wget http://192.168.21.253/Resources/Kubernetes/Add-ons/dashboard/weixiang-metrics-scraper-v1.0.7.tar.gz [root@worker233 ~]# docker load < weixiang-metrics-scraper-v1.0.7.tar.gz [root@worker233 ~]# docker tag harbor.weixiang.com/weixiang-add-ons/metrics-scraper:v1.0.7 harbor250.weixiang.com/weixiang-add-ons/metrics-scraper:v1.0.7 [root@worker233 ~]# [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-add-ons/metrics-scraper:v1.0.7 2.3.修改资源清单 [root@master231 03-dashboard]# vim 01-dashboard.yaml ... - 1.将8443的svc的类型改为LoadBalancer; - 2.修改2个镜像的名称即可;

image

bash
2.4.部署服务 [root@master231 03-dashboard]# kubectl apply -f 01-dashboard.yaml namespace/kubernetes-dashboard created serviceaccount/kubernetes-dashboard created service/kubernetes-dashboard created secret/kubernetes-dashboard-certs created secret/kubernetes-dashboard-csrf created secret/kubernetes-dashboard-key-holder created configmap/kubernetes-dashboard-settings created role.rbac.authorization.k8s.io/kubernetes-dashboard created clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created deployment.apps/kubernetes-dashboard created service/dashboard-metrics-scraper created deployment.apps/dashboard-metrics-scraper created [root@master231 03-dashboard]# 2.5.查看资源 [root@master231 ~/count/03-dashboard]#kubectl get pods,svc -n kubernetes-dashboard -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/dashboard-metrics-scraper-9d986c98c-pp8zs 1/1 Running 0 2m11s 10.100.2.143 worker233 <none> <none> pod/kubernetes-dashboard-5ccf77bb87-mb9jj 1/1 Running 0 2m11s 10.100.0.15 master231 <none> <none> NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/dashboard-metrics-scraper ClusterIP 10.200.208.135 <none> 8000/TCP 2m11s k8s-app=dashboard-metrics-scraper service/kubernetes-dashboard LoadBalancer 10.200.180.79 43.139.47.66 443:8443/TCP 2m11s k8s-app=kubernetes-dashboard # 这里有个问题,配置的MetalLB IP 池已耗尽,当时是配置了1个公网ip,导致EXTERNAL-IP一直是pending, # 编辑kubectl edit ipaddresspools -n metallb-system jasonyin2020,添加其他公网ip,完成

image

bash
2.6.访问Dashboard https://43.139.47.66:8443/#/login 输入神秘代码: "thisisunsafe". 可以看到以下界面

image

bash
3.基于token登录实战 3.1 创建sa [root@master231 03-dashboard]# kubectl create serviceaccount weixiang98 -o yaml --dry-run=client > 02-sa.yaml [root@master231 03-dashboard]# [root@master231 03-dashboard]# vim 02-sa.yaml [root@master231 03-dashboard]# [root@master231 03-dashboard]# cat 02-sa.yaml apiVersion: v1 kind: ServiceAccount metadata: name: weixiang98 [root@master231 03-dashboard]# [root@master231 03-dashboard]# kubectl apply -f 02-sa.yaml serviceaccount/weixiang98 created [root@master231 03-dashboard]# [root@master231 03-dashboard]# kubectl get sa weixiang98 NAME SECRETS AGE weixiang98 1 10s [root@master231 03-dashboard]# 3.2 将sa和内置集群角色绑定 [root@master231 03-dashboard]# kubectl create clusterrolebinding dashboard-weixiang98 --clusterrole=cluster-admin --serviceaccount=default:weixiang98 -o yaml --dry-run=client > 03-clusterrolebinding-sa.yaml [root@master231 03-dashboard]# [root@master231 03-dashboard]# vim 03-clusterrolebinding-sa.yaml [root@master231 03-dashboard]# [root@master231 03-dashboard]# cat 03-clusterrolebinding-sa.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: dashboard-weixiang98 roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: weixiang98 namespace: default [root@master231 03-dashboard]# [root@master231 03-dashboard]# kubectl apply -f 03-clusterrolebinding-sa.yaml clusterrolebinding.rbac.authorization.k8s.io/dashboard-weixiang98 created [root@master231 03-dashboard]# [root@master231 03-dashboard]# kubectl get clusterrolebinding dashboard-weixiang98 -o wide NAME ROLE AGE USERS GROUPS SERVICEACCOUNTS dashboard-weixiang98 ClusterRole/cluster-admin 41s default/weixiang98 [root@master231 03-dashboard]# 3.3 浏览器使用token登录【注意,你的token和我的不一样】 [root@master231 03-dashboard]# kubectl get secrets `kubectl get sa weixiang98 -o jsonpath='{.secrets[0].name}'` -o jsonpath='{.data.token}' |base64 -d ; echo eyJhbGciOiJSUzI1NiIsImtpZCI6Ikw0aG01SjlnR1RwS1EwRmpNcE5lQ3BjU1k3LW1ULXlXLVM3U0hfXzRwTTgifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJkZWZhdWx0Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImxpbnV4OTgtdG9rZW4tOWc3NzYiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoibGludXg5OCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjdjMmIyYTY3LWZhM2EtNDY1MC1hMWI5LTkwNDIwNzA1MmM0OCIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpkZWZhdWx0OmxpbnV4OTgifQ.Y2QmV6xLoX9hEZg16aIjXsBuTvjchdwhOlwmKc9PdBMPkNj6x9xCHNstLn8NstTJ46cXoKEeah_2EaINsOSoLxHHvDJw1qlvwKmNjz_6desCTh9A-XQXGu4v8U1lvJ83wqO09zMtRfc5-4OvkoEkm9Po76qfFBgouF9fvFNVnCG4WMQIqwEphGh3f51oGRQgFA8tOiINsWz0Gx0iqbXo5lct-Jdlj2dRk0-3t85pEk0BTpLCMmKoIw21F0WcsNQRx7u7-sgASRsK_I_Bmxsw5SsUiwuKKnpNrLrAIQkWhk_E6YqONcJvgB9NhvgWKB94-tg76-_-cjeqmxJkj1L16g [root@master231 03-dashboard]#

image

e96cfc2c38abe405e6e43634a6fc8bff_720

image

image

46b5f230a70dbf731aac078f885df33e

使用Kubeconfig授权登录实战

bash
4.使用Kubeconfig授权登录实战(彩蛋) 4.1.创建kubeconfig文件 cat > weixiang-generate-context-conf.sh <<'EOF' #!/bin/bash # auther: Jason Yin # 获取secret的名称 SECRET_NAME=`kubectl get sa weixiang98 -o jsonpath='{.secrets[0].name}'` # 指定API SERVER的地址 API_SERVER=106.55.44.37:6443 # 指定kubeconfig配置文件的路径名称 KUBECONFIG_NAME=/tmp/weixiang-k8s-dashboard-admin.conf # 获取用户的tocken TOCKEN=`kubectl get secrets $SECRET_NAME -o jsonpath={.data.token} | base64 -d` # 在kubeconfig配置文件中设置群集项 kubectl config set-cluster weixiang-k8s-dashboard-cluster --server=$API_SERVER --kubeconfig=$KUBECONFIG_NAME # 在kubeconfig中设置用户项 kubectl config set-credentials weixiang-k8s-dashboard-user --token=$TOCKEN --kubeconfig=$KUBECONFIG_NAME # 配置上下文,即绑定用户和集群的上下文关系,可以将多个集群和用户进行绑定哟~ kubectl config set-context weixiang-admin --cluster=weixiang-k8s-dashboard-cluster --user=weixiang-k8s-dashboard-user --kubeconfig=$KUBECONFIG_NAME # 配置当前使用的上下文 kubectl config use-context weixiang-admin --kubeconfig=$KUBECONFIG_NAME EOF bash weixiang-generate-context-conf.sh 4.2 查看kubeconfig的配置文件结构 [root@master231 03-dashboard]# kubectl config view --kubeconfig=/tmp/weixiang-k8s-dashboard-admin.conf apiVersion: v1 clusters: - cluster: server: 106.55.44.37:6443 name: weixiang-k8s-dashboard-cluster contexts: - context: cluster: weixiang-k8s-dashboard-cluster user: weixiang-k8s-dashboard-user name: weixiang-admin current-context: weixiang-admin kind: Config preferences: {} users: - name: weixiang-k8s-dashboard-user user: token: REDACTED [root@master231 03-dashboard]# 3.3.注销并基于kubeconfig文件登录 5.Dashboard的基本使用

image

2、基于docker的方式部署kuboard
bash
- 基于docker的方式部署kuboard 参考链接: https://kuboard.cn/install/v3/install-built-in.html github社区地址: https://github.com/eip-work/kuboard-press 1.导入镜像 [root@harbor250.weixiang.com ~]# wget http://192.168.21.253/Resources/Kubernetes/Project/kuboard/kuboard-on-docker/weixiang-kuboard-v3.tar.gz [root@harbor250.weixiang.com ~]# docker load -i weixiang-kuboard-v3.tar.gz 2.基于docker运行kuboard [root@harbor250.weixiang.com ~]# docker run -d \ --restart=unless-stopped \ --name=kuboard \ -p 88:80/tcp \ -p 10081:10081/tcp \ -e KUBOARD_ENDPOINT="http://0.0.0.0:80" \ -e KUBOARD_AGENT_SERVER_TCP_PORT="10081" \ -v /root/kuboard-data:/data \ swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3 [root@harbor250.weixiang.com ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 826bee1c48be swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3 "/entrypoint.sh" 53 seconds ago Up 52 seconds 443/tcp, 0.0.0.0:10081->10081/tcp, :::10081->10081/tcp, 0.0.0.0:88->80/tcp, :::88->80/tcp kuboard [root@harbor250.weixiang.com ~]# 3.访问kuboard http://8.148.236.36:88/ 用户名: admin 密 码: Kuboard123 4.添加K8S集群 4.1 获取token [root@master231 04-kuboard]# cat << EOF > kuboard-create-token.yaml --- apiVersion: v1 kind: Namespace metadata: name: kuboard --- apiVersion: v1 kind: ServiceAccount metadata: name: kuboard-admin namespace: kuboard --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: kuboard-admin-crb roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: kuboard-admin namespace: kuboard --- apiVersion: v1 kind: Secret type: kubernetes.io/service-account-token metadata: annotations: kubernetes.io/service-account.name: kuboard-admin name: kuboard-admin-token namespace: kuboard EOF kubectl apply -f kuboard-create-token.yaml echo -e "\033[1;34m将下面这一行红色输出结果填入到 kuboard 界面的 Token 字段:\033[0m" echo -e "\033[31m$(kubectl -n kuboard get secret $(kubectl -n kuboard get secret kuboard-admin-token | grep kuboard-admin-token | awk '{print $1}') -o go-template='{{.data.token}}' | base64 -d)\033[0m"

image

image

bash
4.2 使用token登录

fd88c3d1a0cb2b40a08eecd1624f0015

image

bash
5.基本使用

0e490ab7cdaf3b2654fbc21e6ae0860e

image

image

image

image

image

image

image

bash
- kuboard管理多套K8S集群案例 1.查看kubeconfig文件内容 [root@k8s66 ~]# cat ~/.kube/config 2.将文件内容拷贝到Kuboard 3.测试验证

image

image

image

20、pod调度

1、nodeName
bash
- 玩转Pod调度基础之nodeName 1.什么是nodeName 所谓的nodeName指的是节点的名称,值得注意的,该节点必须在etcd数据库中有记录。 2.测试案例 [root@master231 18-scheduler]# kubectl get nodes NAME STATUS ROLES AGE VERSION master231 Ready control-plane,master 13d v1.23.17 worker232 Ready <none> 13d v1.23.17 worker233 Ready <none> 13d v1.23.17 [root@master231 18-scheduler]# [root@master231 18-scheduler]# cat 01-scheduler-nodeName.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-nodename spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: # 指定节点调度,但是调度的节点必须在etcd中有记录。 nodeName: master231 # 强制将所有这 3 个副本都部署到 `master231` 这一个节点上 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl apply -f 01-scheduler-nodeName.yaml deployment.apps/deploy-nodename created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-nodename-79c77b9ff8-8ssfd 1/1 Running 0 14s 10.100.0.19 master231 <none> <none> deploy-nodename-79c77b9ff8-ph8fj 1/1 Running 0 14s 10.100.0.21 master231 <none> <none> deploy-nodename-79c77b9ff8-znzf2 1/1 Running 0 14s 10.100.0.20 master231 <none> <none> [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl delete -f 01-scheduler-nodeName.yaml deployment.apps "deploy-nodename" deleted [root@master231 18-scheduler]#
2、hostPort
bash
- 玩转Pod调度基础之hostPort 1.什么是hostPort hostPort是指的主机端口,如果pod调度的节点端口被占用,则新的Pod不会调度到该节点。 2.实战案例 [root@master231 18-scheduler]# cat 02-scheduler-hostPort.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-hostport spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 ports: - containerPort: 80 # 将pod的80端口映射到调度worker节点的90端口。 # 这意味着其他pod如果也想要使用90端口是不可以的,因为端口被占用就无法完成调度! hostPort: 90 [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl apply -f 02-scheduler-hostPort.yaml deployment.apps/deploy-hostport created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-hostport-565858676b-6ncs4 1/1 Running 0 8s 10.100.1.119 worker232 <none> <none> deploy-hostport-565858676b-974lr 0/1 Pending 0 8s <none> <none> <none> <none> deploy-hostport-565858676b-vqhnv 1/1 Running 0 8s 10.100.2.86 worker233 <none> <none> [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl delete -f 02-scheduler-hostPort.yaml deployment.apps "deploy-hostport" deleted [root@master231 18-scheduler]#
3、hostNetwork
bash
玩转Pod调度基础之hostNetwork 1.什么是hostNetwork 所谓的hostNetwork让容器使用宿主机网络。当宿主机的端口和Pod端口冲突时,则不会往该节点调度。 2.实战案例 [root@master231 18-scheduler]# cat 03-scheduler-hostNetwork.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-hostnetwork spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: # 使用宿主机网络通常情况下会搭配dnsPolicy和可选的'containerPort'字段。 hostNetwork: true dnsPolicy: ClusterFirstWithHostNet # 优先尝试使用集群的 DNS 服务来解析域名。这个Pod就既能访问外部网络(通过宿主机的 DNS),也能方便地发现和访问集群内的其他服务 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 ports: - containerPort: 80 # hostNetwork: true会让pod绑定到宿主机的80端口 [root@master231 18-scheduler]# [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl apply -f 03-scheduler-hostNetwork.yaml deployment.apps/deploy-hostnetwork created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-hostnetwork-68f6858c4b-dkblh 0/1 Pending 0 3s <none> <none> <none> <none> deploy-hostnetwork-68f6858c4b-mj8n5 1/1 Running 0 3s 10.0.0.232 worker232 <none> <none> deploy-hostnetwork-68f6858c4b-w6mkw 1/1 Running 0 3s 10.0.0.233 worker233 <none> <none> [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl delete -f 03-scheduler-hostNetwork.yaml deployment.apps "deploy-hostnetwork" deleted [root@master231 18-scheduler]#
4、resources
bash
- 玩转Pod调度基础之resources 1.什么是resources 所谓的resources就是配置容器的资源限制。 可以通过requests配置容器的期望资源,可以通过limits配置容器的资源使用上限。 2.实战案例 [root@master231 18-scheduler]# cat 04-scheduler-resources.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-resources spec: replicas: 10 selector: matchLabels: app: stress template: metadata: labels: app: stress spec: containers: # - image: jasonyin2020/weixiang-linux-tools:v0.1 - image: harbor250.weixiang.com/weixiang-test/stress:v0.1 name: weixiang-linux-tools args: - tail - -f - /etc/hosts # 配置容器的资源限制 resources: # 1.若节点不满足期望资源,则无法调度到该节点; # 2.调度到该节点后也不会立刻吃掉期望的资源。 # 3.若没有定义requests字段,则默认等效于limits的配置; requests: cpu: 0.5 memory: 300Mi # memory: 10G # 1.定义容器资源使用的上限; # 2.如果没有定义limits资源,则默认使用调度到该节点的所有资源。 limits: cpu: 1.5 memory: 500Mi [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl apply -f 04-scheduler-resources.yaml deployment.apps/deploy-resources created [root@master231 18-scheduler]# [root@master231 ~/count/03-dashboard]#kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-resources-7496f54d5b-424w9 1/1 Running 0 53m 10.100.1.139 worker232 <none> <none> deploy-resources-7496f54d5b-8jzgp 1/1 Running 0 53m 10.100.1.137 worker232 <none> <none> deploy-resources-7496f54d5b-bkqfm 1/1 Running 0 53m 10.100.2.147 worker233 <none> <none> deploy-resources-7496f54d5b-c7n4w 0/1 Pending 0 53m <none> <none> <none> <none> deploy-resources-7496f54d5b-cjs8w 0/1 Pending 0 53m <none> <none> <none> <none> deploy-resources-7496f54d5b-k8wzp 1/1 Running 0 53m 10.100.1.138 worker232 <none> <none> deploy-resources-7496f54d5b-mcwhv 1/1 Running 0 53m 10.100.2.148 worker233 <none> <none> deploy-resources-7496f54d5b-phfvx 1/1 Running 0 53m 10.100.2.146 worker233 <none> <none> deploy-resources-7496f54d5b-rjbgp 0/1 Pending 0 53m <none> <none> <none> <none> deploy-resources-7496f54d5b-zsf7n 0/1 Pending 0 53m <none> <none> <none> <none> # 根据配置文件,每个pod占0.5cpu,系统一共4个cpu,按理说应该是8个Running状态的,但是别的pod还占0.1cpu,因为每个节点cpu是2核,所以每个节点只能有3个,总共6个Running [root@master231 18-scheduler]# kubectl delete -f 04-scheduler-resources.yaml deployment.apps "deploy-resources" deleted [root@master231 18-scheduler]# [root@worker232 ~]# docker ps -a |grep deploy-resources-7496f54d5b-424w9 1c3d017f8c14 da6fdb7c9168 "tail -f /etc/hosts" 53 minutes ago Up 53 minutes k8s_weixiang-linux-tools_deploy-resources-7496f54d5b-424w9_default_fd8df59d-ce5d-4b52-8e34-1fd183631c96_0 353d92850f47 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 53 minutes ago Up 53 minutes k8s_POD_deploy-resources-7496f54d5b-424w9_default_fd8df59d-ce5d-4b52-8e34-1fd183631c96_0 [root@worker232 ~]# docker stats 1c3d017f8c14
5、Taints污点
bash
- 玩转Pod调度基础之Taints 1.什么是Taints Taints表示污点,作用在worker工作节点上。 污点类型大概分为三类: - NoSchedule (不可调度): 不在接受新的Pod调度,已经调度到该节点的Pod不会被驱逐。 - PreferNoSchedule(尽量不调度): 优先将Pod调度到其他节点,当其他节点不可调度时,再往该节点调度。 - NoExecute (驱逐): 不在接受新的Pod调度,且已经调度到该节点的Pod会被立刻驱逐。 污点的格式: key[=value]:effect 2.污点的基础管理 2.1 查看污点 [root@master231 18-scheduler]# kubectl describe nodes | grep Taints Taints: node-role.kubernetes.io/master:NoSchedule Taints: <none> Taints: <none> [root@master231 18-scheduler]# 温馨提示: <none>表示该节点没有污点。 2.2 给指定节点打污点 [root@master231 18-scheduler]# kubectl taint node --all school=weixiang:PreferNoSchedule node/master231 tainted node/worker232 tainted node/worker233 tainted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule school=weixiang:PreferNoSchedule Unschedulable: false -- Taints: school=weixiang:PreferNoSchedule Unschedulable: false Lease: -- Taints: school=weixiang:PreferNoSchedule Unschedulable: false Lease: [root@master231 18-scheduler]# 2.3 修改污点【只能修改value字段,修改effect则表示创建了一个新的污点类型】 [root@master231 18-scheduler]# kubectl taint node worker233 school=laonanhai:PreferNoSchedule --overwrite node/worker233 modified [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule school=weixiang:PreferNoSchedule Unschedulable: false -- Taints: school=weixiang:PreferNoSchedule Unschedulable: false Lease: -- Taints: school=laonanhai:PreferNoSchedule Unschedulable: false Lease: [root@master231 18-scheduler]# 2.3 删除污点 [root@master231 18-scheduler]# kubectl taint node --all school- node/master231 untainted node/worker232 untainted node/worker233 untainted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: [root@master231 18-scheduler]# 3.测试污点类型案例 3.1 添加污点 [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl taint node worker233 school:NoSchedule node/worker233 tainted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: -- Taints: school:NoSchedule Unschedulable: false Lease: [root@master231 18-scheduler]# 3.2 测试案例 [root@master231 18-scheduler]# cat 05-scheduler-taints.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-taints spec: replicas: 5 selector: matchLabels: app: stress template: metadata: labels: app: stress spec: containers: - image: harbor250.weixiang.com/weixiang-test/stress:v0.1 name: weixiang-linux-tools args: - tail - -f - /etc/hosts resources: requests: cpu: 0.5 memory: 300Mi limits: cpu: 1.5 memory: 500Mi [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl apply -f 05-scheduler-taints.yaml deployment.apps/deploy-taints created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-taints-7496f54d5b-9qfch 0/1 Pending 0 4s <none> <none> <none> <none> deploy-taints-7496f54d5b-llmqv 1/1 Running 0 4s 10.100.1.138 worker232 <none> <none> deploy-taints-7496f54d5b-n4vx4 0/1 Pending 0 4s <none> <none> <none> <none> deploy-taints-7496f54d5b-nvm7h 1/1 Running 0 4s 10.100.1.139 worker232 <none> <none> deploy-taints-7496f54d5b-t9bxt 1/1 Running 0 4s 10.100.1.137 worker232 <none> <none> [root@master231 18-scheduler]# 3.3 修改污点类型 [root@master231 18-scheduler]# kubectl taint node worker233 school:PreferNoSchedule node/worker233 tainted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl taint node worker233 school:NoSchedule- node/worker233 untainted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: -- Taints: school:PreferNoSchedule Unschedulable: false Lease: [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-taints-7496f54d5b-9qfch 1/1 Running 0 2m32s 10.100.2.99 worker233 <none> <none> deploy-taints-7496f54d5b-llmqv 1/1 Running 0 2m32s 10.100.1.138 worker232 <none> <none> deploy-taints-7496f54d5b-n4vx4 1/1 Running 0 2m32s 10.100.2.100 worker233 <none> <none> deploy-taints-7496f54d5b-nvm7h 1/1 Running 0 2m32s 10.100.1.139 worker232 <none> <none> deploy-taints-7496f54d5b-t9bxt 1/1 Running 0 2m32s 10.100.1.137 worker232 <none> <none> [root@master231 18-scheduler]# 3.4 再次修改污点类型 [root@master231 18-scheduler]# kubectl taint node worker233 school=weixiang:NoExecute node/worker233 tainted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-taints-7496f54d5b-4rpbg 0/1 Pending 0 1s <none> <none> <none> <none> deploy-taints-7496f54d5b-9qfch 1/1 Terminating 0 4m35s 10.100.2.99 worker233 <none> <none> deploy-taints-7496f54d5b-llmqv 1/1 Running 0 4m35s 10.100.1.138 worker232 <none> <none> deploy-taints-7496f54d5b-n4vx4 1/1 Terminating 0 4m35s 10.100.2.100 worker233 <none> <none> deploy-taints-7496f54d5b-nvm7h 1/1 Running 0 4m35s 10.100.1.139 worker232 <none> <none> deploy-taints-7496f54d5b-t2pcv 0/1 Pending 0 1s <none> <none> <none> <none> deploy-taints-7496f54d5b-t9bxt 1/1 Running 0 4m35s 10.100.1.137 worker232 <none> <none> [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-taints-7496f54d5b-4rpbg 0/1 Pending 0 33s <none> <none> <none> <none> deploy-taints-7496f54d5b-llmqv 1/1 Running 0 5m7s 10.100.1.138 worker232 <none> <none> deploy-taints-7496f54d5b-nvm7h 1/1 Running 0 5m7s 10.100.1.139 worker232 <none> <none> deploy-taints-7496f54d5b-t2pcv 0/1 Pending 0 33s <none> <none> <none> <none> deploy-taints-7496f54d5b-t9bxt 1/1 Running 0 5m7s 10.100.1.137 worker232 <none> <none> [root@master231 18-scheduler]# 3.5.删除测试 [root@master231 18-scheduler]# kubectl delete -f 05-scheduler-taints.yaml deployment.apps "deploy-taints" deleted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: -- Taints: school=weixiang:NoExecute school:PreferNoSchedule Unschedulable: false [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl taint node worker233 school- node/worker233 untainted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: [root@master231 18-scheduler]# 4.kubesphere操作污点

927d032f11a046ef82c7b98f0a0cfeab_720

image

特性 (Feature)NoSchedule​ (不可调度)PreferNoSchedule​ (尽量不调度)NoExecute​ (驱逐)
对新 Pod 的影响拒绝调度(除非有容忍)降低调度优先级(软拒绝)拒绝调度(除非有容忍)
对已运行 Pod 的影响无影响无影响驱逐(除非有容忍)
严厉程度中等
核心作用预留资源,规划性隔离引导流量,优化调度故障转移,强制性清空
常见场景专用节点(GPU/DB),计划性维护跨区容灾,任务优先级分组节点NotReady​,网络分区,紧急维护
6、tolerations
bash
- 玩转Pod调度基础之tolerations 1.什么是tolerations tolerations是污点容忍,用该技术可以让Pod调度到一个具有污点的节点。 指的注意的是,一个Pod如果想要调度到某个worker节点,则必须容忍该worker的所有污点。 2.实战案例 2.1 环境准备就绪 [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: [root@master231 18-scheduler]# [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl taint node --all school=weixiang:NoSchedule node/master231 tainted node/worker232 tainted node/worker233 tainted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl taint node worker233 class=weixiang98:NoExecute node/worker233 tainted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule school=weixiang:NoSchedule Unschedulable: false -- Taints: school=weixiang:NoSchedule Unschedulable: false Lease: -- Taints: class=weixiang98:NoExecute school=weixiang:NoSchedule Unschedulable: false [root@master231 18-scheduler]# 2.2 测试验证 [root@master231 18-scheduler]# cat 06-scheduler-tolerations.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-tolerations spec: replicas: 10 selector: matchLabels: app: stress template: metadata: labels: app: stress spec: # 配置污点容忍 tolerations: # 指定污点的key,如果不定义,则默认匹配所有的key。 - key: school # 指定污点的value,如果不定义,则默认匹配所有的value。 value: weixiang # 指定污点的effect类型,如果不定义,则默认匹配所有的effect类型。 effect: NoSchedule - key: class value: weixiang98 effect: NoExecute - key: node-role.kubernetes.io/master effect: NoSchedule # # 注意,operator表示key和value的关系,有效值为: Exists and Equal,默认值为: Equal。 # # 如果将operator的值设置为: Exists,且不定义key,value,effect时,表示无视污点。 #- operator: Exists containers: - image: harbor250.weixiang.com/weixiang-test/stress:v0.1 name: weixiang-linux-tools args: - tail - -f - /etc/hosts resources: requests: cpu: 0.5 memory: 300Mi limits: cpu: 1.5 memory: 500Mi [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl apply -f 06-scheduler-tolerations.yaml deployment.apps/deploy-tolerations created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-tolerations-8667479cb9-85zfb 0/1 Pending 0 4s <none> <none> <none> <none> deploy-tolerations-8667479cb9-bn7r6 1/1 Running 0 4s 10.100.2.110 worker233 <none> <none> deploy-tolerations-8667479cb9-d24mp 1/1 Running 0 4s 10.100.1.149 worker232 <none> <none> deploy-tolerations-8667479cb9-d4gqx 1/1 Running 0 4s 10.100.2.111 worker233 <none> <none> deploy-tolerations-8667479cb9-dk64w 1/1 Running 0 4s 10.100.2.108 worker233 <none> <none> deploy-tolerations-8667479cb9-gfqb9 1/1 Running 0 4s 10.100.1.148 worker232 <none> <none> deploy-tolerations-8667479cb9-n985j 1/1 Running 0 4s 10.100.0.26 master231 <none> <none> deploy-tolerations-8667479cb9-s88np 1/1 Running 0 4s 10.100.1.147 worker232 <none> <none> deploy-tolerations-8667479cb9-swnbk 1/1 Running 0 4s 10.100.0.25 master231 <none> <none> deploy-tolerations-8667479cb9-t52qj 1/1 Running 0 4s 10.100.2.109 worker233 <none> <none> [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl delete -f 06-scheduler-tolerations.yaml deployment.apps "deploy-tolerations" deleted [root@master231 18-scheduler]# 3.3 删除污点 [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule school=weixiang:NoSchedule Unschedulable: false -- Taints: school=weixiang:NoSchedule Unschedulable: false Lease: -- Taints: class=weixiang98:NoExecute school=weixiang:NoSchedule Unschedulable: false [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl taint node --all school- node/master231 untainted node/worker232 untainted node/worker233 untainted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl taint node worker233 class- node/worker233 untainted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: [root@master231 18-scheduler]#
7、cordon
bash
- 玩转Pod调度基础之cordon 1.什么是cordon cordon标记节点不可调度,一般用于集群维护。 cordon的底层实现逻辑其实就给节点打污点。 2.实战案例 [root@master231 18-scheduler]# kubectl cordon worker233 node/worker233 cordoned [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get nodes NAME STATUS ROLES AGE VERSION master231 Ready control-plane,master 13d v1.23.17 worker232 Ready <none> 13d v1.23.17 worker233 Ready,SchedulingDisabled <none> 13d v1.23.17 [root@master231 18-scheduler]# [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: -- Taints: node.kubernetes.io/unschedulable:NoSchedule Unschedulable: true Lease: [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl apply -f 05-scheduler-taints.yaml deployment.apps/deploy-taints created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-taints-7496f54d5b-5h2xq 1/1 Running 0 6s 10.100.1.153 worker232 <none> <none> deploy-taints-7496f54d5b-794cp 1/1 Running 0 6s 10.100.1.154 worker232 <none> <none> deploy-taints-7496f54d5b-jsk5b 1/1 Running 0 6s 10.100.1.152 worker232 <none> <none> deploy-taints-7496f54d5b-n7q7m 0/1 Pending 0 6s <none> <none> <none> <none> deploy-taints-7496f54d5b-png9p 0/1 Pending 0 6s <none> <none> <none> <none> [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl delete -f 05-scheduler-taints.yaml deployment.apps "deploy-taints" deleted [root@master231 18-scheduler]#
8、uncordon
bash
- 玩转Pod调度基础之uncordon 1.什么uncordon uncordon的操作和cordon操作相反,表示取消节点不可调度功能。 2.实战案例 [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: -- Taints: node.kubernetes.io/unschedulable:NoSchedule Unschedulable: true Lease: [root@master231 18-scheduler]# [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl uncordon worker233 node/worker233 uncordoned [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get nodes NAME STATUS ROLES AGE VERSION master231 Ready control-plane,master 13d v1.23.17 worker232 Ready <none> 13d v1.23.17 worker233 Ready <none> 13d v1.23.17 [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl apply -f 05-scheduler-taints.yaml deployment.apps/deploy-taints created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-taints-7496f54d5b-bfnrh 1/1 Running 0 3s 10.100.2.116 worker233 <none> <none> deploy-taints-7496f54d5b-csds7 1/1 Running 0 3s 10.100.2.117 worker233 <none> <none> deploy-taints-7496f54d5b-hqrnm 1/1 Running 0 3s 10.100.2.118 worker233 <none> <none> deploy-taints-7496f54d5b-k8n66 1/1 Running 0 3s 10.100.1.156 worker232 <none> <none> deploy-taints-7496f54d5b-pdnkr 1/1 Running 0 3s 10.100.1.155 worker232 <none> <none> [root@master231 18-scheduler]#
9、drain
bash
- 玩转Pod调度基础之drain 1.什么是 drain 所谓drain其实就是将所在节点的pod进行驱逐的操作,说白了,就是将当前节点的Pod驱逐到其他节点运行。 在驱逐Pod时,需要忽略ds控制器创建的pod。 驱逐的主要应用场景是集群的缩容。 drain底层调用的cordon。 2.测试案例 2.1 部署测试服务 [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-taints-7496f54d5b-bfnrh 1/1 Running 0 4m19s 10.100.2.116 worker233 <none> <none> deploy-taints-7496f54d5b-csds7 1/1 Running 0 4m19s 10.100.2.117 worker233 <none> <none> deploy-taints-7496f54d5b-hqrnm 1/1 Running 0 4m19s 10.100.2.118 worker233 <none> <none> deploy-taints-7496f54d5b-k8n66 1/1 Running 0 4m19s 10.100.1.156 worker232 <none> <none> deploy-taints-7496f54d5b-pdnkr 1/1 Running 0 4m19s 10.100.1.155 worker232 <none> <none> [root@master231 18-scheduler]# 2.2 驱逐worker233节点的Pod [root@master231 18-scheduler]# kubectl drain worker233 --ignore-daemonsets --delete-emptydir-data node/worker233 already cordoned WARNING: ignoring DaemonSet-managed Pods: kube-flannel/kube-flannel-ds-6k42r, kube-system/kube-proxy-g5sfd, metallb-system/speaker-8q9ft evicting pod kube-system/metrics-server-57c6f647bb-54f7w evicting pod default/deploy-taints-7496f54d5b-csds7 evicting pod default/deploy-taints-7496f54d5b-bfnrh evicting pod default/deploy-taints-7496f54d5b-hqrnm pod/metrics-server-57c6f647bb-54f7w evicted pod/deploy-taints-7496f54d5b-bfnrh evicted pod/deploy-taints-7496f54d5b-csds7 evicted pod/deploy-taints-7496f54d5b-hqrnm evicted node/worker233 drained [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide # 目前来说,其他节点无法完成调度时则处于Pending状态。 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-taints-7496f54d5b-9mkj5 0/1 Pending 0 58s <none> <none> <none> <none> deploy-taints-7496f54d5b-k8n66 1/1 Running 0 6m16s 10.100.1.156 worker232 <none> <none> deploy-taints-7496f54d5b-p2tfw 0/1 Pending 0 58s <none> <none> <none> <none> deploy-taints-7496f54d5b-pdnkr 1/1 Running 0 6m16s 10.100.1.155 worker232 <none> <none> deploy-taints-7496f54d5b-vrmfg 1/1 Running 0 58s 10.100.1.157 worker232 <none> <none> [root@master231 18-scheduler]# 2.3 验证drain底层调用cordon [root@master231 18-scheduler]# kubectl get nodes NAME STATUS ROLES AGE VERSION master231 Ready control-plane,master 13d v1.23.17 worker232 Ready <none> 13d v1.23.17 worker233 Ready,SchedulingDisabled <none> 13d v1.23.17 [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl describe nodes | grep Taints -A 2 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: -- Taints: <none> Unschedulable: false Lease: -- Taints: node.kubernetes.io/unschedulable:NoSchedule Unschedulable: true Lease: [root@master231 18-scheduler]#
10、nodeSelector
bash
- 玩转Pod调度之nodeSelector 1.什么是nodeSelector 所谓的nodeSelector,可以让Pod调度基于标签匹配符合的工作节点。 2.实战案例 2.1 响应式给节点打标签 [root@master231 18-scheduler]# kubectl label nodes --all school=weixiang node/master231 labeled node/worker232 labeled node/worker233 labeled [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl label nodes worker232 school=laonanhai --overwrite node/worker232 labeled [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get nodes --show-labels | grep school master231 Ready control-plane,master 13d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=master231,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,school=weixiang worker232 Ready <none> 13d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker232,kubernetes.io/os=linux,school=laonanhai worker233 Ready <none> 15h v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker233,kubernetes.io/os=linux,school=weixiang [root@master231 18-scheduler]# 2.2 部署测试服务,将Pod调度到school=weixiang的节点。 [root@master231 18-scheduler]# cat 07-scheduler-nodeSelector.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-tolerations-nodeselector spec: replicas: 5 selector: matchLabels: app: stress template: metadata: labels: app: stress spec: nodeSelector: # 基于节点的标签匹配Pod可以调度到哪些worker节点上。 school: weixiang # 指定标签值 tolerations: # 允许 Pod 容忍(调度到)带有 node-role.kubernetes.io/master 污点的节点 - key: node-role.kubernetes.io/master effect: NoSchedule containers: - image: harbor250.weixiang.com/weixiang-test/stress:v0.1 name: weixiang-linux-tools args: - tail - -f - /etc/hosts resources: requests: cpu: 0.5 memory: 300Mi limits: cpu: 1.5 memory: 500Mi [root@master231 18-scheduler]# [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl apply -f 07-scheduler-nodeSelector.yaml deployment.apps/deploy-tolerations-nodeselector created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-tolerations-nodeselector-67bb8d6899-5795w 1/1 Running 0 3s 10.100.0.31 master231 <none> <none> deploy-tolerations-nodeselector-67bb8d6899-g8jnl 1/1 Running 0 3s 10.100.2.10 worker233 <none> <none> deploy-tolerations-nodeselector-67bb8d6899-jhm5q 1/1 Running 0 3s 10.100.2.11 worker233 <none> <none> deploy-tolerations-nodeselector-67bb8d6899-shbcn 1/1 Running 0 3s 10.100.0.32 master231 <none> <none> deploy-tolerations-nodeselector-67bb8d6899-t7sk5 1/1 Running 0 3s 10.100.2.12 worker233 <none> <none> [root@master231 18-scheduler]#
ds结合nodeSelector
bash
部署nginx服务,分别部署到master231和worker232节点,每个节点有且仅有一个Pod副本。 1.给相应的节点打标签 [root@master231 18-scheduler]# kubectl label nodes --all class=weixiang98 node/master231 labeled node/worker232 labeled node/worker233 labeled [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl label nodes worker233 class- node/worker233 unlabeled [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get nodes -l class=weixiang98 NAME STATUS ROLES AGE VERSION master231 Ready control-plane,master 13d v1.23.17 worker232 Ready <none> 13d v1.23.17 [root@master231 18-scheduler]# 2.编写资源清单 [root@master231 18-scheduler]# cat 08-scheduler-nodeSelector.yaml apiVersion: apps/v1 kind: DaemonSet metadata: name: ds-xiuxian-nodeselector spec: selector: matchLabels: apps: xiuxian template: metadata: labels: apps: xiuxian spec: tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule nodeSelector: class: weixiang98 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl apply -f 08-scheduler-nodeSelector.yaml daemonset.apps/ds-xiuxian-nodeselector created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ds-xiuxian-nodeselector-l56hs 1/1 Running 0 4s 10.100.0.33 master231 <none> <none> ds-xiuxian-nodeselector-zbfkc 1/1 Running 0 4s 10.100.1.167 worker232 <none> <none> [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl delete -f 08-scheduler-nodeSelector.yaml daemonset.apps "ds-xiuxian-nodeselector" deleted [root@master231 18-scheduler]#
11、nodeAffinity
bash
- 玩转Pod调度之nodeAffinity 1.什么是nodeAffinity nodeAffinity的作用和nodeSelector类似,但功能更强大。 2.测试案例 2.1 修改节点的标签 [root@master231 18-scheduler]# kubectl label nodes master231 school=yitiantian --overwrite node/master231 labeled [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get nodes --show-labels | grep school master231 Ready control-plane,master 13d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,class=weixiang98,kubernetes.io/arch=amd64,kubernetes.io/hostname=master231,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,school=yitiantian worker232 Ready <none> 13d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,class=weixiang98,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker232,kubernetes.io/os=linux,school=laonanhai worker233 Ready <none> 16h v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker233,kubernetes.io/os=linux,school=weixiang [root@master231 18-scheduler]# 2.2 测试案例 [root@master231 18-scheduler]# cat 09-scheduler-nodeAffinity.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-nodeaffinity spec: replicas: 5 selector: matchLabels: app: stress template: metadata: labels: app: stress spec: # 配置Pod的粘性(亲和性) affinity: # 配置Pod更倾向于哪些节点进行调度,匹配条件基于节点标签实现。 nodeAffinity: # 硬限制要求,必须满足 requiredDuringSchedulingIgnoredDuringExecution: # 基于节点标签匹配 nodeSelectorTerms: # 基于表达式匹配节点标签 - matchExpressions: # 指定节点标签的key - key: school # 指定节点标签的value values: - laonanhai - weixiang # 指定key和values之间的关系。 operator: In tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule containers: - image: harbor250.weixiang.com/weixiang-test/stress:v0.1 name: weixiang-linux-tools args: - tail - -f - /etc/hosts resources: requests: cpu: 0.5 memory: 300Mi limits: cpu: 1.5 memory: 500Mi [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl apply -f 09-scheduler-nodeAffinity.yaml deployment.apps/deploy-nodeaffinity created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-nodeaffinity-6764d77cc4-4v9t9 1/1 Running 0 3s 10.100.2.18 worker233 <none> <none> deploy-nodeaffinity-6764d77cc4-6l9gp 1/1 Running 0 3s 10.100.2.16 worker233 <none> <none> deploy-nodeaffinity-6764d77cc4-nlzvc 1/1 Running 0 3s 10.100.1.168 worker232 <none> <none> deploy-nodeaffinity-6764d77cc4-pjfzt 1/1 Running 0 3s 10.100.1.169 worker232 <none> <none> deploy-nodeaffinity-6764d77cc4-wlx2c 1/1 Running 0 3s 10.100.2.17 worker233 <none> <none> [root@master231 18-scheduler]#
12、podAffinity
bash
玩转Pod调度之podAffinity 1.什么是podAffinity 所谓的podAffinity指的是某个Pod调度到特定的拓扑域(暂时理解为'机房')后,后续的所有Pod都往该拓扑域调度。 2.实战案例 # 删除标签 [root@master231 ~/count/18-scheduler]#kubectl label nodes --all dc- 2.1 给节点打标签 [root@master231 18-scheduler]# kubectl label nodes master231 dc=jiuxianqiao node/master231 labeled [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl label nodes worker232 dc=lugu node/worker232 labeled [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl label nodes worker233 dc=zhaowei node/worker233 labeled [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get nodes --show-labels | grep dc master231 Ready control-plane,master 13d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,class=weixiang98,dc=jiuxianqiao,kubernetes.io/arch=amd64,kubernetes.io/hostname=master231,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,school=yitiantian worker232 Ready <none> 13d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,class=weixiang98,dc=lugu,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker232,kubernetes.io/os=linux,school=laonanhai worker233 Ready <none> 17h v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dc=zhaowei,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker233,kubernetes.io/os=linux,school=weixiang [root@master231 18-scheduler]# 2.2 编写资源清单 [root@master231 18-scheduler]# cat 10-scheduler-podAffinity.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-podaffinity spec: replicas: 5 selector: matchLabels: app: stress template: metadata: labels: app: stress spec: affinity: # 配置Pod的亲和性 podAffinity: # 硬限制要求,必须满足 requiredDuringSchedulingIgnoredDuringExecution: # 指定拓扑域的key - topologyKey: dc # 指定标签选择器关联Pod labelSelector: matchExpressions: - key: app values: - stress operator: In tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule containers: - image: harbor250.weixiang.com/weixiang-test/stress:v0.1 name: weixiang-linux-tools args: - tail - -f - /etc/hosts [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl apply -f 10-scheduler-podAffinity.yaml deployment.apps/deploy-podaffinity created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-podaffinity-9cdf99867-28q45 1/1 Running 0 32s 10.100.1.173 worker232 <none> <none> deploy-podaffinity-9cdf99867-95g9m 1/1 Running 0 32s 10.100.1.172 worker232 <none> <none> deploy-podaffinity-9cdf99867-ftp7k 1/1 Running 0 32s 10.100.1.174 worker232 <none> <none> deploy-podaffinity-9cdf99867-jpqgh 1/1 Running 0 32s 10.100.1.175 worker232 <none> <none> deploy-podaffinity-9cdf99867-xqgrz 1/1 Running 0 32s 10.100.1.171 worker232 <none> <none> [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl delete -f 10-scheduler-podAffinity.yaml deployment.apps "deploy-podaffinity" deleted [root@master231 18-scheduler]#
13、PodAntiAffinity
bash
- 玩转Pod调度之PodAntiAffinity 1.什么是PodAntiAffinity 所谓的PodAntiAffinity和PodAffinity的作用相反,表示pod如果调度到某个拓扑域后,后续的Pod不会往该拓扑域调度。 2.实战案例 2.1 查看标签 [root@master231 18-scheduler]# kubectl get nodes --show-labels | grep dc master231 Ready control-plane,master 13d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,class=weixiang98,dc=jiuxianqiao,kubernetes.io/arch=amd64,kubernetes.io/hostname=master231,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,school=yitiantian worker232 Ready <none> 13d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,class=weixiang98,dc=lugu,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker232,kubernetes.io/os=linux,school=laonanhai worker233 Ready <none> 17h v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dc=zhaowei,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker233,kubernetes.io/os=linux,school=weixiang [root@master231 18-scheduler]# 2.2 编写资源清单测试验证 [root@master231 18-scheduler]# cat 11-scheduler-podAntiAffinity.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-pod-anti-affinity spec: replicas: 5 selector: matchLabels: app: stress template: metadata: labels: app: stress spec: affinity: # 配置Pod的反亲和性 podAntiAffinity: # 硬限制要求,必须满足 requiredDuringSchedulingIgnoredDuringExecution: # 指定拓扑域的key - topologyKey: dc # 指定标签选择器关联Pod labelSelector: matchExpressions: - key: app values: - stress operator: In tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule containers: - image: harbor250.weixiang.com/weixiang-test/stress:v0.1 name: weixiang-linux-tools args: - tail - -f - /etc/hosts [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl apply -f 11-scheduler-podAntiAffinity.yaml deployment.apps/deploy-pod-anti-affinity created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-pod-anti-affinity-79f6759c97-5p8r6 0/1 Pending 0 3s <none> <none> <none> <none> deploy-pod-anti-affinity-79f6759c97-7snq4 1/1 Running 0 3s 10.100.1.176 worker232 <none> <none> deploy-pod-anti-affinity-79f6759c97-cj9bm 1/1 Running 0 3s 10.100.0.36 master231 <none> <none> deploy-pod-anti-affinity-79f6759c97-dcng4 0/1 Pending 0 3s <none> <none> <none> <none> deploy-pod-anti-affinity-79f6759c97-kkjm6 1/1 Running 0 3s 10.100.2.19 worker233 <none> <none> [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl delete -f 11-scheduler-podAntiAffinity.yaml deployment.apps "deploy-pod-anti-affinity" deleted [root@master231 18-scheduler]# 2.3 修改节点的标签 [root@master231 18-scheduler]# kubectl label nodes worker233 dc=lugu --overwrite node/worker233 unlabeled [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get nodes --show-labels | grep dc master231 Ready control-plane,master 13d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,class=weixiang98,dc=jiuxianqiao,kubernetes.io/arch=amd64,kubernetes.io/hostname=master231,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,school=yitiantian worker232 Ready <none> 13d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,class=weixiang98,dc=lugu,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker232,kubernetes.io/os=linux,school=laonanhai worker233 Ready <none> 17h v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dc=lugu,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker233,kubernetes.io/os=linux,school=weixiang [root@master231 18-scheduler]# 2.4 再次测试验证 [root@master231 18-scheduler]# kubectl apply -f 11-scheduler-podAntiAffinity.yaml deployment.apps/deploy-pod-anti-affinity created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-pod-anti-affinity-79f6759c97-fq77v 0/1 Pending 0 6s <none> <none> <none> <none> deploy-pod-anti-affinity-79f6759c97-m72lz 0/1 Pending 0 6s <none> <none> <none> <none> deploy-pod-anti-affinity-79f6759c97-rcpfl 1/1 Running 0 6s 10.100.2.20 worker233 <none> <none> deploy-pod-anti-affinity-79f6759c97-wq76t 1/1 Running 0 6s 10.100.0.37 master231 <none> <none> deploy-pod-anti-affinity-79f6759c97-xtncp 0/1 Pending 0 6s <none> <none> <none> <none> [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl delete -f 11-scheduler-podAntiAffinity.yaml deployment.apps "deploy-pod-anti-affinity" deleted [root@master231 18-scheduler]# 2.5 再次修改标签 【将3个节点都指向同一个拓扑域】 [root@master231 18-scheduler]# kubectl label nodes master231 dc=lugu --overwrite node/master231 unlabeled [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get nodes --show-labels | grep dc master231 Ready control-plane,master 13d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,class=weixiang98,dc=lugu,kubernetes.io/arch=amd64,kubernetes.io/hostname=master231,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,school=yitiantian worker232 Ready <none> 13d v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,class=weixiang98,dc=lugu,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker232,kubernetes.io/os=linux,school=laonanhai worker233 Ready <none> 17h v1.23.17 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,dc=lugu,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker233,kubernetes.io/os=linux,school=weixiang [root@master231 18-scheduler]# 2.6 测试验证 [root@master231 18-scheduler]# kubectl apply -f 11-scheduler-podAntiAffinity.yaml deployment.apps/deploy-pod-anti-affinity created [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-pod-anti-affinity-79f6759c97-bzjfd 0/1 Pending 0 3s <none> <none> <none> <none> deploy-pod-anti-affinity-79f6759c97-gpbm4 0/1 Pending 0 3s <none> <none> <none> <none> deploy-pod-anti-affinity-79f6759c97-jvbrv 1/1 Running 0 3s 10.100.2.21 worker233 <none> <none> deploy-pod-anti-affinity-79f6759c97-tfrbb 0/1 Pending 0 3s <none> <none> <none> <none> deploy-pod-anti-affinity-79f6759c97-xj8t8 0/1 Pending 0 3s <none> <none> <none> <none> [root@master231 18-scheduler]# [root@master231 18-scheduler]#
14、k8s集群的缩容与扩容
缩容
baSH
- k8s集群的缩容 1.k8s集群的缩容的流程 A.将已经调度到待下线节点的Pod驱逐到其他节点; B.停止kubelet进程,避免kubelet实时上报数据给apiServer; C.如果是二进制部署的话,可以将kube-proxy组件停止;【可选】 D.待下线节点重置环境,重新安装操作系统(防止数据泄露),然后再将服务器用做其他处理; E.master节点移除待下线节点; 2.实操案例 2.1 驱逐已经调度到节点的Pod [root@master231 18-scheduler]# kubectl get nodes NAME STATUS ROLES AGE VERSION master231 Ready control-plane,master 13d v1.23.17 worker232 Ready <none> 13d v1.23.17 worker233 Ready,SchedulingDisabled <none> 13d v1.23.17 [root@master231 18-scheduler]# 2.2 停止kubelet进程 [root@worker233 ~]# systemctl stop kubelet.service 2.3 重新安装操作系统 2.4 移除待下线节点 [root@master231 18-scheduler]# kubectl get nodes NAME STATUS ROLES AGE VERSION master231 Ready control-plane,master 13d v1.23.17 worker232 Ready <none> 13d v1.23.17 worker233 NotReady,SchedulingDisabled <none> 13d v1.23.17 [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl delete nodes worker233 node "worker233" deleted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get nodes NAME STATUS ROLES AGE VERSION master231 Ready control-plane,master 13d v1.23.17 worker232 Ready <none> 13d v1.23.17 [root@master231 18-scheduler]#
扩容

e6f99ce9b2cad86d42e0fa343c5ac6eb_720

bash
- kubelet首次加入集群bootstrap原理图解 参考链接: https://kubernetes.io/zh-cn/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/ - k8s集群的扩容实战案例 1.扩容流程 A.创建token; B.待加入节点安装docker,kubectl,kubeadm,kubelet等相关组件; C.kubeadm join即可 2.实战案例 2.1 master节点创建token [root@master231 18-scheduler]# kubeadm token create oldboy.yinzhengjiejason --print-join-command --ttl 0 kubeadm join 10.1.24.13:6443 --token oldboy.yinzhengjiejason --discovery-token-ca-cert-hash sha256:0a3262eddefaeda176117e50757807ebc5a219648bda6afad21bc1d5f921caca [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubeadm token list TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS oldboy.yinzhengjiejason <forever> <never> authentication,signing <none> system:bootstrappers:kubeadm:default-node-token [root@master231 18-scheduler]# 2.2 客户端安装相关软件包 wget http://192.168.21.253/Resources/Docker/scripts/weixiang-autoinstall-docker-docker-compose.tar.gz tar xf weixiang-autoinstall-docker-docker-compose.tar.gz ./install-docker.sh i wget http://192.168.21.253/Resources/Kubernetes/softwares/k8s-v1.23.17.tar.gz tar xf k8s-v1.23.17.tar.gz cd var/cache/apt/archives/ && dpkg -i *.deb 主要安装docker,kubectl,kubeadm,kubelet等相关组件。 2.3 待加入节点假如集群【使用第一步生成的token】 确认数据已删除,没有删除用以下命令 [root@worker233 ~]# kubeadm reset -f [root@worker233 ~]# kubeadm join 10.1.24.13:6443 --token oldboy.yinzhengjiejason --discovery-token-ca-cert-hash sha256:0a3262eddefaeda176117e50757807ebc5a219648bda6afad21bc1d5f921caca [preflight] Running pre-flight checks [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' W0722 16:54:18.692995 335932 utils.go:69] The recommended value for "resolvConf" in "KubeletConfiguration" is: /run/systemd/resolve/resolv.conf; the provided value is: /run/systemd/resolve/resolv.conf [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Starting the kubelet [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster. [root@worker233 ~]# 2.4 master节点验证 [root@master231 18-scheduler]# kubectl get nodes NAME STATUS ROLES AGE VERSION master231 Ready control-plane,master 13d v1.23.17 worker232 Ready <none> 13d v1.23.17 worker233 Ready <none> 29s v1.23.17 [root@master231 18-scheduler]# 2.5 验证客户端证书信息 [root@worker233 ~]# ll /var/lib/kubelet total 48 drwxr-xr-x 8 root root 4096 Jul 22 16:54 ./ drwxr-xr-x 65 root root 4096 Jul 15 15:06 ../ -rw-r--r-- 1 root root 1016 Jul 22 16:54 config.yaml -rw------- 1 root root 62 Jul 22 16:54 cpu_manager_state drwxr-xr-x 2 root root 4096 Jul 22 16:54 device-plugins/ -rw-r--r-- 1 root root 122 Jul 22 16:54 kubeadm-flags.env -rw------- 1 root root 61 Jul 22 16:54 memory_manager_state drwxr-xr-x 2 root root 4096 Jul 22 16:54 pki/ drwxr-x--- 2 root root 4096 Jul 22 16:54 plugins/ drwxr-x--- 2 root root 4096 Jul 22 16:54 plugins_registry/ drwxr-x--- 2 root root 4096 Jul 22 16:54 pod-resources/ drwxr-x--- 8 root root 4096 Jul 22 16:54 pods/ [root@worker233 ~]# [root@worker233 ~]# ll /var/lib/kubelet/pki/ total 20 drwxr-xr-x 2 root root 4096 Jul 22 16:54 ./ drwxr-xr-x 8 root root 4096 Jul 22 16:54 ../ -rw------- 1 root root 1114 Jul 22 16:54 kubelet-client-2025-07-22-16-54-19.pem lrwxrwxrwx 1 root root 59 Jul 22 16:54 kubelet-client-current.pem -> /var/lib/kubelet/pki/kubelet-client-2025-07-22-16-54-19.pem -rw-r--r-- 1 root root 2258 Jul 22 16:54 kubelet.crt -rw------- 1 root root 1675 Jul 22 16:54 kubelet.key [root@worker233 ~]# 2.6 master节点查看证书签发请求 [root@master231 18-scheduler]# kubectl get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-szn6d 83s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:oldboy <none> Approved,Issued [root@master231 18-scheduler]# [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get csr csr-szn6d -o yaml apiVersion: certificates.k8s.io/v1 kind: CertificateSigningRequest metadata: creationTimestamp: "2025-07-22T08:54:19Z" generateName: csr- name: csr-szn6d resourceVersion: "216756" uid: 1e600d47-0c23-4736-8322-ec00a5881a06 spec: groups: - system:bootstrappers - system:bootstrappers:kubeadm:default-node-token - system:authenticated request: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURSBSRVFVRVNULS0tLS0KTUlIeU1JR1pBZ0VBTURjeEZUQVRCZ05WQkFvVERITjVjM1JsYlRwdWIyUmxjekVlTUJ3R0ExVUVBeE1WYzNsegpkR1Z0T201dlpHVTZkMjl5YTJWeU1qTXpNRmt3RXdZSEtvWkl6ajBDQVFZSUtvWkl6ajBEQVFjRFFnQUU1QitECi9QeG5qZkVnVWVJZi94NlMzTWVuYzQyZ3RQZkRHZzIzMmpPMEIvMGR1d2k5QlVBNW1hcGFlL3hmUUlKV1kzdHQKTmNnRW54RzZZRDVzR1B6alVhQUFNQW9HQ0NxR1NNNDlCQU1DQTBnQU1FVUNJUUNoSTRPbTNueTVLRjU5K3BDSApNaENhREQwQ1F2MFMyVDU5T0owNHowNEQwZ0lnRE5JN1RmZTF3QmFlbjRsSSt3VmZZeGxkSE9yUFRsaFlpcWZXCnBkWXRjeGM9Ci0tLS0tRU5EIENFUlRJRklDQVRFIFJFUVVFU1QtLS0tLQo= signerName: kubernetes.io/kube-apiserver-client-kubelet usages: - digital signature - key encipherment - client auth username: system:bootstrap:oldboy status: certificate: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUNZakNDQVVxZ0F3SUJBZ0lSQU16dFFHeEdJVU5HN2kzWThMNkFFQXd3RFFZSktvWklodmNOQVFFTEJRQXcKRlRFVE1CRUdBMVVFQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TlRBM01qSXdPRFE1TVRsYUZ3MHlOakEzTWpJdwpPRFE1TVRsYU1EY3hGVEFUQmdOVkJBb1RESE41YzNSbGJUcHViMlJsY3pFZU1Cd0dBMVVFQXhNVmMzbHpkR1Z0Ck9tNXZaR1U2ZDI5eWEyVnlNak16TUZrd0V3WUhLb1pJemowQ0FRWUlLb1pJemowREFRY0RRZ0FFNUIrRC9QeG4KamZFZ1VlSWYveDZTM01lbmM0Mmd0UGZER2cyMzJqTzBCLzBkdXdpOUJVQTVtYXBhZS94ZlFJSldZM3R0TmNnRQpueEc2WUQ1c0dQempVYU5XTUZRd0RnWURWUjBQQVFIL0JBUURBZ1dnTUJNR0ExVWRKUVFNTUFvR0NDc0dBUVVGCkJ3TUNNQXdHQTFVZEV3RUIvd1FDTUFBd0h3WURWUjBqQkJnd0ZvQVVPVGxjOFJGWlV3b3pOdHBMaDhmd1BwNHkKZCtZd0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFJVGh2cDAzcS9zK2hSZkdFaCtVK0xmd2JtSU9lZkxCOVZYZAp0QW1zZnVCMmF4VVdOOWU2MCsxVzJMdHRYVzV6RC9FS1NpcFNlYUtBRGI5aTJvWDVka3RqVzRKMVNscC8wallPClhIWmh0VWUrcndOdk53ZXdBMHFsbVY0YTlZZnpqSUo5bmlhazh5NEhlamNpczdHZ1BsaFJKSCswL0ZqcGdja0sKa3BFNTlEV0MxVE1KRU1XNVdlRUlEcHpUK01oVW9yZkhnV1JCaTFvMTRxN0c1SHdOcmJ1VWNMc0pkYTV1a0x5WQpIbkRBNWJRRzFlQ21WYTZHUkVQQnBESTM3bVh2TVN5Wm1Ud3RoSE1UbzJuSHI3eXpSS1ZRbnI5aWttTk51WUYrCnJTUEJ2Snc5QTk1YUdKN3JkR3ZTSnk5cmVvalFCUm1UZ1pnVitzQmZ2aXVhNVlqSG5kQT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo= conditions: - lastTransitionTime: "2025-07-22T08:54:19Z" lastUpdateTime: "2025-07-22T08:54:19Z" message: Auto approving kubelet client certificate after SubjectAccessReview. reason: AutoApproved status: "True" type: Approved [root@master231 18-scheduler]# 2.8 移除token [root@master231 18-scheduler]# kubeadm token list TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS oldboy.yinzhengjiejason <forever> <never> authentication,signing <none> system:bootstrappers:kubeadm:default-node-token [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubeadm token delete oldboy bootstrap token "oldboy" deleted [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubeadm token list [root@master231 18-scheduler]# [root@master231 18-scheduler]# [root@master231 18-scheduler]# kubectl get secrets -n kube-system bootstrap-token-oldboy Error from server (NotFound): secrets "bootstrap-token-oldboy" not found [root@master231 18-scheduler]# [root@master231 18-scheduler]# - 今日作业: - 完成课堂的所有练习并整理思维导图; - 将nginx和tomcat案例部署到k8s集群,将tomcat部署到master231和worker233节点,将nginx部署到worker232节点; - 扩展作业: - 部署Loki+Grafana案例日志采集方案到K8S集群。 镜像参考: http://192.168.21.253/Resources/Kubernetes/Case-Demo/Loki/ http://192.168.21.253/Resources/Kubernetes/Case-Demo/Grafana/

21、地址池被耗尽问题

bash
# 查看MetalLB 配置 [root@master231 ~/count/inCluster]#kubectl get ipaddresspools -n metallb-system NAME AUTO ASSIGN AVOID BUGGY IPS ADDRESSES jasonyin2020 true false ["106.55.44.37/32","43.139.47.66/32","43.139.77.96/32"] [root@master231 ~/count/inCluster]#kubectl edit ipaddresspool jasonyin2020 -n metallb-system # Please edit the object below. Lines beginning with a '#' will be ignored, # and an empty file will abort the edit. If an error occurs while saving this file will be # reopened with the relevant failures. # apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"metallb.io/v1beta1","kind":"IPAddressPool","metadata":{"annotations":{},"name":"jasonyin2020","namespace":"metallb-system"},"spec":{"addresses":["106.55.44.37/32"]}} creationTimestamp: "2025-07-22T02:05:49Z" generation: 2 name: jasonyin2020 namespace: metallb-system resourceVersion: "1283509" uid: 274d9617-1228-46b5-905a-b8438a5c10a1 spec: addresses: - 106.55.44.37/32 - 43.139.47.66/32 # 分配了三个公网 - 43.139.77.96/32 autoAssign: true avoidBuggyIPs: false status: assignedIPv4: 3 # 已经用了三个 assignedIPv6: 0 availableIPv4: 0 availableIPv6: 0 # 查看 [root@master231 ~/count/inCluster]#kubectl get service -A | grep LoadBalancer default svc-kibana LoadBalancer 10.200.191.241 <pending> 5601:25713/TCP 34m default svc-xiuxian LoadBalancer 10.200.34.15 106.55.44.37 80:15188/TCP 2d18h default xixixix LoadBalancer 10.200.201.112 43.139.77.96 80:25934/TCP 19h kubernetes-dashboard kubernetes-dashboard LoadBalancer 10.200.180.79 43.139.47.66 443:8443/TCP # 修改 kubectl edit service svc-xiuxian -n default 在打开的编辑器中,找到 type: LoadBalancer,修改为 type: NodePort,然后保存退出。 # 验证 kubectl get svc svc-kibana -n default

22、Pod创建流程详解

e50d19adc82faaac6706a1f207320c14_720

bash
Pod的创建,删除,修改流程: 1.执行kubectl命令时会加载"~/.kube/config",从而识别到apiserver的地址,端口及认证证书; 2.apiserver进行证书认证,鉴权,语法检查,若成功则可以进行数据的读取或者写入; 3.若用户是写入操作(创建,修改,删除)则需要修改etcd数据库的信息; 4.如果创建Pod,此时scheduler负责Pod调度,将Pod调度到合适的worker节点,并将结果返回给ApiServer,由apiServer负责存储到etcd中; 5.kubelet组件会周期性上报给apiServer节点,包括Pod内的容器资源(cpu,memory,disk,gpu,...)及worker宿主机节点状态,apiServer并将结果存储到etcd中,若有该节点的任务也会直接返回给该节点进行调度; 6.kubelet开始调用CRI接口创建容器(依次创建pause,initContainers,containers); 7.在运行过程中,若Pod容器,正常或者异常退出时,kubelet会根据重启策略是否重启容器(Never,Always,OnFailure); 8.若一个节点挂掉,则需要controller manager介入维护,比如Pod副本数量缺失,则需要创建watch事件,要求控制器的副本数要达到标准,从而要创建新的Pod,此过程重复步骤4-6。

23、项目

1、EFK架构分析k8s集群日志
bash
1.k8s日志采集方案 边车模式(sidecar): 可以在原有的容器基础上添加一个新的容器,新的容器成为边车容器,该容器可以负责日志采集,监控,流量代理等功能。 优点: 不需要修改原有的架构,就可以实现新的功能。 缺点: 相对来说比较消耗资源。 守护进程(ds): 每个工作节点仅有一个pod。 优点: 相对边车模式更加节省资源。 缺点: 需要学习K8S的RBAC认证体系。 产品内置: 说白了缺啥功能直接让开发实现即可。 优点: 运维人员省事,无需安装任何组件。 缺点: 推动较慢,因为大多数开发都是业务开发。要么就需要单独的运维开发人员来解决。

边车模式
bash
2.部署准备环境ES和kibana 2.1 编写资源清单 [root@master231 inCluster]# cat 01-deploy-es-kibana.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-es-kibana spec: replicas: 1 selector: matchLabels: apps: elasticstack template: metadata: labels: apps: elasticstack spec: containers: - name: es # image: docker.elastic.co/elasticsearch/elasticsearch:7.17.25 image: harbor250.weixiang.com/weixiang-elasticstack/elasticsearch:7.17.25 ports: - containerPort: 9200 name: http - containerPort: 9300 name: tcp env: - name: discovery.type value: "single-node" - name: node.name value: "weixiang98-es" - name: cluster.name value: "weixiang-weixiang98-single" - name: ES_JAVA_OPTS value: "-Xms512m -Xmx512m" - name: kibana # image: docker.elastic.co/kibana/kibana:7.17.25 image: harbor250.weixiang.com/weixiang-elasticstack/kibana:7.17.25 ports: - containerPort: 5601 name: webui env: - name: ELASTICSEARCH_HOSTS value: http://127.0.0.1:9200 - name: I18N_LOCALE value: "zh-CN" [root@master231 inCluster]# 2.2 创建资源 [root@master231 inCluster]# kubectl apply -f 01-deploy-es-kibana.yaml deployment.apps/deploy-es-kibana created [root@master231 inCluster]# [root@master231 inCluster]# kubectl get pods -o wide -l apps=elasticstack NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-es-kibana-f46f6d948-xwjfv 2/2 Running 0 41s 10.100.2.22 worker233 <none> <none> [root@master231 inCluster]# 2.3 编写Service资源暴露kibana [root@master231 inCluster]# cat 02-svc-elasticstack.yaml apiVersion: v1 kind: Service metadata: name: svc-kibana spec: ports: - port: 5601 selector: apps: elasticstack type: LoadBalancer [root@master231 inCluster]# 2.4 创建svc资源 [root@master231 inCluster]# kubectl apply -f 02-svc-elasticstack.yaml service/svc-kibana created [root@master231 inCluster]# [root@master231 inCluster]# kubectl get svc svc-kibana NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc-kibana LoadBalancer 10.200.195.114 10.0.0.152 5601:10282/TCP 5s [root@master231 inCluster]# 2.5 验证测试 http://10.0.0.152:5601/ - 基于sidecar模式实现日志采集 1.编写资源清单 [root@master231 inCluster]# cat 03-deploy-xiuxian.yaml apiVersion: v1 kind: Service metadata: name: svc-es spec: ports: - port: 9200 selector: apps: elasticstack --- apiVersion: v1 kind: ConfigMap metadata: name: cm-filebeat data: main: | filebeat.config.modules: path: ${path.config}/modules.d/*.yml reload.enabled: true output.elasticsearch: hosts: - svc-es:9200 index: "weixiang98-k8s-modeules-nginx-accesslog-%{+yyyy.MM.dd}" setup.ilm.enabled: false setup.template.name: "weixiang-weixiang98" setup.template.pattern: "weixiang98*" setup.template.overwrite: false setup.template.settings: index.number_of_shards: 3 index.number_of_replicas: 0 nginx.yml: | - module: nginx access: enabled: true var.paths: ["/data/access.log"] error: enabled: false var.paths: ["/data/error.log"] ingress_controller: enabled: false --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: volumes: - name: data emptyDir: {} - name: main configMap: name: cm-filebeat items: - key: main path: modules-to-es.yaml - key: nginx.yml path: nginx.yml containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 volumeMounts: - name: data mountPath: /var/log/nginx - name: c2 image: harbor250.weixiang.com/weixiang-elasticstack/filebeat:7.17.25 volumeMounts: - name: data mountPath: /data - name: main mountPath: /config/modules-to-es.yaml subPath: modules-to-es.yaml - name: main mountPath: /usr/share/filebeat/modules.d/nginx.yml subPath: nginx.yml command: - /bin/bash - -c - "filebeat -e -c /config/modules-to-es.yaml --path.data /tmp/xixi" [root@master231 inCluster]# 2.测试验证 [root@master231 inCluster]# kubectl apply -f 03-deploy-xiuxian.yaml service/svc-es created configmap/cm-filebeat created deployment.apps/deploy-xiuxian created [root@master231 inCluster]# [root@master231 inCluster]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-es-kibana-5b999c888d-lxbkx 2/2 Running 0 3h4m 10.100.2.24 worker233 <none> <none> deploy-xiuxian-f977d4d55-ft2mb 2/2 Running 0 3s 10.100.2.32 worker233 <none> <none> deploy-xiuxian-f977d4d55-g5mpl 2/2 Running 0 3s 10.100.2.33 worker233 <none> <none> deploy-xiuxian-f977d4d55-r84n5 2/2 Running 0 3s 10.100.1.185 worker232 <none> <none> [root@master231 inCluster]# [root@master231 inCluster]# [root@master231 inCluster]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-es-kibana-5b999c888d-lxbkx 2/2 Running 0 3h4m 10.100.2.24 worker233 <none> <none> deploy-xiuxian-f977d4d55-ft2mb 2/2 Running 0 7s 10.100.2.32 worker233 <none> <none> deploy-xiuxian-f977d4d55-g5mpl 2/2 Running 0 7s 10.100.2.33 worker233 <none> <none> deploy-xiuxian-f977d4d55-r84n5 2/2 Running 0 7s 10.100.1.185 worker232 <none> <none> [root@master231 inCluster]# [root@master231 inCluster]# [root@master231 inCluster]# curl 10.100.2.32 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v3</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: pink">凡人修仙传 v3 </h1> <div> <img src="3.jpg"> <div> </body> </html> [root@master231 inCluster]# [root@master231 inCluster]# curl 10.100.2.33 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v3</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: pink">凡人修仙传 v3 </h1> <div> <img src="3.jpg"> <div> </body> </html> [root@master231 inCluster]# [root@master231 inCluster]# curl 10.100.1.185 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v3</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: pink">凡人修仙传 v3 </h1> <div> <img src="3.jpg"> <div> </body> </html> [root@master231 inCluster]# [root@master231 inCluster]# curl 10.100.1.185/weixiang.html <html> <head><title>404 Not Found</title></head> <body> <center><h1>404 Not Found</h1></center> <hr><center>nginx/1.20.1</center> </body> </html> [root@master231 inCluster]# 3.kibana查看数据 略,见视频。 4.删除环境 [root@master231 inCluster]# kubectl delete -f 03-deploy-xiuxian.yaml service "svc-es" deleted configmap "cm-filebeat" deleted deployment.apps "deploy-xiuxian" deleted [root@master231 inCluster]#
守护进程ds控制器采集日志
bash
1.编写资源清单 [root@master231 inCluster]# cat 04-deploy-ds-filebeat.yaml apiVersion: v1 kind: Service metadata: name: svc-es spec: ports: - port: 9200 selector: apps: elasticstack --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 --- apiVersion: v1 kind: ServiceAccount metadata: name: filebeat --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: filebeat subjects: - kind: ServiceAccount name: filebeat namespace: default roleRef: kind: ClusterRole name: filebeat apiGroup: rbac.authorization.k8s.io --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: filebeat labels: k8s-app: filebeat rules: - apiGroups: [""] resources: - namespaces - pods - nodes verbs: - get - watch - list --- apiVersion: v1 kind: ConfigMap metadata: name: filebeat-config data: filebeat.yml: |- filebeat.config: inputs: path: ${path.config}/inputs.d/*.yml reload.enabled: true modules: path: ${path.config}/modules.d/*.yml reload.enabled: true output.elasticsearch: hosts: ['http://svc-es:9200'] # hosts: ['svc-es.default.svc.weixiang.com:9200'] index: 'weixiang-k8s-ds-%{+yyyy.MM.dd}' # 配置索引模板 setup.ilm.enabled: false setup.template.name: "weixiang-k8s-ds" setup.template.pattern: "weixiang-k8s-ds*" setup.template.overwrite: false setup.template.settings: index.number_of_shards: 5 index.number_of_replicas: 0 --- apiVersion: v1 kind: ConfigMap metadata: name: filebeat-inputs data: kubernetes.yml: |- - type: docker containers.ids: - "*" processors: - add_kubernetes_metadata: in_cluster: true --- apiVersion: apps/v1 kind: DaemonSet metadata: name: filebeat spec: selector: matchLabels: k8s-app: filebeat template: metadata: labels: k8s-app: filebeat spec: tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule operator: Exists serviceAccountName: filebeat terminationGracePeriodSeconds: 30 containers: - name: filebeat image: harbor250.weixiang.com/weixiang-elasticstack/filebeat:7.17.25 args: [ "-c", "/etc/filebeat.yml", "-e", ] securityContext: runAsUser: 0 resources: limits: memory: 200Mi requests: cpu: 100m memory: 100Mi volumeMounts: - name: config mountPath: /etc/filebeat.yml readOnly: true subPath: filebeat.yml - name: inputs mountPath: /usr/share/filebeat/inputs.d readOnly: true - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true volumes: - name: config configMap: defaultMode: 0600 name: filebeat-config - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers - name: inputs configMap: defaultMode: 0600 name: filebeat-inputs [root@master231 inCluster]# 2.创建资源 [root@master231 inCluster]# [root@master231 inCluster]# kubectl apply -f 04-deploy-ds-filebeat.yaml service/svc-es created deployment.apps/deploy-xiuxian created serviceaccount/filebeat created clusterrolebinding.rbac.authorization.k8s.io/filebeat created clusterrole.rbac.authorization.k8s.io/filebeat created configmap/filebeat-config created configmap/filebeat-inputs created daemonset.apps/filebeat created [root@master231 inCluster]# [root@master231 inCluster]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-es-kibana-5b999c888d-lxbkx 2/2 Running 0 4h4m 10.100.2.24 worker233 <none> <none> deploy-xiuxian-8445b8c95b-4bwl4 1/1 Running 0 9s 10.100.2.37 worker233 <none> <none> deploy-xiuxian-8445b8c95b-lc77c 1/1 Running 0 9s 10.100.1.189 worker232 <none> <none> deploy-xiuxian-8445b8c95b-xgc9r 1/1 Running 0 9s 10.100.2.36 worker233 <none> <none> filebeat-hthsr 1/1 Running 0 9s 10.100.0.39 master231 <none> <none> filebeat-pr2tk 1/1 Running 0 9s 10.100.2.38 worker233 <none> <none> filebeat-s82t4 1/1 Running 0 9s 10.100.1.190 worker232 <none> <none> [root@master231 inCluster]# 3.访问测试 [root@master231 inCluster]# curl 10.100.1.189 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v3</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: pink">凡人修仙传 v3 </h1> <div> <img src="3.jpg"> <div> </body> </html> [root@master231 inCluster]# [root@master231 inCluster]# curl 10.100.2.37/weixiang.html <html> <head><title>404 Not Found</title></head> <body> <center><h1>404 Not Found</h1></center> <hr><center>nginx/1.20.1</center> </body> </html> [root@master231 inCluster]# 4.kibana查询数据 参考KQL: kubernetes.pod.name : "deploy-xiuxian-8445b8c95b-lc77c" or kubernetes.pod.name: "deploy-xiuxian-8445b8c95b-4bwl4" and kubernetes.namespace : "default" 5.删除资源 [root@master231 inCluster]# kubectl delete -f 04-deploy-ds-filebeat.yaml service "svc-es" deleted deployment.apps "deploy-xiuxian" deleted serviceaccount "filebeat" deleted clusterrolebinding.rbac.authorization.k8s.io "filebeat" deleted clusterrole.rbac.authorization.k8s.io "filebeat" deleted configmap "filebeat-config" deleted configmap "filebeat-inputs" deleted daemonset.apps "filebeat" deleted [root@master231 inCluster]#
EFK架构存储到ES加密集群
bash
将ES部署到K8S集群外部,且要求ES集群支持https协议,并使用api-key方式进行认证写入 1.启动ES集群 [root@master231 inCluster]# curl https://10.0.0.91:9200/_cat/nodes -k -u elastic:123456 10.0.0.93 79 46 64 1.75 0.71 0.26 cdfhilmrstw * elk93 10.0.0.92 71 60 63 3.14 1.19 0.43 cdfhilmrstw - elk92 10.0.0.91 76 69 72 4.07 1.70 0.63 cdfhilmrstw - elk91 [root@master231 inCluster]# 2.kibana创建api-key http://10.0.0.91:5601/ 复制你自己的api-key,而后基于api-key认证即可: vpVtNpgB2uL8EiIjlFY4:ZREiR0aERCGNxZJmy49SDg 3.无需将ES部署到K8S集群内部 [root@master231 inCluster]# kubectl delete -f 01-deploy-es-kibana.yaml deployment.apps "deploy-es-kibana" deleted [root@master231 inCluster]# [root@master231 inCluster]# kubectl get pods -o wide No resources found in default namespace. [root@master231 inCluster]# 4.编写资源清单 [root@master231 inCluster]# cat 05-deploy-ds-ep-filebeat.yaml apiVersion: v1 kind: Endpoints metadata: name: svc-es subsets: - addresses: - ip: 10.0.0.91 - ip: 10.0.0.92 - ip: 10.0.0.93 ports: - port: 9200 name: https - port: 9300 name: tcp --- apiVersion: v1 kind: Service metadata: name: svc-es spec: type: ClusterIP ports: - port: 9200 name: https - port: 9300 name: tcp --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v3 --- apiVersion: v1 kind: ServiceAccount metadata: name: filebeat --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: filebeat subjects: - kind: ServiceAccount name: filebeat namespace: default roleRef: kind: ClusterRole name: filebeat apiGroup: rbac.authorization.k8s.io --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: filebeat labels: k8s-app: filebeat rules: - apiGroups: [""] resources: - namespaces - pods - nodes verbs: - get - watch - list --- apiVersion: v1 kind: ConfigMap metadata: name: filebeat-config data: filebeat.yml: |- filebeat.config: inputs: path: ${path.config}/inputs.d/*.yml reload.enabled: true modules: path: ${path.config}/modules.d/*.yml reload.enabled: true output.elasticsearch: hosts: ['https://svc-es:9200'] index: 'weixiang-k8s-ds-ep-%{+yyyy.MM.dd}' api_key: "vpVtNpgB2uL8EiIjlFY4:ZREiR0aERCGNxZJmy49SDg" # 跳过证书校验 ssl.verification_mode: none # 配置索引模板 setup.ilm.enabled: false setup.template.name: "weixiang-k8s-ds" setup.template.pattern: "weixiang-k8s-ds*" setup.template.overwrite: false setup.template.settings: index.number_of_shards: 5 index.number_of_replicas: 0 --- apiVersion: v1 kind: ConfigMap metadata: name: filebeat-inputs data: kubernetes.yml: |- - type: docker containers.ids: - "*" processors: - add_kubernetes_metadata: in_cluster: true --- apiVersion: apps/v1 kind: DaemonSet metadata: name: filebeat spec: selector: matchLabels: k8s-app: filebeat template: metadata: labels: k8s-app: filebeat spec: tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule operator: Exists serviceAccountName: filebeat terminationGracePeriodSeconds: 30 containers: - name: filebeat image: harbor250.weixiang.com/weixiang-elasticstack/filebeat:7.17.25 args: [ "-c", "/etc/filebeat.yml", "-e", ] securityContext: runAsUser: 0 resources: limits: memory: 200Mi requests: cpu: 100m memory: 100Mi volumeMounts: - name: config mountPath: /etc/filebeat.yml readOnly: true subPath: filebeat.yml - name: inputs mountPath: /usr/share/filebeat/inputs.d readOnly: true - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true volumes: - name: config configMap: defaultMode: 0600 name: filebeat-config - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers - name: inputs configMap: defaultMode: 0600 name: filebeat-inputs [root@master231 inCluster]# [root@master231 inCluster]# wc -l 05-deploy-ds-ep-filebeat.yaml 198 05-deploy-ds-ep-filebeat.yaml [root@master231 inCluster]# 5.测试验证 [root@master231 inCluster]# kubectl apply -f 05-deploy-ds-ep-filebeat.yaml endpoints/svc-es created service/svc-es created deployment.apps/deploy-xiuxian created serviceaccount/filebeat created clusterrolebinding.rbac.authorization.k8s.io/filebeat created clusterrole.rbac.authorization.k8s.io/filebeat created configmap/filebeat-config created configmap/filebeat-inputs created daemonset.apps/filebeat created [root@master231 inCluster]# [root@master231 inCluster]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-8445b8c95b-9v2t4 1/1 Running 0 6s 10.100.2.48 worker233 <none> <none> deploy-xiuxian-8445b8c95b-d5xgw 1/1 Running 0 6s 10.100.1.197 worker232 <none> <none> deploy-xiuxian-8445b8c95b-kspm8 1/1 Running 0 6s 10.100.2.49 worker233 <none> <none> filebeat-ck75m 1/1 Running 0 5s 10.100.0.43 master231 <none> <none> filebeat-dgjvz 1/1 Running 0 5s 10.100.1.198 worker232 <none> <none> filebeat-thzkw 1/1 Running 0 5s 10.100.2.50 worker233 <none> <none> [root@master231 inCluster]# [root@master231 inCluster]# curl 10.100.1.197 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v3</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: pink">凡人修仙传 v3 </h1> <div> <img src="3.jpg"> <div> </body> </html> [root@master231 inCluster]# [root@master231 inCluster]# curl 10.100.1.197/weixiang98.html <html> <head><title>404 Not Found</title></head> <body> <center><h1>404 Not Found</h1></center> <hr><center>nginx/1.20.1</center> </body> </html> [root@master231 inCluster]# 6.kibana查看数据 关键字段: kubernetes.namespace kubernetes.pod.name kubernetes.pod.ip message 7.删除资源 [root@master231 inCluster]# kubectl delete -f 05-deploy-ds-ep-filebeat.yaml endpoints "svc-es" deleted service "svc-es" deleted deployment.apps "deploy-xiuxian" deleted serviceaccount "filebeat" deleted clusterrolebinding.rbac.authorization.k8s.io "filebeat" deleted clusterrole.rbac.authorization.k8s.io "filebeat" deleted configmap "filebeat-config" deleted configmap "filebeat-inputs" deleted daemonset.apps "filebeat" deleted [root@master231 inCluster]#
2、jenkins集成K8S集群实现CICD

8ab9e619186ff0d8b33eb0184e30be57

1、部署jenkins
bash
- 部署jenkins环境 1.准备环境 10.0.0.211 jenkins211 CPU : 1C 内存: 2G DISK: 50G 2.安装JDK 2.1 下载jdk [root@jenkins211 ~]# wget http://192.168.21.253/Resources/Kubernetes/Project/DevOps/Jenkins/jdk-17_linux-x64_bin.tar.gz 2.2 解压软件包 [root@jenkins211 ~]# tar xf jdk-17_linux-x64_bin.tar.gz -C /usr/local/ 2.3 配置环境变量 [root@jenkins211 ~]# cat /etc/profile.d/jdk.sh #!/bin/bash export JAVA_HOME=/usr/local/jdk-17.0.8 export PATH=$PATH:$JAVA_HOME/bin [root@jenkins211 ~]# [root@jenkins211 ~]# source /etc/profile.d/jdk.sh [root@jenkins211 ~]# [root@jenkins211 ~]# java -version java version "17.0.8" 2023-07-18 LTS Java(TM) SE Runtime Environment (build 17.0.8+9-LTS-211) Java HotSpot(TM) 64-Bit Server VM (build 17.0.8+9-LTS-211, mixed mode, sharing) [root@jenkins211 ~]# 3.安装jenkins 2.0 安装jenkins的依赖 [root@jenkins211 ~]# apt install fontconfig 2.1 下载jenkins [root@jenkins211 ~]# wget http://192.168.21.253/Resources/Kubernetes/Project/DevOps/Jenkins/jenkins-v2.479.3/jenkins_2.479.3_all.deb 2.2 安装jenkins [root@jenkins211 ~]# dpkg -i jenkins_2.479.3_all.deb 2.3 修改jenkins的启动脚本 [root@jenkins211 ~]# vim /lib/systemd/system/jenkins.service ... 36 #User=jenkins 37 #Group=jenkins 38 39 User=root 40 Group=root 41 # Directory where Jenkins stores its configuration and workspaces 42 Environment="JENKINS_HOME=/var/lib/jenkins" 43 Environment="JAVA_HOME=/usr/local/jdk-17.0.8" 2.4 启动jenkins [root@jenkins211 ~]# systemctl daemon-reload [root@jenkins211 ~]# [root@jenkins211 ~]# systemctl restart jenkins.service [root@jenkins211 ~]# [root@jenkins211 ~]# ss -nlt | grep 8080 LISTEN 0 50 *:8080 *:* [root@jenkins211 ~]# 2.5 访问jenkins的WebUI http://118.89.55.174:8080/

0eb25225aa526c899d22aa6b9f47973a_720

ac3c37da49c4042682d0fa4abe344f82

bash
2.6 基于密码访问登录 [root@jenkins211 ~]# cat /var/lib/jenkins/secrets/initialAdminPassword 417305a1be944bb38b8c217c01ba1040 [root@jenkins211 ~]# 2.7 修改admin用户密码

a7dfc458db5074264a5a6aaf003c87ce

1275b7a3a8f2cc659d0b579ebd29f9e1

bash
4.安装jenkins的插件 4.1 下载jenkins的插件包 [root@jenkins211 ~]# wget http://192.168.21.253/Resources/Kubernetes/Project/DevOps/Jenkins/jenkins-v2.479.3/weixiang-jenkins-2.479.3-plugins.tar.gz 4.2 解压插件包 [root@jenkins211 ~]# ll /var/lib/jenkins/plugins/ total 8 drwxr-xr-x 2 root root 4096 Jul 24 09:34 ./ drwxr-xr-x 8 jenkins jenkins 4096 Jul 24 09:36 ../ [root@jenkins211 ~]# [root@jenkins211 ~]# tar xf weixiang-jenkins-2.479.3-plugins.tar.gz -C /var/lib/jenkins/plugins/ [root@jenkins211 ~]# [root@jenkins211 ~]# ll /var/lib/jenkins/plugins/ | wc -l 227 [root@jenkins211 ~]# 温馨提示: 如果想要手动尝试一把,估计得一小时起步。 参考链接: https://www.cnblogs.com/yinzhengjie/p/9589319.html https://www.cnblogs.com/yinzhengjie/p/18563962 4.3 重启jenkins http://10.0.0.211:8080/restart

d7bb61e1144c40a4b710a4d3b5fab1b6

2647d5a81962e6ca790a32200d88742a

2、部署gitlab到K8S集群,要求数据持久化
bash
参考镜像: http://192.168.21.253/Resources/Kubernetes/Project/DevOps/images/weixiang-gitlab-ce-v17.5.2.tar.gz 1.导入镜像 [root@worker233 ~]# wget http://192.168.21.253/Resources/Kubernetes/Project/DevOps/images/weixiang-gitlab-ce-v17.5.2.tar.gz [root@worker233 ~]# docker load -i weixiang-gitlab-ce-v17.5.2.tar.gz [root@worker233 ~]# docker tag gitlab/gitlab-ce:17.5.2-ce.0 harbor250.weixiang.com/weixiang-devops/gitlab-ce:17.5.2-ce.0 [root@worker233 ~]# docker push harbor250.weixiang.com/weixiang-devops/gitlab-ce:17.5.2-ce.0 2.创建nfs共享目录 [root@master231 ~]# mkdir -pv /yinzhengjie/data/nfs-server/case-demo/gitlab/{data,logs,conf} mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/gitlab' mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/gitlab/data' mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/gitlab/logs' mkdir: created directory '/yinzhengjie/data/nfs-server/case-demo/gitlab/conf' [root@master231 ~]# 3.编写资源清单 [root@master231 02-jenkins]# cat 01-deploy-svc-gitlab.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-gitlab spec: replicas: 1 selector: matchLabels: apps: gitlab template: metadata: labels: apps: gitlab spec: volumes: - name: data nfs: server: 10.0.0.231 path: /yinzhengjie/data/nfs-server/case-demo/gitlab/data - name: conf nfs: server: 10.0.0.231 path: /yinzhengjie/data/nfs-server/case-demo/gitlab/conf - name: logs nfs: server: 10.0.0.231 path: /yinzhengjie/data/nfs-server/case-demo/gitlab/logs containers: - name: c1 image: harbor250.weixiang.com/weixiang-devops/gitlab-ce:17.5.2-ce.0 ports: - containerPort: 22 name: ssh - containerPort: 80 name: http - containerPort: 443 name: https volumeMounts: - name: logs mountPath: /var/log/gitlab - name: data mountPath: /var/opt/gitlab - name: conf mountPath: /etc/gitlab --- apiVersion: v1 kind: Service metadata: name: svc-gitlab spec: type: LoadBalancer selector: apps: gitlab ports: # 定义端口映射规则 - protocol: TCP # 协议类型 port: 80 name: http - protocol: TCP port: 443 name: https - protocol: TCP port: 22 name: sshd [root@master231 02-jenkins]# [root@master231 02-jenkins]# 4.创建资源 [root@master231 02-jenkins]# kubectl apply -f 01-deploy-svc-gitlab.yaml deployment.apps/deploy-gitlab created service/svc-gitlab created [root@master231 02-jenkins]# [root@master231 ~/count/02-jenkins]#kubectl get svc svc-gitlab NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc-gitlab LoadBalancer 10.200.131.141 43.139.47.66 80:23275/TCP,443:9812/TCP,22:3432/TCP 67m [root@master231 02-jenkins]# [root@master231 02-jenkins]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-gitlab-c84979449-f5rn9 1/1 Running 0 9m16s 10.100.2.53 worker233 <none> <none> [root@master231 02-jenkins]# [root@master231 02-jenkins]# 5.查看gitlab的初始密码 [root@master231 02-jenkins]# kubectl logs deploy-gitlab-c84979449-f5rn9 | grep /etc/gitlab/initial_root_password Password stored to /etc/gitlab/initial_root_password. This file will be cleaned up in first reconfigure run after 24 hours. [root@master231 ~/count/02-jenkins]#kubectl exec deploy-gitlab-647d45cb7c-hs6r7 -- cat /etc/gitlab/initial_root_password # WARNING: This value is valid only in the following conditions # 1. If provided manually (either via `GITLAB_ROOT_PASSWORD` environment variable or via `gitlab_rails['initial_root_password']` setting in `gitlab.rb`, it was provided before database was seeded for the first time (usually, the first reconfigure run). # 2. Password hasn't been changed manually, either via UI or via command line. # # If the password shown here doesn't work, you must reset the admin password following https://docs.gitlab.com/ee/security/reset_user_password.html#reset-your-root-password. Password: TlDLGaLMsPw5VhotNNlLtYSos1pU9bhfQgHWL1YEFnQ= TlDLGaLMsPw5VhotNNlLtYSos1pU9bhfQgHWL1YEFnQ= # NOTE: This file will be automatically deleted in the first reconfigure run after 24 hours. 6.访问测试 http://10.0.0.153/ 用上一步的密码进行登录,用户名为: root 修改密码: 推荐密码为: Linux98@2025

image

8126ca069462020caef88e39a424aad1

e9b5e301c68542ac2a64ed4afe9ebb5b

3、模拟开发人员将代码推送到gitlab
bash
1.准备代码 [root@harbor250.weixiang.com ~]# wget http://192.168.21.253/Resources/Kubernetes/Project/DevOps/Jenkins/weixiang-yiliao.zip [root@harbor250.weixiang.com ~]# mkdir code [root@harbor250.weixiang.com ~]# unzip weixiang-yiliao.zip -d code/ [root@harbor250.weixiang.com ~]# cd code/ [root@harbor250.weixiang.com code]# [root@harbor250.weixiang.com code]# ll total 224 drwxr-xr-x 5 root root 4096 Jul 24 11:39 ./ drwx------ 9 root root 4096 Jul 24 11:39 ../ -rw-r--r-- 1 root root 16458 Jun 13 2019 about.html -rw-r--r-- 1 root root 20149 Jun 13 2019 album.html -rw-r--r-- 1 root root 19662 Jun 13 2019 article_detail.html -rw-r--r-- 1 root root 18767 Jun 13 2019 article.html -rw-r--r-- 1 root root 18913 Jun 13 2019 comment.html -rw-r--r-- 1 root root 16465 Jun 13 2019 contact.html drwxr-xr-x 2 root root 4096 Sep 19 2022 css/ drwxr-xr-x 5 root root 4096 Sep 19 2022 images/ -rw-r--r-- 1 root root 29627 Jun 29 2019 index.html drwxr-xr-x 2 root root 4096 Sep 19 2022 js/ -rw-r--r-- 1 root root 24893 Jun 13 2019 product_detail.html -rw-r--r-- 1 root root 20672 Jun 13 2019 product.html [root@harbor250.weixiang.com code]# [root@harbor250.weixiang.com code]# 2.编写Dockerfile [root@harbor250.weixiang.com code]# cat Dockerfile FROM harbor250.weixiang.com/weixiang-xiuxian/apps:v1 MAINTAINER Jason Yin LABEL school=weixiang \ class=weixiang98 ADD . /usr/share/nginx/html EXPOSE 80 WORKDIR /usr/share/nginx/html [root@harbor250.weixiang.com code]# 3.gitlab创建项目 推荐项目名称为"weixiang-yiliao"

dba023d5e9039cd60195d5f8a493b677

4c4cb3647dce5a5ee741fa504da480ec_720

218d9c6eafdc0e3166ea58ee3b6ee67a_720

bash
139. 4.开发人员初始化项目并添加远程仓库 [root@harbor250.weixiang.com code]# git init . [root@harbor250.weixiang.com code]# git remote add origin http://43.139.77.96/root/weixiang-yiliao.git [root@harbor250.weixiang.com code]# [root@harbor250.weixiang.com code]# git remote -v origin http://10.0.0.153/root/weixiang-yiliao.git (fetch) origin http://10.0.0.153/root/weixiang-yiliao.git (push) [root@harbor250.weixiang.com code]# 5.推送代码到远程仓库 [root@harbor250.weixiang.com code]# git add . [root@harbor250.weixiang.com code]# git commit -m 'k8s yiliao demo' [root@harbor250.weixiang.com code]# git push origin master Username for 'http://10.0.0.153': root Password for 'http://root@10.0.0.153': # 此处输入密码不会提示你,建议直接复制密码回车即可。Linux98@2025 Enumerating objects: 91, done. Counting objects: 100% (91/91), done. Delta compression using up to 2 threads Compressing objects: 100% (91/91), done. Writing objects: 100% (91/91), 1.48 MiB | 7.69 MiB/s, done. Total 91 (delta 11), reused 0 (delta 0), pack-reused 0 remote: remote: To create a merge request for master, visit: remote: http://deploy-gitlab-c84979449-f5rn9/root/weixiang-yiliao/-/merge_requests/new?merge_request%5Bsource_branch%5D=master remote: To http://10.0.0.153/root/weixiang-yiliao.git * [new branch] master -> master [root@harbor250.weixiang.com code]# 6.gitlab代码仓库查看代码是否推送成功

ddc4f2c04280cf11589922d542b0c0ce_720

4、模拟开发人员将代码推送到gitee
bash
- 模拟开发人员将代码推送到gitee 1.创建gitee账号 略。 参考链接: https://gitee.com/signup 2.gitlab创建项目 略,见视频。推荐项目名称为"weixiang-yiliao"

dfcc5a0a94271a332aa3002c532f6a83

65c51b5530bbf0b0ceb83d1217d1f475_720

bash
3.Git 全局设置 [root@harbor250.weixiang.com code]# git config --global user.name "尹正杰" [root@harbor250.weixiang.com code]# git config --global user.email "8669059+yinzhengjie@user.noreply.gitee.com" [root@harbor250.weixiang.com code]# 4.开发人员初始化项目并添加远程仓库 [root@harbor250.weixiang.com code]# git init . [root@harbor250.weixiang.com code]# git remote add gitee https://gitee.com/yinzhengjie/weixiang-yiliao.git [root@harbor250.weixiang.com code]# [root@harbor250.weixiang.com code]# git remote -v gitee https://gitee.com/yinzhengjie/weixiang-yiliao.git (fetch) gitee https://gitee.com/yinzhengjie/weixiang-yiliao.git (push) origin http://10.0.0.153/root/weixiang-yiliao.git (fetch) origin http://10.0.0.153/root/weixiang-yiliao.git (push) [root@harbor250.weixiang.com code]# 5.推送代码到远程仓库 [root@harbor250.weixiang.com code]# git add . [root@harbor250.weixiang.com code]# git commit -m 'k8s yiliao demo' [root@harbor250.weixiang.com code]# git push gitee master Username for 'https://gitee.com': 15321095200 Password for 'https://15321095200@gitee.com': # 此处输入密码不会提示你,建议直接复制密码回车即可。 Enumerating objects: 91, done. Counting objects: 100% (91/91), done. Delta compression using up to 2 threads Compressing objects: 100% (91/91), done. Writing objects: 100% (91/91), 1.48 MiB | 1.86 MiB/s, done. Total 91 (delta 12), reused 0 (delta 0), pack-reused 0 remote: Powered by GITEE.COM [1.1.5] remote: Set trace flag 0c41858d To https://gitee.com/yinzhengjie/weixiang-yiliao.git * [new branch] master -> master [root@harbor250.weixiang.com code]# 6.gitee代码仓库查看代码是否推送成功

b477e3a53c9af367757767aa9f75d58a_720

​​

​​

5、jenkins和gitlab及gitee联调
bash
1.jenkins创建项目 建议项目名称为'weixiang-weixiang98-yiliao',建议构建'自由风格类型任务'

413528a785607fc6a9e15fdb0853be31_720

bash
2.添加gitee或者gitlab账号信息 [root@harbor250.weixiang.com ~/code]#git remote -v origin http://43.139.77.96:36515/root/weixiang-yiliao.git (fetch) origin http://43.139.77.96:36515/root/weixiang-yiliao.git (push)

6842e0dffd48ade847d79416217d8b8d_720

bash
3.编写shell脚本 pwd ls -l 4.立即构建 5.查看日志

image

6、jenkins构建镜像并推送到harbor仓库
bash
- jenkins构建镜像并推送到harbor仓库 1.安装docker环境 [root@jenkins211 ~]# wget http://192.168.21.253/Resources/Docker/scripts/weixiang-autoinstall-docker-docker-compose.tar.gz [root@jenkins211 ~]# tar xf weixiang-autoinstall-docker-docker-compose.tar.gz [root@jenkins211 ~]# ./install-docker.sh i 2.添加解析 [root@jenkins211 ~]# echo 8.148.236.36 harbor250.weixiang.com >> /etc/hosts [root@jenkins211 ~]# [root@jenkins211 ~]# tail -1 /etc/hosts 10.0.0.250 harbor250.weixiang.com [root@jenkins211 ~]# 3.拷贝证书文件 [root@jenkins211 ~]# scp -r 10.1.12.3:/etc/docker/certs.d/ /etc/docker/ [root@jenkins211 ~]# apt -y install tree [root@jenkins211 ~]# tree /etc/docker/certs.d/ /etc/docker/certs.d/ └── harbor250.weixiang.com ├── ca.crt ├── harbor250.weixiang.com.cert └── harbor250.weixiang.com.key 1 directory, 3 files [root@jenkins211 ~]# 4.修改jenkins的脚本内容 docker build -t harbor250.weixiang.com/weixiang-cicd/yiliao:v0.1 . docker login -u admin -p 1 harbor250.weixiang.com docker push harbor250.weixiang.com/weixiang-cicd/yiliao:v0.1

image

bash
5.立即构建,要保证harbor仓库有weixiang-cicd项目

image

bash
6.验证harbor的WebUI项目是否创建 主要观察'weixiang-cicd'项目是否有镜像。

image

7、jenkins部署服务医疗服务到集群
bash
1.安装kubectl客户端工具 [root@jenkins211 ~]# wget http://192.168.21.253/Resources/Kubernetes/Project/DevOps/Jenkins/kubectl-1.23.17 [root@jenkins211 ~]# chmod +x kubectl-1.23.17 [root@jenkins211 ~]# mv kubectl-1.23.17 /usr/local/bin/kubectl [root@jenkins211 ~]# ll /usr/local/bin/kubectl -rwxr-xr-x 1 root root 45174784 Sep 4 2023 /usr/local/bin/kubectl* [root@jenkins211 ~]# 2.准备认证文件 [root@jenkins211 ~]# mkdir -p ~/.kube/ [root@jenkins211 ~]# scp 106.55.44.37:/root/.kube/config ~/.kube/ [root@jenkins211 ~]# kubectl version Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.17", GitCommit:"953be8927218ec8067e1af2641e540238ffd7576", GitTreeState:"clean", BuildDate:"2023-02-22T13:34:27Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.17", GitCommit:"953be8927218ec8067e1af2641e540238ffd7576", GitTreeState:"clean", BuildDate:"2023-02-22T13:27:46Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"} [root@jenkins211 ~]# 3.修改jenkins的配置 docker build -t harbor250.weixiang.com/weixiang-cicd/yiliao:v0.1 . docker login -u admin -p 1 harbor250.weixiang.com docker push harbor250.weixiang.com/weixiang-cicd/yiliao:v0.1 kubectl get deployments deploy-yiliao &>/dev/null if [ $? -eq 0 ]; then cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: name: deploy-yiliao spec: replicas: 3 selector: matchLabels: apps: yiliao template: metadata: labels: apps: yiliao spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-cicd/yiliao:v0.1 --- apiVersion: v1 kind: Service metadata: name: svc-yiliao spec: type: LoadBalancer selector: apps: yiliao ports: - protocol: TCP port: 80 EOF fi kubectl get pods -o wide -l apps=yiliao kubectl get svc svc-yiliao kubectl describe svc svc-yiliao | grep Endpoints 4.立即构建

image

bash
温馨提示: 查看构建信息日志部分内容如下: + kubectl get pods -o wide -l apps=yiliao NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-yiliao-9f765b8d8-688hn 1/1 Running 0 2m22s 10.100.2.59 worker233 <none> <none> deploy-yiliao-9f765b8d8-6ztrf 1/1 Running 0 2m22s 10.100.2.58 worker233 <none> <none> deploy-yiliao-9f765b8d8-bkb2n 1/1 Running 0 2m22s 10.100.1.204 worker232 <none> <none> + kubectl get svc svc-yiliao NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc-yiliao LoadBalancer 10.200.208.97 10.0.0.154 80:33356/TCP 2m22s + kubectl describe svc svc-yiliao + grep Endpoints Endpoints: 10.100.1.204:80,10.100.2.58:80,10.100.2.59:80 Finished: SUCCESS 5.访问测试 http://10.0.0.154/

image

8、jenkins实现升级
bash
- jenkins实现升级案例 1.准备资源清单 [root@jenkins211 ~]# mkdir -pv /weixiang/manifests/yiliao [root@jenkins211 ~]# cat > /weixiang/manifests/yiliao/deploy-svc-yiliao.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-yiliao spec: replicas: 3 selector: matchLabels: apps: yiliao template: metadata: labels: apps: yiliao spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-cicd/yiliao:v0.1 --- apiVersion: v1 kind: Service metadata: name: svc-yiliao spec: type: LoadBalancer selector: apps: yiliao ports: - protocol: TCP port: 80 EOF 2.修改gitee或者gitlab代码 修改的index.html文件, <h1>学IT来老男孩,月薪过万不是梦,官网地址: https://www.oldboyoedu.com </h1>

image

bash
3.修改jenkins的配置 #!/bin/bash docker build -t harbor250.weixiang.com/weixiang-cicd/yiliao:${version} . docker login -u admin -p 1 harbor250.weixiang.com docker push harbor250.weixiang.com/weixiang-cicd/yiliao:${version} #kubectl get deployments deploy-yiliao &>/dev/null #if [ $? -eq 0 ]; then #kubectl set image deploy deploy-yiliao c1=harbor250.weixiang.com/weixiang-cicd/yiliao:${version} #else #kubectl apply -f /weixiang/manifests/yiliao/deploy-svc-yiliao.yaml #fi kubectl set image deploy deploy-yiliao c1=harbor250.weixiang.com/weixiang-cicd/yiliao:${version} kubectl get pods -o wide -l apps=yiliao kubectl get svc svc-yiliao kubectl describe svc svc-yiliao | grep Endpoints 4.参数化构建

image

bash
5.验证测试 访问10.0.0.154即可。
9、jenkins实现回滚案例
bash
- jenkins实现回滚案例 1.克隆项目

image

bash
2.修改jenkins的配置 kubectl set image deploy deploy-yiliao c1=harbor250.weixiang.com/weixiang-cicd/yiliao:${version} 3.立即构建测试
10、jenkins基于pipeline构建实战
bash
- jenkins基于pipeline构建实战 1.编写pipeline pipeline { agent any stages { stage('pull code') { steps { git credentialsId: '83da8426-9aa0-42be-bef8-9688e5fa54f8', url: 'http://10.0.0.153/root/weixiang-yiliao.git' } } stage('build image') { steps { sh '''docker build -t harbor250.weixiang.com/weixiang-cicd/yiliao:v0.2 .''' } } stage('push image') { steps { sh '''docker login -u admin -p 1 harbor250.weixiang.com docker push harbor250.weixiang.com/weixiang-cicd/yiliao:v0.2''' } } stage('deploy or update image') { steps { sh '''kubectl set image deploy deploy-yiliao c1=harbor250.weixiang.com/weixiang-cicd/yiliao:v0.2 kubectl get pods -o wide -l apps=yiliao kubectl get svc svc-yiliao kubectl describe svc svc-yiliao | grep Endpoints''' } } } } 2.立即构建 略,见视频 - jenkins基于pipeline构建之jenkinsfile实战 略,见视频。

e815af76825f0d99ade633d796a26bc9

cdae2eb2d837aaa99220aafc5e0fd949

11、jenkins配置钉钉机器人自动发信息
bash
jenkins里面安装钉钉,安装完重启jenkins

image

image

image

bash
下载钉钉客户端

image

bash
填写钉钉的webhook及关键字信息

image

bash
如上图所示,我们可以添加钉钉的webhook信息机钉钉设置的关键字。配置完成后点击"Submit"提交任务。 如下图所示,钉钉测试邮箱成功了?

image

配置钉钉告警

bash
// 最终成品 Pipeline 脚本 pipeline { agent any stages { stage('执行并判断结果') { steps { script { try { // --- 您的核心构建/测试步骤放在这里 --- echo "开始执行核心任务..." // 1️⃣ 用来测试【成功】场景的命令 (当前使用) sh 'echo "任务执行成功!"; exit 0' // 2️⃣ 用来测试【失败】场景的命令 (如需测试,请注释掉上面一行,并取消下面一行的注释) // sh 'echo "任务执行失败!"; exit 1' // ------------------------------------------------ // ⬇️ 成功后发送的通知 ⬇️ // ------------------------------------------------ def successMessage = """### ✅ [${env.JOB_NAME}] Jenkins 任务成功 - **任务状态**: **成功 (SUCCESS)** - **构建编号**: #${env.BUILD_NUMBER} - **完成时间**: ${new Date().format("yyyy-MM-dd HH:mm:ss")} - **详情链接**: [点击查看构建详情](${env.BUILD_URL}) > 由 **张维祥** 的 Jenkins 任务自动触发。""" echo "任务成功,准备发送钉钉通知..." dingtalk ( robot: 'zhangweixiang-jenkins', text: [successMessage], type: 'MARKDOWN' ) } catch (any) { // ------------------------------------------------ // ⬇️ 失败后发送的通知 ⬇️ // ------------------------------------------------ def failureMessage = """### 🚨 [${env.JOB_NAME}] Jenkins 任务失败 - **任务状态**: **失败 (FAILURE)** - **构建编号**: #${env.BUILD_NUMBER} - **失败链接**: [点击查看失败日志](${env.BUILD_URL}) > 请 **张维祥** 尽快检查问题!""" echo "任务失败,准备发送钉钉通知..." dingtalk ( robot: 'zhangweixiang-jenkins', text: [failureMessage], type: 'MARKDOWN' ) currentBuild.result = 'FAILURE' error "任务执行失败" } } } } } } # 把上面内容添加到下方红框

image

保存,立即构建

image

24、Flannel的工作原理图解

1、Flannel的工作原理图解

55b5584956a728c60875e8d8fe4f5a7a_720

2、同节点各Pod实现数据通信原理

image

同节点各Pod实现数据通信原理,232节点两个pod,同一个节点,跑的业务不一样,pods1访问pods2流量要经过cni0,所以抓cni的包可以抓到

bash
1.1 环境准备 [root@master231 01-pods]# cat 26-pods-on-single-node.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-v1 labels: apps: v1 spec: nodeName: worker233 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 --- apiVersion: v1 kind: Pod metadata: name: xiuxian-v2 labels: apps: v2 spec: nodeName: worker233 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v2 [root@master231 01-pods]# [root@master231 01-pods]# kubectl apply -f 26-pods-on-single-node.yaml pod/xiuxian-v1 created pod/xiuxian-v2 created [root@master231 01-pods]# [root@master231 01-pods]# kubectl get pods -o wide -l "apps in (v1,v2)" NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-v1 1/1 Running 0 17s 10.100.2.89 worker233 <none> <none> xiuxian-v2 1/1 Running 0 17s 10.100.2.90 worker233 <none> <none> [root@master231 01-pods]# curl 10.100.2.89 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 01-pods]# [root@master231 01-pods]# curl 10.100.2.90 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v2</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: red">凡人修仙传 v2 </h1> <div> <img src="2.jpg"> <div> </body> </html> [root@master231 01-pods]# [root@master231 01-pods]# kubectl exec xiuxian-v1 -- ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0@if11: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP link/ether 52:80:67:75:61:ec brd ff:ff:ff:ff:ff:ff inet 10.100.2.89/24 brd 10.100.2.255 scope global eth0 valid_lft forever preferred_lft forever [root@master231 01-pods]# [root@master231 01-pods]# [root@master231 01-pods]# kubectl exec xiuxian-v2 -- ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0@if12: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP link/ether b2:2d:fa:6e:c7:39 brd ff:ff:ff:ff:ff:ff inet 10.100.2.90/24 brd 10.100.2.255 scope global eth0 valid_lft forever preferred_lft forever [root@master231 ~/count/01-pods]#kubectl exec -it xiuxian-v2 -- sh / # ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0@if49: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP link/ether 8e:eb:47:44:e4:db brd ff:ff:ff:ff:ff:ff inet 10.100.2.72/24 brd 10.100.2.255 scope global eth0 valid_lft forever preferred_lft forever / # route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.100.2.1 0.0.0.0 UG 0 0 0 eth0 10.100.0.0 10.100.2.1 255.255.0.0 UG 0 0 0 eth0 10.100.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 / # [root@master231 ~/count/01-pods]#kubectl exec -it xiuxian-v1 -- sh / # route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.100.2.1 0.0.0.0 UG 0 0 0 eth0 10.100.0.0 10.100.2.1 255.255.0.0 UG 0 0 0 eth0 10.100.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 / #

image

bash
1.2 去worker节点测试验证 因为pod调度到了233,所以去233查看对应ip

image

bash
[root@worker233 ~]# ip a ... 41: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 3a:85:78:71:bf:e7 brd ff:ff:ff:ff:ff:ff inet 10.100.2.1/24 scope global cni0 valid_lft forever preferred_lft forever inet6 fe80::3885:78ff:fe71:bfe7/64 scope link valid_lft forever preferred_lft forever 49: vethf5a8e2fd@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 state UP group default link/ether 22:43:7e:82:de:c1 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet6 fe80::2043:7eff:fe82:dec1/64 scope link valid_lft forever preferred_lft forever 50: veth932dec82@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 state UP group default link/ether 46:03:a7:9c:d1:ff brd ff:ff:ff:ff:ff:ff link-netnsid 2 inet6 fe80::4403:a7ff:fe9c:d1ff/64 scope link valid_lft forever preferred_lft forever 1.3 验证桥接网卡cni0 [root@worker233 ~]# apt install bridge-utils [root@worker233 ~]# brctl show cni0 bridge name bridge id STP enabled interfaces cni0 8000.3a857871bfe7 no veth932dec82 # cni的接口对应50的网卡名称,也就是231pod里面的网卡 vethf5a8e2fd # cni的接口对应49的网卡名称,也就是231另一个pod里面的网卡 1.4 抓包测试 1.4.1 ping测试 [root@master231 01-pods]# kubectl exec -it xiuxian-v1 -- sh / # ping 10.100.2.90 -c 3 PING 10.100.2.90 (10.100.2.90): 56 data bytes 64 bytes from 10.100.2.90: seq=0 ttl=64 time=0.058 ms 64 bytes from 10.100.2.90: seq=1 ttl=64 time=0.073 ms 64 bytes from 10.100.2.90: seq=2 ttl=64 time=0.074 ms --- 10.100.2.90 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.058/0.068/0.074 ms / # 1.4.2 查看抓包结果 [root@worker233 ~]# tcpdump -i cni0 icmp tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on cni0, link-type EN10MB (Ethernet), snapshot length 262144 bytes 09:24:38.106960 IP 10.100.2.89 > 10.100.2.90: ICMP echo request, id 13824, seq 0, length 64 09:24:38.106997 IP 10.100.2.90 > 10.100.2.89: ICMP echo reply, id 13824, seq 0, length 64 09:24:39.107475 IP 10.100.2.89 > 10.100.2.90: ICMP echo request, id 13824, seq 1, length 64 09:24:39.107509 IP 10.100.2.90 > 10.100.2.89: ICMP echo reply, id 13824, seq 1, length 64 09:24:40.108017 IP 10.100.2.89 > 10.100.2.90: ICMP echo request, id 13824, seq 2, length 64 09:24:40.108052 IP 10.100.2.90 > 10.100.2.89: ICMP echo reply, id 13824, seq 2, length 64 1.5 验证cni0网卡可以实现同节点不同pod数据通信 1.5.1 移除网卡后抓包失败 [root@worker233 ~]# ifconfig cni0 cni0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450 inet 10.100.2.1 netmask 255.255.255.0 broadcast 10.100.2.255 inet6 fe80::80a9:59ff:fe3d:396f prefixlen 64 scopeid 0x20<link> ether 82:a9:59:3d:39:6f txqueuelen 1000 (Ethernet) RX packets 9494 bytes 2392135 (2.3 MB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 10900 bytes 2614807 (2.6 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [root@worker233 ~]# ip link delete cni0 [root@worker233 ~]# [root@worker233 ~]# ifconfig cni0 cni0: error fetching interface information: Device not found 1.5.2 再次ping 测试 [root@master231 01-pods]# kubectl exec -it xiuxian-v1 -- sh / # / # ping 10.100.2.90 -c 3 PING 10.100.2.90 (10.100.2.90): 56 data bytes --- 10.100.2.90 ping statistics --- 3 packets transmitted, 0 packets received, 100% packet loss / # # 如上所示,ping失败,如果发现同节点pod无法互通,先检查cni0网卡是否还在,不在就不通 1.6 添加cni0网卡 1.6.1 添加cni0网卡 [root@worker233 ~]# ip link add cni0 type bridge [root@worker233 ~]# ip link set dev cni0 up [root@worker233 ~]# ip addr add 10.100.2.1/24 dev cni0 [root@worker233 ~]# [root@worker233 ~]# ifconfig cni0 cni0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 10.100.2.1 netmask 255.255.255.0 broadcast 0.0.0.0 inet6 fe80::5c2f:cff:fecb:94c8 prefixlen 64 scopeid 0x20<link> ether 3a:28:99:ca:1f:85 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [root@worker233 ~]# 1.6.2 将网卡设备添加到cni0网桥 [root@worker233 ~]# apt -y install bridge-utils [root@worker233 ~]# brctl show cni0 bridge name bridge id STP enabled interfaces cni0 8000.3a2899ca1f85 no [root@worker233 ~]# [root@worker233 ~]# [root@worker233 ~]# brctl addif cni0 veth3c6c6472 # 添加pod ip a 对应的网卡 [root@worker233 ~]# brctl addif cni0 veth19d9ae5e [root@worker233 ~]# [root@worker233 ~]# brctl show cni0 bridge name bridge id STP enabled interfaces cni0 8000.3a2899ca1f85 no vethbe473b53 vethc961a27f [root@worker233 ~]# 1.6.3 再次ping测试 [root@master231 01-pods]# kubectl exec -it xiuxian-v1 -- sh / # / # ping 10.100.2.90 -c 3 PING 10.100.2.90 (10.100.2.90): 56 data bytes 64 bytes from 10.100.2.90: seq=0 ttl=64 time=0.305 ms 64 bytes from 10.100.2.90: seq=1 ttl=64 time=0.070 ms 64 bytes from 10.100.2.90: seq=2 ttl=64 time=0.080 ms --- 10.100.2.90 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.070/0.151/0.305 ms / #
3、不同节点各Pod实现数据通信原理
bash
2.1 环境准备 [root@master231 01-pods]# cat 27-pods-on-multiple-node.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-v1 labels: apps: v1 spec: nodeName: worker232 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 --- apiVersion: v1 kind: Pod metadata: name: xiuxian-v2 labels: apps: v2 spec: nodeName: worker233 containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v2 [root@master231 01-pods]# [root@master231 01-pods]# kubectl apply -f 27-pods-on-multiple-node.yaml pod/xiuxian-v1 created pod/xiuxian-v2 created [root@master231 01-pods]# [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-v1 1/1 Running 0 3s 10.100.1.223 worker232 <none> <none> xiuxian-v2 1/1 Running 0 3s 10.100.2.91 worker233 <none> <none> [root@master231 01-pods]# 2.2 ping测试 [root@master231 01-pods]# kubectl exec -it xiuxian-v1 -- sh / # / # ping 10.100.2.91 -c 3 PING 10.100.2.91 (10.100.2.91): 56 data bytes 64 bytes from 10.100.2.91: seq=0 ttl=62 time=0.607 ms 64 bytes from 10.100.2.91: seq=1 ttl=62 time=0.437 ms 64 bytes from 10.100.2.91: seq=2 ttl=62 time=0.375 ms --- 10.100.2.91 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.375/0.473/0.607 ms / # 2.3 抓包测试 [root@worker233 ~]# tcpdump -i eth0 -nn | grep overlay tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes 09:49:14.541026 IP 10.0.0.232.50003 > 10.0.0.233.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:14.541163 IP 10.0.0.233.44620 > 10.0.0.232.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:15.541631 IP 10.0.0.232.50003 > 10.0.0.233.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:15.541734 IP 10.0.0.233.44620 > 10.0.0.232.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:16.542202 IP 10.0.0.232.50003 > 10.0.0.233.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:16.542272 IP 10.0.0.233.44620 > 10.0.0.232.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:16.954555 IP 10.0.0.231.50448 > 10.0.0.232.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:16.954672 IP 10.0.0.231.50448 > 10.0.0.232.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:16.954682 IP 10.0.0.231.50448 > 10.0.0.232.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:16.954795 IP 10.0.0.231.50448 > 10.0.0.232.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:16.954806 IP 10.0.0.231.50448 > 10.0.0.232.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:16.954944 IP 10.0.0.232.43147 > 10.0.0.231.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:16.955180 IP 10.0.0.232.43147 > 10.0.0.231.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:16.955231 IP 10.0.0.232.43147 > 10.0.0.231.8472: OTV, flags [I] (0x08), overlay 0, instance 1 09:49:16.955291 IP 10.0.0.232.43147 > 10.0.0.231.8472: OTV, flags [I] (0x08), overlay 0, instance 1 [root@worker233 ~]# tcpdump -i flannel.1 -nn icmp tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on flannel.1, link-type EN10MB (Ethernet), snapshot length 262144 bytes 09:50:25.609115 IP 10.100.1.223 > 10.100.2.91: ICMP echo request, id 11264, seq 0, length 64 09:50:25.609196 IP 10.100.2.91 > 10.100.1.223: ICMP echo reply, id 11264, seq 0, length 64 09:50:26.609518 IP 10.100.1.223 > 10.100.2.91: ICMP echo request, id 11264, seq 1, length 64 09:50:26.609585 IP 10.100.2.91 > 10.100.1.223: ICMP echo reply, id 11264, seq 1, length 64 09:50:27.609855 IP 10.100.1.223 > 10.100.2.91: ICMP echo request, id 11264, seq 2, length 64 09:50:27.609916 IP 10.100.2.91 > 10.100.1.223: ICMP echo reply, id 11264, seq 2, length 64 # 可以看到在Flannel的VXLAN模式下,每个节点上 ip neigh show dev flannel.1 的输出内容几乎是完全一样的。 # 原因在于,flannel.1 接口的邻居表(ARP表)不是通过传统的ARP广播/请求动态学习的,而是由flanneld进程根据整个集群的状态,主动、 # 静态地写入的。

image

4、flannel的工作模式切换
bash
- flannel的工作模式切换 1 Flannel的工作模式 - udp: 早期支持的一种工作模式,由于性能差,目前官方已弃用。 - vxlan: 将源数据报文进行封装为二层报文(需要借助物理网卡转发),进行跨主机转发。 工作模式: 1.Pod A (10.244.1.5) 发送数据包给 Pod B (10.244.2.6) 2.数据包经由cbr0网桥,根据Node 1上的路由规则,被转发到 flannel.1 这个特殊的VXLAN虚拟设备 3.flannel.1 (VTEP - VXLAN Tunnel End Point) 是一个内核级的设备。它看到目标Pod IP (10.244.2.6) 属于Node 2的子网。 4.内核直接将原始的以太网帧封装进一个UDP包里(VXLAN格式),目标IP是Node 2的主机IP 192.168.1.102,目标UDP端口是8472 5.这个封装后的包通过Node 1的物理网卡eth0发往Node 2 6.Node 2的内核收到这个UDP包,识别出是VXLAN包,直接在内核层面进行解封装,还原出原始的以太网帧。 7.内核将原始帧转发给cbr0网桥,最终送达Pod B。 - host-gw: 将容器网络的路由信息写到宿主机的路由表上。尽管效率高,但不支持跨网段。 - directrouting: 将vxlan和host-gw工作模式工作。 工作流程: flanneld 进程会判断目标节点是否与当前节点在同一个二层子网中 如果目标节点在同一子网: flanneld 会采用 host-gw 模式。它会在主机路由表里添加一条指向目标节点IP的路由规则。通信走的是最高效的路由方式。 如果目标节点不在同一子网 (例如,集群跨越了不同的数据中心或VPC): flanneld 会自动回退到 vxlan 模式。它会通过 flannel.1 设备对数据包进行VXLAN封装,确保数据包可以跨越三层网络进行传输。 2 切换flannel的工作模式为"vxlan" 2.1.修改配置文件 [root@master231 ~]# vim kube-flannel-v0.27.0.yml ... net-conf.json: | { "Network": "10.100.0.0/16", "Backend": { "Type": "vxlan" } } 2.2.重新创建资源 [root@master231 cni]# kubectl delete -f kube-flannel-v0.27.0.yml [root@master231 cni]# kubectl apply -f kube-flannel-v0.27.0.yml 2.3.检查网络 [root@worker232 ~]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.0.0.254 0.0.0.0 UG 0 0 0 eth0 10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 10.100.0.0 10.100.0.0 255.255.255.0 UG 0 0 0 flannel.1 10.100.1.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0 10.100.2.0 10.100.2.0 255.255.255.0 UG 0 0 0 flannel.1 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0 # 所有去往其他节点Pod网段(10.100.0.0/24 和 10.100.2.0/24)的流量,都被指向了 flannel.1 这个设备。 [root@worker232 ~]# ip route default via 10.0.0.254 dev eth0 proto static 10.0.0.0/24 dev eth0 proto kernel scope link src 10.0.0.232 10.100.0.0/24 via 10.100.0.0 dev flannel.1 onlink 10.100.1.0/24 dev cni0 proto kernel scope link src 10.100.1.1 10.100.2.0/24 via 10.100.2.0 dev flannel.1 onlink 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown [root@worker232 ~]# [root@worker232 ~]# ifconfig cni0 cni0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450 inet 10.100.1.1 netmask 255.255.255.0 broadcast 10.100.1.255 inet6 fe80::3828:99ff:feca:1f85 prefixlen 64 scopeid 0x20<link> ether 3a:28:99:ca:1f:85 txqueuelen 1000 (Ethernet) RX packets 33297 bytes 18665209 (18.6 MB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 36440 bytes 7487563 (7.4 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [root@worker232 ~]# 3 切换flannel的工作模式为"host-gw" 3.1.修改配置文件 [root@master231 cni]# vim kube-flannel.yml ... net-conf.json: | { "Network": "10.100.0.0/16", "Backend": { "Type": "host-gw" } } 3.2.重新创建资源 [root@master231 ~]# kubectl delete -f kube-flannel-v0.27.0.yml [root@master231 ~]# kubectl apply -f kube-flannel-v0.27.0.yml 3.3.检查网络 [root@worker232 ~]# ip route default via 10.0.0.254 dev eth0 proto static 10.0.0.0/24 dev eth0 proto kernel scope link src 10.0.0.232 10.100.0.0/24 via 10.0.0.231 dev eth0 10.100.1.0/24 dev cni0 proto kernel scope link src 10.100.1.1 10.100.2.0/24 via 10.0.0.233 dev eth0 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown [root@worker232 ~]# [root@worker232 ~]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.0.0.254 0.0.0.0 UG 0 0 0 eth0 10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 10.100.0.0 10.0.0.231 255.255.255.0 UG 0 0 0 eth0 # 任何要去往 10.100.0.0/24 这个Pod网段的数据包,都应该通过eth0网卡,发给网关10.0.0.231。这里的10.0.0.231正是另一个节点的主机IP 10.100.1.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0 10.100.2.0 10.0.0.233 255.255.255.0 UG 0 0 0 eth0 # 10.100.2.0 10.0.0.233 ... eth0: 同理,要去往 10.100.2.0/24 这个Pod网段,数据包需要发给 10.0.0.233 这个主机。 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0 # 一条直接将“目标Pod网段”指向“目标Node主机IP”的路由规则 [root@worker232 ~]# ifconfig cni0 cni0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450 inet 10.100.1.1 netmask 255.255.255.0 broadcast 10.100.1.255 inet6 fe80::3828:99ff:feca:1f85 prefixlen 64 scopeid 0x20<link> ether 3a:28:99:ca:1f:85 txqueuelen 1000 (Ethernet) RX packets 34671 bytes 19358694 (19.3 MB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 37867 bytes 7869178 (7.8 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [root@worker232 ~]# 4 切换flannel的工作模式为"Directrouting"(推荐配置) 4.1.修改配置文件 [root@master231 cni]# vim kube-flannel.yml ... net-conf.json: | { "Network": "10.100.0.0/16", "Backend": { "Type": "vxlan", "Directrouting": true } } 4.2.重新创建资源 [root@master231 ~]# kubectl delete -f kube-flannel-v0.27.0.yml [root@master231 ~]# kubectl apply -f kube-flannel-v0.27.0.yml 4.3.检查网络 [root@worker232 ~]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.0.0.254 0.0.0.0 UG 0 0 0 eth0 10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 10.100.0.0 10.0.0.231 255.255.255.0 UG 0 0 0 eth0 10.100.1.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0 10.100.2.0 10.0.0.233 255.255.255.0 UG 0 0 0 eth0 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0

25、探针

1、探针介绍
bash
容器探针(Probe)探测类型,检查机制及探测结果 1.探针(Probe)的探测类型 livenessProbe: 健康状态检查,周期性检查服务是否存活,检查结果失败,将"重启"容器(停止源容器并重新创建新容器)。 如果容器没有提供健康状态检查,则默认状态为Success。 readinessProbe: 可用性检查,周期性检查服务是否可用,从而判断容器是否就绪。 若检测Pod服务不可用,会将Pod标记为未就绪状态,而svc的ep列表会将Addresses的地址移动到NotReadyAddresses列表。 若检测Pod服务可用,则ep会将Pod地址从NotReadyAddresses列表重新添加到Addresses列表中。 如果容器没有提供可用性检查,则默认状态为Success。 startupProbe: (1.16+之后的版本才支持) 如果提供了启动探针,则所有其他探针都会被禁用,直到此探针成功为止。 如果启动探测失败,kubelet将杀死容器,而容器依其重启策略进行重启。 如果容器没有提供启动探测,则默认状态为 Success。 对于starup探针是一次性检测,容器启动时进行检测,检测成功后,才会调用其他探针,且此探针不在生效。 2.探针(Probe)检查机制: exec: 执行一段命令,根据返回值判断执行结果。返回值为0或非0,有点类似于"echo $?"。 httpGet: 发起HTTP请求,根据返回的状态码来判断服务是否正常。 200: 返回状态码成功 301: 永久跳转 302: 临时跳转 401: 验证失败 403: 权限被拒绝 404: 文件找不到 413: 文件上传过大 500: 服务器内部错误 502: 无效的请求 504: 后端应用网关响应超时 ... tcpSocket: 测试某个TCP端口是否能够链接,类似于telnet,nc等测试工具。 grpc: k8s 1.19+版本才支持,1.23依旧属于一个alpha阶段。 3.探测结果: 每次探测都将获得以下三种结果之一: Success(成功) 容器通过了诊断。 Failure(失败) 容器未通过诊断。 Unknown(未知) 诊断失败,因此不会采取任何行动。 参考链接: https://kubernetes.io/zh/docs/concepts/workloads/pods/pod-lifecycle/#types-of-probe https://kubernetes.io/zh-cn/docs/concepts/workloads/pods/pod-lifecycle/#probe-check-methods https://kubernetes.io/zh-cn/docs/concepts/workloads/pods/pod-lifecycle/#probe-outcome

​​

2、livenessProbe探针之exec探测方式
bash
[root@master231 probe]# cat > 01-deploy-livenessProbe-exec.yaml << ''EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-livenessprobe-exec spec: replicas: 5 selector: matchLabels: # 指定了一组标签。Deployment会管理所有带xiuxian标签的Pod。 apps: xiuxian template: metadata: labels: # 为Pod设置标签。这个标签(apps: xiuxian)被上面的selector用来识别和管理这个Pod apps: xiuxian spec: restartPolicy: Always # 无论容器因何种原因退出,都会被重启 containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 name: c1 command: - /bin/sh - -c # -c选项表示后面的字符串是一个完整的命令 - touch /tmp/weixiang-linux-healthy; sleep 20; rm -f /tmp/weixiang-linux-healthy; sleep 600 # --- 存活探针 (Liveness Probe) --- livenessProbe: # 用于检测容器是否仍在运行且能响应。如果探测失败,Kubelet会杀死并重启该容器。 exec: # 使用exec的方式去做健康检查,exec是执行一段命令 command: # 自定义检查的命令 - cat - /tmp/weixiang-linux-healthy # 如果文件存在,命令成功(退出码0),表示探测成功。反之则失败 periodSeconds: 1 # 指定探针的探测周期,每1秒探测一次,当检测服务成功后,该值会被重置! failureThreshold: 3 # 指定了探测失败的阈值, 默认值是3次 successThreshold: 1 # 检测服务失败次数的累加值, initialDelaySeconds: 30 # 指定容器启动后,延迟30秒再开始第一次探测 timeoutSeconds: 1 # 一次检测周期超时的秒数,默认值是1秒,最小值为1. EOF 不健康并且进行了重启,因为阈值为30秒,加上3次探针检测,最少33秒才会检测失败,加上api-server的上报时间,大约64秒左右才会重启, 因为上面把文件删了,依据livenessProbe探测类型所以pod会重启

image

3、livenessProbe探针之httGet探测方式
bash
[root@master231 probe]# cat > 02-deploy-livenessProbe-httpGet.yaml <<''EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-livenessprobe-httpget spec: replicas: 5 selector: matchLabels: apps: xiuxian template: metadata: labels: apps: xiuxian spec: restartPolicy: Always containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 name: c1 ports: # 声明了容器对外暴露的端口 - containerPort: 80 # containerPort 是容器实际监听的端口号 name: web # 为这个端口指定一个名称,方便在其他地方(如 Service)引用 - containerPort: 22 # 容器还监听 22 端口 name: ssh # 为 22 端口命名为 "ssh"。 livenessProbe: # 健康状态检查,周期性检查服务是否存活,检查结果失败,将重启容器。 httpGet: # 使用httpGet的方式去做健康检查 port: 80 # 指定访问的端口号 path: /index.html # 检测指定的访问路径 failureThreshold: 5 # 探测失败的阈值 initialDelaySeconds: 10 # 容器启动后,延迟 10 秒再开始第一次探测 periodSeconds: 2 # 探测的周期。每隔 2 秒执行一次存活探测 successThreshold: 1 timeoutSeconds: 1 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxian spec: type: ClusterIP # 表示为 Service 分配一个只能在集群内部访问的虚拟 IP clusterIP: "10.200.0.200" # clusterIP 允许你手动指定一个 ClusterIP。这个 IP 必须在集群 Service 的 IP 地址范围内 selector: apps: xiuxian ports: - port: 80 # 是 Service 自身(即 ClusterIP)监听的端口。其他 Pod 将通过 "10.200.0.200:80" 来访问这个服务。 targetPort: web # 引用containerPort: 80端口的名字 EOF # 间隔1s探测一次

image

bash
测试方式: 可以在集群内部访问svc的地址,删除首页文件,访问http就会有错误,会发现,会有短暂的403错误。 # 右边窗口的检测命令 [root@master231 ~]#while true; do curl 10.200.0.200;sleep 0.1;done

image

image

image

bash
Kubelet 重启容器: 一旦达到失败阈值,Kubelet 会杀死 pod-A 中的 Nginx 容器,并根据 restartPolicy: Always 立即重新启动它。 容器恢复: 新启动的容器是基于原始镜像 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 创建的。 这个原始镜像里包含了 index.html 文件! 所以,新启动的容器是“出厂设置”,是完全健康的。 新容器启动后,livenessProbe 的 initialDelaySeconds: 10 开始计时。10 秒后,Kubelet 开始探测,这次访问 index.html 会成功,pod-A 被标记为健康。
4、livenessProbe探针之tcpSocket探测方式
bash
[root@master231 probe]# cat > 03-deploy-livenessProbe-tcpSocket.yaml <<''EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-livenessprobe-tcpsocket spec: replicas: 5 selector: matchLabels: apps: xiuxian template: metadata: labels: apps: xiuxian spec: restartPolicy: Always containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 name: c1 livenessProbe: # 健康状态检查,周期性检查服务是否存活,检查结果失败,将重启容器。 tcpSocket: # 使用tcpSocket的方式去做健康检查 port: 80 # 检查容器的 80 端口是否在监听并能够接受连接 failureThreshold: 3 initialDelaySeconds: 30 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 EOF 测试方式: 可以在集群内部访问某个Pod的IP地址,而后进入该pod修改nginx的端口配置并热加载,15s内会自动重启。

image

image

image

image

5、livenessProbe探针之grpc探测方式
bash
[root@master231 probe]# cat > 04-deploy-livenessProbe-grpc.yaml << ''EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-livenessprobe-grpc spec: replicas: 5 selector: matchLabels: apps: xiuxian template: metadata: labels: apps: xiuxian spec: restartPolicy: Always containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/etcd:3.5.10 name: web imagePullPolicy: IfNotPresent command: - /opt/bitnami/etcd/bin/etcd - --data-dir=/tmp/etcd - --listen-client-urls=http://0.0.0.0:2379 - --advertise-client-urls=http://127.0.0.1:2379 - --log-level=debug ports: - containerPort: 2379 livenessProbe: # 对grpc端口发起grpc调用,目前属于alpha测试阶段,如果真的想要使用,请在更高版本关注,比如k8s 1.24+ # 在1.23.17版本中,如果检测失败,会触发警告,但不会重启容器只是会有警告事件。 grpc: port: 2379 # 指定服务,但是服务名称我是瞎写的,实际工作中会有开发告诉你 service: /health failureThreshold: 3 initialDelaySeconds: 10 periodSeconds: 1 successThreshold: 1 timeoutSeconds: 1 EOF # 不会触发重启

image

6、readinessProbe探针之exec探测方式
bash
[root@master231 probe]# cat > 05-deploy-readinessprobe-livenessProbe-exec.yaml <<''EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-livenessprobe-readinessprobe-exec spec: revisionHistoryLimit: 1 strategy: type: "RollingUpdate" rollingUpdate: maxUnavailable: 1 maxSurge: 2 replicas: 3 selector: matchLabels: apps: xiuxian template: metadata: labels: apps: xiuxian spec: restartPolicy: Always containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 imagePullPolicy: Always ports: - containerPort: 80 command: - /bin/sh - -c - nginx; touch /tmp/weixiang-linux-healthy; sleep 30; rm -f /tmp/weixiang-linux-healthy; sleep 600 livenessProbe: exec: command: - cat - /tmp/weixiang-linux-healthy failureThreshold: 3 initialDelaySeconds: 65 periodSeconds: 1 successThreshold: 1 timeoutSeconds: 1 readinessProbe: # 可用性检查,周期性检查服务是否可用,从而判断容器是否就绪. exec: # 使用exec的方式去做健康检查 command: # 自定义检查的命令 - cat - /tmp/weixiang-linux-healthy # 同样是通过检查文件是否存在来判断。 failureThreshold: 3 # 连续失败 3 次后,判定容器未就绪。 initialDelaySeconds: 15 # 容器启动后,延迟 15 秒开始第一次就绪探测。这个时间点健康文件是存在的。 periodSeconds: 1 successThreshold: 1 timeoutSeconds: 1 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxain spec: clusterIP: "10.200.20.25" selector: apps: xiuxian ports: - port: 80 EOF 测试方式: [root@master231 sts]# while true; do curl 10.200.20.25 ; sleep 0.1;done # 刚启动15秒后开始READY就绪,这个时间点是能curl通的(图一) # 30秒后文件被删除,之后是curl不通的

image

image

bash
未就绪的会放在NotReadyAddresses里面

image

image

image

7、readinessProbe探针之httpGet探测方式
bash
[root@master231 probe]# cat > 06-deploy-readinessProbe-livenessProbe-httpGet.yaml << ''EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-livenessprobe-readinessprobe-httpget spec: revisionHistoryLimit: 1 strategy: type: "RollingUpdate" rollingUpdate: maxUnavailable: 1 maxSurge: 2 replicas: 3 selector: matchLabels: apps: xiuxian template: metadata: labels: apps: xiuxian spec: restartPolicy: Always containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 imagePullPolicy: Always ports: - containerPort: 80 command: - /bin/sh - -c - touch /tmp/weixiang-linux-healthy; sleep 30; rm -f /tmp/weixiang-linux-healthy; sleep 600 livenessProbe: exec: command: - cat - /tmp/weixiang-linux-healthy failureThreshold: 3 initialDelaySeconds: 180 periodSeconds: 1 successThreshold: 1 timeoutSeconds: 1 # 可用性检查,周期性检查服务是否可用,从而判断容器是否就绪. readinessProbe: # 使用httpGet的方式去做健康检查 httpGet: # 指定访问的端口号 port: 80 path: /index.html failureThreshold: 3 initialDelaySeconds: 15 periodSeconds: 1 successThreshold: 1 timeoutSeconds: 1 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxain spec: clusterIP: 10.200.20.25 selector: apps: xiuxian ports: - port: 80 EOF 测试方式: [root@master231 ~]# while true;do curl 10.200.20.25;sleep 0.5;done # 报错,因为80端口没起来 [root@master231 ~/count/probe]#curl 10.200.20.25 curl: (7) Failed to connect to 10.200.20.25 port 80 after 0 ms: Couldnt connect to server # 手动把某一个容器的nginx启动 [root@master231 ~/count/probe]#kubectl exec -it deploy-livenessprobe-readinessprobe-httpget-55d9cb49f9-7lc8d -- nginx 2025/07/25 14:22:30 [notice] 9#9: using the "epoll" event method 2025/07/25 14:22:30 [notice] 9#9: nginx/1.20.1 2025/07/25 14:22:30 [notice] 9#9: built by gcc 10.2.1 20201203 (Alpine 10.2.1_pre1) 2025/07/25 14:22:30 [notice] 9#9: OS: Linux 6.8.0-51-generic 2025/07/25 14:22:30 [notice] 9#9: getrlimit(RLIMIT_NOFILE): 524288:524288 2025/07/25 14:22:30 [notice] 9#15: signal 1 (SIGHUP) received, reconfiguring 2025/07/25 14:22:30 [notice] 15#15: start worker processes # 可以看到状态已经Running了 [root@master231 ~/count/probe]#kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-livenessprobe-readinessprobe-httpget-55d9cb49f9-7lc8d 1/1 Running 0 2m33s 10.100.2.95 worker233 <none> <none> deploy-livenessprobe-readinessprobe-httpget-55d9cb49f9-7sd8l 0/1 Running 0 2m33s 10.100.1.204 worker232 <none> <none> deploy-livenessprobe-readinessprobe-httpget-55d9cb49f9-b5fmf 0/1 Running 0 2m33s 10.100.2.96 worker233 <none> <none> # 查看ep的地址池,发现可用地址有容器的ip [root@master231 ~/count/probe]#kubectl describe ep svc-xiuxain Name: svc-xiuxain Namespace: default Labels: <none> Annotations: endpoints.kubernetes.io/last-change-trigger-time: 2025-07-25T14:22:31Z Subsets: Addresses: 10.100.2.95 NotReadyAddresses: 10.100.1.204,10.100.2.96 Ports: Name Port Protocol ---- ---- -------- <unset> 80 TCP Events: <none> # 发现顺利curl通了 [root@master231 ~/count/probe]#curl 10.200.20.25 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html>
8、readinessProbe探针之tcpSocket探测方式
bash
[root@master231 probe]# cat > 07-deploy-readinessProbe-livenessProbe-tcpSocket.yaml <<''EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-livenessprobe-readinessprobe-tcpsocket spec: revisionHistoryLimit: 1 strategy: type: "RollingUpdate" rollingUpdate: maxUnavailable: 1 maxSurge: 2 replicas: 3 selector: matchLabels: apps: xiuxian template: metadata: labels: apps: xiuxian spec: restartPolicy: Always containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 imagePullPolicy: Always ports: - containerPort: 80 command: - /bin/sh - -c - touch /tmp/weixiang-linux-healthy; sleep 30; rm -f /tmp/weixiang-linux-healthy; sleep 600 livenessProbe: exec: command: - cat - /tmp/weixiang-linux-healthy failureThreshold: 3 initialDelaySeconds: 300 periodSeconds: 1 successThreshold: 1 timeoutSeconds: 1 # 可用性检查,周期性检查服务是否可用,从而判断容器是否就绪. readinessProbe: # 使用tcpSocket的方式去做健康检查 tcpSocket: # 探测80端口是否存活 port: 80 failureThreshold: 3 initialDelaySeconds: 10 periodSeconds: 1 successThreshold: 1 timeoutSeconds: 1 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxain spec: clusterIP: 10.200.20.25 selector: apps: xiuxian ports: - port: 80 EOF # pod没有就绪,因为nginx没起 [root@master231 ~/count/probe]#kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-livenessprobe-readinessprobe-tcpsocket-7db74bdb6f-hpjbv 0/1 Running 0 23s 10.100.2.98 worker233 <none> <none> deploy-livenessprobe-readinessprobe-tcpsocket-7db74bdb6f-ljnhm 0/1 Running 0 23s 10.100.1.205 worker232 <none> <none> deploy-livenessprobe-readinessprobe-tcpsocket-7db74bdb6f-ntk6r 0/1 Running 0 23s 10.100.2.97 worker233 <none> <none> # 手动启动nginx [root@master231 ~/count/probe]#kubectl exec -it deploy-livenessprobe-readinessprobe-tcpsocket-7db74bdb6f-hpjbv -- nginx 2025/07/25 14:38:48 [notice] 10#10: using the "epoll" event method 2025/07/25 14:38:48 [notice] 10#10: nginx/1.20.1 2025/07/25 14:38:48 [notice] 10#10: built by gcc 10.2.1 20201203 (Alpine 10.2.1_pre1) 2025/07/25 14:38:48 [notice] 10#10: OS: Linux 6.8.0-51-generic 2025/07/25 14:38:48 [notice] 10#10: getrlimit(RLIMIT_NOFILE): 524288:524288 2025/07/25 14:38:48 [notice] 16#16: start worker processes 2025/07/25 14:38:48 [notice] 16#16: start worker process 17 2025/07/25 14:38:48 [notice] 16#16: start worker process 18 # 测试 [root@master231 ~/count/probe]#while true;do curl 10.200.20.25;sleep 0.5;done <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style>

image

9、startupProbe启动探针实战
bash
[root@master231 probe]# cat > 08-deploy-startupProbe-httpGet.yaml <<''EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-livenessprobe-readinessprobe-startupprobe-httpget spec: revisionHistoryLimit: 1 strategy: type: "RollingUpdate" rollingUpdate: maxUnavailable: 1 maxSurge: 2 replicas: 3 selector: matchLabels: apps: xiuxian template: metadata: labels: apps: xiuxian spec: volumes: - name: data emptyDir: {} # 初始化容器仅在Pod创建时执行一次,容器重启时并不会调用初始化容器。 initContainers: - name: init01 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 volumeMounts: - name: data mountPath: /weixiang command: - /bin/sh - -c - echo "liveness probe test page" >> /weixiang/huozhe.html - name: init02 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 volumeMounts: - name: data mountPath: /weixiang command: - /bin/sh - -c - echo "readiness probe test page" >> /weixiang/weixiang.html - name: init03 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3 volumeMounts: - name: data mountPath: /weixiang command: - /bin/sh - -c - echo "startup probe test page" >> /weixiang/start.html containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 volumeMounts: - name: data mountPath: /usr/share/nginx/html # 周期性:判断服务是否健康,若检查不通过,将Pod直接重启。 livenessProbe: httpGet: port: 80 path: /huozhe.html failureThreshold: 3 initialDelaySeconds: 5 periodSeconds: 1 successThreshold: 1 timeoutSeconds: 1 # 周期性: 判断服务是否就绪,若检查不通过,将Pod标记为未就绪状态。 readinessProbe: httpGet: port: 80 path: /weixiang.html failureThreshold: 3 initialDelaySeconds: 10 periodSeconds: 3 successThreshold: 1 timeoutSeconds: 1 # 一次性: 容器启动时做检查,若检查不通过,直接杀死容器。并进行重启! # startupProbe探针通过后才回去执行readinessProbe和livenessProbe哟~ startupProbe: httpGet: port: 80 path: /start.html failureThreshold: 3 # 尽管上面的readinessProbe和livenessProbe数据已经就绪,但必须等待startupProbe的检测成功后才能执行。 initialDelaySeconds: 35 periodSeconds: 3 successThreshold: 1 timeoutSeconds: 1 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxain spec: clusterIP: 10.200.20.25 selector: apps: xiuxian ports: - port: 80 EOF # 根据startupProbe的优先级,38秒后pod才进入就绪状态

image

bash
测试验证: [root@master231 sts]# while true; do curl 10.200.20.25/huozhe.html ; sleep 0.1;done liveness probe test page liveness probe test page liveness probe test page liveness probe test page liveness probe test page liveness probe test page liveness probe test page liveness probe test page liveness probe test page ^C [root@master231 sts]# while true; do curl 10.200.20.25/weixiang.html ; sleep 0.1;done readiness probe test page readiness probe test page readiness probe test page readiness probe test page readiness probe test page readiness probe test page readiness probe test page readiness probe test page readiness probe test page readiness probe test page ^C [root@master231 sts]# while true; do curl 10.200.20.25/start.html ; sleep 0.1;done startup probe test page startup probe test page startup probe test page startup probe test page startup probe test page startup probe test page startup probe test page ^C [root@master231 sts]# 彩蛋:查看容器重启之前的上一个日志信息。 [root@master231 probe]# kubectl logs -f deploy-livenessprobe-readinessprobe-startupprobe-httpget-96k7bw ... 10.0.0.233 - - [20/Apr/2025:06:54:34 +0000] "GET /start.html HTTP/1.1" 200 24 "-" "kube-probe/1.23" "-" 10.0.0.233 - - [20/Apr/2025:06:55:31 +0000] "GET /weixiang.html HTTP/1.1" 200 26 "-" "kube-probe/1.23" "-" 10.0.0.233 - - [20/Apr/2025:06:55:31 +0000] "GET /huozhe.html HTTP/1.1" 200 25 "-" "kube-probe/1.23" "-" 10.0.0.233 - - [20/Apr/2025:06:55:32 +0000] "GET /huozhe.html HTTP/1.1" 200 25 "-" "kube-probe/1.23" "-" 10.0.0.233 - - [20/Apr/2025:06:55:33 +0000] "GET /huozhe.html HTTP/1.1" 200 25 "-" "kube-probe/1.23" "-" 10.0.0.233 - - [20/Apr/2025:06:55:34 +0000] "GET /huozhe.html HTTP/1.1" 200 25 "-" "kube-probe/1.23" "-" 10.0.0.233 - - [20/Apr/2025:06:55:34 +0000] "GET /weixiang.html HTTP/1.1" 200 26 "-" "kube-probe/1.23" "-" 10.0.0.233 - - [20/Apr/2025:06:55:35 +0000] "GET /huozhe.html HTTP/1.1" 200 25 "-" "kube-probe/1.23" "-" 10.0.0.233 - - [20/Apr/2025:06:55:36 +0000] "GET /huozhe.html HTTP/1.1" 200 25 "-" "kube-probe/1.23" "-" ... # 通过日志可以看出,startupProbe先执行,执行一次

image

26、容器的生命周期lifecycle

bash
- postStart : 容器启动后操作的事情。 - preStop: 容器停止之前做的事情。 Pod的优雅终止及容器的生命周期postStart和preStop [root@master231 19-probe]# cat 09-deploy-lifecycle-postStart-preStop.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-lifecycle spec: replicas: 1 selector: matchLabels: apps: xiuxian template: metadata: labels: apps: xiuxian spec: volumes: - name: data hostPath: path: /weixiang-linux # 在pod优雅终止时,定义延迟发送kill信号的时间,此时间可用于pod处理完未处理的请求等状况。 # 默认单位是秒,若不设置默认值为30s。 terminationGracePeriodSeconds: 60 #terminationGracePeriodSeconds: 3 containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 volumeMounts: - name: data mountPath: /data # 定义容器的生命周期。 lifecycle: # 容器启动(command)之后做的事情,如果此函数未执行完成,容器始终处于:' ContainerCreating'状态。 postStart: exec: # command: ["tail","-f","/etc/hosts"] command: - "/bin/sh" - "-c" - "sleep 30;echo \"postStart at $(date +%F_%T)\" >> /data/postStart.log" # 容器停止之前做的事情,这个时间受限于: terminationGracePeriodSeconds preStop: exec: command: - "/bin/sh" - "-c" - "sleep 20;echo \"preStop at $(date +%F_%T)\" >> /data/preStop.log" [root@master231 19-probe]#

bash
# 修改优雅停止时间为3s [root@master231 ~/count/probe]#vim 09-deploy-lifecycle-postStart-preStop.yaml ... terminationGracePeriodSeconds: 3 ... # 没有postStart.log,因为没满足sleep20s就停止了

image

·27、kubelet启动容器的原理

bash
- kubelet启动容器的原理图解【了解】 1.kubelet创建Pod的全流程: - 1.kubelet调用CRI接口创建容器,底层支持docker|containerd作为容器运行时; - 2.底层基于runc(符合OCI规范)创建容器: - 3.优先创建pause基础镜像; - 4.创建初始化容器 - 5.业务容器,业务容器如果定义了优雅终止,探针则顺序如下: - 5.1 启动命令【COMMAND】 - 5.2 启动postStart; - 5.3 Probe - StartupProbe - LivenessProbe | readinessProbe - 5.4 启动PreStop 受限于优雅终止时间(默认30s)。 2.测试案例 [root@master231 pods]# cat > 10-deploy-shaonao-workflow.yaml <<'EOF' apiVersion: apps/v1 kind: Deployment metadata: name: deploy-lifecycle spec: replicas: 1 selector: matchLabels: apps: xiuxian template: metadata: labels: apps: xiuxian spec: volumes: - name: data hostPath: path: /weixiang-shaonao - name: dt hostPath: path: /etc/localtime initContainers: - name: init01 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 volumeMounts: - name: data mountPath: /weixiang - name: dt mountPath: /etc/localtime command: - "/bin/sh" - "-c" - "echo \"initContainer at $(date +%F_%T)\" > /weixiang/haha.log" terminationGracePeriodSeconds: 3 containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3 command: - /bin/sh - -c - "echo \"command at $(date +%F_%T)\" >> /usr/share/nginx/html/haha.log; sleep 600" volumeMounts: - name: data mountPath: /usr/share/nginx/html - name: dt mountPath: /etc/localtime imagePullPolicy: IfNotPresent livenessProbe: exec: command: - "/bin/sh" - "-c" - "echo \"livenessProbe at $(date +%F_%T)\" >> /usr/share/nginx/html/haha.log" failureThreshold: 3 initialDelaySeconds: 0 periodSeconds: 3 successThreshold: 1 timeoutSeconds: 1 readinessProbe: exec: command: - "/bin/sh" - "-c" - "echo \"readinessProbe at $(date +%F_%T)\" >> /usr/share/nginx/html/haha.log" failureThreshold: 3 initialDelaySeconds: 0 periodSeconds: 3 successThreshold: 1 timeoutSeconds: 1 startupProbe: exec: command: - "/bin/sh" - "-c" - "echo \"startupProbe at $(date +%F_%T)\" >> /usr/share/nginx/html/haha.log" failureThreshold: 3 initialDelaySeconds: 0 periodSeconds: 3 successThreshold: 1 timeoutSeconds: 1 lifecycle: postStart: exec: command: - "/bin/sh" - "-c" - "sleep 10;echo \"postStart at $(date +%F_%T)\" >> /usr/share/nginx/html/haha.log" preStop: exec: command: - "/bin/sh" - "-c" - "echo \"preStop at $(date +%F_%T)\" >> /usr/share/nginx/html/haha.log;sleep 30" EOF 3.测试验证 [root@worker233 ~]# tail -100f /weixiang-shaonao/haha.log initContainer at 2025-04-20_07:34:44 command at 2025-04-20_07:34:45 postStart at 2025-04-20_07:34:55 startupProbe at 2025-04-20_07:34:56 readinessProbe at 2025-04-20_07:34:56 livenessProbe at 2025-04-20_07:34:59 readinessProbe at 2025-04-20_07:34:59 readinessProbe at 2025-04-20_07:35:02 livenessProbe at 2025-04-20_07:35:02 livenessProbe at 2025-04-20_07:35:05 ... preStop at 2025-04-20_07:36:29

28、Pod的5个阶段及容器的3种状态

bash
参考链接: https://kubernetes.io/zh-cn/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase - Pod的5个阶段 Pending(挂起): kubectl get pods xiuxian-mutiple -o wide Pod 已被 Kubernetes 系统接受,但有一个或者多个容器尚未创建亦未运行。 此阶段包括等待Pod被调度的时间和通过网络下载镜像的时间。 Running(运行中): Pod 已经绑定到了某个节点,Pod 中所有的容器都已被创建。 至少有一个容器仍在运行,或者正处于启动或重启状态。 Succeeded(成功): Pod 中的所有容器都已成功结束,并且不会再重启。 Failed(失败): Pod 中的所有容器都已终止,并且至少有一个容器是因为失败终止。 也就是说,容器以非 0 状态退出或者被系统终止,且未被设置为自动重启。 Unknown(未知): 因为某些原因无法取得 Pod 的状态。 这种情况通常是因为与 Pod 所在主机通信失败。 - 容器3种状态 Waiting(等待): 如果容器并不处在 Running 或 Terminated 状态之一,它就处在 Waiting 状态。 处于 Waiting 状态的容器仍在运行它完成启动所需要的操作:例如, 从某个容器镜像仓库拉取容器镜像,或者向容器应用 Secret 数据等等。 当你使用 kubectl 来查询包含 Waiting 状态的容器的 Pod 时,你也会看到一个 Reason 字段,其中给出了容器处于等待状态的原因。 Running(运行中): Running 状态表明容器正在执行状态并且没有问题发生。 如果配置了 postStart 回调,那么该回调已经执行且已完成。 如果你使用 kubectl 来查询包含 Running 状态的容器的 Pod 时, 你也会看到关于容器进入 Running 状态的信息。 Terminated(已终止): 处于 Terminated 状态的容器开始执行后,或者运行至正常结束或者因为某些原因失败。 如果你使用 kubectl 来查询包含 Terminated 状态的容器的 Pod 时, 你会看到容器进入此状态的原因、退出代码以及容器执行期间的起止时间。 实战案例: [root@master231 01-pods]# cat 11-pods-Troubleshooting-describe.yaml apiVersion: v1 kind: Pod metadata: name: xiuxian-describe labels: apps: xiuxian spec: containers: - name: c1 image: harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111 ports: - containerPort: 80 [root@master231 01-pods]# [root@master231 01-pods]# kubectl apply -f 11-pods-Troubleshooting-describe.yaml pod/xiuxian-describe created [root@master231 01-pods]# [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-describe 0/1 ContainerCreating 0 3s <none> worker233 <none> <none> [root@master231 01-pods]# [root@master231 01-pods]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-describe 0/1 ErrImagePull 0 19s 10.100.2.140 worker233 <none> <none> [root@master231 01-pods]# [root@master231 01-pods]# kubectl describe po xiuxian-describe Name: xiuxian-describe # Pod的名称 Namespace: default # Pod的名称空间 Priority: 0 # 调度的优先级 Node: worker233/10.0.0.233 # 调度的节点及对应的IP地址 Start Time: Fri, 25 Jul 2025 16:12:30 +0800 # 启动时间 Labels: apps=xiuxian # 标签 Annotations: <none> # 资源注解 Status: Pending # Pod的阶段 IP: 10.100.2.140 # Pod的IP地址 IPs: # IP地址列表。 IP: 10.100.2.140 Containers: # 容器信息 c1: # 容器的名称 Container ID: # 容器ID Image: harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111 # 容器镜像 Image ID: # 镜像ID Port: 80/TCP # 暴露的度哪款 Host Port: 0/TCP # 绑定的主机端口 State: Waiting # 容器的状态。 Reason: ErrImagePull # 容器处于该状态的原因。 Ready: False # 容器是否就绪。 Restart Count: 0 # 容器的重启次数。 Environment: <none> # 容器传递的环境变量。 Mounts: # 容器的挂载信息 /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-77vgs (ro) Conditions: # 容器的状态条件 Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: # 存储卷信息 kube-api-access-77vgs: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: BestEffort Node-Selectors: <none> # 节点选择器。 Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s # 污点容忍。 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: # 事件信息 Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 22s default-scheduler Successfully assigned default/xiuxian-describe to worker233 Normal Pulling 21s kubelet Pulling image "harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111" Warning Failed 6s kubelet Failed to pull image "harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111": rpc error: code = Unknown desc = Error response from daemon: Get "https://harbor250.weixiang.com/v2/": net/http: request canceled (Client.Timeout exceeded while awaiting headers) Warning Failed 6s kubelet Error: ErrImagePull Normal BackOff 5s kubelet Back-off pulling image "harbor250.weixiang.com/weixiang-xiuxian/apps:v11111111111" Warning Failed 5s kubelet Error: ImagePullBackOff [root@master231 01-pods]#

29、存储卷进阶

image

bash
volumes不直接挂载到后端,而是挂载一个持久卷声明pvc,只需要声明要多大的存储卷资源(期望3G,上限4G),但是要手动创建一个PV,关联到 后端的存储,PVC可以自动跟PV进行绑定。 官方又提成存储类的概念,pvc直接去SC进行申请,申请后会自动创建pv,但是需要单独部署SC的组件,存储类可以有多个,可以指定默认的存储类 如果存在一个满足所有条件且没有 storageClassName 的静态,也存在一个被标记为 (default) 的 StorageClass,PVPVC 会优先绑定到那个已经存在的静态 PV 上

1、pv,pvc,sc之间的关系
bash
- pv pv用于和后端存储对接的资源,关联后端存储。 - sc sc可以动态创建pv的资源,关联后端存储。 - pvc 可以向pv或者sc进行资源请求,获取特定的存储。 pod只需要在存储卷声明使用哪个pvc即可。
2、手动创建pv和pvc及pod引用
bash
1.手动创建pv 1.1 创建工作目录 [root@master231 ~]# mkdir -pv /yinzhengjie/data/nfs-server/pv/linux/pv00{1,2,3} mkdir: created directory '/yinzhengjie/data/nfs-server/pv' mkdir: created directory '/yinzhengjie/data/nfs-server/pv/linux' mkdir: created directory '/yinzhengjie/data/nfs-server/pv/linux/pv001' mkdir: created directory '/yinzhengjie/data/nfs-server/pv/linux/pv002' mkdir: created directory '/yinzhengjie/data/nfs-server/pv/linux/pv003' [root@master231 ~]# [root@master231 ~]# tree /yinzhengjie/data/nfs-server/pv/linux /yinzhengjie/data/nfs-server/pv/linux ├── pv001 ├── pv002 └── pv003 3 directories, 0 files [root@master231 ~]# 1.2 编写资源清单 [root@master231 20-persistentvolumes]# cat > 01-manual-pv.yaml <<''EOF apiVersion: v1 kind: PersistentVolume metadata: name: weixiang-linux-pv01 labels: school: weixiang spec: # 声明PV的访问模式,常用的有"ReadWriteOnce","ReadOnlyMany"和"ReadWriteMany": # ReadWriteOnce:(简称:"RWO") # 只允许单个worker节点读写存储卷,但是该节点的多个Pod是可以同时访问该存储卷的。 # ReadOnlyMany:(简称:"ROX") # 允许多个worker节点进行只读存储卷。 # ReadWriteMany:(简称:"RWX") # 允许多个worker节点进行读写存储卷。 # ReadWriteOncePod:(简称:"RWOP") # 该卷可以通过单个Pod以读写方式装入。 # 如果您想确保整个集群中只有一个pod可以读取或写入PVC,请使用ReadWriteOncePod访问模式。 # 这仅适用于CSI卷和Kubernetes版本1.22+。 accessModes: - ReadWriteMany # 声明存储卷的类型为nfs nfs: path: /yinzhengjie/data/nfs-server/pv/linux/pv001 server: 10.1.24.13 # 指定存储卷的回收策略,常用的有"Retain"和"Delete" # Retain: # "保留回收"策略允许手动回收资源。 # 删除PersistentVolumeClaim时,PersistentVolume仍然存在,并且该卷被视为"已释放"。 # 在管理员手动回收资源之前,使用该策略其他Pod将无法直接使用。 # Delete: # 对于支持删除回收策略的卷插件,k8s将删除pv及其对应的数据卷数据。 # Recycle: # 对于"回收利用"策略官方已弃用。相反,推荐的方法是使用动态资源调配。 # 如果基础卷插件支持,回收回收策略将对卷执行基本清理(rm -rf /thevolume/*),并使其再次可用于新的声明。 persistentVolumeReclaimPolicy: Retain # 声明存储的容量 capacity: storage: 2Gi --- apiVersion: v1 kind: PersistentVolume metadata: name: weixiang-linux-pv02 labels: school: weixiang spec: accessModes: - ReadWriteMany nfs: path: /yinzhengjie/data/nfs-server/pv/linux/pv002 server: 10.1.24.13 persistentVolumeReclaimPolicy: Retain capacity: storage: 5Gi --- apiVersion: v1 kind: PersistentVolume metadata: name: weixiang-linux-pv03 labels: school: weixiang spec: accessModes: - ReadWriteMany nfs: path: /yinzhengjie/data/nfs-server/pv/linux/pv003 server: 10.1.24.13 persistentVolumeReclaimPolicy: Retain capacity: storage: 10Gi EOF 1.3 创建pv [root@master231 20-persistentvolumes]# kubectl apply -f 01-manual-pv.yaml persistentvolume/weixiang-linux-pv01 created persistentvolume/weixiang-linux-pv02 created persistentvolume/weixiang-linux-pv03 created [root@master231 20-persistentvolumes]# [root@master231 20-persistentvolumes]# kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE weixiang-linux-pv01 2Gi RWX Retain Available 5s weixiang-linux-pv02 5Gi RWX Retain Available 5s weixiang-linux-pv03 10Gi RWX Retain Available 5s [root@master231 20-persistentvolumes]# 相关资源说明: NAME : pv的名称 CAPACITY : pv的容量 ACCESS MODES: pv的访问模式 RECLAIM POLICY: pv的回收策略。 STATUS : pv的状态。 CLAIM: pv被哪个pvc使用。 STORAGECLASS sc的名称。 REASON pv出错时的原因。 AGE 创建的时间。 2.手动创建pvc [root@master231 21-persistentvolumeclaims]# cat > 01-manual-pvc.yaml <<''EOF apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc001 spec: # 声明要是用的pv # volumeName: weixiang-linux-pv03 # 声明资源的访问模式 accessModes: - ReadWriteMany # 声明资源的使用量 resources: limits: storage: 4Gi requests: storage: 3Gi EOF [root@master231 21-persistentvolumeclaims]# kubectl get pvc,pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE persistentvolume/weixiang-linux-pv01 2Gi RWX Retain Available 5m31s persistentvolume/weixiang-linux-pv02 5Gi RWX Retain Available 5m31s persistentvolume/weixiang-linux-pv03 10Gi RWX Retain Available 5m31s [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# kubectl apply -f 01-manual-pvc.yaml persistentvolumeclaim/pvc001 created [root@master231 21-persistentvolumeclaims]# # pvc自动关联到pv02上,虽然配置文件没指定,但是pv02的容量跟pvc的期望容量是最接近的 [root@master231 21-persistentvolumeclaims]# kubectl get pvc,pv NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/pvc001 Bound weixiang-linux-pv02 5Gi RWX 1s NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE persistentvolume/weixiang-linux-pv01 2Gi RWX Retain Available 5m38s persistentvolume/weixiang-linux-pv02 5Gi RWX Retain Bound default/pvc001 5m38s persistentvolume/weixiang-linux-pv03 10Gi RWX Retain Available 5m38s [root@master231 21-persistentvolumeclaims]# 3.Pod引用pvc [root@master231 21-persistentvolumeclaims]# cat 02-deploy-pvc.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-pvc-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: volumes: - name: data persistentVolumeClaim: # 声明存储卷的类型是pvc claimName: pvc001 # 声明pvc的名称 - name: dt hostPath: path: /etc/localtime initContainers: # 定义了初始化容器列表。 - name: init01 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 volumeMounts: - name: data # 引用上面定义的名为 "data" 的存储卷 (即 pvc001) mountPath: /weixiang # 指定将 "data" 卷挂载到容器内的 /weixiang 目录 - name: dt mountPath: /etc/localtime command: - /bin/sh - -c - date -R > /weixiang/index.html ; echo www.weixiang.com >> /weixiang/index.html containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 volumeMounts: - name: data mountPath: /usr/share/nginx/html - name: dt mountPath: /etc/localtime [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# kubectl apply -f 02-deploy-pvc.yaml deployment.apps/deploy-pvc-xiuxian created [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-pvc-xiuxian-86f6d8d54d-24b28 1/1 Running 0 54s 10.100.2.145 worker233 <none> <none> deploy-pvc-xiuxian-86f6d8d54d-gklnh 1/1 Running 0 54s 10.100.2.146 worker233 <none> <none> deploy-pvc-xiuxian-86f6d8d54d-lq246 1/1 Running 0 54s 10.100.1.6 worker232 <none> <none> [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# curl 10.100.2.146 Sat, 26 Jul 2025 10:52:27 +0800 www.weixiang.com [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# curl 10.100.2.145 Sat, 26 Jul 2025 10:52:27 +0800 www.weixiang.com [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# curl 10.100.1.6 Sat, 26 Jul 2025 10:52:27 +0800 www.weixiang.com [root@master231 21-persistentvolumeclaims]# 4.基于Pod找到后端的pv 4.1 找到pvc的名称 [root@master231 21-persistentvolumeclaims]# kubectl describe pod deploy-pvc-xiuxian-86f6d8d54d-24b28 Name: deploy-pvc-xiuxian-86f6d8d54d-24b28 Namespace: default ... Volumes: data: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: pvc001 # 找到pvc的名称 ReadOnly: false ... 4.2 基于pvc找到与之关联的pv [root@master231 21-persistentvolumeclaims]# kubectl get pvc pvc001 NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pvc001 Bound weixiang-linux-pv02 5Gi RWX 10m [root@master231 21-persistentvolumeclaims]# # 根据上面找到的pv名称weixiang-linux-pv02 [root@master231 21-persistentvolumeclaims]# kubectl get pv weixiang-linux-pv02 NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE weixiang-linux-pv02 5Gi RWX Retain Bound default/pvc001 16m 4.3 查看pv的详细信息 [root@master231 21-persistentvolumeclaims]# kubectl describe pv weixiang-linux-pv02 Name: weixiang-linux-pv02 Labels: school=weixiang Annotations: pv.kubernetes.io/bound-by-controller: yes Finalizers: [kubernetes.io/pv-protection] StorageClass: Status: Bound Claim: default/pvc001 Reclaim Policy: Retain Access Modes: RWX VolumeMode: Filesystem Capacity: 5Gi Node Affinity: <none> Message: Source: # 重点关注字段 Type: NFS (an NFS mount that lasts the lifetime of a pod) Server: 10.0.0.231 Path: /yinzhengjie/data/nfs-server/pv/linux/pv002 ReadOnly: false Events: <none> [root@master231 21-persistentvolumeclaims]# 4.4 验证数据的内容 [root@master231 21-persistentvolumeclaims]# ll /yinzhengjie/data/nfs-server/pv/linux/pv002 total 12 drwxr-xr-x 2 root root 4096 Jul 26 10:52 ./ drwxr-xr-x 5 root root 4096 Jul 26 10:37 ../ -rw-r--r-- 1 root root 50 Jul 26 10:52 index.html [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# cat /yinzhengjie/data/nfs-server/pv/linux/pv002/index.html Sat, 26 Jul 2025 10:52:27 +0800 www.weixiang.com [root@master231 21-persistentvolumeclaims]#
3、简单陈述下CSI,CRI,CNI的区别?
bash
CSI: Container Storage Interface,表示容器的存储接口,凡是符合该接口的程序K8S都能将数据存储该系统。 CRI: Container Runtime Interface,表示容器运行接口,凡是符合该接口的运行时k8s底层都能调用该容器管理工具。 CNI: Container Network Interface,表示容器的网络接口,凡是符合该接口的CNI插件k8s的Pod都可以使用该网络插件。
4、基于nfs4.9.0版本实现动态存储类
bash
推荐阅读: https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/docs/install-csi-driver-v4.9.0.md https://kubernetes.io/docs/concepts/storage/storage-classes/#nfs 1.删除资源 [root@master231 21-persistentvolumeclaims]# ll total 16 drwxr-xr-x 2 root root 4096 Jul 26 10:52 ./ drwxr-xr-x 26 root root 4096 Jul 26 10:43 ../ -rw-r--r-- 1 root root 309 Jul 26 10:45 01-manual-pvc.yaml -rw-r--r-- 1 root root 1070 Jul 26 10:52 02-deploy-pvc.yaml [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# kubectl delete -f . persistentvolumeclaim "pvc001" deleted deployment.apps "deploy-pvc-xiuxian" deleted [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# cd ../20-persistentvolumes/ [root@master231 20-persistentvolumes]# ll total 12 drwxr-xr-x 2 root root 4096 Jul 26 10:39 ./ drwxr-xr-x 26 root root 4096 Jul 26 10:43 ../ -rw-r--r-- 1 root root 2489 Jul 26 10:39 01-manual-pv.yaml [root@master231 20-persistentvolumes]# [root@master231 20-persistentvolumes]# kubectl delete -f 01-manual-pv.yaml persistentvolume "weixiang-linux-pv01" deleted persistentvolume "weixiang-linux-pv02" deleted persistentvolume "weixiang-linux-pv03" deleted [root@master231 20-persistentvolumes]# [root@master231 20-persistentvolumes]# 2.克隆代码 [root@master231 nfs]# git clone https://github.com/kubernetes-csi/csi-driver-nfs.git 如果下载不了的SVIP: [root@master231 manifests]# mkdir 22-storageclasses [root@master231 manifests]# cd 22-storageclasses [root@master231 22-storageclasses]# [root@master231 22-storageclasses]# wget http://192.168.21.253/Resources/Kubernetes/sc/nfs/code/csi-driver-nfs-4.9.0.tar.gz [root@master231 22-storageclasses]# [root@master231 22-storageclasses]# tar xf csi-driver-nfs-4.9.0.tar.gz [root@master231 22-storageclasses]# [root@master231 22-storageclasses]# rm -f csi-driver-nfs-4.9.0.tar.gz [root@master231 22-storageclasses]# [root@master231 22-storageclasses]# ll total 12 drwxr-xr-x 3 root root 4096 Jul 26 11:15 ./ drwxr-xr-x 27 root root 4096 Jul 26 11:14 ../ drwxrwxr-x 13 root root 4096 Sep 1 2024 csi-driver-nfs-4.9.0/ [root@master231 22-storageclasses]# [root@master231 22-storageclasses]# cd csi-driver-nfs-4.9.0/ [root@master231 csi-driver-nfs-4.9.0]# [root@master231 csi-driver-nfs-4.9.0]# ls CHANGELOG cmd deploy go.mod LICENSE OWNERS_ALIASES RELEASE.md support.md charts code-of-conduct.md Dockerfile go.sum Makefile pkg release-tools test cloudbuild.yaml CONTRIBUTING.md docs hack OWNERS README.md SECURITY_CONTACTS vendor [root@master231 csi-driver-nfs-4.9.0]# 3.安装nfs动态存储类 [root@master231 csi-driver-nfs-4.9.0]# ./deploy/install-driver.sh v4.9.0 local use local deploy Installing NFS CSI driver, version: v4.9.0 ... serviceaccount/csi-nfs-controller-sa created serviceaccount/csi-nfs-node-sa created clusterrole.rbac.authorization.k8s.io/nfs-external-provisioner-role created clusterrolebinding.rbac.authorization.k8s.io/nfs-csi-provisioner-binding created csidriver.storage.k8s.io/nfs.csi.k8s.io created deployment.apps/csi-nfs-controller created daemonset.apps/csi-nfs-node created NFS CSI driver installed successfully. [root@master231 csi-driver-nfs-4.9.0]# 4.验证是否安装成功 [root@master231 csi-driver-nfs]# kubectl -n kube-system get pod -o wide -l app NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES csi-nfs-controller-5c5c695fb-6psv8 4/4 Running 0 4s 10.0.0.232 worker232 <none> <none> csi-nfs-node-bsmr7 3/3 Running 0 3s 10.0.0.232 worker232 <none> <none> csi-nfs-node-ghtvt 3/3 Running 0 3s 10.0.0.231 master231 <none> <none> csi-nfs-node-s4dm5 3/3 Running 0 3s 10.0.0.233 worker233 <none> <none> [root@master231 csi-driver-nfs]# 温馨提示: 此步骤如果镜像下载不下来,则可以到我的仓库下载"http://192.168.21.253/Resources/Kubernetes/sc/nfs/images/"。 [root@master231 ~/count/22-storageclasses]#docker load -i weixiang-csi-nfs-node-v4.9.0.tar.gz [root@master231 ~/count/22-storageclasses]#docker load -i weixiang-csi-nfs-controller-v4.9.0.tar.gz 5.创建存储类 [root@master231 csi-driver-nfs-4.9.0]# mkdir /yinzhengjie/data/nfs-server/sc/ [root@master231 csi-driver-nfs-4.9.0]# [root@master231 csi-driver-nfs-4.9.0]# cat deploy/v4.9.0/storageclass.yaml ... parameters: server: 10.0.0.231 share: /yinzhengjie/data/nfs-server/sc/ ... [root@master231 csi-driver-nfs-4.9.0]# [root@master231 csi-driver-nfs-4.9.0]# kubectl apply -f deploy/v4.9.0/storageclass.yaml storageclass.storage.k8s.io/nfs-csi created [root@master231 csi-driver-nfs-4.9.0]# [root@master231 csi-driver-nfs-4.9.0]# kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE nfs-csi nfs.csi.k8s.io Delete Immediate false 3s [root@master231 csi-driver-nfs-4.9.0]# 7.创建pvc测试 [root@master231 21-persistentvolumeclaims]# cat 03-pvc-sc.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: weixiang-linux-pvc-sc spec: # 声明要是用的pv # volumeName: weixiang-linux-pv03 # 声明使用的存储类(storage class,简称sc) storageClassName: nfs-csi # 声明资源的访问模式 accessModes: - ReadWriteMany # 声明资源的使用量 resources: limits: storage: 2Mi requests: storage: 1Mi [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# kubectl apply -f 03-pvc-sc.yaml persistentvolumeclaim/weixiang-linux-pvc-sc created [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE weixiang-linux-pvc-sc Bound pvc-9527d300-e94b-48fc-8827-dc8e63adcd52 1Mi RWX nfs-csi 3s [root@master231 21-persistentvolumeclaims]# 8.pod引用pvc [root@master231 21-persistentvolumeclaims]# cat 02-deploy-pvc.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-pvc-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: volumes: - name: data # 声明存储卷的类型是pvc persistentVolumeClaim: # 声明pvc的名称 claimName: weixiang-linux-pvc-sc # 所有由这个Deployment创建的Pod,都去挂载这个weixiang-linux-pvc-sc的PVC - name: dt hostPath: path: /etc/localtime initContainers: - name: init01 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 volumeMounts: - name: data # 挂载那个共享的PVC到/weixiang mountPath: /weixiang - name: dt mountPath: /etc/localtime command: - /bin/sh - -c - date -R > /weixiang/index.html ; echo www.weixiang.com >> /weixiang/index.html containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 volumeMounts: - name: data # # 挂载那个已经被 init 容器写好内容的共享 PVC mountPath: /usr/share/nginx/html - name: dt mountPath: /etc/localtime [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# kubectl apply -f 02-deploy-pvc.yaml deployment.apps/deploy-pvc-xiuxian created [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-pvc-xiuxian-65d4b9bf97-c66gd 1/1 Running 0 5s 10.100.2.148 worker233 <none> <none> deploy-pvc-xiuxian-65d4b9bf97-dmx8s 1/1 Running 0 5s 10.100.1.7 worker232 <none> <none> deploy-pvc-xiuxian-65d4b9bf97-zkkhk 1/1 Running 0 5s 10.100.2.147 worker233 <none> <none> [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# curl 10.100.2.148 Sat, 26 Jul 2025 11:23:57 +0800 www.weixiang.com [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# curl 10.100.1.7 Sat, 26 Jul 2025 11:23:57 +0800 www.weixiang.com [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# curl 10.100.2.147 Sat, 26 Jul 2025 11:23:57 +0800 www.weixiang.com [root@master231 21-persistentvolumeclaims]# 9.验证pod的后端存储数据 [root@master231 21-persistentvolumeclaims]# kubectl describe pod deploy-pvc-xiuxian-65d4b9bf97-c66gd | grep ClaimName ClaimName: weixiang-linux-pvc-sc [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# kubectl get pvc weixiang-linux-pvc-sc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE weixiang-linux-pvc-sc Bound pvc-9527d300-e94b-48fc-8827-dc8e63adcd52 1Mi RWX nfs-csi 2m23s [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# kubectl describe pv pvc-9527d300-e94b-48fc-8827-dc8e63adcd52 | grep Source -A 5 Source: Type: CSI (a Container Storage Interface (CSI) volume source) Driver: nfs.csi.k8s.io FSType: VolumeHandle: 10.0.0.231#yinzhengjie/data/nfs-server/sc#pvc-9527d300-e94b-48fc-8827-dc8e63adcd52## ReadOnly: false [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# ll /yinzhengjie/data/nfs-server/sc/pvc-9527d300-e94b-48fc-8827-dc8e63adcd52/ total 12 drwxr-xr-x 2 root root 4096 Jul 26 11:23 ./ drwxr-xr-x 3 root root 4096 Jul 26 11:22 ../ -rw-r--r-- 1 root root 50 Jul 26 11:23 index.html [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# cat /yinzhengjie/data/nfs-server/sc/pvc-9527d300-e94b-48fc-8827-dc8e63adcd52/index.html Sat, 26 Jul 2025 11:23:57 +0800 www.weixiang.com [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]#
5、K8S配置默认的存储类及多个存储类定义
bash
1.响应式配置默认存储类 [root@master231 22-storageclasses]# kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE nfs-csi nfs.csi.k8s.io Delete Immediate false 7m12s [root@master231 22-storageclasses]# [root@master231 22-storageclasses]# kubectl patch sc nfs-csi -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' storageclass.storage.k8s.io/nfs-csi patched [root@master231 22-storageclasses]# [root@master231 22-storageclasses]# # 可以看到default字段 [root@master231 22-storageclasses]# kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE nfs-csi (default) nfs.csi.k8s.io Delete Immediate false 7m25s [root@master231 22-storageclasses]# 2.响应式取消默认存储类 [root@master231 22-storageclasses]# kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE nfs-csi (default) nfs.csi.k8s.io Delete Immediate false 7m25s [root@master231 22-storageclasses]# [root@master231 22-storageclasses]# kubectl patch sc nfs-csi -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}' storageclass.storage.k8s.io/nfs-csi patched [root@master231 22-storageclasses]# # 可以看到default字段已经没了 [root@master231 22-storageclasses]# kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE nfs-csi nfs.csi.k8s.io Delete Immediate false 7m44s [root@master231 22-storageclasses]# 3.声明式配置多个存储类 [root@master231 storageclasses]# cat > sc-multiple.yaml <<''EOF apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: weixiang-sc-xixi # 配置资源注解【一般用于为资源定义配置信息】 annotations: storageclass.kubernetes.io/is-default-class: "false" # "false" 表示它不是默认的。只有当 PVC 明确指定 'storageClassName: weixiang-sc-xixi' 时,才会使用它。 provisioner: nfs.csi.k8s.io # 表示使用 NFS 的 CSI (容器存储接口) 驱动来创建存储卷 parameters: server: 10.0.0.231 share: /yinzhengjie/data/nfs-server/sc-xixi reclaimPolicy: Delete # 当 PVC 被删除时,对应的 PV 和 NFS 服务器上的物理数据都会被自动删除 volumeBindingMode: Immediate # 一旦 PVC 被创建,供应者就立即创建 PV 并进行绑定,而不管是否有 Pod 要使用它 mountOptions: - nfsvers=4.1 --- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: weixiang-sc-haha annotations: storageclass.kubernetes.io/is-default-class: "true" # 请把这个存储类作为整个集群的默认选项。 provisioner: nfs.csi.k8s.io parameters: server: 10.0.0.231 share: /yinzhengjie/data/nfs-server/sc-haha EOF [root@master231 22-storageclasses]# kubectl apply -f sc-multiple.yaml storageclass.storage.k8s.io/weixiang-sc-xixi created storageclass.storage.k8s.io/weixiang-sc-haha created [root@master231 22-storageclasses]# [root@master231 22-storageclasses]# kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE nfs-csi nfs.csi.k8s.io Delete Immediate false 9m36s weixiang-sc-haha (default) nfs.csi.k8s.io Delete Immediate false 9s weixiang-sc-xixi nfs.csi.k8s.io Delete Immediate false 9s [root@master231 22-storageclasses]# 4.准备目录 [root@master231 22-storageclasses]# mkdir -pv /yinzhengjie/data/nfs-server/sc-{xixi,haha} mkdir: created directory '/yinzhengjie/data/nfs-server/sc-xixi' mkdir: created directory '/yinzhengjie/data/nfs-server/sc-haha' [root@master231 22-storageclasses]# [root@master231 22-storageclasses]# 5.测试验证 [root@master231 21-persistentvolumeclaims]# cat 04-pvc-sc-default.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-default spec: accessModes: - ReadWriteMany resources: limits: storage: 2Mi requests: storage: 1Mi [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# kubectl apply -f 04-pvc-sc-default.yaml persistentvolumeclaim/pvc-default created [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# kubectl get pvc pvc-default NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pvc-default Bound pvc-6b2ff6b5-bb38-461d-92ae-27986827d46a 1Mi RWX weixiang-sc-haha 4s [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# 6.pod引用pvc [root@master231 21-persistentvolumeclaims]# kubectl apply -f 02-deploy-pvc.yaml deployment.apps/deploy-pvc-xiuxian configured [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# cat 02-deploy-pvc.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-pvc-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: volumes: - name: data # 声明存储卷的类型是pvc persistentVolumeClaim: # 声明pvc的名称 # claimName: pvc001 # claimName: weixiang-linux-pvc-sc claimName: pvc-default - name: dt hostPath: path: /etc/localtime initContainers: - name: init01 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 volumeMounts: - name: data mountPath: /weixiang - name: dt mountPath: /etc/localtime command: - /bin/sh - -c - date -R > /weixiang/index.html ; echo www.weixiang.com >> /weixiang/index.html containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 volumeMounts: - name: data mountPath: /usr/share/nginx/html - name: dt mountPath: /etc/localtime [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# 7.验证后端存储 [root@master231 21-persistentvolumeclaims]# cat > /usr/local/bin/get-pv.sh <<'EOF' #!/bin/bash POD_NAME=$1 PVC_NAME=`kubectl describe pod $POD_NAME | grep ClaimName | awk '{print $2}'` PV_NAME=`kubectl get pvc ${PVC_NAME} | awk 'NR==2{print $3}'` kubectl describe pv $PV_NAME | grep Source -A 5 EOF [root@master231 21-persistentvolumeclaims]# chmod +x /usr/local/bin/get-pv.sh [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-pvc-xiuxian-679d6d8d5c-28xtd 1/1 Running 0 44s 10.100.2.150 worker233 <none> <none> deploy-pvc-xiuxian-679d6d8d5c-8lbsb 1/1 Running 0 49s 10.100.2.149 worker233 <none> <none> deploy-pvc-xiuxian-679d6d8d5c-j4lp8 1/1 Running 0 46s 10.100.1.8 worker232 <none> <none> [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# curl 10.100.2.150 Sat, 26 Jul 2025 11:33:05 +0800 www.weixiang.com [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# curl 10.100.2.149 Sat, 26 Jul 2025 11:33:05 +0800 www.weixiang.com [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# curl 10.100.1.8 Sat, 26 Jul 2025 11:33:05 +0800 www.weixiang.com [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# get-pv.sh deploy-pvc-xiuxian-679d6d8d5c-28xtd Source: Type: CSI (a Container Storage Interface (CSI) volume source) Driver: nfs.csi.k8s.io FSType: VolumeHandle: 10.0.0.231#yinzhengjie/data/nfs-server/sc-haha#pvc-6b2ff6b5-bb38-461d-92ae-27986827d46a## ReadOnly: false [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# ll /yinzhengjie/data/nfs-server/sc-haha/pvc-6b2ff6b5-bb38-461d-92ae-27986827d46a/ total 12 drwxr-xr-x 2 root root 4096 Jul 26 11:33 ./ drwxr-xr-x 3 root root 4096 Jul 26 11:31 ../ -rw-r--r-- 1 root root 50 Jul 26 11:33 index.html [root@master231 21-persistentvolumeclaims]# [root@master231 21-persistentvolumeclaims]# cat /yinzhengjie/data/nfs-server/sc-haha/pvc-6b2ff6b5-bb38-461d-92ae-27986827d46a/index.html Sat, 26 Jul 2025 11:33:05 +0800 www.weixiang.com [root@master231 21-persistentvolumeclaims]#

30、sts控制器

bash
参考链接: https://kubernetes.io/zh-cn/docs/concepts/workloads/controllers/statefulset/ https://kubernetes.io/zh-cn/docs/tutorials/stateful-application/basic-stateful-set/
1、StatefulSets概述
bash
以Nginx的为例,当任意一个Nginx挂掉,其处理的逻辑是相同的,即仅需重新创建一个Pod副本即可,这类服务我们称之为无状态服务。 以MySQL主从同步为例,master,slave两个库任意一个库挂掉,其处理逻辑是不相同的,这类服务我们称之为有状态服务。 有状态服务面临的难题: (1)启动/停止顺序; (2)pod实例的数据是独立存储; (3)需要固定的IP地址或者主机名; StatefulSet一般用于有状态服务,StatefulSets对于需要满足以下一个或多个需求的应用程序很有价值。 (1)稳定唯一的网络标识符。 (2)稳定独立持久的存储。 (3)有序优雅的部署和缩放。 (4)有序自动的滚动更新。 稳定的网络标识: 其本质对应的是一个service资源,只不过这个service没有定义VIP,我们称之为headless service,即"无头服务"。 通过"headless service"来维护Pod的网络身份,会为每个Pod分配一个数字编号并且按照编号顺序部署。 综上所述,无头服务("headless service")要求满足以下两点: (1)将svc资源的clusterIP字段设置None,即"clusterIP: None"; (2)将sts资源的serviceName字段声明为无头服务的名称; 独享存储: Statefulset的存储卷使用VolumeClaimTemplate创建,称为"存储卷申请模板"。 当sts资源使用VolumeClaimTemplate创建一个PVC时,同样也会为每个Pod分配并创建唯一的pvc编号,每个pvc绑定对应pv,从而保证每个Pod都有独立的存储。
2、StatefulSets控制器-网络唯一标识之headless
bash
2.1 编写资源清单 [root@master231 23-statefulsets]# cat > 01-statefulset-headless-network.yaml <<EOF apiVersion: v1 kind: Service metadata: name: svc-headless spec: ports: - port: 80 name: web # 将clusterIP字段设置为None表示为一个无头服务,即svc将不会分配VIP。 clusterIP: None selector: app: nginx --- apiVersion: apps/v1 kind: StatefulSet metadata: name: sts-xiuxian spec: selector: matchLabels: app: nginx # 声明无头服务 serviceName: svc-headless replicas: 3 template: metadata: labels: app: nginx spec: containers: - name: nginx image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 imagePullPolicy: Always EOF [root@master231 23-statefulsets]# kubectl apply -f 01-statefulset-headless-network.yaml service/svc-headless created statefulset.apps/sts-xiuxian created [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl get sts,svc,po -o wide NAME READY AGE CONTAINERS IMAGES statefulset.apps/sts-xiuxian 3/3 26s nginx registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 27h <none> service/svc-headless ClusterIP None <none> 80/TCP 26s app=nginx NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/sts-xiuxian-0 1/1 Running 0 26s 10.100.2.151 worker233 <none> <none> pod/sts-xiuxian-1 1/1 Running 0 24s 10.100.1.9 worker232 <none> <none> pod/sts-xiuxian-2 1/1 Running 0 22s 10.100.2.152 worker233 <none> <none> [root@master231 23-statefulsets]# 2.2 测试验证 [root@master231 23-statefulsets]# kubectl exec -it sts-xiuxian-0 -- sh / # ping sts-xiuxian-1 -c 3 ping: bad address 'sts-xiuxian-1' / # / # ping sts-xiuxian-1.svc-headless -c 3 PING sts-xiuxian-1.svc-headless (10.100.1.9): 56 data bytes 64 bytes from 10.100.1.9: seq=0 ttl=62 time=0.418 ms 64 bytes from 10.100.1.9: seq=1 ttl=62 time=0.295 ms 64 bytes from 10.100.1.9: seq=2 ttl=62 time=0.290 ms --- sts-xiuxian-1.svc-headless ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.290/0.334/0.418 ms / # / # ping sts-xiuxian-2.svc-headless.default.svc.weixiang.com -c 3 PING sts-xiuxian-2.svc-headless.default.svc.weixiang.com (10.100.2.152): 56 data bytes 64 bytes from 10.100.2.152: seq=0 ttl=64 time=0.076 ms 64 bytes from 10.100.2.152: seq=1 ttl=64 time=0.067 ms 64 bytes from 10.100.2.152: seq=2 ttl=64 time=0.168 ms --- sts-xiuxian-2.svc-headless.default.svc.weixiang.com ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.067/0.103/0.168 ms / # [root@master231 23-statefulsets]# kubectl delete pods -l app=nginx pod "sts-xiuxian-0" deleted pod "sts-xiuxian-1" deleted pod "sts-xiuxian-2" deleted [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl get sts,svc,po -o wide NAME READY AGE CONTAINERS IMAGES statefulset.apps/sts-xiuxian 3/3 2m28s nginx registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 27h <none> service/svc-headless ClusterIP None <none> 80/TCP 2m28s app=nginx NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/sts-xiuxian-0 1/1 Running 0 5s 10.100.2.153 worker233 <none> <none> pod/sts-xiuxian-1 1/1 Running 0 3s 10.100.1.10 worker232 <none> <none> pod/sts-xiuxian-2 1/1 Running 0 2s 10.100.2.154 worker233 <none> <none> [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl exec -it sts-xiuxian-0 -- sh / # ping sts-xiuxian-1.svc-headless -c 3 PING sts-xiuxian-1.svc-headless (10.100.1.10): 56 data bytes 64 bytes from 10.100.1.10: seq=0 ttl=62 time=0.455 ms 64 bytes from 10.100.1.10: seq=1 ttl=62 time=0.455 ms 64 bytes from 10.100.1.10: seq=2 ttl=62 time=0.371 ms --- sts-xiuxian-1.svc-headless ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.371/0.427/0.455 ms / # / # ping sts-xiuxian-2.svc-headless -c 3 PING sts-xiuxian-2.svc-headless (10.100.2.154): 56 data bytes 64 bytes from 10.100.2.154: seq=0 ttl=64 time=0.095 ms 64 bytes from 10.100.2.154: seq=1 ttl=64 time=0.068 ms 64 bytes from 10.100.2.154: seq=2 ttl=64 time=0.084 ms --- sts-xiuxian-2.svc-headless ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.068/0.082/0.095 ms / # [root@master231 23-statefulsets]# kubectl delete -f 01-statefulset-headless-network.yaml service "svc-headless" deleted statefulset.apps "sts-xiuxian" deleted [root@master231 23-statefulsets]# [root@master231 23-statefulsets]#
3、StatefulSets控制器-独享存储
bash
3.1 编写资源清单 [root@master231 statefulsets]# cat > 02-statefulset-headless-volumeClaimTemplates.yaml <<''EOF apiVersion: v1 kind: Service metadata: name: svc-headless spec: ports: - port: 80 name: web clusterIP: None selector: app: nginx --- apiVersion: apps/v1 kind: StatefulSet metadata: name: sts-xiuxian spec: selector: matchLabels: app: nginx serviceName: svc-headless replicas: 3 # 卷申请模板,会为每个Pod去创建唯一的pvc并与之关联哟! volumeClaimTemplates: - metadata: name: data spec: accessModes: [ "ReadWriteOnce" ] # 声明咱们自定义的动态存储类,即sc资源。 storageClassName: "weixiang-sc-xixi" resources: requests: storage: 2Gi template: metadata: labels: app: nginx spec: containers: - name: nginx image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 ports: - containerPort: 80 name: xiuxian volumeMounts: - name: data mountPath: /usr/share/nginx/html --- apiVersion: v1 kind: Service metadata: name: svc-sts-xiuxian spec: type: ClusterIP clusterIP: 10.200.0.200 selector: app: nginx ports: - port: 80 targetPort: xiuxian EOF 3.2 测试验证 [root@master231 23-statefulsets]# kubectl apply -f 02-statefulset-headless-volumeClaimTemplates.yaml service/svc-headless created statefulset.apps/sts-xiuxian created service/svc-sts-xiuxian created [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl get sts,svc,po -o wide NAME READY AGE CONTAINERS IMAGES statefulset.apps/sts-xiuxian 3/3 30s nginx registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 27h <none> service/svc-headless ClusterIP None <none> 80/TCP 30s app=nginx service/svc-sts-xiuxian ClusterIP 10.200.0.200 <none> 80/TCP 30s app=nginx NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/sts-xiuxian-0 1/1 Running 0 30s 10.100.2.155 worker233 <none> <none> pod/sts-xiuxian-1 1/1 Running 0 27s 10.100.1.11 worker232 <none> <none> pod/sts-xiuxian-2 1/1 Running 0 23s 10.100.2.156 worker233 <none> <none> [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl exec -it sts-xiuxian-0 -- sh / # echo AAA > /usr/share/nginx/html/index.html / # [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl exec -it sts-xiuxian-1 -- sh / # echo BBB > /usr/share/nginx/html/index.html / # [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl exec -it sts-xiuxian-2 -- sh / # echo CCC > /usr/share/nginx/html/index.html / # [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# for i in `seq 10`;do curl 10.200.0.200;done CCC BBB AAA CCC BBB AAA CCC BBB AAA CCC [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl delete pods --all pod "sts-xiuxian-0" deleted pod "sts-xiuxian-1" deleted pod "sts-xiuxian-2" deleted [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# for i in `seq 10`;do curl 10.200.0.200;done CCC BBB AAA CCC BBB AAA CCC BBB AAA CCC [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl get pvc,pv NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/data-sts-xiuxian-0 Bound pvc-e30008e7-b904-4f4e-a533-feb7010ffe3c 2Gi RWO weixiang-sc-xixi 3m3s persistentvolumeclaim/data-sts-xiuxian-1 Bound pvc-39895bf3-937b-49ee-8b1d-e11ab2c18cc0 2Gi RWO weixiang-sc-xixi 3m persistentvolumeclaim/data-sts-xiuxian-2 Bound pvc-d1a08646-6a09-424e-837b-44c2a21b8f7b 2Gi RWO weixiang-sc-xixi 2m56s NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE persistentvolume/pvc-39895bf3-937b-49ee-8b1d-e11ab2c18cc0 2Gi RWO Delete Bound default/data-sts-xiuxian-1 weixiang-sc-xixi 3m persistentvolume/pvc-d1a08646-6a09-424e-837b-44c2a21b8f7b 2Gi RWO Delete Bound default/data-sts-xiuxian-2 weixiang-sc-xixi 2m56s persistentvolume/pvc-e30008e7-b904-4f4e-a533-feb7010ffe3c 2Gi RWO Delete Bound default/data-sts-xiuxian-0 weixiang-sc-xixi 3m3s [root@master231 23-statefulsets]# 3.3 验证后端存储 [root@master231 statefulsets]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES sts-xiuxian-0 1/1 Running 0 4m25s 10.100.203.171 worker232 <none> <none> sts-xiuxian-1 1/1 Running 0 4m23s 10.100.140.99 worker233 <none> <none> sts-xiuxian-2 1/1 Running 0 4m22s 10.100.160.178 master231 <none> <none> [root@master231 statefulsets]# [root@master231 23-statefulsets]# kubectl get pvc -l app=nginx | awk 'NR>=2{print $3}' | xargs kubectl describe pv | grep VolumeHandle VolumeHandle: 10.0.0.231#yinzhengjie/data/nfs-server/sc-xixi#pvc-e30008e7-b904-4f4e-a533-feb7010ffe3c## VolumeHandle: 10.0.0.231#yinzhengjie/data/nfs-server/sc-xixi#pvc-39895bf3-937b-49ee-8b1d-e11ab2c18cc0## VolumeHandle: 10.0.0.231#yinzhengjie/data/nfs-server/sc-xixi#pvc-d1a08646-6a09-424e-837b-44c2a21b8f7b## [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# cat /yinzhengjie/data/nfs-server/sc-xixi/pvc-e30008e7-b904-4f4e-a533-feb7010ffe3c/index.html AAA [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# cat /yinzhengjie/data/nfs-server/sc-xixi/pvc-39895bf3-937b-49ee-8b1d-e11ab2c18cc0/index.html BBB [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# cat /yinzhengjie/data/nfs-server/sc-xixi/pvc-d1a08646-6a09-424e-837b-44c2a21b8f7b/index.html CCC [root@master231 23-statefulsets]#
4、sts的分段更新
bash
4.1.编写资源清单 [root@master231 23-statefulsets]# cat > 03-statefuleset-updateStrategy-partition.yaml <<EOF apiVersion: v1 kind: Service metadata: name: sts-headless spec: ports: - port: 80 name: web clusterIP: None selector: app: web --- apiVersion: apps/v1 kind: StatefulSet metadata: name: weixiang-sts-web spec: # 指定sts资源的更新策略 updateStrategy: # 配置滚动更新 rollingUpdate: # 当编号小于3时不更新,说白了,就是Pod编号大于等于3的Pod会被更新! partition: 3 selector: matchLabels: app: web serviceName: sts-headless replicas: 5 template: metadata: labels: app: web spec: containers: - name: c1 ports: - containerPort: 80 name: xiuxian image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 --- apiVersion: v1 kind: Service metadata: name: weixiang-sts-svc spec: selector: app: web ports: - port: 80 targetPort: xiuxian EOF 4.2.验证 [root@master231 23-statefulsets]# kubectl apply -f 03-statefuleset-updateStrategy-partition.yaml service/sts-headless created statefulset.apps/weixiang-sts-web created service/weixiang-sts-svc created [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES weixiang-sts-web-0 1/1 Running 0 7s 10.100.2.159 worker233 <none> <none> weixiang-sts-web-1 1/1 Running 0 6s 10.100.1.13 worker232 <none> <none> weixiang-sts-web-2 1/1 Running 0 4s 10.100.2.160 worker233 <none> <none> weixiang-sts-web-3 1/1 Running 0 4s 10.100.1.14 worker232 <none> <none> weixiang-sts-web-4 1/1 Running 0 2s 10.100.2.161 worker233 <none> <none> [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl get pods -l app=web -o yaml | grep "\- image:" - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# grep hangzhou 03-statefuleset-updateStrategy-partition.yaml image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# sed -i '/hangzhou/s#v1#v2#' 03-statefuleset-updateStrategy-partition.yaml [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# grep hangzhou 03-statefuleset-updateStrategy-partition.yaml image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl apply -f 03-statefuleset-updateStrategy-partition.yaml service/sts-headless unchanged statefulset.apps/weixiang-sts-web configured service/weixiang-sts-svc unchanged [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES weixiang-sts-web-0 1/1 Running 0 75s 10.100.2.159 worker233 <none> <none> weixiang-sts-web-1 1/1 Running 0 74s 10.100.1.13 worker232 <none> <none> weixiang-sts-web-2 1/1 Running 0 72s 10.100.2.160 worker233 <none> <none> weixiang-sts-web-3 1/1 Running 0 6s 10.100.1.15 worker232 <none> <none> weixiang-sts-web-4 1/1 Running 0 7s 10.100.2.162 worker233 <none> <none> [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl get pods -l app=web -o yaml | grep "\- image:" - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl delete -f 03-statefuleset-updateStrategy-partition.yaml service "sts-headless" deleted statefulset.apps "weixiang-sts-web" deleted service "weixiang-sts-svc" deleted [root@master231 23-statefulsets]#

5、基于sts部署zookeeper集群
bash
参考链接: https://kubernetes.io/zh-cn/docs/tutorials/stateful-application/zookeeper/ 5.1 K8S所有节点导入镜像 wget http://192.168.21.253/Resources/Kubernetes/Case-Demo/zookeeper/weixiang-kubernetes-zookeeper-v1.0-3.4.10.tar.gz docker load -i weixiang-kubernetes-zookeeper-v1.0-3.4.10.tar.gz 5.2 编写资源清单 [root@master231 23-statefulsets]# cat > 04-sts-zookeeper-cluster.yaml << 'EOF' apiVersion: v1 kind: Service metadata: name: zk-hs labels: app: zk spec: ports: - port: 2888 name: server - port: 3888 name: leader-election clusterIP: None selector: app: zk --- apiVersion: v1 kind: Service metadata: name: zk-cs labels: app: zk spec: ports: - port: 2181 name: client selector: app: zk --- apiVersion: policy/v1 # 此类型用于定义可以对一组Pod造成的最大中断,说白了就是最大不可用的Pod数量。 # 一般情况下,对于分布式集群而言,假设集群故障容忍度为N,则集群最少需要2N+1个Pod。 kind: PodDisruptionBudget metadata: name: zk-pdb spec: # 匹配Pod selector: matchLabels: app: zk # 最大不可用的Pod数量。这意味着将来zookeeper集群,最少要2*1 +1 = 3个Pod数量。 maxUnavailable: 1 --- apiVersion: apps/v1 kind: StatefulSet metadata: name: zk spec: selector: matchLabels: app: zk serviceName: zk-hs replicas: 3 updateStrategy: type: RollingUpdate podManagementPolicy: OrderedReady template: metadata: labels: app: zk spec: tolerations: - key: node-role.kubernetes.io/master operator: Exists effect: NoSchedule affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: "app" operator: In values: - zk topologyKey: "kubernetes.io/hostname" containers: - name: kubernetes-zookeeper imagePullPolicy: IfNotPresent image: "registry.k8s.io/kubernetes-zookeeper:1.0-3.4.10" resources: requests: memory: "1Gi" cpu: "0.5" ports: - containerPort: 2181 name: client - containerPort: 2888 name: server - containerPort: 3888 name: leader-election command: - sh - -c - "start-zookeeper \ --servers=3 \ --data_dir=/var/lib/zookeeper/data \ --data_log_dir=/var/lib/zookeeper/data/log \ --conf_dir=/opt/zookeeper/conf \ --client_port=2181 \ --election_port=3888 \ --server_port=2888 \ --tick_time=2000 \ --init_limit=10 \ --sync_limit=5 \ --heap=512M \ --max_client_cnxns=60 \ --snap_retain_count=3 \ --purge_interval=12 \ --max_session_timeout=40000 \ --min_session_timeout=4000 \ --log_level=INFO" readinessProbe: exec: command: - sh - -c - "zookeeper-ready 2181" initialDelaySeconds: 10 timeoutSeconds: 5 livenessProbe: exec: command: - sh - -c - "zookeeper-ready 2181" initialDelaySeconds: 10 timeoutSeconds: 5 volumeMounts: - name: datadir mountPath: /var/lib/zookeeper securityContext: runAsUser: 1000 fsGroup: 1000 volumeClaimTemplates: - metadata: name: datadir spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 10Gi EOF 5.3 实时观察Pod状态 [root@master231 23-statefulsets]# kubectl apply -f 04-sts-zookeeper-cluster.yaml service/zk-hs created service/zk-cs created poddisruptionbudget.policy/zk-pdb created statefulset.apps/zk created [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl get pods -o wide -w -l app=zk NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES zk-0 0/1 Pending 0 0s <none> <none> <none> <none> zk-0 0/1 Pending 0 1s <none> worker233 <none> <none> zk-0 0/1 ContainerCreating 0 1s <none> worker233 <none> <none> zk-0 0/1 ContainerCreating 0 3s <none> worker233 <none> <none> zk-0 0/1 Running 0 7s 10.100.140.125 worker233 <none> <none> zk-0 1/1 Running 0 22s 10.100.140.125 worker233 <none> <none> zk-1 0/1 Pending 0 0s <none> <none> <none> <none> zk-1 0/1 Pending 0 0s <none> master231 <none> <none> zk-1 0/1 ContainerCreating 0 0s <none> master231 <none> <none> zk-1 0/1 ContainerCreating 0 1s <none> master231 <none> <none> zk-1 0/1 Running 0 5s 10.100.160.189 master231 <none> <none> zk-1 1/1 Running 0 21s 10.100.160.189 master231 <none> <none> zk-2 0/1 Pending 0 0s <none> <none> <none> <none> zk-2 0/1 Pending 0 0s <none> worker232 <none> <none> zk-2 0/1 ContainerCreating 0 0s <none> worker232 <none> <none> zk-2 0/1 ContainerCreating 0 1s <none> worker232 <none> <none> zk-2 0/1 Running 0 5s 10.100.203.188 worker232 <none> <none> zk-2 1/1 Running 0 21s 10.100.203.188 worker232 <none> <none> ... 5.4 检查后端的存储 [root@master231 23-statefulsets]# kubectl get po,pvc -l app=zk NAME READY STATUS RESTARTS AGE pod/zk-0 1/1 Running 0 72s pod/zk-1 1/1 Running 0 61s pod/zk-2 1/1 Running 0 50s NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/datadir-zk-0 Bound pvc-8fa25151-0d0f-4493-97f6-62ff64bd6f09 10Gi RWO weixiang-sc-haha 3m32s persistentvolumeclaim/datadir-zk-1 Bound pvc-630f592d-caf0-47e9-bfaa-9a9397c8344c 10Gi RWO weixiang-sc-haha 3m20s persistentvolumeclaim/datadir-zk-2 Bound pvc-11cfccc3-7509-445b-ba7b-9072b20227b1 10Gi RWO weixiang-sc-haha 3m8s [root@master231 23-statefulsets]# 5.5.验证集群是否正常 [root@master231 23-statefulsets]# for i in 0 1 2; do kubectl exec zk-$i -- hostname; done zk-0 zk-1 zk-2 [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# for i in 0 1 2; do echo "myid zk-$i";kubectl exec zk-$i -- cat /var/lib/zookeeper/data/myid; done myid zk-0 1 myid zk-1 2 myid zk-2 3 [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# for i in 0 1 2; do kubectl exec zk-$i -- hostname -f; done zk-0.zk-hs.default.svc.weixiang.com zk-1.zk-hs.default.svc.weixiang.com zk-2.zk-hs.default.svc.weixiang.com [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl exec zk-0 -- cat /opt/zookeeper/conf/zoo.cfg #This file was autogenerated DO NOT EDIT clientPort=2181 dataDir=/var/lib/zookeeper/data dataLogDir=/var/lib/zookeeper/data/log tickTime=2000 initLimit=10 syncLimit=5 maxClientCnxns=60 minSessionTimeout=4000 maxSessionTimeout=40000 autopurge.snapRetainCount=3 autopurge.purgeInteval=12 server.1=zk-0.zk-hs.default.svc.weixiang.com:2888:3888 server.2=zk-1.zk-hs.default.svc.weixiang.com:2888:3888 server.3=zk-2.zk-hs.default.svc.weixiang.com:2888:3888 [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl exec zk-1 -- cat /opt/zookeeper/conf/zoo.cfg #This file was autogenerated DO NOT EDIT clientPort=2181 dataDir=/var/lib/zookeeper/data dataLogDir=/var/lib/zookeeper/data/log tickTime=2000 initLimit=10 syncLimit=5 maxClientCnxns=60 minSessionTimeout=4000 maxSessionTimeout=40000 autopurge.snapRetainCount=3 autopurge.purgeInteval=12 server.1=zk-0.zk-hs.default.svc.weixiang.com:2888:3888 server.2=zk-1.zk-hs.default.svc.weixiang.com:2888:3888 server.3=zk-2.zk-hs.default.svc.weixiang.com:2888:3888 [root@master231 23-statefulsets]# [root@master231 23-statefulsets]# kubectl exec zk-2 -- cat /opt/zookeeper/conf/zoo.cfg #This file was autogenerated DO NOT EDIT clientPort=2181 dataDir=/var/lib/zookeeper/data dataLogDir=/var/lib/zookeeper/data/log tickTime=2000 initLimit=10 syncLimit=5 maxClientCnxns=60 minSessionTimeout=4000 maxSessionTimeout=40000 autopurge.snapRetainCount=3 autopurge.purgeInteval=12 server.1=zk-0.zk-hs.default.svc.weixiang.com:2888:3888 server.2=zk-1.zk-hs.default.svc.weixiang.com:2888:3888 server.3=zk-2.zk-hs.default.svc.weixiang.com:2888:3888 [root@master231 23-statefulsets]# 5.6 创建数据测试 5.6.1 在一个Pod写入数据 [root@master231 23-statefulsets]# kubectl exec -it zk-1 -- zkCli.sh ... [zk: localhost:2181(CONNECTED) 0] ls / [zookeeper] [zk: localhost:2181(CONNECTED) 1] [zk: localhost:2181(CONNECTED) 1] [zk: localhost:2181(CONNECTED) 1] create /school weixiang Created /school [zk: localhost:2181(CONNECTED) 2] [zk: localhost:2181(CONNECTED) 2] create /school/linux97 XIXI Created /school/linux97 [zk: localhost:2181(CONNECTED) 3] [zk: localhost:2181(CONNECTED) 3] ls / [zookeeper, school] [zk: localhost:2181(CONNECTED) 4] [zk: localhost:2181(CONNECTED) 4] ls /school [linux97] [zk: localhost:2181(CONNECTED) 5] 5.6.2 在另一个Pod查看下数据 [root@master231 23-statefulsets]# kubectl exec -it zk-2 -- zkCli.sh ... [zk: localhost:2181(CONNECTED) 0] ls / [zookeeper, school] [zk: localhost:2181(CONNECTED) 1] get /school/linux97 XIXI cZxid = 0x100000003 ctime = Mon Jun 09 03:10:51 UTC 2025 mZxid = 0x100000003 mtime = Mon Jun 09 03:10:51 UTC 2025 pZxid = 0x100000003 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 4 numChildren = 0 [zk: localhost:2181(CONNECTED) 2] 5.7 查看start-zookeeper 脚本逻辑 [root@master231 23-statefulsets]# kubectl exec -it zk-0 -- bash zookeeper@zk-0:/$ zookeeper@zk-0:/$ which start-zookeeper /usr/bin/start-zookeeper zookeeper@zk-0:/$ zookeeper@zk-0:/$ wc -l /usr/bin/start-zookeeper 320 /usr/bin/start-zookeeper zookeeper@zk-0:/$ zookeeper@zk-0:/$ cat /usr/bin/start-zookeeper ;echo #!/usr/bin/env bash # Copyright 2017 The Kubernetes Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # #Usage: start-zookeeper [OPTIONS] # Starts a ZooKeeper server based on the supplied options. # --servers The number of servers in the ensemble. The default # value is 1. # --data_dir The directory where the ZooKeeper process will store its # snapshots. The default is /var/lib/zookeeper/data. # --data_log_dir The directory where the ZooKeeper process will store its # write ahead log. The default is # /var/lib/zookeeper/data/log. # --conf_dir The directoyr where the ZooKeeper process will store its # configuration. The default is /opt/zookeeper/conf. # --client_port The port on which the ZooKeeper process will listen for # client requests. The default is 2181. # --election_port The port on which the ZooKeeper process will perform # leader election. The default is 3888. # --server_port The port on which the ZooKeeper process will listen for # requests from other servers in the ensemble. The # default is 2888. # --tick_time The length of a ZooKeeper tick in ms. The default is # 2000. # --init_limit The number of Ticks that an ensemble member is allowed # to perform leader election. The default is 10. # --sync_limit The maximum session timeout that the ensemble will # allows a client to request. The default is 5. # --heap The maximum amount of heap to use. The format is the # same as that used for the Xmx and Xms parameters to the # JVM. e.g. --heap=2G. The default is 2G. # --max_client_cnxns The maximum number of client connections that the # ZooKeeper process will accept simultaneously. The # default is 60. # --snap_retain_count The maximum number of snapshots the ZooKeeper process # will retain if purge_interval is greater than 0. The # default is 3. # --purge_interval The number of hours the ZooKeeper process will wait # between purging its old snapshots. If set to 0 old # snapshots will never be purged. The default is 0. # --max_session_timeout The maximum time in milliseconds for a client session # timeout. The default value is 2 * tick time. # --min_session_timeout The minimum time in milliseconds for a client session # timeout. The default value is 20 * tick time. # --log_level The log level for the zookeeeper server. Either FATAL, # ERROR, WARN, INFO, DEBUG. The default is INFO. USER=`whoami` HOST=`hostname -s` DOMAIN=`hostname -d` LOG_LEVEL=INFO DATA_DIR="/var/lib/zookeeper/data" DATA_LOG_DIR="/var/lib/zookeeper/log" LOG_DIR="/var/log/zookeeper" CONF_DIR="/opt/zookeeper/conf" CLIENT_PORT=2181 SERVER_PORT=2888 ELECTION_PORT=3888 TICK_TIME=2000 INIT_LIMIT=10 SYNC_LIMIT=5 HEAP=2G MAX_CLIENT_CNXNS=60 SNAP_RETAIN_COUNT=3 PURGE_INTERVAL=0 SERVERS=1 function print_usage() { echo "\ Usage: start-zookeeper [OPTIONS] Starts a ZooKeeper server based on the supplied options. --servers The number of servers in the ensemble. The default value is 1. --data_dir The directory where the ZooKeeper process will store its snapshots. The default is /var/lib/zookeeper/data. --data_log_dir The directory where the ZooKeeper process will store its write ahead log. The default is /var/lib/zookeeper/data/log. --conf_dir The directoyr where the ZooKeeper process will store its configuration. The default is /opt/zookeeper/conf. --client_port The port on which the ZooKeeper process will listen for client requests. The default is 2181. --election_port The port on which the ZooKeeper process will perform leader election. The default is 3888. --server_port The port on which the ZooKeeper process will listen for requests from other servers in the ensemble. The default is 2888. --tick_time The length of a ZooKeeper tick in ms. The default is 2000. --init_limit The number of Ticks that an ensemble member is allowed to perform leader election. The default is 10. --sync_limit The maximum session timeout that the ensemble will allows a client to request. The default is 5. --heap The maximum amount of heap to use. The format is the same as that used for the Xmx and Xms parameters to the JVM. e.g. --heap=2G. The default is 2G. --max_client_cnxns The maximum number of client connections that the ZooKeeper process will accept simultaneously. The default is 60. --snap_retain_count The maximum number of snapshots the ZooKeeper process will retain if purge_interval is greater than 0. The default is 3. --purge_interval The number of hours the ZooKeeper process will wait between purging its old snapshots. If set to 0 old snapshots will never be purged. The default is 0. --max_session_timeout The maximum time in milliseconds for a client session timeout. The default value is 2 * tick time. --min_session_timeout The minimum time in milliseconds for a client session timeout. The default value is 20 * tick time. --log_level The log level for the zookeeeper server. Either FATAL, ERROR, WARN, INFO, DEBUG. The default is INFO. " } function create_data_dirs() { if [ ! -d $DATA_DIR ]; then mkdir -p $DATA_DIR chown -R $USER:$USER $DATA_DIR fi if [ ! -d $DATA_LOG_DIR ]; then mkdir -p $DATA_LOG_DIR chown -R $USER:USER $DATA_LOG_DIR fi if [ ! -d $LOG_DIR ]; then mkdir -p $LOG_DIR chown -R $USER:$USER $LOG_DIR fi if [ ! -f $ID_FILE ] && [ $SERVERS -gt 1 ]; then echo $MY_ID >> $ID_FILE fi } function print_servers() { for (( i=1; i<=$SERVERS; i++ )) do echo "server.$i=$NAME-$((i-1)).$DOMAIN:$SERVER_PORT:$ELECTION_PORT" done } function create_config() { rm -f $CONFIG_FILE echo "#This file was autogenerated DO NOT EDIT" >> $CONFIG_FILE echo "clientPort=$CLIENT_PORT" >> $CONFIG_FILE echo "dataDir=$DATA_DIR" >> $CONFIG_FILE echo "dataLogDir=$DATA_LOG_DIR" >> $CONFIG_FILE echo "tickTime=$TICK_TIME" >> $CONFIG_FILE echo "initLimit=$INIT_LIMIT" >> $CONFIG_FILE echo "syncLimit=$SYNC_LIMIT" >> $CONFIG_FILE echo "maxClientCnxns=$MAX_CLIENT_CNXNS" >> $CONFIG_FILE echo "minSessionTimeout=$MIN_SESSION_TIMEOUT" >> $CONFIG_FILE echo "maxSessionTimeout=$MAX_SESSION_TIMEOUT" >> $CONFIG_FILE echo "autopurge.snapRetainCount=$SNAP_RETAIN_COUNT" >> $CONFIG_FILE echo "autopurge.purgeInteval=$PURGE_INTERVAL" >> $CONFIG_FILE if [ $SERVERS -gt 1 ]; then print_servers >> $CONFIG_FILE fi cat $CONFIG_FILE >&2 } function create_jvm_props() { rm -f $JAVA_ENV_FILE echo "ZOO_LOG_DIR=$LOG_DIR" >> $JAVA_ENV_FILE echo "JVMFLAGS=\"-Xmx$HEAP -Xms$HEAP\"" >> $JAVA_ENV_FILE } function create_log_props() { rm -f $LOGGER_PROPS_FILE echo "Creating ZooKeeper log4j configuration" echo "zookeeper.root.logger=CONSOLE" >> $LOGGER_PROPS_FILE echo "zookeeper.console.threshold="$LOG_LEVEL >> $LOGGER_PROPS_FILE echo "log4j.rootLogger=\${zookeeper.root.logger}" >> $LOGGER_PROPS_FILE echo "log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender" >> $LOGGER_PROPS_FILE echo "log4j.appender.CONSOLE.Threshold=\${zookeeper.console.threshold}" >> $LOGGER_PROPS_FILE echo "log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout" >> $LOGGER_PROPS_FILE echo "log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n" >> $LOGGER_PROPS_FILE } optspec=":hv-:" while getopts "$optspec" optchar; do case "${optchar}" in -) case "${OPTARG}" in servers=*) SERVERS=${OPTARG##*=} ;; data_dir=*) DATA_DIR=${OPTARG##*=} ;; data_log_dir=*) DATA_LOG_DIR=${OPTARG##*=} ;; log_dir=*) LOG_DIR=${OPTARG##*=} ;; conf_dir=*) CONF_DIR=${OPTARG##*=} ;; client_port=*) CLIENT_PORT=${OPTARG##*=} ;; election_port=*) ELECTION_PORT=${OPTARG##*=} ;; server_port=*) SERVER_PORT=${OPTARG##*=} ;; tick_time=*) TICK_TIME=${OPTARG##*=} ;; init_limit=*) INIT_LIMIT=${OPTARG##*=} ;; sync_limit=*) SYNC_LIMIT=${OPTARG##*=} ;; heap=*) HEAP=${OPTARG##*=} ;; max_client_cnxns=*) MAX_CLIENT_CNXNS=${OPTARG##*=} ;; snap_retain_count=*) SNAP_RETAIN_COUNT=${OPTARG##*=} ;; purge_interval=*) PURGE_INTERVAL=${OPTARG##*=} ;; max_session_timeout=*) MAX_SESSION_TIMEOUT=${OPTARG##*=} ;; min_session_timeout=*) MIN_SESSION_TIMEOUT=${OPTARG##*=} ;; log_level=*) LOG_LEVEL=${OPTARG##*=} ;; *) echo "Unknown option --${OPTARG}" >&2 exit 1 ;; esac;; h) print_usage exit ;; v) echo "Parsing option: '-${optchar}'" >&2 ;; *) if [ "$OPTERR" != 1 ] || [ "${optspec:0:1}" = ":" ]; then echo "Non-option argument: '-${OPTARG}'" >&2 fi ;; esac done MIN_SESSION_TIMEOUT=${MIN_SESSION_TIMEOUT:- $((TICK_TIME*2))} MAX_SESSION_TIMEOUT=${MAX_SESSION_TIMEOUT:- $((TICK_TIME*20))} ID_FILE="$DATA_DIR/myid" CONFIG_FILE="$CONF_DIR/zoo.cfg" LOGGER_PROPS_FILE="$CONF_DIR/log4j.properties" JAVA_ENV_FILE="$CONF_DIR/java.env" if [[ $HOST =~ (.*)-([0-9]+)$ ]]; then NAME=${BASH_REMATCH[1]} ORD=${BASH_REMATCH[2]} else echo "Fialed to parse name and ordinal of Pod" exit 1 fi MY_ID=$((ORD+1)) create_config && create_jvm_props && create_log_props && create_data_dirs && exec zkServer.sh start-foreground zookeeper@zk-0:/$ zookeeper@zk-0:/$ [root@master231 statefulsets]# [root@master231 statefulsets]# kubectl delete -f 04-sts-zookeeper.yaml service "zk-hs" deleted service "zk-cs" deleted poddisruptionbudget.policy "zk-pdb" deleted statefulset.apps "zk" deleted [root@master231 statefulsets]# 温馨提示: 业界对于sts控制器有点望而却步,我们知道这个控制器用做有状态服务部署,但是我们不用~ 于是coreOS公司有研发出来了Operator(sts+crd)框架,大家可以基于该框架部署各种服务。

31、helm

1、helm环境快速部署实战
bash
- helm环境快速部署实战 1.helm概述 helm有点类似于Linux的yum,apt工具,帮助我们管理K8S集群的资源清单。 Helm 帮助您管理 Kubernetes 应用—— Helm Chart,即使是最复杂的 Kubernetes 应用程序,都可以帮助您定义,安装和升级。 Helm Chart 易于创建、发版、分享和发布,所以停止复制粘贴,开始使用 Helm 吧。 Helm 是 CNCF 的毕业项目,由 Helm 社区维护。 Helm尽量使用v3版本 官方文档: https://helm.sh/zh/ 2.helm的架构版本选择 2019年11月Helm团队发布V3版本,相比v2版本最大变化是将Tiller删除,并大部分代码重构。 helm v3相比helm v2还做了很多优化,比如不同命名空间资源同名的情况在v3版本是允许的,我们在生产环境中使用建议大家使用v3版本,不仅仅是因为它版本功能较强,而且相对来说也更加稳定了。 官方地址: https://helm.sh/docs/intro/install/ github地址: https://github.com/helm/helm/releases 3.安装helm wget https://get.helm.sh/helm-v3.18.4-linux-amd64.tar.gz SVIP: [root@master231 ~]# wget http://192.168.21.253/Resources/Kubernetes/Add-ons/helm/softwares/helm-v3.18.4-linux-amd64.tar.gz [root@master231 ~]# tar xf helm-v3.18.4-linux-amd64.tar.gz -C /usr/local/bin/ linux-amd64/helm --strip-components=1 [root@master231 ~]# [root@master231 ~]# ll /usr/local/bin/helm -rwxr-xr-x 1 1001 fwupd-refresh 59715768 Jul 9 04:36 /usr/local/bin/helm* [root@master231 ~]# [root@master231 ~]# helm version version.BuildInfo{Version:"v3.18.4", GitCommit:"d80839cf37d860c8aa9a0503fe463278f26cd5e2", GitTreeState:"clean", GoVersion:"go1.24.4"} [root@master231 ~]# 4.配置helm的自动补全功能 [root@master231 ~]# helm completion bash > /etc/bash_completion.d/helm [root@master231 ~]# source /etc/bash_completion.d/helm [root@master231 ~]# echo 'source /etc/bash_completion.d/helm' >> ~/.bashrc - helm的Chart基本管理 什么是Chart:包含 Chart.yaml, values.yaml, templates/ 等文件的目录。它是一个预先打包好的应用模板或蓝图 1.创建Chart [root@master231 add-ons]# mkdir 05-helm [root@master231 add-ons]# cd 05-helm [root@master231 05-helm]# ll total 8 drwxr-xr-x 2 root root 4096 Jul 28 09:29 ./ drwxr-xr-x 7 root root 4096 Jul 28 09:29 ../ [root@master231 05-helm]# [root@master231 05-helm]# [root@master231 05-helm]# helm create weixiang-weixiang98 Creating weixiang-weixiang98 [root@master231 05-helm]# [root@master231 05-helm]# ll total 12 drwxr-xr-x 3 root root 4096 Jul 28 09:29 ./ drwxr-xr-x 7 root root 4096 Jul 28 09:29 ../ drwxr-xr-x 4 root root 4096 Jul 28 09:29 weixiang-weixiang98/ [root@master231 05-helm]# 2.查看Chart结构 [root@master231 helm-Chart]# tree weixiang-weixiang98/ weixiang-weixiang98/ ├── charts # 包含chart依赖的其他chart ├── Chart.yaml # 包含了chart信息的YAML文件 ├── templates # 模板目录, 当和values 结合时,可生成有效的Kubernetes manifest文件 │ ├── deployment.yaml # deployment资源清单模板。 │ ├── _helpers.tpl # 自定义模板 │ ├── hpa.yaml # hpa资源清单模板。 │ ├── ingress.yaml # Ingress资源清单模板。 │ ├── NOTES.txt # 可选: 包含简要使用说明的纯文本文件 │ ├── serviceaccount.yaml # sa资源清单模板。 │ ├── service.yaml # svc资源清单模板。 │ └── tests # 测试目录 │ └── test-connection.yaml └── values.yaml # chart 默认的配置值 你可以把这个 Helm Chart 想象成一个“应用的安装包”。values.yaml 是这个安装包的“配置文件”,templates/ 目录是“安装脚本”,而 Chart.yaml 则是这个安装包的“说明标签”。当执行 helm install 时,Helm 会读取 values.yaml 里的配置,填充到 templates/ 目录 下的模板文件中,最终生成一系列标准的 Kubernetes YAML 清单文件,并应用到你的集群里。 3 directories, 10 files [root@master231 helm-Chart]# 参考链接: https://helm.sh/zh/docs/topics/charts/#chart-%E6%96%87%E4%BB%B6%E7%BB%93%E6%9E%84 3.修改默认的values.yaml [root@master231 05-helm]# egrep "repository:|tag:" weixiang-weixiang98/values.yaml repository: nginx tag: "" [root@master231 05-helm]# [root@master231 05-helm]# sed -i "/repository\:/s#nginx#registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps#" weixiang-weixiang98/values.yaml [root@master231 05-helm]# [root@master231 05-helm]# sed -ri '/tag\:/s#tag: ""#tag: v1#' weixiang-weixiang98/values.yaml [root@master231 05-helm]# [root@master231 05-helm]# egrep "repository:|tag:" weixiang-weixiang98/values.yaml repository: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps # 修改镜像为修仙 tag: v1 # 指定v2版本 [root@master231 05-helm]#

image

bash
4.基于Chart安装服务发行Release # 使用 weixiang-weixiang98 这个应用的“安装包”(Chart),在 Kubernetes 集群里安装一个实例,并给这个实例起个名字叫 xiuxian [root@master231 05-helm]# helm install xiuxian weixiang-weixiang98 NAME: xiuxian LAST DEPLOYED: Mon Jul 28 09:34:21 2025 NAMESPACE: default STATUS: deployed REVISION: 1 NOTES: 1. Get the application URL by running these commands: export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=weixiang-weixiang98,app.kubernetes.io/instance=xiuxian" -o jsonpath="{.items[0].metadata.name}") export CONTAINER_PORT=$(kubectl get pod --namespace default $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}") echo "Visit http://127.0.0.1:8080 to use your application" kubectl --namespace default port-forward $POD_NAME 8080:$CONTAINER_PORT [root@master231 05-helm]# [root@master231 05-helm]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION xiuxian default 1 2025-07-28 09:34:21.622905261 +0800 CST deployed weixiang-weixiang98-0.1.0 1.16.0 [root@master231 05-helm]# 5.查看服务 [root@master231 05-helm]# kubectl get deploy,svc,pods NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/xiuxian-weixiang-weixiang98 1/1 1 1 84s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 2d22h service/xiuxian-weixiang-weixiang98 ClusterIP 10.200.231.250 <none> 80/TCP 84s NAME READY STATUS RESTARTS AGE pod/xiuxian-weixiang-weixiang98-68c6c66bb6-srwkx 1/1 Running 0 84s [root@master231 05-helm]# [root@master231 05-helm]# curl 10.200.231.250 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 05-helm]# 6.卸载服务 [root@master231 05-helm]# kubectl get deploy,svc,pods NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/xiuxian-weixiang-weixiang98 1/1 1 1 2m1s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 2d22h service/xiuxian-weixiang-weixiang98 ClusterIP 10.200.231.250 <none> 80/TCP 2m1s NAME READY STATUS RESTARTS AGE pod/xiuxian-weixiang-weixiang98-68c6c66bb6-srwkx 1/1 Running 0 2m1s [root@master231 05-helm]# [root@master231 05-helm]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION xiuxian default 1 2025-07-28 09:34:21.622905261 +0800 CST deployed weixiang-weixiang98-0.1.0 1.16.0 [root@master231 05-helm]# [root@master231 05-helm]# helm uninstall xiuxian release "xiuxian" uninstalled [root@master231 05-helm]# [root@master231 05-helm]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION [root@master231 05-helm]# [root@master231 05-helm]# kubectl get deploy,svc,pods NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 2d22h [root@master231 05-helm]#

2、helm的两种升级方式案例
bash
- helm的两种升级方式案例 1.安装旧的服务 [root@master231 05-helm]# helm install xiuxian weixiang-weixiang98 # 安装一个名为 weixiang-weixiang98 的应用包(Chart),并且将这次安装命名为 xiuxian。 E0728 10:15:47.679502 103409 memcache.go:287] "Unhandled Error" err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request" logger="UnhandledError" E0728 10:15:47.689877 103409 memcache.go:121] "Unhandled Error" err="couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request" logger="UnhandledError" NAME: xiuxian LAST DEPLOYED: Mon Jul 28 10:15:47 2025 NAMESPACE: default STATUS: deployed REVISION: 1 NOTES: 1. Get the application URL by running these commands: export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=weixiang-weixiang98,app.kubernetes.io/instance=xiuxian" -o jsonpath="{.items[0].metadata.name}") export CONTAINER_PORT=$(kubectl get pod --namespace default $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}") echo "Visit http://127.0.0.1:8080 to use your application" kubectl --namespace default port-forward $POD_NAME 8080:$CONTAINER_PORT [root@master231 05-helm]# [root@master231 05-helm]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION xiuxian default 1 2025-07-28 10:15:47.695220923 +0800 CST deployed weixiang-weixiang98-0.1.0 1.16.0 [root@master231 05-helm]# [root@master231 05-helm]# kubectl get deploy,svc,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/xiuxian-weixiang-weixiang98 1/1 1 1 2m17s weixiang-weixiang98 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 app.kubernetes.io/instance=xiuxian,app.kubernetes.io/name=weixiang-weixiang98 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 2d23h <none> service/xiuxian-weixiang-weixiang98 ClusterIP 10.200.67.55 <none> 80/TCP 2m17s app.kubernetes.io/instance=xiuxian,app.kubernetes.io/name=weixiang-weixiang98 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/xiuxian-weixiang-weixiang98-68c6c66bb6-cpcz7 1/1 Running 0 2m17s 10.100.2.168 worker233 <none> <none> [root@master231 05-helm]# [root@master231 05-helm]# [root@master231 05-helm]# curl 10.200.67.55 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 05-helm]# 2.修改要升级的相关参数【当然,你也可以做其他的修改哟~】 [root@master231 05-helm]# egrep "replicaCount|tag:" weixiang-weixiang98/values.yaml replicaCount: 1 tag: v1 [root@master231 05-helm]# [root@master231 05-helm]# sed -i '/replicaCount/s#1#3#' weixiang-weixiang98/values.yaml [root@master231 05-helm]# [root@master231 05-helm]# sed -i "/tag:/s#v1#v2#" weixiang-weixiang98/values.yaml [root@master231 05-helm]# [root@master231 05-helm]# egrep "replicaCount|tag:" weixiang-weixiang98/values.yaml replicaCount: 3 tag: v2 [root@master231 05-helm]# [root@master231 05-helm]# 3.基于文件方式升级 [root@master231 05-helm]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION xiuxian default 1 2025-07-28 10:15:47.695220923 +0800 CST deployed weixiang-weixiang98-0.1.0 1.16.0 [root@master231 05-helm]# [root@master231 05-helm]# helm upgrade xiuxian -f weixiang-weixiang98/values.yaml weixiang-weixiang98/ # 使用的配置来自weixiang-weixiang98/values.yaml # 应用程序的模板定义在 weixiang-weixiang98/ 这个 Helm Chart 目录中 Release "xiuxian" has been upgraded. Happy Helming! NAME: xiuxian LAST DEPLOYED: Mon Jul 28 10:21:07 2025 NAMESPACE: default STATUS: deployed REVISION: 2 NOTES: 1. Get the application URL by running these commands: export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=weixiang-weixiang98,app.kubernetes.io/instance=xiuxian" -o jsonpath="{.items[0].metadata.name}") export CONTAINER_PORT=$(kubectl get pod --namespace default $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}") echo "Visit http://127.0.0.1:8080 to use your application" kubectl --namespace default port-forward $POD_NAME 8080:$CONTAINER_PORT [root@master231 05-helm]# [root@master231 05-helm]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION xiuxian default 2 2025-07-28 10:21:07.155788336 +0800 CST deployed weixiang-weixiang98-0.1.0 1.16.0 [root@master231 05-helm]# 4.验证升级效果 [root@master231 05-helm]# kubectl get deploy,svc,po NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/xiuxian-weixiang-weixiang98 3/3 3 3 5m35s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 2d23h service/xiuxian-weixiang-weixiang98 ClusterIP 10.200.67.55 <none> 80/TCP 5m35s NAME READY STATUS RESTARTS AGE pod/xiuxian-weixiang-weixiang98-5b48f4cb6c-fvwfv 1/1 Running 0 15s pod/xiuxian-weixiang-weixiang98-5b48f4cb6c-jjc6c 1/1 Running 0 13s pod/xiuxian-weixiang-weixiang98-5b48f4cb6c-scrpn 1/1 Running 0 12s [root@master231 05-helm]# [root@master231 05-helm]# curl 10.200.67.55 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v2</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: red">凡人修仙传 v2 </h1> <div> <img src="2.jpg"> <div> </body> </html> [root@master231 05-helm]# 5.基于环境变量方式升级 [root@master231 05-helm]# helm upgrade xiuxian --set replicaCount=5,image.tag=v3 weixiang-weixiang98 Release "xiuxian" has been upgraded. Happy Helming! NAME: xiuxian LAST DEPLOYED: Mon Jul 28 10:24:33 2025 NAMESPACE: default STATUS: deployed REVISION: 3 NOTES: 1. Get the application URL by running these commands: export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=weixiang-weixiang98,app.kubernetes.io/instance=xiuxian" -o jsonpath="{.items[0].metadata.name}") export CONTAINER_PORT=$(kubectl get pod --namespace default $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}") echo "Visit http://127.0.0.1:8080 to use your application" kubectl --namespace default port-forward $POD_NAME 8080:$CONTAINER_PORT [root@master231 05-helm]# 6.再次验证升级效果 [root@master231 05-helm]# kubectl get deploy,svc,po NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/xiuxian-weixiang-weixiang98 5/5 5 5 8m53s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 2d23h service/xiuxian-weixiang-weixiang98 ClusterIP 10.200.67.55 <none> 80/TCP 8m53s NAME READY STATUS RESTARTS AGE pod/xiuxian-weixiang-weixiang98-6989979888-574ss 1/1 Running 0 5s pod/xiuxian-weixiang-weixiang98-6989979888-bsmp5 1/1 Running 0 6s pod/xiuxian-weixiang-weixiang98-6989979888-kd5j2 1/1 Running 0 7s pod/xiuxian-weixiang-weixiang98-6989979888-nrwfq 1/1 Running 0 7s pod/xiuxian-weixiang-weixiang98-6989979888-shdnk 1/1 Running 0 7s [root@master231 05-helm]# [root@master231 05-helm]# curl 10.200.67.55 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v3</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: pink">凡人修仙传 v3 </h1> <div> <img src="3.jpg"> <div> </body> </html> [root@master231 05-helm]#
3、helm的回滚实战
bash
- helm的回滚实战 1.查看RELEASE历史版本 [root@master231 05-helm]# helm history xiuxian REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION 1 Mon Jul 28 10:15:47 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Install complete 2 Mon Jul 28 10:21:07 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Upgrade complete 3 Mon Jul 28 10:24:33 2025 deployed weixiang-weixiang98-0.1.0 1.16.0 Upgrade complete [root@master231 05-helm]# [root@master231 05-helm]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION xiuxian default 3 2025-07-28 10:24:33.168320177 +0800 CST deployed weixiang-weixiang98-0.1.0 1.16.0 [root@master231 05-helm]# [root@master231 05-helm]# 2.回滚到上一个版本 [root@master231 05-helm]# helm rollback xiuxian Rollback was a success! Happy Helming! [root@master231 05-helm]# [root@master231 05-helm]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION xiuxian default 4 2025-07-28 10:27:07.456398093 +0800 CST deployed weixiang-weixiang98-0.1.0 1.16.0 [root@master231 05-helm]# [root@master231 05-helm]# helm history xiuxian REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION 1 Mon Jul 28 10:15:47 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Install complete 2 Mon Jul 28 10:21:07 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Upgrade complete 3 Mon Jul 28 10:24:33 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Upgrade complete 4 Mon Jul 28 10:27:07 2025 deployed weixiang-weixiang98-0.1.0 1.16.0 Rollback to 2 [root@master231 05-helm]# 3.验证测试回滚效果 [root@master231 05-helm]# kubectl get deploy,svc,po NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/xiuxian-weixiang-weixiang98 3/3 3 3 11m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 2d23h service/xiuxian-weixiang-weixiang98 ClusterIP 10.200.67.55 <none> 80/TCP 11m NAME READY STATUS RESTARTS AGE pod/xiuxian-weixiang-weixiang98-5b48f4cb6c-2wgvt 1/1 Running 0 32s pod/xiuxian-weixiang-weixiang98-5b48f4cb6c-4r4vx 1/1 Running 0 33s pod/xiuxian-weixiang-weixiang98-5b48f4cb6c-mctq7 1/1 Running 0 30s [root@master231 05-helm]# [root@master231 05-helm]# curl 10.200.67.55 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v2</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: red">凡人修仙传 v2 </h1> <div> <img src="2.jpg"> <div> </body> </html> [root@master231 05-helm]# 4.注意再次回滚到上一个版本并验证结果 [root@master231 05-helm]# helm rollback xiuxian Rollback was a success! Happy Helming! [root@master231 05-helm]# [root@master231 05-helm]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION xiuxian default 5 2025-07-28 10:29:10.472334011 +0800 CST deployed weixiang-weixiang98-0.1.0 1.16.0 [root@master231 05-helm]# [root@master231 05-helm]# helm history xiuxian REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION 1 Mon Jul 28 10:15:47 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Install complete 2 Mon Jul 28 10:21:07 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Upgrade complete 3 Mon Jul 28 10:24:33 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Upgrade complete 4 Mon Jul 28 10:27:07 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Rollback to 2 5 Mon Jul 28 10:29:10 2025 deployed weixiang-weixiang98-0.1.0 1.16.0 Rollback to 3 [root@master231 05-helm]# [root@master231 05-helm]# kubectl get deploy,svc,po NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/xiuxian-weixiang-weixiang98 5/5 5 5 13m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 2d23h service/xiuxian-weixiang-weixiang98 ClusterIP 10.200.67.55 <none> 80/TCP 13m NAME READY STATUS RESTARTS AGE pod/xiuxian-weixiang-weixiang98-6989979888-2l9rt 1/1 Running 0 15s pod/xiuxian-weixiang-weixiang98-6989979888-dq2d4 1/1 Running 0 15s pod/xiuxian-weixiang-weixiang98-6989979888-hg5pr 1/1 Running 0 15s pod/xiuxian-weixiang-weixiang98-6989979888-k82x6 1/1 Running 0 13s pod/xiuxian-weixiang-weixiang98-6989979888-qwsmf 1/1 Running 0 13s [root@master231 05-helm]# [root@master231 05-helm]# curl 10.200.67.55 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v3</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: pink">凡人修仙传 v3 </h1> <div> <img src="3.jpg"> <div> </body> </html> [root@master231 05-helm]# 5.回滚到指定版本 [root@master231 05-helm]# helm rollback xiuxian 1 Rollback was a success! Happy Helming! [root@master231 05-helm]# [root@master231 05-helm]# helm history xiuxian REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION 1 Mon Jul 28 10:15:47 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Install complete 2 Mon Jul 28 10:21:07 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Upgrade complete 3 Mon Jul 28 10:24:33 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Upgrade complete 4 Mon Jul 28 10:27:07 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Rollback to 2 5 Mon Jul 28 10:29:10 2025 superseded weixiang-weixiang98-0.1.0 1.16.0 Rollback to 3 6 Mon Jul 28 10:30:01 2025 deployed weixiang-weixiang98-0.1.0 1.16.0 Rollback to 1 [root@master231 05-helm]# [root@master231 05-helm]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION xiuxian default 6 2025-07-28 10:30:01.204347883 +0800 CST deployed weixiang-weixiang98-0.1.0 1.16.0 [root@master231 05-helm]# 6.再次验证回滚的效果 [root@master231 05-helm]# kubectl get deploy,svc,pod NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/xiuxian-weixiang-weixiang98 1/1 1 1 14m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 2d23h service/xiuxian-weixiang-weixiang98 ClusterIP 10.200.67.55 <none> 80/TCP 14m NAME READY STATUS RESTARTS AGE pod/xiuxian-weixiang-weixiang98-68c6c66bb6-dd7st 1/1 Running 0 13s [root@master231 05-helm]# [root@master231 05-helm]# curl 10.200.67.55 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 05-helm]#
4、helm的公有仓库管理
bash
- helm的公有仓库管理及es-exporter环境部署案例 1 主流的Chart仓库概述 互联网公开Chart仓库,可以直接使用他们制作好的Chart包: 微软仓库: http://mirror.azure.cn/kubernetes/charts/ 阿里云仓库: https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts 2 添加共有仓库 [root@master231 05-helm]# helm repo add azure http://mirror.azure.cn/kubernetes/charts/ "azure" has been added to your repositories [root@master231 05-helm]# [root@master231 05-helm]# helm repo add weixiang-aliyun https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts "weixiang-aliyun" has been added to your repositories [root@master231 05-helm]# 3.查看本地的仓库列表 [root@master231 05-helm]# helm repo list NAME URL azure http://mirror.azure.cn/kubernetes/charts/ weixiang-aliyun https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts [root@master231 05-helm]# 4.更新本地的仓库信息 [root@master231 05-helm]# helm repo update Hang tight while we grab the latest from your chart repositories... ...Successfully got an update from the "weixiang-aliyun" chart repository ...Successfully got an update from the "azure" chart repository Update Complete. ⎈Happy Helming!⎈ [root@master231 05-helm]# 5.搜索我们关心的"Chart" [root@master231 05-helm]# helm search repo elasticsearch # 此处的“elasticsearch”可以换成你想要搜索的Chart关键字即可。 NAME CHART VERSION APP VERSION DESCRIPTION azure/elasticsearch 1.32.5 6.8.6 DEPRECATED Flexible and powerful open source, d... azure/elasticsearch-curator 2.2.3 5.7.6 DEPRECATED A Helm chart for Elasticsearch Curator azure/elasticsearch-exporter 3.7.1 1.1.0 DEPRECATED Elasticsearch stats exporter for Pro... azure/fluentd-elasticsearch 2.0.7 2.3.2 DEPRECATED! - A Fluentd Helm chart for Kubernet... weixiang-aliyun/elasticsearch-exporter 0.1.2 1.0.2 Elasticsearch stats exporter for Prometheus azure/apm-server 2.1.7 7.0.0 DEPRECATED The server receives data from the El... azure/dmarc2logstash 1.3.1 1.0.3 DEPRECATED Provides a POP3-polled DMARC XML rep... azure/elastabot 1.2.1 1.1.0 DEPRECATED A Helm chart for Elastabot - a Slack... azure/elastalert 1.5.1 0.2.4 DEPRECATED ElastAlert is a simple framework for... azure/fluentd 2.5.3 v2.4.0 DEPRECATED A Fluentd Elasticsearch Helm chart f... azure/kibana 3.2.8 6.7.0 DEPRECATED - Kibana is an open source data visu... weixiang-aliyun/elastalert 0.1.1 0.1.21 ElastAlert is a simple framework for alerting o... weixiang-aliyun/kibana 0.2.2 6.0.0 Kibana is an open source data visualization plu... [root@master231 05-helm]# [root@master231 05-helm]# [root@master231 05-helm]# helm search repo elasticsearch -l # 显示所有的版本信息列表 NAME CHART VERSION APP VERSION DESCRIPTION ... weixiang-aliyun/elasticsearch-exporter 0.1.2 1.0.2 Elasticsearch stats exporter for Prometheus weixiang-aliyun/elasticsearch-exporter 0.1.1 1.0.2 Elasticsearch stats exporter for Prometheus ... 6.查看Chart的详细信息 [root@master231 05-helm]# helm show chart weixiang-aliyun/elasticsearch-exporter # 若不指定,默认显示最新版本信息。 apiVersion: v1 appVersion: 1.0.2 description: Elasticsearch stats exporter for Prometheus keywords: - metrics - elasticsearch - monitoring maintainers: - email: sven.mueller@commercetools.com name: svenmueller name: elasticsearch-exporter sources: - https://github.com/justwatchcom/elasticsearch_exporter version: 0.1.2 [root@master231 05-helm]# [root@master231 05-helm]# helm show chart weixiang-aliyun/elasticsearch-exporter --version 0.1.1 # 指定Chart版本信息 apiVersion: v1 appVersion: 1.0.2 description: Elasticsearch stats exporter for Prometheus keywords: - metrics - elasticsearch - monitoring maintainers: - email: sven.mueller@commercetools.com name: svenmueller name: elasticsearch-exporter sources: - https://github.com/justwatchcom/elasticsearch_exporter version: 0.1.1 [root@master231 05-helm]# 7.拉取Chart [root@master231 05-helm]# helm pull weixiang-aliyun/elasticsearch-exporter # 若不指定,拉取最新的Chart [root@master231 05-helm]# [root@master231 05-helm]# ll elasticsearch* -rw-r--r-- 1 root root 3761 Jul 28 10:52 elasticsearch-exporter-0.1.2.tgz [root@master231 05-helm]# [root@master231 05-helm]# helm pull weixiang-aliyun/elasticsearch-exporter --version 0.1.1 # 拉取指定Chart版本 [root@master231 05-helm]# [root@master231 05-helm]# ll elasticsearch* -rw-r--r-- 1 root root 3718 Jul 28 10:53 elasticsearch-exporter-0.1.1.tgz -rw-r--r-- 1 root root 3761 Jul 28 10:52 elasticsearch-exporter-0.1.2.tgz [root@master231 05-helm]# 8.解压Chart包 [root@master231 05-helm]# tar xf elasticsearch-exporter-0.1.2.tgz [root@master231 05-helm]# tree elasticsearch-exporter elasticsearch-exporter ├── Chart.yaml ├── README.md ├── templates │   ├── cert-secret.yaml │   ├── deployment.yaml │   ├── _helpers.tpl │   ├── NOTES.txt │   └── service.yaml └── values.yaml 1 directory, 8 files [root@master231 05-helm]# [root@master231 05-helm]# grep apiVersion elasticsearch-exporter/templates/deployment.yaml apiVersion: apps/v1beta2 [root@master231 05-helm]# [root@master231 05-helm]# sed -ri '/apiVersion/s#(apps\/v1)beta2#\1#' elasticsearch-exporter/templates/deployment.yaml [root@master231 05-helm]# grep apiVersion elasticsearch-exporter/templates/deployment.yaml apiVersion: apps/v1 [root@master231 05-helm]# 8.基于Chart安装服务发行Release [root@master231 05-helm]# helm install myes-exporter elasticsearch-exporter NAME: myes-exporter LAST DEPLOYED: Mon Jul 28 10:56:18 2025 NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: 1. Get the application URL by running these commands: export POD_NAME=$(kubectl get pods --namespace default -l "app=myes-exporter-elasticsearch-exporter" -o jsonpath="{.items[0].metadata.name}") echo "Visit http://127.0.0.1:9108/metrics to use your application" kubectl port-forward $POD_NAME 9108:9108 --namespace default [root@master231 05-helm]# [root@master231 05-helm]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION myes-exporter default 1 2025-07-28 10:56:18.011736108 +0800 CST deployed elasticsearch-exporter-0.1.2 1.0.2 xiuxian default 6 2025-07-28 10:30:01.204347883 +0800 CST deployed weixiang-weixiang98-0.1.0 1.16.0 [root@master231 05-helm]# [root@master231 05-helm]# kubectl get deploy,svc,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/myes-exporter-elasticsearch-exporter 1/1 1 1 2m13s elasticsearch-exporter justwatch/elasticsearch_exporter:1.0.2 app=elasticsearch-exporter,release=myes-exporter deployment.apps/xiuxian-weixiang-weixiang98 1/1 1 1 42m weixiang-weixiang98 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 app.kubernetes.io/instance=xiuxian,app.kubernetes.io/name=weixiang-weixiang98 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 2d23h <none> service/myes-exporter-elasticsearch-exporter ClusterIP 10.200.210.74 <none> 9108/TCP 2m13s app=elasticsearch-exporter,release=myes-exporter service/xiuxian-weixiang-weixiang98 ClusterIP 10.200.67.55 <none> 80/TCP 42m app.kubernetes.io/instance=xiuxian,app.kubernetes.io/name=weixiang-weixiang98 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/myes-exporter-elasticsearch-exporter-5b68b7f954-ctzgw 1/1 Running 0 3s 10.100.2.184 worker233 <none> <none> pod/xiuxian-weixiang-weixiang98-68c6c66bb6-dd7st 1/1 Running 0 28m 10.100.2.182 worker233 <none> <none> [root@master231 05-helm]# [root@master231 05-helm]# curl -s 10.200.210.74:9108/metrics | tail process_open_fds 7 # HELP process_resident_memory_bytes Resident memory size in bytes. # TYPE process_resident_memory_bytes gauge process_resident_memory_bytes 6.7584e+06 # HELP process_start_time_seconds Start time of the process since unix epoch in seconds. # TYPE process_start_time_seconds gauge process_start_time_seconds 1.75367150895e+09 # HELP process_virtual_memory_bytes Virtual memory size in bytes. # TYPE process_virtual_memory_bytes gauge process_virtual_memory_bytes 1.0010624e+07 [root@master231 05-helm]# 温馨提示: 如果镜像拉取不成功,可以在我的仓库中找到即可。 http://192.168.21.253/Resources/Kubernetes/Add-ons/helm/ 9.删除第三方仓库 [root@master231 05-helm]# helm repo list NAME URL azure http://mirror.azure.cn/kubernetes/charts/ weixiang-aliyun https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts [root@master231 05-helm]# [root@master231 05-helm]# helm repo remove weixiang-aliyun "weixiang-aliyun" has been removed from your repositories [root@master231 05-helm]# [root@master231 05-helm]# helm repo list NAME URL azure http://mirror.azure.cn/kubernetes/charts/ [root@master231 05-helm]# [root@master231 05-helm]# helm repo remove azure "azure" has been removed from your repositories [root@master231 05-helm]# [root@master231 05-helm]# helm repo list Error: no repositories to show [root@master231 05-helm]#
5、helm高级
bash
1.安装helm环境 略,见视频。 2.创建Chart [root@master231 05-helm]# helm create weixiang-linux Creating weixiang-linux [root@master231 05-helm]# [root@master231 05-helm]# ll weixiang-linux total 32 drwxr-xr-x 4 root root 4096 Jul 30 11:37 ./ drwxr-xr-x 5 root root 4096 Jul 30 11:37 ../ drwxr-xr-x 2 root root 4096 Jul 30 11:37 charts/ -rw-r--r-- 1 root root 1151 Jul 30 11:37 Chart.yaml -rw-r--r-- 1 root root 349 Jul 30 11:37 .helmignore drwxr-xr-x 3 root root 4096 Jul 30 11:37 templates/ -rw-r--r-- 1 root root 4301 Jul 30 11:37 values.yaml [root@master231 05-helm]# [root@master231 05-helm]# tree weixiang-linux/ weixiang-linux/ ├── charts ├── Chart.yaml ├── templates │   ├── deployment.yaml │   ├── _helpers.tpl │   ├── hpa.yaml │   ├── ingress.yaml │   ├── NOTES.txt │   ├── serviceaccount.yaml │   ├── service.yaml │   └── tests │   └── test-connection.yaml └── values.yaml 3 directories, 10 files [root@master231 05-helm]# 3.清空目录结构 [root@master231 05-helm]# rm -rf weixiang-linux/templates/* [root@master231 05-helm]# [root@master231 05-helm]# > weixiang-linux/values.yaml [root@master231 05-helm]# [root@master231 05-helm]# tree weixiang-linux/ weixiang-linux/ ├── charts ├── Chart.yaml ├── templates └── values.yaml 2 directories, 2 files [root@master231 05-helm]# 4.准备资源清单 [root@master231 05-helm]# cat > weixiang-linux/templates/deployments.yaml << EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: app: xiuxian template: metadata: labels: app: xiuxian spec: volumes: - name: data configMap: name: cm-xiuxian items: - key: default.conf path: default.conf containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 ports: - containerPort: 80 volumeMounts: - name: data mountPath: /etc/nginx/conf.d/default.conf subPath: default.conf name: c1 livenessProbe: failureThreshold: 8 httpGet: path: / port: 80 initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 readinessProbe: failureThreshold: 3 httpGet: path: / port: 80 periodSeconds: 1 timeoutSeconds: 15 resources: requests: cpu: 0.2 memory: 200Mi limits: cpu: 0.5 memory: 300Mi EOF [root@master231 05-helm]# cat > weixiang-linux/templates/configmaps.yaml << EOF apiVersion: v1 kind: ConfigMap metadata: name: cm-xiuxian data: default.conf: | server { listen 80; server_name localhost; location / { root /usr/share/nginx/html; index index.html index.htm; } error_page 500 502 503 504 /50x.html; location = /50x.html { root /usr/share/nginx/html; } } EOF [root@master231 05-helm]# cat > weixiang-linux/templates/hpa.yaml << EOF apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: deploy-xiuxian spec: maxReplicas: 5 minReplicas: 2 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: deploy-xiuxian EOF [root@master231 05-helm]# cat > weixiang-linux/templates/service.yaml << EOF apiVersion: v1 kind: Service metadata: name: svc-xiuxian spec: clusterIP: 10.200.20.25 ports: - port: 80 selector: app: xiuxian type: NodePort EOF [root@master231 05-helm]# cat > weixiang-linux/templates/ingress.yaml << EOF apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ing-xiuxian spec: ingressClassName: traefik-server rules: - host: xiuxian.weixiang.com http: paths: - backend: service: name: svc-xiuxian port: number: 80 path: / pathType: Prefix EOF [root@master231 05-helm]# tree weixiang-linux weixiang-linux ├── charts ├── Chart.yaml ├── templates │   ├── configmaps.yaml │   ├── deployments.yaml │   ├── hpa.yaml │   ├── ingress.yaml │   └── service.yaml └── values.yaml 2 directories, 7 files [root@master231 05-helm]# 5.安装测试 [root@master231 05-helm]# helm install myapp weixiang-linux -n kube-public NAME: myapp LAST DEPLOYED: Wed Jul 30 11:41:03 2025 NAMESPACE: kube-public STATUS: deployed REVISION: 1 TEST SUITE: None [root@master231 05-helm]# [root@master231 05-helm]# helm list -n kube-public NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION myapp kube-public 1 2025-07-30 11:41:03.600886497 +0800 CST deployed weixiang-linux-0.1.0 1.16.0 [root@master231 05-helm]# [root@master231 05-helm]# [root@master231 05-helm]# kubectl get deploy,svc,hpa,cm,po -n kube-public NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/deploy-xiuxian 3/3 3 3 28s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/svc-xiuxian NodePort 10.200.20.25 <none> 80:25965/TCP 28s NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE horizontalpodautoscaler.autoscaling/deploy-xiuxian Deployment/deploy-xiuxian <unknown>/80% 2 5 3 28s NAME DATA AGE configmap/cluster-info 1 21d configmap/cm-xiuxian 1 28s configmap/kube-root-ca.crt 1 21d NAME READY STATUS RESTARTS AGE pod/deploy-xiuxian-7f95b8844f-2snfs 1/1 Running 0 28s pod/deploy-xiuxian-7f95b8844f-kkfrh 1/1 Running 0 28s pod/deploy-xiuxian-7f95b8844f-rdpjh 1/1 Running 0 28s [root@master231 05-helm]#
1、Chart的基本信息定义
bash
1.卸载服务 [root@master231 05-helm]# helm list -n kube-public NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION myapp kube-public 1 2025-07-30 11:41:03.600886497 +0800 CST deployed weixiang-linux-0.1.0 1.16.0 [root@master231 05-helm]# [root@master231 05-helm]# helm -n kube-public uninstall myapp release "myapp" uninstalled [root@master231 05-helm]# [root@master231 05-helm]# helm list -n kube-public NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION [root@master231 05-helm]# 2.修改Chart.yaml文件 [root@master231 05-helm]# cat > weixiang-linux/Chart.yaml <<'EOF' # 指定Chart的版本,一般无需修改。 apiVersion: v2 # 指定Chart的名称 name: weixiang-linux # 表示Chart描述信息,描述此Chart的作用 description: weixiang Linux Kubernetes helm case demo。 # 指定Chart的类型,有效值为: application和library # application: # 此类型的Chart可以被独立部署,打包等。 # library: # 无法被独立部署。但可以被application类型的Chart进行引用。 type: application #type: library # 定义当前Chart版本,建议命令遵循: https://semver.org/ # 核心语法: MAJOR.MINOR.PATCH #MAJOR: 进行不兼容的API更改时的主要版本,对应的大版本变化。 # MINOR: 在大版本(MAJOR)架构基础之上新增各种功能。 # PATCH: 修复功能的各种BUG,说白了,就是各种打补丁。 version: 25.07.30 # 表示当前正在部署的Release发行版本。 appVersion: "v1.2.0" EOF 3.部署测试 [root@master231 05-helm]# helm -n kube-public install myapp weixiang-linux NAME: myapp LAST DEPLOYED: Wed Jul 30 11:47:09 2025 NAMESPACE: kube-public STATUS: deployed REVISION: 1 TEST SUITE: None [root@master231 05-helm]# [root@master231 05-helm]# helm list -n kube-public NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION myapp kube-public 1 2025-07-30 11:47:09.688571728 +0800 CST deployed weixiang-linux-25.07.30 v1.2.0 [root@master231 05-helm]# [root@master231 05-helm]# kubectl get deploy,svc,hpa,cm,po,ing -n kube-public NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/deploy-xiuxian 3/3 3 3 2m12s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/svc-xiuxian NodePort 10.200.20.25 <none> 80:3799/TCP 2m12s NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE horizontalpodautoscaler.autoscaling/deploy-xiuxian Deployment/deploy-xiuxian 0%/80% 2 5 3 2m12s NAME DATA AGE configmap/cluster-info 1 21d configmap/cm-xiuxian 1 2m12s configmap/kube-root-ca.crt 1 21d NAME READY STATUS RESTARTS AGE pod/deploy-xiuxian-7f95b8844f-pwdxl 1/1 Running 0 2m12s pod/deploy-xiuxian-7f95b8844f-qbdl2 1/1 Running 0 2m12s pod/deploy-xiuxian-7f95b8844f-tqs88 1/1 Running 0 2m12s NAME CLASS HOSTS ADDRESS PORTS AGE ingress.networking.k8s.io/ing-xiuxian traefik-server xiuxian.weixiang.com 10.0.0.152 80 2m12s [root@master231 05-helm]# - 自定义values变量引用 1.values.yaml配置文件说明 可以自定义字段在values.yaml文件中。 在templates目录下的所有资源清单均可以基于jija2语法引用values.yaml文件的自定义字段。 2.实战案例 2.1 修改values文件内容 [root@master231 05-helm]# cat > weixiang-linux/values.yaml <<EOF service: port: 90 EOF 2.2 templates目录下的资源清单进行引用 [root@master231 05-helm]# cat > weixiang-linux/templates/configmaps.yaml <<'EOF' apiVersion: v1 kind: ConfigMap metadata: name: cm-xiuxian data: default.conf: | server { listen {{ .Values.service.port }}; server_name localhost; location / { root /usr/share/nginx/html; index index.html index.htm; } error_page 500 502 503 504 /50x.html; location = /50x.html { root /usr/share/nginx/html; } } EOF [root@master231 05-helm]# cat > weixiang-linux/templates/deployments.yaml <<'EOF' apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: app: xiuxian template: metadata: labels: app: xiuxian spec: volumes: - name: data configMap: name: cm-xiuxian items: - key: default.conf path: default.conf containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 ports: - containerPort: {{ .Values.service.port }} volumeMounts: - name: data mountPath: /etc/nginx/conf.d/default.conf subPath: default.conf name: c1 livenessProbe: failureThreshold: 8 httpGet: path: / port: {{ .Values.service.port }} initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 readinessProbe: failureThreshold: 3 httpGet: path: / port: {{ .Values.service.port }} periodSeconds: 1 timeoutSeconds: 15 resources: requests: cpu: 0.2 memory: 200Mi limits: cpu: 0.5 memory: 300Mi EOF [root@master231 05-helm]# cat > weixiang-linux/templates/service.yaml <<'EOF' apiVersion: v1 kind: Service metadata: name: svc-xiuxian spec: clusterIP: 10.200.20.25 ports: - port: {{ .Values.service.port }} selector: app: xiuxian type: NodePort EOF [root@master231 05-helm]# cat > weixiang-linux/templates/ingress.yaml <<'EOF' apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ing-xiuxian spec: ingressClassName: traefik-server rules: - host: xiuxian.weixiang.com http: paths: - backend: service: name: svc-xiuxian port: number: {{ .Values.service.port }} path: / pathType: Prefix EOF 2.3 安装服务 [root@master231 05-helm]# helm -n kube-public uninstall myapp release "myapp" uninstalled [root@master231 05-helm]# [root@master231 05-helm]# helm -n kube-public install myapp weixiang-linux NAME: myapp LAST DEPLOYED: Wed Jul 30 14:38:02 2025 NAMESPACE: kube-public STATUS: deployed REVISION: 1 TEST SUITE: None [root@master231 05-helm]# [root@master231 05-helm]# helm list -n kube-public NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION myapp kube-public 1 2025-07-30 14:38:02.820673468 +0800 CST deployed weixiang-linux-25.07.30 v1.2.0 [root@master231 05-helm]# 2.4 测试访问 [root@master231 05-helm]# kubectl get deploy,cm,svc,hpa,ing -n kube-public NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/deploy-xiuxian 3/3 3 3 29s NAME DATA AGE configmap/cluster-info 1 21d configmap/cm-xiuxian 1 30s configmap/kube-root-ca.crt 1 21d NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/svc-xiuxian NodePort 10.200.20.25 <none> 90:3259/TCP 29s NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE horizontalpodautoscaler.autoscaling/deploy-xiuxian Deployment/deploy-xiuxian <unknown>/80% 2 5 3 29s NAME CLASS HOSTS ADDRESS PORTS AGE ingress.networking.k8s.io/ing-xiuxian traefik-server xiuxian.weixiang.com 10.0.0.152 80 29s [root@master231 05-helm]# [root@master231 05-helm]# [root@master231 05-helm]# curl 10.200.20.25:90 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 05-helm]# 2.5 卸载服务 [root@master231 05-helm]# helm -n kube-public uninstall myapp release "myapp" uninstalled [root@master231 05-helm]# [root@master231 05-helm]# helm -n kube-public list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION [root@master231 05-helm]#
2、基于NOTES配置安装文档
bash
参考链接: https://helm.sh/zh/docs/chart_template_guide/notes_files/ 1.NOTES.txt概述 NOTES.txt功能主要是为chart添加安装说明。 该文件是纯文本,但会像模板一样处理, 所有正常的模板函数和对象都是可用的。 2.实战案例 2.1 添加NOTES.txt文件 [root@master231 05-helm]# cat > weixiang-linux/templates/NOTES.txt <<'EOF' ########################################################### # # 学IT来老男孩,月薪过万不是梦~ # # 官网地址: # https://www.weixiang.com # # 作者: 尹正杰 ########################################################### Duang~恭喜您,服务部署成功啦~ 当前的部署的信息如下: Chart名称: {{ .Chart.Name }} Chart版本: {{ .Chart.Version }} Release名称: {{ .Release.Name }} K8S集群内部可以使用如下命令测试: ClusterIP=$(kubectl -n kube-public get svc svc-xiuxian -o jsonpath='{.spec.clusterIP}') curl ${ClusterIP}:{{ .Values.service.port }} EOF 2.2 安装测试 [root@master231 05-helm]# helm -n kube-public install myapp weixiang-linux NAME: myapp LAST DEPLOYED: Wed Jul 30 14:43:27 2025 NAMESPACE: kube-public STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: ########################################################### # # 学IT来老男孩,月薪过万不是梦~ # # 官网地址: # https://www.weixiang.com # # 作者: 尹正杰 ########################################################### Duang~恭喜您,服务部署成功啦~ 当前的部署的信息如下: Chart名称: weixiang-linux Chart版本: 25.07.30 Release名称: myapp K8S集群内部可以使用如下命令测试: ClusterIP=$(kubectl -n kube-public get svc svc-xiuxian -o jsonpath='{.spec.clusterIP}') curl ${ClusterIP}:90 [root@master231 05-helm]# [root@master231 05-helm]# helm -n kube-public list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION myapp kube-public 1 2025-07-30 14:43:27.190196071 +0800 CST deployed weixiang-linux-25.07.30 v1.2.0 [root@master231 05-helm]# [root@master231 05-helm]# ClusterIP=$(kubectl -n kube-public get svc svc-xiuxian -o jsonpath='{.spec.clusterIP}') [root@master231 05-helm]# curl ${ClusterIP}:90 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 05-helm]# 3.卸载服务 [root@master231 05-helm]# helm -n kube-public list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION myapp kube-public 1 2025-07-30 14:43:27.190196071 +0800 CST deployed weixiang-linux-25.07.30 v1.2.0 [root@master231 05-helm]# [root@master231 05-helm]# helm -n kube-public uninstall myapp release "myapp" uninstalled [root@master231 05-helm]# [root@master231 05-helm]# helm -n kube-public list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION [root@master231 05-helm]#
3、helm的函数基础体验
bash
1.helm函数概述 说白了,就是一系列处理文本的功能函数。 参考链接: https://helm.sh/zh/docs/chart_template_guide/function_list/ 2.实战案例 [root@master231 05-helm]# cat > weixiang-linux/values.yaml <<EOF service: port: 90 deployments: image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps tag: v1 env: school: weixiang class: weixiang98 probe: enable: true code: livenessProbe: failureThreshold: 8 httpGet: path: / port: 90 initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 readinessProbe: failureThreshold: 3 httpGet: path: / port: 90 periodSeconds: 1 timeoutSeconds: 15 resources: requests: cpu: 0.2 memory: 200Mi limits: cpu: 0.5 memory: 300Mi EOF [root@master231 05-helm]# cat > weixiang-linux/templates/deployments.yaml <<'EOF' apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: app: xiuxian template: metadata: labels: app: xiuxian spec: volumes: - name: data configMap: name: cm-xiuxian items: - key: default.conf path: default.conf containers: - image: "{{ .Values.deployments.image }}:{{ .Values.deployments.tag}}" env: - name: XIXI value: {{ .Values.deployments.env.school | upper }} - name: HAHA value: {{ .Values.deployments.env.class | title | indent 4 }} - name: HEHE value: {{ .Values.service.port | quote }} ports: - containerPort: {{ .Values.service.port }} volumeMounts: - name: data mountPath: /etc/nginx/conf.d/default.conf subPath: default.conf name: c1 {{- toYaml .Values.probe.code | nindent 8}} EOF 3.定义注释和函数引用 [root@master231 05-helm]# cat > weixiang-linux/templates/NOTES.txt <<EOF ########################################################### # # 学IT来老男孩,月薪过万不是梦~ # # 官网地址: # https://www.weixiang.com # # 作者: 尹正杰 ########################################################### {{/* 注释内容: 使用repeat函数可以指定字符特定的次数。 温馨提示: 在使用函数时,应该使用"{{}}"来进行引用哟~否则不生效! */}} {{ repeat 30 "*" }} Duang~恭喜您,服务部署成功啦~ 当前的部署的信息如下: Chart名称: {{ .Chart.Name }} Chart版本: {{ .Chart.Version }} Release名称: {{ .Release.Name }} K8S集群内部可以使用如下命令测试: ClusterIP=$(kubectl -n kube-public get svc svc-xiuxian -o jsonpath='{.spec.clusterIP}') curl ${ClusterIP}:{{ .Values.service.port }} EOF 4.测试验证 [root@master231 05-helm]# helm -n kube-public install myapp weixiang-linux --dry-run=client
4、helm的流程控制初体验
bash
参考链接: https://helm.sh/zh/docs/chart_template_guide/control_structures/ 1.流程控制 可以进行判断,遍历等操作。 2.实战案例 [root@master231 05-helm]# cat > weixiang-linux/values.yaml <<EOF service: port: 90 # kind: NodePort kind: LoadBalancer deployments: image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps tag: v1 env: school: weixiang class: weixiang98 probe: # enable: true enable: false code: livenessProbe: failureThreshold: 8 httpGet: path: / port: 90 initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 readinessProbe: failureThreshold: 3 httpGet: path: / port: 90 periodSeconds: 1 timeoutSeconds: 15 resources: requests: cpu: 0.2 memory: 200Mi limits: cpu: 0.5 memory: 300Mi jianshenTopics: - pulati - quanji - suxing - lashen - zengji - youyang - xiufu - youyong EOF [root@master231 05-helm]# cat > weixiang-linux/templates/deployments.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: app: xiuxian template: metadata: labels: app: xiuxian spec: volumes: - name: data configMap: name: cm-xiuxian items: - key: default.conf path: default.conf containers: - image: "{{ .Values.deployments.image }}:{{ .Values.deployments.tag}}" env: {{- with .Values.deployments.env }} - name: XIXI value: {{ .school | upper }} - name: HAHA value: {{ .class | title | indent 4 }} {{- end }} - name: HEHE value: {{ .Values.service.port | quote }} ports: - containerPort: {{ .Values.service.port }} volumeMounts: - name: data mountPath: /etc/nginx/conf.d/default.conf subPath: default.conf name: c1 {{- if .Values.probe.enable }} {{- toYaml .Values.probe.code | nindent 8}} {{- end }} EOF [root@master231 05-helm]# cat > weixiang-linux/templates/configmaps.yaml <<EOF apiVersion: v1 kind: ConfigMap metadata: name: cm-xiuxian data: default.conf: | server { listen {{ .Values.service.port }}; server_name localhost; location / { root /usr/share/nginx/html; index index.html index.htm; } error_page 500 502 503 504 /50x.html; location = /50x.html { root /usr/share/nginx/html; } } jianshenTopics: | {{- range .Values.jianshenTopics }} - {{ . | title | quote }} {{- end }} EOF [root@master231 05-helm]# cat > weixiang-linux/templates/service.yaml <<EOF apiVersion: v1 kind: Service metadata: name: svc-xiuxian spec: clusterIP: 10.200.20.25 {{- if eq .Values.service.kind "NodePort" }} type: {{ .Values.service.kind }} ports: - port: {{ .Values.service.port }} nodePort: 30090 {{- else if eq .Values.service.kind "LoadBalancer" }} type: {{ .Values.service.kind }} ports: - port: {{ .Values.service.port }} nodePort: 30110 {{- else }} type: {{ .Values.service.kind | default "ClusterIP" }} ports: - port: {{ .Values.service.port }} {{- end }} selector: app: xiuxian EOF 3.测试脚本 [root@master231 05-helm]# helm install myapp weixiang-linux --dry-run=client
5、helm的打包并推送到harbor仓库
bash
1.harbor创建项目 建议项目名称为 "weixiang-helm" 2.打包Chart [root@master231 Charts]# ll total 12 drwxr-xr-x 3 root root 4096 Jun 12 15:45 ./ drwxr-xr-x 32 root root 4096 Jun 12 15:44 ../ drwxr-xr-x 4 root root 4096 Jun 12 17:13 weixiang-linux/ [root@master231 Charts]# [root@master231 Charts]# helm package weixiang-linux Successfully packaged chart and saved it to: /weixiang/manifests/Charts/weixiang-linux-25.06.12.tgz [root@master231 Charts]# [root@master231 Charts]# ll total 16 drwxr-xr-x 3 root root 4096 Jun 12 17:20 ./ drwxr-xr-x 32 root root 4096 Jun 12 15:44 ../ drwxr-xr-x 4 root root 4096 Jun 12 17:13 weixiang-linux/ -rw-r--r-- 1 root root 2350 Jun 12 17:20 weixiang-linux-25.06.12.tgz [root@master231 Charts]# [root@master231 Charts]# tar tf weixiang-linux-25.06.12.tgz weixiang-linux/Chart.yaml weixiang-linux/values.yaml weixiang-linux/templates/NOTES.txt weixiang-linux/templates/_helpers.tpl weixiang-linux/templates/configmaps.yaml weixiang-linux/templates/deployments.yaml weixiang-linux/templates/hpa.yaml weixiang-linux/templates/ingress.yaml weixiang-linux/templates/service.yaml weixiang-linux/.helmignore [root@master231 Charts]# 3.跳过证书校验并配置认证信息 [root@master231 Charts]# helm push weixiang-linux-25.06.12.tgz oci://harbor250.weixiang.com/weixiang-helm --username admin --password 1 --insecure-skip-tls-verify Pushed: harbor250.weixiang.com/weixiang-helm/weixiang-linux:25.06.12 Digest: sha256:efe44993f6fd90b50bd86a49bbd85a97702e1a0fe8b8bebfe2925950ee4fbab6 [root@master231 Charts]# 4.harbor仓库验证 略,见视频。 [root@master231 Chart]# scp -p /usr/local/bin/helm 10.0.0.233:/usr/local/bin/ 5.拉取harbor仓库的Chart [root@worker233 ~]# helm pull oci://harbor250.weixiang.com/weixiang-helm/weixiang-linux --version 25.06.12 --insecure-skip-tls-verify Pulled: harbor250.weixiang.com/weixiang-helm/weixiang-linux:25.06.12 Digest: sha256:efe44993f6fd90b50bd86a49bbd85a97702e1a0fe8b8bebfe2925950ee4fbab6 [root@worker233 ~]# [root@worker233 ~]# ll weixiang-linux-25.06.12.tgz -rw-r--r-- 1 root root 2350 Jun 12 17:26 weixiang-linux-25.06.12.tgz [root@worker233 ~]# [root@worker233 ~]# tar xf weixiang-linux-25.06.12.tgz [root@worker233 ~]# [root@worker233 ~]# tree weixiang-linux weixiang-linux ├── Chart.yaml ├── templates │   ├── configmaps.yaml │   ├── deployments.yaml │   ├── _helpers.tpl │   ├── hpa.yaml │   ├── ingress.yaml │   ├── NOTES.txt │   └── service.yaml └── values.yaml 1 directory, 9 files [root@worker233 ~]# 6.打包注意事项 如果有一些文件不想要打包进去,则可以使用'.helmignore'文件进行忽略。达到优化的目的。 参考示例: http://192.168.14.253/Resources/Kubernetes/Add-ons/helm/weixiang-linux-25.04.22.tgz
6、helm自定义模板实战案例
bash
1.自定义模板文件 _helpers.tpl用于存放自定义的模板文件。 参考链接: https://helm.sh/zh/docs/chart_template_guide/named_templates/ 2.参考案例 [root@master231 05-helm]# cat > weixiang-linux/templates/_helpers.tpl <<EOF {{- define "weixiang-deploy" }} replicas: 3 selector: matchLabels: app: xiuxian template: metadata: labels: app: xiuxian spec: volumes: - name: data configMap: name: cm-xiuxian items: - key: default.conf path: default.conf containers: - image: "{{ .Values.deployments.image }}:{{ .Values.deployments.tag}}" env: {{- with .Values.deployments.env }} - name: XIXI value: {{ .school | upper }} - name: HAHA value: {{ .class | title | indent 4 }} {{- end }} - name: HEHE value: {{ .Values.service.port | quote }} ports: - containerPort: {{ .Values.service.port }} volumeMounts: - name: data mountPath: /etc/nginx/conf.d/default.conf subPath: default.conf name: c1 {{- if .Values.probe.enable }} {{- toYaml .Values.probe.code | nindent 8}} {{- end }} {{- end }} EOF [root@master231 05-helm]# cat > weixiang-linux/templates/deployments.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: {{- template "weixiang-deploy" . }} EOF 3.测试脚本 [root@master231 05-helm]# tree weixiang-linux weixiang-linux ├── charts ├── Chart.yaml ├── templates │   ├── configmaps.yaml │   ├── deployments.yaml │   ├── _helpers.tpl │   ├── hpa.yaml │   ├── ingress.yaml │   ├── NOTES.txt │   └── service.yaml └── values.yaml 2 directories, 9 files [root@master231 05-helm]# [root@master231 05-helm]# helm install myapp weixiang-linux --dry-run=client -n kube-public - helm的打包并推送到harbor仓库 1.harbor创建项目 建议项目名称为 "weixiang-helm" 2.打包Chart [root@master231 05-helm]# tree weixiang-linux weixiang-linux ├── charts ├── Chart.yaml ├── templates │   ├── configmaps.yaml │   ├── deployments.yaml │   ├── _helpers.tpl │   ├── hpa.yaml │   ├── ingress.yaml │   ├── NOTES.txt │   └── service.yaml └── values.yaml 2 directories, 9 files [root@master231 05-helm]# [root@master231 05-helm]# helm package weixiang-linux Successfully packaged chart and saved it to: /weixiang/manifests/add-ons/05-helm/weixiang-linux-25.07.30.tgz [root@master231 05-helm]# [root@master231 05-helm]# ll weixiang-linux-25.07.30.tgz -rw-r--r-- 1 root root 2538 Jul 30 16:06 weixiang-linux-25.07.30.tgz [root@master231 05-helm]# [root@master231 05-helm]# tar tf weixiang-linux-25.07.30.tgz weixiang-linux/Chart.yaml weixiang-linux/values.yaml weixiang-linux/templates/NOTES.txt weixiang-linux/templates/_helpers.tpl weixiang-linux/templates/configmaps.yaml weixiang-linux/templates/deployments.yaml weixiang-linux/templates/hpa.yaml weixiang-linux/templates/ingress.yaml weixiang-linux/templates/service.yaml weixiang-linux/.helmignore [root@master231 05-helm]# 3.跳过证书校验并配置认证信息 [root@master231 05-helm]# docker login -u admin -p 1 harbor250.weixiang.com WARNING! Using --password via the CLI is insecure. Use --password-stdin. WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded [root@master231 05-helm]# [root@master231 05-helm]# helm push weixiang-linux-25.07.30.tgz oci://harbor250.weixiang.com/weixiang-helm --insecure-skip-tls-verify Pushed: harbor250.weixiang.com/weixiang-helm/weixiang-linux:25.07.30 Digest: sha256:176b8af7a95a1abb082ed89b5d1ce4aaba50e25c18006a6fdab1b05275095e7c [root@master231 05-helm]# 4.harbor仓库验证 略,见视频。 [root@master231 05-helm]# scp -p /usr/local/bin/helm 10.0.0.233:/usr/local/bin/ 5.拉取harbor仓库的Chart [root@worker233 ~]# helm pull oci://harbor250.weixiang.com/weixiang-helm/weixiang-linux --version 25.07.30 --insecure-skip-tls-verify Pulled: harbor250.weixiang.com/weixiang-helm/weixiang-linux:25.07.30 Digest: sha256:176b8af7a95a1abb082ed89b5d1ce4aaba50e25c18006a6fdab1b05275095e7c [root@worker233 ~]# [root@worker233 ~]# ll weixiang-linux-25.07.30.tgz -rw-r--r-- 1 root root 2538 Jul 30 16:13 weixiang-linux-25.07.30.tgz [root@worker233 ~]# [root@worker233 ~]# tar xf weixiang-linux-25.06.12.tgz [root@worker233 ~]# [root@worker233 ~]# apt -y install tree [root@worker233 ~]# [root@worker233 ~]# tree weixiang-linux weixiang-linux ├── Chart.yaml ├── templates │   ├── configmaps.yaml │   ├── deployments.yaml │   ├── _helpers.tpl │   ├── hpa.yaml │   ├── ingress.yaml │   ├── NOTES.txt │   └── service.yaml └── values.yaml 1 directory, 9 files [root@worker233 ~]# 6.打包注意事项 如果有一些文件不想要打包进去,则可以使用'.helmignore'文件进行忽略。达到优化的目的。
7、kubeapps图形化管理Chart组件
bash
推荐阅读: https://github.com/vmware-tanzu/kubeapps/blob/main/site/content/docs/latest/tutorials/getting-started.md 官方的Chart【存在问题,需要去docker官方拉取数据】 https://github.com/bitnami/charts/tree/main/bitnami/kubeapps 温馨提示: 官方对于kubeapps的文档会去从docker官网拉取镜像,国内因素可能无法访问。 1.配置vpn代理 [root@master231 05-helm]# export http_proxy=http://10.0.0.1:7890 [root@master231 05-helm]# export https_proxy=http://10.0.0.1:7890 [root@master231 05-helm]# env | grep http -i https_proxy=http://10.0.0.1:7890 http_proxy=http://10.0.0.1:7890 [root@master231 05-helm]# 2.在线安装kubeapps [root@master231 05-helm]# helm install kubeapps --namespace kubeapps oci://registry-1.docker.io/bitnamicharts/kubeapps Pulled: registry-1.docker.io/bitnamicharts/kubeapps:18.0.1 Digest: sha256:3688c0296e86d23644519d23f68ac5554b473296349e610d87ac49be8d13ac97 WARNING: This chart is deprecated NAME: kubeapps LAST DEPLOYED: Wed Jul 30 16:25:02 2025 NAMESPACE: kubeapps STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: This Helm chart is deprecated The upstream project has been discontinued, therefore, this Helm chart will be deprecated as well. CHART NAME: kubeapps CHART VERSION: 18.0.1 APP VERSION: 2.12.1 Did you know there are enterprise versions of the Bitnami catalog? For enhanced secure software supply chain features, unlimited pulls from Docker, LTS support, or application customization, see Bitnami Premium or Tanzu Application Catalog. See https://www.arrow.com/globalecs/na/vendors/bitnami for more information.** Please be patient while the chart is being deployed ** Tip: Watch the deployment status using the command: kubectl get pods -w --namespace kubeapps Kubeapps can be accessed via port 80 on the following DNS name from within your cluster: kubeapps.kubeapps.svc.cluster.local To access Kubeapps from outside your K8s cluster, follow the steps below: 1. Get the Kubeapps URL by running these commands: echo "Kubeapps URL: http://127.0.0.1:8080" kubectl port-forward --namespace kubeapps service/kubeapps 8080:80 2. Open a browser and access Kubeapps using the obtained URL. WARNING: There are "resources" sections in the chart not set. Using "resourcesPreset" is not recommended for production. For production installations, please set the following values according to your workload needs: - apprepository.resources - dashboard.resources - frontend.resources - kubeappsapis.resources - postgresql.resources +info https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ [root@master231 05-helm]# 3.检查Pod状态 [root@master231 ~]# kubectl get pods -n kubeapps -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES apprepo-kubeapps-sync-bitnami-5298l-22xmk 1/1 Running 0 98s 10.100.140.105 worker233 <none> <none> apprepo-kubeapps-sync-bitnami-9ppdp-7768r 1/1 Running 0 98s 10.100.140.104 worker233 <none> <none> kubeapps-56ff8d8d86-vmvmd 1/1 Running 0 3m37s 10.100.140.102 worker233 <none> <none> kubeapps-56ff8d8d86-xwshm 1/1 Running 0 3m37s 10.100.203.189 worker232 <none> <none> kubeapps-internal-apprepository-controller-6dff9fd46d-v9tn4 1/1 Running 0 3m37s 10.100.203.188 worker232 <none> <none> kubeapps-internal-dashboard-7b6b84d96d-m5hsq 1/1 Running 0 3m37s 10.100.203.187 worker232 <none> <none> kubeapps-internal-dashboard-7b6b84d96d-vw4kd 1/1 Running 0 3m37s 10.100.140.101 worker233 <none> <none> kubeapps-internal-kubeappsapis-7dfb95f987-fv7hb 1/1 Running 0 3m37s 10.100.140.103 worker233 <none> <none> kubeapps-internal-kubeappsapis-7dfb95f987-h5vcr 1/1 Running 0 3m37s 10.100.203.186 worker232 <none> <none> kubeapps-postgresql-0 1/1 Running 0 3m37s 10.100.140.100 worker233 <none> <none> [root@master231 ~]# 5.开启端口转发 [root@master231 ~]# kubectl port-forward --namespace kubeapps service/kubeapps 8080:80 --address=0.0.0.0 Forwarding from 0.0.0.0:8080 -> 8080 6.创建sa并获取token [root@master231 Chart]# cat > sa-admin.yaml <<EOF apiVersion: v1 kind: ServiceAccount metadata: name: weixiang98 --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: cluster-weixiang98 roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: weixiang98 namespace: default EOF [root@master231 Chart]# kubectl apply -f sa-admin.yaml serviceaccount/weixiang98 created clusterrolebinding.rbac.authorization.k8s.io/cluster-weixiang98 created [root@master231 Chart]# [root@master231 Chart]# kubectl get secrets `kubectl get sa weixiang98 -o jsonpath='{.secrets[0].name}'` -o jsonpath='{.data.token}' | base64 -d ;echo eyJhbGciOiJSUzI1NiIsImtpZCI6InU4RFAyREFKeWhLbGJKa20yUUN6d0lnck5GWWh1aV93OEtFMjIyM3k5blkifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJkZWZhdWx0Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImxpbnV4OTYtdG9rZW4tdnc5bWYiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoibGludXg5NiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImZhNDQ2ZmMwLTE0MmItNGUwZC05MDhmLTliNDZhYTdkMjdiNiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpkZWZhdWx0OmxpbnV4OTYifQ.TCc3RsKqgF9Y7TzxJLnW1k4rpadLXcpBIruwC30eeTJmykcO9pgKXFUZNYy2-Hwf2IDL90m9n3EfT1e6om1dHiZ51arxr3JVIMOmv_A81Uj_kDNwwGcsqWuVzkibg_YrcRvcUY8pekT4MAo300_bMi0TI3QSZ8Z8_ADBn6wL7Mu1AgYctant5tFMGkvE8g2Sdt5UBabMMc37AUYWKXYx_kpDGBLJkWm8WzhOuc0WbvsFTUjiCLDMQotuJSmk_89zyirHWCLE_1SZe5mTrE-lbXbplYstQqsLdIvJVilfzVWqyj9kTOpDjapyMwkOeYjgy6aoUKxX5gvNb-Xc5254iQ [root@master231 Chart]# 7.登录kubeapps的WebUI http://10.0.0.231:8080/ 使用上一步的token进行登录即可.

32、Ingress

1、基于helm部署Ingress-nginx

2a11aa7014ff69dd4c7e07c3d2019522

bash
ingress是k8s内置的资源,可以理解为nginx.conf配置文件,ingressClass可以理解为ngix,只需要学习ingress怎么写,如图中所示,只需 要域名到service的解析关系,由service自己找到后端的pods,然后由ingressClass把这个资源清单转换成nginx.conf,交由ingressClass 内置的nginx去加载,ingressClass会自动监听80跟443端口 - Ingress和Service的区别 1.Service是实现了四层代理基于ip地址访问,不需要部署额外组件,原生支持; 2.Ingress实现的是七层代理,但需要部署附加组件(IngressClass)来解析Ingress资源清单; - 基于helm部署Ingress-nginx实战 1.Ingress-Nginx概述 Ingress-Nginx是K8S官方写的一个Ingress Controller,而"nginx-Ingress"是Nginx官方写的资源清单。 注意,部署时要观察对比一下K8S和Ingress-Nginx对应的版本以来关系哟。 github地址: https://github.com/kubernetes/ingress-nginx 安装文档: https://kubernetes.github.io/ingress-nginx/deploy/#installation-guide 如上图所示,官方推荐了三种安装方式: - 使用"helm"安装; - 使用"kubectl apply"创建yaml资源清单的方式进行安装; - 使用第三方插件的方式进行安装; 2.添加第三方仓库 [root@master231 ingress-nginx]# helm repo add weixiang-ingress https://kubernetes.github.io/ingress-nginx "weixiang-ingress" has been added to your repositories [root@master231 ingress-nginx]# [root@master231 ingress-nginx]# helm repo list NAME URL weixiang-ingress https://kubernetes.github.io/ingress-nginx [root@master231 ingress-nginx]# 3.搜索Ingress-nginx的Chart [root@master231 ingress-nginx]# helm search repo ingress-nginx NAME CHART VERSION APP VERSION DESCRIPTION weixiang-ingress/ingress-nginx 4.13.0 1.13.0 Ingress controller for Kubernetes using NGINX a... [root@master231 ingress-nginx]# [root@master231 ingress-nginx]# helm search repo ingress-nginx -l NAME CHART VERSION APP VERSION DESCRIPTION weixiang-ingress/ingress-nginx 4.13.0 1.13.0 Ingress controller for Kubernetes using NGINX a... weixiang-ingress/ingress-nginx 4.12.4 1.12.4 Ingress controller for Kubernetes using NGINX a... weixiang-ingress/ingress-nginx 4.12.3 1.12.3 Ingress controller for Kubernetes using NGINX a... weixiang-ingress/ingress-nginx 4.12.2 1.12.2 Ingress controller for Kubernetes using NGINX a... ... [root@master231 ingress-nginx]# 4.下载指定的Chart [root@master231 ingress-nginx]# helm pull weixiang-ingress/ingress-nginx --version 4.2.5 [root@master231 ingress-nginx]# [root@master231 ingress-nginx]# ll total 52 drwxr-xr-x 2 root root 4096 Jun 10 11:40 ./ drwxr-xr-x 8 root root 4096 Jun 10 11:40 ../ -rw-r--r-- 1 root root 42132 Jun 10 11:40 ingress-nginx-4.2.5.tgz [root@master231 ingress-nginx]# [root@master231 ingress-nginx]# svip: [root@master231 ingress-nginx]# wget http://192.168.21.253/Resources/Kubernetes/Add-ons/ingress-nginx/ingress-nginx-4.2.5.tgz 5.解压软件包并修改配置参数 [root@master231 helm]# tar xf ingress-nginx-4.2.5.tgz [root@master231 helm]# [root@master231 helm]# sed -i '/registry:/s#registry.k8s.io#registry.cn-hangzhou.aliyuncs.com#g' ingress-nginx/values.yaml [root@master231 helm]# sed -i 's#ingress-nginx/controller#yinzhengjie-k8s/ingress-nginx#' ingress-nginx/values.yaml [root@master231 helm]# sed -i 's#ingress-nginx/kube-webhook-certgen#yinzhengjie-k8s/ingress-nginx#' ingress-nginx/values.yaml [root@master231 helm]# sed -i 's#v1.3.0#kube-webhook-certgen-v1.3.0#' ingress-nginx/values.yaml [root@master231 helm]# sed -ri '/digest:/s@^@#@' ingress-nginx/values.yaml [root@master231 helm]# sed -i '/hostNetwork:/s#false#true#' ingress-nginx/values.yaml [root@master231 helm]# sed -i '/dnsPolicy/s#ClusterFirst#ClusterFirstWithHostNet#' ingress-nginx/values.yaml [root@master231 helm]# sed -i '/kind/s#Deployment#DaemonSet#' ingress-nginx/values.yaml [root@master231 helm]# sed -i '/default:/s#false#true#' ingress-nginx/values.yaml 温馨提示: - 修改镜像为国内的镜像,否则无法下载海外镜像,除非你会FQ; - 如果使用我提供的镜像需要将digest注释掉,因为我的镜像是从海外同步过来的,被重新构建过,其digest不一致; - 建议大家使用宿主机网络效率最高,但是使用宿主机网络将来DNS解析策略会直接使用宿主机的解析; - 如果还想要继续使用K8S内部的svc名称解析,则需要将默认的"ClusterFirst"的DNS解析策略修改为"ClusterFirstWithHostNet"; - 建议将Deployment类型改为DaemonSet类型,可以确保在各个节点部署一个Pod,也可以修改"nodeSelector"字段让其调度到指定节点; - 如果仅有一个ingress controller,可以考虑将"ingressClassResource.default"设置为true,表示让其成为默认的ingress controller; 6.关闭 admissionWebhooks功能 [root@master231 ingress-nginx]# vim ingress-nginx/values.yaml ... admissionWebhooks: ... enabled: false # 关闭admissionWebhooks功能,避免后面使用Ingress时报错! 7.安装ingress-nginx [root@master231 ingress-nginx]# helm upgrade --install ingress-server ingress-nginx -n ingress-nginx --create-namespace Release "ingress-server" does not exist. Installing it now. NAME: ingress-server LAST DEPLOYED: Mon Jul 28 11:50:11 2025 NAMESPACE: ingress-nginx STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: The ingress-nginx controller has been installed. It may take a few minutes for the LoadBalancer IP to be available. You can watch the status by running 'kubectl --namespace ingress-nginx get services -o wide -w ingress-server-ingress-nginx-controller' An example Ingress that makes use of the controller: apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: example namespace: foo spec: ingressClassName: nginx rules: - host: www.example.com http: paths: - pathType: Prefix backend: service: name: exampleService port: number: 80 path: / # This section is only required if TLS is to be enabled for the Ingress tls: - hosts: - www.example.com secretName: example-tls If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided: apiVersion: v1 kind: Secret metadata: name: example-tls namespace: foo data: tls.crt: <base64 encoded cert> tls.key: <base64 encoded key> type: kubernetes.io/tls [root@master231 ingress-nginx]# 8.验证Ingress-nginx是否安装成功 [root@master231 ingress-nginx]# helm list -n ingress-nginx NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION ingress-server ingress-nginx 1 2025-07-28 11:50:11.629524463 +0800 CST deployed ingress-nginx-4.2.5 1.3.1 [root@master231 ingress-nginx]# [root@master231 ingress-nginx]# kubectl get ingressclass,deploy,svc,po -n ingress-nginx -o wide NAME CONTROLLER PARAMETERS AGE ingressclass.networking.k8s.io/nginx k8s.io/ingress-nginx <none> 61s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/ingress-server-ingress-nginx-controller LoadBalancer 10.200.35.157 10.0.0.150 80:32384/TCP,443:31918/TCP 61s app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-server,app.kubernetes.io/name=ingress-nginx NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/ingress-server-ingress-nginx-controller-58m2j 1/1 Running 0 61s 10.0.0.233 worker233 <none> <none> pod/ingress-server-ingress-nginx-controller-hfrrc 1/1 Running 0 61s 10.0.0.232 worker232 <none> <none> [root@master231 ingress-nginx]# [root@master231 ingress-nginx]# 温馨提示: 如果镜像拉取失败,可以使用我的仓库地址导入即可。 http://192.168.21.253/Resources/Kubernetes/Add-ons/ingress-nginx/images/ 9.查看ingress-nginx所监听的端口号 [root@worker232 ~]# ss -ntl | egrep "443|80" LISTEN 0 4096 0.0.0.0:80 0.0.0.0:* LISTEN 0 4096 0.0.0.0:80 0.0.0.0:* LISTEN 0 4096 0.0.0.0:443 0.0.0.0:* LISTEN 0 4096 0.0.0.0:443 0.0.0.0:* LISTEN 0 4096 [::]:80 [::]:* LISTEN 0 4096 [::]:80 [::]:* LISTEN 0 4096 [::]:443 [::]:* LISTEN 0 4096 [::]:443 [::]:* [root@worker232 ~]# [root@worker233 ~]# ss -ntl | egrep "443|80" LISTEN 0 4096 0.0.0.0:80 0.0.0.0:* LISTEN 0 4096 0.0.0.0:80 0.0.0.0:* LISTEN 0 4096 0.0.0.0:443 0.0.0.0:* LISTEN 0 4096 0.0.0.0:443 0.0.0.0:* LISTEN 0 4096 [::]:80 [::]:* LISTEN 0 4096 [::]:80 [::]:* LISTEN 0 4096 [::]:443 [::]:* LISTEN 0 4096 [::]:443 [::]:* [root@worker233 ~]# 10.测试验证 [root@master231 ingress-nginx]# curl http://10.0.0.150/ <html> <head><title>404 Not Found</title></head> <body> <center><h1>404 Not Found</h1></center> <hr><center>nginx</center> </body> </html> [root@master231 ingress-nginx]#

​​

6、ingress的映射http案例
bash
1.为什么要学习Ingress NodePort在暴露服务时,会监听一个NodePort端口,且多个服务无法使用同一个端口的情况。 因此我们说Service可以理解为四层代理。说白了,就是基于IP:PORT的方式进行代理。 假设"v1.weixiang.com"的服务需要监听80端口,而"v2.weixiang.com""v3.weixiang.com"同时也需要监听80端口,svc就很难实现。 这个时候,我们可以借助Ingress来实现此功能,可以将Ingress看做七层代理,底层依旧基于svc进行路由。 而Ingress在K8S是内置的资源,表示主机到svc的解析规则,但具体实现需要安装附加组件(对应的是IngressClass),比如ingress-nginx,traefik等。 IngressClass和Ingress的关系优点类似于: nginx和nginx.conf的关系。 2.准备环境 [root@master231 25-ingresses]# cat > 01-deploy-svc-xiuxian.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian-v1 spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 ports: - containerPort: 80 --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian-v2 spec: replicas: 3 selector: matchLabels: apps: v2 template: metadata: labels: apps: v2 spec: containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 ports: - containerPort: 80 --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian-v3 spec: replicas: 3 selector: matchLabels: apps: v3 template: metadata: labels: apps: v3 spec: containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxian-v1 spec: type: ClusterIP selector: apps: v1 ports: - port: 80 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxian-v2 spec: type: ClusterIP selector: apps: v2 ports: - port: 80 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxian-v3 spec: type: ClusterIP selector: apps: v3 ports: - port: 80 EOF [root@master231 25-ingresses]# kubectl apply -f 01-deploy-svc-xiuxian.yaml deployment.apps/deploy-xiuxian-v1 created deployment.apps/deploy-xiuxian-v2 created deployment.apps/deploy-xiuxian-v3 created service/svc-xiuxian-v1 created service/svc-xiuxian-v2 created service/svc-xiuxian-v3 created [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl get pods -o wide --show-labels NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS deploy-xiuxian-v1-6bc556784f-28h62 1/1 Running 0 33s 10.100.2.186 worker233 <none> <none> apps=v1,pod-template-hash=6bc556784f deploy-xiuxian-v1-6bc556784f-4wc6d 1/1 Running 0 33s 10.100.2.185 worker233 <none> <none> apps=v1,pod-template-hash=6bc556784f deploy-xiuxian-v1-6bc556784f-ntmlq 1/1 Running 0 33s 10.100.1.32 worker232 <none> <none> apps=v1,pod-template-hash=6bc556784f deploy-xiuxian-v2-64bb8c9785-ck9jd 1/1 Running 0 33s 10.100.2.189 worker233 <none> <none> apps=v2,pod-template-hash=64bb8c9785 deploy-xiuxian-v2-64bb8c9785-cq5s6 1/1 Running 0 33s 10.100.2.188 worker233 <none> <none> apps=v2,pod-template-hash=64bb8c9785 deploy-xiuxian-v2-64bb8c9785-jtxn8 1/1 Running 0 33s 10.100.1.34 worker232 <none> <none> apps=v2,pod-template-hash=64bb8c9785 deploy-xiuxian-v3-698c86cf85-dm72r 1/1 Running 0 33s 10.100.2.187 worker233 <none> <none> apps=v3,pod-template-hash=698c86cf85 deploy-xiuxian-v3-698c86cf85-jpz5j 1/1 Running 0 33s 10.100.1.35 worker232 <none> <none> apps=v3,pod-template-hash=698c86cf85 deploy-xiuxian-v3-698c86cf85-kzp8g 1/1 Running 0 33s 10.100.1.33 worker232 <none> <none> apps=v3,pod-template-hash=698c86cf85 [root@master231 25-ingresses]# 3.编写Ingress规则 [root@master231 25-ingresses]# cat > 02-ingress-xiuxian.yaml <<''EOF apiVersion: networking.k8s.io/v1 kind: Ingress # 定义资源类型为Ingress(用于管理外部访问集群服务的路由规则) metadata: name: ingress-xiuxian # 指定Ingress资源的名称 spec: ingressClassName: nginx # 指定IngressClass的名称 rules: # 定义解析规则 - host: v1.weixiang.com # 定义的是主机名 http: # 配置http协议 paths: # 配置访问路径 - pathType: Prefix # 配置匹配用户访问的类型,表示前缀匹配 path: / # 指定匹配的路径(匹配以/开头的所有路径) backend: # 配置后端的调度svc service: # 配置svc的名称及端口 name: svc-xiuxian-v1 # Service名称 port: number: 80 # Service端口号 - host: v2.weixiang.com http: paths: - pathType: Prefix backend: service: name: svc-xiuxian-v2 port: number: 80 path: / - host: v3.weixiang.com http: paths: - pathType: Prefix backend: service: name: svc-xiuxian-v3 port: number: 80 path: / EOF 4.创建Ingress规则 [root@master231 25-ingresses]# kubectl apply -f 02-ingress-xiuxian.yaml ingress.networking.k8s.io/ingress-xiuxian created [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl get ingress NAME CLASS HOSTS ADDRESS PORTS AGE ingress-xiuxian nginx v1.weixiang.com,v2.weixiang.com,v3.weixiang.com 80 4s [root@master231 25-ingresses]# [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl describe ingress ingress-xiuxian Name: ingress-xiuxian Labels: <none> Namespace: default Address: Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>) Rules: Host Path Backends ---- ---- -------- v1.weixiang.com / svc-xiuxian-v1:80 (10.100.1.32:80,10.100.2.185:80,10.100.2.186:80) v2.weixiang.com / svc-xiuxian-v2:80 (10.100.1.34:80,10.100.2.188:80,10.100.2.189:80) v3.weixiang.com / svc-xiuxian-v3:80 (10.100.1.33:80,10.100.1.35:80,10.100.2.187:80) Annotations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Sync 12s nginx-ingress-controller Scheduled for sync Normal Sync 12s nginx-ingress-controller Scheduled for sync [root@master231 25-ingresses]# 5.windows添加解析记录 10.0.0.232 v1.weixiang.com v2.weixiang.com 10.0.0.233 v3.weixiang.com 或者: 10.0.0.150 v1.weixiang.com v2.weixiang.com v3.weixiang.com

image

bash
6.访问Ingress-class服务 http://v1.weixiang.com/ http://v2.weixiang.com/ http://v3.weixiang.com/ 7.Ingress和Ingress class底层原理验证 [root@master231 25-ingresses]# kubectl get pods -o wide -n ingress-nginx NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ingress-server-ingress-nginx-controller-58m2j 1/1 Running 0 174m 10.0.0.233 worker233 <none> <none> ingress-server-ingress-nginx-controller-hfrrc 1/1 Running 0 174m 10.0.0.232 worker232 <none> <none> [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl -n ingress-nginx exec -it ingress-server-ingress-nginx-controller-58m2j -- bash bash-5.1$ grep weixiang.com /etc/nginx/nginx.conf ## start server v1.weixiang.com server_name v1.weixiang.com ; ## end server v1.weixiang.com ## start server v2.weixiang.com server_name v2.weixiang.com ; ## end server v2.weixiang.com ## start server v3.weixiang.com server_name v3.weixiang.com ; ## end server v3.weixiang.com bash-5.1$ exit [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl delete ingress ingress-xiuxian ingress.networking.k8s.io "ingress-xiuxian" deleted [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl -n ingress-nginx exec -it ingress-server-ingress-nginx-controller-58m2j -- bash bash-5.1$ grep weixiang.com /etc/nginx/nginx.conf bash-5.1$ [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl delete ingress ingress-xiuxian ingress.networking.k8s.io "ingress-xiuxian" deleted [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl -n ingress-nginx exec -it ingress-server-ingress-nginx-controller-58m2j -- bash bash-5.1$ grep weixiang.com /etc/nginx/nginx.conf bash-5.1$

7、Ingress实现uri多路径匹配案例
bash
Ingress实现uri多路径匹配案例 1.编写资源清单 [root@master231 25-ingresses]# cat > 03-ingress-xiuxian-uri.yaml <<''EOF apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ingress-xiuxian-uri spec: ingressClassName: nginx rules: - host: xiuxian.weixiang.com http: paths: - pathType: Prefix path: /v1 # 需要提前创建/v1的目录 backend: service: name: svc-xiuxian-v1 port: number: 80 - pathType: Prefix path: /v2 backend: service: name: svc-xiuxian-v2 port: number: 80 - pathType: Prefix path: /v3 backend: service: name: svc-xiuxian-v3 port: number: 80 EOF 2.创建Ingress资源 [root@master231 25-ingresses]# kubectl apply -f 03-ingress-xiuxian-uri.yaml ingress.networking.k8s.io/ingress-xiuxian-uri created [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl get ingress NAME CLASS HOSTS ADDRESS PORTS AGE ingress-xiuxian-uri nginx xiuxian.weixiang.com 80 4s [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl describe ingress ingress-xiuxian-uri Name: ingress-xiuxian-uri Labels: <none> Namespace: default Address: Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>) Rules: Host Path Backends ---- ---- -------- xiuxian.weixiang.com /v1 svc-xiuxian-v1:80 (10.100.1.37:80,10.100.2.191:80,10.100.2.192:80) /v2 svc-xiuxian-v2:80 (10.100.1.36:80,10.100.2.190:80,10.100.2.193:80) /v3 svc-xiuxian-v3:80 (10.100.1.38:80,10.100.1.39:80,10.100.2.194:80) Annotations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Sync 9s nginx-ingress-controller Scheduled for sync Normal Sync 9s nginx-ingress-controller Scheduled for sync [root@master231 25-ingresses]# 3.准备测试数据 3.1 修改Pod副本数量 [root@master231 25-ingresses]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-v1-6bc556784f-28cx8 1/1 Running 0 2m9s 10.100.2.191 worker233 <none> <none> deploy-xiuxian-v1-6bc556784f-djwzs 1/1 Running 0 2m9s 10.100.2.192 worker233 <none> <none> deploy-xiuxian-v1-6bc556784f-t4p84 1/1 Running 0 2m9s 10.100.1.37 worker232 <none> <none> deploy-xiuxian-v2-64bb8c9785-mht58 1/1 Running 0 2m9s 10.100.2.193 worker233 <none> <none> deploy-xiuxian-v2-64bb8c9785-s8xqh 1/1 Running 0 2m9s 10.100.2.190 worker233 <none> <none> deploy-xiuxian-v2-64bb8c9785-s8zfh 1/1 Running 0 2m9s 10.100.1.36 worker232 <none> <none> deploy-xiuxian-v3-698c86cf85-c8jpb 1/1 Running 0 2m9s 10.100.2.194 worker233 <none> <none> deploy-xiuxian-v3-698c86cf85-h42pf 1/1 Running 0 2m9s 10.100.1.38 worker232 <none> <none> deploy-xiuxian-v3-698c86cf85-jzm8r 1/1 Running 0 2m9s 10.100.1.39 worker232 <none> <none> [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl scale deployment deploy-xiuxian-v1 --replicas=1 deployment.apps/deploy-xiuxian-v1 scaled [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl scale deployment deploy-xiuxian-v2 --replicas=1 deployment.apps/deploy-xiuxian-v2 scaled [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl scale deployment deploy-xiuxian-v3 --replicas=1 deployment.apps/deploy-xiuxian-v3 scaled [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-v1-6bc556784f-t4p84 1/1 Running 0 2m35s 10.100.1.37 worker232 <none> <none> deploy-xiuxian-v2-64bb8c9785-s8zfh 1/1 Running 0 2m35s 10.100.1.36 worker232 <none> <none> deploy-xiuxian-v3-698c86cf85-c8jpb 1/1 Running 0 2m35s 10.100.2.194 worker233 <none> <none> [root@master231 25-ingresses]# 3.2 修改测试数据 [root@master231 25-ingresses]# kubectl exec -it deploy-xiuxian-v1-6bc556784f-t4p84 -- sh / # mkdir /usr/share/nginx/html/v1 / # / # / # echo "<h1 style='color: red;'>www.weixiang.com</h1>" > /usr/share/nginx/html/v1/index.html / # / # cat /usr/share/nginx/html/v1/index.html <h1 style='color: red;'>www.weixiang.com</h1> / # [root@master231 25-ingresses]# kubectl exec -it deploy-xiuxian-v2-64bb8c9785-s8zfh -- sh / # mkdir /usr/share/nginx/html/v2 / # echo "<h1 style='color: green;'>www.weixiang.com</h1>" > /usr/share/nginx/html/v2/index.html / # cat /usr/share/nginx/html/v2/index.html <h1 style='color: green;'>www.weixiang.com</h1> / # [root@master231 25-ingresses]# kubectl exec -it deploy-xiuxian-v3-698c86cf85-c8jpb -- sh / # mkdir /usr/share/nginx/html/v3 / # / # echo "<h1 style='color: pink;'>www.weixiang.com</h1>" > /usr/share/nginx/html/v3/index.html / # / # cat /usr/share/nginx/html/v3/index.html <h1 style='color: pink;'>www.weixiang.com</h1> / # 3.3 测试验证 http://xiuxian.weixiang.com/v1/ http://xiuxian.weixiang.com/v2/ http://xiuxian.weixiang.com/v3/

image

特性前一个配置 (02-ingress-xiuxian.yaml)当前配置 (03-ingress-xiuxian-uri.yaml)
域名数量3个不同域名 (v1/v2/v3.weixiang.com)1个域名 (xiuxian.weixiang.com)
路径要求所有路径(/)都路由到同一服务根据路径前缀(/v1,/v2,/v3)路由
DNS配置需要3个DNS记录只需要1个DNS记录
适用场景多租户/完全独立的应用版本同一应用的版本化路由

8、ingress的映射https案例
bash
- ingress的映射https案例 1.生成证书文件【如果有生产环境的证书此步骤可以跳过】 [root@master231 25-ingresses]# openssl req -x509 -nodes -days 3650 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=www.yinzhengjie.com" [root@master231 25-ingresses]# [root@master231 25-ingresses]# ll tls.* -rw-r--r-- 1 root root 1139 Jul 28 15:47 tls.crt -rw------- 1 root root 1704 Jul 28 15:47 tls.key [root@master231 25-ingresses]# 2.将证书文件以secrets形式存储 [root@master231 25-ingresses]# kubectl create secret tls ca-secret --cert=tls.crt --key=tls.key secret/ca-secret created [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl get secrets ca-secret NAME TYPE DATA AGE ca-secret kubernetes.io/tls 2 10s [root@master231 25-ingresses]# 3.部署测试服务 [root@master231 25-ingresses]# cat > 04-deploy-apple.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deployment-apple spec: replicas: 3 selector: matchLabels: apps: apple template: metadata: labels: apps: apple spec: containers: - name: apple image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:apple ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: svc-apple spec: selector: apps: apple ports: - protocol: TCP port: 80 targetPort: 80 EOF [root@master231 25-ingresses]# kubectl apply -f 04-deploy-apple.yaml deployment.apps/deployment-apple created service/svc-apple created [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl get pods -l apps=apple -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deployment-apple-5496cd9b6c-dxjqc 1/1 Running 0 9s 10.100.2.195 worker233 <none> <none> deployment-apple-5496cd9b6c-qwzhz 1/1 Running 0 9s 10.100.2.196 worker233 <none> <none> deployment-apple-5496cd9b6c-wxf5v 1/1 Running 0 9s 10.100.1.40 worker232 <none> <none> [root@master231 25-ingresses]# 4.配置Ingress添加TLS证书 [root@master231 25-ingresses]# cat > 05-ingress-tls.yaml <<EOF apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ingress-tls-https # 如果指定了"ingressClassName"参数,就不需要在这里重复声明啦。 # 如果你的K8S 1.22- 版本,则使用注解的方式进行传参即可。 #annotations: # kubernetes.io/ingress.class: "nginx" spec: # 指定Ingress Class,要求你的K8S 1.22+ ingressClassName: nginx rules: - host: www.yinzhengjie.com http: paths: - backend: service: name: svc-apple port: number: 80 path: / pathType: ImplementationSpecific # 配置https证书 tls: - hosts: - www.yinzhengjie.com secretName: ca-secret EOF [root@master231 25-ingresses]# kubectl apply -f 05-ingress-tls.yaml ingress.networking.k8s.io/ingress-tls-https created [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl get ingress ingress-tls-https NAME CLASS HOSTS ADDRESS PORTS AGE ingress-tls-https nginx www.yinzhengjie.com 80, 443 13s [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl describe ingress ingress-tls-https Name: ingress-tls-https Labels: <none> Namespace: default Address: Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>) TLS: ca-secret terminates www.yinzhengjie.com Rules: Host Path Backends ---- ---- -------- www.yinzhengjie.com / svc-apple:80 (10.100.1.40:80,10.100.2.195:80,10.100.2.196:80) Annotations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Sync 16s nginx-ingress-controller Scheduled for sync Normal Sync 16s nginx-ingress-controller Scheduled for sync 5.windows添加解析 10.0.0.233 www.yinzhengjie.com 6.访问测试 https://www.yinzhengjie.com/ 温馨提示: 如果google浏览器自建证书不认可,可以用鼠标在空白处单击左键,而后输入:"thisisunsafe",就会自动跳转。 当然,如果不想打这个代码,可以使用火狐浏览器打开即可。 [root@master231 ~/count/25-ingresses]#kubectl get ingressclass,deploy,svc,po -n ingress-nginx -o wide NAME CONTROLLER PARAMETERS AGE ingressclass.networking.k8s.io/nginx k8s.io/ingress-nginx <none> 4h4m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/ingress-server-ingress-nginx-controller LoadBalancer 10.200.78.73 10.0.0.150 80:16154/TCP,443:9803/TCP 4h4m app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-server,app.kubernetes.io/name=ingress-nginx NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/ingress-server-ingress-nginx-controller-np9v2 1/1 Running 0 4h4m 10.1.20.5 worker232 <none> <none> pod/ingress-server-ingress-nginx-controller-wbs9b 1/1 Running 0 4h4m 10.1.24.4 worker233 <none>

image

9、基于helm部署trafik使用指南
bash
-参考链接: https://doc.traefik.io/traefik/getting-started/install-traefik/#use-the-helm-chart 1.添加仓库 [root@master231 traefik]# helm repo add traefik https://traefik.github.io/charts "traefik" has been added to your repositories [root@master231 traefik]# [root@master231 traefik]# helm repo list NAME URL weixiang-ingress https://kubernetes.github.io/ingress-nginx traefik https://traefik.github.io/charts [root@master231 traefik]# 2.更新仓库信息 [root@master231 traefik]# helm repo update Hang tight while we grab the latest from your chart repositories... ...Successfully got an update from the "weixiang-ingress" chart repository ...Successfully got an update from the "traefik" chart repository Update Complete. ⎈Happy Helming!⎈ [root@master231 traefik]# 3.安装traefik [root@master231 traefik]# helm search repo traefik NAME CHART VERSION APP VERSION DESCRIPTION azure/traefik 1.87.7 1.7.26 DEPRECATED - A Traefik based Kubernetes ingress... traefik/traefik 36.0.0 v3.4.1 A Traefik based Kubernetes ingress controller traefik/traefik-crds 1.8.1 A Traefik based Kubernetes ingress controller traefik/traefik-hub 4.2.0 v2.11.0 Traefik Hub Ingress Controller traefik/traefik-mesh 4.1.1 v1.4.8 Traefik Mesh - Simpler Service Mesh traefik/traefikee 4.2.3 v2.12.4 Traefik Enterprise is a unified cloud-native ne... traefik/maesh 2.1.2 v1.3.2 Maesh - Simpler Service Mesh [root@master231 traefik]# [root@master231 traefik]# helm pull traefik/traefik # 指定下载的版本 [root@master231 ~]# helm pull traefik/traefik --version 36.3.0 # 卸载traefik [root@master231 ~]# helm uninstall traefik-server -n default [root@master231 traefik]# [root@master231 traefik]# ll total 260 drwxr-xr-x 2 root root 4096 Jul 28 16:13 ./ drwxr-xr-x 4 root root 4096 Jul 28 16:11 ../ -rw-r--r-- 1 root root 257573 Jul 28 16:13 traefik-36.3.0.tgz [root@master231 traefik]# [root@master231 traefik]# tar xf traefik-36.3.0.tgz [root@master231 traefik]# [root@master231 traefik]# ll total 264 drwxr-xr-x 3 root root 4096 Jul 28 16:14 ./ drwxr-xr-x 4 root root 4096 Jul 28 16:11 ../ drwxr-xr-x 4 root root 4096 Jul 28 16:14 traefik/ -rw-r--r-- 1 root root 257573 Jul 28 16:13 traefik-36.3.0.tgz [root@master231 traefik]# [root@master231 traefik]# helm install traefik-server traefik NAME: traefik-server LAST DEPLOYED: Mon Jul 28 16:14:30 2025 NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: traefik-server with docker.io/traefik:v3.4.3 has been deployed successfully on default namespace ! [root@master231 traefik]# [root@master231 traefik]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION traefik-server default 1 2025-07-28 16:14:30.08425946 +0800 CST deployed traefik-36.3.0 v3.4.3 [root@master231 traefik]# 4.查看服务 [root@master231 traefik]# kubectl get ingressclass,deploy,svc,po -o wide NAME CONTROLLER PARAMETERS AGE ingressclass.networking.k8s.io/nginx k8s.io/ingress-nginx <none> 4h29m ingressclass.networking.k8s.io/traefik-server traefik.io/ingress-controller <none> 5m5s NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/traefik-server 1/1 1 1 5m5s traefik-server docker.io/traefik:v3.4.3 app.kubernetes.io/instance=traefik-server-default,app.kubernetes.io/name=traefik NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 3d5h <none> service/traefik-server LoadBalancer 10.200.224.167 10.0.0.152 80:12742/TCP,443:19345/TCP 5m5s app.kubernetes.io/instance=traefik-server-default,app.kubernetes.io/name=traefik NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/traefik-server-56846685f9-x5ws4 1/1 Running 0 5m5s 10.100.2.197 worker233 <none> <none> [root@master231 traefik]# 温馨提示: 如果无法下载镜像,则需要你手动下载。 SVIP直接来我的仓库获取: http://192.168.21.253/Resources/Kubernetes/Add-ons/traefik/ 5.创建测试案例 [root@master231 25-ingresses]# kubectl apply -f 01-deploy-svc-xiuxian.yaml deployment.apps/deploy-xiuxian-v1 created deployment.apps/deploy-xiuxian-v2 created deployment.apps/deploy-xiuxian-v3 created service/svc-xiuxian-v1 created service/svc-xiuxian-v2 created service/svc-xiuxian-v3 created [root@master231 25-ingresses]# [root@master231 25-ingresses]# cat 02-ingress-xiuxian.yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ingress-xiuxian spec: # 指定IngressClass的名称,不使用之前的Ingress-nginx,而是使用Traefik。 # ingressClassName: nginx ingressClassName: traefik-server rules: - host: v1.weixiang.com http: paths: - pathType: Prefix path: / backend: service: name: svc-xiuxian-v1 port: number: 80 - host: v2.weixiang.com http: paths: - pathType: Prefix backend: service: name: svc-xiuxian-v2 port: number: 80 path: / - host: v3.weixiang.com http: paths: - pathType: Prefix backend: service: name: svc-xiuxian-v3 port: number: 80 path: / [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl apply -f 02-ingress-xiuxian.yaml ingress.networking.k8s.io/ingress-xiuxian created [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl describe -f 02-ingress-xiuxian.yaml Name: ingress-xiuxian Labels: <none> Namespace: default Address: 10.0.0.152 Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>) Rules: Host Path Backends ---- ---- -------- v1.weixiang.com / svc-xiuxian-v1:80 (10.100.1.43:80,10.100.2.198:80,10.100.2.199:80) v2.weixiang.com / svc-xiuxian-v2:80 (10.100.1.41:80,10.100.1.42:80,10.100.2.200:80) v3.weixiang.com / svc-xiuxian-v3:80 (10.100.1.44:80,10.100.2.201:80,10.100.2.202:80) Annotations: <none> Events: <none> [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl get ingress ingress-xiuxian NAME CLASS HOSTS ADDRESS PORTS AGE ingress-xiuxian traefik-server v1.weixiang.com,v2.weixiang.com,v3.weixiang.com 10.0.0.152 80 79s [root@master231 25-ingresses]# [root@master231 25-ingresses]# curl -H "HOST: v1.weixiang.com" http://43.139.47.66/ <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 25-ingresses]# [root@master231 25-ingresses]# curl -H "HOST: v2.weixiang.com" http://10.0.0.152/ <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v2</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: red">凡人修仙传 v2 </h1> <div> <img src="2.jpg"> <div> </body> </html> [root@master231 25-ingresses]# [root@master231 25-ingresses]# curl -H "HOST: v3.weixiang.com" http://10.0.0.152/ <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v3</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: pink">凡人修仙传 v3 </h1> <div> <img src="3.jpg"> <div> </body> </html> [root@master231 25-ingresses]# - 彩蛋: traefik开启Dashboard 1.开启Dashboard参数 [root@master231 helm]# vim traefik/values.yaml ... 187 ingressRoute: 188 dashboard: 189 # -- Create an IngressRoute for the dashboard 190 # enabled: false 191 enabled: true 2.重新安装traefik [root@master231 traefik]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION traefik-server default 1 2025-07-28 16:14:30.08425946 +0800 CST deployed traefik-36.3.0 v3.4.3 [root@master231 traefik]# [root@master231 traefik]# helm uninstall traefik-server release "traefik-server" uninstalled [root@master231 traefik]# [root@master231 traefik]# helm install traefik-server traefik NAME: traefik-server LAST DEPLOYED: Mon Jul 28 16:29:05 2025 NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: traefik-server with docker.io/traefik:v3.4.3 has been deployed successfully on default namespace ! [root@master231 traefik]# [root@master231 traefik]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION traefik-server default 1 2025-07-28 16:29:05.025166045 +0800 CST deployed traefik-36.3.0 v3.4.3 [root@master231 traefik]# [root@master231 traefik]# 3.创建svc关联Dashboard [root@master231 traefik]# kubectl get pods -l app.kubernetes.io/name=traefik -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES traefik-server-74654b469d-zrx9c 1/1 Running 0 20s 10.100.203.156 worker232 <none> <none> [root@master231 traefik]# [root@master231 traefik]# cat > 01-svc-ing-traefik-dashboard.yaml <<''EOF apiVersion: v1 kind: Service metadata: name: jiege-traefik-dashboard spec: ports: - name: dashboard # 为这个端口起一个有意义的名字 port: 8080 # 这是 Service 自身暴露给集群内部的端口 selector: app.kubernetes.io/name: traefik # 在集群中寻找所有带有 `app.kubernetes.io/name: traefik` 这个标签的 Pod type: ClusterIP # 仅在集群内部可以访问的虚拟 IP --- apiVersion: networking.k8s.io/v1 kind: Ingress # Ingress 是一套规则 metadata: name: ingress-traefik-dashboard # 这个 Ingress 规则的唯一名称 spec: ingressClassName: traefik-server # 它明确指定了哪个 Ingress Controller 应该来处理这个 Ingress 规则 rules: # 定义了一系列的路由规则。 - host: traefik.weixiang.com # 只有当 HTTP 请求的 Host 头部是 "traefik.weixiang.com" 时,下面的规则才会生效。 http: paths: - pathType: Prefix path: / backend: service: name: jiege-traefik-dashboard port: number: 8080 EOF # 当一个访问 v1.weixiang.com、v2.weixiang.com 或 v3.weixiang.com 的请求到达集群入口时,Traefik Ingress Controller 会处理它。 [root@master231 ~/count/traefik]#kubectl get ingress NAME CLASS HOSTS ADDRESS PORTS AGE ingress-xiuxian traefik-server v1.weixiang.com,v2.weixiang.com,v3.weixiang.com 80 93m [root@master231 traefik]# kubectl apply -f 01-svc-ing-traefik-dashboard.yaml service/jiege-traefik-dashboard created ingress.networking.k8s.io/ingress-traefik-dashboard created [root@master231 traefik]# [root@master231 traefik]# kubectl get ingress ingress-traefik-dashboard NAME CLASS HOSTS ADDRESS PORTS AGE ingress-traefik-dashboard traefik-server traefik.weixiang.com 10.0.0.152 80 7s [root@master231 traefik]# 4.windows添加解析 10.0.0.152 traefik.weixiang.com 5.访问traefik的WebUI http://traefik.weixiang.com/dashboard/#/ 访问:http://43.139.77.96:46158/dashboard/#/

image

32、Traefik架构

1.Traefik核心概念

c6a4769991963dd8f421dd072cb3a4c2

bash
- Traefik架构 1.Traefik核心概念 参考链接: https://docs.traefik.cn/basics Traefik所示一个边缘路由器,它会拦截外部的请求并根据逻辑规则选择不同的操作方式,这些规则决定着这些请求到底该如何处理。 Traefik提供自动发现能力,会实时检测服务,并自动更新路由规则。 如上图所示,请求首先会链接到"entrypoint(入口点)","frontends(前端)""backends(后端)" - entrypoint(入口点): 请求在入口点处结束, 顾名思义, 它们是Træfɪk的网络入口(监听端口, SSL, 流量重定向...)。 Entrypoints是Traefik的网络入口,它定义接受请求的接口,以及是否监听TCP或者UDP。 - frontends(前端): 之后流量会导向一个匹配的前端。 前端是定义入口点到后端之间的路由的地方。 路由是通过请求字段(Host, Path, Headers...) 来定义的,它可以匹配或否定一个请求。 - backends(后端): 前端将会把请求发送到后端。后端可以由一台或一个通过负载均衡策略配置后的多台服务器组成。 最后, 服务器将转发请求到对应私有网络的微服务当中去。 这涉及到以下几个重要核心组件: - Providers Providers是基础组件,Traefik的配置发现是通过它来实现的,它可以是协调器,容器引擎,云提供商或者键值存储。 Traefik通过查询Provides的API来查询路由的相关信息,一旦检测到变化,就会动态的更新路由。 - Routers Routess主要用于分析请求,并负责将这些请求连接到对应的服务上去。 在这个过程中,Routers还可以使用Middlewares来更新请求,比如:在把请求发到服务之前添加一些Headers。 - Middlewares Middlewarees用来修改请求或者根据请求来做出一些判断(authentiacation,rate limiting,headers,...), 中间件被附加到路由上,是一种在请求发送到你的服务之前(或者在服务的响应发送到客户端之前)调整请求的一种方法。 - Services Servers负责配置如何到达最终将处理传入的实际服务。 - Traefik支持的路由规则 Traefik提供了三种创建路由规则的方法: - 原生Ingress K8S原生支持的资源。 - CRD IngressRoute 部署Traefik时安装的自定义资源。 - Gateway API 基于Gateway的API来实现暴露。是k8s官方基于Ingress的一种扩展实现。
2.基于Ingress暴露Traefik的Dashboard
bash
1.基于Ingress暴露Traefik的Dashboard 2.基于IngressRoute暴露Traefik的Dashboard 2.1 保留一个Ingress解析记录避免实验干扰【注意解析域名可能存在冲突问题】 [root@master231 25-ingresses]# kubectl get ingress NAME CLASS HOSTS ADDRESS PORTS AGE ingress-traefik-dashboard traefik-server traefik.weixiang.com 10.0.0.152 80 17h ingress-xiuxian traefik-server v1.weixiang.com,v2.weixiang.com,v3.weixiang.com 10.0.0.152 80 17h [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl delete ingress --all ingress.networking.k8s.io "ingress-traefik-dashboard" deleted ingress.networking.k8s.io "ingress-xiuxian" deleted [root@master231 25-ingresses]# [root@master231 25-ingresses]# kubectl get ingress No resources found in default namespace. [root@master231 25-ingresses]# 2.2 编写资源清单 [root@master231 26-IngressRoute]# cat > 01-ingressroutes-traefik-dashboard.yaml <<'EOF' apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: name: ingressroute-traefik spec: # 指定入口点 entryPoints: - web # 配置路由 routes: # 匹配访问的请求为: "www.yinzhengjie.com/" - match: Host(`www.yinzhengjie.com`) && PathPrefix(`/`) kind: Rule # 调度到后端的svc services: # 指定svc - name: jiege-traefik-dashboard # 指定端口 port: 8080 EOF 2.3 创建资源 [root@master231 26-IngressRoute]# kubectl apply -f 01-ingressroutes-traefik-dashboard.yaml ingressroute.traefik.io/ingressroute-traefik created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get ingressroute NAME AGE ingressroute-traefik 27s traefik-server-dashboard 17h [root@master231 26-IngressRoute]# 2.4 在window添加Traefik server的解析记录: 10.0.0.152 www.yinzhengjie.com 2.5 访问测试 http://www.yinzhengjie.com/dashboard/#/ -
3、Traefik的测试环境部署
bash
Traefik的测试环境部署 1.K8S所有节点导入镜像 wget http://192.168.21.253/Resources/Kubernetes/Add-ons/traefik/case-demo/weixiang-traefik-whoamiudp-v0.2.tar.gz wget http://192.168.21.253/Resources/Kubernetes/Add-ons/traefik/case-demo/weixiang-traefik-whoamitcp-v0.3.tar.gz wget http://192.168.21.253/Resources/Kubernetes/Add-ons/traefik/case-demo/weixiang-traefik-whoami-v1.11.tar.gz for i in `ls -1 weixiang-traefik-whoami*` ; do docker load -i $i;done 2.编写资源清单 [root@master231 ingresses]# cat > 02-deploy-svc-whoami.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-whoami spec: replicas: 2 selector: matchLabels: apps: whoami template: metadata: labels: apps: whoami spec: containers: - name: whoami image: docker.io/traefik/whoami:v1.11 imagePullPolicy: IfNotPresent ports: - containerPort: 80 --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-whoamitcp spec: replicas: 2 selector: matchLabels: apps: whoamitcp template: metadata: labels: apps: whoamitcp spec: containers: - name: whoamitcp image: docker.io/traefik/whoamitcp:v0.3 imagePullPolicy: IfNotPresent ports: - containerPort: 8080 protocol: TCP # TCP协议,不指定默认也是TCP --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-whoamiudp spec: replicas: 2 selector: matchLabels: apps: whoamiudp template: metadata: labels: apps: whoamiudp spec: containers: - name: whoamiudp image: docker.io/traefik/whoamiudp:v0.2 imagePullPolicy: IfNotPresent ports: - containerPort: 8080 protocol: UDP --- apiVersion: v1 kind: Service metadata: name: svc-whoami spec: ports: - name: http port: 80 selector: apps: whoami --- apiVersion: v1 kind: Service metadata: name: svc-whoamitcp spec: ports: - name: tcp port: 8080 selector: apps: whoamitcp --- apiVersion: v1 kind: Service metadata: name: svc-whoamiudp spec: ports: - name: udp port: 8080 protocol: UDP selector: apps: whoamiudp EOF 3.创建资源 [root@master231 26-IngressRoute]# kubectl apply -f 02-deploy-svc-whoami.yaml deployment.apps/deploy-whoami created deployment.apps/deploy-whoamitcp created deployment.apps/deploy-whoamiudp created service/svc-whoami created service/svc-whoamitcp created service/svc-whoamiudp created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-whoami-5565cc959d-fjbdb 1/1 Running 0 55s 10.100.2.212 worker233 <none> <none> deploy-whoami-5565cc959d-lmrn9 1/1 Running 0 55s 10.100.1.53 worker232 <none> <none> deploy-whoamitcp-764f9bc89-shxxg 1/1 Running 0 55s 10.100.2.211 worker233 <none> <none> deploy-whoamitcp-764f9bc89-tflq6 1/1 Running 0 55s 10.100.1.52 worker232 <none> <none> deploy-whoamiudp-f7657cc98-gjhdb 1/1 Running 0 55s 10.100.1.54 worker232 <none> <none> deploy-whoamiudp-f7657cc98-kld2q 1/1 Running 0 55s 10.100.2.213 worker233 <none> <none> traefik-server-56846685f9-7spv2 1/1 Running 1 (52m ago) 17h 10.100.2.207 worker233 <none> <none> [root@master231 26-IngressRoute]# 4.创建IngressRoute路由规则代理whoami程序 [root@master231 26-IngressRoute]# cat > 03-ingressroutes-whoami.yaml <<'EOF' apiVersion: traefik.io/v1alpha1 # 你必须先在集群中安装了 Traefik 的 CRDs,才能创建这种类型的资源 kind: IngressRoute # 定义了要创建的资源类型。这里是 "IngressRoute",Traefik 的专属路由资源 metadata: name: ingressroute-whoami # 此 IngressRoute 规则的唯一名称 namespace: default # 默认命名空间 spec: # spec: 定义了该 IngressRoute 的具体路由规则 entryPoints: # 指定这条路由规则应该应用于 Traefik 的哪个“入口点”,"入口点" 是 Traefik 启动时定义的网络监听端口,比如 'web' 通常代表 HTTP 的 80 端口,'websecure' 代表 HTTPS 的 443 端口。 - web # 这行配置表示,只有通过 'web' 入口点(即 HTTP 80 端口)进入 Traefik 的流量,才会应用下面的路由规则,这里只能写Traefik 页面定义的入口点 routes: # routes: 定义了一系列的具体路由匹配规则。可以有多条。 - match: Host(`whoami.yinzhengjie.com`) && PathPrefix(`/`) # 这条规则匹配所有发往 "whoami.yinzhengjie.com/" 的 HTTP 请求。 kind: Rule # 定义了这条 route 的类型,"Rule" 是标准类型,表示这是一个常规的转发规则 services: - name: svc-whoami # 会将流量转发到名为 "svc-whoami" 的 Service。这个 Service 必须和 IngressRoute 在同一个命名空间(这里是 `default`)。 port: 80 EOF [root@master231 26-IngressRoute]# kubectl apply -f 03-ingressroutes-whoami.yaml ingressroute.traefik.io/ingressroute-whoami created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get ingressroute ingressroute-whoami NAME AGE ingressroute-whoami 8s [root@master231 26-IngressRoute]# 5.测试验证 [root@master231 26-IngressRoute]# kubectl get svc traefik-server NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE traefik-server LoadBalancer 10.200.250.150 10.0.0.152 80:46037/TCP,443:12808/TCP 17h [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# curl -H 'HOST: whoami.yinzhengjie.com' http://10.0.0.152 Hostname: deploy-whoami-5565cc959d-fjbdb IP: 127.0.0.1 IP: 10.100.2.212 RemoteAddr: 10.100.2.207:47122 GET / HTTP/1.1 Host: whoami.yinzhengjie.com User-Agent: curl/7.81.0 Accept: */* Accept-Encoding: gzip X-Forwarded-For: 10.0.0.231 X-Forwarded-Host: whoami.yinzhengjie.com X-Forwarded-Port: 80 X-Forwarded-Proto: http X-Forwarded-Server: traefik-server-56846685f9-7spv2 X-Real-Ip: 10.0.0.231 [root@master231 26-IngressRoute]# 6.windows测试 需要添加解析哟~ 10.0.0.152 whoami.yinzhengjie.com

image

image

4、IngressRoute路由规则
特性IngressRoute (HTTP/S)IngressRouteTCP (TCP)IngressRouteUDP (UDP)
协议层L7 (应用层)L4 (传输层)L4 (传输层)
处理协议HTTP, HTTPS任意 TCP 协议任意 UDP 协议
核心路由依据Host, Path, Headers, Method, QuerySNI (for TLS), Client IP (with PROXY protocol)仅入口点 (一个入口点 -> 一个服务)
TLS 处理终端 (Termination), 透传 (Passthrough)透传 (Passthrough), 终端 (Termination)不适用
中间件支持 (非常丰富)支持 (有限,如 InFlightConn)不支持
典型应用Web 服务, API 网关数据库, 消息队列, SSHDNS, 游戏服务器, 流媒体
比喻能拆开信件看内容的秘书只看包裹单号不看内容的快递员只管把桶里的水倒进另一个桶的搬运工

5、配置https路由规则之whoami案例
bash
1 配置https路由规则注意事项 如果我们需要使用https来访问我们这个应用的话,就需要监听websecure这个入口点,也就是通过443端口来访问。 用HTTPS访问应用必然就需要证书,这个证书可以是自签证书,也可以是权威机构颁发的证书。 2 创建证书并封装为secret资源 2.1.使用openssl自建证书 [root@master231 26-IngressRoute]# openssl req -x509 -nodes -days 365 --newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=whoamissl.yinzhengjie.com" 2.查看生成的证书文件 [root@master231 26-IngressRoute]# ll tls.* -rw-r--r-- 1 root root 1155 Jul 29 10:33 tls.crt -rw------- 1 root root 1704 Jul 29 10:33 tls.key [root@master231 26-IngressRoute]# 3.将证书封装为secrets资源 [root@master231 26-IngressRoute]# kubectl create secret tls whoami-tls --cert=tls.crt --key=tls.key secret/whoami-tls created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get secrets whoami-tls NAME TYPE DATA AGE whoami-tls kubernetes.io/tls 2 8s [root@master231 26-IngressRoute]# 4.创建https应用路由规则并访问测试 [root@master231 26-IngressRoute]# cat > 04-ingressroutes-whoami-https.yaml <<'EOF' apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: name: ingressroute-whoami-https namespace: default spec: tls: secretName: whoami-tls # 指定包含 TLS 证书和私钥的 Kubernetes Secret 的名称 # 【重要前提】: 你必须事先在同一个命名空间(`default`)中创建了一个名为 "whoami-tls" 的 Secret。 # 这个 Secret 必须包含 `tls.crt` (证书) 和 `tls.key` (私钥) 这两个键。 # Traefik 会自动从这个 Secret 中获取证书,用于与客户端(如浏览器)进行 TLS 握手。 entryPoints: - websecure # 'websecure' 通常代表 HTTPS 的 443 端口,确保了只有加密的 HTTPS 请求才会匹配到这条路由 routes: - match: Host(`whoamissl.yinzhengjie.com`) # 精确匹配 "whoamissl.yinzhengjie.com" 时,此规则才生效 kind: Rule services: - name: svc-whoami port: 80 EOF [root@master231 26-IngressRoute]# kubectl apply -f 04-ingressroutes-whoami-https.yaml ingressroute.traefik.io/ingressroute-whoami-https created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get ingressroute ingressroute-whoami-https NAME AGE ingressroute-whoami-https 8s [root@master231 26-IngressRoute]# 5.访问测试 https://whoamissl.yinzhengjie.com/ 温馨提示: 记得在window添加Traefik解析记录: 10.0.0.152 whoamissl.yinzhengjie.com
6、配置tcp路由规则之MySQL案例
bash
1 配置tcp路由规则注意事项 SNI为服务名称标识,是TLS协议的扩展,因此,只有TLS路由才能使用该规则指定域名。 但是,非TLS路由必须带有"*"(所有域)的规则来声明每个非TLS请求都将由路由进行处理。 2 重新部署Traefik 2.1.修改values.yaml配置文件【目的是为了添加'entryPoints'】 [root@master231 ~]# cd /weixiang/manifests/24-ingressClass/traefik/ [root@master231 traefik]# [root@master231 traefik]# vim traefik/values.yaml ... ports: mysql: port: 3306 2.2.卸载Traefik服务 [root@master231 traefik]# helm uninstall traefik-server release "traefik-server" uninstalled [root@master231 traefik]# 2.3.安装Traefik服务 [root@master231 traefik]# helm install traefik-server traefik NAME: traefik-server LAST DEPLOYED: Tue Jul 29 10:39:47 2025 NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: traefik-server with docker.io/traefik:v3.4.3 has been deployed successfully on default namespace ! [root@master231 traefik]# 3.部署MySQL [root@master231 26-IngressRoute]# cat > 05-deploy-mysql.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-mysql spec: replicas: 1 selector: matchLabels: apps: mysql template: metadata: labels: apps: mysql spec: containers: - image: harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle name: db ports: - containerPort: 3306 env: - name: MYSQL_DATABASE value: wordpress - name: MYSQL_ALLOW_EMPTY_PASSWORD value: "yes" - name: MYSQL_USER value: admin - name: MYSQL_PASSWORD value: yinzhengjie args: - --character-set-server=utf8 - --collation-server=utf8_bin - --default-authentication-plugin=mysql_native_password --- apiVersion: v1 kind: Service metadata: name: svc-mysql spec: ports: - port: 3306 selector: apps: mysql EOF [root@master231 26-IngressRoute]# kubectl apply -f 05-deploy-mysql.yaml deployment.apps/deploy-mysql created service/svc-mysql created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get pods -o wide -l apps=mysql NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-mysql-869f7867d8-9995c 1/1 Running 0 10s 10.100.2.215 worker233 <none> <none> [root@master231 26-IngressRoute]# 4 创建路由规则 [root@master231 ingresses]# cat > 06-IngressRouteTCP-mysql.yaml <<'EOF' apiVersion: traefik.io/v1alpha1 kind: IngressRouteTCP # 创建一条 Traefik 专属的 TCP 路由规则。它告诉 Traefik 如何处理 metadata: name: ingressroutetcp-mysql namespace: default spec: # 使用自己定义的entryPoint。 entryPoints: - mysql # 要写在traefik/values.yaml 配置文件里 routes: - match: HostSNI(`*`) # 这个规则基本等同于“所有进入 'mysql' 入口点的 TCP 连接都匹配”,所有域名都匹配 services: - name: svc-mysql # 指定目标 Kubernetes Service 的名称为 "svc-mysql" port: 3306 EOF [root@master231 26-IngressRoute]# kubectl apply -f 06-IngressRouteTCP-mysql.yaml ingressroutetcp.traefik.io/ingressroutetcp-mysql created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get ingressroutetcps ingressroutetcp-mysql NAME AGE ingressroutetcp-mysql 7s [root@master231 26-IngressRoute]# 5.查看Traefik的Dashboard验证 6.修改Traefik的svc暴露端口 [root@master231 ingresses]# kubectl edit svc traefik-server ... spec: ... ports: - name: mysql port: 3306 ... [root@master231 26-IngressRoute]# kubectl get svc traefik-server NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE traefik-server LoadBalancer 10.200.111.95 10.0.0.152 3306:29995/TCP,80:25776/TCP,443:11156/TCP 6m50s [root@master231 26-IngressRoute]# 6.客户端访问测试 [root@harbor250.weixiang.com ~]# apt -y install mysql-client-core-8.0 [root@harbor250.weixiang.com ~]# echo 10.0.0.152 mysql.yinzhengjie.com >> /etc/hosts [root@harbor250.weixiang.com ~]# tail -1 /etc/hosts 10.0.0.152 mysql.yinzhengjie.com [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# mysql -h mysql.yinzhengjie.com Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 9 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2025, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SHOW DATABASES; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | wordpress | +--------------------+ 5 rows in set (0.01 sec) mysql> USE wordpress Database changed mysql> SHOW TABLES; Empty set (0.00 sec) mysql> mysql> CREATE TABLE student(id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(255) NOT NULL, hobby VARCHAR(255) NOT NULL); Query OK, 0 rows affected (0.01 sec) mysql> mysql> INSERT INTO student(name,hobby) VALUES ('YuWenZhi','Sleep'); Query OK, 1 row affected (0.01 sec) mysql> 7.服务端测试验证 [root@master231 26-IngressRoute]# kubectl exec -it deploy-mysql-869f7867d8-9995c -- mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 10 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> mysql> SHOW DATABASES; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | | wordpress | +--------------------+ 5 rows in set (0.00 sec) mysql> mysql> USE wordpress Database changed mysql> mysql> SHOW TABLES; +---------------------+ | Tables_in_wordpress | +---------------------+ | student | +---------------------+ 1 row in set (0.00 sec) mysql> SELECT * FROM student; Empty set (0.01 sec) mysql> mysql> SELECT * FROM student; +----+----------+-------+ | id | name | hobby | +----+----------+-------+ | 1 | YuWenZhi | Sleep | +----+----------+-------+ 1 row in set (0.00 sec) mysql>

image

bash
梳理逻辑 1.客户端发起请求:客户端访问 mysql.yinzhengjie.com,通过 /etc/hosts 解析到 IP 10.0.0.152。mysql 客户端向 10.0.0.152:3306 发起 TCP 连接。 2.LoadBalancer Service 接收:10.0.0.152 是 traefik-server 这个 LoadBalancer Service 的公网 IP。它接收到 3306 端口的流量。 3.NodePort 转发:Service 将这个流量转发到集群中某个**节点(Node)**上的 29995 端口 (NodePort)。 4.Kube-proxy 路由:运行在节点上的 kube-proxy 组件,将节点 29995 端口的流量,路由到 Traefik Pod 内部监听的 3306 端口(values.yam文件配置)。 5.Traefik EntryPoint 捕获:Traefik Pod 内部的 3306 端口是一个被命名为 mysql 的入口点 (EntryPoint),它成功接收到这个 TCP 连接。 6.Traefik 路由匹配:Traefik 查找所有 IngressRouteTCP 规则,发现 ingressroutetcp-mysql 的 entryPoints 字段正好是 mysql,匹配成功。 7.Traefik 转发到后端 Service:根据规则,Traefik 将这个连接转发给 svc-mysql 这个 Service。 8.后端 Service 转发到 Pod:svc-mysql 再根据自己的 selector,将连接最终转发给后端的 MySQL Pod。
6、配置tcp路由规则之Redis案例
bash
1 部署Redis [root@master231 26-IngressRoute]# cat > 07-deploy-redis.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-redis spec: replicas: 1 selector: matchLabels: apps: redis template: metadata: labels: apps: redis spec: containers: - image: harbor250.weixiang.com/weixiang-db/redis:6.0.5 name: db ports: - containerPort: 6379 --- apiVersion: v1 kind: Service metadata: name: svc-redis spec: ports: - port: 6379 selector: apps: redis EOF [root@master231 26-IngressRoute]# kubectl apply -f 07-deploy-redis.yaml deployment.apps/deploy-redis created service/svc-redis created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get pods -o wide -l apps=redis NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-redis-5dd745fbb9-pqj47 1/1 Running 0 11s 10.100.1.55 worker232 <none> <none> [root@master231 26-IngressRoute]# 2 重新部署Traefik 2.1.修改values.yaml配置文件【目的是为了添加'entryPoints'】 [root@master231 ~]# cd /weixiang/manifests/24-ingressClass/traefik/ [root@master231 traefik]# [root@master231 traefik]# vim traefik/values.yaml [root@master231 traefik]# ... ports: redis: port: 6379 2.2.卸载Traefik服务 [root@master231 traefik]# helm uninstall traefik-server release "traefik-server" uninstalled [root@master231 traefik]# 2.3.安装Traefik服务 [root@master231 traefik]# helm install traefik-server traefik NAME: traefik-server LAST DEPLOYED: Tue Jul 29 10:54:50 2025 NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: traefik-server with docker.io/traefik:v3.4.3 has been deployed successfully on default namespace ! [root@master231 traefik]# 3 创建路由规则 [root@master231 traefik]# cat > 08-IngressRouteTCP-redis.yaml <<'EOF' apiVersion: traefik.io/v1alpha1 kind: IngressRouteTCP metadata: name: ingressroutetcp-redis namespace: default spec: # 使用自己定义的entryPoint。 entryPoints: - redis routes: - match: HostSNI(`*`) services: - name: svc-redis port: 6379 EOF [root@master231 26-IngressRoute]# kubectl apply -f 08-IngressRouteTCP-redis.yaml ingressroutetcp.traefik.io/ingressroutetcp-redis created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl describe ingressroutetcp ingressroutetcp-redis Name: ingressroutetcp-redis Namespace: default Labels: <none> Annotations: <none> API Version: traefik.io/v1alpha1 Kind: IngressRouteTCP Metadata: Creation Timestamp: 2025-07-29T02:56:02Z Generation: 1 Managed Fields: API Version: traefik.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: .: f:kubectl.kubernetes.io/last-applied-configuration: f:spec: .: f:entryPoints: f:routes: Manager: kubectl-client-side-apply Operation: Update Time: 2025-07-29T02:56:02Z Resource Version: 509532 UID: ef60ad3f-feea-4900-af0a-7e0189b4d167 Spec: Entry Points: redis Routes: Match: HostSNI(`*`) Services: Name: svc-redis Port: 6379 Events: <none> [root@master231 26-IngressRoute]# 4.访问traefki的WebUI 略,见视频。 5.k8s修改Traefik的svc解析记录 [root@master231 26-IngressRoute]# kubectl edit svc traefik-server ... spec: ... ports: - name: redis port: 6379 ... [root@master231 26-IngressRoute]# kubectl get svc traefik-server NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE traefik-server LoadBalancer 10.200.86.241 10.0.0.152 6379:47944/TCP,80:43758/TCP,443:12991/TCP 3m11s [root@master231 26-IngressRoute]# 6.客户端测试 6.1 安装redis客户端工具 [root@harbor250.weixiang.com ~]# apt -y install redis-server 6.2 添加解析记录 [root@harbor250.weixiang.com ~]# echo 10.0.0.152 redis.yinzhengjie.com >> /etc/hosts [root@harbor250.weixiang.com ~]# tail -1 /etc/hosts 10.0.0.152 redis.yinzhengjie.com [root@harbor250.weixiang.com ~]# 6.3 写入测试数据 [root@harbor250.weixiang.com ~]# redis-cli --raw -n 5 -h redis.yinzhengjie.com redis.yinzhengjie.com:6379[5]> KEYS * redis.yinzhengjie.com:6379[5]> set school weixiang OK redis.yinzhengjie.com:6379[5]> set class weixiang98 OK redis.yinzhengjie.com:6379[5]> KEYS * school class redis.yinzhengjie.com:6379[5]> 7.服务端查看数据并验证 [root@master231 26-IngressRoute]# kubectl get pods -o wide -l apps=redis NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-redis-5dd745fbb9-pqj47 1/1 Running 0 5m38s 10.100.1.55 worker232 <none> <none> [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl exec -it deploy-redis-5dd745fbb9-pqj47 -- redis-cli 127.0.0.1:6379> KEYS * (empty array) 127.0.0.1:6379> 127.0.0.1:6379> SELECT 5 OK 127.0.0.1:6379[5]> 127.0.0.1:6379[5]> KEYS * 1) "school" 2) "class" 127.0.0.1:6379[5]> get class "weixiang98" 127.0.0.1:6379[5]> 127.0.0.1:6379[5]> get school "weixiang" 127.0.0.1:6379[5]>
7、使用IngressRouteTCP实现配置whoamitcp应用的代理
bash
[root@master231 26-IngressRoute]# kubectl get pods -o wide -l apps=whoamitcp NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-whoamitcp-764f9bc89-shxxg 1/1 Running 0 68m 10.100.2.211 worker233 <none> <none> deploy-whoamitcp-764f9bc89-tflq6 1/1 Running 0 68m 10.100.1.52 worker232 <none> <none> [root@master231 26-IngressRoute]# - 配置TCP路由规则之whoamitcp 1 重新部署Traefik 1.1 修改values.yaml配置文件【目的是为了添加'entryPoints'】 [root@master231 ~]# cd /weixiang/manifests/24-ingressClass/traefik/ [root@master231 traefik]# [root@master241 traefik]# vim traefik/values.yaml ... ports: tcpcase: port: 8082 # 注意哈,端口不能和其他的entryPoints端口重复,否则会启动失败。 protocol: TCP ... 1.2 卸载Traefik服务 [root@master231 traefik]# helm uninstall traefik-server release "traefik-server" uninstalled [root@master231 traefik]# [root@master231 traefik]# 1.3 安装Traefik服务 [root@master231 traefik]# helm install traefik-server traefik NAME: traefik-server LAST DEPLOYED: Tue Jul 29 11:36:46 2025 NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: traefik-server with docker.io/traefik:v3.4.3 has been deployed successfully on default namespace ! [root@master231 traefik]# 1.4 查看Traefik的Dashboard验证 略,见视频。 2 创建路由规则 [root@master231 26-IngressRoute]# cat > 09-IngressRouteTCP-whoamitcp.yaml <<'EOF' apiVersion: traefik.io/v1alpha1 kind: IngressRouteTCP metadata: name: ingressroutetcp-whoamitcp namespace: default spec: entryPoints: - tcpcase routes: - match: HostSNI(`*`) services: - name: svc-whoamitcp port: 8080 EOF [root@master231 26-IngressRoute]# kubectl apply -f 09-IngressRouteTCP-whoamitcp.yaml ingressroutetcp.traefik.io/ingressroutetcp-whoamitcp created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get ingressroutetcps ingressroutetcp-whoamitcp NAME AGE ingressroutetcp-whoamitcp 10s [root@master231 26-IngressRoute]# 3.测试验证 1.1.查看whoamitcp的svc的ClusterIP [root@master231 26-IngressRoute]# kubectl edit svc traefik-server ... ports: - name: tcpcase nodePort: 18614 port: 8082 1.2.安装socat测试工具 [root@harbor250.weixiang.com ~]# apt -y install socat 1.3.访问测试 [root@harbor250.weixiang.com ~]# echo 10.0.0.152 tcpcase.yinzhengjie.com >> /etc/hosts [root@harbor250.weixiang.com ~]# tail -1 /etc/hosts 10.0.0.152 tcpcase.yinzhengjie.com [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# echo "WHO" | nc tcpcase.yinzhengjie.com 8082 Hostname: deploy-whoamitcp-764f9bc89-shxxg IP: 127.0.0.1 IP: 10.100.2.211 ^C [root@harbor250.weixiang.com ~]# echo "WHO" | nc tcpcase.yinzhengjie.com 8082 Hostname: deploy-whoamitcp-764f9bc89-tflq6 IP: 127.0.0.1 IP: 10.100.1.52
8、配置UDP路由规则之whoamiudp
bash
1 重新部署Traefik 1.1 修改values.yaml配置文件【目的是为了添加'entryPoints'】 [root@master231 ~]# cd /weixiang/manifests/24-ingressClass/traefik/ [root@master231 traefik]# [root@master231 traefik]# vim traefik/values.yaml ... ports: udpcase: port: 8081 protocol: UDP ... 1.2.卸载Traefik服务 [root@master231 traefik]# helm uninstall traefik-server release "traefik-server" uninstalled [root@master231 traefik]# 1.3.安装Traefik服务 [root@master231 traefik]# helm install traefik-server traefik NAME: traefik-server LAST DEPLOYED: Tue Jul 29 11:45:06 2025 NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: traefik-server with docker.io/traefik:v3.4.3 has been deployed successfully on default namespace ! [root@master231 traefik]# 2.创建路由规则 [root@master231 26-IngressRoute]# cat > 10-IngressRouteUDP-whoamiudp.yaml <<EOF apiVersion: traefik.io/v1alpha1 kind: IngressRouteUDP metadata: name: ingressroutetcp-whoamiudp namespace: default spec: entryPoints: - udpcase routes: - services: - name: svc-whoamiudp port: 8080 EOF [root@master231 26-IngressRoute]# kubectl apply -f 10-IngressRouteUDP-whoamiudp.yaml ingressrouteudp.traefik.io/ingressroutetcp-whoamiudp created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get ingressrouteudps ingressroutetcp-whoamiudp NAME AGE ingressroutetcp-whoamiudp 10s [root@master231 26-IngressRoute]# 3.测试验证 3.1.查看whoamiudp的svc的ClusterIP [root@master231 26-IngressRoute]# kubectl get svc svc-whoamiudp -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR svc-whoamiudp ClusterIP 10.200.69.93 <none> 8080/UDP 113m apps=whoamiudp [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get pods -o wide -l apps=whoamiudp NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-whoamiudp-f7657cc98-gjhdb 1/1 Running 0 113m 10.100.1.54 worker232 <none> <none> deploy-whoamiudp-f7657cc98-kld2q 1/1 Running 0 113m 10.100.2.213 worker233 <none> <none> [root@master231 26-IngressRoute]# 2.安装socat测试工具 [root@worker232 ~]# apt -y install socat 3.3访问测试 [root@worker232 ~]# echo "WHO" | socat - udp4-datagram:10.200.69.93:8080 # 此处是SVC的地址和端口 Hostname: deploy-whoamiudp-f7657cc98-kld2q IP: 127.0.0.1 IP: 10.100.2.213 [root@worker232 ~]# [root@worker232 ~]# echo "WHO" | socat - udp4-datagram:10.200.69.93:8080 # 此处是SVC的地址和端口 Hostname: deploy-whoamiudp-f7657cc98-gjhdb IP: 127.0.0.1 IP: 10.100.1.54 [root@worker232 ~]# [root@worker232 ~]# echo "https://www.cnblogs.com/yinzhengjie" | socat - udp4-datagram:10.200.85.25:8080 # 如果输入的部署WHO,则会将你输入的返回给你。 Received: https://www.cnblogs.com/yinzhengjie [root@worker232 ~]# 3.4 修改svc的端口映射 [root@master231 26-IngressRoute]# cat /tmp/xixi.yaml apiVersion: v1 kind: Service metadata: name: traefik-server namespace: default spec: ports: - name: udpcase port: 8081 protocol: UDP targetPort: 8081 - name: web nodePort: 10824 port: 80 protocol: TCP targetPort: web - name: websecure nodePort: 20453 port: 443 protocol: TCP targetPort: websecure selector: app.kubernetes.io/instance: traefik-server-default app.kubernetes.io/name: traefik sessionAffinity: None # type: LoadBalancer type: NodePort [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl apply -f /tmp/xixi.yaml [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get svc traefik-server NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE traefik-server NodePort 10.200.11.72 <none> 8081:41741/UDP,80:10824/TCP,443:20453/TCP 8m57s [root@master231 26-IngressRoute]# 温馨提示: 此处一定要将svc的类型由LoadBalancer切换为NodePort,因为LoadBalancer不支持多端口暴露不同的协议。 但是改成NodePort再改回LoadBalancer又是可以的,在k8s 1.23.17测试有效。 3.5 集群外部测试 [root@harbor250.weixiang.com ~]# echo 10.0.0.152 whoamiudp.weixiang.com >> /etc/hosts [root@harbor250.weixiang.com ~]# tail -1 /etc/hosts 10.0.0.152 whoamiudp.weixiang.com [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# echo "WHO" | socat - udp4-datagram:10.0.0.233:41741 Hostname: deploy-whoamiudp-f7657cc98-gjhdb IP: 127.0.0.1 IP: 10.100.1.54 [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# echo "WHO" | socat - udp4-datagram:10.0.0.233:41741 Hostname: deploy-whoamiudp-f7657cc98-kld2q IP: 127.0.0.1 IP: 10.100.2.213 [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# echo "https://www.cnblogs.com/yinzhengjie" | socat - udp4-datagram:10.0.0.231:41741 Received: https://www.cnblogs.com/yinzhengjie [root@harbor250.weixiang.com ~]#
9、Traefik中间件实战案例

9cbcaf4bb1b4a7ba68a853178a46c284_720

1.中间件在Traefik的位置
bash
1.中间件在Traefik的位置 连接到路由器的中间件是在将请求发送到您的服务之前(或在将服务的答案发送到客户端之前)调整请求的一种手段。 Traefik中有几个可用的中间件,有些可以修改请求、标头,有些负责重定向,有些添加身份验证等等。 使用相同协议的中间件可以组合成链,以适应每种情况。官方支持HTTP和TCP两种中间件。 参考链接: https://doc.traefik.io/traefik/middlewares/overview/
2、http中间件之ipallowlist中间件实战案例
bash
‍ 2.http中间件之ipallowlist中间件实战案例 2.1 部署测试服务 [root@master231 26-IngressRoute]# cat > 11-deploy-xiuxian.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 1 selector: matchLabels: apps: xiuxian template: metadata: labels: apps: xiuxian spec: containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 name: c1 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxian spec: ports: - port: 80 selector: apps: xiuxian EOF 2.2 测试验证 [root@master231 26-IngressRoute]# kubectl apply -f 11-deploy-xiuxian.yaml deployment.apps/deploy-xiuxian created service/svc-xiuxian created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get pods -o wide -l apps=xiuxian NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-565ff98ccb-zlqn7 1/1 Running 0 10s 10.100.2.220 worker233 <none> <none> [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get svc svc-xiuxian NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc-xiuxian ClusterIP 10.200.220.145 <none> 80/TCP 27s [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# curl 10.200.220.145 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 26-IngressRoute]# 2.3 ipWhiteList|ipallowlist应用案例 在实际工作中,有一些URL并不希望对外暴露,比如prometheus、grafana等,我们就可以通过白名单IP来过到要求,可以使用Traefix中的ipWhiteList中间件来完成。 ipWhiteList组件可以限制客户端的IP地址是否可以进行访问,如果在白名单则允许访问,目前官方支持http和tcp两种配置方式。 值得注意的是,ipwhitelist官方已经弃用,推荐使用ipallowlist来替代。 [root@master231 26-IngressRoute]# cat > 12-ipAllowList-IngressRoute.yaml <<'EOF' apiVersion: traefik.io/v1alpha1 kind: Middleware # 中间件是 Traefik 的一个核心概念,用于在请求到达最终服务前对其进行处理。 metadata: name: xiuxian-ipallowlist namespace: default spec: ipAllowList: # 指定了这个中间件的功能是“IP地址白名单”。 sourceRange: - 127.0.0.1 # 允许来自本机的回环地址的请求。 - 10.0.0.0/24 # 允许所有来源 IP 在 10.0.0.0/24 网段内的请求。 --- apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: name: ingressroute-xiuxian spec: entryPoints: - web routes: - match: Host(`middleware.yinzhengjie.com`) && PathPrefix(`/`) kind: Rule services: - name: svc-xiuxian port: 80 middlewares: # middlewares 定义了在将请求转发到后端服务之前,需要应用的中间件列表。 - name: xiuxian-ipallowlist # # 这个名字必须与上面定义的 Middleware 资源的名称相匹配 namespace: default # namespace 指定了该中间件所在的命名空间 EOF [root@master231 26-IngressRoute]# kubectl apply -f 12-ipAllowList-IngressRoute.yaml middleware.traefik.io/xiuxian-ipallowlist created ingressroute.traefik.io/ingressroute-xiuxian created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get middleware NAME AGE xiuxian-ipallowlist 57s [root@master231 26-IngressRoute]# 2.4 测试访问 [root@harbor250.weixiang.com ~]# echo 43.139.47.66 middleware.yinzhengjie.com >> /etc/hosts [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# tail -1 /etc/hosts 10.0.0.152 middleware.yinzhengjie.com [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# curl middleware.yinzhengjie.com <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@harbor250.weixiang.com ~]# 2.5 将10.0.0.0/24网段移除 [root@master231 26-IngressRoute]# cat 12-ipAllowList-IngressRoute.yaml apiVersion: traefik.io/v1alpha1 kind: Middleware metadata: name: xiuxian-ipallowlist namespace: default spec: ipAllowList: sourceRange: - 127.0.0.1 # - 10.0.0.0/24 --- apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: name: ingressroute-xiuxian spec: entryPoints: - web routes: - match: Host(`middleware.yinzhengjie.com`) && PathPrefix(`/`) kind: Rule services: - name: svc-xiuxian port: 80 middlewares: - name: xiuxian-ipallowlist namespace: default [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl apply -f 12-ipAllowList-IngressRoute.yaml middleware.traefik.io/xiuxian-ipallowlist configured ingressroute.traefik.io/ingressroute-xiuxian unchanged [root@master231 26-IngressRoute]# 2.6 再次测试验证发现无法访问 [root@harbor250.weixiang.com ~]# curl middleware.yinzhengjie.com;echo Forbidden [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# curl -I middleware.yinzhengjie.com HTTP/1.1 403 Forbidden Date: Tue, 29 Jul 2025 07:04:16 GMT Content-Length: 9 [root@harbor250.weixiang.com ~]# 2.7 删除测试环境 [root@master231 26-IngressRoute]# kubectl delete -f 12-ipAllowList-IngressRoute.yaml middleware.traefik.io "xiuxian-ipallowlist" deleted ingressroute.traefik.io "ingressroute-xiuxian" deleted [root@master231 26-IngressRoute]# - 中间件存在问题补充 - 1.网络插件问题 - 2.配置暴露客户端的源IP地址 [root@master231 26-IngressRoute]# kubectl get svc traefik-server -o yaml | grep externalTrafficPolicy externalTrafficPolicy: Cluster [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get svc traefik-server -o yaml | sed '/externalTrafficPolicy/s#Cluster#Local#' | kubectl apply -f - service/traefik-server configured [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get svc traefik-server -o yaml | grep externalTrafficPolicy {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"meta.helm.sh/release-name":"traefik-server","meta.helm.sh/release-namespace":"default","metallb.io/ip-allocated-from-pool":"jasonyin2020"},"creationTimestamp":"2025-07-29T06:40:53Z","labels":{"app.kubernetes.io/instance":"traefik-server-default","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"traefik","helm.sh/chart":"traefik-36.3.0"},"name":"traefik-server","namespace":"default","resourceVersion":"537686","uid":"eb7665d9-1d20-419e-a265-c14292e83bbc"},"spec":{"allocateLoadBalancerNodePorts":true,"clusterIP":"10.200.11.72","clusterIPs":["10.200.11.72"],"externalTrafficPolicy":"Local","internalTrafficPolicy":"Cluster","ipFamilies":["IPv4"],"ipFamilyPolicy":"SingleStack","ports":[{"name":"udpcase","nodePort":41741,"port":8081,"protocol":"UDP","targetPort":8081},{"name":"web","nodePort":10824,"port":80,"protocol":"TCP","targetPort":"web"},{"name":"websecure","nodePort":20453,"port":443,"protocol":"TCP","targetPort":"websecure"}],"selector":{"app.kubernetes.io/instance":"traefik-server-default","app.kubernetes.io/name":"traefik"},"sessionAffinity":"None","type":"LoadBalancer"},"status":{"loadBalancer":{"ingress":[{"ip":"10.0.0.152"}]}}} externalTrafficPolicy: Local [root@master231 26-IngressRoute]# 参考输出: 10.100.2.223 - - [30/Jul/2025:01:01:46 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" # Cluster模式,是worker节点IP 10.100.2.223 - - [30/Jul/2025:01:01:46 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" 10.100.2.223 - - [30/Jul/2025:01:01:46 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" 10.100.2.223 - - [30/Jul/2025:01:01:46 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" 10.100.2.223 - - [30/Jul/2025:01:01:46 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" ... 10.100.2.223 - - [30/Jul/2025:01:03:13 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.250" # Local模式,保留了客户端的真实IP地址。 10.100.2.223 - - [30/Jul/2025:01:03:13 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.250" 10.100.2.223 - - [30/Jul/2025:01:03:14 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.250" 10.100.2.223 - - [30/Jul/2025:01:03:14 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.250" 10.100.2.223 - - [30/Jul/2025:01:03:14 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.250"
3、IngressRoute绑定多中间件案例
bash
3.IngressRoute绑定多中间件案例 3.1 编写资源清单 [root@master231 ingresses]# cat > 13-basicAuth-secrets-IngressRoute.yaml <<'EOF' apiVersion: v1 kind: Secret metadata: name: login-info namespace: default type: kubernetes.io/basic-auth stringData: username: JasonYin password: yinzhengjie --- apiVersion: traefik.io/v1alpha1 kind: Middleware metadata: name: login-auth spec: basicAuth: secret: login-info --- apiVersion: traefik.io/v1alpha1 kind: Middleware metadata: name: xiuxian-ipallowlist namespace: default spec: ipAllowList: sourceRange: - 127.0.0.1 - 10.0.0.0/24 --- apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: name: ingressroute-xiuxian spec: entryPoints: - web routes: - match: Host(`auth.yinzhengjie.com`) && PathPrefix(`/`) kind: Rule services: - name: svc-xiuxian port: 80 middlewares: - name: login-auth namespace: default - name: xiuxian-ipallowlist namespace: default EOF 3.2 创建资源 [root@master231 26-IngressRoute]# kubectl apply -f 13-basicAuth-secrets-IngressRoute.yaml secret/login-info created middleware.traefik.io/login-auth created middleware.traefik.io/xiuxian-ipallowlist created ingressroute.traefik.io/ingressroute-xiuxian created [root@master231 26-IngressRoute]# 3.3 直接测试访问 [root@harbor250.weixiang.com ~]# echo 10.0.0.152 auth.yinzhengjie.com >> /etc/hosts [root@harbor250.weixiang.com ~]# tail -1 /etc/hosts 10.0.0.152 auth.yinzhengjie.com [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# curl auth.yinzhengjie.com 401 Unauthorized [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# curl -u jasonyin:yinzhengjie auth.yinzhengjie.com 401 Unauthorized [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# curl -u JasonYin:yinzhengjie auth.yinzhengjie.com <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@harbor250.weixiang.com ~]#
4、Traefik高级应用实战案例之负载均衡1
bash
1.测试环境准备 # 创建两个完全独立的、可被访问的 Web 应用 [root@master231 26-IngressRoute]# cat > 14-deploy-svc-cm-lb-web.yaml <<'EOF' apiVersion: v1 kind: ConfigMap metadata: name: cm-lb-web data: web01: | server { listen 81; listen [::]:81; server_name localhost; location / { root /usr/share/nginx/html; index index.html index.htm; } error_page 500 502 503 504 /50x.html; location = /50x.html { root /usr/share/nginx/html; } } web02: | server { listen 82; listen [::]:82; server_name localhost; location / { root /usr/share/nginx/html; index index.html index.htm; } error_page 500 502 503 504 /50x.html; location = /50x.html { root /usr/share/nginx/html; } } --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-web01 spec: replicas: 1 selector: matchLabels: apps: web01 template: metadata: labels: apps: web01 school: weixiang spec: volumes: - emptyDir: {} name: data - name: webconf configMap: name: cm-lb-web items: - key: web01 path: default.conf initContainers: - name: i1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 env: - name: PODNAME valueFrom: fieldRef: fieldPath: metadata.name - name: PODNS valueFrom: fieldRef: fieldPath: metadata.namespace - name: PODIP valueFrom: fieldRef: fieldPath: status.podIP volumeMounts: - name: data mountPath: /weixiang command: - /bin/sh - -c - 'echo "<h1> 【web01】 NameSpace: ${PODNS}, PodName: ${PODNAME}, PodIP:${PODIP}</h1>" > /weixiang/index.html' containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 volumeMounts: - name: data mountPath: /usr/share/nginx/html - name: webconf mountPath: /etc/nginx/conf.d/default.conf subPath: default.conf --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-web02 spec: replicas: 1 selector: matchLabels: apps: web02 template: metadata: labels: apps: web02 school: weixiang spec: volumes: - emptyDir: {} name: data - name: webconf configMap: name: cm-lb-web items: - key: web02 path: default.conf initContainers: - name: i1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 env: - name: PODNAME valueFrom: fieldRef: fieldPath: metadata.name - name: PODNS valueFrom: fieldRef: fieldPath: metadata.namespace - name: PODIP valueFrom: fieldRef: fieldPath: status.podIP volumeMounts: - name: data mountPath: /weixiang command: - /bin/sh - -c - 'echo "<h1> 【web02】 NameSpace: ${PODNS}, PodName: ${PODNAME}, PodIP:${PODIP}</h1>" > /weixiang/index.html' containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 volumeMounts: - name: data mountPath: /usr/share/nginx/html - name: webconf mountPath: /etc/nginx/conf.d/default.conf subPath: default.conf --- apiVersion: v1 kind: Service metadata: name: svc-web01 spec: ports: - port: 80 targetPort: 81 selector: apps: web01 type: ClusterIP --- apiVersion: v1 kind: Service metadata: name: svc-web02 spec: ports: - port: 80 targetPort: 82 selector: apps: web02 type: ClusterIP EOF # 这个文件内部,Service 通过 selector 字段与 Deployment 的 Pod 关联。Deployment 通过 volumes 和 volumeMounts 字段与 ConfigMap 关联,从而获得自定义的 Nginx 配置 [root@master231 26-IngressRoute]# kubectl apply -f 14-deploy-svc-cm-lb-web.yaml configmap/cm-lb-web created deployment.apps/deploy-web01 created deployment.apps/deploy-web02 created service/svc-web01 created service/svc-web02 created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get svc svc-web01 svc-web02 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc-web01 ClusterIP 10.200.185.165 <none> 80/TCP 22s svc-web02 ClusterIP 10.200.97.145 <none> 80/TCP 22s [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# curl 10.200.185.165 <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# curl 10.200.97.145 <h1> 【web02】 NameSpace: default, PodName: deploy-web02-6cf97565db-74vlc, PodIP:10.100.2.221</h1> [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get pods -o wide -l school=weixiang NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-web01-6bc58b4f9c-zb52l 1/1 Running 0 53s 10.100.1.56 worker232 <none> <none> deploy-web02-6cf97565db-74vlc 1/1 Running 0 53s 10.100.2.221 worker233 <none> <none> [root@master231 26-IngressRoute]#
5、Traefik高级应用实战案例之负载均衡案例2
bash
# 基于上面4实现 2.1 创建资源 # 这个文件是整个负载均衡的核心,它告诉 Traefik 如何处理 lb.yinzhengjie.com 的流量 [root@master231 26-IngressRoute]# cat > 15-IngressRoute-lb-xiuxian.yaml <<'EOF' apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: name: ingressroute-lb namespace: default spec: entryPoints: # 通过 entryPoint(入口点)进行关联 - web # 安装 Traefik 时,values.yaml里定义 routes: - match: Host(`lb.yinzhengjie.com`) && PathPrefix(`/`) kind: Rule services: - name: svc-web01 port: 80 namespace: default - name: svc-web02 port: 80 namespace: default EOF 负载均衡的工作流程: 1.请求到达:你的测试机 (harbor250) 发送一个curl lb.yinzhengjie.com请求。由于你在/etc/hosts文件中将lb.yinzhengjie.com 指向了 10.0.0.152(这应该是 Traefik Ingress Controller 的入口 IP),所以请求被发送到了 Traefik。 # 这个10.0.0.152是创建Traefik的时候分配的LoadBalancer的ip就是所说的Traefik Ingress Controller服务的对外暴露地址 2.Traefik 匹配规则:Traefik收到请求后,发现请求的 Host 头部是 lb.yinzhengjie.com,这与ingressroute-lb这个IngressRoute 资源中定义的 match 规则完全匹配。 3.Traefik 查看后端服务列表: Traefik 看到这个规则下有两个后端服务: svc-web01 svc-web02 4.Traefik 执行负载均衡 默认策略: 在没有任何额外配置(如权重 weight)的情况下,Traefik 会对这两个服务采用轮询 (Round Robin) 的负载均衡策略。 第一次请求: Traefik 将请求转发给列表中的第一个服务,即 svc-web01。svc-web01 再将请求转发给后端的 deploy-web01 Pod,所以你看到了 "【web01】..." 的响应。 第二次请求: Traefik 按照轮询策略,将请求转发给列表中的第二个服务,即 svc-web02。svc-web02 再将请求转发给后端的 deploy-web02 Pod,所以你看到了 "【web02】..." 的响应。 第三次请求: 轮询回到列表的开头,请求再次被发往 svc-web01。 创建流程: # 先从 Helm Chart中获取 values.yaml 文件,修改该文件,确保名为 web 的入口点监听 80 端口;然后安装 Traefik,它随之创建了web #这个入口点;之后,我们创建的 IngressRoute 再通过 entryPoint 主动关联到 web,并根据其内部规则,将流量转发到后端的 Service, # Service 再将流量最终送达 Pods yaml文件、IngressRoute规则、service、pod [root@master231 26-IngressRoute]# kubectl apply -f 15-IngressRoute-lb-xiuxian.yaml ingressroute.traefik.io/ingressroute-lb created [root@master231 26-IngressRoute]# 2.2 测试验证 [root@harbor250.weixiang.com ~]# echo 10.0.0.152 lb.yinzhengjie.com >> /etc/hosts [root@harbor250.weixiang.com ~]# tail -1 /etc/hosts 10.0.0.152 lb.yinzhengjie.com [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# for i in `seq 10`; do curl -s lb.yinzhengjie.com;done <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web02】 NameSpace: default, PodName: deploy-web02-6cf97565db-74vlc, PodIP:10.100.2.221</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web02】 NameSpace: default, PodName: deploy-web02-6cf97565db-74vlc, PodIP:10.100.2.221</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web02】 NameSpace: default, PodName: deploy-web02-6cf97565db-74vlc, PodIP:10.100.2.221</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web02】 NameSpace: default, PodName: deploy-web02-6cf97565db-74vlc, PodIP:10.100.2.221</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web02】 NameSpace: default, PodName: deploy-web02-6cf97565db-74vlc, PodIP:10.100.2.221</h1> [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# for i in `seq 10`; do curl -s lb.yinzhengjie.com;done | sort | uniq -c 5 <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> 5 <h1> 【web02】 NameSpace: default, PodName: deploy-web02-6cf97565db-74vlc, PodIP:10.100.2.221</h1> [root@harbor250.weixiang.com ~]# 2.3 删除环境 [root@master231 26-IngressRoute]# kubectl delete -f 15-IngressRoute-lb-xiuxian.yaml ingressroute.traefik.io "ingressroute-lb" deleted [root@master231 26-IngressRoute]#
6、Traefik高级应用实战案例之灰度发布案例
bash
3.1 创建资源 [root@master231 26-IngressRoute]# cat > 16-TraefikService-weighted.yaml <<'EOF' apiVersion: traefik.io/v1alpha1 kind: TraefikService # 用于实现高级的负载均衡策略,如加权、镜像等 metadata: name: traefikservices-wrr namespace: default spec: weighted: # 基于权重调度 services: # services 定义了参与加权负载均衡的后端服务列表。 - name: svc-web01 port: 80 weight: 4 # 定义调度到该svc的权重为4 kind: Service # 指定类型有效值为: Service(default), TraefikService,默认就是 'Service' - name: svc-web02 port: 80 weight: 1 # 定义调度到该svc的权重为4 EOF [root@master231 26-IngressRoute]# kubectl apply -f 16-TraefikService-weighted.yaml traefikservice.traefik.io/traefikservices-wrr created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# cat > 17-IngressRoute-TraefikService.yaml <<'EOF' apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: name: ingressroute-lb-wrr namespace: default spec: entryPoints: - web routes: - match: Host(`lb.yinzhengjie.com`) && PathPrefix(`/`) kind: Rule services: # 指定TraefikService的名称 - name: traefikservices-wrr namespace: default # 注意,类型不再是k8s的Service,而是Traefik自实现的TraefikService kind: TraefikService EOF [root@master231 26-IngressRoute]# kubectl apply -f 17-IngressRoute-TraefikService.yaml ingressroute.traefik.io/ingressroute-lb-wrr created [root@master231 26-IngressRoute]# 3.2 测试验证 [root@harbor250.weixiang.com ~]# for i in `seq 10`; do curl -s lb.yinzhengjie.com;done <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web02】 NameSpace: default, PodName: deploy-web02-6cf97565db-74vlc, PodIP:10.100.2.221</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web02】 NameSpace: default, PodName: deploy-web02-6cf97565db-74vlc, PodIP:10.100.2.221</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# for i in `seq 10`; do curl -s lb.yinzhengjie.com; done | sort | uniq -c 8 <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> 2 <h1> 【web02】 NameSpace: default, PodName: deploy-web02-6cf97565db-74vlc, PodIP:10.100.2.221</h1> [root@harbor250.weixiang.com ~]# 3.3 删除测试案例 [root@master231 26-IngressRoute]# kubectl delete -f 16-TraefikService-weighted.yaml -f 17-IngressRoute-TraefikService.yaml traefikservice.traefik.io "traefikservices-wrr" deleted ingressroute.traefik.io "ingressroute-lb-wrr" deleted [root@master231 26-IngressRoute]#
7、Traefik高级应用实战案例之流量镜像|影子流量(Mirroring/traffic-shadow)
bash
4.1 编写资源清单 [root@master231 26-IngressRoute]# cat > 18-TraefikService-mirroring.yaml <<'EOF' apiVersion: traefik.io/v1alpha1 kind: TraefikService metadata: name: traefikservices-mirroring namespace: default spec: # 发送 100% 的请求到K8S名为"svc-web01"的Service。 mirroring: # mirroring 指定了这种 TraefikService 的类型是“流量镜像”。 kind: Service # kind 和 name 定义了主服务,即所有用户请求都会被正常发送到的地方。 name: svc-web01 port: 80 # 将其中20%的请求调度到k8s名为"svc-web02"的Service。 mirrors: # mirrors 定义了一个或多个镜像目标的列表。 - name: svc-web02 port: 80 # 是指将20%请求的流量复制一份发送给其它'svc-web02'服务,并且会忽略这部分请求的响应,这个功能在做一些压测或者问题复现的时候很有用。 percent: 20 --- apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: name: ingressroute-mirror # IngressRoute 资源的名称 namespace: default spec: entryPoints: - web routes: - match: Host(`lb.yinzhengjie.com`) && PathPrefix(`/`) kind: Rule services: - name: traefikservices-mirroring # 指向 TraefikService 的名称 namespace: default kind: TraefikService EOF [root@master231 26-IngressRoute]# kubectl apply -f 18-TraefikService-mirroring.yaml traefikservice.traefik.io/traefikservices-mirroring created ingressroute.traefik.io/ingressroute-mirror created [root@master231 26-IngressRoute]# 4.2 测试验证 [root@harbor250.weixiang.com ~]# for i in `seq 10`; do curl -s lb.yinzhengjie.com; done <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> <h1> 【web01】 NameSpace: default, PodName: deploy-web01-6bc58b4f9c-zb52l, PodIP:10.100.1.56</h1> [root@harbor250.weixiang.com ~]# 4.3 查看后端日志 [root@master231 traefik]# kubectl get svc traefik-server NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE traefik-server LoadBalancer 10.200.11.72 10.0.0.152 8081:41741/UDP,80:10824/TCP,443:20453/TCP 130m [root@master231 traefik]# [root@master231 traefik]# kubectl describe svc traefik-server Name: traefik-server Namespace: default Labels: app.kubernetes.io/instance=traefik-server-default app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=traefik helm.sh/chart=traefik-36.3.0 Annotations: meta.helm.sh/release-name: traefik-server meta.helm.sh/release-namespace: default metallb.io/ip-allocated-from-pool: jasonyin2020 Selector: app.kubernetes.io/instance=traefik-server-default,app.kubernetes.io/name=traefik Type: LoadBalancer IP Family Policy: SingleStack IP Families: IPv4 IP: 10.200.11.72 IPs: 10.200.11.72 LoadBalancer Ingress: 10.0.0.152 Port: udpcase 8081/UDP TargetPort: 8081/UDP NodePort: udpcase 41741/UDP Endpoints: 10.100.2.219:8081 Port: web 80/TCP TargetPort: web/TCP NodePort: web 10824/TCP Endpoints: 10.100.2.219:8000 Port: websecure 443/TCP TargetPort: websecure/TCP NodePort: websecure 20453/TCP Endpoints: 10.100.2.219:8443 Session Affinity: None External Traffic Policy: Cluster Events: <none> [root@master231 traefik]# [root@master231 traefik]# kubectl get pods -o wide -l app.kubernetes.io/instance=traefik-server-default NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES traefik-server-67b58d485-wj5rc 1/1 Running 0 130m 10.100.2.219 worker233 <none> <none> [root@master231 traefik]# [root@master231 26-IngressRoute]# kubectl get pods -o wide -l school=weixiang NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-web01-6bc58b4f9c-zb52l 1/1 Running 0 16m 10.100.1.56 worker232 <none> <none> deploy-web02-6cf97565db-74vlc 1/1 Running 0 16m 10.100.2.221 worker233 <none> <none> [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl logs -f deploy-web01-6bc58b4f9c-zb52l ... 10.100.2.219 - - [29/Jul/2025:08:50:34 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" 10.100.2.219 - - [29/Jul/2025:08:50:34 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" 10.100.2.219 - - [29/Jul/2025:08:50:34 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" 10.100.2.219 - - [29/Jul/2025:08:50:34 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" 10.100.2.219 - - [29/Jul/2025:08:50:34 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" 10.100.2.219 - - [29/Jul/2025:08:50:34 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" 10.100.2.219 - - [29/Jul/2025:08:50:34 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" 10.100.2.219 - - [29/Jul/2025:08:50:34 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" 10.100.2.219 - - [29/Jul/2025:08:50:34 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" 10.100.2.219 - - [29/Jul/2025:08:50:34 +0000] "GET / HTTP/1.1" 200 100 "-" "curl/7.81.0" "10.0.0.232" [root@master231 traefik]# kubectl logs -f deploy-web02-6cf97565db-74vlc ... 10.100.2.219 - - [29/Jul/2025:08:50:34 +0000] "GET / HTTP/1.1" 200 101 "-" "curl/7.81.0" "10.0.0.232" 10.100.2.219 - - [29/Jul/2025:08:50:34 +0000] "GET / HTTP/1.1" 200 101 "-" "curl/7.81.0" "10.0.0.232"
8、Trafik 启用Gateway API

bash
- Trafik 启用Gateway API 1.什么是Gateway API 由于Ingress资源对象不能很好的满足网络需求,很多场景下Ingress控制器都需要通过定义annotations或者crd来进行功能扩展,这对于使用标准和支持是非常不利的,新推出的Gateway API旨在通过可扩展的面向角色的接口来增强服务网络。 Gateway API之前叫"Service API",是由SIG-NETWORK社区管理的开源项目。Gateway API是Kubernetes的官方项目,专注于Kubernete中的L4和L7路由。 该项目代表了下一代Kubernetes入口、负载平衡和服务网格API。从一开始,它就被设计成通用的、富有表现力的和面向角色的。 Gateway API是Kubernetes中的一个API资源集合,此API中的大部分配置都包含在路由层中,包括GatewayClass、Gateway、HTTPRoute、TCPRoute、Service等,这些资源共同为各种网络用例构建模型,为Ingress和Mesh提供了高级路由功能。 官网链接: https://gateway-api.sigs.k8s.io/ 2.Gateway API和Ingress的比较 Gateway API的改进比当前的Ingress资源对象有很多更好的设计: - 面向角色: Gateway由各种API资源组成,这些资源根据使用和配置Kubernetes服务网络的角色进行建模。 - 通用性: 和Ingress一样是一个具有众多实现的通用规范,Gateway API是一个被设计成由许多实现支持的规范标准。 - 更具表现力: Gateway API资源支持基于Header头的匹配、流量权重等核心功能,这些功能在Ingress中只能通过自定义注解才能实现。 - 可扩展性: Gateway API 允许自定义资源链接到 API 的各个层,这就允许在 API 结构的适当位置进行更精细的定制。 还有一些其他值得关注的功能: - GatewayClasses: 将负载均衡实现的类型形式化,这些类使用户可以很容易了解到通过Kubernetes资源可以获得什么样的能力。 - 共享网关和跨命名空间支持: 它们允许共享负载均衡器和VIP,允许独立的路由资源绑定到同一个网关,这使得团队可以安全地共享(包括跨命名空间)基础设施,而不需要直接协调。 - 规范化路由和后端: Gateway API支持类型化的路由资源和不同类型的后端。 这使得API可以灵活地支持各种协议(如: HTTP和gRPC)和各种后端服务(如: Kubernetes Service、存储桶或函数)。 - Gateway API的资源模型 在整个Gateway API中涉及到3个角色:基础设施提供商、集群管理员、应用开发人员,在某些场景下可能还会涉及到应用管理员等角色。 Gateway API 中定义了3种主要的资源模型:GatewayClass、Gateway、Route。 - GatewayClass 定义了一组共享相同配置和动作的网关。 每个GatewayClass 由一个控制器处理,是一个集群范围的资源,必须至少有一个GatewayClass被定义。 这与Ingress的IngressClass类似,在Ingress v1beta1版本中,与GatewayClass类似的是ingress-class注解。 而在Ingress V1版本中,最接近的就是IngressClass资源对象。 - Gateway 网关描述了如何将流量转化为集群内的服务,也就是说,它定义了一个请求,要求将流量从不了解Kubernetes的地方转换到集群内的服务。 例如,由云端负载均衡器、集群内代理或外部硬件负载均衡器发送到Kubernetes服务的流量。 它定义了对特定负载均衡器配置的请求,该配置实现了GatewayClass的配置和行为规范。 该资源可以由管理员直接创建,也可以由处理GatewayClass的控制器创建。 Gateway可以附加到一个或多个路由引用上,这些路由引用的作用是将流量的一个子集导向特定的服务。 - Route 路由资源定义了特定的规则,用于将请求从网关映射到Kubernetes服务。 从v1alpha2版本开始,API中包含四种Route路由资源类型。 对于其他未定义的协议,鼓励采用特定实现的自定义路由类型,当然未来也可能会添加新的路由类型。 主流的Route路由资源类型 - HTTPRoute 适用于HTTP或HTTPS连接,适用于我们想要检查HTTP请求并使用HTTP请求进行路由或修改的场景。 比如使用HTTP Headers头进行路由,或在请求过程中对它们进行修改。 - TLSRoute 用于TLS连接,通过SNI进行区分,它适用于希望使用SNI作为主要路由方法的地方。 并且对HTTP等更高级别协议的属性不感兴趣,连接的字节流不经任何检查就被代理到后端。 - TCPRoute 旨在用于将一个或多个端口映射到单个后端。 在这种情况下,没有可以用来选择同一端口的不同后端的判别器,所以每个TCPRoute在监听器上需要一个不同的端口。 你可以使用TLS,在这种情况下,未加密的字节流会被传递到后端,当然也可以不使用TLS,这样加密的字节流将传递到后端。 - UDPRoute 和TCPRoute类似,旨在用于将一个或多个端口映射到单个后端,只不过走的是UDP协议。 - Gateway API资源模型组合关系 GatewayClass、Gateway、xRoute和服务的组合定义了一个可实现的负载均衡器。如上图所示,说明了不同资源之间的关系。 使用反向代理实现的网关的典型客户端/网关 API 请求流程如下所示: - 1.客户端向"http://foo.example.com"发出请求; - 2.DNS将域名解析为Gateway网关地址; - 3.反向代理在监听器上接收请求,并使用"Host Header"来匹配HTTPRoute; - 4.(可选)反向代理可以根据"HTTPRoute"的匹配规则进行路由; - 5.(可选)反向代理可以根据"HTTPRoute"的过滤规则修改请求,即添加或删除headers; - 6.最后,反向代理根据"HTTPRoute""forwardTo"规则,将请求转发给集群中的一个或多个对象,即服务; 参考链接: https://gateway-api.sigs.k8s.io/concepts/api-overview/#combined-types
9、Traefik启用kubernetes Gateway功能
bash
- Traefik启用kubernetes Gateway功能 1.启用kubernetesGateway功能 [root@master231 ~]# cd /weixiang/manifests/24-ingressClass/traefik/ [root@master231 traefik]# [root@master231 traefik]# vim traefik/values.yaml ... providers: ... kubernetesGateway: ... enabled: true 2.升级服务 [root@master231 traefik]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION traefik-server default 1 2025-07-29 14:40:53.347245878 +0800 CST deployed traefik-36.3.0 v3.4.3 [root@master231 traefik]# [root@master231 traefik]# helm upgrade -f traefik/values.yaml traefik-server traefik # -f traefik/values.yaml: 使用 traefik/values.yaml 这个文件中的配置值来覆盖 Chart 中默认的配置 # traefik-server: 发布名称 # traefik: Chart 名称 Release "traefik-server" has been upgraded. Happy Helming! NAME: traefik-server LAST DEPLOYED: Wed Jul 30 10:10:30 2025 NAMESPACE: default STATUS: deployed REVISION: 2 TEST SUITE: None NOTES: traefik-server with docker.io/traefik:v3.4.3 has been deployed successfully on default namespace ! [root@master231 traefik]# [root@master231 traefik]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION traefik-server default 2 2025-07-30 10:10:30.877111413 +0800 CST deployed traefik-36.3.0 v3.4.3 [root@master231 traefik]# 3.查看gatewayclass资源 [root@master231 traefik]# kubectl get gatewayclasses # 1.创建了GatewayClass (网关类): NAME CONTROLLER ACCEPTED AGE traefik traefik.io/gateway-controller True 6s [root@master231 traefik]# [root@master231 traefik]# kubectl get gc NAME CONTROLLER ACCEPTED AGE traefik traefik.io/gateway-controller True 48s [root@master231 traefik]# 4.查看Traefik的WebUI验证 如上图所示,我们成功启用了Gateway API功能哟。 http://www.yinzhengjie.com/dashboard/#/
10、通过Gateway API方式暴露traefik dashboard
bash
通过Gateway API方式暴露traefik dashboard 1 查看默认的entryPoint [root@master241 traefik]# vim traefik/values.yaml ... # 注意观察gateway定义的listeners,这是默认的entryPoint,也支持我们自行定义,但后面的案例要用到该配置 gateway: ... enabled: true ... listeners: web: port: 8000 ... protocol: HTTP ... 2.创建Gateway资源 # 2.创建网关,这个文件定义了流量的入口 (Entrypoint) [root@master231 27-gatewayAPI]# cat > 01-Gateway-Traefik-dashboard.yaml <<''EOF apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: name: http-gateway # Gateway 资源的唯一名称,在 HTTPRoute 中会引用它 spec: gatewayClassName: traefik # 绑定上面创建的gatewayclasses网关,请让 Traefik 控制器来管理这个 Gateway listeners: # 定义监听器列表,即 Gateway 开放的端口和协议 - protocol: HTTP # 监听的协议,这里是 HTTP port: 8000 # 监听的端口号。外部流量需要访问这个端口 name: web EOF # 它通过 gatewayClassName: traefik 与 GatewayClass 关联,它作为 HTTPRoute 的父级 (parent),被 HTTPRoute 通过名称引用 [root@master231 27-gatewayAPI]# kubectl apply -f 01-Gateway-Traefik-dashboard.yaml gateway.gateway.networking.k8s.io/http-gateway created [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# kubectl get gateway NAME CLASS ADDRESS PROGRAMMED AGE http-gateway traefik 10.0.0.152 True 10s traefik-gateway traefik 10.0.0.152 True 5m5s [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# kubectl get gtw NAME CLASS ADDRESS PROGRAMMED AGE http-gateway traefik 10.0.0.152 True 34s traefik-gateway traefik 10.0.0.152 True 5m29s [root@master231 27-gatewayAPI]# 3.创建HTTPRoute资源引用Gateway # 3.创建HTTPRoute,这个文件定义了路由规则 (Routing Rules) [root@master231 27-gatewayAPI]# cat > 02-HTTPRoute-Traefik-dashboard.yaml <<''EOF apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute # 资源类型是 HTTPRoute metadata: name: traefik-dashboard-httproute # HTTPRoute 资源的唯一名称 labels: role: traefik-dashboard spec: hostnames: - "web.yinzhengjie.com" # 定义此路由规则适用于哪些主机名(域名) parentRefs: # 关键!定义此 HTTPRoute 挂载到哪个 Gateway 上。 - name: http-gateway # 注意哈,这里的名称要指定的是上面创建的Gateway rules: - matches: - path: type: PathPrefix # PathPrefix 表示匹配此前缀开头的任何路径 value: / # 匹配的路径值,'/' 表示匹配所有路径 timeouts: request: 100ms backendRefs: # 定义匹配成功后,流量应该被转发到哪里 - name: jiege-traefik-dashboard # 后端 K8s Service 的名称 port: 8080 # 流量要发往该 Service 的哪个端口 weight: 1 EOF # 它通过 parentRefs.name: http-gateway 与 Gateway 关联,表示“我的规则应用在 http-gateway 这个入口上” # 它通过 backendRefs.name: jiege-traefik-dashboard 与后端的 Service 关联,定义了流量的最终去向 [root@master231 27-gatewayAPI]# kubectl get svc jiege-traefik-dashboard NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE jiege-traefik-dashboard ClusterIP 10.200.57.133 <none> 8080/TCP 41h [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# kubectl apply -f 02-HTTPRoute-Traefik-dashboard.yaml httproute.gateway.networking.k8s.io/traefik-dashboard-httproute created [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# kubectl get httproutes NAME HOSTNAMES AGE traefik-dashboard-httproute ["web.yinzhengjie.com"] 8s [root@master231 27-gatewayAPI]# # 4.Service & Pods (服务和应用实例) 4.客户端访问测试 温馨提示: 需要在windows添加解析后就可以正常访问啦。 10.0.0.152 web.yinzhengjie.com 流程: 1.客户端:用户在浏览器输入 http://web.yinzhengjie.com:8000。 2.DNS/Hosts解析:操作系统将 web.yinzhengjie.com 解析为 10.0.0.152。 3.流量进入集群:请求被发送到 10.0.0.152 的 8000 端口,被 Traefik 接收。 4.Gateway 匹配:Traefik 发现有一个 Gateway 资源(http-gateway)正在监听 8000 端口。 5.HTTPRoute 匹配:Traefik 查找所有挂载到 http-gateway 上的 HTTPRoute 资源。它找到了 traefik-dashboard-httproute,因为它的 hostnames 字段匹配 web.yinzhengjie.com,路径 / 也匹配。 6.后端转发:根据 HTTPRoute 的 backendRefs 规则,Traefik 将请求转发到名为 jiege-traefik-dashboard 的 Service 的 8080 端口。 7.服务发现:Kubernetes 的 Service 将请求负载均衡到其后端的某个 Pod 上,最终由 Traefik Dashboard 应用处理并响应。 参考链接: http://web.yinzhengjie.com/dashboard/#/ 彩蛋:名称空间删除不掉的解决方案: [root@master231 ~]# kubectl get ns NAME STATUS AGE default Active 21d ingress-nginx Active 47h kube-flannel Terminating 5d kube-node-lease Active 21d kube-public Active 21d kube-system Active 21d kubernetes-dashboard Active 8d kuboard Active 8d metallb-system Active 14d [root@master231 ~]# [root@master231 ~]# ETCDCTL_API=3 etcdctl --endpoints=https://10.0.0.231:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key del /registry/namespaces/kube-flannel 1 [root@master231 ~]# [root@master231 ~]# kubectl get ns NAME STATUS AGE default Active 21d ingress-nginx Active 47h kube-node-lease Active 21d kube-public Active 21d kube-system Active 21d kubernetes-dashboard Active 8d kuboard Active 8d metallb-system Active 14d [root@master231 ~]#
11、通过Gateway API方式暴露WEB应用
bash
- 通过Gateway API方式暴露WEB应用 1 创建测试应用 [root@master231 gatewayAPI]# cat > 03-deploy-xiuxian.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 1 selector: matchLabels: apps: xiuxian template: metadata: labels: apps: xiuxian spec: containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 name: c1 ports: - containerPort: 80 name: web --- apiVersion: v1 kind: Service metadata: name: svc-xiuxian spec: ports: - port: 80 targetPort: web selector: apps: xiuxian EOF [root@master231 27-gatewayAPI]# kubectl apply -f 03-deploy-xiuxian.yaml deployment.apps/deploy-xiuxian created service/svc-xiuxian created [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# kubectl get svc svc-xiuxian NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc-xiuxian ClusterIP 10.200.7.46 <none> 80/TCP 4s [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# curl 10.200.7.46 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 27-gatewayAPI]# 2 创建Gateway资源并指定allowedRoutes [root@master231 27-gatewayAPI]# cat > 04-Gateway-xiuxian.yaml <<EOF apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: name: xiuxian-gateway spec: gatewayClassName: traefik listeners: - protocol: HTTP port: 8000 name: web # 注意哈,我们可以配置允许的路由类型哟,如果不定义,则默认允许所有的路由都可以访问该网关。 allowedRoutes: kinds: - kind: HTTPRoute namespaces: from: All selector: matchLabels: role: xiuxian EOF [root@master231 27-gatewayAPI]# kubectl apply -f 04-Gateway-xiuxian.yaml gateway.gateway.networking.k8s.io/xiuxian-gateway created [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# kubectl get gtw xiuxian-gateway NAME CLASS ADDRESS PROGRAMMED AGE xiuxian-gateway traefik 10.0.0.152 True 10s [root@master231 27-gatewayAPI]# 3 创建HTTPRoute资源引用Gateway [root@master231 27-gatewayAPI]# cat > 05-HTTPRoute-xiuxian.yaml <<EOF apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: httproute-xiuxian labels: role: xiuxian spec: hostnames: - "xiuxian.yinzhengjie.com" parentRefs: - name: xiuxian-gateway rules: - matches: - path: type: PathPrefix value: / timeouts: request: 100ms backendRefs: - name: svc-xiuxian port: 80 weight: 1 EOF [root@master231 27-gatewayAPI]# kubectl apply -f 05-HTTPRoute-xiuxian.yaml httproute.gateway.networking.k8s.io/httproute-xiuxian created [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# kubectl get httproutes httproute-xiuxian NAME HOSTNAMES AGE httproute-xiuxian ["xiuxian.yinzhengjie.com"] 12s [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# 4.访问测试 温馨提示: 需要在windows添加解析后就可以正常访问啦。 10.0.0.152 xiuxian.yinzhengjie.com 参考链接: http://xiuxian.yinzhengjie.com/
12、Gateway API实现灰度发布案例
bash
- Gateway API实现灰度发布案例 1 准备测试案例 [root@master231 gatewayAPI]# cat > 06-deploy-apps.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-blog spec: replicas: 1 selector: matchLabels: apps: blog template: metadata: labels: apps: blog spec: containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 command: - /bin/sh - -c - 'echo https://www.cnblogs.com/yinzhengjie > /usr/share/nginx/html/index.html && nginx && tail -f /etc/hosts' name: c1 ports: - containerPort: 80 name: web --- apiVersion: v1 kind: Service metadata: name: svc-blog spec: ports: - port: 80 targetPort: web selector: apps: blog --- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-bilibili spec: replicas: 1 selector: matchLabels: apps: bilibili template: metadata: labels: apps: bilibili spec: containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 command: - /bin/sh - -c - 'echo https://space.bilibili.com/600805398/lists > /usr/share/nginx/html/index.html && nginx && tail -f /etc/hosts' name: c1 ports: - containerPort: 80 name: web --- apiVersion: v1 kind: Service metadata: name: svc-bilibili spec: ports: - port: 80 targetPort: web selector: apps: bilibili EOF [root@master231 27-gatewayAPI]# kubectl apply -f 06-deploy-apps.yaml deployment.apps/deploy-blog created service/svc-blog created deployment.apps/deploy-bilibili created service/svc-bilibili created [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-bilibili-6c6bbd5ffb-2xptw 1/1 Running 0 3s 10.100.2.233 worker233 <none> <none> deploy-blog-565594b7d8-4t2xp 1/1 Running 0 3s 10.100.2.232 worker233 <none> <none> deploy-xiuxian-6b58c75548-gpjh4 1/1 Running 0 18m 10.100.2.231 worker233 <none> <none> traefik-server-6767c5db-bmfvt 1/1 Running 0 54m 10.100.1.66 worker232 <none> <none> [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# kubectl get svc svc-blog svc-bilibili NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc-blog ClusterIP 10.200.192.109 <none> 80/TCP 42s svc-bilibili ClusterIP 10.200.98.79 <none> 80/TCP 42s [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# curl 10.200.192.109 https://www.cnblogs.com/yinzhengjie [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# curl 10.200.98.79 https://space.bilibili.com/600805398/lists [root@master231 27-gatewayAPI]# 2. 创建Gateway资源 [root@master231 27-gatewayAPI]# cat > 07-Gateway-xiuxian.yaml <<EOF apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: name: xiuxian-gateway spec: gatewayClassName: traefik listeners: - protocol: HTTP port: 8000 name: web allowedRoutes: kinds: - kind: HTTPRoute namespaces: from: All selector: matchLabels: role: xiuxian EOF [root@master231 27-gatewayAPI]# kubectl apply -f 07-Gateway-xiuxian.yaml gateway.gateway.networking.k8s.io/xiuxian-gateway configured [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# kubectl get gtw xiuxian-gateway NAME CLASS ADDRESS PROGRAMMED AGE xiuxian-gateway traefik 10.0.0.152 True 5m46s [root@master231 27-gatewayAPI]# 3 创建HTTPRoute资源引用Gateway [root@master231 27-gatewayAPI]# cat > 08-HTTPRoute-huidu.yaml <<EOF apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: httproute-xiuxian-lb labels: role: xiuxian spec: hostnames: - "demo.yinzhengjie.com" parentRefs: - name: xiuxian-gateway rules: - matches: - path: type: PathPrefix value: / timeouts: request: 100ms backendRefs: - name: svc-bilibili port: 80 weight: 8 - name: svc-blog port: 80 weight: 2 EOF [root@master231 27-gatewayAPI]# kubectl apply -f 08-HTTPRoute-huidu.yaml httproute.gateway.networking.k8s.io/httproute-xiuxian-lb created [root@master231 27-gatewayAPI]# [root@master231 27-gatewayAPI]# kubectl get httproutes httproute-xiuxian-lb NAME HOSTNAMES AGE httproute-xiuxian-lb ["demo.yinzhengjie.com"] 9s [root@master231 27-gatewayAPI]# 4.测试验证 [root@master231 gatewayAPI]# for i in `seq 10`; do curl -s -H "HOST: demo.yinzhengjie.com" 10.0.0.152; done | sort | uniq -c 8 https://space.bilibili.com/600805398/lists 2 https://www.cnblogs.com/yinzhengjie [root@master231 gatewayAPI]# [root@master231 gatewayAPI]# for i in `seq 10`; do curl -s -H "HOST: demo.yinzhengjie.com" 10.0.0.152; done https://space.bilibili.com/600805398/lists https://space.bilibili.com/600805398/lists https://space.bilibili.com/600805398/lists https://www.cnblogs.com/yinzhengjie https://space.bilibili.com/600805398/lists https://space.bilibili.com/600805398/lists https://space.bilibili.com/600805398/lists https://space.bilibili.com/600805398/lists https://www.cnblogs.com/yinzhengjie https://space.bilibili.com/600805398/lists [root@master231 gatewayAPI]#
10、客户端访问Traefik流程

eeffc587133e01808dad1f66fd06dc99

bash
一、数据平面:请求的流转路径 这是客户端发起一个HTTP/HTTPS请求后,数据包在系统中的实际旅行路线。 1.客户端发起请求 (Client) 用户或外部系统(Client)向一个公共的IP地址和端口发起请求。这个IP地址是由Service (traefik-server)这个LoadBalancer类型的服务提供的。 2.K8s服务 (Service: traefik-server) 请求首先到达Kubernetes集群的入口,也就是这个名为 traefik-server 的 Service。 这个Service的类型是 LoadBalancer,意味着它通常会通过云服务商(或本地的MetalLB等)获得一个对外的、可访问的IP地址。 该Service的核心作用是负载均衡。它通过 Selector(选择器)来寻找带有特定labels(标签)的Pod。从图中看,它的Selector是: app.kubernetes.io/instance=traefik-server-default app.kubernetes.io/name=traefik 3.Kube-proxy (kube-ipvs0) 当请求到达Service的虚拟IP后,Kubernetes内部的网络组件 kube-proxy 会接管。 图中每个节点(master231, woker232, woker233)上都有一个 kube-ipvs0 的组件,这表明该集群使用IPVS模式进行服务路由。 kube-proxy 会根据Service的Selector找到所有匹配的Pod(也就是Traefik的Pod),然后通过IPVS规则,将请求从Service的虚拟IP高效地转发到其中一个健康的Traefik Pod的IP地址和端口上。图中箭头从Service指向了所有节点的kube-ipvs0,形象地说明了这是一个集群范围内的转发规则。 4.Traefik Pod (入口控制器) 请求最终被路由到了一个正在运行的Traefik Pod上(在图中,这个Pod恰好被调度在了woker233节点上)。 这个Pod内的traefik proxy进程接收到请求。 5.Traefik内部处理与转发 EntryPoint (入口点): 请求首先进入Traefik配置好的一个EntryPoint(例如,web入口通常是80端口,websecure是443端口)。 Router (路由匹配): Traefik会检查请求的特征(如域名、路径、Header等),并与它内部的路由规则(在Traefik v1中称为Frontends,v2+中称为Routers)进行匹配。 Middleware (中间件): 如果匹配的路由上配置了中间件(Middleware),例如鉴权、限流、Header修改等,Traefik会在这里执行这些操作。 Service & Backend (后端服务): 一旦路由匹配成功并通过了所有中间件,Traefik就会将请求转发给该路由所关联的后端服务(Backends/Services)。这个后端服务就是你在Ingress或IngressRoute中定义的Kubernetes Service,最终指向了你的业务应用Pod。 6.最终应用Pod (web01, web02) Traefik将请求转发给目标应用Pod(如图中的 web01 或 web02)。这些Pod处理请求并返回响应。响应会沿着原路返回给客户端。 二、控制平面:Traefik的“大脑” 这个流程解释了Traefik是如何动态地、自动地创建和更新其内部的路由规则的。 1.用户定义路由规则 开发者或运维人员通过编写YAML文件,来定义路由规则。这些规则可以是标准的Kubernetes Ingress资源,也可以是Traefik自己的CRD(Custom Resource Definition),如图中列出的: IngressRoute (推荐,功能更强) IngressRouteTCP / IngressRouteUDP (用于TCP/UDP流量) Middleware (定义中间件) TraefikService (高级负载均衡配置) 2.提交到K8s API Server 用户使用 kubectl apply -f 等命令将这些YAML文件提交到Kubernetes集群的api-server。 api-server会对这些资源进行验证,然后将它们持久化存储在etcd中。 3.Traefik自动发现 (Auto Discovery) Traefik Pod在启动时被配置为Kubernetes Ingress Controller。 它会持续地“监听”(Watch)api-server上特定资源的变化(就是第1步中提到的那些Ingress、IngressRoute等资源)。 如图中从api-server指向traefik proxy的箭头所示,Traefik正在从api-server获取配置信息。 4.动态更新路由 当Traefik监听到有新的IngressRoute被创建,或者现有的被修改/删除时,它会立即读取这些资源的内容。 根据这些内容,Traefik会在内存中动态地、实时地更新自己的路由配置(EntryPoints, Routers, Services等)。这个过程不需要重启Traefik Pod。

33、二进制安装k8s

26ef0d1b48ec8f0973e9ca22cb2a4316_720

1、K8S硬件环境准备
LVM卷扩展
bash
- LVM卷扩展 lvextend /dev/mapper/ubuntu--vg-ubuntu--lv -l +100%FREE resize2fs /dev/mapper/ubuntu--vg-ubuntu--lv
K8S集群各主机基础优化
bash
1.所有节点安装常用的软件包 apt update && apt -y install bind9-utils expect rsync jq psmisc net-tools lvm2 vim unzip rename tree 2.k8s-cluster241节点免密钥登录集群并同步数据 [root@k8s-cluster241 ~]# cat >> /etc/hosts <<'EOF' 10.0.0.240 apiserver-lb 10.1.12.3 k8s-cluster241 10.1.12.4 k8s-cluster242 10.1.12.15 k8s-cluster243 EOF 3.配置免密码登录其他节点 [root@k8s-cluster241 ~]# cat > password_free_login.sh <<'EOF' #!/bin/bash # auther: Jason Yin # 创建密钥对 ssh-keygen -t rsa -P "" -f /root/.ssh/id_rsa -q # 声明你服务器密码,建议所有节点的密码均一致,否则该脚本需要再次进行优化 export mypasswd=1 # 定义主机列表 k8s_host_list=(k8s-cluster241 k8s-cluster242 k8s-cluster243) # 配置免密登录,利用expect工具免交互输入 for i in ${k8s_host_list[@]};do expect -c " spawn ssh-copy-id -i /root/.ssh/id_rsa.pub root@$i expect { \"*yes/no*\" {send \"yes\r\"; exp_continue} \"*password*\" {send \"$mypasswd\r\"; exp_continue} }" done EOF bash password_free_login.sh 4.编写同步脚本 [root@k8s-cluster241 ~]# cat > /usr/local/sbin/data_rsync.sh <<'EOF' #!/bin/bash # Auther: Jason Yin if [ $# -lt 1 ];then echo "Usage: $0 /path/to/file(绝对路径) [mode: m|w]" exit fi if [ ! -e $1 ];then echo "[ $1 ] dir or file not find!" exit fi fullpath=`dirname $1` basename=`basename $1` cd $fullpath case $2 in WORKER_NODE|w) K8S_NODE=(k8s-cluster242 k8s-cluster243) ;; MASTER_NODE|m) K8S_NODE=(k8s-cluster242 k8s-cluster243) ;; *) K8S_NODE=(k8s-cluster242 k8s-cluster243) ;; esac for host in ${K8S_NODE[@]};do tput setaf 2 echo ===== rsyncing ${host}: $basename ===== tput setaf 7 rsync -az $basename `whoami`@${host}:$fullpath if [ $? -eq 0 ];then echo "命令执行成功!" fi done EOF chmod +x /usr/local/sbin/data_rsync.sh data_rsync.sh /etc/hosts 5.所有节点Linux基础环境优化 systemctl disable --now NetworkManager ufw swapoff -a && sysctl -w vm.swappiness=0 sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab ln -svf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime cat >> /etc/security/limits.conf <<'EOF' * soft nofile 655360 * hard nofile 131072 * soft nproc 655350 * hard nproc 655350 * soft memlock unlimited * hard memlock unlimited EOF sed -i 's@#UseDNS yes@UseDNS no@g' /etc/ssh/sshd_config sed -i 's@^GSSAPIAuthentication yes@GSSAPIAuthentication no@g' /etc/ssh/sshd_config cat > /etc/sysctl.d/k8s.conf <<'EOF' # 以下3个参数是containerd所依赖的内核参数 net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.ipv6.conf.all.disable_ipv6 = 1 fs.may_detach_mounts = 1 vm.overcommit_memory=1 vm.panic_on_oom=0 fs.inotify.max_user_watches=89100 fs.file-max=52706963 fs.nr_open=52706963 net.netfilter.nf_conntrack_max=2310720 net.ipv4.tcp_keepalive_time = 600 net.ipv4.tcp_keepalive_probes = 3 net.ipv4.tcp_keepalive_intvl =15 net.ipv4.tcp_max_tw_buckets = 36000 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_max_orphans = 327680 net.ipv4.tcp_orphan_retries = 3 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.ip_conntrack_max = 65536 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.tcp_timestamps = 0 net.core.somaxconn = 16384 EOF sysctl --system cat <<EOF >> ~/.bashrc PS1='[\[\e[34;1m\]\u@\[\e[0m\]\[\e[32;1m\]\H\[\e[0m\]\[\e[31;1m\] \W\[\e[0m\]]# ' EOF source ~/.bashrc free -h 6.所有节点安装ipvsadm以实现kube-proxy的负载均衡 apt -y install ipvsadm ipset sysstat conntrack cat > /etc/modules-load.d/ipvs.conf << 'EOF' ip_vs ip_vs_lc ip_vs_wlc ip_vs_rr ip_vs_wrr ip_vs_lblc ip_vs_lblcr ip_vs_dh ip_vs_sh ip_vs_fo ip_vs_nq ip_vs_sed ip_vs_ftp ip_vs_sh nf_conntrack br_netfilter ip_tables ip_set xt_set ipt_set ipt_rpfilter ipt_REJECT ipip EOF 7.重复所有节点并验证模块是否加载成功 reboot lsmod | grep --color=auto -e ip_vs -e nf_conntrack -e br_netfilter uname -r ifconfig free -h 参考示例: [root@k8s-cluster241 ~]# lsmod | grep --color=auto -e ip_vs -e nf_conntrack -e br_netfilter br_netfilter 32768 0 bridge 311296 1 br_netfilter ip_vs_ftp 16384 0 nf_nat 49152 1 ip_vs_ftp ip_vs_sed 16384 0 ip_vs_nq 16384 0 ip_vs_fo 16384 0 ip_vs_sh 16384 0 ip_vs_dh 16384 0 ip_vs_lblcr 16384 0 ip_vs_lblc 16384 0 ip_vs_wrr 16384 0 ip_vs_rr 16384 0 ip_vs_wlc 16384 0 ip_vs_lc 16384 0 ip_vs 176128 25 ip_vs_wlc,ip_vs_rr,ip_vs_dh,ip_vs_lblcr,ip_vs_sh,ip_vs_fo,ip_vs_nq,ip_vs_lblc,ip_vs_wrr,ip_v_lc,ip_vs_sed,ip_vs_ftp nf_conntrack 172032 2 nf_nat,ip_vs nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs nf_defrag_ipv4 16384 1 nf_conntrack libcrc32c 16384 5 nf_conntrack,nf_nat,btrfs,raid456,ip_vs [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# free -h total used free shared buff/cache available Mem: 7.7Gi 334Mi 7.1Gi 1.0Mi 340Mi 7.1Gi Swap: 0B 0B 0B [root@k8s-cluster241 ~]# [root@k8s-cluster242 ~]# lsmod | grep --color=auto -e ip_vs -e nf_conntrack -e br_netfilter br_netfilter 32768 0 bridge 311296 1 br_netfilter ip_vs_ftp 16384 0 nf_nat 49152 1 ip_vs_ftp ip_vs_sed 16384 0 ip_vs_nq 16384 0 ip_vs_fo 16384 0 ip_vs_sh 16384 0 ip_vs_dh 16384 0 ip_vs_lblcr 16384 0 ip_vs_lblc 16384 0 ip_vs_wrr 16384 0 ip_vs_rr 16384 0 ip_vs_wlc 16384 0 ip_vs_lc 16384 0 ip_vs 176128 24 ip_vs_wlc,ip_vs_rr,ip_vs_dh,ip_vs_lblcr,ip_vs_sh,ip_vs_fo,ip_vs_nq,ip_vs_lblcip_vs_wrr,ip_vs_lc,ip_vs_sed,ip_vs_ftp nf_conntrack 172032 2 nf_nat,ip_vs nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs nf_defrag_ipv4 16384 1 nf_conntrack libcrc32c 16384 5 nf_conntrack,nf_nat,btrfs,raid456,ip_vs [root@k8s-cluster242 ~]# [root@k8s-cluster242 ~]# free -h total used free shared buff/cache available Mem: 7.7Gi 315Mi 7.1Gi 1.0Mi 336Mi 7.2Gi Swap: 0B 0B 0B [root@k8s-cluster242 ~]# [root@k8s-cluster243 ~]# lsmod | grep --color=auto -e ip_vs -e nf_conntrack ip_vs_ftp 16384 0 nf_nat 49152 1 ip_vs_ftp ip_vs_sed 16384 0 ip_vs_nq 16384 0 ip_vs_fo 16384 0 ip_vs_sh 16384 0 ip_vs_dh 16384 0 ip_vs_lblcr 16384 0 ip_vs_lblc 16384 0 ip_vs_wrr 16384 0 ip_vs_rr 16384 0 ip_vs_wlc 16384 0 ip_vs_lc 16384 0 ip_vs 176128 25 ip_vs_wlc,ip_vs_rr,ip_vs_dh,ip_vs_lblcr,ip_vs_sh,ip_vs_fo,ip_vs_nq,ip_vs_lblc,ip_vs_wrr,ip_vs_lc,ip_v_sed,ip_vs_ftp nf_conntrack 172032 2 nf_nat,ip_vs nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs nf_defrag_ipv4 16384 1 nf_conntrack libcrc32c 16384 5 nf_conntrack,nf_nat,btrfs,raid456,ip_vs [root@k8s-cluster243 ~]# [root@k8s-cluster243 ~]# free -h total used free shared buff/cache available Mem: 7.7Gi 317Mi 7.1Gi 1.0Mi 331Mi 7.2Gi Swap: 0B 0B 0B [root@k8s-cluster243 ~]# 8.关机拍快照 快照名称'k8s操作系统环境准备就绪'
部署Containerd运行时
bash
1.如果有docker环境,请卸载【可跳过】 ./install-docker.sh r ip link del docker0 2.安装containerd wget http://192.168.21.253/Resources/Docker/Containerd/weixiang-autoinstall-containerd-v1.6.36.tar.gz tar xf weixiang-autoinstall-containerd-v1.6.36.tar.gz ./install-containerd.sh i 3.检查Containerd的版本 [root@k8s-cluster241 ~]# ctr version Client: Version: v1.6.36 Revision: 88c3d9bc5b5a193f40b7c14fa996d23532d6f956 Go version: go1.22.7 Server: Version: v1.6.36 Revision: 88c3d9bc5b5a193f40b7c14fa996d23532d6f956 UUID: 02339237-7847-4564-9733-1e6ac9618a33 [root@k8s-cluster241 ~]# [root@k8s-cluster242 ~]# ctr version Client: Version: v1.6.36 Revision: 88c3d9bc5b5a193f40b7c14fa996d23532d6f956 Go version: go1.22.7 Server: Version: v1.6.36 Revision: 88c3d9bc5b5a193f40b7c14fa996d23532d6f956 UUID: f9ba2325-9146-4217-b2a8-1e733fa7facf [root@k8s-cluster242 ~]# [root@k8s-cluster243 ~]# ctr version Client: Version: v1.6.36 Revision: 88c3d9bc5b5a193f40b7c14fa996d23532d6f956 Go version: go1.22.7 Server: Version: v1.6.36 Revision: 88c3d9bc5b5a193f40b7c14fa996d23532d6f956 UUID: 1c95864f-3aa1-412e-a0e7-dbfc62557c78
etcd高可用集群部署
bash
1 下载etcd的软件包 wget https://github.com/etcd-io/etcd/releases/download/v3.5.22/etcd-v3.5.22-linux-amd64.tar.gz svip: [root@k8s-cluster241 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/Etcd/etcd-v3.5.22-linux-amd64.tar.gz 2 解压etcd的二进制程序包到PATH环境变量路径 [root@k8s-cluster241 ~]# tar -xf etcd-v3.5.22-linux-amd64.tar.gz -C /usr/local/bin etcd-v3.5.22-linux-amd64/etcd{,ctl} --strip-components=1 [root@k8s-cluster241 ~]# ll /usr/local/bin/etcd* -rwxr-xr-x 1 yinzhengjie yinzhengjie 24072344 Mar 28 06:58 /usr/local/bin/etcd* -rwxr-xr-x 1 yinzhengjie yinzhengjie 18419864 Mar 28 06:58 /usr/local/bin/etcdctl* [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl version etcdctl version: 3.5.22 API version: 3.5 [root@k8s-cluster241 ~]# 3 将软件包下发到所有节点 [root@k8s-cluster241 ~]# scp /usr/local/bin/etcd* k8s-cluster242:/usr/local/bin [root@k8s-cluster241 ~]# scp /usr/local/bin/etcd* k8s-cluster243:/usr/local/bin 4.准备etcd的证书文件 4.1 安装cfssl证书管理工具 [root@k8s-cluster241 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/Etcd/weixiang-cfssl-v1.6.5.zip [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# unzip weixiang-cfssl-v1.6.5.zip Archive: weixiang-cfssl-v1.6.5.zip inflating: cfssl-certinfo_1.6.5_linux_amd64 inflating: cfssljson_1.6.5_linux_amd64 inflating: cfssl_1.6.5_linux_amd64 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# rename -v "s/_1.6.5_linux_amd64//g" cfssl* cfssl_1.6.5_linux_amd64 renamed as cfssl cfssl-certinfo_1.6.5_linux_amd64 renamed as cfssl-certinfo cfssljson_1.6.5_linux_amd64 renamed as cfssljson [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# mv cfssl* /usr/local/bin/ [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# chmod +x /usr/local/bin/cfssl* [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# ll /usr/local/bin/cfssl* -rwxr-xr-x 1 root root 11890840 Jun 15 2024 /usr/local/bin/cfssl* -rwxr-xr-x 1 root root 8413336 Jun 15 2024 /usr/local/bin/cfssl-certinfo* -rwxr-xr-x 1 root root 6205592 Jun 15 2024 /usr/local/bin/cfssljson* [root@k8s-cluster241 ~]# 4.2 创建证书存储目录 [root@k8s-cluster241 ~]# mkdir -pv /weixiang/{certs,pki}/etcd mkdir: created directory '/weixiang' mkdir: created directory '/weixiang/certs' mkdir: created directory '/weixiang/certs/etcd' mkdir: created directory '/weixiang/pki' mkdir: created directory '/weixiang/pki/etcd' [root@k8s-cluster241 ~]# 4.3 生成证书的CSR文件: 证书签发请求文件,配置了一些域名,公司,单位 [root@k8s-cluster241 ~]# cd /weixiang/pki/etcd [root@k8s-cluster241 etcd]# [root@k8s-cluster241 etcd]# cat > etcd-ca-csr.json <<EOF { "CN": "etcd", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "etcd", "OU": "Etcd Security" } ], "ca": { "expiry": "876000h" } } EOF 4.4 生成etcd CA证书和CA证书的key [root@k8s-cluster241 etcd]# cfssl gencert -initca etcd-ca-csr.json | cfssljson -bare /weixiang/certs/etcd/etcd-ca 2025/06/13 16:15:00 [INFO] generating a new CA key and certificate from CSR 2025/06/13 16:15:00 [INFO] generate received request 2025/06/13 16:15:00 [INFO] received CSR 2025/06/13 16:15:00 [INFO] generating key: rsa-2048 2025/06/13 16:15:00 [INFO] encoded CSR 2025/06/13 16:15:00 [INFO] signed certificate with serial number 469498848898639707673864924355340868165928121853 [root@k8s-cluster241 etcd]# [root@k8s-cluster241 etcd]# ll /weixiang/certs/etcd/etcd-ca* -rw-r--r-- 1 root root 1050 Jun 13 16:15 /weixiang/certs/etcd/etcd-ca.csr -rw------- 1 root root 1679 Jun 13 16:15 /weixiang/certs/etcd/etcd-ca-key.pem -rw-r--r-- 1 root root 1318 Jun 13 16:15 /weixiang/certs/etcd/etcd-ca.pem [root@k8s-cluster241 etcd]# 4.5 生成etcd证书的有效期为100年 [root@k8s-cluster241 etcd]# cat > ca-config.json <<EOF { "signing": { "default": { "expiry": "876000h" }, "profiles": { "kubernetes": { "usages": [ "signing", "key encipherment", "server auth", "client auth" ], "expiry": "876000h" } } } } EOF 4.6 生成证书的CSR文件: 证书签发请求文件,配置了一些域名,公司,单位 [root@k8s-cluster241 etcd]# cat > etcd-csr.json <<EOF { "CN": "etcd", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "etcd", "OU": "Etcd Security" } ] } EOF 4.7 基于自建的ectd ca证书生成etcd的证书 [root@k8s-cluster241 etcd]# cfssl gencert \ -ca=/weixiang/certs/etcd/etcd-ca.pem \ -ca-key=/weixiang/certs/etcd/etcd-ca-key.pem \ -config=ca-config.json \ --hostname=127.0.0.1,k8s-cluster241,k8s-cluster242,k8s-cluster243,10.1.12.3,10.1.12.4,10.1.12.15 \ --profile=kubernetes \ etcd-csr.json | cfssljson -bare /weixiang/certs/etcd/etcd-server cfssl gencert \ -ca=/weixiang/certs/etcd/etcd-ca.pem \ -ca-key=/weixiang/certs/etcd/etcd-ca-key.pem \ -config=ca-config.json \ --hostname=127.0.0.1,node-exporter41,node-exporter42,k8s-cluster43,10.1.12.15,10.1.12.3,10.1.12.4 \ --profile=kubernetes \ etcd-csr.json | cfssljson -bare /weixiang/certs/etcd/etcd-server [root@k8s-cluster241 etcd]# ll /weixiang/certs/etcd/etcd-server* -rw-r--r-- 1 root root 1139 Jun 13 16:16 /weixiang/certs/etcd/etcd-server.csr -rw------- 1 root root 1679 Jun 13 16:16 /weixiang/certs/etcd/etcd-server-key.pem -rw-r--r-- 1 root root 1472 Jun 13 16:16 /weixiang/certs/etcd/etcd-server.pem [root@k8s-cluster241 etcd]# 4.8 将etcd证书拷贝到其他两个master节点 [root@k8s-cluster241 etcd]# data_rsync.sh /weixiang/certs/ ===== rsyncing k8s-cluster242: certs ===== 命令执行成功! ===== rsyncing k8s-cluster243: certs ===== 命令执行成功! [root@k8s-cluster241 etcd]# [root@k8s-cluster242 ~]# tree /weixiang/certs/etcd/ /weixiang/certs/etcd/ ├── etcd-ca.csr ├── etcd-ca-key.pem ├── etcd-ca.pem ├── etcd-server.csr ├── etcd-server-key.pem └── etcd-server.pem 0 directories, 6 files [root@k8s-cluster242 ~]# [root@k8s-cluster243 ~]# tree /weixiang/certs/etcd/ /weixiang/certs/etcd/ ├── etcd-ca.csr ├── etcd-ca-key.pem ├── etcd-ca.pem ├── etcd-server.csr ├── etcd-server-key.pem └── etcd-server.pem 0 directories, 6 files [root@k8s-cluster243 ~]# 5.创建etcd集群各节点配置文件 5.1 k8s-cluster241节点的配置文件 [root@k8s-cluster241 ~]# mkdir -pv /weixiang/softwares/etcd mkdir: created directory '/weixiang/softwares' mkdir: created directory '/weixiang/softwares/etcd' [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# cat > /weixiang/softwares/etcd/etcd.config.yml <<'EOF' name: 'k8s-cluster241' data-dir: /var/lib/etcd wal-dir: /var/lib/etcd/wal snapshot-count: 5000 heartbeat-interval: 100 election-timeout: 1000 quota-backend-bytes: 0 listen-peer-urls: 'https://10.1.12.3:2380' listen-client-urls: 'https://10.1.12.3:2379,http://127.0.0.1:2379' max-snapshots: 3 max-wals: 5 cors: initial-advertise-peer-urls: 'https://10.1.12.3:2380' advertise-client-urls: 'https://10.1.12.3:2379' discovery: discovery-fallback: 'proxy' discovery-proxy: discovery-srv: initial-cluster: 'k8s-cluster241=https://10.1.12.3:2380,k8s-cluster242=https://10.1.12.4:2380,k8s-cluster243=https://10.1.12.15:2380' initial-cluster-token: 'etcd-k8s-cluster' initial-cluster-state: 'new' strict-reconfig-check: false enable-v2: true enable-pprof: true proxy: 'off' proxy-failure-wait: 5000 proxy-refresh-interval: 30000 proxy-dial-timeout: 1000 proxy-write-timeout: 5000 proxy-read-timeout: 0 client-transport-security: cert-file: '/weixiang/certs/etcd/etcd-server.pem' key-file: '/weixiang/certs/etcd/etcd-server-key.pem' client-cert-auth: true trusted-ca-file: '/weixiang/certs/etcd/etcd-ca.pem' auto-tls: true peer-transport-security: cert-file: '/weixiang/certs/etcd/etcd-server.pem' key-file: '/weixiang/certs/etcd/etcd-server-key.pem' peer-client-cert-auth: true trusted-ca-file: '/weixiang/certs/etcd/etcd-ca.pem' auto-tls: true debug: false log-package-levels: log-outputs: [default] force-new-cluster: false EOF 5.2 k8s-cluster242节点的配置文件 [root@k8s-cluster242 ~]# mkdir -pv /weixiang/softwares/etcd mkdir: created directory '/weixiang/softwares' mkdir: created directory '/weixiang/softwares/etcd' [root@k8s-cluster242 ~]# [root@k8s-cluster242 ~]# cat > /weixiang/softwares/etcd/etcd.config.yml <<'EOF' name: 'k8s-cluster242' data-dir: /var/lib/etcd wal-dir: /var/lib/etcd/wal snapshot-count: 5000 heartbeat-interval: 100 election-timeout: 1000 quota-backend-bytes: 0 listen-peer-urls: 'https://10.1.12.4:2380' listen-client-urls: 'https://10.1.12.4:2379,http://127.0.0.1:2379' max-snapshots: 3 max-wals: 5 cors: initial-advertise-peer-urls: 'https://10.1.12.4:2380' advertise-client-urls: 'https://10.1.12.4:2379' discovery: discovery-fallback: 'proxy' discovery-proxy: discovery-srv: initial-cluster: 'k8s-cluster241=https://10.1.12.3:2380,k8s-cluster242=https://10.1.12.4:2380,k8s-cluster243=https://10.1.12.15:2380' initial-cluster-token: 'etcd-k8s-cluster' initial-cluster-state: 'new' strict-reconfig-check: false enable-v2: true enable-pprof: true proxy: 'off' proxy-failure-wait: 5000 proxy-refresh-interval: 30000 proxy-dial-timeout: 1000 proxy-write-timeout: 5000 proxy-read-timeout: 0 client-transport-security: cert-file: '/weixiang/certs/etcd/etcd-server.pem' key-file: '/weixiang/certs/etcd/etcd-server-key.pem' client-cert-auth: true trusted-ca-file: '/weixiang/certs/etcd/etcd-ca.pem' auto-tls: true peer-transport-security: cert-file: '/weixiang/certs/etcd/etcd-server.pem' key-file: '/weixiang/certs/etcd/etcd-server-key.pem' peer-client-cert-auth: true trusted-ca-file: '/weixiang/certs/etcd/etcd-ca.pem' auto-tls: true debug: false log-package-levels: log-outputs: [default] force-new-cluster: false EOF 5.3 k8s-cluster243节点的配置文件 [root@k8s-cluster243 ~]# mkdir -pv /weixiang/softwares/etcd mkdir: created directory '/weixiang/softwares' mkdir: created directory '/weixiang/softwares/etcd' [root@k8s-cluster243 ~]# [root@k8s-cluster243 ~]# cat > /weixiang/softwares/etcd/etcd.config.yml <<'EOF' name: 'k8s-cluster243' data-dir: /var/lib/etcd wal-dir: /var/lib/etcd/wal snapshot-count: 5000 heartbeat-interval: 100 election-timeout: 1000 quota-backend-bytes: 0 listen-peer-urls: 'https://10.1.12.15:2380' listen-client-urls: 'https://10.1.12.15:2379,http://127.0.0.1:2379' max-snapshots: 3 max-wals: 5 cors: initial-advertise-peer-urls: 'https://10.1.12.15:2380' advertise-client-urls: 'https://10.1.12.15:2379' discovery: discovery-fallback: 'proxy' discovery-proxy: discovery-srv: initial-cluster: 'k8s-cluster241=https://10.1.12.3:2380,k8s-cluster242=https://10.1.12.4:2380,k8s-cluster243=https://10.1.12.15:2380' initial-cluster-token: 'etcd-k8s-cluster' initial-cluster-state: 'new' strict-reconfig-check: false enable-v2: true enable-pprof: true proxy: 'off' proxy-failure-wait: 5000 proxy-refresh-interval: 30000 proxy-dial-timeout: 1000 proxy-write-timeout: 5000 proxy-read-timeout: 0 client-transport-security: cert-file: '/weixiang/certs/etcd/etcd-server.pem' key-file: '/weixiang/certs/etcd/etcd-server-key.pem' client-cert-auth: true trusted-ca-file: '/weixiang/certs/etcd/etcd-ca.pem' auto-tls: true peer-transport-security: cert-file: '/weixiang/certs/etcd/etcd-server.pem' key-file: '/weixiang/certs/etcd/etcd-server-key.pem' peer-client-cert-auth: true trusted-ca-file: '/weixiang/certs/etcd/etcd-ca.pem' auto-tls: true debug: false log-package-levels: log-outputs: [default] force-new-cluster: false EOF 6.所有节点编写etcd启动脚本 cat > /usr/lib/systemd/system/etcd.service <<'EOF' [Unit] Description=Jason Yins Etcd Service Documentation=https://coreos.com/etcd/docs/latest/ After=network.target [Service] Type=notify ExecStart=/usr/local/bin/etcd --config-file=/weixiang/softwares/etcd/etcd.config.yml Restart=on-failure RestartSec=10 LimitNOFILE=65536 [Install] WantedBy=multi-user.target Alias=etcd3.service EOF 7.所有节点启动etcd集群 systemctl daemon-reload && systemctl enable --now etcd systemctl status etcd 8.查看etcd集群状态 [root@k8s-cluster241 ~]# etcdctl --endpoints="https://10.1.12.3:2379,https://10.1.12.4:2379,https://10.1.12.15:2379" --cacert=/weixiang/certs/etcd/etcd-ca.pem --cert=/weixiang/certs/etcd/etcd-server.pem --key=/weixiang/certs/etcd/etcd-server-key.pem endpoint status --write-out=table +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | https://10.1.12.3:2379 | 566d563f3c9274ed | 3.5.21 | 25 kB | true | false | 2 | 9 | 9 | | | https://10.1.12.4:2379 | b83b69ba7d246b29 | 3.5.21 | 25 kB | false | false | 2 | 9 | 9 | | | https://10.1.12.15:2379 | 47b70f9ecb1f200 | 3.5.21 | 20 kB | false | false | 2 | 9 | 9 | | +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ [root@k8s-cluster241 ~]# 9.验证etcd高可用集群 9.1 停止leader节点 [root@k8s-cluster241 ~]# ss -ntl | egrep "2379|2380" LISTEN 0 16384 127.0.0.1:2379 0.0.0.0:* LISTEN 0 16384 10.1.12.3:2379 0.0.0.0:* LISTEN 0 16384 10.1.12.3:2380 0.0.0.0:* [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# systemctl stop etcd [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# ss -ntl | egrep "2379|2380" [root@k8s-cluster241 ~]# 9.2 查看现有集群环境,发现新leader诞生 [root@k8s-cluster241 ~]# etcdctl --endpoints="https://10.1.12.3:2379,https://10.1.12.4:2379,https://10.1.12.15:2379" --cacert=/weixiang/certs/etcd/etcd-ca.pem --cert=/weixiang/certs/etcd/etcd-server.pem --key=/weixiang/certs/etcd/etcd-server-key.pem endpoint status --write-out=table {"level":"warn","ts":"2025-06-13T16:24:51.098673+0800","logger":"etcd-client","caller":"v3@v3.5.21/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0002f45a0/10.1.12.3:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 10.1.12.3:2379: connect: connection refused\""} Failed to get the status of endpoint https://10.1.12.3:2379 (context deadline exceeded) +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | https://10.1.12.4:2379 | b83b69ba7d246b29 | 3.5.21 | 25 kB | true | false | 3 | 10 | 10 | | | https://10.1.12.15:2379 | 47b70f9ecb1f200 | 3.5.21 | 20 kB | false | false | 3 | 10 | 10 | | +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl --endpoints="https://10.1.12.4:2379,https://10.1.12.15:2379" --cacert=/weixiang/certs/etcd/etcd-ca.pem --cert=/weixiang/certs/etcd/etcd-server.pem --key=/weixiang/certs/etcd/etcd-server-key.pem endpoint status --write-out=table +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | https://10.1.12.4:2379 | b83b69ba7d246b29 | 3.5.21 | 25 kB | true | false | 3 | 10 | 10 | | | https://10.1.12.15:2379 | 47b70f9ecb1f200 | 3.5.21 | 20 kB | false | false | 3 | 10 | 10 | | +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ [root@k8s-cluster241 ~]# 9.3 再将之前的leader起来 [root@k8s-cluster241 ~]# ss -ntl | egrep "2379|2380" [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# systemctl start etcd [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# ss -ntl | egrep "2379|2380" LISTEN 0 16384 127.0.0.1:2379 0.0.0.0:* LISTEN 0 16384 10.1.12.3:2379 0.0.0.0:* LISTEN 0 16384 10.1.12.3:2380 0.0.0.0:* [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl --endpoints="https://10.1.12.3:2379,https://10.1.12.4:2379,https://10.1.12.15:2379" --cacert=/weixiang/certs/etcd/etcd-ca.pem --cert=/weixiang/certs/etcd/etcd-server.pem --key=/weixiang/certs/etcd/etcd-server-key.pem endpoint status --write-out=table +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | https://10.1.12.3:2379 | 566d563f3c9274ed | 3.5.21 | 25 kB | false | false | 3 | 11 | 11 | | | https://10.1.12.4:2379 | b83b69ba7d246b29 | 3.5.21 | 25 kB | true | false | 3 | 11 | 11 | | | https://10.1.12.15:2379 | 47b70f9ecb1f200 | 3.5.21 | 20 kB | false | false | 3 | 11 | 11 | | +-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ [root@k8s-cluster241 ~]# 10.添加别名 10.1 添加别名 [root@k8s-cluster241 ~]# vim .bashrc ... alias etcdctl='etcdctl --endpoints="10.1.12.3:2379,10.1.12.4:2379,10.1.12.15:2379" --cacert=/weixiang/certs/etcd/etcd-ca.pem --cert=/weixiang/certs/etcd/etcd-server.pem --key=/weixiang/certs/etcd/etcd-server-key.pem ' ... [root@k8s-cluster241 ~]# source .bashrc [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl endpoint status --write-out=table +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | 10.1.12.3:2379 | 566d563f3c9274ed | 3.5.21 | 25 kB | false | false | 3 | 11 | 11 | | | 10.1.12.4:2379 | b83b69ba7d246b29 | 3.5.21 | 25 kB | true | false | 3 | 11 | 11 | | | 10.1.12.15:2379 | 47b70f9ecb1f200 | 3.5.21 | 20 kB | false | false | 3 | 11 | 11 | | +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# data_rsync.sh .bashrc ===== rsyncing k8s-cluster242: .bashrc ===== 命令执行成功! ===== rsyncing k8s-cluster243: .bashrc ===== 命令执行成功! [root@k8s-cluster241 ~]# 10.2 测试验证 【需要断开重连】 [root@k8s-cluster242 ~]# source ~/.bashrc [root@k8s-cluster242 ~]# [root@k8s-cluster242 ~]# etcdctl endpoint status --write-out=table +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | 10.1.12.3:2379 | 566d563f3c9274ed | 3.5.21 | 25 kB | false | false | 3 | 11 | 11 | | | 10.1.12.4:2379 | b83b69ba7d246b29 | 3.5.21 | 25 kB | true | false | 3 | 11 | 11 | | | 10.1.12.15:2379 | 47b70f9ecb1f200 | 3.5.21 | 20 kB | false | false | 3 | 11 | 11 | | +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ [root@k8s-cluster242 ~]# [root@k8s-cluster243 ~]# source ~/.bashrc [root@k8s-cluster243 ~]# [root@k8s-cluster243 ~]# etcdctl endpoint status --write-out=table +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | 10.1.12.3:2379 | 566d563f3c9274ed | 3.5.21 | 25 kB | false | false | 3 | 11 | 11 | | | 10.1.12.4:2379 | b83b69ba7d246b29 | 3.5.21 | 25 kB | true | false | 3 | 11 | 11 | | | 10.1.12.15:2379 | 47b70f9ecb1f200 | 3.5.21 | 20 kB | false | false | 3 | 11 | 11 | | +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ [root@k8s-cluster243 ~]# 11.关机拍快照 快照名称: 'etcd环境准备就绪'。 温馨提示: 拍快照之前先重启测试下效果,观察etcd集群是否可用。然后在关机拍快照。 [root@k8s-cluster241 ~]# etcdctl endpoint status --write-out=table +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | 10.1.12.3:2379 | 566d563f3c9274ed | 3.5.21 | 25 kB | true | false | 5 | 16 | 16 | | | 10.1.12.4:2379 | b83b69ba7d246b29 | 3.5.21 | 25 kB | false | false | 5 | 16 | 16 | | | 10.1.12.15:2379 | 47b70f9ecb1f200 | 3.5.21 | 20 kB | false | false | 5 | 16 | 16 | | +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ [root@k8s-cluster241 ~]#
etcd的基本使用
bash
1.etcd基础操作概述 etcd的操作和zookeeper,Redis的操作类似,存储数据都是键值对。 2.etcd增删改查基础操作 2.1 写入数据KEY的school,value等于weixiang # 向 etcd 数据库中写入一条数据,其中键(Key)是 school,值(Value)是 weixiang [root@k8s-cluster241 ~]# etcdctl put school weixiang OK [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl put /class weixiang98 OK [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl put classroom 教室1 OK [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl put /etc/hosts 10.0.0.141 ceph141 OK [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl put /weixiang/docker/registry/harbor 企业级镜像仓库 OK [root@k8s-cluster241 ~]# 2.2 查看数据 # 会同时打印出键和值,各占一行 [root@k8s-cluster241 ~]# etcdctl get school school weixiang # 只获取键 school 的键本身 [root@k8s-cluster241 ~]# etcdctl get school --keys-only school # 只获取键 school 的值 [root@k8s-cluster241 ~]# etcdctl get school --print-value-only weixiang # 查询以 / 这个键为起点的键名 [root@k8s-cluster241 ~]# etcdctl get / --prefix --keys-only /class /etc/hosts /weixiang/docker/registry/harbor # 打印出所有以 / 开头的键所对应的值。 [root@k8s-cluster241 ~]# etcdctl get / --prefix --print-value-only weixiang98 10.0.0.141 企业级镜像仓库 # 会匹配 etcd 中的所有键 [root@k8s-cluster241 ~]# etcdctl get "" --prefix --keys-only /class /etc/hosts /weixiang/docker/registry/harbor classroom school # 查找所有的值 [root@k8s-cluster241 ~]# etcdctl get "" --prefix --print-value-only weixiang98 10.0.0.141 企业级镜像仓库 教室1 weixiang # 把所有键和它们对应的值都打印出来 [root@k8s-cluster241 ~]# etcdctl get "" --prefix /class weixiang98 /etc/hosts 10.0.0.141 /weixiang/docker/registry/harbor 企业级镜像仓库 classroom 教室1 school weixiang 2.3 修改数据 [root@k8s-cluster241 ~]# etcdctl get school --print-value-only weixiang [root@k8s-cluster241 ~]# # 修改school的值为IT [root@k8s-cluster241 ~]# etcdctl put school IT OK [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl get school --print-value-only 老男孩IT教育 [root@k8s-cluster241 ~]# 2.4 删除数据 [root@k8s-cluster241 ~]# etcdctl get "" --prefix --keys-only /class /etc/hosts /weixiang/docker/registry/harbor classroom school [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl del school 1 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl del / --prefix 3 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl get "" --prefix --keys-only classroom [root@k8s-cluster241 ~]#
etcd集群数据备份和恢复
bash
推荐阅读: https://etcd.io/docs/v3.5/op-guide/recovery/ https://etcd.io/docs/v3.5/op-guide/ https://etcd.io/docs/v3.5/learning/ https://etcd.io/docs/v3.5/upgrades/ 1 准备测试数据【数据随机创建即可,用于模拟备份环节】 [root@k8s-cluster241 ~]# etcdctl get "" --prefix classroom 教室1 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl put /weixiang/weixiang98 嘻嘻 OK [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl put /weixiang/linux99 哈哈 OK [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl get "" --prefix /weixiang/weixiang98 嘻嘻 /weixiang/linux99 哈哈 classroom 教室1 [root@k8s-cluster241 ~]# 2 创建快照用于备份数据 [root@k8s-cluster241 ~]# \etcdctl snapshot save /tmp/weixiang-etcd-`date +%F`.backup {"level":"info","ts":"2025-07-31T14:32:49.948793+0800","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/tmp/weixiang-etcd-2025-07-31.backup.part"} {"level":"info","ts":"2025-07-31T14:32:49.949734+0800","logger":"client","caller":"v3@v3.5.22/maintenance.go:212","msg":"opened snapshot stream; downloading"} {"level":"info","ts":"2025-07-31T14:32:49.949764+0800","caller":"snapshot/v3_snapshot.go:73","msg":"fetching snapshot","endpoint":"127.0.0.1:2379"} {"level":"info","ts":"2025-07-31T14:32:49.952514+0800","logger":"client","caller":"v3@v3.5.22/maintenance.go:220","msg":"completed snapshot read; closing"} {"level":"info","ts":"2025-07-31T14:32:49.953291+0800","caller":"snapshot/v3_snapshot.go:88","msg":"fetched snapshot","endpoint":"127.0.0.1:2379","size":"20 kB","took":"now"} {"level":"info","ts":"2025-07-31T14:32:49.953364+0800","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/tmp/weixiang-etcd-2025-07-31.backup"} Snapshot saved at /tmp/weixiang-etcd-2025-07-31.backup [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# ll /tmp/weixiang-etcd-`date +%F`.backup -rw------- 1 root root 20512 Jul 31 14:32 /tmp/weixiang-etcd-2025-07-31.backup [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# file /tmp/weixiang-etcd-`date +%F`.backup /tmp/weixiang-etcd-2025-07-31.backup: data [root@k8s-cluster241 ~]# 3.查看快照的基本信息 [root@k8s-cluster241 ~]# etcdctl snapshot status /tmp/weixiang-etcd-`date +%F`.backup -w table # 查看备份快照的状态 Deprecated: Use `etcdutl snapshot status` instead. +---------+----------+------------+------------+ | HASH | REVISION | TOTAL KEYS | TOTAL SIZE | +---------+----------+------------+------------+ | e546d7a | 11 | 20 | 20 kB | +---------+----------+------------+------------+ [root@k8s-cluster241 ~]# 4.将快照拷贝到其他两个集群节点 [root@k8s-cluster241 ~]# scp /tmp/weixiang-etcd-`date +%F`.backup k8s-cluster242:/tmp [root@k8s-cluster241 ~]# scp /tmp/weixiang-etcd-`date +%F`.backup k8s-cluster243:/tmp 5 删除所有数据【搞破坏】 [root@k8s-cluster241 ~]# etcdctl get "" --prefix /weixiang/weixiang98 嘻嘻 /weixiang/linux99 哈哈 classroom 教室1 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl del "" --prefix 3 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl get "" --prefix [root@k8s-cluster241 ~]# 6 停止etcd集群 [root@k8s-cluster241 ~]# systemctl stop etcd [root@k8s-cluster242 ~]# systemctl stop etcd [root@k8s-cluster243 ~]# systemctl stop etcd 7.各节点恢复数据 【恢复的数据目录必须为空】 [root@k8s-cluster241 ~]# etcdctl snapshot restore /tmp/weixiang-etcd-`date +%F`.backup --data-dir=/var/lib/etcd-2025 Deprecated: Use `etcdutl snapshot restore` instead. 2025-07-31T14:37:26+08:00 info snapshot/v3_snapshot.go:265 restoring snapshot {"path": "/tmp/weixiang-etcd-2025-07-31.backup", "wal-dir": "/var/lib/etcd-2025/member/wal", "data-dir": "/var/lib/etcd-2025", "snap-dir": "/var/lib/etcd-2025/member/snap", "initial-memory-map-size": 0} 2025-07-31T14:37:26+08:00 info membership/store.go:138 Trimming membership information from the backend... 2025-07-31T14:37:26+08:00 info membership/cluster.go:421 added member {"cluster-id": "cdf818194e3a8c32", "local-member-id": "0", "added-peer-id": "8e9e05c52164694d", "added-peer-peer-urls": ["http://localhost:2380"], "added-peer-is-learner": false} 2025-07-31T14:37:26+08:00 info snapshot/v3_snapshot.go:293 restored snapshot {"path": "/tmp/weixiang-etcd-2025-07-31.backup", "wal-dir": "/var/lib/etcd-2025/member/wal", "data-dir": "/var/lib/etcd-2025", "snap-dir": "/var/lib/etcd-2025/member/snap", "initial-memory-map-size": 0} [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# ll /var/lib/etcd-2025 total 12 drwx------ 3 root root 4096 Jul 31 14:37 ./ drwxr-xr-x 63 root root 4096 Jul 31 14:37 ../ drwx------ 4 root root 4096 Jul 31 14:37 member/ [root@k8s-cluster241 ~]# [root@k8s-cluster242 ~]# etcdctl snapshot restore /tmp/weixiang-etcd-`date +%F`.backup --data-dir=/var/lib/etcd-2025 Deprecated: Use `etcdutl snapshot restore` instead. 2025-07-31T14:37:53+08:00 info snapshot/v3_snapshot.go:265 restoring snapshot {"path": "/tmp/weixiang-etcd-2025-07-31.backup", "wal-dir": "/var/lib/etcd-2025/member/wal", "data-dir": "/var/lib/etcd-2025", "snap-dir": "/var/lib/etcd-2025/member/snap", "initial-memory-map-size": 0} 2025-07-31T14:37:53+08:00 info membership/store.go:138 Trimming membership information from the backend... 2025-07-31T14:37:53+08:00 info membership/cluster.go:421 added member {"cluster-id": "cdf818194e3a8c32", "local-member-id": "0", "added-peer-id": "8e9e05c52164694d", "added-peer-peer-urls": ["http://localhost:2380"], "added-peer-is-learner": false} 2025-07-31T14:37:53+08:00 info snapshot/v3_snapshot.go:293 restored snapshot {"path": "/tmp/weixiang-etcd-2025-07-31.backup", "wal-dir": "/var/lib/etcd-2025/member/wal", "data-dir": "/var/lib/etcd-2025", "snap-dir": "/var/lib/etcd-2025/member/snap", "initial-memory-map-size": 0} [root@k8s-cluster242 ~]# [root@k8s-cluster242 ~]# ll /var/lib/etcd-2025 total 12 drwx------ 3 root root 4096 Jul 31 14:37 ./ drwxr-xr-x 63 root root 4096 Jul 31 14:37 ../ drwx------ 4 root root 4096 Jul 31 14:37 member/ [root@k8s-cluster242 ~]# [root@k8s-cluster243 ~]# etcdctl snapshot restore /tmp/weixiang-etcd-`date +%F`.backup --data-dir=/var/lib/etcd-2025 Deprecated: Use `etcdutl snapshot restore` instead. 2025-07-31T14:38:11+08:00 info snapshot/v3_snapshot.go:265 restoring snapshot {"path": "/tmp/weixiang-etcd-2025-07-31.backup", "wal-dir": "/var/lib/etcd-2025/member/wal", "data-dir": "/var/lib/etcd-2025", "snap-dir": "/var/lib/etcd-2025/member/snap", "initial-memory-map-size": 0} 2025-07-31T14:38:11+08:00 info membership/store.go:138 Trimming membership information from the backend... 2025-07-31T14:38:11+08:00 info membership/cluster.go:421 added member {"cluster-id": "cdf818194e3a8c32", "local-member-id": "0", "added-peer-id": "8e9e05c52164694d", "added-peer-peer-urls": ["http://localhost:2380"], "added-peer-is-learner": false} 2025-07-31T14:38:11+08:00 info snapshot/v3_snapshot.go:293 restored snapshot {"path": "/tmp/weixiang-etcd-2025-07-31.backup", "wal-dir": "/var/lib/etcd-2025/member/wal", "data-dir": "/var/lib/etcd-2025", "snap-dir": "/var/lib/etcd-2025/member/snap", "initial-memory-map-size": 0} [root@k8s-cluster243 ~]# [root@k8s-cluster243 ~]# ll /var/lib/etcd-2025 total 12 drwx------ 3 root root 4096 Jul 31 14:38 ./ drwxr-xr-x 63 root root 4096 Jul 31 14:38 ../ drwx------ 4 root root 4096 Jul 31 14:38 member/ [root@k8s-cluster243 ~]# 6 将恢复后的数据目录作为新的数据目录 [root@k8s-cluster241 ~]# grep "/var/lib/etcd" /weixiang/softwares/etcd/etcd.config.yml data-dir: /var/lib/etcd wal-dir: /var/lib/etcd/wal [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# sed -ri "s#(/var/lib/etcd)#\1-2025#g" /weixiang/softwares/etcd/etcd.config.yml [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# grep "/var/lib/etcd" /weixiang/softwares/etcd/etcd.config.yml data-dir: /var/lib/etcd-2025 wal-dir: /var/lib/etcd-2025/wal [root@k8s-cluster241 ~]# [root@k8s-cluster242 ~]# sed -ri "s#(/var/lib/etcd)#\1-2025#g" /weixiang/softwares/etcd/etcd.config.yml [root@k8s-cluster242 ~]# grep "/var/lib/etcd" /weixiang/softwares/etcd/etcd.config.yml data-dir: /var/lib/etcd-2025 wal-dir: /var/lib/etcd-2025/wal [root@k8s-cluster242 ~]# [root@k8s-cluster243 ~]# sed -ri "s#(/var/lib/etcd)#\1-2025#g" /weixiang/softwares/etcd/etcd.config.yml [root@k8s-cluster243 ~]# grep "/var/lib/etcd" /weixiang/softwares/etcd/etcd.config.yml data-dir: /var/lib/etcd-2025 wal-dir: /var/lib/etcd-2025/wal [root@k8s-cluster243 ~]# 7 启动etcd集群 [root@k8s-cluster241 ~]# systemctl start etcd [root@k8s-cluster242 ~]# systemctl start etcd [root@k8s-cluster243 ~]# systemctl start etcd 8 验证数据是否恢复 [root@k8s-cluster243 ~]# etcdctl get "" --prefix --keys-only /weixiang/weixiang98 /weixiang/linux99 classroom [root@k8s-cluster243 ~]# etcdctl get "" --prefix /weixiang/weixiang98 嘻嘻 /weixiang/linux99 哈哈 classroom 教室1 [root@k8s-cluster243 ~]# 9.测试数据是否可以正常读写 [root@k8s-cluster242 ~]# etcdctl put xixi 哈哈 OK [root@k8s-cluster242 ~]# [root@k8s-cluster242 ~]# etcdctl get "" --prefix --keys-only /weixiang/weixiang98 /weixiang/linux99 classroom xixi [root@k8s-cluster242 ~]#
etcd-workbench图形化管理etcd集群
bash
参考链接: https://tzfun.github.io/etcd-workbench/ https://github.com/tzfun/etcd-workbench/blob/master/README_ZH.md https://github.com/tzfun/etcd-workbench-web/blob/master/server/src/main/resources/etcd-workbench.conf 1.拉取镜像 [root@k8s-cluster243 ~]# ctr i pull tzfun/etcd-workbench:1.1.4 SVIP: [root@k8s-cluster243 ~]# wget http://192.168.21.253/Resources/Prometheus/images/etcd-workbench/weixiang-etcd-workbench-v1.1.4.tar.gz [root@k8s-cluster243 ~]# ctr ns ls NAME LABELS [root@k8s-cluster243 ~]# [root@k8s-cluster243 ~]# ctr i import weixiang-etcd-workbench-v1.1.4.tar.gz unpacking docker.io/tzfun/etcd-workbench:1.1.4 (sha256:ddfe5e61fc4b54e02e2d7a75d209a6aeb72c8bd0b993de2b2934671371b9b93f)...done [root@k8s-cluster243 ~]# [root@k8s-cluster243 ~]# ctr ns ls NAME LABELS default [root@k8s-cluster243 ~]# [root@k8s-cluster243 ~]# ctr i ls REF TYPE DIGEST SIZE PLATFORMS LABELS docker.io/tzfun/etcd-workbench:1.1.4 application/vnd.docker.distribution.manifest.v2+json sha256:ddfe5e61fc4b54e02e2d7a75d209a6aeb72c8bd0b993de2b2934671371b9b93f 310.2 MiB linux/amd64 - [root@k8s-cluster243 ~]# 2.运行etcd-workbench [root@k8s-cluster243 ~]# ctr container create --net-host docker.io/tzfun/etcd-workbench:1.1.4 etcd-workbench [root@k8s-cluster243 ~]# [root@k8s-cluster243 ~]# ctr c ls CONTAINER IMAGE RUNTIME etcd-workbench docker.io/tzfun/etcd-workbench:1.1.4 io.containerd.runc.v2 [root@k8s-cluster243 ~]# [root@k8s-cluster243 ~]# ctr t start etcd-workbench _____ ____ ____ ____ ____ ____ ____ |_ _||_ _| |_ _||_ \ / _||_ \ / _| | | \ \ / / | \/ | | \/ | _ | | \ \ / / | |\ /| | | |\ /| | | |__' | \ ' / _| |_\/_| |_ _| |_\/_| |_ `.____.' \_/ |_____||_____||_____||_____| Powered by JVMM https://github.com/tzfun/jvmm ' Framework version: 2.4.2 2025-07-31 06:52:28.881+0000 INFO org.beifengtz.etcd.server.EtcdServer Load configuration successfully 2025-07-31 06:52:29.350+0000 INFO org.beifengtz.etcd.server.service.HttpService Please access http://10.0.0.243:8002 2025-07-31 06:52:29.350+0000 INFO org.beifengtz.etcd.server.service.HttpService Http server service started on 8002 in 317 ms 2025-07-31 06:52:29.351+0000 INFO org.beifengtz.etcd.server.EtcdServer Etcd workbench version: 1.1.4 2025-07-31 06:52:29.352+0000 INFO org.beifengtz.etcd.server.EtcdServer Etcd workbench build hash: 6020e3f 3.访问etcd-workbench的webUI http://10.0.0.243:8002 如果基于docker启用认证的话,则可以使用用户名和密码登录即可。 4.拷贝证书到windows系统 cert-file: '/weixiang/certs/etcd/etcd-server.pem' key-file: '/weixiang/certs/etcd/etcd-server-key.pem' trusted-ca-file: '/weixiang/certs/etcd/etcd-ca.pem' 5.上传证书并测试 略,见视频。 彩蛋: 基于Docker部署支持认证功能: 1.准备配置文件 [root@harbor250.weixiang.com ~]# cat > etcd-workbench.conf <<'EOF' [server] # 服务监听的端口 port = 8002 # 链接超时时间 etcdExecuteTimeoutMillis = 3000 # 数据存储目录 dataDir = ./data [auth] # 启用认证功能 enable = true # 指定用户名和密码 user = admin:yinzhengjie [log] # 指定日志的级别 level = INFO # 日志存储目录 file = ./logs # 日志文件的名称 fileName = etcd-workbench # 指定日志的滚动大小 fileLimitSize = 100 # 日志打印的位置 printers = std,file EOF 2.启动容器 [root@harbor250.weixiang.com ~]# docker run -d -v /root/etcd-workbench.conf:/usr/tzfun/etcd-workbench/etcd-workbench.conf --name etcd-workbench --network host tzfun/etcd-workbench:1.1.4 88e4dc60963e92f988a617727e7cf76db3e0d565096859ca63549bed7883fc46 [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# ss -ntl | grep 8002 LISTEN 0 4096 *:8002 *:* [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 88e4dc60963e tzfun/etcd-workbench:1.1.4 "/bin/sh -c 'java …" 9 seconds ago Up 8 seconds etcd-workbench [root@harbor250.weixiang.com ~]#

938bfb2dea3e721a273ad05081fdd5ea

image

image

2、安装
1、下载程序包
bash
1.下载K8S程序包 wget https://dl.k8s.io/v1.33.3/kubernetes-server-linux-amd64.tar.gz svip: [root@k8s-cluster241 ~]# wget http://192.168.21.253/Resources/Kubernetes/K8S%20Cluster/binary/v1.33/kubernetes-server-linux-amd64.tar.gz 2.解压指定的软件包 [root@k8s-cluster241 ~]# tar xf kubernetes-server-linux-amd64.tar.gz --strip-components=3 -C /usr/local/bin kubernetes/server/bin/kube{let,ctl,-apiserver,-controller-manager,-scheduler,-proxy} [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# ll /usr/local/bin/kube* -rwxr-xr-x 1 root root 97960120 May 15 16:40 /usr/local/bin/kube-apiserver* -rwxr-xr-x 1 root root 90759352 May 15 16:40 /usr/local/bin/kube-controller-manager* -rwxr-xr-x 1 root root 60121272 May 15 16:40 /usr/local/bin/kubectl* -rwxr-xr-x 1 root root 81690916 May 15 16:40 /usr/local/bin/kubelet* -rwxr-xr-x 1 root root 70594744 May 15 16:40 /usr/local/bin/kube-proxy* -rwxr-xr-x 1 root root 69603512 May 15 16:40 /usr/local/bin/kube-scheduler* [root@k8s-cluster241 ~]# 3.查看kubelet的版本 [root@k8s-cluster241 ~]# kube-apiserver --version Kubernetes v1.33.3 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kube-proxy --version Kubernetes v1.33.3 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kube-controller-manager --version Kubernetes v1.33.3 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kube-scheduler --version Kubernetes v1.33.3 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubelet --version Kubernetes v1.33.3 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl version Client Version: v1.33.3 Kustomize Version: v5.6.0 The connection to the server localhost:8080 was refused - did you specify the right host or port? [root@k8s-cluster241 ~]# 4.分发软件包 [root@k8s-cluster241 ~]# for i in `ls -1 /usr/local/bin/kube*`;do data_rsync.sh $i ;done 5.其他节点验证 [root@k8s-cluster242 ~]# ll /usr/local/bin/kube* -rwxr-xr-x 1 root root 97968312 Jul 16 02:18 /usr/local/bin/kube-apiserver* -rwxr-xr-x 1 root root 90767544 Jul 16 02:18 /usr/local/bin/kube-controller-manager* -rwxr-xr-x 1 root root 60129464 Jul 16 02:18 /usr/local/bin/kubectl* -rwxr-xr-x 1 root root 81703204 Jul 16 02:18 /usr/local/bin/kubelet* -rwxr-xr-x 1 root root 70602936 Jul 16 02:18 /usr/local/bin/kube-proxy* -rwxr-xr-x 1 root root 69611704 Jul 16 02:18 /usr/local/bin/kube-scheduler* [root@k8s-cluster242 ~]# [root@k8s-cluster243 ~]# ll /usr/local/bin/kube* -rwxr-xr-x 1 root root 97968312 Jul 16 02:18 /usr/local/bin/kube-apiserver* -rwxr-xr-x 1 root root 90767544 Jul 16 02:18 /usr/local/bin/kube-controller-manager* -rwxr-xr-x 1 root root 60129464 Jul 16 02:18 /usr/local/bin/kubectl* -rwxr-xr-x 1 root root 81703204 Jul 16 02:18 /usr/local/bin/kubelet* -rwxr-xr-x 1 root root 70602936 Jul 16 02:18 /usr/local/bin/kube-proxy* -rwxr-xr-x 1 root root 69611704 Jul 16 02:18 /usr/local/bin/kube-scheduler* [root@k8s-cluster243 ~]# - 生成k8s组件相关证书 温馨提示: 建议大家在做此步骤之前,拍个快照。 1.生成证书的CSR文件: 证书签发请求文件,配置了一些域名,公司,单位 [root@k8s-cluster241 ~]# mkdir -pv /weixiang/pki/k8s && cd /weixiang/pki/k8s mkdir: created directory '/weixiang/pki/k8s' [root@k8s-cluster241 k8s]# [root@k8s-cluster241 k8s]# cat > k8s-ca-csr.json <<EOF { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "Kubernetes", "OU": "Kubernetes-manual" } ], "ca": { "expiry": "876000h" } } EOF 2.生成K8S证书 [root@k8s-cluster241 k8s]# mkdir -pv /weixiang/certs/k8s/ mkdir: created directory '/weixiang/certs/k8s/' [root@k8s-cluster241 k8s]# [root@k8s-cluster241 k8s]# cfssl gencert -initca k8s-ca-csr.json | cfssljson -bare /weixiang/certs/k8s/k8s-ca 2025/07/31 16:01:22 [INFO] generating a new CA key and certificate from CSR 2025/07/31 16:01:22 [INFO] generate received request 2025/07/31 16:01:22 [INFO] received CSR 2025/07/31 16:01:22 [INFO] generating key: rsa-2048 2025/07/31 16:01:22 [INFO] encoded CSR 2025/07/31 16:01:22 [INFO] signed certificate with serial number 597707726810272531705749021812232151373399052669 [root@k8s-cluster241 k8s]# [root@k8s-cluster241 k8s]# ll /weixiang/certs/k8s/k8s-ca* -rw-r--r-- 1 root root 1070 Jul 31 16:01 /weixiang/certs/k8s/k8s-ca.csr -rw------- 1 root root 1679 Jul 31 16:01 /weixiang/certs/k8s/k8s-ca-key.pem -rw-r--r-- 1 root root 1363 Jul 31 16:01 /weixiang/certs/k8s/k8s-ca.pem [root@k8s-cluster241 k8s]# 3.生成k8s证书的有效期为100年 [root@k8s-cluster241 k8s]# cat > k8s-ca-config.json <<EOF { "signing": { "default": { "expiry": "876000h" }, "profiles": { "kubernetes": { "usages": [ "signing", "key encipherment", "server auth", "client auth" ], "expiry": "876000h" } } } } EOF 4.生成apiserver证书的CSR文件: 证书签发请求文件,配置了一些域名,公司,单位 [root@k8s-cluster241 k8s]# cat > apiserver-csr.json <<EOF { "CN": "kube-apiserver", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "Kubernetes", "OU": "Kubernetes-manual" } ] } EOF 5.基于自建ca证书生成apiServer的证书文件 [root@k8s-cluster241 k8s]# cfssl gencert \ -ca=/weixiang/certs/k8s/k8s-ca.pem \ -ca-key=/weixiang/certs/k8s/k8s-ca-key.pem \ -config=k8s-ca-config.json \ --hostname=10.200.0.1,10.0.0.240,kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.weixiang,kubernetes.default.svc.weixiang.com,10.1.12.3,10.1.12.4,10.1.12.15 \ --profile=kubernetes \ apiserver-csr.json | cfssljson -bare /weixiang/certs/k8s/apiserver 2025/07/31 16:03:22 [INFO] generate received request 2025/07/31 16:03:22 [INFO] received CSR 2025/07/31 16:03:22 [INFO] generating key: rsa-2048 2025/07/31 16:03:22 [INFO] encoded CSR 2025/07/31 16:03:22 [INFO] signed certificate with serial number 673748347141635635906292205037564855770852883027 [root@k8s-cluster241 k8s]# [root@k8s-cluster241 k8s]# [root@k8s-cluster241 k8s]# ll /weixiang/certs/k8s/apiserver* -rw-r--r-- 1 root root 1293 Jul 31 16:03 /weixiang/certs/k8s/apiserver.csr -rw------- 1 root root 1679 Jul 31 16:03 /weixiang/certs/k8s/apiserver-key.pem -rw-r--r-- 1 root root 1688 Jul 31 16:03 /weixiang/certs/k8s/apiserver.pem [root@k8s-cluster241 k8s]# 温馨提示: "10.200.0.1"为咱们的svc网段的第一个地址,您需要根据自己的场景稍作修改。 "10.0.0.240"是负载均衡器的VIP地址。 "kubernetes,...,kubernetes.default.svc.weixiang.com"对应的是apiServer的svc解析的A记录。 "10.0.0.41,...,10.0.0.43"对应的是K8S集群的地址。 5 生成聚合证书的用于自建ca的CSR文件 聚合证书的作用就是让第三方组件(比如metrics-server等)能够拿这个证书文件和apiServer进行通信。 [root@k8s-cluster241 k8s]# cat > front-proxy-ca-csr.json <<EOF { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 } } EOF 6 生成聚合证书的自建ca证书 [root@node-exporter41 pki]# cfssl gencert -initca front-proxy-ca-csr.json | cfssljson -bare /weixiang/certs/k8s/front-proxy-ca 2025/07/31 16:04:04 [INFO] generating a new CA key and certificate from CSR 2025/07/31 16:04:04 [INFO] generate received request 2025/07/31 16:04:04 [INFO] received CSR 2025/07/31 16:04:04 [INFO] generating key: rsa-2048 2025/07/31 16:04:04 [INFO] encoded CSR 2025/07/31 16:04:04 [INFO] signed certificate with serial number 510482660017855825441334359944401305341828692486 [root@k8s-cluster241 k8s]# [root@k8s-cluster241 k8s]# ll /weixiang/certs/k8s/front-proxy-ca* -rw-r--r-- 1 root root 891 Jul 31 16:04 /weixiang/certs/k8s/front-proxy-ca.csr -rw------- 1 root root 1679 Jul 31 16:04 /weixiang/certs/k8s/front-proxy-ca-key.pem -rw-r--r-- 1 root root 1094 Jul 31 16:04 /weixiang/certs/k8s/front-proxy-ca.pem [root@k8s-cluster241 k8s]# 7.生成聚合证书的用于客户端的CSR文件 [root@k8s-cluster241 k8s]# cat > front-proxy-client-csr.json <<EOF { "CN": "front-proxy-client", "key": { "algo": "rsa", "size": 2048 } } EOF 8 基于聚合证书的自建ca证书签发聚合证书的客户端证书 [root@k8s-cluster241 k8s]# cfssl gencert \ -ca=/weixiang/certs/k8s/front-proxy-ca.pem \ -ca-key=/weixiang/certs/k8s/front-proxy-ca-key.pem \ -config=k8s-ca-config.json \ -profile=kubernetes \ front-proxy-client-csr.json | cfssljson -bare /weixiang/certs/k8s/front-proxy-client 2025/07/31 16:04:26 [INFO] generate received request 2025/07/31 16:04:26 [INFO] received CSR 2025/07/31 16:04:26 [INFO] generating key: rsa-2048 2025/07/31 16:04:26 [INFO] encoded CSR 2025/07/31 16:04:26 [INFO] signed certificate with serial number 107619448349148361013766691468796306114314872515 2025/07/31 16:04:26 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for websites. For more information see the Baseline Requirements for the Issuance and Management of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org); specifically, section 10.2.3 ("Information Requirements"). [root@k8s-cluster241 k8s]# [root@k8s-cluster241 k8s]# ll /weixiang/certs/k8s/front-proxy-client* -rw-r--r-- 1 root root 903 Jul 31 16:04 /weixiang/certs/k8s/front-proxy-client.csr -rw------- 1 root root 1679 Jul 31 16:04 /weixiang/certs/k8s/front-proxy-client-key.pem -rw-r--r-- 1 root root 1188 Jul 31 16:04 /weixiang/certs/k8s/front-proxy-client.pem [root@k8s-cluster241 k8s]# 9.生成kube-proxy的csr文件 [root@k8s-cluster241 k8s]# cat > kube-proxy-csr.json <<EOF { "CN": "system:kube-proxy", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "system:kube-proxy", "OU": "Kubernetes-manual" } ] } EOF 10.创建kube-proxy需要的证书文件 [root@k8s-cluster241 k8s]# cfssl gencert \ -ca=/weixiang/certs/k8s/k8s-ca.pem \ -ca-key=/weixiang/certs/k8s/k8s-ca-key.pem \ -config=k8s-ca-config.json \ -profile=kubernetes \ kube-proxy-csr.json | cfssljson -bare /weixiang/certs/k8s/kube-proxy 2025/06/16 09:02:34 [INFO] generate received request 2025/06/16 09:02:34 [INFO] received CSR 2025/06/16 09:02:34 [INFO] generating key: rsa-2048 2025/06/16 09:02:34 [INFO] encoded CSR 2025/06/16 09:02:34 [INFO] signed certificate with serial number 428484742950452765379969700730076331479633468881 2025/06/16 09:02:34 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for websites. For more information see the Baseline Requirements for the Issuance and Management of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org); specifically, section 10.2.3 ("Information Requirements"). [root@k8s-cluster241 k8s]# [root@k8s-cluster241 k8s]# ll /weixiang/certs/k8s/kube-proxy* -rw-r--r-- 1 root root 1045 Jun 16 09:02 /weixiang/certs/k8s/kube-proxy.csr -rw------- 1 root root 1675 Jun 16 09:02 /weixiang/certs/k8s/kube-proxy-key.pem -rw-r--r-- 1 root root 1464 Jun 16 09:02 /weixiang/certs/k8s/kube-proxy.pem [root@k8s-cluster241 k8s]# 11.生成controller-manager的CSR文件 [root@k8s-cluster241 k8s]# cat > controller-manager-csr.json <<EOF { "CN": "system:kube-controller-manager", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "system:kube-controller-manager", "OU": "Kubernetes-manual" } ] } EOF 12.生成controller-manager证书文件 [root@k8s-cluster241 k8s]# cfssl gencert \ -ca=/weixiang/certs/k8s/k8s-ca.pem \ -ca-key=/weixiang/certs/k8s/k8s-ca-key.pem \ -config=k8s-ca-config.json \ -profile=kubernetes \ controller-manager-csr.json | cfssljson -bare /weixiang/certs/k8s/controller-manager 2025/07/31 16:05:40 [INFO] generate received request 2025/07/31 16:05:40 [INFO] received CSR 2025/07/31 16:05:40 [INFO] generating key: rsa-2048 2025/07/31 16:05:41 [INFO] encoded CSR 2025/07/31 16:05:41 [INFO] signed certificate with serial number 701177347536747357211350863493315844960643128712 2025/07/31 16:05:41 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for websites. For more information see the Baseline Requirements for the Issuance and Management of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org); specifically, section 10.2.3 ("Information Requirements"). [root@k8s-cluster241 k8s]# [root@k8s-cluster241 k8s]# [root@k8s-cluster241 k8s]# ll /weixiang/certs/k8s/controller-manager* -rw-r--r-- 1 root root 1082 Jul 31 16:05 /weixiang/certs/k8s/controller-manager.csr -rw------- 1 root root 1679 Jul 31 16:05 /weixiang/certs/k8s/controller-manager-key.pem -rw-r--r-- 1 root root 1501 Jul 31 16:05 /weixiang/certs/k8s/controller-manager.pem [root@k8s-cluster241 k8s]# 13.生成scheduler的CSR文件 [root@k8s-cluster241 k8s]# cat > scheduler-csr.json <<EOF { "CN": "system:kube-scheduler", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "system:kube-scheduler", "OU": "Kubernetes-manual" } ] } EOF 14.生成scheduler证书文件 [root@k8s-cluster241 k8s]# cfssl gencert \ -ca=/weixiang/certs/k8s/k8s-ca.pem \ -ca-key=/weixiang/certs/k8s/k8s-ca-key.pem \ -config=k8s-ca-config.json \ -profile=kubernetes \ scheduler-csr.json | cfssljson -bare /weixiang/certs/k8s/scheduler 2025/07/31 16:06:47 [INFO] generate received request 2025/07/31 16:06:47 [INFO] received CSR 2025/07/31 16:06:47 [INFO] generating key: rsa-2048 2025/07/31 16:06:47 [INFO] encoded CSR 2025/07/31 16:06:47 [INFO] signed certificate with serial number 212792112494494126026675764617349696176153259100 2025/07/31 16:06:47 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for websites. For more information see the Baseline Requirements for the Issuance and Management of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org); specifically, section 10.2.3 ("Information Requirements"). [root@k8s-cluster241 k8s]# [root@k8s-cluster241 k8s]# [root@k8s-cluster241 k8s]# ll /weixiang/certs/k8s/scheduler* -rw-r--r-- 1 root root 1058 Jul 31 16:06 /weixiang/certs/k8s/scheduler.csr -rw------- 1 root root 1675 Jul 31 16:06 /weixiang/certs/k8s/scheduler-key.pem -rw-r--r-- 1 root root 1476 Jul 31 16:06 /weixiang/certs/k8s/scheduler.pem [root@k8s-cluster241 k8s]# 15.生成管理员的CSR文件 [root@k8s-cluster241 k8s]# cat > admin-csr.json <<EOF { "CN": "admin", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "system:masters", "OU": "Kubernetes-manual" } ] } EOF 16.生成k8s集群管理员证书 [root@k8s-cluster241 k8s]# cfssl gencert \ -ca=/weixiang/certs/k8s/k8s-ca.pem \ -ca-key=/weixiang/certs/k8s/k8s-ca-key.pem \ -config=k8s-ca-config.json \ -profile=kubernetes \ admin-csr.json | cfssljson -bare /weixiang/certs/k8s/admin 2025/07/31 16:07:15 [INFO] generate received request 2025/07/31 16:07:15 [INFO] received CSR 2025/07/31 16:07:15 [INFO] generating key: rsa-2048 2025/07/31 16:07:15 [INFO] encoded CSR 2025/07/31 16:07:15 [INFO] signed certificate with serial number 401352708867554717353193111824639183965095055782 2025/07/31 16:07:15 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for websites. For more information see the Baseline Requirements for the Issuance and Management of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org); specifically, section 10.2.3 ("Information Requirements"). [root@k8s-cluster241 k8s]# [root@k8s-cluster241 k8s]# ll /weixiang/certs/k8s/admin* -rw-r--r-- 1 root root 1025 Jul 31 16:07 /weixiang/certs/k8s/admin.csr -rw------- 1 root root 1679 Jul 31 16:07 /weixiang/certs/k8s/admin-key.pem -rw-r--r-- 1 root root 1444 Jul 31 16:07 /weixiang/certs/k8s/admin.pem [root@k8s-cluster241 k8s]# 17.创建ServiceAccount账号证书【api-server和controller manager组件可以基于该私钥签署所颁发的ID令牌(token)。】 [root@k8s-cluster241 k8s]# openssl genrsa -out /weixiang/certs/k8s/sa.key 2048 [root@k8s-cluster241 k8s]# openssl rsa -in /weixiang/certs/k8s/sa.key -pubout -out /weixiang/certs/k8s/sa.pub [root@k8s-cluster241 k8s]# ll /weixiang/certs/k8s/sa* -rw------- 1 root root 1704 Jun 16 09:15 /weixiang/certs/k8s/sa.key -rw-r--r-- 1 root root 451 Jun 16 09:15 /weixiang/certs/k8s/sa.pub [root@k8s-cluster241 k8s]# 温馨提示: 细心地小伙伴可能已经发现了缺少kubelet相关证书,当然我们也可以考虑创建出来,但也可以直接使用bootstrap token的方式认证。 [root@k8s-cluster241 k8s]# tree /weixiang/ /weixiang/ ├── certs │   ├── etcd │   │   ├── etcd-ca.csr │   │   ├── etcd-ca-key.pem │   │   ├── etcd-ca.pem │   │   ├── etcd-server.csr │   │   ├── etcd-server-key.pem │   │   └── etcd-server.pem │   └── k8s │   ├── admin.csr │   ├── admin-key.pem │   ├── admin.pem │   ├── apiserver.csr │   ├── apiserver-key.pem │   ├── apiserver.pem │   ├── controller-manager.csr │   ├── controller-manager-key.pem │   ├── controller-manager.pem │   ├── front-proxy-ca.csr │   ├── front-proxy-ca-key.pem │   ├── front-proxy-ca.pem │   ├── front-proxy-client.csr │   ├── front-proxy-client-key.pem │   ├── front-proxy-client.pem │   ├── k8s-ca.csr │   ├── k8s-ca-key.pem │   ├── k8s-ca.pem │   ├── kube-proxy.csr │   ├── kube-proxy-key.pem │   ├── kube-proxy.pem │   ├── sa.key │   ├── sa.pub │   ├── scheduler.csr │   ├── scheduler-key.pem │   └── scheduler.pem ├── pki │   ├── etcd │   │   ├── ca-config.json │   │   ├── etcd-ca-csr.json │   │   └── etcd-csr.json │   └── k8s │   ├── admin-csr.json │   ├── apiserver-csr.json │   ├── controller-manager-csr.json │   ├── front-proxy-ca-csr.json │   ├── front-proxy-client-csr.json │   ├── k8s-ca-config.json │   ├── k8s-ca-csr.json │   ├── kube-proxy-csr.json │   └── scheduler-csr.json └── softwares └── etcd └── etcd.config.yml 8 directories, 45 files [root@k8s-cluster241 k8s]#
2、生成k8s组件相关Kubeconfig文件并同步到其他master节点
bash
1. 创建一个kubeconfig目录 [root@k8s-cluster241 k8s]# mkdir -pv /weixiang/certs/kubeconfig mkdir: created directory '/weixiang/certs/kubeconfig' [root@k8s-cluster241 k8s]# 2.生成了controller-manager组件的kubeconfig文件 2.1 设置一个集群 [root@k8s-cluster241 k8s]# kubectl config set-cluster yinzhengjie-k8s \ --certificate-authority=/weixiang/certs/k8s/k8s-ca.pem \ --embed-certs=true \ --server=https://10.0.0.240:8443 \ --kubeconfig=/weixiang/certs/kubeconfig/kube-controller-manager.kubeconfig 2.2 设置一个用户项 [root@k8s-cluster241 k8s]# kubectl config set-credentials system:kube-controller-manager \ --client-certificate=/weixiang/certs/k8s/controller-manager.pem \ --client-key=/weixiang/certs/k8s/controller-manager-key.pem \ --embed-certs=true \ --kubeconfig=/weixiang/certs/kubeconfig/kube-controller-manager.kubeconfig 2.3 设置一个上下文环境 [root@k8s-cluster241 k8s]# kubectl config set-context system:kube-controller-manager@kubernetes \ --cluster=yinzhengjie-k8s \ --user=system:kube-controller-manager \ --kubeconfig=/weixiang/certs/kubeconfig/kube-controller-manager.kubeconfig 2.4 使用默认的上下文 [root@k8s-cluster241 k8s]# kubectl config use-context system:kube-controller-manager@kubernetes \ --kubeconfig=/weixiang/certs/kubeconfig/kube-controller-manager.kubeconfig 2.5 查看kubeconfig资源结构 [root@k8s-cluster241 k8s]# kubectl config view --kubeconfig=/weixiang/certs/kubeconfig/kube-controller-manager.kubeconfig apiVersion: v1 clusters: - cluster: certificate-authority-data: DATA+OMITTED server: https://10.0.0.240:8443 name: yinzhengjie-k8s contexts: - context: cluster: yinzhengjie-k8s user: system:kube-controller-manager name: system:kube-controller-manager@kubernetes current-context: system:kube-controller-manager@kubernetes kind: Config preferences: {} users: - name: system:kube-controller-manager user: client-certificate-data: DATA+OMITTED client-key-data: DATA+OMITTED [root@k8s-cluster241 k8s]#
3、生成scheduler证书及kubeconfig文件
bash
生成scheduler证书及kubeconfig文件 3.1 设置一个集群 [root@k8s-cluster241 k8s]# kubectl config set-cluster yinzhengjie-k8s \ --certificate-authority=/weixiang/certs/k8s/k8s-ca.pem \ --embed-certs=true \ --server=https://10.0.0.240:8443 \ --kubeconfig=/weixiang/certs/kubeconfig/kube-scheduler.kubeconfig 3.2 设置一个用户项 [root@k8s-cluster241 k8s]# kubectl config set-credentials system:kube-scheduler \ --client-certificate=/weixiang/certs/k8s/scheduler.pem \ --client-key=/weixiang/certs/k8s/scheduler-key.pem \ --embed-certs=true \ --kubeconfig=/weixiang/certs/kubeconfig/kube-scheduler.kubeconfig 3.3 设置一个上下文环境 [root@k8s-cluster241 k8s]# kubectl config set-context system:kube-scheduler@kubernetes \ --cluster=yinzhengjie-k8s \ --user=system:kube-scheduler \ --kubeconfig=/weixiang/certs/kubeconfig/kube-scheduler.kubeconfig 3.4 使用默认的上下文 [root@k8s-cluster241 k8s]# kubectl config use-context system:kube-scheduler@kubernetes \ --kubeconfig=/weixiang/certs/kubeconfig/kube-scheduler.kubeconfig 3.5 查看kubeconfig资源结构 [root@k8s-cluster241 k8s]# kubectl config view --kubeconfig=/weixiang/certs/kubeconfig/kube-scheduler.kubeconfig apiVersion: v1 clusters: - cluster: certificate-authority-data: DATA+OMITTED server: https://10.0.0.240:8443 name: yinzhengjie-k8s contexts: - context: cluster: yinzhengjie-k8s user: system:kube-scheduler name: system:kube-scheduler@kubernetes current-context: system:kube-scheduler@kubernetes kind: Config preferences: {} users: - name: system:kube-scheduler user: client-certificate-data: DATA+OMITTED client-key-data: DATA+OMITTED [root@k8s-cluster241 k8s]#
4、生成kube-proxy证书及kubeconfig文件
bash
4.1 设置集群 [root@k8s-cluster241 k8s]# kubectl config set-cluster yinzhengjie-k8s \ --certificate-authority=/weixiang/certs/k8s/k8s-ca.pem \ --embed-certs=true \ --server=https://10.0.0.240:8443 \ --kubeconfig=/weixiang/certs/kubeconfig/kube-proxy.kubeconfig 4.2 设置用户 [root@k8s-cluster241 k8s]# kubectl config set-credentials system:kube-proxy \ --client-certificate=/weixiang/certs/k8s/kube-proxy.pem \ --client-key=/weixiang/certs/k8s/kube-proxy-key.pem \ --embed-certs=true \ --kubeconfig=/weixiang/certs/kubeconfig/kube-proxy.kubeconfig 4.3 设置上下文 [root@k8s-cluster241 k8s]# kubectl config set-context kube-proxy@kubernetes \ --cluster=yinzhengjie-k8s \ --user=system:kube-proxy \ --kubeconfig=/weixiang/certs/kubeconfig/kube-proxy.kubeconfig 4.4 设置默认上下文 [root@k8s-cluster241 k8s]# kubectl config use-context kube-proxy@kubernetes \ --kubeconfig=/weixiang/certs/kubeconfig/kube-proxy.kubeconfig 4.5 查看kubeconfig资源结构 [root@k8s-cluster241 k8s]# kubectl config view --kubeconfig=/weixiang/certs/kubeconfig/kube-proxy.kubeconfig apiVersion: v1 clusters: - cluster: certificate-authority-data: DATA+OMITTED server: https://10.0.0.240:8443 name: yinzhengjie-k8s contexts: - context: cluster: yinzhengjie-k8s user: system:kube-proxy name: kube-proxy@kubernetes current-context: kube-proxy@kubernetes kind: Config preferences: {} users: - name: system:kube-proxy user: client-certificate-data: DATA+OMITTED client-key-data: DATA+OMITTED [root@k8s-cluster241 k8s]#
5、配置k8s集群管理员证书及kubeconfig文件
bash
5.1 设置一个集群 [root@k8s-cluster241 k8s]# kubectl config set-cluster yinzhengjie-k8s \ --certificate-authority=/weixiang/certs/k8s/k8s-ca.pem \ --embed-certs=true \ --server=https://10.0.0.240:8443 \ --kubeconfig=/weixiang/certs/kubeconfig/kube-admin.kubeconfig 5.2 设置一个用户项 [root@k8s-cluster241 k8s]# kubectl config set-credentials kube-admin \ --client-certificate=/weixiang/certs/k8s/admin.pem \ --client-key=/weixiang/certs/k8s/admin-key.pem \ --embed-certs=true \ --kubeconfig=/weixiang/certs/kubeconfig/kube-admin.kubeconfig 5.3 设置一个上下文环境 [root@k8s-cluster241 k8s]# kubectl config set-context kube-admin@kubernetes \ --cluster=yinzhengjie-k8s \ --user=kube-admin \ --kubeconfig=/weixiang/certs/kubeconfig/kube-admin.kubeconfig 5.4 使用默认的上下文 [root@k8s-cluster241 k8s]# kubectl config use-context kube-admin@kubernetes \ --kubeconfig=/weixiang/certs/kubeconfig/kube-admin.kubeconfig 5.5 查看kubeconfig资源结构 [root@k8s-cluster241 k8s]# kubectl config view --kubeconfig=/weixiang/certs/kubeconfig/kube-admin.kubeconfig apiVersion: v1 clusters: - cluster: certificate-authority-data: DATA+OMITTED server: https://10.0.0.240:8443 name: yinzhengjie-k8s contexts: - context: cluster: yinzhengjie-k8s user: kube-admin name: kube-admin@kubernetes current-context: kube-admin@kubernetes kind: Config preferences: {} users: - name: kube-admin user: client-certificate-data: DATA+OMITTED client-key-data: DATA+OMITTED [root@k8s-cluster241 k8s]#
6、后续安装
bash
6.将K8S组件证书拷贝到其他两个master节点 6.1 拷贝kubeconfig文件【如果controller manager,scheduler和ApiServer不在同一个节点,则证书可以不用拷贝,因为证书已经写入了kubeconfig文件】 [root@k8s-cluster241 k8s]# data_rsync.sh /weixiang/certs/kubeconfig/ ===== rsyncing k8s-cluster242: kubeconfig ===== 命令执行成功! ===== rsyncing k8s-cluster243: kubeconfig ===== 命令执行成功! [root@k8s-cluster241 k8s]# 6.2 拷贝证书文件【由于我们的环境api-server和controller manager,scheduler并没有单独部署,因此所有节点都得有证书,以供api-server启动时使用。】 [root@k8s-cluster241 ~]# data_rsync.sh /weixiang/certs/k8s/ ===== rsyncing k8s-cluster242: k8s ===== 命令执行成功! ===== rsyncing k8s-cluster243: k8s ===== 命令执行成功! [root@k8s-cluster241 ~]# 6.2 查看所有节点kubeconfig目录组织结构 [root@k8s-cluster241 k8s]# tree /weixiang/ /weixiang/ ├── certs │   ├── etcd │   │   ├── etcd-ca.csr │   │   ├── etcd-ca-key.pem │   │   ├── etcd-ca.pem │   │   ├── etcd-server.csr │   │   ├── etcd-server-key.pem │   │   └── etcd-server.pem │   ├── k8s │   │   ├── admin.csr │   │   ├── admin-key.pem │   │   ├── admin.pem │   │   ├── apiserver.csr │   │   ├── apiserver-key.pem │   │   ├── apiserver.pem │   │   ├── controller-manager.csr │   │   ├── controller-manager-key.pem │   │   ├── controller-manager.pem │   │   ├── front-proxy-ca.csr │   │   ├── front-proxy-ca-key.pem │   │   ├── front-proxy-ca.pem │   │   ├── front-proxy-client.csr │   │   ├── front-proxy-client-key.pem │   │   ├── front-proxy-client.pem │   │   ├── k8s-ca.csr │   │   ├── k8s-ca-key.pem │   │   ├── k8s-ca.pem │   │   ├── kube-proxy.csr │   │   ├── kube-proxy-key.pem │   │   ├── kube-proxy.pem │   │   ├── sa.key │   │   ├── sa.pub │   │   ├── scheduler.csr │   │   ├── scheduler-key.pem │   │   └── scheduler.pem │   └── kubeconfig │   ├── kube-admin.kubeconfig │   ├── kube-controller-manager.kubeconfig │   ├── kube-proxy.kubeconfig │   └── kube-scheduler.kubeconfig ├── pki │   ├── etcd │   │   ├── ca-config.json │   │   ├── etcd-ca-csr.json │   │   └── etcd-csr.json │   └── k8s │   ├── admin-csr.json │   ├── apiserver-csr.json │   ├── controller-manager-csr.json │   ├── front-proxy-ca-csr.json │   ├── front-proxy-client-csr.json │   ├── k8s-ca-config.json │   ├── k8s-ca-csr.json │   ├── kube-proxy-csr.json │   └── scheduler-csr.json └── softwares └── etcd └── etcd.config.yml 9 directories, 49 files [root@k8s-cluster241 k8s]# [root@k8s-cluster242 ~]# tree /weixiang/ /weixiang/ ├── certs │   ├── etcd │   │   ├── etcd-ca.csr │   │   ├── etcd-ca-key.pem │   │   ├── etcd-ca.pem │   │   ├── etcd-server.csr │   │   ├── etcd-server-key.pem │   │   └── etcd-server.pem │   ├── k8s │   │   ├── admin.csr │   │   ├── admin-key.pem │   │   ├── admin.pem │   │   ├── apiserver.csr │   │   ├── apiserver-key.pem │   │   ├── apiserver.pem │   │   ├── controller-manager.csr │   │   ├── controller-manager-key.pem │   │   ├── controller-manager.pem │   │   ├── front-proxy-ca.csr │   │   ├── front-proxy-ca-key.pem │   │   ├── front-proxy-ca.pem │   │   ├── front-proxy-client.csr │   │   ├── front-proxy-client-key.pem │   │   ├── front-proxy-client.pem │   │   ├── k8s-ca.csr │   │   ├── k8s-ca-key.pem │   │   ├── k8s-ca.pem │   │   ├── kube-proxy.csr │   │   ├── kube-proxy-key.pem │   │   ├── kube-proxy.pem │   │   ├── sa.key │   │   ├── sa.pub │   │   ├── scheduler.csr │   │   ├── scheduler-key.pem │   │   └── scheduler.pem │   └── kubeconfig │   ├── kube-admin.kubeconfig │   ├── kube-controller-manager.kubeconfig │   ├── kube-proxy.kubeconfig │   └── kube-scheduler.kubeconfig └── softwares └── etcd └── etcd.config.yml 6 directories, 37 files [root@k8s-cluster242 ~]# [root@k8s-cluster243 ~]# tree /weixiang/ /weixiang/ ├── certs │   ├── etcd │   │   ├── etcd-ca.csr │   │   ├── etcd-ca-key.pem │   │   ├── etcd-ca.pem │   │   ├── etcd-server.csr │   │   ├── etcd-server-key.pem │   │   └── etcd-server.pem │   ├── k8s │   │   ├── admin.csr │   │   ├── admin-key.pem │   │   ├── admin.pem │   │   ├── apiserver.csr │   │   ├── apiserver-key.pem │   │   ├── apiserver.pem │   │   ├── controller-manager.csr │   │   ├── controller-manager-key.pem │   │   ├── controller-manager.pem │   │   ├── front-proxy-ca.csr │   │   ├── front-proxy-ca-key.pem │   │   ├── front-proxy-ca.pem │   │   ├── front-proxy-client.csr │   │   ├── front-proxy-client-key.pem │   │   ├── front-proxy-client.pem │   │   ├── k8s-ca.csr │   │   ├── k8s-ca-key.pem │   │   ├── k8s-ca.pem │   │   ├── kube-proxy.csr │   │   ├── kube-proxy-key.pem │   │   ├── kube-proxy.pem │   │   ├── sa.key │   │   ├── sa.pub │   │   ├── scheduler.csr │   │   ├── scheduler-key.pem │   │   └── scheduler.pem │   └── kubeconfig │   ├── kube-admin.kubeconfig │   ├── kube-controller-manager.kubeconfig │   ├── kube-proxy.kubeconfig │   └── kube-scheduler.kubeconfig └── softwares └── etcd └── etcd.config.yml 6 directories, 37 files [root@k8s-cluster243 ~]# - 启动ApiServer组件服务 1 'k8s-cluster241'节点启动ApiServer 温馨提示: - "--advertise-address"是对应的master节点的IP地址; - "--service-cluster-ip-range"对应的是svc的网段 - "--service-node-port-range"对应的是svc的NodePort端口范围; - "--etcd-servers"指定的是etcd集群地址 配置文件参考链接: https://kubernetes.io/zh-cn/docs/reference/command-line-tools-reference/kube-apiserver/ 具体实操: 0.可以清空etcd数据【确保没有数据,如果如果有也没有关系,可跳过。】 [root@k8s-cluster241 ~]# etcdctl del "" --prefix 0 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl get "" --prefix --keys-only [root@k8s-cluster241 ~]# 1.1 创建'k8s-cluster241'节点的配置文件 [root@k8s-cluster241 ~]# cat > /usr/lib/systemd/system/kube-apiserver.service << 'EOF' [Unit] Description=Jason Yin's Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-apiserver \ --requestheader-allowed-names=front-proxy-client \ --v=2 \ --bind-address=0.0.0.0 \ --secure-port=6443 \ --allow_privileged=true \ --advertise-address=10.1.12.3 \ --service-cluster-ip-range=10.200.0.0/16 \ --service-node-port-range=3000-50000 \ --etcd-servers=https://10.1.12.3:2379,https://10.1.12.4:2379,https://10.1.12.15:2379 \ --etcd-cafile=/weixiang/certs/etcd/etcd-ca.pem \ --etcd-certfile=/weixiang/certs/etcd/etcd-server.pem \ --etcd-keyfile=/weixiang/certs/etcd/etcd-server-key.pem \ --client-ca-file=/weixiang/certs/k8s/k8s-ca.pem \ --tls-cert-file=/weixiang/certs/k8s/apiserver.pem \ --tls-private-key-file=/weixiang/certs/k8s/apiserver-key.pem \ --kubelet-client-certificate=/weixiang/certs/k8s/apiserver.pem \ --kubelet-client-key=/weixiang/certs/k8s/apiserver-key.pem \ --service-account-key-file=/weixiang/certs/k8s/sa.pub \ --service-account-signing-key-file=/weixiang/certs/k8s/sa.key \ --service-account-issuer=https://kubernetes.default.svc.weixiang.com \ --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname \ --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota \ --authorization-mode=Node,RBAC \ --enable-bootstrap-token-auth=true \ --requestheader-client-ca-file=/weixiang/certs/k8s/front-proxy-ca.pem \ --proxy-client-cert-file=/weixiang/certs/k8s/front-proxy-client.pem \ --proxy-client-key-file=/weixiang/certs/k8s/front-proxy-client-key.pem \ --requestheader-allowed-names=aggregator \ --requestheader-group-headers=X-Remote-Group \ --requestheader-extra-headers-prefix=X-Remote-Extra- \ --requestheader-username-headers=X-Remote-User Restart=on-failure RestartSec=10s LimitNOFILE=65535 [Install] WantedBy=multi-user.target EOF 1.2 启动服务 systemctl daemon-reload && systemctl enable --now kube-apiserver systemctl status kube-apiserver ss -ntl | grep 6443 1.3 etcd数据库测试验证【发现启动api-server后,该组件就直接会和etcd进行数据交互。】 [root@k8s-cluster242 ~]# etcdctl get "" --prefix --keys-only | head /registry/apiregistration.k8s.io/apiservices/v1. /registry/apiregistration.k8s.io/apiservices/v1.admissionregistration.k8s.io /registry/apiregistration.k8s.io/apiservices/v1.apiextensions.k8s.io /registry/apiregistration.k8s.io/apiservices/v1.apps /registry/apiregistration.k8s.io/apiservices/v1.authentication.k8s.io [root@k8s-cluster242 ~]# [root@k8s-cluster242 ~]# etcdctl get "" --prefix --keys-only | wc -l 370 [root@k8s-cluster242 ~]# 2 'k8s-cluster242'节点启动ApiServer 温馨提示: 如果该节点api-server无法启动,请检查日志,尤其是证书文件需要api-server启动时要直接加载etcd,sa,api-server,客户端认证等相关证书。 如果证书没有同步,可以执行命令'data_rsync.sh /weixiang/certs/k8s/'自动进行同步。 具体实操: 2.1 创建'k8s-cluster242'节点的配置文件 [root@k8s-cluster242 ~]# cat > /usr/lib/systemd/system/kube-apiserver.service << 'EOF' [Unit] Description=Jason Yin's Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-apiserver \ --requestheader-allowed-names=front-proxy-client \ --v=2 \ --bind-address=0.0.0.0 \ --secure-port=6443 \ --allow_privileged=true \ --advertise-address=10.1.12.4 \ --service-cluster-ip-range=10.200.0.0/16 \ --service-node-port-range=3000-50000 \ --etcd-servers=https://10.1.12.3:2379,https://10.1.12.4:2379,https://10.1.12.15:2379 \ --etcd-cafile=/weixiang/certs/etcd/etcd-ca.pem \ --etcd-certfile=/weixiang/certs/etcd/etcd-server.pem \ --etcd-keyfile=/weixiang/certs/etcd/etcd-server-key.pem \ --client-ca-file=/weixiang/certs/k8s/k8s-ca.pem \ --tls-cert-file=/weixiang/certs/k8s/apiserver.pem \ --tls-private-key-file=/weixiang/certs/k8s/apiserver-key.pem \ --kubelet-client-certificate=/weixiang/certs/k8s/apiserver.pem \ --kubelet-client-key=/weixiang/certs/k8s/apiserver-key.pem \ --service-account-key-file=/weixiang/certs/k8s/sa.pub \ --service-account-signing-key-file=/weixiang/certs/k8s/sa.key \ --service-account-issuer=https://kubernetes.default.svc.weixiang.com \ --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname \ --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota \ --authorization-mode=Node,RBAC \ --enable-bootstrap-token-auth=true \ --requestheader-client-ca-file=/weixiang/certs/k8s/front-proxy-ca.pem \ --proxy-client-cert-file=/weixiang/certs/k8s/front-proxy-client.pem \ --proxy-client-key-file=/weixiang/certs/k8s/front-proxy-client-key.pem \ --requestheader-allowed-names=aggregator \ --requestheader-group-headers=X-Remote-Group \ --requestheader-extra-headers-prefix=X-Remote-Extra- \ --requestheader-username-headers=X-Remote-User Restart=on-failure RestartSec=10s LimitNOFILE=65535 [Install] WantedBy=multi-user.target EOF 2.2 启动服务 systemctl daemon-reload && systemctl enable --now kube-apiserver systemctl status kube-apiserver ss -ntl | grep 6443 3.'k8s-cluster243'节点启动ApiServer 具体实操: 3.1 创建'k8s-cluster243'节点的配置文件 [root@k8s-cluster243 ~]# cat > /usr/lib/systemd/system/kube-apiserver.service << 'EOF' [Unit] Description=Jason Yins Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-apiserver \ --requestheader-allowed-names=front-proxy-client \ --v=2 \ --bind-address=0.0.0.0 \ --secure-port=6443 \ --allow_privileged=true \ --advertise-address=10.1.12.15 \ --service-cluster-ip-range=10.200.0.0/16 \ --service-node-port-range=3000-50000 \ --etcd-servers=https://10.1.12.3:2379,https://10.1.12.4:2379,https://10.1.12.15:2379 \ --etcd-cafile=/weixiang/certs/etcd/etcd-ca.pem \ --etcd-certfile=/weixiang/certs/etcd/etcd-server.pem \ --etcd-keyfile=/weixiang/certs/etcd/etcd-server-key.pem \ --client-ca-file=/weixiang/certs/k8s/k8s-ca.pem \ --tls-cert-file=/weixiang/certs/k8s/apiserver.pem \ --tls-private-key-file=/weixiang/certs/k8s/apiserver-key.pem \ --kubelet-client-certificate=/weixiang/certs/k8s/apiserver.pem \ --kubelet-client-key=/weixiang/certs/k8s/apiserver-key.pem \ --service-account-key-file=/weixiang/certs/k8s/sa.pub \ --service-account-signing-key-file=/weixiang/certs/k8s/sa.key \ --service-account-issuer=https://kubernetes.default.svc.weixiang.com \ --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname \ --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota \ --authorization-mode=Node,RBAC \ --enable-bootstrap-token-auth=true \ --requestheader-client-ca-file=/weixiang/certs/k8s/front-proxy-ca.pem \ --proxy-client-cert-file=/weixiang/certs/k8s/front-proxy-client.pem \ --proxy-client-key-file=/weixiang/certs/k8s/front-proxy-client-key.pem \ --requestheader-allowed-names=aggregator \ --requestheader-group-headers=X-Remote-Group \ --requestheader-extra-headers-prefix=X-Remote-Extra- \ --requestheader-username-headers=X-Remote-User Restart=on-failure RestartSec=10s LimitNOFILE=65535 [Install] WantedBy=multi-user.target EOF 3.2 启动服务 systemctl daemon-reload && systemctl enable --now kube-apiserver systemctl status kube-apiserver ss -ntl | grep 6443 3.3 再次查看数据 [root@k8s-cluster242 ~]# etcdctl get "" --prefix --keys-only | wc -l 378 [root@k8s-cluster242 ~]#
7、启动Controler Manager组件服务
bash
- 启动Controler Manager组件服务 1 所有节点创建配置文件 温馨提示: - "--cluster-cidr"是Pod的网段地址,我们可以自行修改。 配置文件参考链接: https://kubernetes.io/zh-cn/docs/reference/command-line-tools-reference/kube-controller-manager/ 所有节点的controller-manager组件配置文件相同: (前提是证书文件存放的位置也要相同哟!) cat > /usr/lib/systemd/system/kube-controller-manager.service << 'EOF' [Unit] Description=Jason Yin's Kubernetes Controller Manager Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-controller-manager \ --v=2 \ --root-ca-file=/weixiang/certs/k8s/k8s-ca.pem \ --cluster-signing-cert-file=/weixiang/certs/k8s/k8s-ca.pem \ --cluster-signing-key-file=/weixiang/certs/k8s/k8s-ca-key.pem \ --service-account-private-key-file=/weixiang/certs/k8s/sa.key \ --kubeconfig=/weixiang/certs/kubeconfig/kube-controller-manager.kubeconfig \ --leader-elect=true \ --use-service-account-credentials=true \ --node-monitor-grace-period=40s \ --node-monitor-period=5s \ --controllers=*,bootstrapsigner,tokencleaner \ --allocate-node-cidrs=true \ --cluster-cidr=10.100.0.0/16 \ --requestheader-client-ca-file=/weixiang/certs/k8s/front-proxy-ca.pem \ --node-cidr-mask-size=24 Restart=always RestartSec=10s [Install] WantedBy=multi-user.target EOF 2.启动controller-manager服务 systemctl daemon-reload systemctl enable --now kube-controller-manager systemctl status kube-controller-manager ss -ntl | grep 10257 - 启动Scheduler组件服务 1 所有节点创建配置文件 配置文件参考链接: https://kubernetes.io/zh-cn/docs/reference/command-line-tools-reference/kube-scheduler/ 所有节点的Scheduler组件配置文件相同: (前提是证书文件存放的位置也要相同哟!) cat > /usr/lib/systemd/system/kube-scheduler.service <<'EOF' [Unit] Description=Jason Yin's Kubernetes Scheduler Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-scheduler \ --v=2 \ --leader-elect=true \ --kubeconfig=/weixiang/certs/kubeconfig/kube-scheduler.kubeconfig Restart=always RestartSec=10s [Install] WantedBy=multi-user.target EOF 2.启动scheduler服务 systemctl daemon-reload systemctl enable --now kube-scheduler systemctl status kube-scheduler ss -ntl | grep 10259 - 高可用组件haproxy+keepalived安装及验证 1 所有master【k8s-cluster24[1-3]】节点安装高可用组件 温馨提示: - 对于高可用组件,其实我们也可以单独找两台虚拟机来部署,但我为了节省2台机器,就直接在master节点复用了。 - 如果在云上安装K8S则无安装高可用组件了,毕竟公有云大部分都是不支持keepalived的,可以直接使用云产品,比如阿里的"SLB",腾讯的"ELB"等SAAS产品; - 推荐使用ELB,SLB有回环的问题,也就是SLB代理的服务器不能反向访问SLB,但是腾讯云修复了这个问题; 具体实操: [root@k8s-cluster241 ~]# cat /etc/apt/sources.list # 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释 deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse # 以下安全更新软件源包含了官方源与镜像站配置,如有需要可自行修改注释切换 deb http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse # deb-src http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse # 预发布软件源,不建议启用 # deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse # # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# apt update [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# apt-get -y install keepalived haproxy [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# data_rsync.sh /etc/apt/sources.list ===== rsyncing k8s-cluster242: sources.list ===== 命令执行成功! ===== rsyncing k8s-cluster243: sources.list ===== 命令执行成功! [root@k8s-cluster241 ~]# [root@k8s-cluster242 ~]# apt update [root@k8s-cluster242 ~]# [root@k8s-cluster242 ~]# apt-get -y install keepalived haproxy [root@k8s-cluster243 ~]# apt update [root@k8s-cluster243 ~]# apt-get -y install keepalived haproxy 2.所有master节点配置haproxy 温馨提示: - haproxy的负载均衡器监听地址我配置是8443,你可以修改为其他端口,haproxy会用来反向代理各个master组件的地址; - 如果你真的修改晴一定注意上面的证书配置的kubeconfig文件,也要一起修改,否则就会出现链接集群失败的问题; 具体实操: 2.1 备份配置文件 cp /etc/haproxy/haproxy.cfg{,`date +%F`} 2.2 所有节点的配置文件内容相同 cat > /etc/haproxy/haproxy.cfg <<'EOF' global maxconn 2000 ulimit-n 16384 log 127.0.0.1 local0 err stats timeout 30s defaults log global mode http option httplog timeout connect 5000 timeout client 50000 timeout server 50000 timeout http-request 15s timeout http-keep-alive 15s frontend monitor-haproxy bind *:9999 mode http option httplog monitor-uri /ruok frontend yinzhengjie-k8s bind 0.0.0.0:8443 bind 127.0.0.1:8443 mode tcp option tcplog tcp-request inspect-delay 5s default_backend yinzhengjie-k8s backend yinzhengjie-k8s mode tcp option tcplog option tcp-check balance roundrobin default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100 server k8s-cluster241 10.0.0.241:6443 check server k8s-cluster242 10.0.0.242:6443 check server k8s-cluster243 10.0.0.243:6443 check EOF 3.所有master节点配置keepalived 温馨提示: - 注意"interface"字段为你的物理网卡的名称,如果你的网卡是ens33,请将"eth0"修改为"ens33"哟; - 注意"mcast_src_ip"各master节点的配置均不相同,修改根据实际环境进行修改哟; - 注意"virtual_ipaddress"指定的是负载均衡器的VIP地址,这个地址也要和kubeconfig文件的Apiserver地址要一致哟; - 注意"script"字段的脚本用于检测后端的apiServer是否健康; - 注意"router_id"字段为节点ip,master每个节点配置自己的IP 具体实操: 3.1."k8s-cluster241"节点创建配置文件 [root@k8s-cluster241 ~]# ifconfig eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.0.0.241 netmask 255.255.255.0 broadcast 10.0.0.255 inet6 fe80::20c:29ff:fe4c:af81 prefixlen 64 scopeid 0x20<link> ether 00:0c:29:4c:af:81 txqueuelen 1000 (Ethernet) RX packets 955748 bytes 496681590 (496.6 MB) RX errors 0 dropped 24 overruns 0 frame 0 TX packets 873509 bytes 357603229 (357.6 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ... [root@k8s-cluster241 ~]# cat > /etc/keepalived/keepalived.conf <<'EOF' ! Configuration File for keepalived global_defs { router_id 10.0.0.241 } vrrp_script chk_lb { script "/etc/keepalived/check_port.sh 8443" interval 2 weight -20 } vrrp_instance VI_1 { state MASTER interface eth0 virtual_router_id 251 priority 100 advert_int 1 mcast_src_ip 10.0.0.241 nopreempt authentication { auth_type PASS auth_pass yinzhengjie_k8s } track_script { chk_lb } virtual_ipaddress { 10.0.0.240 } } EOF 3.2."k8s-cluster242"节点创建配置文件 [root@k8s-cluster242 ~]# ifconfig eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.0.0.242 netmask 255.255.255.0 broadcast 10.0.0.255 inet6 fe80::20c:29ff:fe97:a11 prefixlen 64 scopeid 0x20<link> ether 00:0c:29:97:0a:11 txqueuelen 1000 (Ethernet) RX packets 544582 bytes 213769468 (213.7 MB) RX errors 0 dropped 24 overruns 0 frame 0 TX packets 453156 bytes 75867279 (75.8 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ... [root@k8s-cluster242 ~]# cat > /etc/keepalived/keepalived.conf <<EOF ! Configuration File for keepalived global_defs { router_id 10.0.0.242 } vrrp_script chk_lb { script "/etc/keepalived/check_port.sh 8443" interval 2 weight -20 } vrrp_instance VI_1 { state MASTER interface eth0 virtual_router_id 251 priority 100 advert_int 1 mcast_src_ip 10.0.0.242 nopreempt authentication { auth_type PASS auth_pass yinzhengjie_k8s } track_script { chk_lb } virtual_ipaddress { 10.0.0.240 } } EOF 3.3."k8s-cluster243"节点创建配置文件 [root@k8s-cluster243 ~]# ifconfig eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.0.0.243 netmask 255.255.255.0 broadcast 10.0.0.255 inet6 fe80::20c:29ff:fec9:46ff prefixlen 64 scopeid 0x20<link> ether 00:0c:29:c9:46:ff txqueuelen 1000 (Ethernet) RX packets 918814 bytes 774698270 (774.6 MB) RX errors 0 dropped 24 overruns 0 frame 0 TX packets 495407 bytes 78301697 (78.3 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ... [root@k8s-cluster243 ~]# cat > /etc/keepalived/keepalived.conf <<EOF ! Configuration File for keepalived global_defs { router_id 10.0.0.243 } vrrp_script chk_lb { script "/etc/keepalived/check_port.sh 8443" interval 2 weight -20 } vrrp_instance VI_1 { state MASTER interface eth0 virtual_router_id 251 priority 100 advert_int 1 mcast_src_ip 10.0.0.243 nopreempt authentication { auth_type PASS auth_pass yinzhengjie_k8s } track_script { chk_lb } virtual_ipaddress { 10.0.0.240 } } EOF 3.4.所有keepalived节点均需要创建健康检查脚本 cat > /etc/keepalived/check_port.sh <<'EOF' #!/bin/bash CHK_PORT=$1 if [ -n "$CHK_PORT" ];then PORT_PROCESS=`ss -lt|grep $CHK_PORT|wc -l` if [ $PORT_PROCESS -eq 0 ];then echo "Port $CHK_PORT Is Not Used,End." systemctl stop keepalived fi else echo "Check Port Cant Be Empty!" fi EOF chmod +x /etc/keepalived/check_port.sh 4.验证haproxy服务并验证 4.1 所有节点启动haproxy服务 systemctl enable --now haproxy systemctl restart haproxy systemctl status haproxy ss -ntl | egrep "8443|9999" 4.2 基于webUI进行验证 [root@k8s-cluster241 ~]# curl http://10.0.0.241:9999/ruok <html><body><h1>200 OK</h1> Service ready. </body></html> [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# curl http://10.0.0.242:9999/ruok <html><body><h1>200 OK</h1> Service ready. </body></html> [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# curl http://10.0.0.243:9999/ruok <html><body><h1>200 OK</h1> Service ready. </body></html> [root@k8s-cluster241 ~]# 5.启动keepalived服务并验证 5.1.所有节点启动keepalived服务 systemctl daemon-reload systemctl enable --now keepalived systemctl status keepalived 5.2 验证服务是否正常 [root@k8s-cluster241 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:4c:af:81 brd ff:ff:ff:ff:ff:ff altname enp2s1 altname ens33 inet 10.0.0.241/24 brd 10.0.0.255 scope global eth0 valid_lft forever preferred_lft forever inet 10.0.0.240/32 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe4c:af81/64 scope link valid_lft forever preferred_lft forever 3: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.0 [root@k8s-cluster241 ~]# 5.3 基于telnet验证haporxy是否正常 [root@k8s-cluster242 ~]# telnet 10.0.0.240 8443 Trying 10.0.0.240... Connected to 10.0.0.240. Escape character is '^]'. [root@k8s-cluster243 ~]# ping 10.0.0.240 -c 3 PING 10.0.0.240 (10.0.0.240) 56(84) bytes of data. 64 bytes from 10.0.0.240: icmp_seq=1 ttl=64 time=0.132 ms 64 bytes from 10.0.0.240: icmp_seq=2 ttl=64 time=0.211 ms 64 bytes from 10.0.0.240: icmp_seq=3 ttl=64 time=0.202 ms --- 10.0.0.240 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2038ms rtt min/avg/max/mdev = 0.132/0.181/0.211/0.035 ms [root@k8s-cluster243 ~]# 5.4 将VIP节点的haproxy停止 [root@k8s-cluster241 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:4c:af:81 brd ff:ff:ff:ff:ff:ff altname enp2s1 altname ens33 inet 10.0.0.241/24 brd 10.0.0.255 scope global eth0 valid_lft forever preferred_lft forever inet 10.0.0.240/32 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe4c:af81/64 scope link valid_lft forever preferred_lft forever 3: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.0 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# systemctl stop haproxy.service # 停止服务 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:4c:af:81 brd ff:ff:ff:ff:ff:ff altname enp2s1 altname ens33 inet 10.0.0.241/24 brd 10.0.0.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe4c:af81/64 scope link valid_lft forever preferred_lft forever 3: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.0 [root@k8s-cluster241 ~]# 5.5 观察VIP是否飘逸 [root@k8s-cluster242 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:97:0a:11 brd ff:ff:ff:ff:ff:ff altname enp2s1 altname ens33 inet 10.0.0.242/24 brd 10.0.0.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe97:a11/64 scope link valid_lft forever preferred_lft forever 3: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.0 [root@k8s-cluster242 ~]# [root@k8s-cluster243 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:c9:46:ff brd ff:ff:ff:ff:ff:ff altname enp2s1 altname ens33 inet 10.0.0.243/24 brd 10.0.0.255 scope global eth0 valid_lft forever preferred_lft forever inet 10.0.0.240/32 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fec9:46ff/64 scope link valid_lft forever preferred_lft forever 3: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.0 [root@k8s-cluster243 ~]# 5.6 如果长期ping测试的话会看到如下的提示 [root@k8s-cluster241 ~]# ping 10.0.0.240 PING 10.0.0.240 (10.0.0.240) 56(84) bytes of data. 64 bytes from 10.0.0.240: icmp_seq=1 ttl=64 time=0.016 ms 64 bytes from 10.0.0.240: icmp_seq=2 ttl=64 time=0.024 ms 64 bytes from 10.0.0.240: icmp_seq=3 ttl=64 time=0.024 ms 64 bytes from 10.0.0.240: icmp_seq=4 ttl=64 time=0.024 ms 64 bytes from 10.0.0.240: icmp_seq=5 ttl=64 time=0.020 ms 64 bytes from 10.0.0.240: icmp_seq=6 ttl=64 time=0.024 ms 64 bytes from 10.0.0.240: icmp_seq=7 ttl=64 time=0.022 ms 64 bytes from 10.0.0.240: icmp_seq=8 ttl=64 time=403 ms From 10.0.0.242 icmp_seq=9 Redirect Host(New nexthop: 10.0.0.240) 64 bytes from 10.0.0.240: icmp_seq=9 ttl=64 time=0.275 ms 64 bytes from 10.0.0.240: icmp_seq=10 ttl=64 time=0.202 ms 64 bytes from 10.0.0.240: icmp_seq=11 ttl=64 time=0.249 ms From 10.0.0.242 icmp_seq=12 Redirect Host(New nexthop: 10.0.0.240) 64 bytes from 10.0.0.240: icmp_seq=12 ttl=64 time=0.292 ms 64 bytes from 10.0.0.240: icmp_seq=13 ttl=64 time=0.314 ms 64 bytes from 10.0.0.240: icmp_seq=14 ttl=64 time=0.164 ms 64 bytes from 10.0.0.240: icmp_seq=15 ttl=64 time=0.158 ms 64 bytes from 10.0.0.240: icmp_seq=16 ttl=64 time=0.170 ms 64 bytes from 10.0.0.240: icmp_seq=17 ttl=64 time=0.152 ms 64 bytes from 10.0.0.240: icmp_seq=18 ttl=64 time=0.227 ms 64 bytes from 10.0.0.240: icmp_seq=19 ttl=64 time=0.141 ms 64 bytes from 10.0.0.240: icmp_seq=20 ttl=64 time=0.558 ms 64 bytes from 10.0.0.240: icmp_seq=21 ttl=64 time=0.173 ms 64 bytes from 10.0.0.240: icmp_seq=22 ttl=64 time=0.165 ms 64 bytes from 10.0.0.240: icmp_seq=23 ttl=64 time=0.178 ms 64 bytes from 10.0.0.240: icmp_seq=24 ttl=64 time=0.199 ms ^C --- 10.0.0.240 ping statistics --- 24 packets transmitted, 24 received, +2 errors, 0% packet loss, time 23510ms rtt min/avg/max/mdev = 0.016/16.932/402.600/80.417 ms [root@k8s-cluster241 ~]# 5.7 可以尝试将停止的服务启动观察VIP是否飘逸回来 很明显,经过测试验证,VIP并不会飘逸回来。 启动服务参考命令: systemctl start keepalived haproxy systemctl status keepalived haproxy - 配置K8S管理节点并自动补全功能 1.测试验证 [root@k8s-cluster241 ~]# kubectl cluster-info --kubeconfig=/weixiang/certs/kubeconfig/kube-admin.kubeconfig Kubernetes control plane is running at https://10.0.0.240:8443 To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl get cs --kubeconfig=/weixiang/certs/kubeconfig/kube-admin.kubeconfig Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy ok [root@k8s-cluster241 ~]# 2.所有master节点操作 mkdir -p $HOME/.kube cp -i /weixiang/certs/kubeconfig/kube-admin.kubeconfig $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config 3.测试验证 [root@k8s-cluster241 ~]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-0 Healthy ok [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl cluster-info Kubernetes control plane is running at https://10.0.0.240:8443 To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. [root@k8s-cluster241 ~]# [root@k8s-cluster242 ~]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-0 Healthy ok [root@k8s-cluster242 ~]# [root@k8s-cluster243 ~]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-0 Healthy ok [root@k8s-cluster243 ~]# 4.所有master节点配置自动补全功能 kubectl completion bash > ~/.kube/completion.bash.inc echo source '$HOME/.kube/completion.bash.inc' >> ~/.bashrc source ~/.bashrc 5.测试验证自动补全功能 [root@k8s-cluster241 ~]# kubectl annotate (Update the annotations on a resource) api-resources (Print the supported API resources on the server) api-versions (Print the supported API versions on the server, in the form of "group/version") apply (Apply a configuration to a resource by file name or stdin) attach (Attach to a running container) auth (Inspect authorization) autoscale (Auto-scale a deployment, replica set, stateful set, or replication controller) certificate (Modify certificate resources) cluster-info (Display cluster information) completion (Output shell completion code for the specified shell (bash, zsh, fish, or powershell)) config (Modify kubeconfig files) cordon (Mark node as unschedulable) cp (Copy files and directories to and from containers) create (Create a resource from a file or from stdin) debug (Create debugging sessions for troubleshooting workloads and nodes) delete (Delete resources by file names, stdin, resources and names, or by resources and label selector) describe (Show details of a specific resource or group of resources) diff (Diff the live version against a would-be applied version) drain (Drain node in preparation for maintenance) edit (Edit a resource on the server) events (List events) exec (Execute a command in a container) explain (Get documentation for a resource) expose (Take a replication controller, service, deployment or pod and expose it as a new Kubernetes service) get (Display one or many resources) help (Help about any command) kustomize (Build a kustomization target from a directory or URL) --More--
8、启动kubelet服务组件
bash
1.创建Bootstrapping自动颁发kubelet证书配置 温馨提示: - "--server"指向的是负载均衡器的IP地址,由负载均衡器对master节点进行反向代理哟。 - "--token"也可以自定义,但也要同时修改"bootstrap"的Secret的"token-id""token-secret"对应值哟; 1.1 设置集群 [root@k8s-cluster241 ~]# kubectl config set-cluster yinzhengjie-k8s \ --certificate-authority=/weixiang/certs/k8s/k8s-ca.pem \ --embed-certs=true \ --server=https://10.0.0.240:8443 \ --kubeconfig=/weixiang/certs/kubeconfig/bootstrap-kubelet.kubeconfig 1.2 创建用户 [root@k8s-cluster241 ~]# kubectl config set-credentials tls-bootstrap-token-user \ --token=oldboy.jasonyinzhengjie \ --kubeconfig=/weixiang/certs/kubeconfig/bootstrap-kubelet.kubeconfig 1.3 将集群和用户进行绑定 [root@k8s-cluster241 ~]# kubectl config set-context tls-bootstrap-token-user@kubernetes \ --cluster=yinzhengjie-k8s \ --user=tls-bootstrap-token-user \ --kubeconfig=/weixiang/certs/kubeconfig/bootstrap-kubelet.kubeconfig 1.4.配置默认的上下文 [root@k8s-cluster241 ~]# kubectl config use-context tls-bootstrap-token-user@kubernetes \ --kubeconfig=/weixiang/certs/kubeconfig/bootstrap-kubelet.kubeconfig 1.5 查看kubeconfig资源结构 [root@k8s-cluster241 ~]# kubectl config view --kubeconfig=/weixiang/certs/kubeconfig/bootstrap-kubelet.kubeconfig apiVersion: v1 clusters: - cluster: certificate-authority-data: DATA+OMITTED server: https://10.0.0.240:8443 name: yinzhengjie-k8s contexts: - context: cluster: yinzhengjie-k8s user: tls-bootstrap-token-user name: tls-bootstrap-token-user@kubernetes current-context: tls-bootstrap-token-user@kubernetes kind: Config preferences: {} users: - name: tls-bootstrap-token-user user: token: REDACTED [root@k8s-cluster241 ~]# 1.6 拷贝kubelet的Kubeconfig文件 [root@k8s-cluster241 ~]# data_rsync.sh /weixiang/certs/kubeconfig/bootstrap-kubelet.kubeconfig 2 创建bootstrap-secret授权 2.1 创建配bootstrap-secret文件用于授权 [root@k8s-cluster241 ~]# cat > bootstrap-secret.yaml <<EOF apiVersion: v1 kind: Secret metadata: name: bootstrap-token-oldboy namespace: kube-system type: bootstrap.kubernetes.io/token stringData: description: "The default bootstrap token generated by 'kubelet '." token-id: oldboy token-secret: jasonyinzhengjie usage-bootstrap-authentication: "true" usage-bootstrap-signing: "true" auth-extra-groups: system:bootstrappers:default-node-token,system:bootstrappers:worker,system:bootstrappers:ingress --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: kubelet-bootstrap roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:node-bootstrapper subjects: - apiGroup: rbac.authorization.k8s.io kind: Group name: system:bootstrappers:default-node-token --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: node-autoapprove-bootstrap roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:nodeclient subjects: - apiGroup: rbac.authorization.k8s.io kind: Group name: system:bootstrappers:default-node-token --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: node-autoapprove-certificate-rotation roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient subjects: - apiGroup: rbac.authorization.k8s.io kind: Group name: system:nodes --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults name: system:kube-apiserver-to-kubelet rules: - apiGroups: - "" resources: - nodes/proxy - nodes/stats - nodes/log - nodes/spec - nodes/metrics verbs: - "*" --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:kube-apiserver roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:kube-apiserver-to-kubelet subjects: - apiGroup: rbac.authorization.k8s.io kind: User name: kube-apiserver EOF 2.2.应用bootstrap-secret配置文件 [root@k8s-cluster241 ~]# etcdctl get "/" --prefix --keys-only | wc -l 476 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl apply -f bootstrap-secret.yaml secret/bootstrap-token-oldboy created clusterrolebinding.rbac.authorization.k8s.io/kubelet-bootstrap created clusterrolebinding.rbac.authorization.k8s.io/node-autoapprove-bootstrap created clusterrolebinding.rbac.authorization.k8s.io/node-autoapprove-certificate-rotation created clusterrole.rbac.authorization.k8s.io/system:kube-apiserver-to-kubelet created clusterrolebinding.rbac.authorization.k8s.io/system:kube-apiserver created [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# etcdctl get "/" --prefix --keys-only | wc -l 488 [root@k8s-cluster241 ~]# 3.部署worker节点之kubelet启动实战 温馨提示: - 在"10-kubelet.con"文件中使用"--kubeconfig"指定的"kubelet.kubeconfig"文件并不存在,这个证书文件后期会自动生成; - 对于"clusterDNS"是NDS地址,我们可以自定义,比如"10.200.0.254"; - “clusterDomain”对应的是域名信息,要和我们设计的集群保持一致,比如"yinzhengjie.com"; - "10-kubelet.conf"文件中的"ExecStart="需要写2次,否则可能无法启动kubelet; 具体实操: 3.1 所有节点创建工作目录 mkdir -p /var/lib/kubelet /var/log/kubernetes /etc/systemd/system/kubelet.service.d /etc/kubernetes/manifests/ 3.2 所有节点创建kubelet的配置文件 cat > /etc/kubernetes/kubelet-conf.yml <<'EOF' apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration address: 0.0.0.0 port: 10250 readOnlyPort: 10255 authentication: anonymous: enabled: false webhook: cacheTTL: 2m0s enabled: true x509: clientCAFile: /weixiang/certs/k8s/k8s-ca.pem authorization: mode: Webhook webhook: cacheAuthorizedTTL: 5m0s cacheUnauthorizedTTL: 30s cgroupDriver: systemd cgroupsPerQOS: true clusterDNS: - 10.200.0.254 clusterDomain: weixiang.com containerLogMaxFiles: 5 containerLogMaxSize: 10Mi contentType: application/vnd.kubernetes.protobuf cpuCFSQuota: true cpuManagerPolicy: none cpuManagerReconcilePeriod: 10s enableControllerAttachDetach: true enableDebuggingHandlers: true enforceNodeAllocatable: - pods eventBurst: 10 eventRecordQPS: 5 evictionHard: imagefs.available: 15% memory.available: 100Mi nodefs.available: 10% nodefs.inodesFree: 5% evictionPressureTransitionPeriod: 5m0s failSwapOn: true fileCheckFrequency: 20s hairpinMode: promiscuous-bridge healthzBindAddress: 127.0.0.1 healthzPort: 10248 httpCheckFrequency: 20s imageGCHighThresholdPercent: 85 imageGCLowThresholdPercent: 80 imageMinimumGCAge: 2m0s iptablesDropBit: 15 iptablesMasqueradeBit: 14 kubeAPIBurst: 10 kubeAPIQPS: 5 makeIPTablesUtilChains: true maxOpenFiles: 1000000 maxPods: 110 nodeStatusUpdateFrequency: 10s oomScoreAdj: -999 podPidsLimit: -1 registryBurst: 10 registryPullQPS: 5 resolvConf: /etc/kubernetes/resolv.conf rotateCertificates: true runtimeRequestTimeout: 2m0s serializeImagePulls: true staticPodPath: /etc/kubernetes/manifests streamingConnectionIdleTimeout: 4h0m0s syncFrequency: 1m0s volumeStatsAggPeriod: 1m0s EOF 3.3 所有节点配置kubelet service cat > /usr/lib/systemd/system/kubelet.service <<'EOF' [Unit] Description=JasonYin's Kubernetes Kubelet Documentation=https://github.com/kubernetes/kubernetes After=containerd.service Requires=containerd.service [Service] ExecStart=/usr/local/bin/kubelet Restart=always StartLimitInterval=0 RestartSec=10 [Install] WantedBy=multi-user.target EOF 3.4 所有节点配置kubelet service的配置文件 cat > /etc/systemd/system/kubelet.service.d/10-kubelet.conf <<'EOF' [Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/weixiang/certs/kubeconfig/bootstrap-kubelet.kubeconfig --kubeconfig=/weixiang/certs/kubeconfig/kubelet.kubeconfig" Environment="KUBELET_CONFIG_ARGS=--config=/etc/kubernetes/kubelet-conf.yml" Environment="KUBELET_SYSTEM_ARGS=--container-runtime-endpoint=unix:///run/containerd/containerd.sock" Environment="KUBELET_EXTRA_ARGS=--node-labels=node.kubernetes.io/node='' " ExecStart= ExecStart=/usr/local/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_SYSTEM_ARGS $KUBELET_EXTRA_ARGS EOF 3.5 启动所有节点kubelet systemctl daemon-reload systemctl enable --now kubelet systemctl status kubelet 4. 在所有master节点上查看nodes信息。 4.1 'k8s-cluster241'查看worker节点列表 [root@k8s-cluster241 ~]# kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8s-cluster241 NotReady <none> 100s v1.33.1 10.0.0.241 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic containerd://1.6.36 k8s-cluster242 NotReady <none> 8s v1.33.1 10.0.0.242 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic containerd://1.6.36 k8s-cluster243 NotReady <none> 2s v1.33.1 10.0.0.243 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic containerd://1.6.36 [root@k8s-cluster241 ~]# 4.2 'k8s-cluster242'查看worker节点列表 [root@k8s-cluster242 ~]# kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8s-cluster241 NotReady <none> 2m3s v1.33.1 10.0.0.241 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic containerd://1.6.36 k8s-cluster242 NotReady <none> 31s v1.33.1 10.0.0.242 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic containerd://1.6.36 k8s-cluster243 NotReady <none> 25s v1.33.1 10.0.0.243 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic containerd://1.6.36 [root@k8s-cluster242 ~]# 4.3 'k8s-cluster243'查看worker节点列表 [root@k8s-cluster243 ~]# kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8s-cluster241 NotReady <none> 2m15s v1.33.1 10.0.0.241 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic containerd://1.6.36 k8s-cluster242 NotReady <none> 43s v1.33.1 10.0.0.242 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic containerd://1.6.36 k8s-cluster243 NotReady <none> 37s v1.33.1 10.0.0.243 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic containerd://1.6.36 [root@k8s-cluster243 ~]# 5.可以查看到有相应的csr用户客户端的证书请求 [root@k8s-cluster243 ~]# kubectl get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-2qmw7 62s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:oldboy <none> Approved,Issued csr-qcvjf 2m40s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:oldboy <none> Approved,Issued csr-rrp6d 69s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:oldboy <none> Approved,Issued [root@k8s-cluster243 ~]# - 启动kube-proxy组件服务 1.所有节点创建kube-proxy.conf配置文件 cat > /etc/kubernetes/kube-proxy.yml << EOF apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration bindAddress: 0.0.0.0 metricsBindAddress: 127.0.0.1:10249 clientConnection: acceptConnection: "" burst: 10 contentType: application/vnd.kubernetes.protobuf kubeconfig: /weixiang/certs/kubeconfig/kube-proxy.kubeconfig qps: 5 clusterCIDR: 10.100.0.0/16 configSyncPeriod: 15m0s conntrack: max: null maxPerCore: 32768 min: 131072 tcpCloseWaitTimeout: 1h0m0s tcpEstablishedTimeout: 24h0m0s enableProfiling: false healthzBindAddress: 0.0.0.0:10256 hostnameOverride: "" iptables: masqueradeAll: false masqueradeBit: 14 minSyncPeriod: 0s ipvs: masqueradeAll: true minSyncPeriod: 5s scheduler: "rr" syncPeriod: 30s mode: "ipvs" nodeProtAddress: null oomScoreAdj: -999 portRange: "" udpIdelTimeout: 250ms EOF 2.所有节点使用systemd管理kube-proxy cat > /usr/lib/systemd/system/kube-proxy.service << EOF [Unit] Description=Jason Yin's Kubernetes Proxy After=network.target [Service] ExecStart=/usr/local/bin/kube-proxy \ --config=/etc/kubernetes/kube-proxy.yml \ --v=2 Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF 3.启动kube-proxy服务之前查看etcd是数据统计 [root@k8s-cluster243 ~]# etcdctl get "/" --prefix --keys-only | wc -l 560 [root@k8s-cluster243 ~]# 4.所有节点启动kube-proxy systemctl daemon-reload && systemctl enable --now kube-proxy systemctl status kube-proxy ss -ntl |egrep "10256|10249" 5.启动kube-proxy服务之后再次查看etcd是数据统计【说明kube-proxy服务启动后,其实也会和api-server通信,并由api-server写入数据到etcd中】 [root@k8s-cluster243 ~]# etcdctl get "/" --prefix --keys-only | wc -l 566 [root@k8s-cluster243 ~]# 6.集群关机拍快照 略,见视频。 温馨提示: 为了让大家后续可以更换不同的网络插件环境,建议大家在此步骤再拍一次快照,如果将来想要更换网络插件,直接还原快照即可。 避免你未来为了卸载现有的CNI插件卸载不干净导致不必要的冲突。
9、部署Calico实战
bash
部署Calico实战 参考链接: https://docs.tigera.io/calico/latest/getting-started/kubernetes/quickstart#step-2-install-calico 1.所有节点准备宿主机DNS解析文件 [root@k8s-cluster241 ~]# cat > /etc/kubernetes/resolv.conf <<EOF nameserver 223.5.5.5 options edns0 trust-ad search . EOF [root@k8s-cluster241 ~]# data_rsync.sh /etc/kubernetes/resolv.conf 2.下载资源清单 [root@k8s-cluster241 ~]# wget https://raw.githubusercontent.com/projectcalico/calico/v3.30.2/manifests/tigera-operator.yaml [root@k8s-cluster241 ~]# wget https://raw.githubusercontent.com/projectcalico/calico/v3.30.2/manifests/custom-resources.yaml 3.导入镜像 SVIP参考链接: http://192.168.21.253/Resources/Kubernetes/K8S%20Cluster/CNI/calico/calico-v3.30.2/ 彩蛋: 拉取镜像测试: [root@k8s-cluster242 ~]# export http_proxy=http://10.0.0.1:7890 export https_proxy=http://10.0.0.1:7890; ctr -n k8s.io i pull docker.io/calico/pod2daemon-flexvol:v3.30.2 导入镜像 : [root@k8s-cluster242 ~]# ctr -n k8s.io i import weixiang-tigera-operator-v1.38.3.tar.gz unpacking quay.io/tigera/operator:v1.38.3 (sha256:dbf1bad0def7b5955dc8e4aeee96e23ead0bc5822f6872518e685cd0ed484121)...done [root@k8s-cluster242 ~]# 3.安装Tigera Operator和自定义资源定义。 [root@k8s-cluster241 ~]# kubectl apply -f tigera-operator.yaml namespace/tigera-operator created serviceaccount/tigera-operator created clusterrole.rbac.authorization.k8s.io/tigera-operator-secrets created clusterrole.rbac.authorization.k8s.io/tigera-operator created clusterrolebinding.rbac.authorization.k8s.io/tigera-operator created rolebinding.rbac.authorization.k8s.io/tigera-operator-secrets created deployment.apps/tigera-operator created [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl get pods -A -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES tigera-operator tigera-operator-747864d56d-z6d2m 1/1 Running 0 5m52s 10.0.0.241 k8s-cluster241 <none> <none> [root@k8s-cluster241 ~]# 4.通过创建必要的自定义资源来安装Calico。 [root@k8s-cluster241 ~]# grep 16 custom-resources.yaml cidr: 192.168.0.0/16 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# sed -i '/16/s#192.168#10.100#' custom-resources.yaml [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# grep 16 custom-resources.yaml cidr: 10.100.0.0/16 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# grep blockSize custom-resources.yaml blockSize: 26 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# sed -i '/blockSize/s#26#24#' custom-resources.yaml [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# grep blockSize custom-resources.yaml blockSize: 24 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl create -f custom-resources.yaml installation.operator.tigera.io/default created apiserver.operator.tigera.io/default created goldmane.operator.tigera.io/default created whisker.operator.tigera.io/default created [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# 5.检查Pod是否就绪 [root@k8s-cluster241 ~]# kubectl get pods -A -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES calico-apiserver calico-apiserver-68b989c5c9-7wkpv 1/1 Running 0 68s 10.100.88.7 k8s-cluster243 <none> <none> calico-apiserver calico-apiserver-68b989c5c9-hzgqr 1/1 Running 0 68s 10.100.99.6 k8s-cluster242 <none> <none> calico-system calico-kube-controllers-74544dd8d8-sjc6g 1/1 Running 0 68s 10.100.88.8 k8s-cluster243 <none> <none> calico-system calico-node-p24ql 1/1 Running 0 62s 10.0.0.242 k8s-cluster242 <none> <none> calico-system calico-node-qlr7r 1/1 Running 0 64s 10.0.0.243 k8s-cluster243 <none> <none> calico-system calico-node-w6cmr 1/1 Running 0 67s 10.0.0.241 k8s-cluster241 <none> <none> calico-system calico-typha-6744756b8b-cxkxp 1/1 Running 0 67s 10.0.0.241 k8s-cluster241 <none> <none> calico-system calico-typha-6744756b8b-mbdww 1/1 Running 0 68s 10.0.0.243 k8s-cluster243 <none> <none> calico-system csi-node-driver-6lnsp 2/2 Running 0 65s 10.100.86.6 k8s-cluster241 <none> <none> calico-system csi-node-driver-p299m 2/2 Running 0 61s 10.100.88.10 k8s-cluster243 <none> <none> calico-system csi-node-driver-tstgf 2/2 Running 0 64s 10.100.99.8 k8s-cluster242 <none> <none> calico-system goldmane-56dbdbc4c8-hdhtv 1/1 Running 0 67s 10.100.88.9 k8s-cluster243 <none> <none> calico-system whisker-c75778546-mpnqz 2/2 Running 0 67s 10.100.99.7 k8s-cluster242 <none> <none> tigera-operator tigera-operator-747864d56d-9v67r 1/1 Running 0 67s 10.0.0.242 k8s-cluster242 <none> <none> [root@k8s-cluster241 ~]# 6.监控Calico Whisker中的网络流量 [root@k8s-cluster243 ~]# kubectl port-forward -n calico-system service/whisker 8081:8081 --address 0.0.0.0 Forwarding from 0.0.0.0:8081 -> 8081 7.浏览器访问测试 http://10.0.0.243:8081/flow-logs 8.检查节点的状态 [root@k8s-cluster241 ~]# kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8s-cluster241 Ready <none> 83m v1.33.3 10.0.0.241 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic containerd://1.6.36 k8s-cluster242 Ready <none> 83m v1.33.3 10.0.0.242 <none> Ubuntu 22.04.4 LTS 5.15.0-119-generic containerd://1.6.36 k8s-cluster243 Ready <none> 83m v1.33.3 10.0.0.243 <none> Ubuntu 22.04.4 LTS 5.15.0-151-generic containerd://1.6.36 [root@k8s-cluster241 ~]# - 测试CNI网络插件 1.编写资源清单 [root@k8s-cluster241 ~]# cat > weixiang-network-cni-test.yaml <<EOF apiVersion: v1 kind: Pod metadata: name: xiuxian-v1 labels: apps: v1 spec: nodeName: k8s-cluster242 containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 name: xiuxian --- apiVersion: v1 kind: Pod metadata: name: xiuxian-v2 labels: apps: v2 spec: nodeName: k8s-cluster243 containers: - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2 name: xiuxian EOF 2 创建资源并测试Pod跨节点访问 [root@k8s-cluster241 ~]# kubectl apply -f weixiang-network-cni-test.yaml pod/xiuxian-v1 created pod/xiuxian-v2 created [root@k8s-cluster241 ~]# 3.测试网络 [root@k8s-cluster241 ~]# kubectl get pods -o wide --show-labels NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS xiuxian-v1 1/1 Running 0 5s 10.100.99.10 k8s-cluster242 <none> <none> apps=v1 xiuxian-v2 1/1 Running 0 5s 10.100.88.12 k8s-cluster243 <none> <none> apps=v2 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# curl 10.100.99.10 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# curl 10.100.88.12 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v2</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: red">凡人修仙传 v2 </h1> <div> <img src="2.jpg"> <div> </body> </html> [root@k8s-cluster241 ~]# 4.暴露服务对集群外部【可跳过】 [root@k8s-cluster241 ~]# kubectl expose po xiuxian-v1 --port=80 service/xiuxian-v1 exposed [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 19h xiuxian-v1 ClusterIP 10.200.240.114 <none> 80/TCP 4s [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# curl 10.200.240.114 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@k8s-cluster241 ~]# 温馨提示: 官方关于calico网络的Whisker组件流量监控是有出图效果的,可以模拟测试,但建议先部署CoreDNS组件。 - 附加组件CoreDNS部署实战 1 下载资源清单 参考链接: https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns/coredns [root@k8s-cluster241 ~]# wget http://192.168.21.253/Resources/Kubernetes/Add-ons/CoreDNS/coredns.yaml.base 2 修改资源清单模板的关键字段 [root@k8s-cluster241 ~]# sed -i '/__DNS__DOMAIN__/s#__DNS__DOMAIN__#weixiang.com#' coredns.yaml.base [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# sed -i '/__DNS__MEMORY__LIMIT__/s#__DNS__MEMORY__LIMIT__#200Mi#' coredns.yaml.base [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# sed -i '/__DNS__SERVER__/s#__DNS__SERVER__#10.200.0.254#' coredns.yaml.base [root@k8s-cluster241 ~]# 相关字段说明: __DNS__DOMAIN__ DNS自定义域名,要和你实际的K8S域名对应上。 __DNS__MEMORY__LIMIT__ CoreDNS组件的内存限制。 __DNS__SERVER__ DNS服务器的svc的CLusterIP地址。 3.部署CoreDNS组件 [root@k8s-cluster241 ~]# kubectl apply -f coredns.yaml.base serviceaccount/coredns created clusterrole.rbac.authorization.k8s.io/system:coredns created clusterrolebinding.rbac.authorization.k8s.io/system:coredns created configmap/coredns created deployment.apps/coredns created service/kube-dns created [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl -n kube-system get svc,po -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kube-dns ClusterIP 10.200.0.254 <none> 53/UDP,53/TCP,9153/TCP 83s k8s-app=kube-dns NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/coredns-5578c9dc84-qvpqt 1/1 Running 0 82s 10.100.86.7 k8s-cluster241 <none> <none> [root@k8s-cluster241 ~]# 温馨提示: 如果镜像下载失败,可以手动导入。操作如下: wget http://192.168.21.253/Resources/Kubernetes/Add-ons/CoreDNS/weixiang-coredns-v1.12.0.tar.gz ctr -n k8s.io i import weixiang-coredns-v1.12.0.tar.gz 5.验证DNS服务 [root@k8s-cluster241 ~]# kubectl get svc -A NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE calico-apiserver calico-api ClusterIP 10.200.64.223 <none> 443/TCP 3h40m calico-system calico-kube-controllers-metrics ClusterIP None <none> 9094/TCP 3h19m calico-system calico-typha ClusterIP 10.200.103.220 <none> 5473/TCP 3h40m calico-system goldmane ClusterIP 10.200.245.83 <none> 7443/TCP 3h40m calico-system whisker ClusterIP 10.200.160.62 <none> 8081/TCP 3h40m default kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 22h default xiuxian-v1 ClusterIP 10.200.240.114 <none> 80/TCP 3h2m kube-system kube-dns ClusterIP 10.200.0.254 <none> 53/UDP,53/TCP,9153/TCP 110s [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# dig @10.200.0.254 kube-dns.kube-system.svc.weixiang.com +short 10.200.0.254 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# dig @10.200.0.254 kubernetes.default.svc.weixiang.com +short 10.200.0.1 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# dig @10.200.0.254 calico-api.calico-apiserver.svc.weixiang.com +short 10.200.64.223 [root@k8s-cluster241 ~]# 6.部署Pod验证默认的DNS服务器 [root@k8s-cluster241 ~]# kubectl apply -f weixiang-network-cni-test.yaml pod/xiuxian-v1 created pod/xiuxian-v2 created [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES xiuxian-v1 1/1 Running 0 3s 10.100.99.11 k8s-cluster242 <none> <none> xiuxian-v2 1/1 Running 0 3s 10.100.88.13 k8s-cluster243 <none> <none> [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl exec -it xiuxian-v1 -- sh / # cat /etc/resolv.conf search default.svc.weixiang.com svc.weixiang.com weixiang.com nameserver 10.200.0.254 options ndots:5 / # 7.清除Pod环境 [root@k8s-cluster241 ~]# kubectl delete pods --all pod "xiuxian-v1" deleted pod "xiuxian-v2" deleted [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl get pods -o wide No resources found in default namespace. [root@k8s-cluster241 ~]# - 部署MetallB组件实现LoadBalancer 1 配置kube-proxy代理模式为ipvs [root@k8s-cluster241 ~]# grep mode /etc/kubernetes/kube-proxy.yml mode: "ipvs" [root@k8s-cluster241 ~]# [root@k8s-cluster242 ~]# grep mode /etc/kubernetes/kube-proxy.yml mode: "ipvs" [root@k8s-cluster242 ~]# [root@k8s-cluster243 ~]# grep mode /etc/kubernetes/kube-proxy.yml mode: "ipvs" [root@k8s-cluster243 ~]# 2 K8S集群所有节点导入镜像 wget http://192.168.21.253/Resources/Kubernetes/Add-ons/metallb/v0.15.2/weixiang-metallb-controller-v0.15.2.tar.gz wget http://192.168.21.253/Resources/Kubernetes/Add-ons/metallb/v0.15.2/weixiang-metallb-speaker-v0.15.2.tar.gz ctr -n k8s.io i import weixiang-metallb-controller-v0.15.2.tar.gz ctr -n k8s.io i import weixiang-metallb-speaker-v0.15.2.tar.gz 3.下载metallb组件的资源清单 [root@master231 metallb]# wget https://raw.githubusercontent.com/metallb/metallb/v0.15.2/config/manifests/metallb-native.yaml SVIP: [root@k8s-cluster241 ~]# wget http://192.168.21.253/Resources/Kubernetes/Add-ons/metallb/v0.15.2/metallb-native.yaml 4 部署Metallb [root@k8s-cluster241 ~]# kubectl apply -f metallb-native.yaml 5 创建IP地址池 [root@k8s-cluster241 ~]# cat > metallb-ip-pool.yaml <<EOF apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: jasonyin2020 namespace: metallb-system spec: addresses: # 注意改为你自己为MetalLB分配的IP地址,改地址,建议设置为你windows能够访问的网段。【建议设置你的虚拟机Vmnet8网段】 - 10.0.0.150-10.0.0.180 --- apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: yinzhengjie namespace: metallb-system spec: ipAddressPools: - jasonyin2020 EOF [root@k8s-cluster241 ~]# kubectl apply -f metallb-ip-pool.yaml ipaddresspool.metallb.io/jasonyin2020 created l2advertisement.metallb.io/yinzhengjie created [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl get ipaddresspools.metallb.io -A NAMESPACE NAME AUTO ASSIGN AVOID BUGGY IPS ADDRESSES metallb-system jasonyin2020 true false ["10.0.0.150-10.0.0.180"] [root@k8s-cluster241 ~]# 6 创建LoadBalancer的Service测试验证 [root@k8s-cluster241 ~]# cat > deploy-svc-LoadBalancer.yaml <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: deploy-xiuxian spec: replicas: 3 selector: matchLabels: apps: v1 template: metadata: labels: apps: v1 spec: containers: - name: c1 image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3 --- apiVersion: v1 kind: Service metadata: name: svc-xiuxian spec: type: LoadBalancer selector: apps: v1 ports: - port: 80 EOF [root@k8s-cluster241 ~]# kubectl apply -f deploy-svc-LoadBalancer.yaml deployment.apps/deploy-xiuxian created service/svc-xiuxian created [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl get deploy,svc,po -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR deployment.apps/deploy-xiuxian 3/3 3 3 9s c1 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v3 apps=v1 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 22h <none> service/svc-xiuxian LoadBalancer 10.200.134.72 10.0.0.150 80:17862/TCP 9s apps=v1 service/xiuxian-v1 ClusterIP 10.200.240.114 <none> 80/TCP 3h12m apps=v1 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/deploy-xiuxian-5bc4d8c6d5-hnln8 1/1 Running 0 9s 10.100.88.14 k8s-cluster243 <none> <none> pod/deploy-xiuxian-5bc4d8c6d5-hw7w6 1/1 Running 0 9s 10.100.86.9 k8s-cluster241 <none> <none> pod/deploy-xiuxian-5bc4d8c6d5-jr5vf 1/1 Running 0 9s 10.100.99.12 k8s-cluster242 <none> <none> [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# curl 10.0.0.150 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v3</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: pink">凡人修仙传 v3 </h1> <div> <img src="3.jpg"> <div> </body> </html> [root@k8s-cluster241 ~]#
10、K8S附加组件helm部署
bash
1.下载helm软件包 wget https://get.helm.sh/helm-v3.18.4-linux-amd64.tar.gz SVIP: [root@k8s-cluster241 ~]# wget http://192.168.21.253/Resources/Kubernetes/Add-ons/helm/softwares/helm-v3.18.4-linux-amd64.tar.gz 2.解压软件包 [root@k8s-cluster241 ~]# tar xf helm-v3.18.4-linux-amd64.tar.gz -C /usr/local/bin/ linux-amd64/helm --strip-components=1 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# ll /usr/local/bin/helm -rwxr-xr-x 1 1001 fwupd-refresh 59715768 Jul 9 04:36 /usr/local/bin/helm* [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# helm version version.BuildInfo{Version:"v3.18.4", GitCommit:"d80839cf37d860c8aa9a0503fe463278f26cd5e2", GitTreeState:"clean", GoVersion:"go1.24.4"} [root@k8s-cluster241 ~]# 3.配置helm的自动补全功能 [root@k8s-cluster241 ~]# helm completion bash > /etc/bash_completion.d/helm [root@k8s-cluster241 ~]# source /etc/bash_completion.d/helm [root@k8s-cluster241 ~]# echo 'source /etc/bash_completion.d/helm' >> ~/.bashrc [root@k8s-cluster241 ~]# - 基于helm部署Dashboard 参考链接: https://github.com/kubernetes/dashboard 1.添加Dashboard的仓库地址 [root@k8s-cluster241 ~]# helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/ "kubernetes-dashboard" has been added to your repositories [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# helm repo list NAME URL kubernetes-dashboard https://kubernetes.github.io/dashboard/ [root@k8s-cluster241 ~]# 2.安装Dashboard [root@k8s-cluster241 ~]# helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard Release "kubernetes-dashboard" does not exist. Installing it now. Error: Get "https://github.com/kubernetes/dashboard/releases/download/kubernetes-dashboard-7.13.0/kubernetes-dashboard-7.13.0.tgz": dial tcp 20.205.243.166:443: connect: connection refused [root@k8s-cluster241 ~]# svip: [root@k8s-cluster241 ~]# wget http://192.168.21.253/Resources/Kubernetes/Add-ons/dashboard/helm/v7.13.0/kubernetes-dashboard-7.13.0.tgz [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# tar xf kubernetes-dashboard-7.13.0.tgz [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# ll kubernetes-dashboard total 56 drwxr-xr-x 4 root root 4096 Aug 1 14:55 ./ drwx------ 10 root root 4096 Aug 1 14:55 ../ -rw-r--r-- 1 root root 497 May 28 23:14 Chart.lock drwxr-xr-x 6 root root 4096 Aug 1 14:55 charts/ -rw-r--r-- 1 root root 982 May 28 23:14 Chart.yaml -rw-r--r-- 1 root root 948 May 28 23:14 .helmignore -rw-r--r-- 1 root root 8209 May 28 23:14 README.md drwxr-xr-x 10 root root 4096 Aug 1 14:55 templates/ -rw-r--r-- 1 root root 13729 May 28 23:14 values.yaml [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# helm upgrade --install mywebui kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard Release "mywebui" does not exist. Installing it now. NAME: mywebui LAST DEPLOYED: Fri Aug 1 14:55:46 2025 NAMESPACE: kubernetes-dashboard STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: ************************************************************************************************* *** PLEASE BE PATIENT: Kubernetes Dashboard may need a few minutes to get up and become ready *** ************************************************************************************************* Congratulations! You have just installed Kubernetes Dashboard in your cluster. To access Dashboard run: kubectl -n kubernetes-dashboard port-forward svc/mywebui-kong-proxy 8443:443 NOTE: In case port-forward command does not work, make sure that kong service name is correct. Check the services in Kubernetes Dashboard namespace using: kubectl -n kubernetes-dashboard get svc Dashboard will be available at: https://localhost:8443 [root@k8s-cluster241 ~]# 3.查看部署信息 [root@k8s-cluster241 ~]# helm -n kubernetes-dashboard list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION mywebui kubernetes-dashboard 1 2025-08-01 14:55:46.018206818 +0800 CST deployed kubernetes-dashboard-7.13.0 [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl -n kubernetes-dashboard get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES mywebui-kong-5bdcb94b79-86ftr 1/1 Running 0 9m37s 10.100.86.11 k8s-cluster241 <none> <none> mywebui-kubernetes-dashboard-api-74fbd85467-vspv9 1/1 Running 0 9m37s 10.100.88.16 k8s-cluster243 <none> <none> mywebui-kubernetes-dashboard-auth-69d4c5864b-zpzcj 1/1 Running 0 9m37s 10.100.99.13 k8s-cluster242 <none> <none> mywebui-kubernetes-dashboard-metrics-scraper-5c99c5ccc8-4f96n 1/1 Running 0 9m37s 10.100.86.10 k8s-cluster241 <none> <none> mywebui-kubernetes-dashboard-web-cd678f7dd-bmjgj 1/1 Running 0 9m37s 10.100.88.15 k8s-cluster243 <none> <none> [root@k8s-cluster241 ~]# SVIP镜像下载地址: http://192.168.21.253/Resources/Kubernetes/Add-ons/dashboard/helm/v7.13.0/images/ 4.修改svc的类型 [root@k8s-cluster241 ~]# kubectl get svc -n kubernetes-dashboard mywebui-kong-proxy NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE mywebui-kong-proxy ClusterIP 10.200.64.101 <none> 443/TCP 10m [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl edit svc -n kubernetes-dashboard mywebui-kong-proxy service/mywebui-kong-proxy edited [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl get svc -n kubernetes-dashboard mywebui-kong-proxy NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE mywebui-kong-proxy LoadBalancer 10.200.64.101 10.0.0.151 443:14400/TCP 10m [root@k8s-cluster241 ~]# 5.访问WebUI https://10.0.0.151/#/login 6.创建登录账号 6.1 创建sa [root@k8s-cluster241 ~]# kubectl create serviceaccount admin serviceaccount/admin created [root@k8s-cluster241 ~]# 6.2 将sa和CLuster-admin进行绑定 [root@k8s-cluster241 ~]# kubectl create clusterrolebinding dashboard-admin --serviceaccount=default:admin --clusterrole=cluster-admin clusterrolebinding.rbac.authorization.k8s.io/dashboard-admin created [root@k8s-cluster241 ~]# 6.3 获取账号的token并进行webUI的登录 [root@k8s-cluster241 ~]# kubectl create token admin eyJhbGciOiJSUzI1NiIsImtpZCI6IjFSTlY2dk5FS3BrdHkySFNnTW1nZFJSMXhibU83X0twWFhMUHBhZGRhV2sifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLm9sZGJveWVkdS5jb20iXSwiZXhwIjoxNzU0MDM1Njc1LCJpYXQiOjE3NTQwMzIwNzUsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5vbGRib3llZHUuY29tIiwianRpIjoiNjc1NTgyZDAtZTMyYS00NjkwLTllODQtZWMyMTJiY2JhYTM4Iiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJkZWZhdWx0Iiwic2VydmljZWFjY291bnQiOnsibmFtZSI6ImFkbWluIiwidWlkIjoiZDgwY2E0OTgtOTE0ZC00MjI4LWI3YmMtMTNlNjYyNjkzYmE1In19LCJuYmYiOjE3NTQwMzIwNzUsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpkZWZhdWx0OmFkbWluIn0.Pth4k-a23691RSdkrklTqwcfyoUyKM675q5Tkjpiw1IsWWoo1_tqm0oh7DTHqcMNtyTnQGvauLLLuKi8ANn2344z3wO_qGIl6wOL7X9qXS5stxhJUWYVA_tokcAoLgomERDy7xNFV03plJIW60g53yfP1oA7ng4z7g5AZArRy2Mf1tvkFTaiMtRK3Ovsnj9K-CGox3R3vpl1Qrkvmnrd-Z465-V61DLmrlyf6YRrSt7sLDIcjeoiEq0DKs4Jau-srAJTIdvJi0OSkVucYlxAyJx5fTPmW4LyFcsWe7tAQBZg-9p0Bu9Rr4scOAhxVDjuu7Rs4gbXLdX0iL-GkMVyfA [root@k8s-cluster241 ~]# - 部署metrics-server组件 1 下载资源清单 [root@k8s-cluster241 ~]# wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/high-availability-1.21+.yaml SVIP: [root@k8s-cluster241 ~]# wget http://192.168.21.253/Resources/Kubernetes/Add-ons/metrics-server/0.7.x/high-availability-1.21%2B.yaml 2 编辑配置文件 [root@k8s-cluster241 ~]# vim high-availability-1.21+.yaml ... 114 apiVersion: apps/v1 115 kind: Deployment 116 metadata: ... 144 - args: 145 - --kubelet-insecure-tls # 不要验证Kubelets提供的服务证书的CA。不配置则会报错x509。 ... ... image: registry.aliyuncs.com/google_containers/metrics-server:v0.7.2 3 部署metrics-server组件 [root@k8s-cluster241 ~]# kubectl apply -f high-availability-1.21+.yaml serviceaccount/metrics-server created clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created clusterrole.rbac.authorization.k8s.io/system:metrics-server created rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created service/metrics-server created deployment.apps/metrics-server created poddisruptionbudget.policy/metrics-server created apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created [root@k8s-cluster241 ~]# 镜像下载地址: http://192.168.21.253/Resources/Kubernetes/Add-ons/metrics-server/0.7.x/ 4 查看镜像是否部署成功 [root@k8s-cluster241 ~]# kubectl get pods,svc -n kube-system -l k8s-app=metrics-server -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/metrics-server-79bdcb6569-lsbs6 1/1 Running 0 34s 10.100.99.14 k8s-cluster242 <none> <none> pod/metrics-server-79bdcb6569-mtgm8 1/1 Running 0 34s 10.100.86.12 k8s-cluster241 <none> <none> NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/metrics-server ClusterIP 10.200.157.24 <none> 443/TCP 34s k8s-app=metrics-server [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl -n kube-system describe svc metrics-server Name: metrics-server Namespace: kube-system Labels: k8s-app=metrics-server Annotations: <none> Selector: k8s-app=metrics-server Type: ClusterIP IP Family Policy: SingleStack IP Families: IPv4 IP: 10.200.157.24 IPs: 10.200.157.24 Port: https 443/TCP TargetPort: https/TCP Endpoints: 10.100.99.14:10250,10.100.86.12:10250 Session Affinity: None Internal Traffic Policy: Cluster Events: <none> [root@k8s-cluster241 ~]# 5 验证metrics组件是否正常工作 [root@k8s-cluster241 ~]# kubectl top node NAME CPU(cores) CPU(%) MEMORY(bytes) MEMORY(%) k8s-cluster241 100m 5% 2644Mi 33% k8s-cluster242 111m 5% 1968Mi 25% k8s-cluster243 80m 4% 2222Mi 28% [root@k8s-cluster241 ~]# [root@k8s-cluster241 ~]# kubectl top pod -A NAMESPACE NAME CPU(cores) MEMORY(bytes) calico-apiserver calico-apiserver-68b989c5c9-7wkpv 2m 46Mi calico-apiserver calico-apiserver-68b989c5c9-hzgqr 2m 37Mi calico-system calico-kube-controllers-74544dd8d8-sjc6g 3m 17Mi calico-system calico-node-p24ql 29m 154Mi calico-system calico-node-qlr7r 17m 154Mi calico-system calico-node-w6cmr 21m 152Mi calico-system calico-typha-6744756b8b-cxkxp 2m 21Mi calico-system calico-typha-6744756b8b-mbdww 2m 22Mi calico-system csi-node-driver-6lnsp 1m 6Mi calico-system csi-node-driver-p299m 1m 6Mi calico-system csi-node-driver-tstgf 1m 6Mi calico-system goldmane-56dbdbc4c8-hdhtv 6m 15Mi calico-system whisker-c75778546-mpnqz 0m 12Mi default deploy-xiuxian-5bc4d8c6d5-hnln8 0m 2Mi default deploy-xiuxian-5bc4d8c6d5-hw7w6 0m 2Mi default deploy-xiuxian-5bc4d8c6d5-jr5vf 0m 2Mi kube-system coredns-5578c9dc84-qvpqt 1m 13Mi kube-system metrics-server-79bdcb6569-lsbs6 2m 13Mi kube-system metrics-server-79bdcb6569-mtgm8 2m 14Mi kubernetes-dashboard mywebui-kong-5bdcb94b79-86ftr 2m 96Mi kubernetes-dashboard mywebui-kubernetes-dashboard-api-74fbd85467-vspv9 1m 12Mi kubernetes-dashboard mywebui-kubernetes-dashboard-auth-69d4c5864b-zpzcj 1m 6Mi kubernetes-dashboard mywebui-kubernetes-dashboard-metrics-scraper-5c99c5ccc8-4f96n 1m 7Mi kubernetes-dashboard mywebui-kubernetes-dashboard-web-cd678f7dd-bmjgj 1m 6Mi metallb-system controller-58fdf44d87-l7vgc 1m 22Mi metallb-system speaker-6qgsc 3m 17Mi metallb-system speaker-ctkb2 3m 17Mi metallb-system speaker-nqsp2 3m 16Mi tigera-operator tigera-operator-747864d56d-9v67r 2m 73Mi [root@k8s-cluster241 ~]#

13、prometheus

9c7234f0cd569ab5385e8e161d2a96e0

1、什么是Prometheus

bash
所谓的Prometheus是一款开源的监控系统,可以监控主流的中间件,操作系系统,硬件设备,网络设备等。 相比于zabbix而言,Prometheus强项是对容器的监控,微服务的监控更加方便。 Prometheus作为CNCF第二个毕业项目,也是采用了GO语言编写。 官网地址: https://prometheus.io/ 官方架构图: https://prometheus.io/docs/introduction/overview/ github地址: https://github.com/prometheus/prometheus

2、安装部署

1、二进制部署Prometheus
bash
2.二进制部署Prometheus 2.1 下载软件包 wget https://github.com/prometheus/prometheus/releases/download/v3.5.0/prometheus-3.5.0.linux-amd64.tar.gz svip: [root@promethues-server31 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/Prometheus_Server/prometheus-3.5.0.linux-amd64.tar.gz 2.2 解压软件包 [root@promethues-server31 ~]# tar xf prometheus-3.5.0.linux-amd64.tar.gz -C /usr/local/ [root@promethues-server31 ~]# [root@promethues-server31 ~]# ll /usr/local/prometheus-3.5.0.linux-amd64/ total 302936 drwxr-xr-x 2 1001 fwupd-refresh 4096 Jul 15 00:38 ./ drwxr-xr-x 11 root root 4096 Aug 4 09:07 ../ -rw-r--r-- 1 1001 fwupd-refresh 11357 Jul 15 00:36 LICENSE -rw-r--r-- 1 1001 fwupd-refresh 3773 Jul 15 00:36 NOTICE -rwxr-xr-x 1 1001 fwupd-refresh 159425376 Jul 15 00:17 prometheus* -rw-r--r-- 1 1001 fwupd-refresh 1093 Jul 15 00:36 prometheus.yml -rwxr-xr-x 1 1001 fwupd-refresh 150746286 Jul 15 00:17 promtool* [root@promethues-server31 ~]# 2.3 运行Prometheus [root@promethues-server31 ~]# cd /usr/local/prometheus-3.5.0.linux-amd64/ [root@promethues-server31 prometheus-3.5.0.linux-amd64]# ./prometheus 2.4. 访问Prometheus的webUi http://10.0.0.31:9090 2.5 卸载服务 [root@promethues-server31 ~]# rm -rf /usr/local/prometheus-3.5.0.linux-amd64/ [root@promethues-server31 ~]#
2、Prometheus-server一键部署脚本
bash
- Prometheus-server一键部署脚本 1.下载安装脚本 [root@prometheus-server31 ~]# wget http://192.168.21.253/Resources/Prometheus/Scripts/weixiang-install-prometheus-server-v2.53.4.tar.gz 2.解压软件包 [root@prometheus-server31 ~]# tar xf weixiang-install-prometheus-server-v2.53.4.tar.gz 3.安装Prometheus [root@prometheus-server31 ~]# ./install-prometheus-server.sh i 4.查看webUi http://106.55.44.37:9090/targets?search= 5.卸载服务 [root@prometheus-server31 ~]# ./install-prometheus-server.sh r
3、二进制部署node-exporter环境
bash
- 二进制部署node-exporter环境 1.下载软件包 [root@node-exporter41 ~]# wget https://github.com/prometheus/node_exporter/releases/download/v1.9.1/node_exporter-1.9.1.linux-amd64.tar.gz SVIP: [root@node-exporter41 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/node_exporter/node_exporter-1.9.1.linux-amd64.tar.gz 2.解压软件包 [root@node-exporter41 ~]# tar xf node_exporter-1.9.1.linux-amd64.tar.gz -C /usr/local/ [root@node-exporter41 ~]# 3.运行node-exporter [root@node-exporter41 ~]# cd /usr/local/node_exporter-1.9.1.linux-amd64/ [root@node-exporter41 node_exporter-1.9.1.linux-amd64]# [root@node-exporter41 node_exporter-1.9.1.linux-amd64]# ll total 21708 drwxr-xr-x 2 1001 1002 4096 Apr 1 23:23 ./ drwxr-xr-x 11 root root 4096 May 12 09:13 ../ -rw-r--r-- 1 1001 1002 11357 Apr 1 23:23 LICENSE -rwxr-xr-x 1 1001 1002 22204245 Apr 1 23:19 node_exporter* -rw-r--r-- 1 1001 1002 463 Apr 1 23:23 NOTICE [root@node-exporter41 node_exporter-1.9.1.linux-amd64]# [root@node-exporter41 node_exporter-1.9.1.linux-amd64]# ./node_exporter 4.访问node-exporter的WebUI [root@promethues-server31 ~]# curl -s http://118.89.55.174:9100/metrics | wc -l 1116 [root@promethues-server31 ~]# 5.卸载二进制部署的node-exporter [root@node-exporter41 ~]# rm -rf /usr/local/node_exporter-1.9.1.linux-amd64/ [root@node-exporter41 ~]#
4、node-exporter一键部署脚本
bash
- node-exporter一键部署脚本 1.下载脚本 [root@node-exporter41 ~]# wget http://192.168.21.253/Resources/Prometheus/Scripts/weixiang-install-node-exporter-v1.9.1.tar.gz 2.解压软件包 [root@node-exporter41 ~]# tar xf weixiang-install-node-exporter-v1.9.1.tar.gz [root@node-exporter41 ~]# 3.安装node-exporter [root@node-exporter41 ~]# ./install-node-exporter.sh i 4.访问WebUI http://118.89.55.174:9100/metrics

3、Prometheus监控Linux主机

bash
- Prometheus监控Linux主机案例 1.被监控的Linux主机安装node-exporter [root@node-exporter42 ~]# wget http://192.168.17.253/Resources/Prometheus/Scripts/weixiang-install-node-exporter-v1.9.1.tar.gz [root@node-exporter43 ~]# wget http://192.168.17.253/Resources/Prometheus/Scripts/weixiang-install-node-exporter-v1.9.1.tar.gz 2.解压软件包 [root@node-exporter42 ~]# tar xf weixiang-install-node-exporter-v1.9.1.tar.gz [root@node-exporter43 ~]# tar xf weixiang-install-node-exporter-v1.9.1.tar.gz 3.安装node-exporter [root@node-exporter42 ~]# ./install-node-exporter.sh i [root@node-exporter43 ~]# ./install-node-exporter.sh i 4.安装Prometheus server [root@prometheus-server31 ~]# ./install-prometheus-server.sh i 5.修改Prometheus server的配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... global: scrape_interval: 3s scrape_configs: ... - job_name: "weixiang-node-exporter" static_configs: - targets: - 10.1.12.15:9100 - 10.1.12.3:9100 - 10.1.12.4:9100 6.热加载配置文件 [root@prometheus-server31 ~]# curl -X POST 10.1.24.13:9090/-/reload [root@prometheus-server31 ~]# 7.检查Prometheus的WebUI http://106.55.44.37:9090/targets?search= 8.检查数据是否采集成功 node_cpu_seconds_total

image

image

4、Prometheus的数据类型及函数

bash
1.gauge gauge数据类型表示当前的值,是一种所见即所得的情况。 如上图所示,使用"node_boot_time_seconds"指标查看节点的启动时间,表示的是当前值。 如下图所示,使用"go_info"指标查看go的版本信息,其返回值意义不大,这个时候标签的KEY和VALUE就能获取到我们想要的信息。 2.counter counter数据类型表示一个指标单调递增的计数器。 一般可以结合rate查看QPS,比如: rate(prometheus_http_requests_total[1m]) 也可以结合increase查看增量,比如: increase(prometheus_http_requests_total[1m]) 查询平均访问时间: prometheus_http_request_duration_seconds_sum / prometheus_http_request_duration_seconds_count 3.histogram histogram数据类型表示直方图样本观测,通常用于查询"所有观察值的总和""请求持续时间""响应时间"等场景。 上一个案例中,我们可以使用"prometheus_http_request_duration_seconds_sum / prometheus_http_request_duration_seconds_count"查询平均访问时间。 但这种统计方式比较粗糙,用"请求的响应时间/请求的次数",算的是平均响应时间,并不能反应在某个时间段内是否有故障,比如在"12:30~12:35"之间出现大面积服务无法响应,其他时间段都是正常提供服务的,最终使用上面的公式算出来的是没有延迟的,因为5分钟的微小延迟在24小时内平均下来的话可能就可以忽略了,从而运维人员就无法及时发现问题并处理,这对于用户体验是比较差的。 因此Prometheus可以使用histogram数据类型可以采用分位值的方式随机采样短时间范围内的数据,从而及时发现问题,这需要配合histogram_quantile函数来使用。 举个例子: HTTP请求的延迟柱状图(下面的"0.95"表示的是分位值,你可以根据需求自行修改即可。) histogram_quantile(0.95,sum(rate(prometheus_http_request_duration_seconds_bucket[1m])) by (le)) histogram_quantile(0.95,sum(rate(prometheus_http_request_duration_seconds_bucket{handler="/api/v1/query"}[5m])) by (le)) 输出格式请参考: https://www.cnblogs.com/yinzhengjie/p/18522782#二-histogram数据说明 4.summary 相比于histogram需要结合histogram_quantile函数进行实时计算结果,summary数据类型的数据是分值值的一个结果。 输出格式请参考: https://www.cnblogs.com/yinzhengjie/p/18522782#三-summary数据说明 - prometheus的PromQL初体验之常见的操作符 1.精确匹配 node_cpu_seconds_total{instance="10.0.0.42:9100",cpu="1"} 2.基于正则匹配 node_cpu_seconds_total{instance="10.0.0.42:9100",cpu="1",mode=~"i.*"} 3.取反操作 node_cpu_seconds_total{instance="10.0.0.42:9100",cpu!="1",mode=~"i.*"} 4.可以做算数运算 100/5 10+20 参考链接: https://prometheus.io/docs/prometheus/latest/querying/operators/

5、prometheus的PromQL初体验之常见的函数

bash
- prometheus的PromQL初体验之常见的函数 1.压力测试42节点 [root@node-exporter42 ~]# apt -y install stress [root@node-exporter42 ~]# stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 20m 2.计算CPU的使用率 (1 - sum(increase(node_cpu_seconds_total{mode="idle"}[1m])) by (instance) / sum(increase(node_cpu_seconds_total[1m])) by (instance)) * 100 3.每个节点的启动时间 (time() - node_boot_time_seconds) / 60 参考链接: https://prometheus.io/docs/prometheus/latest/querying/functions/ 参考案例: https://www.cnblogs.com/yinzhengjie/p/18799074 Prometheus的webUi使用的两个痛点 - 1.临时性,查询数据是临时的,关闭页面重新打开后并不会保存,该页面主要是用来做临时调试的; - 2.需要PromQL语法,新手来说比较痛苦,阅读官方文档,需要有一定的学习能力,还要求你有操作系统的基本功 综上所述,Prometheus的webUI对新手来说并不友好。

6、对接Grafna展示

1、安装Grafna
bash
- Grafna环境安装 参考链接: https://grafana.com/grafana/download/9.5.21 1.安装grafna的依赖包 [root@prometheus-server31 ~]# apt-get install -y adduser libfontconfig1 musl 2.下载grafana wget https://dl.grafana.com/enterprise/release/grafana-enterprise_9.5.21_amd64.deb SVIP: [root@prometheus-server31 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/Grafana/grafana-enterprise_9.5.21_amd64.deb 3.安装grafana [root@prometheus-server31 ~]# dpkg -i grafana-enterprise_9.5.21_amd64.deb 4.启动grafana [root@prometheus-server31 ~]# systemctl enable --now grafana-server [root@prometheus-server31 ~]# ss -ntl | grep 3000 LISTEN 0 4096 *:3000 *:* [root@prometheus-server31 ~]# 5.访问Grafana的webUI http://106.55.44.37:3000/? 默认的用户名和密码均为: admin 首次登录需要修改密码,可跳过。
2、配置Prometheus数据源
bash
- grafana配置Prometheus数据源 1.grafan添加Prometheus数据源 2.新建Dashboard目录 3.导入第三方Dashboard的ID 1860 查询模板ID站点: https://grafana.com/grafana/dashboards 4.查看Dashboard

3b78c2f68e607501d0cc3c42e3c86318

c7b2a7298675c3e2bd66ce6bdea3f3e7

068b0b891e586824c080866dce68593c

382aa1cd19dd18b4ec52eea8df181c8b

481ff68912716d8f1535a6c8e8ce43f9

3、Prometheus监控服务的流程
bash
- Prometheus监控服务的流程 - 1.被监控端需要暴露metrics指标; - 2.prometheus server端需要配置要监控的目标(服务发现); - 3.热加载配置文件; - 4.检查Prometheus的WebUI验证配置是否生效; - 5.grafana导入模板ID; - 6.grafana的Dashboard出图展示; - 7.配置相应的告警规则;

3、Prometheus监控window主机
bash
- Prometheus监控window主机 参考链接: https://prometheus.io/docs/instrumenting/exporters/#hardware-related https://github.com/prometheus-community/windows_exporter https://grafana.com/grafana/dashboards/ 1.被监控端需要暴露metrics指标; 1.1 下载安装的软件包 https://github.com/prometheus-community/windows_exporter/releases/download/v0.31.2/windows_exporter-0.31.2-amd64.exe 1.2 运行软件包 【cmd窗口运行】 windows_exporter-0.31.2-amd64.exe 1.3 访问测试 http://10.0.0.1:9182/metrics 2.prometheus server端需要配置要监控的目标(服务发现); [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-windows-exporter" static_configs: - targets: - 10.0.0.1:9182 3.热加载配置文件; [root@prometheus-server31 ~]# curl -X POST 10.1.24.13:9090/-/reload [root@prometheus-server31 ~]# 4.检查Prometheus的WebUI验证配置是否生效; http://10.0.0.31:9090/targets?search= 5.grafana导入模板ID; 20763 14694 6.grafana的Dashboard出图展示;
4、Prometheus监控zookeeper集群
bash
- Prometheus监控zookeeper集群 1.Prometheus启用metrics接口 1.1 修改配置文件 [root@elk91 ~]# vim /usr/local/apache-zookeeper-3.8.4-bin/conf/zoo.cfg [root@elk91 ~]# [root@elk91 ~]# tail -5 /usr/local/apache-zookeeper-3.8.4-bin/conf/zoo.cfg # https://prometheus.io Metrics Exporter metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider metricsProvider.httpHost=0.0.0.0 metricsProvider.httpPort=7000 metricsProvider.exportJvmInfo=true [root@elk91 ~]# 1.2 同步配置文件到其他节点 [root@elk91 ~]# scp /usr/local/apache-zookeeper-3.8.4-bin/conf/zoo.cfg 10.0.0.92:/usr/local/apache-zookeeper-3.8.4-bin/conf [root@elk91 ~]# scp /usr/local/apache-zookeeper-3.8.4-bin/conf/zoo.cfg 10.0.0.93:/usr/local/apache-zookeeper-3.8.4-bin/conf 1.3 启动zookeeper集群 [root@elk91 ~]# zkServer.sh start [root@elk92 ~]# zkServer.sh start [root@elk93 ~]# zkServer.sh start 2.访问zookeeper的webUI http://118.89.55.174:7000/metrics http://10.0.0.92:7000/metrics http://10.0.0.93:7000/metrics 3.修改Prometheus的配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-zookeeper-exporter" static_configs: - targets: - 118.89.55.174:7000 - 81.71.98.206:7000 - 134.175.108.235:7000 4.热加载配置文件 [root@prometheus-server31 ~]# curl -X POST 118.89.55.174:9090/-/reload [root@prometheus-server31 ~]# 5.验证配置是否生效 http://10.0.0.31:9090/targets?search=

image

bash
6.Grafana导入模板ID 10465

image

5、Prometheus监控kafka集群
bash
- Prometheus监控kafka集群 1.启动kafka集群 [root@elk91 ~]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties [root@elk91 ~]# ss -ntl | grep 9092 LISTEN 0 50 *:9092 *:* [root@elk91 ~]# [root@elk92 ~]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties [root@elk92 ~]# ss -ntl | grep 9092 LISTEN 0 50 *:9092 *:* [root@elk92 ~]# [root@elk93 ~]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties [root@elk93 ~]# ss -ntl | grep 9092 LISTEN 0 50 *:9092 *:* [root@elk93 ~]# 2.下载kafka exporter wget https://github.com/danielqsj/kafka_exporter/releases/download/v1.9.0/kafka_exporter-1.9.0.linux-amd64.tar.gz SVIP: [root@elk91 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/kafka_exporter/kafka_exporter-1.9.0.linux-amd64.tar.gz 3.解压软件包 [root@elk91 ~]# tar xf kafka_exporter-1.9.0.linux-amd64.tar.gz -C /usr/local/bin/ kafka_exporter-1.9.0.linux-amd64/kafka_exporter --strip-components=1 [root@elk91 ~]# [root@elk91 ~]# ll /usr/local/bin/kafka_exporter -rwxr-xr-x 1 1001 fwupd-refresh 25099148 Feb 17 11:04 /usr/local/bin/kafka_exporter* [root@elk91 ~]# 4.启动kafka exporter [root@elk91 ~]# kafka_exporter --kafka.version="3.9.0" --kafka.server=10.1.12.15:9092 --web.listen-address=":9308" --web.telemetry-path="/metrics" 5.访问kafka的webUI http://118.89.55.174:9308/metrics 6.修改Prometheus的配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-kafka-exporter" static_configs: - targets: - 10.0.0.91:9308 7.热加载配置文件 [root@prometheus-server31 ~]# curl -X POST 106.55.44.37:9090/-/reload [root@prometheus-server31 ~]# 8.验证配置是否生效 http://10.0.0.31:9090/targets?search=

image

bash
9.Grafana导入模板ID 21078 7589

image

6、Prometheus监控ElasticSearch集群
bash
1.检查ES集群是否正常 [root@prometheus-server31 ~]# curl https://10.0.0.91:9200/_cat/nodes -u elastic:123456 -k 10.0.0.91 62 61 1 0.05 0.06 0.07 cdfhilmrstw - elk91 10.0.0.93 73 60 1 0.10 0.10 0.08 cdfhilmrstw * elk93 10.0.0.92 76 43 2 0.00 0.00 0.00 cdfhilmrstw - elk92 [root@prometheus-server31 ~]# 2.下载ElasticSearch-exporter wget https://github.com/prometheus-community/elasticsearch_exporter/releases/download/v1.9.0/elasticsearch_exporter-1.9.0.linux-amd64.tar.gz SVIP: [root@elk92 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/elasticSearch_exporter/elasticsearch_exporter-1.9.0.linux-amd64.tar.gz 3.解压软件包 [root@elk92 ~]# tar xf elasticsearch_exporter-1.9.0.linux-amd64.tar.gz -C /usr/local/bin/ elasticsearch_exporter-1.9.0.linux-amd64/elasticsearch_exporter --strip-components=1 [root@elk92 ~]# [root@elk92 ~]# ll /usr/local/bin/elasticsearch_exporter -rwxr-xr-x 1 1001 fwupd-refresh 15069336 Mar 3 18:01 /usr/local/bin/elasticsearch_exporter* [root@elk92 ~]# 4.启动ElasticSearch-exporter [root@elk92 ~]# elasticsearch_exporter --es.uri="https://weixiang:123456@10.1.12.15:9200" --web.listen-address=:9114 --web.telemetry-path="/metrics" --es.ssl-skip-verify 5.访问ElasticSearch-exporter的webUI http://10.1.12.15:9114/metrics 6.Prometheus修改配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-es-exporter" static_configs: - targets: - 10.0.0.92:9114 [root@prometheus-server31 ~]# 7.热加载配置文件 [root@prometheus-server31 ~]# curl -X POST 10.0.0.31:9090/-/reload [root@prometheus-server31 ~]# 8.验证ES的配置是否生效 http://10.0.0.31:9090/targets?search=

image

bash
9.grafna导入模板ID 14191

image

7、Prometheus监控redis案例
bash
1.1 导入镜像 [root@elk93 ~]# wget http://192.168.21.253/Resources/Prometheus/images/Redis/weixiang-redis-v7.4.2-alpine.tar.gz [root@elk93 ~]# docker load -i weixiang-redis-v7.4.2-alpine.tar.gz 1.2 启动redis [root@elk93 ~]# docker run -d --name redis-server --network host redis:7.4.2-alpine 9652086e8ba23206fe4ba1dd2182f2a72ca99e190ab1f5d7a64532f5c590fc0c [root@elk93 ~]# [root@elk93 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 9652086e8ba2 redis:7.4.2-alpine "docker-entrypoint.s…" 3 seconds ago Up 2 seconds redis-server [root@elk93 ~]# [root@elk93 ~]# ss -ntl | grep 6379 LISTEN 0 511 0.0.0.0:6379 0.0.0.0:* LISTEN 0 511 [::]:6379 [::]:* [root@elk93 ~]# 1.3 写入测试数据 [root@elk93 ~]# docker exec -it redis-server redis-cli -n 5 127.0.0.1:6379[5]> KEYS * (empty array) 127.0.0.1:6379[5]> set school weixiang OK 127.0.0.1:6379[5]> set class weixiang98 OK 127.0.0.1:6379[5]> 127.0.0.1:6379[5]> KEYS * 1) "class" 2) "school" 127.0.0.1:6379[5]> 2.下载redis exporter wget https://github.com/oliver006/redis_exporter/releases/download/v1.74.0/redis_exporter-v1.74.0.linux-amd64.tar.gz svip: [root@elk92 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/redis_exporter/redis_exporter-v1.74.0.linux-amd64.tar.gz 3.解压软件包 [root@elk92 ~]# tar xf redis_exporter-v1.74.0.linux-amd64.tar.gz -C /usr/local/bin/ redis_exporter-v1.74.0.linux-amd64/redis_exporter --strip-components=1 [root@elk92 ~]# [root@elk92 ~]# ll /usr/local/bin/redis_exporter -rwxr-xr-x 1 1001 fwupd-refresh 9642168 May 4 13:22 /usr/local/bin/redis_exporter* [root@elk92 ~]# 4.运行redis-exporter [root@elk92 ~]# redis_exporter -redis.addr redis://10.0.0.93:6379 -web.telemetry-path /metrics -web.listen-address :9121 5.访问redis-exporter的webUI http://134.175.108.235:9121/metrics 6.修改Prometheus的配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-redis-exporter" static_configs: - targets: - 10.0.0.92:9121 7.热加载配置文件 [root@prometheus-server31 ~]# curl -X POST 10.0.0.31:9090/-/reload [root@prometheus-server31 ~]# 8.验证配置是否生效 http://10.0.0.31:9090/targets?search=

image

bash
9.Grafana导入ID 11835 14091 14615 # 缺少插件。
8、Grafana插件安装
bash
1.Grafana插件概述 Grafana支持安装第三方插件。 例如,报错如下: 说明缺少插件 Panel plugin not found: natel-discrete-panel 默认的数据目录: [root@prometheus-server31 ~]# ll /var/lib/grafana/ total 1940 drwxr-xr-x 5 grafana grafana 4096 May 12 14:46 ./ drwxr-xr-x 61 root root 4096 May 12 10:38 ../ drwxr-x--- 3 grafana grafana 4096 May 12 10:38 alerting/ drwx------ 2 grafana grafana 4096 May 12 10:38 csv/ -rw-r----- 1 grafana grafana 1961984 May 12 14:46 grafana.db drwx------ 2 grafana grafana 4096 May 12 10:38 png/ [root@prometheus-server31 ~]# 2.Grafana插件管理 2.1 列出本地安装的插件 [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# grafana-cli plugins ls Error: ✗ stat /var/lib/grafana/plugins: no such file or directory [root@prometheus-server31 ~]# 2.2 安装指定的插件 [root@prometheus-server31 ~]# grafana-cli plugins install natel-discrete-panel ✔ Downloaded and extracted natel-discrete-panel v0.1.1 zip successfully to /var/lib/grafana/plugins/natel-discrete-panel Please restart Grafana after installing or removing plugins. Refer to Grafana documentation for instructions if necessary. [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# ll /var/lib/grafana/ total 1944 drwxr-xr-x 6 grafana grafana 4096 May 12 14:49 ./ drwxr-xr-x 61 root root 4096 May 12 10:38 ../ drwxr-x--- 3 grafana grafana 4096 May 12 10:38 alerting/ drwx------ 2 grafana grafana 4096 May 12 10:38 csv/ -rw-r----- 1 grafana grafana 1961984 May 12 14:48 grafana.db drwxr-xr-x 3 root root 4096 May 12 14:49 plugins/ drwx------ 2 grafana grafana 4096 May 12 10:38 png/ [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# ll /var/lib/grafana/plugins/ total 12 drwxr-xr-x 3 root root 4096 May 12 14:49 ./ drwxr-xr-x 6 grafana grafana 4096 May 12 14:49 ../ drwxr-xr-x 4 root root 4096 May 12 14:49 natel-discrete-panel/ [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# ll /var/lib/grafana/plugins/natel-discrete-panel/ total 180 drwxr-xr-x 4 root root 4096 May 12 14:49 ./ drwxr-xr-x 3 root root 4096 May 12 14:49 ../ -rw-r--r-- 1 root root 1891 May 12 14:49 CHANGELOG.md drwxr-xr-x 2 root root 4096 May 12 14:49 img/ -rw-r--r-- 1 root root 1079 May 12 14:49 LICENSE -rw-r--r-- 1 root root 2650 May 12 14:49 MANIFEST.txt -rw-r--r-- 1 root root 30629 May 12 14:49 module.js -rw-r--r-- 1 root root 808 May 12 14:49 module.js.LICENSE.txt -rw-r--r-- 1 root root 108000 May 12 14:49 module.js.map drwxr-xr-x 2 root root 4096 May 12 14:49 partials/ -rw-r--r-- 1 root root 1590 May 12 14:49 plugin.json -rw-r--r-- 1 root root 3699 May 12 14:49 README.md [root@prometheus-server31 ~]# 2.3 重启Grafana使得配置生效 [root@prometheus-server31 ~]# systemctl restart grafana-server.service [root@prometheus-server31 ~]# 2.4 查看插件是否生效 略,见视频

image

9、prometheus监控docker
bash
- prometheus监控主流的中间件之docker 参考链接: https://github.com/google/cadvisor 1.部署docker环境【建议41-43都安装】 wget http://192.168.21.253/Resources/Docker/scripts/weixiang-autoinstall-docker-docker-compose.tar.gz tar xf weixiang-autoinstall-docker-docker-compose.tar.gz ./install-docker.sh i wget http://192.168.21.253/Resources/Prometheus/images/cAdvisor/weixiang-cadvisor-v0.52.1.tar.gz docker load -i weixiang-cadvisor-v0.52.1.tar.gz 2.导入镜像【建议41-43都安装】 wget http://192.168.21.253/Resources/Docker/images/Linux/alpine-v3.20.2.tar.gz docker image load < alpine-v3.20.2.tar.gz 3.运行测试的镜像 [root@node-exporter41 ~]# docker run -id --name c1 alpine:3.20.2 344a3e936abe90cfb2e2e0e6e5f13e1117a79faa5afb939ae261794d3c5ee2b0 [root@node-exporter41 ~]# [root@node-exporter41 ~]# docker run -id --name c2 alpine:3.20.2 b2130c8f78f2df06f53d338161f3f9ad6a133c9c6b68ddb011884c788bb1b37d [root@node-exporter41 ~]# [root@node-exporter41 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b2130c8f78f2 alpine:3.20.2 "/bin/sh" 5 seconds ago Up 4 seconds c2 344a3e936abe alpine:3.20.2 "/bin/sh" 8 seconds ago Up 8 seconds c1 [root@node-exporter41 ~]# [root@node-exporter42 ~]# docker run -id --name c3 alpine:3.20.2 f399c1aafd607bf0c18dff09c1839f923ee9db39b68edf5b216c618a363566a1 [root@node-exporter42 ~]# [root@node-exporter42 ~]# docker run -id --name c4 alpine:3.20.2 bff22c8d96f731cd44dfa55b60a9dd73d7292add33ea5b82314bf2352db115a7 [root@node-exporter42 ~]# [root@node-exporter42 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bff22c8d96f7 alpine:3.20.2 "/bin/sh" 3 seconds ago Up 1 second c4 f399c1aafd60 alpine:3.20.2 "/bin/sh" 6 seconds ago Up 4 seconds c3 [root@node-exporter42 ~]# [root@node-exporter43 ~]# docker run -id --name c5 alpine:3.20.2 198464e1e9a3c7aefb361c3c7df3bfe8009b5ecd633aa19503321428d404008c [root@node-exporter43 ~]# [root@node-exporter43 ~]# docker run -id --name c6 alpine:3.20.2 b8ed9fcec61e017086864f8eb223cd6409d33f144e2fcfbf33acfd09860b0a06 [root@node-exporter43 ~]# 4.运行cAdVisor【建议41-43都安装】 docker run \ --volume=/:/rootfs:ro \ --volume=/var/run:/var/run:ro \ --volume=/sys:/sys:ro \ --volume=/var/lib/docker/:/var/lib/docker:ro \ --volume=/dev/disk/:/dev/disk:ro \ --network host \ --detach=true \ --name=cadvisor \ --privileged \ --device=/dev/kmsg \ gcr.io/cadvisor/cadvisor-amd64:v0.52.1 5.访问cAdvisor的webUI http://118.89.55.174:8080/docker/ http://81.71.98.206:8080/docker/ http://134.175.108.235:8080/docker/ [root@node-exporter41 ~]# curl -s http://134.175.108.235:8080/metrics | wc -l 3067 [root@node-exporter41 ~]#

image

通过id可以看出他俩是一个

image

image

image

bash
6.Prometheus监控容器节点 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-docker-cadVisor" static_configs: - targets: - 118.89.55.174:8080 - 81.71.98.206:8080 - 134.175.108.235:8080 7.热加载配置文件 [root@prometheus-server31 ~]# curl -X POST 106.55.44.37:9090/-/reload [root@prometheus-server31 ~]# 8.验证配置是否生效 http://10.0.0.31:9090/targets?search= 9.Grafana导入ID模板 10619 无法正确显示数据的优化案例: - 1.PromQL语句优化 count(last_over_time(container_last_seen{instance=~"$node:$port",job=~"$job",image!=""}[3s])) - 2.Value options 将'Calculation'字段设置为'Last *'即可。 - 3.保存Dashboard 若不保存,刷新页面后所有配置丢失!!!

image

image

image

10、prometheus监控mysql
bash
- prometheus监控主流的中间件之mysql 1.部署MySQL 1.1 导入MySQL镜像 [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Docker/images/WordPress/weixiang-mysql-v8.0.36-oracle.tar.gz [root@node-exporter43 ~]# docker load < weixiang-mysql-v8.0.36-oracle.tar.gz 1.2 运行MySQL服务 [root@node-exporter43 ~]# docker run -d --network host --name mysql-server --restart always -e MYSQL_DATABASE=prometheus -e MYSQL_USER=weixiang98 -e MYSQL_PASSWORD=yinzhengjie -e MYSQL_ALLOW_EMPTY_PASSWORD=yes mysql:8.0.36-oracle --character-set-server=utf8 --collation-server=utf8_bin --default-authentication-plugin=mysql_native_password 1.3 检查MySQL服务 [root@node-exporter43 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 16aa74bc9e03 mysql:8.0.36-oracle "docker-entrypoint.s…" 2 seconds ago Up 2 seconds mysql-server [root@node-exporter43 ~]# [root@node-exporter43 ~]# ss -ntl | grep 3306 LISTEN 0 151 *:3306 *:* LISTEN 0 70 *:33060 *:* [root@node-exporter43 ~]# 1.4 添加用户权限 [root@node-exporter43 ~]# docker exec -it mysql-server mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 13 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> mysql> SHOW GRANTS FOR weixiang98; +---------------------------------------------------------+ | Grants for weixiang98@% | +---------------------------------------------------------+ | GRANT USAGE ON *.* TO `weixiang98`@`%` | | GRANT ALL PRIVILEGES ON `prometheus`.* TO `weixiang98`@`%` | +---------------------------------------------------------+ 2 rows in set (0.00 sec) mysql> mysql> GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO weixiang98; Query OK, 0 rows affected (0.00 sec) mysql> mysql> SHOW GRANTS FOR weixiang98; +-------------------------------------------------------------------+ | Grants for weixiang98@% | +-------------------------------------------------------------------+ | GRANT SELECT, PROCESS, REPLICATION CLIENT ON *.* TO `weixiang98`@`%` | | GRANT ALL PRIVILEGES ON `prometheus`.* TO `weixiang98`@`%` | +-------------------------------------------------------------------+ 2 rows in set (0.00 sec) mysql> 2.下载MySQL-exporter wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.17.2/mysqld_exporter-0.17.2.linux-amd64.tar.gz SVIP: [root@node-exporter42 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/mysql_exporter/mysqld_exporter-0.17.2.linux-amd64.tar.gz 3.解压软件包 [root@node-exporter42 ~]# tar xf mysqld_exporter-0.17.2.linux-amd64.tar.gz -C /usr/local/bin/ mysqld_exporter-0.17.2.linux-amd64/mysqld_exporter --strip-components=1 [root@node-exporter42 ~]# [root@node-exporter42 ~]# ll /usr/local/bin/mysqld_exporter -rwxr-xr-x 1 1001 1002 18356306 Feb 26 15:16 /usr/local/bin/mysqld_exporter* [root@node-exporter42 ~]# 4.运行MySQL-exporter暴露MySQL的监控指标 [root@node-exporter42 ~]# cat .my.cnf [client] host = 10.0.0.43 port = 3306 user = weixiang98 password = yinzhengjie [root@node-exporter42 ~]# [root@node-exporter42 ~]# mysqld_exporter --config.my-cnf=/root/.my.cnf ... time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:239 msg="Starting mysqld_exporter" version="(version=0.17.2, branch=HEAD, revision=e84f4f22f8a11089d5f04ff9bfdc5fc042605773)" time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:240 msg="Build context" build_context="(go=go1.23.6, platform=linux/amd64, user=root@18b69b4b0fea, date=20250226-07:16:19, tags=unknown)" time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:252 msg="Scraper enabled" scraper=global_status time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:252 msg="Scraper enabled" scraper=global_variables time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:252 msg="Scraper enabled" scraper=slave_status time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:252 msg="Scraper enabled" scraper=info_schema.innodb_cmp time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:252 msg="Scraper enabled" scraper=info_schema.innodb_cmpmem time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:252 msg="Scraper enabled" scraper=info_schema.query_response_time time=2025-05-13T02:07:45.898Z level=INFO source=tls_config.go:347 msg="Listening on" address=[::]:9104 time=2025-05-13T02:07:45.898Z level=INFO source=tls_config.go:350 msg="TLS is disabled." http2=false address=[::]:9104 5.验证测试 [root@node-exporter41 ~]# curl -s http://106.55.44.37:9104/metrics | wc -l 2569 [root@node-exporter41 ~]# 6.修改Prometheus的配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-mysql-exporter" static_configs: - targets: - 10.0.0.42:9104 7.热加载配置文件 [root@prometheus-server31 ~]# curl -X POST 10.0.0.31:9090/-/reload [root@prometheus-server31 ~]# 8.验证配置是否生效 http://10.0.0.31:9090/targets?search= 9.Grafana导入ID模板 14057 17320

image

image

11、Exporter是如何工作的
bash
首先,我们来看mysqld_exporter的工作原理: mysqld_exporter 本身不是MySQL的一部分。它是一个独立的、小型的Web服务器。 它的工作是: 连接到你指定的MySQL数据库(通过网络)。 执行一系列SQL查询(如 SHOW GLOBAL STATUS;, SHOW GLOBAL VARIABLES; 等)来获取MySQL的内部状态和指标。 转换这些查询结果,将其整理成Prometheus能够识别的特定格式(Metrics格式)。 暴露一个HTTP端点(默认是 :9104 端口),等待Prometheus服务器来抓取这些格式化后的数据。 为什么要把它们分开部署? 1. 资源隔离 (最重要) MySQL很宝贵: 数据库服务器(43号机)通常是业务核心,它的CPU、内存、I/O资源都应该优先保证数据库本身高效运行。 Exporter也耗资源: mysqld_exporter虽然小,但它在被Prometheus频繁抓取时,也需要消耗一定的CPU和内存来执行查询和处理数据。 避免竞争: 如果把exporter也装在43号机上,当监控压力大或者exporter自身出现bug(比如内存泄漏)时,它可能会抢占MySQL的资源,甚至导致MySQL性能下降或服务崩溃。将exporter部署在另一台机器(42号机)上,就彻底杜绝了这种风险。 2. 安全性更高 最小权限原则: 数据库服务器(43号机)应该尽可能少地暴露服务和安装额外的软件,以减少攻击面。 网络隔离: 在复杂的网络环境中,你可以将数据库服务器放在一个高度安全的内部网络区域,只允许特定的监控服务器(如42号机)通过防火墙访问它的3306端口。而Prometheus服务器只需要能访问到监控服务器(42号机)的9104端口即可,无需直接接触到数据库服务器。 3. 管理和维护更方便 集中管理Exporter: 你可以指定一台或几台专门的“监控机”(如此处的42号机),在这上面运行所有的Exporter,比如mysqld_exporter, redis_exporter, node_exporter等。 统一升级和配置: 当你需要升级、重启或修改所有Exporter的配置时,只需要登录这几台集中的监控机操作即可,而不需要去登录每一台业务服务器。这大大简化了运维工作。 4. 架构灵活性 这种模式不局限于物理机或虚拟机。在容器化环境(如Kubernetes)中,MySQL Pod 和 mysqld_exporter Pod 也通常是分开部署的两个不同的Pod,它们通过K8s的Service网络进行通信。原理是完全一样的。
12、prometheus监控mongoDB
bash
- prometheus监控主流的中间件之mongoDB 1 导入mongoDB镜像 wget http://192.168.21.253/Resources/Prometheus/images/MongoDB/weixiang-mongoDB-v8.0.6-noble.tar.gz docker load -i weixiang-mongoDB-v8.0.6-noble.tar.gz 2 部署mongoDB服务 [root@node-exporter43 ~]# docker run -d --name mongodb-server --network host mongo:8.0.6-noble 4b0f00dea78bb571c216c344984ced026c1210c94db147fdc9e32f549e3135de [root@node-exporter43 ~]# [root@node-exporter43 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 8179c6077ec8 mongo:8.0.6-noble "docker-entrypoint.s…" 4 seconds ago Up 3 seconds mongodb-server [root@node-exporter43 ~]# [root@node-exporter43 ~]# ss -ntl | grep 27017 LISTEN 0 4096 0.0.0.0:27017 0.0.0.0:* [root@node-exporter43 ~]# 3 下载MongoDB的exporter https://github.com/percona/mongodb_exporter/releases/download/v0.43.1/mongodb_exporter-0.43.1.linux-amd64.tar.gz SVIP: [root@node-exporter42 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/MongoDB_exporter/mongodb_exporter-0.43.1.linux-amd64.tar.gz 4 解压软件包 [root@node-exporter42 ~]# tar xf mongodb_exporter-0.43.1.linux-amd64.tar.gz -C /usr/local/bin/ mongodb_exporter-0.43.1.linux-amd64/mongodb_exporter --strip-components=1 [root@node-exporter42 ~]# [root@node-exporter42 ~]# ll /usr/local/bin/mongodb_exporter -rwxr-xr-x 1 1001 geoclue 20467864 Dec 13 20:10 /usr/local/bin/mongodb_exporter* [root@node-exporter42 ~]# 5 运行mongodb-exporter [root@node-exporter42 ~]# mongodb_exporter --mongodb.uri=mongodb://10.1.12.4:27017 --log.level=info --collect-all time=2025-05-13T02:49:26.332Z level=INFO source=tls_config.go:347 msg="Listening on" address=[::]:9216 time=2025-05-13T02:49:26.332Z level=INFO source=tls_config.go:350 msg="TLS is disabled." http2=false address=[::]:9216 6 验证mongoDB-exporter的WebUI http://81.71.98.206:9216/metrics [root@node-exporter41 ~]# curl -s http://81.71.98.206:9216/metrics | wc -l 8847 [root@node-exporter41 ~]# 7.配置Prometheus监控mongoDB容器 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: weixiang-mongodb-exporter static_configs: - targets: - 81.71.98.206:9216 [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# curl -X POST http://106.55.44.37:9090/-/reload [root@prometheus-server31 ~]# 8 验证Prometheus配置是否生效 http://10.0.0.31:9090/targets?search= 可以进行数据的查询,推荐使用: mongodb_dbstats_dataSize 9 grafana导入模板ID 16504 由于我们的MongoDB版本较为新,grafana的社区模板更新的并不及时,因此可能需要我们自己定制化一些Dashboard。 参考链接: https://grafana.com/grafana/dashboards

image

image

13、prometheus监控nginx
bash
1 编译安装nginx 1.1 安装编译工具 [root@node-exporter41 ~]# cat /etc/apt/sources.list # 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释 deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse # 以下安全更新软件源包含了官方源与镜像站配置,如有需要可自行修改注释切换 deb http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse # deb-src http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse # 预发布软件源,不建议启用 # deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse # # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse [root@node-exporter41 ~]# [root@node-exporter41 ~]# apt update [root@node-exporter41 ~]# [root@node-exporter41 ~]# apt -y install git wget gcc make zlib1g-dev build-essential libtool openssl libssl-dev 参考链接: https://mirrors.tuna.tsinghua.edu.cn/help/ubuntu/ 1.2 克隆nginx-module-vts模块 git clone https://gitee.com/jasonyin2020/nginx-module-vts.git 1.3 下载nginx软件包 wget https://nginx.org/download/nginx-1.28.0.tar.gz 1.4 解压nginx tar xf nginx-1.28.0.tar.gz 1.5 配置nginx cd nginx-1.28.0/ ./configure --prefix=/weixiang/softwares/nginx --with-http_ssl_module --with-http_v2_module --with-http_realip_module --without-http_rewrite_module --with-http_stub_status_module --without-http_gzip_module --with-file-aio --with-stream --with-stream_ssl_module --with-stream_realip_module --add-module=/root/nginx-module-vts 1.6 编译并安装nginx make -j 2 && make install 1.7 修改nginx的配置文件 vim /weixiang/softwares/nginx/conf/nginx.conf ... http { vhost_traffic_status_zone; upstream weixiang-promethues { server 10.0.0.31:9090; } ... server { ... location / { root html; # index index.html index.htm; proxy_pass http://weixiang-promethues; } location /status { vhost_traffic_status_display; vhost_traffic_status_display_format html; } } } 1.8 检查配置文件语法 /weixiang/softwares/nginx/sbin/nginx -t 1.9 启动nginx /weixiang/softwares/nginx/sbin/nginx 1.10 访问nginx的状态页面 http://118.89.55.174/status/format/prometheus 2 安装nginx-vtx-exporter 2.1 下载nginx-vtx-exporter wget https://github.com/sysulq/nginx-vts-exporter/releases/download/v0.10.8/nginx-vtx-exporter_0.10.8_linux_amd64.tar.gz SVIP: wget http://192.168.21.253/Resources/Prometheus/softwares/nginx_exporter/nginx-vtx-exporter_0.10.8_linux_amd64.tar.gz 2.2 解压软件包到path路径 [root@node-exporter42 ~]# tar xf nginx-vtx-exporter_0.10.8_linux_amd64.tar.gz -C /usr/local/bin/ nginx-vtx-exporter [root@node-exporter42 ~]# [root@node-exporter42 ~]# ll /usr/local/bin/nginx-vtx-exporter -rwxr-xr-x 1 1001 avahi 7950336 Jul 11 2023 /usr/local/bin/nginx-vtx-exporter* [root@node-exporter42 ~]# 2.3 运行nginx-vtx-exporter [root@node-exporter42 ~]# nginx-vtx-exporter -nginx.scrape_uri=http://10.0.0.41/status/format/json 3 配置prometheus采集nginx数据 3.1 修改配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-nginx-vts-modules" metrics_path: "/status/format/prometheus" static_configs: - targets: - "10.0.0.41:80" - job_name: "weixiang-nginx-vts-exporter" static_configs: - targets: - "10.0.0.42:9913" 3.2 重新加载配置并验证配置是否生效 curl -X POST http://10.0.0.31:9090/-/reload 3.3 导入grafana模板 9785【编译安装时添加vts模块即可】 2949【编译时添加vts模块且需要安装nginx-exporter】 - prometheus监控主流的中间件之tomcat 1 部署tomcat-exporter 1.1 导入镜像 [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Prometheus/images/weixiang-tomcat-v9.0.87.tar.gz [root@node-exporter43 ~]# [root@node-exporter43 ~]# docker load -i weixiang-tomcat-v9.0.87.tar.gz 1.2 基于Dockerfile构建tomcat-exporter [root@node-exporter43 ~]# git clone https://gitee.com/jasonyin2020/tomcat-exporter.git [root@node-exporter43 ~]# cd tomcat-exporter/ [root@node-exporter43 tomcat-exporter]# [root@node-exporter43 tomcat-exporter]# ll total 44 drwxr-xr-x 5 root root 4096 May 13 11:55 ./ drwx------ 10 root root 4096 May 13 11:55 ../ -rw-r--r-- 1 root root 96 May 13 11:55 build.sh -rw-r--r-- 1 root root 503 May 13 11:55 Dockerfile drwxr-xr-x 8 root root 4096 May 13 11:55 .git/ drwxr-xr-x 2 root root 4096 May 13 11:55 libs/ -rw-r--r-- 1 root root 3407 May 13 11:55 metrics.war drwxr-xr-x 2 root root 4096 May 13 11:55 myapp/ -rw-r--r-- 1 root root 191 May 13 11:55 README.md -rw-r--r-- 1 root root 7604 May 13 11:55 server.xml [root@node-exporter43 tomcat-exporter]# [root@node-exporter43 tomcat-exporter]# bash build.sh 1.2 运行tomcat镜像 [root@node-exporter43 tomcat-exporter]# docker run -dp 18080:8080 --name tomcat-server registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/tomcat9-app:v1 5643c618db790e12b5ec658c362b3963a3db39914c826d6eef2fe55355f1d5d9 [root@node-exporter43 tomcat-exporter]# [root@node-exporter43 tomcat-exporter]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5643c618db79 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/tomcat9-app:v1 "/usr/local/tomcat/b…" 4 seconds ago Up 4 seconds 8009/tcp, 8443/tcp, 0.0.0.0:18080->8080/tcp, :::18080->8080/tcp tomcat-server [root@node-exporter43 tomcat-exporter]# 1.3 访问tomcat应用 http://10.0.0.43:18080/metrics/ http://10.0.0.43:18080/myapp/ 2 配置prometheus监控tomcat应用 2.1 修改配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-tomcat-exporter" static_configs: - targets: - "10.0.0.43:18080" 2.2 重新加载配置并验证配置是否生效 curl -X POST http://10.0.0.31:9090/-/reload 2.3 导入grafana模板 由于官方的支持并不友好,可以在GitHub自行搜索相应的tomcat监控模板。 参考链接: https://github.com/nlighten/tomcat_exporter/blob/master/dashboard/example.json

image

14、prometheus监控tomcat
bash
1 部署tomcat-exporter 1.1 导入镜像 [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Prometheus/images/weixiang-tomcat-v9.0.87.tar.gz [root@node-exporter43 ~]# [root@node-exporter43 ~]# docker load -i weixiang-tomcat-v9.0.87.tar.gz 1.2 基于Dockerfile构建tomcat-exporter [root@node-exporter43 ~]# git clone https://gitee.com/jasonyin2020/tomcat-exporter.git [root@node-exporter43 ~]# cd tomcat-exporter/ [root@node-exporter43 tomcat-exporter]# [root@node-exporter43 tomcat-exporter]# ll total 44 drwxr-xr-x 5 root root 4096 May 13 11:55 ./ drwx------ 10 root root 4096 May 13 11:55 ../ -rw-r--r-- 1 root root 96 May 13 11:55 build.sh -rw-r--r-- 1 root root 503 May 13 11:55 Dockerfile drwxr-xr-x 8 root root 4096 May 13 11:55 .git/ drwxr-xr-x 2 root root 4096 May 13 11:55 libs/ -rw-r--r-- 1 root root 3407 May 13 11:55 metrics.war drwxr-xr-x 2 root root 4096 May 13 11:55 myapp/ -rw-r--r-- 1 root root 191 May 13 11:55 README.md -rw-r--r-- 1 root root 7604 May 13 11:55 server.xml [root@node-exporter43 tomcat-exporter]# [root@node-exporter43 tomcat-exporter]# bash build.sh 1.2 运行tomcat镜像 [root@node-exporter43 tomcat-exporter]# docker run -dp 18080:8080 --name tomcat-server registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/tomcat9-app:v1 5643c618db790e12b5ec658c362b3963a3db39914c826d6eef2fe55355f1d5d9 [root@node-exporter43 tomcat-exporter]# [root@node-exporter43 tomcat-exporter]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5643c618db79 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/tomcat9-app:v1 "/usr/local/tomcat/b…" 4 seconds ago Up 4 seconds 8009/tcp, 8443/tcp, 0.0.0.0:18080->8080/tcp, :::18080->8080/tcp tomcat-server [root@node-exporter43 tomcat-exporter]# 1.3 访问tomcat应用 http://134.175.108.235:18080/metrics/ http://134.175.108.235:18080/myapp/

image

bash
2 配置prometheus监控tomcat应用 2.1 修改配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-tomcat-exporter" static_configs: - targets: - "10.0.0.43:18080" 2.2 重新加载配置并验证配置是否生效 curl -X POST http://10.0.0.31:9090/-/reload 2.3 导入grafana模板 由于官方的支持并不友好,可以在GitHub自行搜索相应的tomcat监控模板。 参考链接: https://github.com/nlighten/tomcat_exporter/blob/master/dashboard/example.json

image

image

15、Prometheus监控主流的中间件之etcd
bash
- Prometheus监控主流的中间件之etcd 参考链接: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config 1.prometheus端创建etcd证书目录 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# pwd /weixiang/softwares/prometheus-2.53.4.linux-amd64 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# mkdir -p certs/etcd [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.将etcd的自建证书拷贝prometheus服务器 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# scp 10.0.0.241:/weixiang/certs/etcd/etcd-{ca.pem,server-key.pem,server.pem} certs/etcd 3.Prometheus查看证书文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# apt install tree [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# tree certs/etcd/ certs/etcd/ ├── etcd-ca.pem ├── etcd-server-key.pem └── etcd-server.pem 0 directories, 3 files [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl -s --cacert certs/etcd/etcd-ca.pem --key certs/etcd/etcd-server-key.pem --cert certs/etcd/etcd-server.pem https://10.0.0.241:2379/metrics -k | wc -l 1717 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl -s --cacert certs/etcd/etcd-ca.pem --key certs/etcd/etcd-server-key.pem --cert certs/etcd/etcd-server.pem https://10.0.0.242:2379/metrics -k | wc -l 1718 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl -s --cacert certs/etcd/etcd-ca.pem --key certs/etcd/etcd-server-key.pem --cert certs/etcd/etcd-server.pem https://10.0.0.243:2379/metrics -k | wc -l 1714 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 4.修改Prometheus的配置文件【修改配置时,可以将中文注释删除,此处的中文注释是方便你理解的。】 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-etcd-cluster" # 使用https协议 scheme: https # 配置https证书相关信息 tls_config: # 指定CA的证书文件 ca_file: certs/etcd/etcd-ca.pem # 指定etcd服务的公钥文件 cert_file: certs/etcd/etcd-server.pem # 指定etcd服务的私钥文件 key_file: certs/etcd/etcd-server-key.pem static_configs: - targets: - 10.0.0.241:2379 - 10.0.0.242:2379 - 10.0.0.243:2379 5.检查配置文件是否正确 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: 1 rule files found SUCCESS: prometheus.yml is valid prometheus config file syntax Checking weixiang-linux96-rules.yml SUCCESS: 3 rules found [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 6.热加载配置文件 [root@prometheus-server31 ~]# curl -X POST http://10.0.0.31:9090/-/reload 7.检查配置是否生效 http://10.0.0.31:9090/targets 8.grafana导入模板ID 21473 3070 10323

7、服务发现

1、基于文件的服务发现案例
bash
- 基于文件的服务发现案例 静态配置:(static_configs) 修改Prometheus的配置文件时需要热加载配置文件或者重启服务生效。 动态配置:() 无需重启服务,可以监听本地的文件,或者通过注册中心,服务发现中心发现要监控的目标。 参考链接: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config 1.修改配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-file-sd" file_sd_configs: - files: - /tmp/xixi.json - /tmp/haha.yaml 2.热记载配置文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl -X POST 10.0.0.31:9090/-/reload [root@prometheus-server31 Prometheus]# cd /weixiang/softwares/prometheus-2.53.4.linux-amd64 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml WARNING: file "/tmp/xixi.json" for file_sd in scrape job "weixiang-file-sd" does not exist WARNING: file "/tmp/haha.yaml" for file_sd in scrape job "weixiang-file-sd" does not exist SUCCESS: prometheus.yml is valid prometheus config file syntax [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 3.修改json格式文件 [root@prometheus-server31 ~]# cat > /tmp/xixi.json <<EOF [ { "targets": [ "10.1.12.15:9100" ], "labels": { "school": "weixiang", "class": "weixiang98" } } ] EOF 4.验证是否自动监控目标 http://106.55.44.37:9090/targets?search= 5.再次编写yaml文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# cat > /tmp/haha.yaml <<EOF - targets: - '10.1.12.3:9100' - '10.1.12.4:9100' labels: address: ShaHe ClassRoom: JiaoShi05 EOF [root@promethues-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 6.验证是否自动监控目标 http://10.0.0.31:9090/targets?search= 7.Grafana导入模板ID 1860

image

image

2、基于consul的服务发现案例
bash
- 基于consul的服务发现案例 官方文档: https://www.consul.io/ https://developer.hashicorp.com/consul/install#linux 1 部署consul集群 1.1 下载consul【41-43节点】 wget https://releases.hashicorp.com/consul/1.21.3/consul_1.21.3_linux_amd64.zip svip: wget http://192.168.21.253/Resources/Prometheus/softwares/Consul/consul_1.21.3_linux_amd64.zip 1.2 解压consul unzip consul_1.21.3_linux_amd64.zip -d /usr/local/bin/ 1.3 运行consul 集群 服务端43: consul agent -server -bootstrap -bind=10.1.12.4 -data-dir=/weixiang/softwares/consul -client=10.1.12.4 -ui 客户端42: consul agent -bind=10.1.12.3 -data-dir=/weixiang/softwares/consul -client=10.1.12.3 -ui -retry-join=10.1.12.4 客户端41: consul agent -server -bind=10.1.12.15 -data-dir=/weixiang/softwares/consul -client=10.1.12.15 -ui -retry-join=10.1.12.4 1.4 查看各节点的监听端口 ss -ntl | egrep "8300|8500" 1.5 访问console服务的WebUI http://134.175.108.235:8500/ui/dc1/nodes 2.使用consul实现自动发现 2.1 修改prometheus的配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-consul-seriver-discovery" # 配置基于consul的服务发现 consul_sd_configs: # 指定consul的服务器地址,若不指定,则默认值为"localhost:8500". - server: 10.0.0.43:8500 - server: 10.0.0.42:8500 - server: 10.0.0.41:8500 relabel_configs: # 匹配consul的源标签字段,表示服务名称 - source_labels: [__meta_consul_service] # 指定源标签的正则表达式,若不定义,默认值为"(.*)" regex: consul # 执行动作为删除,默认值为"replace",有效值有多种 # https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_action action: drop 2.2 检查配置文件是否正确 [root@prometheus-server31 ~]# cd /weixiang/softwares/prometheus-2.53.4.linux-amd64/ [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config ./prometheus.yml Checking ./prometheus.yml SUCCESS: ./prometheus.yml is valid prometheus config file syntax [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.3 重新加载配置 [root@prometheus-server31 ~]# curl -X POST http:/106.55.44.37:9090/-/reload [root@prometheus-server31 ~]# 2.4.被监控节点注册到console集群 2.4.1 注册节点 [root@grafana71 ~]# curl -X PUT -d '{"id":"prometheus-node42","name":"weixiang-prometheus-node42","address":"10.1.12.3","port":9100,"tags":["node-exporter"],"checks": [{"http":"http://10.1.12.3:9100","interval":"5m"}]}' http://10.1.12.4:8500/v1/agent/service/register [root@grafana71 ~]#

image

bash
2.4.2 注销节点 [root@grafana71 ~]# curl -X PUT http://10.1.12.4:8500/v1/agent/service/deregister/prometheus-node42 TODO---> 目前有个坑 你注册时找得哪个节点,那么注销时也要找这个节点注销,待解决...

image

8、node-exporter的黑白名单

bash
- node-exporter的黑白名单 参考链接: https://github.com/prometheus/node_exporter 1.停止服务 [root@node-exporter41 ~]# systemctl stop node-exporter.service 2.配置黑名单 [root@node-exporter41 ~]# cd /weixiang/softwares/node_exporter-1.9.1.linux-amd64/ [root@node-exporter41 node_exporter-1.9.1.linux-amd64]# [root@node-exporter41 node_exporter-1.9.1.linux-amd64]# ./node_exporter --no-collector.cpu 3.配置白名单 [root@node-exporter41 node_exporter-1.9.1.linux-amd64]# ./node_exporter --collector.disable-defaults --collector.cpu --collector.uname 温馨提示: 相关指标测试 node_cpu_seconds_total ----》 cpu node_uname_info ----》 uname - Prometheus server实现黑白名单 1.黑名单 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie_k8s_exporter" params: exclude[]: - cpu static_configs: - targets: ["10.1.12.3:9100"] [root@promethues-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config ./prometheus.yml Checking ./prometheus.yml SUCCESS: ./prometheus.yml is valid prometheus config file syntax [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 过滤方式: node_cpu_seconds_total{job="yinzhengjie_k8s_exporter"}

image

image​​​

bash
2.白名单 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie_dba_exporter" params: collect[]: - uname static_configs: - targets: ["10.0.0.41:9100"] [root@promethues-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config ./prometheus.yml Checking ./prometheus.yml SUCCESS: ./prometheus.yml is valid prometheus config file syntax [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 过滤方式: node_uname_info{job="yinzhengjie_dba_exporter"}

image

image

9、自定义Dashboard

bash
- 自定义Dashboard 1.相关语句 1.1 CPU一分钟内的使用率 (1 - sum(increase(node_cpu_seconds_total{mode="idle"}[1m])) by (instance) / sum(increase(node_cpu_seconds_total[1m])) by (instance)) * 100 1.2 服务器启动时间 (time() - node_boot_time_seconds{job="yinzhengjie_k8s_exporter"}) 1.3 CPU核心数 count(node_cpu_seconds_total{job="weixiang-file-sd",instance="10.1.12.3:9100",mode="idle"}) 1.4 内存总量 node_memory_MemTotal_bytes{instance="10.0.0.42:9100",job="weixiang-file-sd"} 2.自定义变量

可以照着之前被人做的Dashborads

1db82e256b3e8c20059bb605b29bf76e

上传Dashboard到Grafana官网

bash
Dashboard推送到Grafana官网:https://grafana.com/orgs/eb1360821977/dashboards/new 23842

10、Grafana的表格制作

bash
- Grafana的表格制作 参考链接: https://www.cnblogs.com/yinzhengjie/p/18538430 参考语句: avg(node_uname_info) by (instance,nodename,release) 压力测试: CPU压测: stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 10m 内存压测: stress --cpu 8 --io 4 --vm 4 --vm-bytes 512M --timeout 10m --vm-keep

image

11、Grafana授权

1、用户授权

image

image

2、团队授权

image

3、对指定的Dashboard授权

image

12、Prometheus的存储

1、Prometheus的本地存储
bash
- Prometheus的本地存储 相关参数说明: --web.enable-lifecycle 支持热加载模块。 --storage.tsdb.path=/weixiang/data/prometheus 指定数据存储的路径。 --storage.tsdb.retention.time=60d 指定数据的存储周期。 --web.listen-address=0.0.0.0:9090 配置监听地址。 --web.max-connections=65535 配置连接数。 --config.file 指定Prometheus的配置文件。
2、VictoriaMetrics远端存储
bash
- VictoriaMetrics单机版快速部署 1 VicoriaMetrics概述 VictoriaMetrics是一个快速、经济高效且可扩展的监控解决方案和时间序列数据库。 官网地址: https://victoriametrics.com/ 官方文档: https://docs.victoriametrics.com/ GitHub地址: https://github.com/VictoriaMetrics/VictoriaMetrics 部署文档: https://docs.victoriametrics.com/quick-start/ 2 部署victoriametrics 2.1 下载victoriametrics 版本选择建议使用93 LTS,因为使用97 LTS貌似需要企业授权,启动报错,发现如下信息: [root@prometheus-server33 ~]# journalctl -u victoria-metrics.service -f ... Nov 14 12:03:28 prometheus-server33 victoria-metrics-prod[16999]: 2024-11-14T04:03:28.576Z error VictoriaMetrics/lib/license/copyrights.go:33 VictoriaMetrics Enterprise license is required. Please obtain it at https://victoriametrics.com/products/enterprise/trial/ and pass it via either -license or -licenseFile command-line flags. See https://docs.victoriametrics.com/enterprise/ wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.93.16/victoria-metrics-linux-amd64-v1.93.16.tar.gz SVIP: [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/VictoriaMetrics/victoria-metrics-linux-amd64-v1.93.16.tar.gz 2.2 解压软件包 [root@node-exporter43 ~]# tar xf victoria-metrics-linux-amd64-v1.93.16.tar.gz -C /usr/local/bin/ [root@node-exporter43 ~]# [root@node-exporter43 ~]# ll /usr/local/bin/victoria-metrics-prod -rwxr-xr-x 1 yinzhengjie yinzhengjie 22216200 Jul 18 2024 /usr/local/bin/victoria-metrics-prod* [root@node-exporter43 ~]# 2.3 编写启动脚本 cat > /etc/systemd/system/victoria-metrics.service <<EOF [Unit] Description=weixiang Linux VictoriaMetrics Server Documentation=https://docs.victoriametrics.com/ After=network.target [Service] ExecStart=/usr/local/bin/victoria-metrics-prod \ -httpListenAddr=0.0.0.0:8428 \ -storageDataPath=/weixiang/data/victoria-metrics \ -retentionPeriod=3 [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl enable --now victoria-metrics.service systemctl status victoria-metrics 2.4 检查端口是否存活 [root@node-exporter43 ~]# ss -ntl | grep 8428 LISTEN 0 4096 0.0.0.0:8428 0.0.0.0:* [root@node-exporter43 ~]# 2.5 查看webUI http://10.0.0.43:8428/

image

image

bash
- prometheus配置VictoriaMetrics远端存储 1 修改prometheus的配置文件 [root@promethues-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-linux-VictoriaMetrics-node-exporter" static_configs: - targets: - "10.1.12.15:9100" - "10.1.12.3:9100" - "10.1.12.4:9100" # 在顶级字段中配置VictoriaMetrics地址 remote_write: - url: http://10.1.12.4:8428/api/v1/write 2 重新加载prometheus的配置 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# systemctl stop prometheus-server.service [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ll total 261356 drwxr-xr-x 4 1001 fwupd-refresh 4096 Mar 28 17:18 ./ drwxr-xr-x 3 root root 4096 Mar 26 09:45 ../ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 console_libraries/ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 consoles/ -rw-r--r-- 1 1001 fwupd-refresh 11357 Mar 18 23:05 LICENSE -rw-r--r-- 1 1001 fwupd-refresh 3773 Mar 18 23:05 NOTICE -rw-r--r-- 1 root root 135 Mar 28 15:09 weixiang-file-sd.json -rw-r--r-- 1 root root 148 Mar 28 15:10 weixiang-file-sd.yaml -rwxr-xr-x 1 1001 fwupd-refresh 137836884 Mar 18 22:52 prometheus* -rw-r--r-- 1 root root 5321 Mar 28 15:02 prometheus2025-03-28-AM -rw-r--r-- 1 1001 fwupd-refresh 3296 Mar 28 17:16 prometheus.yml -rw-r--r-- 1 root root 1205 Mar 27 10:05 prometheus.yml2025-03-26 -rw-r--r-- 1 root root 2386 Mar 28 10:06 prometheus.yml2025-03-27 -rwxr-xr-x 1 1001 fwupd-refresh 129719117 Mar 18 22:52 promtool* [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./prometheus 温馨提示: 为了避免实验干扰,我建议大家手动启动prometheus。 3 在VictoriaMetrics的WebUI查看数据 node_cpu_seconds_total{instance="10.1.12.15:9100"} 温馨提示: 如果此步骤没有数据,则不要做下面的步骤了,请先把数据搞出来。 4 配置grafana的数据源及URL 数据源是prometheus,但是URL得写VictoriaMetric的URL哟。 参考URL: http://134.175.108.235:8428

image

image

bash
5 导入grafana的模板ID并选择数据源 1860

image

3、VirctoriaMetrics集群架构远端存储
bash
- VirctoriaMetrics集群架构远端存储 1 VirctoriaMetrics集群架构概述 - 单点部署参考链接: https://docs.victoriametrics.com/quick-start/#starting-vm-single-from-a-binary - 集群部署参考链接: https://docs.victoriametrics.com/quick-start/#starting-vm-cluster-from-binaries https://docs.victoriametrics.com/cluster-victoriametrics/#architecture-overview 部署集群时软件包要下载对应的集群cluster版本: wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.93.16/victoria-metrics-linux-amd64-v1.93.16-cluster.tar.gz 软件包会提供3个程序,该程序对应了集群的3个组件 vmstorage: 存储原始数据,并返回给定标签过滤器在给定时间范围内的查询数据 vminsert: 接受摄入的数据,并根据对度量名称及其所有标签的一致散列在vmstorage节点之间传播 vmselect: 通过从所有配置的vmstorage节点获取所需数据来执行传入查询 2 VirctoriaMetrics集群架构图解 见官网

13、Prometheus的标签管理

1、概念
bash
- Prometheus的标签管理 1.什么是标签 标签用于对数据分组和分类,利用标签可以将数据进行过滤筛选。 标签管理的常见场景: - 1.删除不必要的指标; - 2.从指标中删除敏感或不需要的标签; - 3.添加,编辑或修改指标的标签值或标签格式; 标签的分类: - 默认标签: Prometheus自身内置的标签,格式为"__LABLE__"。 如上图所示,典型点如下所示: - "__metrics_path__" - "__address__" - "__scheme__" - "__scrape_interval__" - "__scrape_timeout__" - "instance" - "job" - 应用标签: 应用本身内置,尤其是监控特定的服务,会有对应的应用标签,格式一般为"__LABLE" 如下图所示,以consul服务为例,典型点如下所示: - "__meta_consul_address" - "__meta_consul_dc" - ...

2、自定义标签:
bash
- 自定义标签: 指的是用户自定义的标签,我们在定义targets可以自定义。 2.标签主要有两种表现形式 - 私有标签: 以"__*"样式存在,用于获取监控目标的默认元数据属性,比如"__scheme__","__address__""__metrics_path__"等。 - 普通标签: 对监控指标进行各种灵活管理操作,常见的操作有删除不必要敏感数据,添加,编辑或修改指标标签纸或者标签格式等。 3.Prometheus对数据处理的流程 - 1.服务发现: 支持静态发现和动态发现,主要是找打到对应的target。 - 2.配置: 加载"__scheme__","__address__""__metrics_path__"等信息。 - 3.重新标记: relabel_configs,主要针对要监控的target的标签。 - 4.抓取: 开始抓取数据。 - 5.重新标记: metric_relabel_configs,主要针对已经抓取回来的metrics的标签的操作。 4.为targets自定义打标签案例 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie-node-exporter-lable" static_configs: - targets: ["10.1.12.15:9100","10.1.12.3:9100","10.1.12.4:9100"] labels: auther: yinzhengjie school: weixiang class: weixiang98 热加载 curl -X POST 106.55.44.37:9090/-/reload 5.查看webUI

image

3、relabel_configs替换标签replace案例
bash
1.修改prometheus的配置文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie-node-exporter-relabel_configs" static_configs: - targets: ["10.1.12.15:9100","10.1.12.3:9100","10.1.12.4:9100"] labels: auther: yinzhengjie blog: https://www.cnblogs.com/yinzhengjie relabel_configs: # Prometheus 采集数据之前,对目标的标签(Label)进行重写或转换 # 指定正则表达式匹配成功的label进行标签管理的列表 - source_labels: - __scheme__ - __address__ - __metrics_path__ # 表示source_labels对应Label的名称或值进行匹配此处指定的正则表达式。 # 此处我们对数据进行了分组,后面replacement会使用"${1}"和"$2"进行引用。 regex: "(http|https)(.*)" # 第一个它匹配字符串开头的 "http" 或 "https"。在这里,它匹配到了 "http" 。第二个它匹配到了 "10.1.12.15:9100/metrics"。 # 指定用于连接多个source_labels为一个字符串的分隔符,若不指定,默认为分号";"。 # 假设源数据如下: # __address__="10.1.12.15:9100" # __metrics_path__="/metrics" # __scheme__="http" # 拼接后操作的结果为: "http10.1.24.13:9100/metrics" separator: "" # 在进行Label替换的时候,可以将原来的source_labels替换为指定修改后的label。 # 将来会新加一个标签,标签的名称为"yinzhengjie_prometheus_ep",值为replacement的数据。 target_label: "yinzhengjie_prometheus_ep" # 替换标签时,将target_label对应的值进行修改成此处的值,最终要生成的“替换值” replacement: "${1}://${2}" # 对Label或指标进行管理,场景的动作有replace|keep|drop|lablemap|labeldrop等,默认为replace。 action: replace 总结:整个流程串起来 对于目标 10.1.12.15:9100: 收集源:source_labels 拿到 "http", "10.1.12.15:9100", "/metrics"。 拼接字符串:separator: "" 将它们变成 "http10.1.12.15:9100/metrics"。 正则匹配:regex 匹配成功,捕获到 ${1}="http"${2}="10.1.12.15:9100/metrics"。 生成新值:replacement 使用捕获组生成了新字符串 "http://10.1.12.15:9100/metrics"。 应用结果:action: replace 创建了一个新标签 target_label,最终效果是给这个 target 增加了一个标签: yinzhengjie_prometheus_ep="http://10.1.12.15:9100/metrics"。 2.热加载配置 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl 10.0.0.31:9090/-/reload -X POST 3.webUI验证 略,见视频。 总结: 相对来说,relabel_configs和labels的作用类似,也是为实例打标签,只不过relabel_configs的功能性更强。 我们可以基于标签来对监控指标进行过滤。

image

bash
[root@promethues-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... separator: "%%" # 把分隔符改成%% ...

image

4、relabel_configs新增标签映射labelmap案例
bash
1.修改prometheus的配置文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie-node-exporter-relabel_configs-labeldrop" static_configs: - targets: ["10.1.12.15:9100","10.1.12.3:9100","10.1.12.4:9100"] relabel_configs: # 目标重写 - regex: "(job|app)" # 2.它会检查每一个标签的名称 (label name),看是否能匹配这个正则表达式 replacement: "${1}_yinzhengjie_labelmap_kubernetes" # 3. 标签名匹配成功,新的标签名是job_yinzhengjie_labelmap_kubernetes action: labelmap # 1.这个动作会遍历目标上所有的标签(labels)。 - regex: "(job|app)" action: labeldrop # 4.这个动作会再次遍历目标上所有的标签,删除regex匹配到的标签,job_yinzhengjie_labelmap_kubernetes 不匹配 (job|app)(因为正则没有 .*),所以这个标签被保留。 总结:整个流程串起来 初始状态:目标有一个标签 job="yinzhengjie..."。 labelmap 操作: 发现 job 标签名匹配 (job|app)。 复制 job 标签,并将其重命名为 job_yinzhengjie_labelmap_kubernetes。 此时目标同时拥有 job 和 job_yinzhengjie_labelmap_kubernetes 两个标签。 labeldrop 操作: 发现 job 标签名匹配 (job|app)。 将 job 标签删除。 最终结果:原始的 job 标签被成功地重命名为了 job_yinzhengjie_labelmap_kubernetes。 2.热加载配置 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl 10.0.0.31:9090/-/reload -X POST 3.webUI验证 略,见视频。

image

5、metric_relabel_configs修改metric标签案例
bash
1.修改prometheus的配置文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie-node-exporter-metric_relabel_configs" static_configs: - targets: ["10.1.12.15:9100","10.1.12.3:9100","10.1.12.4:9100"] metric_relabel_configs: # 指标重写 - source_labels: # 指定要操作的源头。在这里,__name__ 是一个非常特殊的内部标签,它代表指标的名称。 - __name__ regex: "node_cpu_.*" # 匹配所有以 node_cpu_ 开头的字符串 action: drop 总结:整个流程 Prometheus 向 10.1.12.15:9100 等目标发起 HTTP 请求,成功抓取到一大堆指标数据。 在将这些数据写入磁盘之前,metric_relabel_configs 规则开始生效。 它逐一检查每条指标的名称 (__name__)。 任何名称以 node_cpu_ 开头的指标,比如 node_cpu_seconds_total{cpu="0",mode="idle"},都会被直接丢弃。 其他不匹配的指标,如 node_memory_MemFree_bytes,则会被保留并正常存储。 2.热加载配置 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl 10.0.0.31:9090/-/reload -X POST 3.webUI验证

14、部署blackbox-exporter黑盒监控

1.blackbox-exporter概述
bash
一般用于监控网站是否监控,端口是否存活,证书有效期等。 blackbox exporter支持基于HTTP, HTTPS, DNS, TCP, ICMP, gRPC协议来对目标节点进行监控。 比如基于http协议我们可以探测一个网站的返回状态码为200判读服务是否正常。 比如基于TCP协议我们可以探测一个主机端口是否监听。 比如基于ICMP协议来ping一个主机的连通性。 比如基于gRPC协议来调用接口并验证服务是否正常工作。 比如基于DNS协议可以来检测域名解析。 2.下载blackbox-exporter wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.27.0/blackbox_exporter-0.27.0.linux-amd64.tar.gz SVIP: [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/blackbox_exporter/blackbox_exporter-0.27.0.linux-amd64.tar.gz 3.解压软件包 [root@node-exporter43 ~]# tar xf blackbox_exporter-0.27.0.linux-amd64.tar.gz -C /usr/local/ [root@node-exporter43 ~]# [root@node-exporter43 ~]# cd /usr/local/blackbox_exporter-0.27.0.linux-amd64/ [root@node-exporter43 blackbox_exporter-0.27.0.linux-amd64]# [root@node-exporter43 blackbox_exporter-0.27.0.linux-amd64]# ll total 30800 drwxr-xr-x 2 1001 1002 4096 Jun 30 20:48 ./ drwxr-xr-x 11 root root 4096 Aug 6 14:35 ../ -rwxr-xr-x 1 1001 1002 31509376 Jun 30 20:46 blackbox_exporter* -rw-r--r-- 1 1001 1002 1209 Jun 30 20:47 blackbox.yml -rw-r--r-- 1 1001 1002 11357 Jun 30 20:47 LICENSE -rw-r--r-- 1 1001 1002 94 Jun 30 20:47 NOTICE [root@node-exporter43 blackbox_exporter-0.27.0.linux-amd64]# [root@node-exporter43 blackbox_exporter-0.27.0.linux-amd64]# 4.启动blackbox服务 [root@node-exporter43 blackbox_exporter-0.27.0.linux-amd64]# ./blackbox_exporter 5.访问blackbox的WebUI http://134.175.108.235:9115/ 6.访问测试 http://134.175.108.235:9115/probe?target=www.weixiang.com&module=http_2xx http://134.175.108.235:9115/probe?target=prometheus.io&module=http_2xx

image

image

image

2、Prometheus server整合blackbox实现网站监控
bash
1.修改Prometheus的配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... # 指定作业的名称,生成环境中,通常是指一类业务的分组配置。 - job_name: 'weixiang-blackbox-exporter-http' # 修改访问路径,若不修改,默认值为"/metrics" metrics_path: /probe # 配置URL的相关参数 params: # 此处表示使用的是blackbox的http模块,从而判断相应的返回状态码是否为200 module: [http_2xx] school: [weixiang] # 静态配置,需要手动指定监控目标 static_configs: # 需要监控的目标 - targets: # 支持https协议 - https://www.weixiang.com/ # 支持http协议 - http://10.1.12.15 # 支持http协议和自定义端口 - http://10.1.24.13:9090 # 对目标节点进行重新打标签配置 relabel_configs: # 指定源标签,此处的"__address__"表示内置的标签,存储的是被监控目标的IP地址 - source_labels: [__address__] # 指定目标标签,其实就是在"Endpoint"中加了一个target字段(用于指定监控目标), target_label: __param_target # 指定需要执行的动作,默认值为"replace",常用的动作有: replace, keep, and drop。 # 但官方支持十几种动作: https://prometheus.io/docs/prometheus/2.45/configuration/configuration/ # 将"__address__"传递给target字段。 action: replace - source_labels: [__param_target] target_label: instance - target_label: __address__ # 指定要替换的值,此处我指定为blackbox exporter的主机地址 replacement: 10.1.12.4:9115 [root@prometheus-server31 ~]# 2.热加载配置 [root@prometheus-server31 ~]# curl -X POST http://10.0.0.31:9090/-/reload 3.验证webUI http://10.0.0.31:9090/targets?search= 4.导入grafana的模板ID 7587 13659

image

image

3、prometheus基于blackbox的ICMP监控目标主机是否存活
bash
1 修改Prometheus配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: 'weixiang-blackbox-exporter-icmp' metrics_path: /probe params: # 如果不指定模块,则默认类型为"http_2xx",不能乱写!乱写监控不到服务啦! module: [icmp] static_configs: - targets: - 10.1.12.15 - 10.1.12.3 relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] # 指定注意的是,如果instance不修改,则instance和"__address__"的值相同 # target_label: ip target_label: instance - target_label: __address__ replacement: 10.1.12.4:9115 2 检查配置文件是否正确 [root@prometheus-server31 ~]# cd /weixiang/softwares/prometheus-2.53.4.linux-amd64/ [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ll total 261348 drwxr-xr-x 5 1001 fwupd-refresh 4096 Mar 28 14:35 ./ drwxr-xr-x 3 root root 4096 Mar 26 09:45 ../ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 console_libraries/ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 consoles/ drwxr-xr-x 4 root root 4096 Mar 26 14:49 data/ -rw-r--r-- 1 1001 fwupd-refresh 11357 Mar 18 23:05 LICENSE -rw-r--r-- 1 1001 fwupd-refresh 3773 Mar 18 23:05 NOTICE -rwxr-xr-x 1 1001 fwupd-refresh 137836884 Mar 18 22:52 prometheus* -rw-r--r-- 1 1001 fwupd-refresh 4858 Mar 28 14:35 prometheus.yml -rw-r--r-- 1 root root 1205 Mar 27 10:05 prometheus.yml2025-03-26 -rw-r--r-- 1 root root 2386 Mar 28 10:06 prometheus.yml2025-03-27 -rwxr-xr-x 1 1001 fwupd-refresh 129719117 Mar 18 22:52 promtool* [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 3 重新加载配置 [root@prometheus-server31 ~]# curl -X POST http://10.0.0.31:9090/-/reload [root@prometheus-server31 ~]# 4 访问prometheus的WebUI http://10.0.0.31:9090/targets 5 访问blackbox的WebUI http://118.89.55.174:9115/ 6 grafana过滤jobs数据 基于"weixiang-blackbox-exporter-icmp"标签进行过滤。

image

image

4、prometheus基于blackbox的TCP案例监控端口是否存活
bash
1 修改Prometheus配置文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: 'weixiang-blackox-exporter-tcp' metrics_path: /probe params: module: [tcp_connect] static_configs: - targets: - 10.1.12.15:80 - 10.1.12.3:22 - 10.1.24.13:9090 relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: 10.1.12.4:9115 2 检查配置文件是否正确 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 3 重新加载配置文件 [root@prometheus-server31 ~]# curl -X POST http://10.0.0.31:9090/-/reload [root@prometheus-server31 ~]# 4 访问prometheus的WebUI http://10.0.0.31:9090/targets 5 访问blackbox exporter的WebUI http://10.0.0.41:9115/ 6 使用grafana查看数据 基于"weixiang-blackbox-exporter-tcp"标签进行过滤。

15、prometheus的联邦模式

1、为什么需要联邦模式
bash
1.为什么需要联邦模式 联邦模式主要作用就是为了减轻prometheus server的I/O压力。 实现分布式存储数据。
2、联邦模式实战
bash
2.prometheus 31节点监控41节点 2.1 拷贝配置文件 [root@promethues-server31 ~]# scp weixiang-install-prometheus-server-v2.53.4.tar.gz 10.1.20.5:~ 2.2 安装Prometheus-server [root@prometheus-server32 ~]# tar xf weixiang-install-prometheus-server-v2.53.4.tar.gz [root@prometheus-server32 ~]# ./install-prometheus-server.sh i 2.3 修改prometheus的配置文件 [root@prometheus-server32 ~]# cd /weixiang/softwares/prometheus-2.53.4.linux-amd64/ [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# ll total 261324 drwxr-xr-x 4 1001 fwupd-refresh 4096 Mar 18 23:08 ./ drwxr-xr-x 3 root root 4096 Aug 6 15:51 ../ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 console_libraries/ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 consoles/ -rw-r--r-- 1 1001 fwupd-refresh 11357 Mar 18 23:05 LICENSE -rw-r--r-- 1 1001 fwupd-refresh 3773 Mar 18 23:05 NOTICE -rwxr-xr-x 1 1001 fwupd-refresh 137836884 Mar 18 22:52 prometheus* -rw-r--r-- 1 1001 fwupd-refresh 934 Mar 18 23:05 prometheus.yml -rwxr-xr-x 1 1001 fwupd-refresh 129719117 Mar 18 22:52 promtool* [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# tail -5 prometheus.yml - job_name: "weixiang-file-sd-yaml-node_exporter" file_sd_configs: - files: - /weixiang/softwares/prometheus-2.53.4.linux-amd64/sd/*.yaml [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# 2.4 创建服务发现的文件 [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# mkdir sd [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# vim sd/*.yaml [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# cat sd/*.yaml - targets: - '10.1.12.15:9100' labels: school: weixiang class: weixiang98 [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# 2.5 检查配置文件 [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# 2.6 热加载配置 [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# curl -X POST 10.1.20.5:9090/-/reload [root@prometheus-server32 prometheus-2.53.4.linux-amd64]#

image

bash
3. Prometheus 33节点监控42,43节点 3.1 启动consul集群 服务端43: consul agent -server -bootstrap -bind=10.1.12.4 -data-dir=/weixiang/softwares/consul -client=10.1.12.4 -ui 客户端42: consul agent -bind=10.1.12.3 -data-dir=/weixiang/softwares/consul -client=10.1.12.3 -ui -retry-join=10.1.12.4 客户端41: consul agent -server -bind=10.1.12.15 -data-dir=/weixiang/softwares/consul -client=10.1.12.15 -ui -retry-join=10.1.12.4 3.2 拷贝配置文件 [root@promethues-server31 ~]# scp weixiang-install-prometheus-server-v2.53.4.tar.gz 10.1.24.4:~ 3.3 安装Prometheus-server [root@prometheus-server33 ~]# tar xf weixiang-install-prometheus-server-v2.53.4.tar.gz [root@prometheus-server33 ~]# ./install-prometheus-server.sh i 3.4 修改prometheus的配置文件 [root@prometheus-server33 ~]# cd /weixiang/softwares/prometheus-2.53.4.linux-amd64/ [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# ll total 261328 drwxr-xr-x 4 1001 fwupd-refresh 4096 Mar 18 23:08 ./ drwxr-xr-x 3 root root 4096 Aug 6 15:55 ../ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 console_libraries/ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 consoles/ -rw-r--r-- 1 1001 fwupd-refresh 11357 Mar 18 23:05 LICENSE -rw-r--r-- 1 1001 fwupd-refresh 3773 Mar 18 23:05 NOTICE -rwxr-xr-x 1 1001 fwupd-refresh 137836884 Mar 18 22:52 prometheus* -rw-r--r-- 1 1001 fwupd-refresh 934 Mar 18 23:05 prometheus.yml -rwxr-xr-x 1 1001 fwupd-refresh 129719117 Mar 18 22:52 promtool* [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# vim prometheus.yml [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# tail prometheus.yml - job_name: "weixiang-consul-sd-node_exporter" consul_sd_configs: - server: 10.1.12.4:8500 - server: 10.1.12.3:8500 - server: 10.1.12.15:8500 relabel_configs: - source_labels: [__meta_consul_service] regex: consul action: drop [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# 3.5 检查配置文件 [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# 2.6 热加载配置 [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# curl -X POST 10.1.24.4:9090/-/reload [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# 2.7 42和43注册节点 [root@node-exporter41 ~]# curl -X PUT -d '{"id":"prometheus-node42","name":"weixiang-prometheus-node42","address":"10.1.12.3","port":9100,"tags":["node-exporter"],"checks": [{"http":"http://10.1.12.3:9100","interval":"5m"}]}' http://10.1.12.4:8500/v1/agent/service/register curl -X PUT -d '{"id":"prometheus-node43","name":"weixiang-prometheus-node43","address":"10.1.12.4","port":9100,"tags":["node-exporter"],"checks": [{"http":"http://10.1.12.4:9100","interval":"5m"}]}' http://10.1.12.4:8500/v1/agent/service/register 彩蛋: 注销服务 curl -X PUT http://10.1.12.4:8500/v1/agent/service/deregister/prometheus-node42 2.8 验证服务是否生效 http://43.139.77.96:9090/targets

image

bash
3.prometheus 31节点配置联邦模式实战 3.1 修改配置文件 [root@promethues-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-prometheus-federate-32" metrics_path: "/federate" # 用于解决标签的冲突问题,有效值为: true和false,默认值为false # 当设置为true时,将保留抓取的标签以忽略服务器自身的标签。说白了会覆盖原有标签。 # 当设置为false时,则不会覆盖原有标签,而是在标点前加了一个"exported_"前缀。 honor_labels: true params: "match[]": - '{job="promethues"}' - '{__name__=~"job:.*"}' - '{__name__=~"node.*"}' static_configs: - targets: - "10.1.20.5:9090" - job_name: "weixiang-prometheus-federate-33" metrics_path: "/federate" honor_labels: true params: "match[]": - '{job="promethues"}' - '{__name__=~"job:.*"}' - '{__name__=~"node.*"}' static_configs: - targets: - "10.1.24.4:9090" [root@promethues-server31 ~]# 3.2 检查配置文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 3.3 热加载配置文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl -X POST 10.1.24.13:9090/-/reload [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 3.4 验证服务是否生效 http://106.55.44.37:9090/

image

bash
基于如下的PromQL查询: node_cpu_seconds_total{job=~"weixiang.*sd.*"}[20s] 3.5 grafana导入模板ID 1860

image

image

3、prometheus监控consul应用
bash
1.下载consul exporter wget https://github.com/prometheus/consul_exporter/releases/download/v0.13.0/consul_exporter-0.13.0.linux-amd64.tar.gz svip: [root@prometheus-server33 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/Consul/consul_exporter/consul_exporter-0.13.0.linux-amd64.tar.gz 2.解压软件包 [root@prometheus-server33 ~]# tar xf consul_exporter-0.13.0.linux-amd64.tar.gz -C /usr/local/bin/ consul_exporter-0.13.0.linux-amd64/consul_exporter --strip-components=1 [root@prometheus-server33 ~]# [root@prometheus-server33 ~]# ll /usr/local/bin/consul_exporter -rwxr-xr-x 1 1001 1002 19294344 Nov 6 21:38 /usr/local/bin/consul_exporter* [root@prometheus-server33 ~]# 3.运行consul_exporter [root@prometheus-server33 ~]# consul_exporter --consul.server="http://10.0.0.41:8500" --web.telemetry-path="/metrics" --web.listen-address=:9107 4.访问console_exporter的webUI http://10.0.0.33:9107/metrics 5.prometheus server监控consul_exporter [root@promethues-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "weixiang-consul-exporter" static_configs: - targets: - 10.0.0.33:9107 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@promethues-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 6.热加载配置 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl -X POST 10.0.0.31:9090/-/reload [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 7.验证配置是否生效 http://10.0.0.31:9090/targets 8.Grafana导入模板ID 8919

b3a817b2c1876b41f3f981dd1c225ad8456f62a9e315f12f99b4831c78aa6b8f

4、联邦模式采集nginx
bash
1 编译安装nginx 1.1 安装编译工具 [root@node-exporter41 ~]# cat /etc/apt/sources.list # 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释 deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse # 以下安全更新软件源包含了官方源与镜像站配置,如有需要可自行修改注释切换 deb http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse # deb-src http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse # 预发布软件源,不建议启用 # deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse # # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse [root@node-exporter41 ~]# [root@node-exporter41 ~]# apt update [root@node-exporter41 ~]# [root@node-exporter41 ~]# apt -y install git wget gcc make zlib1g-dev build-essential libtool openssl libssl-dev 参考链接: https://mirrors.tuna.tsinghua.edu.cn/help/ubuntu/ 1.2 克隆nginx-module-vts模块 git clone https://gitee.com/jasonyin2020/nginx-module-vts.git 1.3 下载nginx软件包 wget https://nginx.org/download/nginx-1.28.0.tar.gz 1.4 解压nginx tar xf nginx-1.28.0.tar.gz 1.5 配置nginx cd nginx-1.28.0/ ./configure --prefix=/weixiang/softwares/nginx --with-http_ssl_module --with-http_v2_module --with-http_realip_module --without-http_rewrite_module --with-http_stub_status_module --without-http_gzip_module --with-file-aio --with-stream --with-stream_ssl_module --with-stream_realip_module --add-module=/root/nginx-module-vts 1.6 编译并安装nginx make -j 2 && make install 1.7 修改nginx的配置文件 vim /weixiang/softwares/nginx/conf/nginx.conf ... http { vhost_traffic_status_zone; upstream weixiang-promethues { server 10.0.0.31:9090; } ... server { ... location / { root html; # index index.html index.htm; proxy_pass http://weixiang-promethues; } location /status { vhost_traffic_status_display; vhost_traffic_status_display_format html; } } } 1.8 检查配置文件语法 /weixiang/softwares/nginx/sbin/nginx -t 1.9 启动nginx /weixiang/softwares/nginx/sbin/nginx 1.10 访问nginx的状态页面 http://118.89.55.174/status/format/prometheus 2 安装nginx-vtx-exporter 2.1 下载nginx-vtx-exporter wget https://github.com/sysulq/nginx-vts-exporter/releases/download/v0.10.8/nginx-vtx-exporter_0.10.8_linux_amd64.tar.gz SVIP: wget http://192.168.21.253/Resources/Prometheus/softwares/nginx_exporter/nginx-vtx-exporter_0.10.8_linux_amd64.tar.gz 2.2 解压软件包到path路径 [root@node-exporter42 ~]# tar xf nginx-vtx-exporter_0.10.8_linux_amd64.tar.gz -C /usr/local/bin/ nginx-vtx-exporter [root@node-exporter42 ~]# [root@node-exporter42 ~]# ll /usr/local/bin/nginx-vtx-exporter -rwxr-xr-x 1 1001 avahi 7950336 Jul 11 2023 /usr/local/bin/nginx-vtx-exporter* [root@node-exporter42 ~]# 2.3 运行nginx-vtx-exporter [root@node-exporter42 ~]# nginx-vtx-exporter -nginx.scrape_uri=http://10.0.0.41/status/format/json 这是 10.1.20.5 上 prometheus.yml 的配置: # 全局配置 global: scrape_interval: 15s # 设置默认的抓取间隔为15秒 evaluation_interval: 15s # 设置默认的告警规则评估间隔为15秒 # scrape_timeout 默认是 10s # 告警管理器(Alertmanager)的配置,如果暂时不用可以忽略 alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # 抓取配置 (这是核心部分) scrape_configs: # Prometheus自身的监控 - job_name: 'prometheus' static_configs: - targets: ['localhost:9090'] # ========================================================== # 【新增】Nginx 监控的配置 # ========================================================== - job_name: 'nginx' # 使用 static_configs,因为你的 exporter 地址是固定的 static_configs: - targets: ['10.1.12.3:9913'] # <-- 你的 nginx-vtx-exporter 的地址和端口 labels: # 添加一些标签,方便查询和区分。非常重要! instance: 'nginx-server-10.1.12.15' # 标记这个数据是来自哪个Nginx实例 group: 'production' # 比如标记为生产环境 app: 'nginx' 配置联邦模式 # ... global 和 alerting 配置 ... scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090'] - job_name: 'nginx' static_configs: - targets: ['10.1.12.3:9913'] labels: instance: 'nginx-server-10.1.12.15' # 重要:添加一个标签来区分这个数据源于哪个局部Prometheus cluster: 'cluster-A' # 假设你还有一个Tomcat Exporter - job_name: 'tomcat' static_configs: - targets: ['10.1.12.4:9404'] # 假设Tomcat Exporter在10.1.12.4上 labels: instance: 'tomcat-app-1' cluster: 'cluster-A' 10.1.24.13 上的 prometheus.yml # ... global 和 alerting 配置 ... scrape_configs: # 这个job用于从其他Prometheus实例拉取数据 - job_name: 'federate' scrape_interval: 60s # 联邦模式的采集间隔通常更长 honor_labels: true # 非常重要!保留从局部Prometheus拉取过来的标签(如job, instance) metrics_path: /federate # 指定联邦模式的端点 # params 用于过滤需要拉取的时间序列 # match[] 是一个选择器,这里我们选择拉取所有非Prometheus自身监控的指标 params: 'match[]': - '{job!="prometheus"}' # 不拉取局部Prometheus自身的指标 - '{__name__=~"nginx_.*|tomcat_.*"}' # 或者更精确地,只拉取你关心的指标 static_configs: - targets: - '10.1.20.5:9090' # 局部Prometheus A - '10.1.24.4:9090' # 局部Prometheus B

16、Grafana配置MySQL作为数据源

bash
[root@node-exporter43 ~]# docker run --name mysql-server \ > -p 3306:3306 \ > -e MYSQL_ROOT_PASSWORD=Sdms2018 \ > -e MYSQL_DATABASE=prometheus \ > -e MYSQL_USER=weixiang98 \ > -e MYSQL_PASSWORD=yinzhengjie \ > -v mysql-server-data:/var/lib/mysql \ > -d mysql:8.0.36-oracle \ > --default-authentication-plugin=mysql_native_password 4ed972eb108a6f143fe097ded438f635faa8c996969d5bac6da82934ed8515a2 1.查看MySQL数据库 [root@node-exporter43 ~]# ss -ntl | grep 3306 LISTEN 0 151 *:3306 *:* LISTEN 0 70 *:33060 *:* [root@node-exporter43 ~]# [root@node-exporter43 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 0204416718e1 mysql:8.0.36-oracle "docker-entrypoint.s…" 32 hours ago Up 8 hours mysql-server [root@node-exporter43 ~]# [root@node-exporter43 ~]# docker inspect mysql-server [ { "Id": "0204416718e13d2d68bc72f1d95371e3a9d14c1ce4f1d6f366fcaf9f3eea7ceb", ... "Env": [ "MYSQL_DATABASE=prometheus", "MYSQL_USER=weixiang98", "MYSQL_PASSWORD=yinzhengjie", "MYSQL_ALLOW_EMPTY_PASSWORD=yes", ... ], ... [root@node-exporter43 ~]# docker exec -it mysql-server mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 8 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SHOW DATABASES; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | prometheus | | sys | +--------------------+ 5 rows in set (0.71 sec) mysql> mysql> USE prometheus Database changed mysql> SHOW TABLES; Empty set (0.01 sec) mysql> 2.修改Grafana的配置文件 [root@promethues-server31 ~]# vim /etc/grafana/grafana.ini ... [database] ... type = mysql host = 10.0.0.43:3306 name = prometheus user = weixiang98 password = yinzhengjie 扩展知识: [security] ... # 在首次启动 Grafana 时禁用管理员用户创建,说白了,就是不创建管理员用户(admin)。 ;disable_initial_admin_creation = false # 默认管理员用户,启动时创建,可以修改,若不指定,则默认为admin。 ;admin_user = admin # 指定默认的密码。 ;admin_password = admin # 默认的邮箱地址。 ;admin_email = admin@localhost 3.重启Grafana使得配置生效 [root@promethues-server31 ~]# systemctl restart grafana-server.service [root@promethues-server31 ~]# [root@promethues-server31 ~]# ss -ntl | grep 3000 LISTEN 0 4096 *:3000 *:* [root@promethues-server31 ~]# 4.验证MySQL [root@node-exporter43 ~]# docker exec -it mysql-server mysql prometheus Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 19 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SELECT DATABASE(); +------------+ | DATABASE() | +------------+ | prometheus | +------------+ 1 row in set (0.00 sec) mysql> SHOW TABLES; +-------------------------------+ | Tables_in_prometheus | +-------------------------------+ | alert | | alert_configuration | | alert_configuration_history | | alert_image | | alert_instance | | alert_notification | | alert_notification_state | | alert_rule | | alert_rule_tag | | alert_rule_version | | annotation | | annotation_tag | | api_key | | builtin_role | | cache_data | | correlation | | dashboard | | dashboard_acl | | dashboard_provisioning | | dashboard_public | | dashboard_public_email_share | | dashboard_public_magic_link | | dashboard_public_session | | dashboard_public_usage_by_day | | dashboard_snapshot | | dashboard_tag | | dashboard_usage_by_day | | dashboard_usage_sums | | dashboard_version | | data_keys | | data_source | | data_source_acl | | data_source_cache | | data_source_usage_by_day | | entity_event | | file | | file_meta | | folder | | kv_store | | library_element | | library_element_connection | | license_token | | login_attempt | | migration_log | | ngalert_configuration | | org | | org_user | | permission | | playlist | | playlist_item | | plugin_setting | | preferences | | provenance_type | | query_history | | query_history_star | | quota | | recording_rules | | remote_write_targets | | report | | report_dashboards | | report_settings | | role | | secrets | | seed_assignment | | server_lock | | session | | setting | | short_url | | star | | tag | | team | | team_group | | team_member | | team_role | | temp_user | | test_data | | user | | user_auth | | user_auth_token | | user_dashboard_views | | user_role | | user_stats | +-------------------------------+ 82 rows in set (0.00 sec) mysql>

image

1、配置邮件告警
bash
cat > blackbox.yml <<'EOF' modules: http_ssl_check: prober: http timeout: 10s http: # 我们不需要特别的http配置,但tls_config是最佳实践 # 它会验证证书链是否有效 tls_config: insecure_skip_verify: false EOF docker run -d \ --name blackbox-exporter \ -p 9115:9115 \ -v "$(pwd)/blackbox.yml:/config/blackbox.yml" \ prom/blackbox-exporter:latest \ --config.file=/config/blackbox.yml [root@node-exporter42 Blackbox]# curl 'http://10.1.12.3:9115/probe?module=http_ssl_check&target=https://kubernetes.io'

如果你看到一大堆以 probe_ 开头的 Prometheus 指标,特别是 probe_ssl_earliest_cert_expiry 和 probe_success 1,那就说明 Blackbox Exporter 已经成功运行了。

bash
[root@prometheus-server32 prometheus-2.53.4.linux-amd64]# ./promtool check config /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml Checking /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml SUCCESS: /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml is valid prometheus config file syntax [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# curl -X POST 10.1.20.5:9090/-/reload [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# curl cip.cc

17、pushgateway

1.什么是pushgateway
bash
说白了,就是自定义监控。 2.部署pushgateway wget https://github.com/prometheus/pushgateway/releases/download/v1.11.1/pushgateway-1.11.1.linux-amd64.tar.gz SVIP: [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/pushgateway/pushgateway-1.11.1.linux-amd64.tar.gz 3.解压软件包 [root@node-exporter43 ~]# tar xf pushgateway-1.11.1.linux-amd64.tar.gz -C /usr/local/bin/ pushgateway-1.11.1.linux-amd64/pushgateway --strip-components=1 [root@node-exporter43 ~]# [root@node-exporter43 ~]# ll /usr/local/bin/pushgateway -rwxr-xr-x 1 1001 1002 21394840 Apr 9 21:24 /usr/local/bin/pushgateway* [root@node-exporter43 ~]# 4.运行pushgateway [root@node-exporter43 ~]# pushgateway --web.telemetry-path="/metrics" --web.listen-address=:9091 --persistence.file=/weixiang/data/pushgateway.data 5.访问pushgateway的WebUI http://134.175.108.235:9091/# 6.模拟直播在线人数统计 6.1 使用curl工具推送测试数据pushgateway [root@node-exporter42 ~]# echo "student_online 35" | curl --data-binary @- http://10.1.12.4:9091/metrics/job/weixiang_student/instance/10.1.12.3

image

image

bash
6.2 pushgateway查询数据是否上传成功 [root@node-exporter43 ~]# curl -s http://10.1.12.4:9091/metrics | grep student_online # TYPE student_online untyped student_online{instance="10.1.12.3",job="weixiang_student"} 35 [root@node-exporter43 ~]# 7.Prometheus server监控pushgateway 7.1 修改prometheus的配置文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "weixiang-pushgateway" honor_labels: true params: static_configs: - targets: - 10.1.12.4:9091 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 7.2 热加载配置 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST 10.1.24.13:9090/-/reload [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 7.3 验证配置是否生效 http://10.0.0.31:9090/targets?search=

image

bash
7.4 查询特定指标 student_online 8.Grafana出图展示

image

bash
9.模拟直播间人数的变化 [root@node-exporter42 ~]# echo "student_online $RANDOM" | curl --data-binary @- http://10.1.12.4:9091/metrics/job/weixiang_student/instance/10.1.12.3 [root@node-exporter42 ~]# echo "student_online $RANDOM" | curl --data-binary @- http://10.1.12.4:9091/metrics/job/weixiang_student/instance/10.1.12.3

image

2.使用pushgateway监控TCP的十二种状态。
bash
Prometheus监控TCP的12种状态 1.监控TCP的12种状态 [root@node-exporter42 ~]# cat /usr/local/bin/tcp_status.sh #!/bin/bash pushgateway_url="http://10.1.12.4:9091/metrics/job/tcp_status" state="SYN-SENT SYN-RECV FIN-WAIT-1 FIN-WAIT-2 TIME-WAIT CLOSE CLOSE-WAIT LAST-ACK LISTEN CLOSING ESTAB UNKNOWN" for i in $state do count=`ss -tan |grep $i |wc -l` echo tcp_connections{state=\""$i"\"} $count >> /tmp/tcp.txt done; cat /tmp/tcp.txt | curl --data-binary @- $pushgateway_url rm -rf /tmp/tcp.txt [root@node-exporter42 ~]# 2.调用脚本 [root@node-exporter42 ~]# bash /usr/local/bin/tcp_status.sh [root@node-exporter42 ~]# 3.Prometheus查询数据 tcp_connections

image

bash
变量:state ---> Custom SYN-SENT,SYN-RECV,FIN-WAIT-1,FIN-WAIT-2,TIME-WAIT,CLOSE,CLOSE-WAIT,LAST-ACK,LISTEN,CLOSING,ESTAB,UNKNOWN 配置变量:

ddc3e3d0a730d84f5245512707cd9441

bash
4.Grafana出图展示 参考PromQL: tcp_connections{state="${state}"}

​​

image

image

image

image

3.SRE运维开发实现自定义的exporter
bash
1.使用python程序自定义exporter案例 1.1 安装pip3工具包 [root@prometheus-server33 ~]# apt update [root@prometheus-server33 ~]# apt install -y python3-pip 1.2 安装实际环境中相关模块库 [root@prometheus-server33 ~]# pip3 install flask prometheus_client -i https://mirrors.aliyun.com/pypi/simple 1.3 编写代码 [root@prometheus-server33 ~]# cat > flask_metric.py <<'EOF' #!/usr/bin/python3 # auther: Jason Yin # blog: https://www.cnblogs.com/yinzhengjie/ from prometheus_client import start_http_server,Counter, Summary from flask import Flask, jsonify from wsgiref.simple_server import make_server import time app = Flask(__name__) # Create a metric to track time spent and requests made REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request') COUNTER_TIME = Counter("request_count", "Total request count of the host") @app.route("/apps") @REQUEST_TIME.time() def requests_count(): COUNTER_TIME.inc() return jsonify({"office": "https://www.weixiang.com"},{"auther":"Jason Yin"}) if __name__ == "__main__": print("启动老男孩教育程序: weixiang-linux-python-exporter, 访问路径: http://0.0.0.0:8001/apps,监控服务: http://0.0.0.0:8000") start_http_server(8000) httpd = make_server( '0.0.0.0', 8001, app ) httpd.serve_forever() EOF 1.4 启动python程序 [root@node-exporter42 ~]# python3 flask_metric.py 启动老男孩教育程序: weixiang-linux-python-exporter, 访问路径: http://0.0.0.0:8001/apps,监控服务: http://0.0.0.0:8000 1.5 客户端测试 [root@node-exporter43 ~]# cat > weixiang_curl_metrics.sh <<'EOF' #!/bin/bash URL=http://10.1.24.13:8001/apps while true;do curl_num=$(( $RANDOM%50+1 )) sleep_num=$(( $RANDOM%5+1 )) for c_num in `seq $curl_num`;do curl -s $URL &> /dev/null done sleep $sleep_num done EOF [root@node-exporter43 ~]# bash weixiang_curl_metrics.sh # 可以看到数据

image

bash
2.prometheus监控python自定义的exporter实战 2.1 编辑配置文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie_python_custom_metrics" static_configs: - targets: - 10.1.24.13:8000 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.2 检查配置文件语法 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.3 重新加载配置文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST http://10.0.0.31:9090/-/reload [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.4 验证prometheus是否采集到数据 http://10.0.0.31:9090/targets

image

bash
2.5 grafana作图展示 request_count_total 老男孩教育apps请求总数。 increase(request_count_total{job="yinzhengjie_python_custom_metrics"}[1m]) 老男孩教育每分钟请求数量曲线QPS。 irate(request_count_total{job="yinzhengjie_python_custom_metrics"}[1m]) 老男孩教育每分钟请求量变化率曲线 request_processing_seconds_sum{job="yinzhengjie_python_custom_metrics"} / request_processing_seconds_count{job="yinzhengjie_python_custom_metrics"} 老男孩教育每分钟请求处理平均耗时

image

image

18、Alertmanager

1、Alertmanager环境部署及子路由配置
bash
- Alertmanager环境部署及子路由配置 1.什么是altermanager Alertmanager是一款开源的告警工具包,可以和Prometheus集成。 2.下载Alertmanager wget https://github.com/prometheus/alertmanager/releases/download/v0.28.1/alertmanager-0.28.1.linux-amd64.tar.gz SVIP: [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/Alertmanager/alertmanager-0.28.1.linux-amd64.tar.gz 3.解压安装包 [root@node-exporter43 ~]# tar xf alertmanager-0.28.1.linux-amd64.tar.gz -C /usr/local/ [root@node-exporter43 ~]# 4.修改Alertmanager的配置文件 [root@node-exporter43 ~]# cd /usr/local/alertmanager-0.28.1.linux-amd64/ [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ll total 67932 drwxr-xr-x 2 1001 1002 4096 Mar 7 23:08 ./ drwxr-xr-x 12 root root 4096 Aug 7 11:17 ../ -rwxr-xr-x 1 1001 1002 38948743 Mar 7 23:06 alertmanager* -rw-r--r-- 1 1001 1002 356 Mar 7 23:07 alertmanager.yml -rwxr-xr-x 1 1001 1002 30582387 Mar 7 23:06 amtool* -rw-r--r-- 1 1001 1002 11357 Mar 7 23:07 LICENSE -rw-r--r-- 1 1001 1002 311 Mar 7 23:07 NOTICE [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml # 通用配置 global: resolve_timeout: 5m smtp_from: '1360821977@qq.com' smtp_smarthost: 'smtp.qq.com:465' smtp_auth_username: '1360821977@qq.com' smtp_auth_password: 'vsqtbsnvfgnobabd' smtp_require_tls: false smtp_hello: 'qq.com' # 定义路由信息 route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'sre_system' # 配置子路由 routes: - receiver: 'sre_dba' match_re: job: yinzhengjie_dba_exporter # 建议将continue的值设置为true,表示当前的条件是否匹配,都将继续向下匹配规则 # 这样做的目的是将消息发给最后的系统组(sre_system) continue: true - receiver: 'sre_k8s' match_re: job: yinzhengjie_k8s_exporter continue: true - receiver: 'sre_system' match_re: job: .* continue: true # 定义接受者 receivers: - name: 'sre_dba' email_configs: - to: '18295829783@163.com' send_resolved: true - to: '1304871040@qq.com' send_resolved: true - name: 'sre_k8s' email_configs: - to: '2996358563@qq.com' send_resolved: true - to: '2011014877@qq.com' send_resolved: true - name: 'sre_system' email_configs: - to: '3220434114@qq.com' send_resolved: true - to: '2825483220@qq.com' send_resolved: true [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# 5.检查配置文件语法 [root@prometheus-server33 alertmanager-0.28.1.linux-amd64]# ./amtool check-config alertmanager.yml Checking 'alertmanager.yml' SUCCESS Found: - global config - route - 0 inhibit rules - 3 receivers - 0 templates [root@prometheus-server33 alertmanager-0.28.1.linux-amd64]# 6.启动Alertmanager服务 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ./alertmanager 7.访问Alertmanager的WebUI http://134.175.108.235:9093/#/status

image

bash
- Prometheus server集成Alertmanager实现告警功能 1. 修改配置文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# egrep -v "^#|^$" prometheus.yml global: scrape_interval: 3s evaluation_interval: 3s ... alerting: alertmanagers: - static_configs: - targets: - 10.0.0.43:9093 rule_files: - "weixiang-linux-rules.yml" ... scrape_configs: ... - job_name: "yinzhengjie_dba_exporter" static_configs: - targets: ["10.0.0.41:9100"] - job_name: "yinzhengjie_k8s_exporter" static_configs: - targets: ["10.0.0.42:9100"] - job_name: "yinzhengjie_bigdata_exporter" static_configs: - targets: ["10.0.0.43:9100"] [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2 修改告警规则 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# cat > weixiang-linux-rules.yml <<'EOF' groups: - name: weixiang-linux-rules-alert rules: - alert: weixiang-dba_exporter-alert expr: up{job="yinzhengjie_dba_exporter"} == 0 for: 1s labels: school: weixiang class: weixiang98 apps: dba annotations: summary: "{{ $labels.instance }} 数据库实例已停止运行超过 3s!" - alert: weixiang-k8s_exporter-alert expr: up{job="yinzhengjie_k8s_exporter"} == 0 for: 1s labels: school: weixiang class: weixiang98 apps: k8s annotations: summary: "{{ $labels.instance }} K8S服务器已停止运行超过 3s!" - alert: weixiang-bigdata_exporter-alert expr: up{job="yinzhengjie_bigdata_exporter"} == 0 for: 1s labels: school: weixiang class: weixiang98 apps: bigdata annotations: summary: "{{ $labels.instance }} 大数据服务器已停止运行超过 5s!" EOF 3.检查配置文件语法 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: 1 rule files found SUCCESS: prometheus.yml is valid prometheus config file syntax Checking weixiang-linux-rules.yml SUCCESS: 3 rules found [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 4.重新加载prometheus的配置 curl -X POST http://10.0.0.31:9090/-/reload 5.查看prometheus server的WebUI验证是否生效 http://106.55.44.37:9090/config

image

bash
http://106.55.44.37:9090/targets?search=

image

bash
http://106.55.44.37:9090/alerts?search=

image

bash
6.触发告警功能 [root@node-exporter41 ~]# systemctl stop node-exporter.service [root@node-exporter41 ~]# ss -ntl | grep 9100 [root@node-exporter41 ~]# [root@node-exporter42 ~]# systemctl stop node-exporter.service [root@node-exporter42 ~]# [root@node-exporter42 ~]# ss -ntl | grep 9100 [root@node-exporter42 ~]# [root@node-exporter43 ~]# systemctl stop node-exporter.service [root@node-exporter43 ~]# [root@node-exporter43 ~]# ss -ntl | grep 9100 [root@node-exporter43 ~]# 7.查看alermanager的WebUI及邮箱接受者 http://106.55.44.37:9090/alerts?search= # 已经变红了,邮箱能收到告警邮件

image

image

image

bash
# 恢复业务 [root@node-exporter41 ~]# systemctl start node-exporter.service [root@node-exporter42 ~]# ss -ntl | grep 9100 LISTEN 0 4096 *:9100 *:* [root@node-exporter42 ~]# systemctl start node-exporter.service [root@node-exporter42 ~]# ss -ntl | grep 9100 LISTEN 0 4096 *:9100 *:*

image

image

2、alertmanager自定义告警模板
bash
1 告警模板介绍 默认的告警信息界面有些简单,可以借助告警的模板信息,对告警信息进行丰富,需要借助于Alertmanager的模板功能来实现。 告警模板的使用流程如下: - 分析关键信息 - 定制模板内容 - Alertmanager加载模板文件 - 告警信息使用模板内容属性 模板文件使用标准Go模板语法,并暴露一些包含时间标签和值的变量。 - 标签引用: {{ $label.<label_name> }} - 指标样本值引用: {{ $value }} 为了显式效果,需要了解一些html相关技术,参考链接: https://www.w3school.com.cn/html/index.asp 2 altertmanger节点自定义告警模板参考案例 2.1 创建邮件模板文件工作目录 [root@prometheus-server43 alertmanager-0.28.1.linux-amd64]# mkdir -pv /weixiang/softwares/alertmanager/tmpl 2.2 创建模板实例,工作中可以考虑嵌入公司的logo [root@prometheus-server43 alertmanager-0.28.1.linux-amd64]# cat > /weixiang/softwares/alertmanager/tmpl/email.tmpl <<'EOF' {{ define "weixiang.html" }} <h1>老男孩IT教育欢迎您: https://www.weixiang.com/</h1> <table border="1"> <tr> <th>报警项</th> <th>实例</th> <th>报警阀值</th> <th>开始时间</th> </tr> {{ range $i, $alert := .Alerts }} <tr> <td>{{ index $alert.Labels "alertname" }}</td> <td>{{ index $alert.Labels "instance" }}</td> <td>{{ index $alert.Annotations "value" }}</td> <td>{{ $alert.StartsAt }}</td> </tr> {{ end }} </table> <img src="https://www.weixiang.com/static/images/header/logo.png"> {{ end }} EOF 2.3 alertmanager引用自定义模板文件 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml # 通用配置 global: resolve_timeout: 5m smtp_from: '13949913771@163.com' smtp_smarthost: 'smtp.163.com:465' smtp_auth_username: '13949913771@163.com' smtp_auth_password: 'UGTMVNtb2Xup2St4' smtp_require_tls: false smtp_hello: '163.com' # 定义路由信息 route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'sre_system' # 配置子路由 routes: - receiver: 'sre_dba' match_re: job: yinzhengjie_dba_exporter # 建议将continue的值设置为true,表示当前的条件是否匹配,都将继续向下匹配规则 # 这样做的目的是将消息发给最后的系统组(sre_system) continue: true - receiver: 'sre_k8s' match_re: job: yinzhengjie_k8s_exporter continue: true - receiver: 'sre_system' match_re: job: .* continue: true # 定义接受者 receivers: - name: 'sre_dba' email_configs: - to: '18295829783@163.com' headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang.html" . }}' send_resolved: true - to: '1304871040@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang.html" . }}' - name: 'sre_k8s' email_configs: - to: '2996358563@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang.html" . }}' - to: '2011014877@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang.html" . }}' - name: 'sre_system' email_configs: - to: '3220434114@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang.html" . }}' - to: '2825483220@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang.html" . }}' # 加载模板 templates: - '/weixiang/softwares/alertmanager/tmpl/*.tmpl' 2.4 alertmanager语法检查 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ./amtool check-config ./alertmanager.yml Checking './alertmanager.yml' SUCCESS Found: - global config - route - 0 inhibit rules - 3 receivers - 1 templates SUCCESS [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# 2.5 重启Alertmanager程序 [root@prometheus-server33 alertmanager-0.28.1.linux-amd64]# ./alertmanager [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ./alertmanager time=2025-08-07T06:50:21.818Z level=INFO source=main.go:191 msg="Starting Alertmanager" version="(version=0.28.1, branch=HEAD, revision=b2099eaa2c9ebc25edb26517cb9c732738e93910)" time=2025-08-07T06:50:21.818Z level=INFO source=main.go:192 msg="Build context" build_context="(go=go1.23.7, platform=linux/amd64, user=root@fa3ca569dfe4, date=20250307-15:05:18, tags=netgo)" time=2025-08-07T06:50:21.821Z level=INFO source=cluster.go:185 msg="setting advertise address explicitly" component=cluster addr=10.1.12.4 port=9094 time=2025-08-07T06:50:21.825Z level=INFO source=cluster.go:674 msg="Waiting for gossip to settle..." component=cluster interval=2s time=2025-08-07T06:50:21.854Z level=INFO source=coordinator.go:112 msg="Loading configuration file" component=configuration file=alertmanager.yml time=2025-08-07T06:50:21.855Z level=INFO source=coordinator.go:125 msg=Completed loading of configuration f 2.6 查看WebUi观察配置是否生效 http://134.175.108.235:9093/#/status 2.7 再次出发告警配置 停下服务再启动 2.8 如果value取不到值,可以考虑修改告警规则添加value字段即可(并重启服务) [root@promethues-server31 prometheus-2.53.4.linux-amd64]# cat weixiang-linux-rules.yml groups: - name: weixiang-linux-rules-alert rules: - alert: weixiang-dba_exporter-alert expr: up{job="yinzhengjie_dba_exporter"} == 0 for: 1s labels: school: weixiang class: weixiang98 apps: dba annotations: summary: "{{ $labels.instance }} 数据库实例已停止运行超过 3s!" # 这里注解部分增加了一个value的属性信息,会从Prometheus的默认信息中获取阈值 value: "{{ $value }}" - alert: weixiang-k8s_exporter-alert expr: up{job="yinzhengjie_k8s_exporter"} == 0 for: 1s labels: school: weixiang class: weixiang98 apps: k8s annotations: summary: "{{ $labels.instance }} K8S服务器已停止运行超过 3s!" value: "{{ $value }}" - alert: weixiang-bigdata_exporter-alert expr: up{job="yinzhengjie_bigdata_exporter"} == 0 for: 1s labels: school: weixiang class: weixiang98 apps: bigdata annotations: summary: "{{ $labels.instance }} 大数据服务器已停止运行超过 5s!" value: "{{ $value }}" [root@promethues-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST http://10.1.24.13:9090/-/reload

image

image

bash
# 恢复 [root@node-exporter41 ~]# systemctl start node-exporter.service

image

image

image

3、自定义告警模板案例2
bash
- 自定义告警模板案例2 1.定义模板 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat /weixiang/softwares/alertmanager/tmpl/xixi.tmp {{ define "xixi" }} <!DOCTYPE html> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <style type="text/css"> body { font-family: 'Helvetica Neue', Arial, sans-serif; line-height: 1.6; color: #333; max-width: 700px; margin: 0 auto; padding: 20px; background-color: #f9f9f9; } .alert-card { border-radius: 8px; padding: 20px; margin-bottom: 20px; box-shadow: 0 2px 10px rgba(0,0,0,0.1); } .alert-critical { background: linear-gradient(135deg, #FFF6F6 0%, #FFEBEB 100%); border-left: 5px solid #FF5252; } .alert-resolved { background: linear-gradient(135deg, #F6FFF6 0%, #EBFFEB 100%); border-left: 5px solid #4CAF50; } .alert-title { font-size: 18px; font-weight: bold; margin-bottom: 15px; display: flex; align-items: center; } .alert-icon { width: 24px; height: 24px; margin-right: 10px; } .alert-field { margin-bottom: 8px; display: flex; } .field-label { font-weight: bold; min-width: 80px; color: #555; } .field-value { flex: 1; } .timestamp { color: #666; font-size: 13px; margin-top: 15px; text-align: right; } .divider { height: 1px; background: #eee; margin: 15px 0; } </style> </head> <body> {{- if gt (len .Alerts.Firing) 0 -}} <div class="alert-header alert-critical"> 告警触发 - 请立即处理! </div> <div> <img src="https://img95.699pic.com/element/40114/9548.png_860.png" width="200px" height="200px"> </div> {{- range $index, $alert := .Alerts -}} <div class="alert-card alert-critical"> <div class="alert-field"> <span class="field-label">告警名称:</span> <span class="field-value">{{ .Labels.alertname }}</span> </div> <div class="alert-field"> <span class="field-label">告警级别:</span> <span class="field-value">{{ .Labels.severity }}</span> </div> <div class="alert-field"> <span class="field-label">目标机器:</span> <span class="field-value">{{ .Labels.instance }}</span> </div> <div class="alert-field"> <span class="field-label">告警摘要:</span> <span class="field-value">{{ .Annotations.summary }}</span> </div> <div class="alert-field"> <span class="field-label">触发时间:</span> <span class="field-value">{{ (.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}</span> </div> {{- if .Annotations.description }} <div class="divider"></div> <div class="alert-field"> <span class="field-label">详细描述:</span> <span class="field-value">{{ .Annotations.description }}</span> </div> {{- end }} </div> {{- end }} {{- end }} {{- if gt (len .Alerts.Resolved) 0 -}} {{- range $index, $alert := .Alerts -}} <div class="alert-card alert-resolved"> <div class="alert-title"> 告警恢复通知 </div> <div> <img src="https://tse2-mm.cn.bing.net/th/id/OIP-C.n7AyZv_wWXqFCc1mtlGhFgHaHa?rs=1&pid=ImgDetMain" width="300" height="300"> </div> <div class="alert-field"> <span class="field-label">告警名称:</span> <span class="field-value">{{ .Labels.alertname }}</span> </div> <div class="alert-field"> <span class="field-label">目标机器:</span> <span class="field-value">{{ .Labels.instance }}</span> </div> <div class="alert-field"> <span class="field-label">告警摘要:</span> <span class="field-value">[ {{ .Annotations.summary }}] 此告警已经恢复~</span> </div> <div class="alert-field"> <span class="field-label">恢复时间:</span> <span class="field-value">{{ (.EndsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}</span> </div> {{- if .Annotations.description }} <div class="alert-field"> <span class="field-label">详细描述:</span> <span class="field-value">{{ .Annotations.description }}</span> </div> {{- end }} </div> {{- end }} {{- end }} </body> </html> {{ end }} [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# 2.引用模板 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml # 通用配置 global: resolve_timeout: 5m smtp_from: '13949913771@163.com' smtp_smarthost: 'smtp.163.com:465' smtp_auth_username: '13949913771@163.com' smtp_auth_password: 'UGTMVNtb2Xup2St4' smtp_require_tls: false smtp_hello: '163.com' # 定义路由信息 route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'sre_system' # 配置子路由 routes: - receiver: 'sre_dba' match_re: job: yinzhengjie_dba_exporter # 建议将continue的值设置为true,表示当前的条件是否匹配,都将继续向下匹配规则 # 这样做的目的是将消息发给最后的系统组(sre_system) continue: true - receiver: 'sre_k8s' match_re: job: yinzhengjie_k8s_exporter continue: true - receiver: 'sre_system' match_re: job: .* continue: true # 定义接受者 receivers: - name: 'sre_dba' email_configs: - to: '18295829783@163.com' headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' html: '{{ template "xixi" . }}' send_resolved: true - to: '1304871040@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' html: '{{ template "xixi" . }}' - name: 'sre_k8s' email_configs: - to: '2996358563@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' html: '{{ template "xixi" . }}' - to: '2011014877@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' html: '{{ template "xixi" . }}' - name: 'sre_system' email_configs: - to: '3220434114@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' html: '{{ template "xixi" . }}' - to: '2825483220@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' html: '{{ template "xixi" . }}' # 加载模板 templates: - '/weixiang/softwares/alertmanager/tmpl/*.tmpl' [root@node-exporter43 alertmanager-0.28.1.linux-amd64]#
4、自定义告警模板案例3
bash
- 自定义告警模板案例3 1.定义模板 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat /weixiang/softwares/alertmanager/tmpl/weixiang.tmpl {{ define "weixiang" }} <!DOCTYPE html> <html> <head> <title>{{ if eq .Status "firing" }}&#128680; 告警触发{{ else }}&#9989; 告警恢复{{ end }}</title> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <style> @font-face { font-family: "EmojiFont"; src: local("Apple Color Emoji"), local("Segoe UI Emoji"), local("Noto Color Emoji"); } :root { --color-critical: #ff4444; --color-warning: #ffbb33; --color-resolved: #00c851; --color-info: #33b5e5; } body { font-family: 'Segoe UI', system-ui, sans-serif, "EmojiFont"; line-height: 1.6; color: #333; max-width: 800px; margin: 20px auto; padding: 0 20px; } .header { text-align: center; padding: 30px; border-radius: 15px; margin-bottom: 30px; background: {{ if eq .Status "firing" }}#fff0f0{{ else }}#f0fff4{{ end }}; border: 2px solid {{ if eq .Status "firing" }}var(--color-critical){{ else }}var(--color-resolved){{ end }}; } .status-badge { padding: 8px 16px; border-radius: 20px; font-weight: bold; display: inline-block; } .alert-table { width: 100%; border-collapse: separate; border-spacing: 0; background: white; border-radius: 10px; overflow: hidden; box-shadow: 0 2px 6px rgba(0,0,0,0.1); margin: 20px 0; } .alert-table th { background: #f8f9fa; padding: 16px; text-align: left; width: 130px; border-right: 2px solid #e9ecef; } .alert-table td { padding: 16px; border-bottom: 1px solid #e9ecef; } .timeline { display: flex; justify-content: space-between; margin: 15px 0; } .timeline-item { flex: 1; text-align: center; padding: 10px; background: #f8f9fa; border-radius: 8px; margin: 0 5px; } .alert-image { text-align: center; margin: 30px 0; } .alert-image img { width: {{ if eq .Status "firing" }}140px{{ else }}100px{{ end }}; opacity: 0.9; transition: all 0.3s ease; } .emoji { font-family: "EmojiFont", sans-serif; font-size: 1.3em; } .severity-critical { color: var(--color-critical); } .severity-warning { color: var(--color-warning); } </style> </head> <body> <div class="header"> <h1> {{ if eq .Status "firing" }} <span class="emoji">&#128680;</span> 告警触发通知 {{ else }} <span class="emoji">&#9989;</span> 告警恢复通知 {{ end }} </h1> </div> {{ if eq .Status "firing" }} <!-- 告警触发内容 --> <table class="alert-table"> <tr> <th><span class="emoji">&#128683;</span> 告警名称</th> <td>{{ range .Alerts }}<span class="emoji">&#128227;</span> {{ .Labels.alertname }}{{ end }}</td> </tr> <tr> <th><span class="emoji">&#9888;&#65039;</span> 严重等级</th> <td class="severity-{{ range .Alerts }}{{ .Labels.severity }}{{ end }}"> {{ range .Alerts }}<span class="emoji">&#9210;</span> {{ .Labels.severity | toUpper }}{{ end }} </td> </tr> <tr> <th><span class="emoji">&#128346;</span> 触发时间</th> <td>{{ range .Alerts }}<span class="emoji">&#128337;</span> {{ .StartsAt.Format "2006-01-02 15:04:05" }}{{ end }}</td> </tr> </table> {{ else }} <!-- 告警恢复内容 --> <table class="alert-table"> <tr> <th><span class="emoji">&#128227;</span> 恢复告警</th> <td>{{ range .Alerts }}<span class="emoji">&#128272;</span> {{ .Labels.alertname }}{{ end }}</td> </tr> <tr> <th><span class="emoji">&#9203;</span> 持续时间</th> <td> {{ range .Alerts }} {{ .StartsAt.Format "15:04:05" }} - {{ .EndsAt.Format "15:04:05" }} ({{ .EndsAt.Sub .StartsAt | printf "%.0f" }} 分钟) {{ end }} </td> </tr> <tr> <th><span class="emoji">&#9989;</span> 恢复时间</th> <td>{{ range .Alerts }}<span class="emoji">&#128338;</span> {{ .EndsAt.Format "2006-01-02 15:04:05" }}{{ end }}</td> </tr> </table> {{ end }} <!-- 公共信息部分 --> <table class="alert-table"> <tr> <th><span class="emoji">&#128187;&#65039;</span> 实例信息</th> <td>{{ range .Alerts }}<span class="emoji">&#127991;</span> {{ .Labels.instance }}{{ end }}</td> </tr> <tr> <th><span class="emoji">&#128221;</span> 告警详情</th> <td>{{ range .Alerts }}<span class="emoji">&#128204;</span> {{ .Annotations.summary }}{{ end }}</td> </tr> <tr> <th><span class="emoji">&#128196;</span> 详细描述</th> <td>{{ range .Alerts }}<span class="emoji">&#128209;</span> {{ .Annotations.description }}{{ end }}</td> </tr> </table> <div class="alert-image"> {{ if eq .Status "firing" }} <img src="https://img95.699pic.com/element/40114/9548.png_860.png" alt="告警图标"> {{ else }} <img src="https://tse2-mm.cn.bing.net/th/id/OIP-C.n7AyZv_wWXqFCc1mtlGhFgHaHa?rs=1&pid=ImgDetMain" alt="恢复图标"> {{ end }} </div> <div class="timeline"> <div class="timeline-item"> <div class="emoji">&#128678; 当前状态</div> {{ range .Alerts }} <strong>{{ if eq .Status "firing" }}<span class="emoji">&#128293;</span> FIRING{{ else }}<span class="emoji">&#9989;</span> RESOLVED{{ end }}</strong> {{ end }} </div> <div class="timeline-item"> <div class="emoji">&#128204; 触发次数</div> <strong>{{ len .Alerts }} 次</strong> </div> </div> </body> </html> {{ end }} [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# 2.引用模板 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml # 通用配置 global: resolve_timeout: 5m smtp_from: '13949913771@163.com' smtp_smarthost: 'smtp.163.com:465' smtp_auth_username: '13949913771@163.com' smtp_auth_password: 'UGTMVNtb2Xup2St4' smtp_require_tls: false smtp_hello: '163.com' # 定义路由信息 route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'sre_system' # 配置子路由 routes: - receiver: 'sre_dba' match_re: job: yinzhengjie_dba_exporter # 建议将continue的值设置为true,表示当前的条件是否匹配,都将继续向下匹配规则 # 这样做的目的是将消息发给最后的系统组(sre_system) continue: true - receiver: 'sre_k8s' match_re: job: yinzhengjie_k8s_exporter continue: true - receiver: 'sre_system' match_re: job: .* continue: true # 定义接受者 receivers: - name: 'sre_dba' email_configs: - to: '18295829783@163.com' headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' send_resolved: true - to: '1304871040@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - name: 'sre_k8s' email_configs: - to: '2996358563@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - to: '2011014877@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - name: 'sre_system' email_configs: - to: '3220434114@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - to: '2825483220@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' # 加载模板 templates: - '/weixiang/softwares/alertmanager/tmpl/*.tmpl' [root@node-exporter43 alertmanager-0.28.1.linux-amd64]#
5、Alertmanager集成钉钉插件实现告警
bash
参考链接: https://github.com/timonwong/prometheus-webhook-dingtalk/ 0.注册钉钉账号并添加钉钉机器人 略,见视频。 1.部署钉钉插件 1.1 下载钉钉插件 wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v2.1.0/prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz svip: [root@node-exporter42 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/Alertmanager/prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz 1.2 解压文件 [root@node-exporter42 ~]# tar xf prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz -C /usr/local/ [root@node-exporter42 ~]# [root@node-exporter42 ~]# ll /usr/local/prometheus-webhook-dingtalk-2.1.0.linux-amd64/ total 18752 drwxr-xr-x 3 3434 3434 4096 Apr 21 2022 ./ drwxr-xr-x 11 root root 4096 Aug 7 15:54 ../ -rw-r--r-- 1 3434 3434 1299 Apr 21 2022 config.example.yml drwxr-xr-x 4 3434 3434 4096 Apr 21 2022 contrib/ -rw-r--r-- 1 3434 3434 11358 Apr 21 2022 LICENSE -rwxr-xr-x 1 3434 3434 19172733 Apr 21 2022 prometheus-webhook-dingtalk* [root@node-exporter42 ~]# 1.3 修改配置文件 [root@node-exporter42 ~]# cd /usr/local/prometheus-webhook-dingtalk-2.1.0.linux-amd64/ [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# cp config{.example,}.yml [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# ll total 18756 drwxr-xr-x 3 3434 3434 4096 Aug 7 15:54 ./ drwxr-xr-x 11 root root 4096 Aug 7 15:54 ../ -rw-r--r-- 1 3434 3434 1299 Apr 21 2022 config.example.yml -rw-r--r-- 1 root root 1299 Aug 7 15:54 config.yml drwxr-xr-x 4 3434 3434 4096 Apr 21 2022 contrib/ -rw-r--r-- 1 3434 3434 11358 Apr 21 2022 LICENSE -rwxr-xr-x 1 3434 3434 19172733 Apr 21 2022 prometheus-webhook-dingtalk* [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# cat config.yml # 也可以直接即可 targets: linux97: # 对应的是dingding的webhook url: https://oapi.dingtalk.com/robot/send?access_token=08462ff18a9c5e739b98a5d7a716408b4ccd8255d19a3b26ae6b8dcb90c73384 # 对应的是"加签"的值,复制过来即可 secret: "SECf5414b69dd0f8a3a72b0bb929cf9271ef061aaea4c60e270cb15deb127339e4b" [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]#

image

bash
1.4 启动钉钉插件 [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# ./prometheus-webhook-dingtalk --web.listen-address="10.1.12.3:8060" ... ts=2025-08-07T07:58:59.946Z caller=main.go:113 component=configuration msg="Webhook urls for prometheus alertmanager" urls=http://10.0.0.42:8060/dingtalk/weixiang98/send 2.Alertmanager集成钉钉插件 2.1 修改Alertmanager的配置文件 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml # 通用配置 global: resolve_timeout: 5m smtp_from: '13949913771@163.com' smtp_smarthost: 'smtp.163.com:465' smtp_auth_username: '13949913771@163.com' smtp_auth_password: 'UGTMVNtb2Xup2St4' smtp_require_tls: false smtp_hello: '163.com' # 定义路由信息 route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'sre_system' # 配置子路由 routes: - receiver: 'sre_dba' match_re: job: yinzhengjie_dba_exporter # 建议将continue的值设置为true,表示当前的条件是否匹配,都将继续向下匹配规则 # 这样做的目的是将消息发给最后的系统组(sre_system) continue: true - receiver: 'sre_k8s' match_re: job: yinzhengjie_k8s_exporter continue: true - receiver: 'sre_system' match_re: job: .* continue: true # 定义接受者 receivers: - name: 'sre_dba' email_configs: - to: '18295829783@163.com' headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' send_resolved: true - to: '1304871040@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - name: 'sre_k8s' email_configs: - to: '2996358563@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - to: '2011014877@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - name: 'sre_system' webhook_configs: # 指向的是Prometheus的插件地址 - url: 'http://10.0.0.42:8060/dingtalk/weixiang98/send' http_config: {} max_alerts: 0 send_resolved: true #email_configs: #- to: '3220434114@qq.com' # send_resolved: true # headers: { Subject: "[WARN] weixiang98报警邮件" } # # html: '{{ template "weixiang.html" . }}' # # html: '{{ template "xixi" . }}' # html: '{{ template "weixiang" . }}' #- to: '2825483220@qq.com' # send_resolved: true # headers: { Subject: "[WARN] weixiang98报警邮件" } # # html: '{{ template "weixiang.html" . }}' # # html: '{{ template "xixi" . }}' # html: '{{ template "weixiang" . }}' # 加载模板 templates: - '/weixiang/softwares/alertmanager/tmpl/*.tmpl' [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ./amtool check-config alertmanager.yml Checking 'alertmanager.yml' SUCCESS Found: - global config - route - 0 inhibit rules - 3 receivers - 1 templates SUCCESS [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# 2.2 启动Alertmanager [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ./alertmanager 3.测试告警验证

image

6、Alertmanager的告警静默(Silence)
bash
- Alertmanager的告警静默(Silence) 1.告警静默(Silence) 一般用于系统维护,预期要做的操作,这意味着就没有必要告警。 比如系统升级,需要8h,在这8h过程中,就可以考虑先不用告警。 2.实战案例 # 停止三个业务 [root@node-exporter41 ~]# systemctl stop node-exporter.service # 查看界面

image

bash
根据key值进行进行静默,只要有这个标签的都进行静默

image

image

image

image

bash
# 开启服务 [root@node-exporter43 ~]# systemctl start node-exporter.service # 钉钉收不到通知消息,静默测试成功

设置静默过期

image

bash
通知已收到

image

7、Alertmanager的告警抑制(inhibit)
bash
- Alertmanager的告警抑制(inhibit) 1.什么是告警抑制 说白了,就是抑制告警,和静默不同的是,抑制的应用场景一般用于抑制符合条件的告警。 举个例子: 一个数据中心有800台服务器,每台服务器有50个监控项,假设一个意味着有4w个监控告警。 如果数据中心断电,理论上来说就会有4w条告警发送到你的手机,你是处理不过来的,所以我们只需要将数据中心断电的告警发出来即可。 2.Prometheus Server编写规则 2.1 修改Prometheus server的配置文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... rule_files: # - "weixiang-linux-rules.yml" - "weixiang-linux-rules-inhibit.yml" ... 2.2 编写告警规则 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# cat weixiang-linux-rules-inhibit.yml groups: - name: weixiang-linux-rules-alert-inhibit rules: - alert: weixiang-dba_exporter-alert expr: up{job="yinzhengjie_dba_exporter"} == 0 for: 3s labels: apps: dba severity: critical dc: beijing # 下面根据这里的dc值告警,这个告警了,k8s就不会告警了 annotations: summary: "{{ $labels.instance }} 数据库实例已停止运行超过 3s!" # 这里注释部分增加了一个value的属性信息,会从Prometheus的默认信息中获取阈值 value: "{{ $value }}" - alert: weixiang-k8s_exporter-alert expr: up{job="yinzhengjie_k8s_exporter"} == 0 for: 3s labels: apps: k8s severity: warning dc: beijing # 下面根据这里的dc值告警 annotations: summary: "{{ $labels.instance }} K8S服务器已停止运行超过 3s!" value: "{{ $value }}" - alert: weixiang-bigdata_exporter-alert expr: up{job="yinzhengjie_bigdata_exporter"} == 0 for: 5s labels: apps: bigdata severity: warning dc: shenzhen annotations: summary: "{{ $labels.instance }} 大数据服务器已停止运行超过 5s!" value: "{{ $value }}" [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.3 热加载配置文件使得生效 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST 106.55.44.37:9090/-/reload [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.4 验证prometheus的webUI配置是否剩下 http://106.55.44.37:9090/alerts?search= # 抑制规则已经生效

image

bash
3.Alertmanager配置告警抑制规则 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml ... ## 配置告警抑制规则 inhibit_rules: # 如果"dc"的值相同的前提条件下。 # 则当触发了"severity: critical"告警,就会抑制"severity: warning"的告警信息。 - source_match: severity: critical # 如果这个告警了 target_match: severity: warning # 这个就不会告了 equal: - dc # 根据这个值告警 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# 4.启动Alertmanager [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ./alertmanager 5.检查Alertmanager的webuI http://134.175.108.235:9093/#/status 6.验证测试 # 三个服务都已停止

image

bash
k8s会被抑制,

image

bash
没收到k8s,因为被抑制了

image

8、监控Linux系统的根目录使用大小

​1. 确认 Alertmanager 配置

bash
[root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml # 通用配置 global: resolve_timeout: 5m smtp_from: '1360821977@qq.com' smtp_smarthost: 'smtp.qq.com:465' smtp_auth_username: '1360821977@qq.com' smtp_auth_password: 'pnmwamrclxfpijfi' # 注意:密码等敏感信息请妥善保管 smtp_require_tls: false smtp_hello: 'qq.com' # 定义路由信息 route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'sre_system' # 配置子路由 routes: - receiver: 'sre_dba' match_re: job: yinzhengjie_dba_exporter # 建议将continue的值设置为true,表示当前的条件是否匹配,都将继续向下匹配规则 # 这样做的目的是将消息发给最后的系统组(sre_system) continue: true - receiver: 'sre_k8s' match_re: job: yinzhengjie_k8s_exporter continue: true - receiver: 'sre_system' match_re: job: .* continue: true # 定义接受者 receivers: - name: 'sre_dba' email_configs: - to: '18295829783@163.com' headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' send_resolved: true - to: '1304871040@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - name: 'sre_k8s' email_configs: - to: '2996358563@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - to: '2011014877@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - name: 'sre_system' webhook_configs: # 指向的是Prometheus的插件地址 - url: 'http://10.1.12.3:8060/dingtalk/weixiang98/send' http_config: {} max_alerts: 0 send_resolved: true #email_configs: #- to: '3220434114@qq.com' # send_resolved: true # headers: { Subject: "[WARN] weixiang98报警邮件" } # # html: '{{ template "weixiang.html" . }}' # # html: '{{ template "xixi" . }}' # html: '{{ template "weixiang" . }}' #- to: '2825483220@qq.com' # send_resolved: true # headers: { Subject: "[WARN] weixiang98报警邮件" } # # html: '{{ template "weixiang.html" . }}' # # html: '{{ template "xixi" . }}' # html: '{{ template "weixiang" . }}' # 加载模板 templates: - '/weixiang/softwares/alertmanager/tmpl/*.tmpl'
2. 创建 Prometheus 告警规则
bash
我们需要创建一个规则文件,定义如何计算根目录使用率以及何时触发告警。 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# cat disk_alert.rules.yml groups: - name: LinuxNodeRules rules: - alert: LinuxRootFilesystemUsageHigh # PromQL 表达式,计算根目录磁盘使用率 # (总空间 - 可用空间) / 总空间 * 100 # 过滤条件: 挂载点为根目录 (mountpoint="/"),文件系统类型为 ext4 或 xfs expr: | ( node_filesystem_size_bytes{mountpoint="/", fstype=~"ext4|xfs"} - node_filesystem_avail_bytes{mountpoint="/", fstype=~"ext4|xfs"} ) / node_filesystem_size_bytes{mountpoint="/", fstype=~"ext4|xfs"} * 100 > 90 # 持续时间:当条件持续满足5分钟后,才真正触发告警,防止抖动 for: 5m # 标签:可以用于告警路由或在告警信息中展示 labels: severity: critical # 注解:告警的详细信息,会显示在钉钉消息中 annotations: summary: "服务器根目录空间使用率过高" description: | - **告警级别**: {{ $labels.severity }} - **主机IP**: {{ $labels.instance }} - **挂载点**: {{ $labels.mountpoint }} - **当前使用率**: {{ $value | printf "%.2f" }}% - **详细信息**: 服务器 {{ $labels.instance }} 的根目录 (/) 空间使用率已超过90%,请立即处理! node_filesystem_size_bytes: 由 node_exporter 提供的文件系统总大小的指标。 node_filesystem_avail_bytes: 由 node_exporter 提供的文件系统可用空间大小的指标。 {mountpoint="/", fstype=~"ext4|xfs"}: 这是筛选条件,确保我们只监控根目录 (/),并且是常见的 ext4 或 xfs 文件系统,避免监控到虚拟文件系统。 > 90: 告警阈值,超过 90% 就满足条件。 for: 5m: 为了防止因为某个瞬间的读写导致使用率飙高而产生误报,设置了一个持续时间。只有当使用率持续 5 分钟都高于 90% 时,告警才会从 Pending 状态变为 Firing 状态,并发送通知。 annotations: 这部分内容会作为变量传递给钉钉模板,最终显示在你的钉钉消息里。{{ $labels.instance }} 会被替换为目标机器的 IP 和端口,{{ $value }} 会被替换为表达式计算出的当前值(即磁盘使用率)。
3. 在 Prometheus 中加载告警规则文件

编辑你的 Prometheus 主配置文件 prometheus.yml,在 rule_files 部分添加你刚刚创建的文件路径。

bash
# ... 其他配置,如 global, scrape_configs ... # 告警规则文件路径 rule_files: # - "first_rules.yml" # 如果你还有其他规则文件,可以一并保留 - "/etc/prometheus/disk_alert.rules.yml" # ... 其他配置,如 alerting ...
4. 重启或重载 Prometheus
bash
# 热加载 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST http://10.1.24.13:9090/-/reload
5. 验证
bash
Prometheus UI 检查: 打开你的 Prometheus Web UI (例如 http://<your-prometheus-ip>:9090)。 点击顶部导航栏的 "Alerts"。你应该能看到我们刚刚创建的 LinuxRootFilesystemUsageHigh 规则。 如果没有任何机器的根目录使用率超过 90%,它的状态应该是绿色的 Inactive。 如果有机器超过了 90%,它的状态会先是黄色的 Pending (在 for: 5m 的等待期内),然后变成红色的 Firing。

image

bash
9.grafna导入模板ID 14191

image

7、Prometheus监控redis案例
bash
1.1 导入镜像 [root@elk93 ~]# wget http://192.168.21.253/Resources/Prometheus/images/Redis/weixiang-redis-v7.4.2-alpine.tar.gz [root@elk93 ~]# docker load -i weixiang-redis-v7.4.2-alpine.tar.gz 1.2 启动redis [root@elk93 ~]# docker run -d --name redis-server --network host redis:7.4.2-alpine 9652086e8ba23206fe4ba1dd2182f2a72ca99e190ab1f5d7a64532f5c590fc0c [root@elk93 ~]# [root@elk93 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 9652086e8ba2 redis:7.4.2-alpine "docker-entrypoint.s…" 3 seconds ago Up 2 seconds redis-server [root@elk93 ~]# [root@elk93 ~]# ss -ntl | grep 6379 LISTEN 0 511 0.0.0.0:6379 0.0.0.0:* LISTEN 0 511 [::]:6379 [::]:* [root@elk93 ~]# 1.3 写入测试数据 [root@elk93 ~]# docker exec -it redis-server redis-cli -n 5 127.0.0.1:6379[5]> KEYS * (empty array) 127.0.0.1:6379[5]> set school weixiang OK 127.0.0.1:6379[5]> set class weixiang98 OK 127.0.0.1:6379[5]> 127.0.0.1:6379[5]> KEYS * 1) "class" 2) "school" 127.0.0.1:6379[5]> 2.下载redis exporter wget https://github.com/oliver006/redis_exporter/releases/download/v1.74.0/redis_exporter-v1.74.0.linux-amd64.tar.gz svip: [root@elk92 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/redis_exporter/redis_exporter-v1.74.0.linux-amd64.tar.gz 3.解压软件包 [root@elk92 ~]# tar xf redis_exporter-v1.74.0.linux-amd64.tar.gz -C /usr/local/bin/ redis_exporter-v1.74.0.linux-amd64/redis_exporter --strip-components=1 [root@elk92 ~]# [root@elk92 ~]# ll /usr/local/bin/redis_exporter -rwxr-xr-x 1 1001 fwupd-refresh 9642168 May 4 13:22 /usr/local/bin/redis_exporter* [root@elk92 ~]# 4.运行redis-exporter [root@elk92 ~]# redis_exporter -redis.addr redis://10.0.0.93:6379 -web.telemetry-path /metrics -web.listen-address :9121 5.访问redis-exporter的webUI http://134.175.108.235:9121/metrics 6.修改Prometheus的配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-redis-exporter" static_configs: - targets: - 10.0.0.92:9121 7.热加载配置文件 [root@prometheus-server31 ~]# curl -X POST 10.0.0.31:9090/-/reload [root@prometheus-server31 ~]# 8.验证配置是否生效 http://10.0.0.31:9090/targets?search=

image

bash
9.Grafana导入ID 11835 14091 14615 # 缺少插件。
8、Grafana插件安装
bash
1.Grafana插件概述 Grafana支持安装第三方插件。 例如,报错如下: 说明缺少插件 Panel plugin not found: natel-discrete-panel 默认的数据目录: [root@prometheus-server31 ~]# ll /var/lib/grafana/ total 1940 drwxr-xr-x 5 grafana grafana 4096 May 12 14:46 ./ drwxr-xr-x 61 root root 4096 May 12 10:38 ../ drwxr-x--- 3 grafana grafana 4096 May 12 10:38 alerting/ drwx------ 2 grafana grafana 4096 May 12 10:38 csv/ -rw-r----- 1 grafana grafana 1961984 May 12 14:46 grafana.db drwx------ 2 grafana grafana 4096 May 12 10:38 png/ [root@prometheus-server31 ~]# 2.Grafana插件管理 2.1 列出本地安装的插件 [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# grafana-cli plugins ls Error: ✗ stat /var/lib/grafana/plugins: no such file or directory [root@prometheus-server31 ~]# 2.2 安装指定的插件 [root@prometheus-server31 ~]# grafana-cli plugins install natel-discrete-panel ✔ Downloaded and extracted natel-discrete-panel v0.1.1 zip successfully to /var/lib/grafana/plugins/natel-discrete-panel Please restart Grafana after installing or removing plugins. Refer to Grafana documentation for instructions if necessary. [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# ll /var/lib/grafana/ total 1944 drwxr-xr-x 6 grafana grafana 4096 May 12 14:49 ./ drwxr-xr-x 61 root root 4096 May 12 10:38 ../ drwxr-x--- 3 grafana grafana 4096 May 12 10:38 alerting/ drwx------ 2 grafana grafana 4096 May 12 10:38 csv/ -rw-r----- 1 grafana grafana 1961984 May 12 14:48 grafana.db drwxr-xr-x 3 root root 4096 May 12 14:49 plugins/ drwx------ 2 grafana grafana 4096 May 12 10:38 png/ [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# ll /var/lib/grafana/plugins/ total 12 drwxr-xr-x 3 root root 4096 May 12 14:49 ./ drwxr-xr-x 6 grafana grafana 4096 May 12 14:49 ../ drwxr-xr-x 4 root root 4096 May 12 14:49 natel-discrete-panel/ [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# ll /var/lib/grafana/plugins/natel-discrete-panel/ total 180 drwxr-xr-x 4 root root 4096 May 12 14:49 ./ drwxr-xr-x 3 root root 4096 May 12 14:49 ../ -rw-r--r-- 1 root root 1891 May 12 14:49 CHANGELOG.md drwxr-xr-x 2 root root 4096 May 12 14:49 img/ -rw-r--r-- 1 root root 1079 May 12 14:49 LICENSE -rw-r--r-- 1 root root 2650 May 12 14:49 MANIFEST.txt -rw-r--r-- 1 root root 30629 May 12 14:49 module.js -rw-r--r-- 1 root root 808 May 12 14:49 module.js.LICENSE.txt -rw-r--r-- 1 root root 108000 May 12 14:49 module.js.map drwxr-xr-x 2 root root 4096 May 12 14:49 partials/ -rw-r--r-- 1 root root 1590 May 12 14:49 plugin.json -rw-r--r-- 1 root root 3699 May 12 14:49 README.md [root@prometheus-server31 ~]# 2.3 重启Grafana使得配置生效 [root@prometheus-server31 ~]# systemctl restart grafana-server.service [root@prometheus-server31 ~]# 2.4 查看插件是否生效 略,见视频

image

9、prometheus监控docker
bash
- prometheus监控主流的中间件之docker 参考链接: https://github.com/google/cadvisor 1.部署docker环境【建议41-43都安装】 wget http://192.168.21.253/Resources/Docker/scripts/weixiang-autoinstall-docker-docker-compose.tar.gz tar xf weixiang-autoinstall-docker-docker-compose.tar.gz ./install-docker.sh i wget http://192.168.21.253/Resources/Prometheus/images/cAdvisor/weixiang-cadvisor-v0.52.1.tar.gz docker load -i weixiang-cadvisor-v0.52.1.tar.gz 2.导入镜像【建议41-43都安装】 wget http://192.168.21.253/Resources/Docker/images/Linux/alpine-v3.20.2.tar.gz docker image load < alpine-v3.20.2.tar.gz 3.运行测试的镜像 [root@node-exporter41 ~]# docker run -id --name c1 alpine:3.20.2 344a3e936abe90cfb2e2e0e6e5f13e1117a79faa5afb939ae261794d3c5ee2b0 [root@node-exporter41 ~]# [root@node-exporter41 ~]# docker run -id --name c2 alpine:3.20.2 b2130c8f78f2df06f53d338161f3f9ad6a133c9c6b68ddb011884c788bb1b37d [root@node-exporter41 ~]# [root@node-exporter41 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b2130c8f78f2 alpine:3.20.2 "/bin/sh" 5 seconds ago Up 4 seconds c2 344a3e936abe alpine:3.20.2 "/bin/sh" 8 seconds ago Up 8 seconds c1 [root@node-exporter41 ~]# [root@node-exporter42 ~]# docker run -id --name c3 alpine:3.20.2 f399c1aafd607bf0c18dff09c1839f923ee9db39b68edf5b216c618a363566a1 [root@node-exporter42 ~]# [root@node-exporter42 ~]# docker run -id --name c4 alpine:3.20.2 bff22c8d96f731cd44dfa55b60a9dd73d7292add33ea5b82314bf2352db115a7 [root@node-exporter42 ~]# [root@node-exporter42 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bff22c8d96f7 alpine:3.20.2 "/bin/sh" 3 seconds ago Up 1 second c4 f399c1aafd60 alpine:3.20.2 "/bin/sh" 6 seconds ago Up 4 seconds c3 [root@node-exporter42 ~]# [root@node-exporter43 ~]# docker run -id --name c5 alpine:3.20.2 198464e1e9a3c7aefb361c3c7df3bfe8009b5ecd633aa19503321428d404008c [root@node-exporter43 ~]# [root@node-exporter43 ~]# docker run -id --name c6 alpine:3.20.2 b8ed9fcec61e017086864f8eb223cd6409d33f144e2fcfbf33acfd09860b0a06 [root@node-exporter43 ~]# 4.运行cAdVisor【建议41-43都安装】 docker run \ --volume=/:/rootfs:ro \ --volume=/var/run:/var/run:ro \ --volume=/sys:/sys:ro \ --volume=/var/lib/docker/:/var/lib/docker:ro \ --volume=/dev/disk/:/dev/disk:ro \ --network host \ --detach=true \ --name=cadvisor \ --privileged \ --device=/dev/kmsg \ gcr.io/cadvisor/cadvisor-amd64:v0.52.1 5.访问cAdvisor的webUI http://118.89.55.174:8080/docker/ http://81.71.98.206:8080/docker/ http://134.175.108.235:8080/docker/ [root@node-exporter41 ~]# curl -s http://134.175.108.235:8080/metrics | wc -l 3067 [root@node-exporter41 ~]#

image

通过id可以看出他俩是一个

image

image

image

bash
6.Prometheus监控容器节点 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-docker-cadVisor" static_configs: - targets: - 118.89.55.174:8080 - 81.71.98.206:8080 - 134.175.108.235:8080 7.热加载配置文件 [root@prometheus-server31 ~]# curl -X POST 106.55.44.37:9090/-/reload [root@prometheus-server31 ~]# 8.验证配置是否生效 http://10.0.0.31:9090/targets?search= 9.Grafana导入ID模板 10619 无法正确显示数据的优化案例: - 1.PromQL语句优化 count(last_over_time(container_last_seen{instance=~"$node:$port",job=~"$job",image!=""}[3s])) - 2.Value options 将'Calculation'字段设置为'Last *'即可。 - 3.保存Dashboard 若不保存,刷新页面后所有配置丢失!!!

image

image

image

10、prometheus监控mysql
bash
- prometheus监控主流的中间件之mysql 1.部署MySQL 1.1 导入MySQL镜像 [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Docker/images/WordPress/weixiang-mysql-v8.0.36-oracle.tar.gz [root@node-exporter43 ~]# docker load < weixiang-mysql-v8.0.36-oracle.tar.gz 1.2 运行MySQL服务 [root@node-exporter43 ~]# docker run -d --network host --name mysql-server --restart always -e MYSQL_DATABASE=prometheus -e MYSQL_USER=weixiang98 -e MYSQL_PASSWORD=yinzhengjie -e MYSQL_ALLOW_EMPTY_PASSWORD=yes mysql:8.0.36-oracle --character-set-server=utf8 --collation-server=utf8_bin --default-authentication-plugin=mysql_native_password 1.3 检查MySQL服务 [root@node-exporter43 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 16aa74bc9e03 mysql:8.0.36-oracle "docker-entrypoint.s…" 2 seconds ago Up 2 seconds mysql-server [root@node-exporter43 ~]# [root@node-exporter43 ~]# ss -ntl | grep 3306 LISTEN 0 151 *:3306 *:* LISTEN 0 70 *:33060 *:* [root@node-exporter43 ~]# 1.4 添加用户权限 [root@node-exporter43 ~]# docker exec -it mysql-server mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 13 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> mysql> SHOW GRANTS FOR weixiang98; +---------------------------------------------------------+ | Grants for weixiang98@% | +---------------------------------------------------------+ | GRANT USAGE ON *.* TO `weixiang98`@`%` | | GRANT ALL PRIVILEGES ON `prometheus`.* TO `weixiang98`@`%` | +---------------------------------------------------------+ 2 rows in set (0.00 sec) mysql> mysql> GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO weixiang98; Query OK, 0 rows affected (0.00 sec) mysql> mysql> SHOW GRANTS FOR weixiang98; +-------------------------------------------------------------------+ | Grants for weixiang98@% | +-------------------------------------------------------------------+ | GRANT SELECT, PROCESS, REPLICATION CLIENT ON *.* TO `weixiang98`@`%` | | GRANT ALL PRIVILEGES ON `prometheus`.* TO `weixiang98`@`%` | +-------------------------------------------------------------------+ 2 rows in set (0.00 sec) mysql> 2.下载MySQL-exporter wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.17.2/mysqld_exporter-0.17.2.linux-amd64.tar.gz SVIP: [root@node-exporter42 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/mysql_exporter/mysqld_exporter-0.17.2.linux-amd64.tar.gz 3.解压软件包 [root@node-exporter42 ~]# tar xf mysqld_exporter-0.17.2.linux-amd64.tar.gz -C /usr/local/bin/ mysqld_exporter-0.17.2.linux-amd64/mysqld_exporter --strip-components=1 [root@node-exporter42 ~]# [root@node-exporter42 ~]# ll /usr/local/bin/mysqld_exporter -rwxr-xr-x 1 1001 1002 18356306 Feb 26 15:16 /usr/local/bin/mysqld_exporter* [root@node-exporter42 ~]# 4.运行MySQL-exporter暴露MySQL的监控指标 [root@node-exporter42 ~]# cat .my.cnf [client] host = 10.0.0.43 port = 3306 user = weixiang98 password = yinzhengjie [root@node-exporter42 ~]# [root@node-exporter42 ~]# mysqld_exporter --config.my-cnf=/root/.my.cnf ... time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:239 msg="Starting mysqld_exporter" version="(version=0.17.2, branch=HEAD, revision=e84f4f22f8a11089d5f04ff9bfdc5fc042605773)" time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:240 msg="Build context" build_context="(go=go1.23.6, platform=linux/amd64, user=root@18b69b4b0fea, date=20250226-07:16:19, tags=unknown)" time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:252 msg="Scraper enabled" scraper=global_status time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:252 msg="Scraper enabled" scraper=global_variables time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:252 msg="Scraper enabled" scraper=slave_status time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:252 msg="Scraper enabled" scraper=info_schema.innodb_cmp time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:252 msg="Scraper enabled" scraper=info_schema.innodb_cmpmem time=2025-05-13T02:07:45.898Z level=INFO source=mysqld_exporter.go:252 msg="Scraper enabled" scraper=info_schema.query_response_time time=2025-05-13T02:07:45.898Z level=INFO source=tls_config.go:347 msg="Listening on" address=[::]:9104 time=2025-05-13T02:07:45.898Z level=INFO source=tls_config.go:350 msg="TLS is disabled." http2=false address=[::]:9104 5.验证测试 [root@node-exporter41 ~]# curl -s http://106.55.44.37:9104/metrics | wc -l 2569 [root@node-exporter41 ~]# 6.修改Prometheus的配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-mysql-exporter" static_configs: - targets: - 10.0.0.42:9104 7.热加载配置文件 [root@prometheus-server31 ~]# curl -X POST 10.0.0.31:9090/-/reload [root@prometheus-server31 ~]# 8.验证配置是否生效 http://10.0.0.31:9090/targets?search= 9.Grafana导入ID模板 14057 17320

image

image

11、Exporter是如何工作的
bash
首先,我们来看mysqld_exporter的工作原理: mysqld_exporter 本身不是MySQL的一部分。它是一个独立的、小型的Web服务器。 它的工作是: 连接到你指定的MySQL数据库(通过网络)。 执行一系列SQL查询(如 SHOW GLOBAL STATUS;, SHOW GLOBAL VARIABLES; 等)来获取MySQL的内部状态和指标。 转换这些查询结果,将其整理成Prometheus能够识别的特定格式(Metrics格式)。 暴露一个HTTP端点(默认是 :9104 端口),等待Prometheus服务器来抓取这些格式化后的数据。 为什么要把它们分开部署? 1. 资源隔离 (最重要) MySQL很宝贵: 数据库服务器(43号机)通常是业务核心,它的CPU、内存、I/O资源都应该优先保证数据库本身高效运行。 Exporter也耗资源: mysqld_exporter虽然小,但它在被Prometheus频繁抓取时,也需要消耗一定的CPU和内存来执行查询和处理数据。 避免竞争: 如果把exporter也装在43号机上,当监控压力大或者exporter自身出现bug(比如内存泄漏)时,它可能会抢占MySQL的资源,甚至导致MySQL性能下降或服务崩溃。将exporter部署在另一台机器(42号机)上,就彻底杜绝了这种风险。 2. 安全性更高 最小权限原则: 数据库服务器(43号机)应该尽可能少地暴露服务和安装额外的软件,以减少攻击面。 网络隔离: 在复杂的网络环境中,你可以将数据库服务器放在一个高度安全的内部网络区域,只允许特定的监控服务器(如42号机)通过防火墙访问它的3306端口。而Prometheus服务器只需要能访问到监控服务器(42号机)的9104端口即可,无需直接接触到数据库服务器。 3. 管理和维护更方便 集中管理Exporter: 你可以指定一台或几台专门的“监控机”(如此处的42号机),在这上面运行所有的Exporter,比如mysqld_exporter, redis_exporter, node_exporter等。 统一升级和配置: 当你需要升级、重启或修改所有Exporter的配置时,只需要登录这几台集中的监控机操作即可,而不需要去登录每一台业务服务器。这大大简化了运维工作。 4. 架构灵活性 这种模式不局限于物理机或虚拟机。在容器化环境(如Kubernetes)中,MySQL Pod 和 mysqld_exporter Pod 也通常是分开部署的两个不同的Pod,它们通过K8s的Service网络进行通信。原理是完全一样的。
12、prometheus监控mongoDB
bash
- prometheus监控主流的中间件之mongoDB 1 导入mongoDB镜像 wget http://192.168.21.253/Resources/Prometheus/images/MongoDB/weixiang-mongoDB-v8.0.6-noble.tar.gz docker load -i weixiang-mongoDB-v8.0.6-noble.tar.gz 2 部署mongoDB服务 [root@node-exporter43 ~]# docker run -d --name mongodb-server --network host mongo:8.0.6-noble 4b0f00dea78bb571c216c344984ced026c1210c94db147fdc9e32f549e3135de [root@node-exporter43 ~]# [root@node-exporter43 ~]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 8179c6077ec8 mongo:8.0.6-noble "docker-entrypoint.s…" 4 seconds ago Up 3 seconds mongodb-server [root@node-exporter43 ~]# [root@node-exporter43 ~]# ss -ntl | grep 27017 LISTEN 0 4096 0.0.0.0:27017 0.0.0.0:* [root@node-exporter43 ~]# 3 下载MongoDB的exporter https://github.com/percona/mongodb_exporter/releases/download/v0.43.1/mongodb_exporter-0.43.1.linux-amd64.tar.gz SVIP: [root@node-exporter42 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/MongoDB_exporter/mongodb_exporter-0.43.1.linux-amd64.tar.gz 4 解压软件包 [root@node-exporter42 ~]# tar xf mongodb_exporter-0.43.1.linux-amd64.tar.gz -C /usr/local/bin/ mongodb_exporter-0.43.1.linux-amd64/mongodb_exporter --strip-components=1 [root@node-exporter42 ~]# [root@node-exporter42 ~]# ll /usr/local/bin/mongodb_exporter -rwxr-xr-x 1 1001 geoclue 20467864 Dec 13 20:10 /usr/local/bin/mongodb_exporter* [root@node-exporter42 ~]# 5 运行mongodb-exporter [root@node-exporter42 ~]# mongodb_exporter --mongodb.uri=mongodb://10.1.12.4:27017 --log.level=info --collect-all time=2025-05-13T02:49:26.332Z level=INFO source=tls_config.go:347 msg="Listening on" address=[::]:9216 time=2025-05-13T02:49:26.332Z level=INFO source=tls_config.go:350 msg="TLS is disabled." http2=false address=[::]:9216 6 验证mongoDB-exporter的WebUI http://81.71.98.206:9216/metrics [root@node-exporter41 ~]# curl -s http://81.71.98.206:9216/metrics | wc -l 8847 [root@node-exporter41 ~]# 7.配置Prometheus监控mongoDB容器 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: weixiang-mongodb-exporter static_configs: - targets: - 81.71.98.206:9216 [root@prometheus-server31 ~]# [root@prometheus-server31 ~]# curl -X POST http://106.55.44.37:9090/-/reload [root@prometheus-server31 ~]# 8 验证Prometheus配置是否生效 http://10.0.0.31:9090/targets?search= 可以进行数据的查询,推荐使用: mongodb_dbstats_dataSize 9 grafana导入模板ID 16504 由于我们的MongoDB版本较为新,grafana的社区模板更新的并不及时,因此可能需要我们自己定制化一些Dashboard。 参考链接: https://grafana.com/grafana/dashboards

image

image

13、prometheus监控nginx
bash
1 编译安装nginx 1.1 安装编译工具 [root@node-exporter41 ~]# cat /etc/apt/sources.list # 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释 deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse # 以下安全更新软件源包含了官方源与镜像站配置,如有需要可自行修改注释切换 deb http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse # deb-src http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse # 预发布软件源,不建议启用 # deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse # # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse [root@node-exporter41 ~]# [root@node-exporter41 ~]# apt update [root@node-exporter41 ~]# [root@node-exporter41 ~]# apt -y install git wget gcc make zlib1g-dev build-essential libtool openssl libssl-dev 参考链接: https://mirrors.tuna.tsinghua.edu.cn/help/ubuntu/ 1.2 克隆nginx-module-vts模块 git clone https://gitee.com/jasonyin2020/nginx-module-vts.git 1.3 下载nginx软件包 wget https://nginx.org/download/nginx-1.28.0.tar.gz 1.4 解压nginx tar xf nginx-1.28.0.tar.gz 1.5 配置nginx cd nginx-1.28.0/ ./configure --prefix=/weixiang/softwares/nginx --with-http_ssl_module --with-http_v2_module --with-http_realip_module --without-http_rewrite_module --with-http_stub_status_module --without-http_gzip_module --with-file-aio --with-stream --with-stream_ssl_module --with-stream_realip_module --add-module=/root/nginx-module-vts 1.6 编译并安装nginx make -j 2 && make install 1.7 修改nginx的配置文件 vim /weixiang/softwares/nginx/conf/nginx.conf ... http { vhost_traffic_status_zone; upstream weixiang-promethues { server 10.0.0.31:9090; } ... server { ... location / { root html; # index index.html index.htm; proxy_pass http://weixiang-promethues; } location /status { vhost_traffic_status_display; vhost_traffic_status_display_format html; } } } 1.8 检查配置文件语法 /weixiang/softwares/nginx/sbin/nginx -t 1.9 启动nginx /weixiang/softwares/nginx/sbin/nginx 1.10 访问nginx的状态页面 http://118.89.55.174/status/format/prometheus 2 安装nginx-vtx-exporter 2.1 下载nginx-vtx-exporter wget https://github.com/sysulq/nginx-vts-exporter/releases/download/v0.10.8/nginx-vtx-exporter_0.10.8_linux_amd64.tar.gz SVIP: wget http://192.168.21.253/Resources/Prometheus/softwares/nginx_exporter/nginx-vtx-exporter_0.10.8_linux_amd64.tar.gz 2.2 解压软件包到path路径 [root@node-exporter42 ~]# tar xf nginx-vtx-exporter_0.10.8_linux_amd64.tar.gz -C /usr/local/bin/ nginx-vtx-exporter [root@node-exporter42 ~]# [root@node-exporter42 ~]# ll /usr/local/bin/nginx-vtx-exporter -rwxr-xr-x 1 1001 avahi 7950336 Jul 11 2023 /usr/local/bin/nginx-vtx-exporter* [root@node-exporter42 ~]# 2.3 运行nginx-vtx-exporter [root@node-exporter42 ~]# nginx-vtx-exporter -nginx.scrape_uri=http://10.0.0.41/status/format/json 3 配置prometheus采集nginx数据 3.1 修改配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-nginx-vts-modules" metrics_path: "/status/format/prometheus" static_configs: - targets: - "10.0.0.41:80" - job_name: "weixiang-nginx-vts-exporter" static_configs: - targets: - "10.0.0.42:9913" 3.2 重新加载配置并验证配置是否生效 curl -X POST http://10.0.0.31:9090/-/reload 3.3 导入grafana模板 9785【编译安装时添加vts模块即可】 2949【编译时添加vts模块且需要安装nginx-exporter】 - prometheus监控主流的中间件之tomcat 1 部署tomcat-exporter 1.1 导入镜像 [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Prometheus/images/weixiang-tomcat-v9.0.87.tar.gz [root@node-exporter43 ~]# [root@node-exporter43 ~]# docker load -i weixiang-tomcat-v9.0.87.tar.gz 1.2 基于Dockerfile构建tomcat-exporter [root@node-exporter43 ~]# git clone https://gitee.com/jasonyin2020/tomcat-exporter.git [root@node-exporter43 ~]# cd tomcat-exporter/ [root@node-exporter43 tomcat-exporter]# [root@node-exporter43 tomcat-exporter]# ll total 44 drwxr-xr-x 5 root root 4096 May 13 11:55 ./ drwx------ 10 root root 4096 May 13 11:55 ../ -rw-r--r-- 1 root root 96 May 13 11:55 build.sh -rw-r--r-- 1 root root 503 May 13 11:55 Dockerfile drwxr-xr-x 8 root root 4096 May 13 11:55 .git/ drwxr-xr-x 2 root root 4096 May 13 11:55 libs/ -rw-r--r-- 1 root root 3407 May 13 11:55 metrics.war drwxr-xr-x 2 root root 4096 May 13 11:55 myapp/ -rw-r--r-- 1 root root 191 May 13 11:55 README.md -rw-r--r-- 1 root root 7604 May 13 11:55 server.xml [root@node-exporter43 tomcat-exporter]# [root@node-exporter43 tomcat-exporter]# bash build.sh 1.2 运行tomcat镜像 [root@node-exporter43 tomcat-exporter]# docker run -dp 18080:8080 --name tomcat-server registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/tomcat9-app:v1 5643c618db790e12b5ec658c362b3963a3db39914c826d6eef2fe55355f1d5d9 [root@node-exporter43 tomcat-exporter]# [root@node-exporter43 tomcat-exporter]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5643c618db79 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/tomcat9-app:v1 "/usr/local/tomcat/b…" 4 seconds ago Up 4 seconds 8009/tcp, 8443/tcp, 0.0.0.0:18080->8080/tcp, :::18080->8080/tcp tomcat-server [root@node-exporter43 tomcat-exporter]# 1.3 访问tomcat应用 http://10.0.0.43:18080/metrics/ http://10.0.0.43:18080/myapp/ 2 配置prometheus监控tomcat应用 2.1 修改配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-tomcat-exporter" static_configs: - targets: - "10.0.0.43:18080" 2.2 重新加载配置并验证配置是否生效 curl -X POST http://10.0.0.31:9090/-/reload 2.3 导入grafana模板 由于官方的支持并不友好,可以在GitHub自行搜索相应的tomcat监控模板。 参考链接: https://github.com/nlighten/tomcat_exporter/blob/master/dashboard/example.json

image

14、prometheus监控tomcat
bash
1 部署tomcat-exporter 1.1 导入镜像 [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Prometheus/images/weixiang-tomcat-v9.0.87.tar.gz [root@node-exporter43 ~]# [root@node-exporter43 ~]# docker load -i weixiang-tomcat-v9.0.87.tar.gz 1.2 基于Dockerfile构建tomcat-exporter [root@node-exporter43 ~]# git clone https://gitee.com/jasonyin2020/tomcat-exporter.git [root@node-exporter43 ~]# cd tomcat-exporter/ [root@node-exporter43 tomcat-exporter]# [root@node-exporter43 tomcat-exporter]# ll total 44 drwxr-xr-x 5 root root 4096 May 13 11:55 ./ drwx------ 10 root root 4096 May 13 11:55 ../ -rw-r--r-- 1 root root 96 May 13 11:55 build.sh -rw-r--r-- 1 root root 503 May 13 11:55 Dockerfile drwxr-xr-x 8 root root 4096 May 13 11:55 .git/ drwxr-xr-x 2 root root 4096 May 13 11:55 libs/ -rw-r--r-- 1 root root 3407 May 13 11:55 metrics.war drwxr-xr-x 2 root root 4096 May 13 11:55 myapp/ -rw-r--r-- 1 root root 191 May 13 11:55 README.md -rw-r--r-- 1 root root 7604 May 13 11:55 server.xml [root@node-exporter43 tomcat-exporter]# [root@node-exporter43 tomcat-exporter]# bash build.sh 1.2 运行tomcat镜像 [root@node-exporter43 tomcat-exporter]# docker run -dp 18080:8080 --name tomcat-server registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/tomcat9-app:v1 5643c618db790e12b5ec658c362b3963a3db39914c826d6eef2fe55355f1d5d9 [root@node-exporter43 tomcat-exporter]# [root@node-exporter43 tomcat-exporter]# docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5643c618db79 registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/tomcat9-app:v1 "/usr/local/tomcat/b…" 4 seconds ago Up 4 seconds 8009/tcp, 8443/tcp, 0.0.0.0:18080->8080/tcp, :::18080->8080/tcp tomcat-server [root@node-exporter43 tomcat-exporter]# 1.3 访问tomcat应用 http://134.175.108.235:18080/metrics/ http://134.175.108.235:18080/myapp/

image

bash
2 配置prometheus监控tomcat应用 2.1 修改配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-tomcat-exporter" static_configs: - targets: - "10.0.0.43:18080" 2.2 重新加载配置并验证配置是否生效 curl -X POST http://10.0.0.31:9090/-/reload 2.3 导入grafana模板 由于官方的支持并不友好,可以在GitHub自行搜索相应的tomcat监控模板。 参考链接: https://github.com/nlighten/tomcat_exporter/blob/master/dashboard/example.json

image

image

15、Prometheus监控主流的中间件之etcd
bash
- Prometheus监控主流的中间件之etcd 参考链接: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config 1.prometheus端创建etcd证书目录 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# pwd /weixiang/softwares/prometheus-2.53.4.linux-amd64 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# mkdir -p certs/etcd [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.将etcd的自建证书拷贝prometheus服务器 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# scp 10.0.0.241:/weixiang/certs/etcd/etcd-{ca.pem,server-key.pem,server.pem} certs/etcd 3.Prometheus查看证书文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# apt install tree [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# tree certs/etcd/ certs/etcd/ ├── etcd-ca.pem ├── etcd-server-key.pem └── etcd-server.pem 0 directories, 3 files [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl -s --cacert certs/etcd/etcd-ca.pem --key certs/etcd/etcd-server-key.pem --cert certs/etcd/etcd-server.pem https://10.0.0.241:2379/metrics -k | wc -l 1717 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl -s --cacert certs/etcd/etcd-ca.pem --key certs/etcd/etcd-server-key.pem --cert certs/etcd/etcd-server.pem https://10.0.0.242:2379/metrics -k | wc -l 1718 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl -s --cacert certs/etcd/etcd-ca.pem --key certs/etcd/etcd-server-key.pem --cert certs/etcd/etcd-server.pem https://10.0.0.243:2379/metrics -k | wc -l 1714 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 4.修改Prometheus的配置文件【修改配置时,可以将中文注释删除,此处的中文注释是方便你理解的。】 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-etcd-cluster" # 使用https协议 scheme: https # 配置https证书相关信息 tls_config: # 指定CA的证书文件 ca_file: certs/etcd/etcd-ca.pem # 指定etcd服务的公钥文件 cert_file: certs/etcd/etcd-server.pem # 指定etcd服务的私钥文件 key_file: certs/etcd/etcd-server-key.pem static_configs: - targets: - 10.0.0.241:2379 - 10.0.0.242:2379 - 10.0.0.243:2379 5.检查配置文件是否正确 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: 1 rule files found SUCCESS: prometheus.yml is valid prometheus config file syntax Checking weixiang-linux96-rules.yml SUCCESS: 3 rules found [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 6.热加载配置文件 [root@prometheus-server31 ~]# curl -X POST http://10.0.0.31:9090/-/reload 7.检查配置是否生效 http://10.0.0.31:9090/targets 8.grafana导入模板ID 21473 3070 10323

7、服务发现

1、基于文件的服务发现案例
bash
- 基于文件的服务发现案例 静态配置:(static_configs) 修改Prometheus的配置文件时需要热加载配置文件或者重启服务生效。 动态配置:() 无需重启服务,可以监听本地的文件,或者通过注册中心,服务发现中心发现要监控的目标。 参考链接: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config 1.修改配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-file-sd" file_sd_configs: - files: - /tmp/xixi.json - /tmp/haha.yaml 2.热记载配置文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl -X POST 10.0.0.31:9090/-/reload [root@prometheus-server31 Prometheus]# cd /weixiang/softwares/prometheus-2.53.4.linux-amd64 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml WARNING: file "/tmp/xixi.json" for file_sd in scrape job "weixiang-file-sd" does not exist WARNING: file "/tmp/haha.yaml" for file_sd in scrape job "weixiang-file-sd" does not exist SUCCESS: prometheus.yml is valid prometheus config file syntax [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 3.修改json格式文件 [root@prometheus-server31 ~]# cat > /tmp/xixi.json <<EOF [ { "targets": [ "10.1.12.15:9100" ], "labels": { "school": "weixiang", "class": "weixiang98" } } ] EOF 4.验证是否自动监控目标 http://106.55.44.37:9090/targets?search= 5.再次编写yaml文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# cat > /tmp/haha.yaml <<EOF - targets: - '10.1.12.3:9100' - '10.1.12.4:9100' labels: address: ShaHe ClassRoom: JiaoShi05 EOF [root@promethues-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 6.验证是否自动监控目标 http://10.0.0.31:9090/targets?search= 7.Grafana导入模板ID 1860

image

image

2、基于consul的服务发现案例
bash
- 基于consul的服务发现案例 官方文档: https://www.consul.io/ https://developer.hashicorp.com/consul/install#linux 1 部署consul集群 1.1 下载consul【41-43节点】 wget https://releases.hashicorp.com/consul/1.21.3/consul_1.21.3_linux_amd64.zip svip: wget http://192.168.21.253/Resources/Prometheus/softwares/Consul/consul_1.21.3_linux_amd64.zip 1.2 解压consul unzip consul_1.21.3_linux_amd64.zip -d /usr/local/bin/ 1.3 运行consul 集群 服务端43: consul agent -server -bootstrap -bind=10.1.12.4 -data-dir=/weixiang/softwares/consul -client=10.1.12.4 -ui 客户端42: consul agent -bind=10.1.12.3 -data-dir=/weixiang/softwares/consul -client=10.1.12.3 -ui -retry-join=10.1.12.4 客户端41: consul agent -server -bind=10.1.12.15 -data-dir=/weixiang/softwares/consul -client=10.1.12.15 -ui -retry-join=10.1.12.4 1.4 查看各节点的监听端口 ss -ntl | egrep "8300|8500" 1.5 访问console服务的WebUI http://134.175.108.235:8500/ui/dc1/nodes 2.使用consul实现自动发现 2.1 修改prometheus的配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-consul-seriver-discovery" # 配置基于consul的服务发现 consul_sd_configs: # 指定consul的服务器地址,若不指定,则默认值为"localhost:8500". - server: 10.0.0.43:8500 - server: 10.0.0.42:8500 - server: 10.0.0.41:8500 relabel_configs: # 匹配consul的源标签字段,表示服务名称 - source_labels: [__meta_consul_service] # 指定源标签的正则表达式,若不定义,默认值为"(.*)" regex: consul # 执行动作为删除,默认值为"replace",有效值有多种 # https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_action action: drop 2.2 检查配置文件是否正确 [root@prometheus-server31 ~]# cd /weixiang/softwares/prometheus-2.53.4.linux-amd64/ [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config ./prometheus.yml Checking ./prometheus.yml SUCCESS: ./prometheus.yml is valid prometheus config file syntax [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.3 重新加载配置 [root@prometheus-server31 ~]# curl -X POST http:/106.55.44.37:9090/-/reload [root@prometheus-server31 ~]# 2.4.被监控节点注册到console集群 2.4.1 注册节点 [root@grafana71 ~]# curl -X PUT -d '{"id":"prometheus-node42","name":"weixiang-prometheus-node42","address":"10.1.12.3","port":9100,"tags":["node-exporter"],"checks": [{"http":"http://10.1.12.3:9100","interval":"5m"}]}' http://10.1.12.4:8500/v1/agent/service/register [root@grafana71 ~]#

image

bash
2.4.2 注销节点 [root@grafana71 ~]# curl -X PUT http://10.1.12.4:8500/v1/agent/service/deregister/prometheus-node42 TODO---> 目前有个坑 你注册时找得哪个节点,那么注销时也要找这个节点注销,待解决...

image

8、node-exporter的黑白名单

bash
- node-exporter的黑白名单 参考链接: https://github.com/prometheus/node_exporter 1.停止服务 [root@node-exporter41 ~]# systemctl stop node-exporter.service 2.配置黑名单 [root@node-exporter41 ~]# cd /weixiang/softwares/node_exporter-1.9.1.linux-amd64/ [root@node-exporter41 node_exporter-1.9.1.linux-amd64]# [root@node-exporter41 node_exporter-1.9.1.linux-amd64]# ./node_exporter --no-collector.cpu 3.配置白名单 [root@node-exporter41 node_exporter-1.9.1.linux-amd64]# ./node_exporter --collector.disable-defaults --collector.cpu --collector.uname 温馨提示: 相关指标测试 node_cpu_seconds_total ----》 cpu node_uname_info ----》 uname - Prometheus server实现黑白名单 1.黑名单 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie_k8s_exporter" params: exclude[]: - cpu static_configs: - targets: ["10.1.12.3:9100"] [root@promethues-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config ./prometheus.yml Checking ./prometheus.yml SUCCESS: ./prometheus.yml is valid prometheus config file syntax [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 过滤方式: node_cpu_seconds_total{job="yinzhengjie_k8s_exporter"}

image

image

bash
2.白名单 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie_dba_exporter" params: collect[]: - uname static_configs: - targets: ["10.0.0.41:9100"] [root@promethues-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config ./prometheus.yml Checking ./prometheus.yml SUCCESS: ./prometheus.yml is valid prometheus config file syntax [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 过滤方式: node_uname_info{job="yinzhengjie_dba_exporter"}

image

image

9、自定义Dashboard

bash
- 自定义Dashboard 1.相关语句 1.1 CPU一分钟内的使用率 (1 - sum(increase(node_cpu_seconds_total{mode="idle"}[1m])) by (instance) / sum(increase(node_cpu_seconds_total[1m])) by (instance)) * 100 1.2 服务器启动时间 (time() - node_boot_time_seconds{job="yinzhengjie_k8s_exporter"}) 1.3 CPU核心数 count(node_cpu_seconds_total{job="weixiang-file-sd",instance="10.1.12.3:9100",mode="idle"}) 1.4 内存总量 node_memory_MemTotal_bytes{instance="10.0.0.42:9100",job="weixiang-file-sd"} 2.自定义变量

可以照着之前被人做的Dashborads

1db82e256b3e8c20059bb605b29bf76e

上传Dashboard到Grafana官网

bash
Dashboard推送到Grafana官网:https://grafana.com/orgs/eb1360821977/dashboards/new 23842

10、Grafana的表格制作

bash
- Grafana的表格制作 参考链接: https://www.cnblogs.com/yinzhengjie/p/18538430 参考语句: avg(node_uname_info) by (instance,nodename,release) 压力测试: CPU压测: stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 10m 内存压测: stress --cpu 8 --io 4 --vm 4 --vm-bytes 512M --timeout 10m --vm-keep

image

11、Grafana授权

1、用户授权

image

image

2、团队授权

image

3、对指定的Dashboard授权

image

12、Prometheus的存储

1、Prometheus的本地存储
bash
- Prometheus的本地存储 相关参数说明: --web.enable-lifecycle 支持热加载模块。 --storage.tsdb.path=/weixiang/data/prometheus 指定数据存储的路径。 --storage.tsdb.retention.time=60d 指定数据的存储周期。 --web.listen-address=0.0.0.0:9090 配置监听地址。 --web.max-connections=65535 配置连接数。 --config.file 指定Prometheus的配置文件。
2、VictoriaMetrics远端存储
bash
- VictoriaMetrics单机版快速部署 1 VicoriaMetrics概述 VictoriaMetrics是一个快速、经济高效且可扩展的监控解决方案和时间序列数据库。 官网地址: https://victoriametrics.com/ 官方文档: https://docs.victoriametrics.com/ GitHub地址: https://github.com/VictoriaMetrics/VictoriaMetrics 部署文档: https://docs.victoriametrics.com/quick-start/ 2 部署victoriametrics 2.1 下载victoriametrics 版本选择建议使用93 LTS,因为使用97 LTS貌似需要企业授权,启动报错,发现如下信息: [root@prometheus-server33 ~]# journalctl -u victoria-metrics.service -f ... Nov 14 12:03:28 prometheus-server33 victoria-metrics-prod[16999]: 2024-11-14T04:03:28.576Z error VictoriaMetrics/lib/license/copyrights.go:33 VictoriaMetrics Enterprise license is required. Please obtain it at https://victoriametrics.com/products/enterprise/trial/ and pass it via either -license or -licenseFile command-line flags. See https://docs.victoriametrics.com/enterprise/ wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.93.16/victoria-metrics-linux-amd64-v1.93.16.tar.gz SVIP: [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/VictoriaMetrics/victoria-metrics-linux-amd64-v1.93.16.tar.gz 2.2 解压软件包 [root@node-exporter43 ~]# tar xf victoria-metrics-linux-amd64-v1.93.16.tar.gz -C /usr/local/bin/ [root@node-exporter43 ~]# [root@node-exporter43 ~]# ll /usr/local/bin/victoria-metrics-prod -rwxr-xr-x 1 yinzhengjie yinzhengjie 22216200 Jul 18 2024 /usr/local/bin/victoria-metrics-prod* [root@node-exporter43 ~]# 2.3 编写启动脚本 cat > /etc/systemd/system/victoria-metrics.service <<EOF [Unit] Description=weixiang Linux VictoriaMetrics Server Documentation=https://docs.victoriametrics.com/ After=network.target [Service] ExecStart=/usr/local/bin/victoria-metrics-prod \ -httpListenAddr=0.0.0.0:8428 \ -storageDataPath=/weixiang/data/victoria-metrics \ -retentionPeriod=3 [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl enable --now victoria-metrics.service systemctl status victoria-metrics 2.4 检查端口是否存活 [root@node-exporter43 ~]# ss -ntl | grep 8428 LISTEN 0 4096 0.0.0.0:8428 0.0.0.0:* [root@node-exporter43 ~]# 2.5 查看webUI http://10.0.0.43:8428/

image

image

bash
- prometheus配置VictoriaMetrics远端存储 1 修改prometheus的配置文件 [root@promethues-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-linux-VictoriaMetrics-node-exporter" static_configs: - targets: - "10.1.12.15:9100" - "10.1.12.3:9100" - "10.1.12.4:9100" # 在顶级字段中配置VictoriaMetrics地址 remote_write: - url: http://10.1.12.4:8428/api/v1/write 2 重新加载prometheus的配置 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# systemctl stop prometheus-server.service [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ll total 261356 drwxr-xr-x 4 1001 fwupd-refresh 4096 Mar 28 17:18 ./ drwxr-xr-x 3 root root 4096 Mar 26 09:45 ../ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 console_libraries/ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 consoles/ -rw-r--r-- 1 1001 fwupd-refresh 11357 Mar 18 23:05 LICENSE -rw-r--r-- 1 1001 fwupd-refresh 3773 Mar 18 23:05 NOTICE -rw-r--r-- 1 root root 135 Mar 28 15:09 weixiang-file-sd.json -rw-r--r-- 1 root root 148 Mar 28 15:10 weixiang-file-sd.yaml -rwxr-xr-x 1 1001 fwupd-refresh 137836884 Mar 18 22:52 prometheus* -rw-r--r-- 1 root root 5321 Mar 28 15:02 prometheus2025-03-28-AM -rw-r--r-- 1 1001 fwupd-refresh 3296 Mar 28 17:16 prometheus.yml -rw-r--r-- 1 root root 1205 Mar 27 10:05 prometheus.yml2025-03-26 -rw-r--r-- 1 root root 2386 Mar 28 10:06 prometheus.yml2025-03-27 -rwxr-xr-x 1 1001 fwupd-refresh 129719117 Mar 18 22:52 promtool* [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./prometheus 温馨提示: 为了避免实验干扰,我建议大家手动启动prometheus。 3 在VictoriaMetrics的WebUI查看数据 node_cpu_seconds_total{instance="10.1.12.15:9100"} 温馨提示: 如果此步骤没有数据,则不要做下面的步骤了,请先把数据搞出来。 4 配置grafana的数据源及URL 数据源是prometheus,但是URL得写VictoriaMetric的URL哟。 参考URL: http://134.175.108.235:8428

image

image

bash
5 导入grafana的模板ID并选择数据源 1860

image

3、VirctoriaMetrics集群架构远端存储
bash
- VirctoriaMetrics集群架构远端存储 1 VirctoriaMetrics集群架构概述 - 单点部署参考链接: https://docs.victoriametrics.com/quick-start/#starting-vm-single-from-a-binary - 集群部署参考链接: https://docs.victoriametrics.com/quick-start/#starting-vm-cluster-from-binaries https://docs.victoriametrics.com/cluster-victoriametrics/#architecture-overview 部署集群时软件包要下载对应的集群cluster版本: wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.93.16/victoria-metrics-linux-amd64-v1.93.16-cluster.tar.gz 软件包会提供3个程序,该程序对应了集群的3个组件 vmstorage: 存储原始数据,并返回给定标签过滤器在给定时间范围内的查询数据 vminsert: 接受摄入的数据,并根据对度量名称及其所有标签的一致散列在vmstorage节点之间传播 vmselect: 通过从所有配置的vmstorage节点获取所需数据来执行传入查询 2 VirctoriaMetrics集群架构图解 见官网

13、Prometheus的标签管理

1、概念
bash
- Prometheus的标签管理 1.什么是标签 标签用于对数据分组和分类,利用标签可以将数据进行过滤筛选。 标签管理的常见场景: - 1.删除不必要的指标; - 2.从指标中删除敏感或不需要的标签; - 3.添加,编辑或修改指标的标签值或标签格式; 标签的分类: - 默认标签: Prometheus自身内置的标签,格式为"__LABLE__"。 如上图所示,典型点如下所示: - "__metrics_path__" - "__address__" - "__scheme__" - "__scrape_interval__" - "__scrape_timeout__" - "instance" - "job" - 应用标签: 应用本身内置,尤其是监控特定的服务,会有对应的应用标签,格式一般为"__LABLE" 如下图所示,以consul服务为例,典型点如下所示: - "__meta_consul_address" - "__meta_consul_dc" - ...

2、自定义标签:
bash
- 自定义标签: 指的是用户自定义的标签,我们在定义targets可以自定义。 2.标签主要有两种表现形式 - 私有标签: 以"__*"样式存在,用于获取监控目标的默认元数据属性,比如"__scheme__","__address__""__metrics_path__"等。 - 普通标签: 对监控指标进行各种灵活管理操作,常见的操作有删除不必要敏感数据,添加,编辑或修改指标标签纸或者标签格式等。 3.Prometheus对数据处理的流程 - 1.服务发现: 支持静态发现和动态发现,主要是找打到对应的target。 - 2.配置: 加载"__scheme__","__address__""__metrics_path__"等信息。 - 3.重新标记: relabel_configs,主要针对要监控的target的标签。 - 4.抓取: 开始抓取数据。 - 5.重新标记: metric_relabel_configs,主要针对已经抓取回来的metrics的标签的操作。 4.为targets自定义打标签案例 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie-node-exporter-lable" static_configs: - targets: ["10.1.12.15:9100","10.1.12.3:9100","10.1.12.4:9100"] labels: auther: yinzhengjie school: weixiang class: weixiang98 热加载 curl -X POST 106.55.44.37:9090/-/reload 5.查看webUI

image

3、relabel_configs替换标签replace案例
bash
1.修改prometheus的配置文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie-node-exporter-relabel_configs" static_configs: - targets: ["10.1.12.15:9100","10.1.12.3:9100","10.1.12.4:9100"] labels: auther: yinzhengjie blog: https://www.cnblogs.com/yinzhengjie relabel_configs: # Prometheus 采集数据之前,对目标的标签(Label)进行重写或转换 # 指定正则表达式匹配成功的label进行标签管理的列表 - source_labels: - __scheme__ - __address__ - __metrics_path__ # 表示source_labels对应Label的名称或值进行匹配此处指定的正则表达式。 # 此处我们对数据进行了分组,后面replacement会使用"${1}"和"$2"进行引用。 regex: "(http|https)(.*)" # 第一个它匹配字符串开头的 "http" 或 "https"。在这里,它匹配到了 "http" 。第二个它匹配到了 "10.1.12.15:9100/metrics"。 # 指定用于连接多个source_labels为一个字符串的分隔符,若不指定,默认为分号";"。 # 假设源数据如下: # __address__="10.1.12.15:9100" # __metrics_path__="/metrics" # __scheme__="http" # 拼接后操作的结果为: "http10.1.24.13:9100/metrics" separator: "" # 在进行Label替换的时候,可以将原来的source_labels替换为指定修改后的label。 # 将来会新加一个标签,标签的名称为"yinzhengjie_prometheus_ep",值为replacement的数据。 target_label: "yinzhengjie_prometheus_ep" # 替换标签时,将target_label对应的值进行修改成此处的值,最终要生成的“替换值” replacement: "${1}://${2}" # 对Label或指标进行管理,场景的动作有replace|keep|drop|lablemap|labeldrop等,默认为replace。 action: replace 总结:整个流程串起来 对于目标 10.1.12.15:9100: 收集源:source_labels 拿到 "http", "10.1.12.15:9100", "/metrics"。 拼接字符串:separator: "" 将它们变成 "http10.1.12.15:9100/metrics"。 正则匹配:regex 匹配成功,捕获到 ${1}="http"${2}="10.1.12.15:9100/metrics"。 生成新值:replacement 使用捕获组生成了新字符串 "http://10.1.12.15:9100/metrics"。 应用结果:action: replace 创建了一个新标签 target_label,最终效果是给这个 target 增加了一个标签: yinzhengjie_prometheus_ep="http://10.1.12.15:9100/metrics"。 2.热加载配置 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl 10.0.0.31:9090/-/reload -X POST 3.webUI验证 略,见视频。 总结: 相对来说,relabel_configs和labels的作用类似,也是为实例打标签,只不过relabel_configs的功能性更强。 我们可以基于标签来对监控指标进行过滤。

image

bash
[root@promethues-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... separator: "%%" # 把分隔符改成%% ...

image

4、relabel_configs新增标签映射labelmap案例
bash
1.修改prometheus的配置文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie-node-exporter-relabel_configs-labeldrop" static_configs: - targets: ["10.1.12.15:9100","10.1.12.3:9100","10.1.12.4:9100"] relabel_configs: # 目标重写 - regex: "(job|app)" # 2.它会检查每一个标签的名称 (label name),看是否能匹配这个正则表达式 replacement: "${1}_yinzhengjie_labelmap_kubernetes" # 3. 标签名匹配成功,新的标签名是job_yinzhengjie_labelmap_kubernetes action: labelmap # 1.这个动作会遍历目标上所有的标签(labels)。 - regex: "(job|app)" action: labeldrop # 4.这个动作会再次遍历目标上所有的标签,删除regex匹配到的标签,job_yinzhengjie_labelmap_kubernetes 不匹配 (job|app)(因为正则没有 .*),所以这个标签被保留。 总结:整个流程串起来 初始状态:目标有一个标签 job="yinzhengjie..."。 labelmap 操作: 发现 job 标签名匹配 (job|app)。 复制 job 标签,并将其重命名为 job_yinzhengjie_labelmap_kubernetes。 此时目标同时拥有 job 和 job_yinzhengjie_labelmap_kubernetes 两个标签。 labeldrop 操作: 发现 job 标签名匹配 (job|app)。 将 job 标签删除。 最终结果:原始的 job 标签被成功地重命名为了 job_yinzhengjie_labelmap_kubernetes。 2.热加载配置 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl 10.0.0.31:9090/-/reload -X POST 3.webUI验证 略,见视频。

image

5、metric_relabel_configs修改metric标签案例
bash
1.修改prometheus的配置文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie-node-exporter-metric_relabel_configs" static_configs: - targets: ["10.1.12.15:9100","10.1.12.3:9100","10.1.12.4:9100"] metric_relabel_configs: # 指标重写 - source_labels: # 指定要操作的源头。在这里,__name__ 是一个非常特殊的内部标签,它代表指标的名称。 - __name__ regex: "node_cpu_.*" # 匹配所有以 node_cpu_ 开头的字符串 action: drop 总结:整个流程 Prometheus 向 10.1.12.15:9100 等目标发起 HTTP 请求,成功抓取到一大堆指标数据。 在将这些数据写入磁盘之前,metric_relabel_configs 规则开始生效。 它逐一检查每条指标的名称 (__name__)。 任何名称以 node_cpu_ 开头的指标,比如 node_cpu_seconds_total{cpu="0",mode="idle"},都会被直接丢弃。 其他不匹配的指标,如 node_memory_MemFree_bytes,则会被保留并正常存储。 2.热加载配置 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl 10.0.0.31:9090/-/reload -X POST 3.webUI验证

14、部署blackbox-exporter黑盒监控

1.blackbox-exporter概述
bash
一般用于监控网站是否监控,端口是否存活,证书有效期等。 blackbox exporter支持基于HTTP, HTTPS, DNS, TCP, ICMP, gRPC协议来对目标节点进行监控。 比如基于http协议我们可以探测一个网站的返回状态码为200判读服务是否正常。 比如基于TCP协议我们可以探测一个主机端口是否监听。 比如基于ICMP协议来ping一个主机的连通性。 比如基于gRPC协议来调用接口并验证服务是否正常工作。 比如基于DNS协议可以来检测域名解析。 2.下载blackbox-exporter wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.27.0/blackbox_exporter-0.27.0.linux-amd64.tar.gz SVIP: [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/blackbox_exporter/blackbox_exporter-0.27.0.linux-amd64.tar.gz 3.解压软件包 [root@node-exporter43 ~]# tar xf blackbox_exporter-0.27.0.linux-amd64.tar.gz -C /usr/local/ [root@node-exporter43 ~]# [root@node-exporter43 ~]# cd /usr/local/blackbox_exporter-0.27.0.linux-amd64/ [root@node-exporter43 blackbox_exporter-0.27.0.linux-amd64]# [root@node-exporter43 blackbox_exporter-0.27.0.linux-amd64]# ll total 30800 drwxr-xr-x 2 1001 1002 4096 Jun 30 20:48 ./ drwxr-xr-x 11 root root 4096 Aug 6 14:35 ../ -rwxr-xr-x 1 1001 1002 31509376 Jun 30 20:46 blackbox_exporter* -rw-r--r-- 1 1001 1002 1209 Jun 30 20:47 blackbox.yml -rw-r--r-- 1 1001 1002 11357 Jun 30 20:47 LICENSE -rw-r--r-- 1 1001 1002 94 Jun 30 20:47 NOTICE [root@node-exporter43 blackbox_exporter-0.27.0.linux-amd64]# [root@node-exporter43 blackbox_exporter-0.27.0.linux-amd64]# 4.启动blackbox服务 [root@node-exporter43 blackbox_exporter-0.27.0.linux-amd64]# ./blackbox_exporter 5.访问blackbox的WebUI http://134.175.108.235:9115/ 6.访问测试 http://134.175.108.235:9115/probe?target=www.weixiang.com&module=http_2xx http://134.175.108.235:9115/probe?target=prometheus.io&module=http_2xx

image

image

image

2、Prometheus server整合blackbox实现网站监控
bash
1.修改Prometheus的配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... # 指定作业的名称,生成环境中,通常是指一类业务的分组配置。 - job_name: 'weixiang-blackbox-exporter-http' # 修改访问路径,若不修改,默认值为"/metrics" metrics_path: /probe # 配置URL的相关参数 params: # 此处表示使用的是blackbox的http模块,从而判断相应的返回状态码是否为200 module: [http_2xx] school: [weixiang] # 静态配置,需要手动指定监控目标 static_configs: # 需要监控的目标 - targets: # 支持https协议 - https://www.weixiang.com/ # 支持http协议 - http://10.1.12.15 # 支持http协议和自定义端口 - http://10.1.24.13:9090 # 对目标节点进行重新打标签配置 relabel_configs: # 指定源标签,此处的"__address__"表示内置的标签,存储的是被监控目标的IP地址 - source_labels: [__address__] # 指定目标标签,其实就是在"Endpoint"中加了一个target字段(用于指定监控目标), target_label: __param_target # 指定需要执行的动作,默认值为"replace",常用的动作有: replace, keep, and drop。 # 但官方支持十几种动作: https://prometheus.io/docs/prometheus/2.45/configuration/configuration/ # 将"__address__"传递给target字段。 action: replace - source_labels: [__param_target] target_label: instance - target_label: __address__ # 指定要替换的值,此处我指定为blackbox exporter的主机地址 replacement: 10.1.12.4:9115 [root@prometheus-server31 ~]# 2.热加载配置 [root@prometheus-server31 ~]# curl -X POST http://10.0.0.31:9090/-/reload 3.验证webUI http://10.0.0.31:9090/targets?search= 4.导入grafana的模板ID 7587 13659

image

image

3、prometheus基于blackbox的ICMP监控目标主机是否存活
bash
1 修改Prometheus配置文件 [root@prometheus-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: 'weixiang-blackbox-exporter-icmp' metrics_path: /probe params: # 如果不指定模块,则默认类型为"http_2xx",不能乱写!乱写监控不到服务啦! module: [icmp] static_configs: - targets: - 10.1.12.15 - 10.1.12.3 relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] # 指定注意的是,如果instance不修改,则instance和"__address__"的值相同 # target_label: ip target_label: instance - target_label: __address__ replacement: 10.1.12.4:9115 2 检查配置文件是否正确 [root@prometheus-server31 ~]# cd /weixiang/softwares/prometheus-2.53.4.linux-amd64/ [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ll total 261348 drwxr-xr-x 5 1001 fwupd-refresh 4096 Mar 28 14:35 ./ drwxr-xr-x 3 root root 4096 Mar 26 09:45 ../ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 console_libraries/ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 consoles/ drwxr-xr-x 4 root root 4096 Mar 26 14:49 data/ -rw-r--r-- 1 1001 fwupd-refresh 11357 Mar 18 23:05 LICENSE -rw-r--r-- 1 1001 fwupd-refresh 3773 Mar 18 23:05 NOTICE -rwxr-xr-x 1 1001 fwupd-refresh 137836884 Mar 18 22:52 prometheus* -rw-r--r-- 1 1001 fwupd-refresh 4858 Mar 28 14:35 prometheus.yml -rw-r--r-- 1 root root 1205 Mar 27 10:05 prometheus.yml2025-03-26 -rw-r--r-- 1 root root 2386 Mar 28 10:06 prometheus.yml2025-03-27 -rwxr-xr-x 1 1001 fwupd-refresh 129719117 Mar 18 22:52 promtool* [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 3 重新加载配置 [root@prometheus-server31 ~]# curl -X POST http://10.0.0.31:9090/-/reload [root@prometheus-server31 ~]# 4 访问prometheus的WebUI http://10.0.0.31:9090/targets 5 访问blackbox的WebUI http://118.89.55.174:9115/ 6 grafana过滤jobs数据 基于"weixiang-blackbox-exporter-icmp"标签进行过滤。

image

image

4、prometheus基于blackbox的TCP案例监控端口是否存活
bash
1 修改Prometheus配置文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: 'weixiang-blackox-exporter-tcp' metrics_path: /probe params: module: [tcp_connect] static_configs: - targets: - 10.1.12.15:80 - 10.1.12.3:22 - 10.1.24.13:9090 relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: 10.1.12.4:9115 2 检查配置文件是否正确 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 3 重新加载配置文件 [root@prometheus-server31 ~]# curl -X POST http://10.0.0.31:9090/-/reload [root@prometheus-server31 ~]# 4 访问prometheus的WebUI http://10.0.0.31:9090/targets 5 访问blackbox exporter的WebUI http://10.0.0.41:9115/ 6 使用grafana查看数据 基于"weixiang-blackbox-exporter-tcp"标签进行过滤。

15、prometheus的联邦模式

1、为什么需要联邦模式
bash
1.为什么需要联邦模式 联邦模式主要作用就是为了减轻prometheus server的I/O压力。 实现分布式存储数据。
2、联邦模式实战
bash
2.prometheus 31节点监控41节点 2.1 拷贝配置文件 [root@promethues-server31 ~]# scp weixiang-install-prometheus-server-v2.53.4.tar.gz 10.1.20.5:~ 2.2 安装Prometheus-server [root@prometheus-server32 ~]# tar xf weixiang-install-prometheus-server-v2.53.4.tar.gz [root@prometheus-server32 ~]# ./install-prometheus-server.sh i 2.3 修改prometheus的配置文件 [root@prometheus-server32 ~]# cd /weixiang/softwares/prometheus-2.53.4.linux-amd64/ [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# ll total 261324 drwxr-xr-x 4 1001 fwupd-refresh 4096 Mar 18 23:08 ./ drwxr-xr-x 3 root root 4096 Aug 6 15:51 ../ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 console_libraries/ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 consoles/ -rw-r--r-- 1 1001 fwupd-refresh 11357 Mar 18 23:05 LICENSE -rw-r--r-- 1 1001 fwupd-refresh 3773 Mar 18 23:05 NOTICE -rwxr-xr-x 1 1001 fwupd-refresh 137836884 Mar 18 22:52 prometheus* -rw-r--r-- 1 1001 fwupd-refresh 934 Mar 18 23:05 prometheus.yml -rwxr-xr-x 1 1001 fwupd-refresh 129719117 Mar 18 22:52 promtool* [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# tail -5 prometheus.yml - job_name: "weixiang-file-sd-yaml-node_exporter" file_sd_configs: - files: - /weixiang/softwares/prometheus-2.53.4.linux-amd64/sd/*.yaml [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# 2.4 创建服务发现的文件 [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# mkdir sd [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# vim sd/*.yaml [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# cat sd/*.yaml - targets: - '10.1.12.15:9100' labels: school: weixiang class: weixiang98 [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# 2.5 检查配置文件 [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# 2.6 热加载配置 [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# curl -X POST 10.1.20.5:9090/-/reload [root@prometheus-server32 prometheus-2.53.4.linux-amd64]#

image

bash
3. Prometheus 33节点监控42,43节点 3.1 启动consul集群 服务端43: consul agent -server -bootstrap -bind=10.1.12.4 -data-dir=/weixiang/softwares/consul -client=10.1.12.4 -ui 客户端42: consul agent -bind=10.1.12.3 -data-dir=/weixiang/softwares/consul -client=10.1.12.3 -ui -retry-join=10.1.12.4 客户端41: consul agent -server -bind=10.1.12.15 -data-dir=/weixiang/softwares/consul -client=10.1.12.15 -ui -retry-join=10.1.12.4 3.2 拷贝配置文件 [root@promethues-server31 ~]# scp weixiang-install-prometheus-server-v2.53.4.tar.gz 10.1.24.4:~ 3.3 安装Prometheus-server [root@prometheus-server33 ~]# tar xf weixiang-install-prometheus-server-v2.53.4.tar.gz [root@prometheus-server33 ~]# ./install-prometheus-server.sh i 3.4 修改prometheus的配置文件 [root@prometheus-server33 ~]# cd /weixiang/softwares/prometheus-2.53.4.linux-amd64/ [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# ll total 261328 drwxr-xr-x 4 1001 fwupd-refresh 4096 Mar 18 23:08 ./ drwxr-xr-x 3 root root 4096 Aug 6 15:55 ../ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 console_libraries/ drwxr-xr-x 2 1001 fwupd-refresh 4096 Mar 18 23:05 consoles/ -rw-r--r-- 1 1001 fwupd-refresh 11357 Mar 18 23:05 LICENSE -rw-r--r-- 1 1001 fwupd-refresh 3773 Mar 18 23:05 NOTICE -rwxr-xr-x 1 1001 fwupd-refresh 137836884 Mar 18 22:52 prometheus* -rw-r--r-- 1 1001 fwupd-refresh 934 Mar 18 23:05 prometheus.yml -rwxr-xr-x 1 1001 fwupd-refresh 129719117 Mar 18 22:52 promtool* [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# vim prometheus.yml [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# tail prometheus.yml - job_name: "weixiang-consul-sd-node_exporter" consul_sd_configs: - server: 10.1.12.4:8500 - server: 10.1.12.3:8500 - server: 10.1.12.15:8500 relabel_configs: - source_labels: [__meta_consul_service] regex: consul action: drop [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# 3.5 检查配置文件 [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# 2.6 热加载配置 [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# curl -X POST 10.1.24.4:9090/-/reload [root@prometheus-server33 prometheus-2.53.4.linux-amd64]# 2.7 42和43注册节点 [root@node-exporter41 ~]# curl -X PUT -d '{"id":"prometheus-node42","name":"weixiang-prometheus-node42","address":"10.1.12.3","port":9100,"tags":["node-exporter"],"checks": [{"http":"http://10.1.12.3:9100","interval":"5m"}]}' http://10.1.12.4:8500/v1/agent/service/register curl -X PUT -d '{"id":"prometheus-node43","name":"weixiang-prometheus-node43","address":"10.1.12.4","port":9100,"tags":["node-exporter"],"checks": [{"http":"http://10.1.12.4:9100","interval":"5m"}]}' http://10.1.12.4:8500/v1/agent/service/register 彩蛋: 注销服务 curl -X PUT http://10.1.12.4:8500/v1/agent/service/deregister/prometheus-node42 2.8 验证服务是否生效 http://43.139.77.96:9090/targets

image

bash
3.prometheus 31节点配置联邦模式实战 3.1 修改配置文件 [root@promethues-server31 ~]# vim /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml ... - job_name: "weixiang-prometheus-federate-32" metrics_path: "/federate" # 用于解决标签的冲突问题,有效值为: true和false,默认值为false # 当设置为true时,将保留抓取的标签以忽略服务器自身的标签。说白了会覆盖原有标签。 # 当设置为false时,则不会覆盖原有标签,而是在标点前加了一个"exported_"前缀。 honor_labels: true params: "match[]": - '{job="promethues"}' - '{__name__=~"job:.*"}' - '{__name__=~"node.*"}' static_configs: - targets: - "10.1.20.5:9090" - job_name: "weixiang-prometheus-federate-33" metrics_path: "/federate" honor_labels: true params: "match[]": - '{job="promethues"}' - '{__name__=~"job:.*"}' - '{__name__=~"node.*"}' static_configs: - targets: - "10.1.24.4:9090" [root@promethues-server31 ~]# 3.2 检查配置文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 3.3 热加载配置文件 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl -X POST 10.1.24.13:9090/-/reload [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 3.4 验证服务是否生效 http://106.55.44.37:9090/

image

bash
基于如下的PromQL查询: node_cpu_seconds_total{job=~"weixiang.*sd.*"}[20s] 3.5 grafana导入模板ID 1860

image

image

3、prometheus监控consul应用
bash
1.下载consul exporter wget https://github.com/prometheus/consul_exporter/releases/download/v0.13.0/consul_exporter-0.13.0.linux-amd64.tar.gz svip: [root@prometheus-server33 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/Consul/consul_exporter/consul_exporter-0.13.0.linux-amd64.tar.gz 2.解压软件包 [root@prometheus-server33 ~]# tar xf consul_exporter-0.13.0.linux-amd64.tar.gz -C /usr/local/bin/ consul_exporter-0.13.0.linux-amd64/consul_exporter --strip-components=1 [root@prometheus-server33 ~]# [root@prometheus-server33 ~]# ll /usr/local/bin/consul_exporter -rwxr-xr-x 1 1001 1002 19294344 Nov 6 21:38 /usr/local/bin/consul_exporter* [root@prometheus-server33 ~]# 3.运行consul_exporter [root@prometheus-server33 ~]# consul_exporter --consul.server="http://10.0.0.41:8500" --web.telemetry-path="/metrics" --web.listen-address=:9107 4.访问console_exporter的webUI http://10.0.0.33:9107/metrics 5.prometheus server监控consul_exporter [root@promethues-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "weixiang-consul-exporter" static_configs: - targets: - 10.0.0.33:9107 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# [root@promethues-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 6.热加载配置 [root@promethues-server31 prometheus-2.53.4.linux-amd64]# curl -X POST 10.0.0.31:9090/-/reload [root@promethues-server31 prometheus-2.53.4.linux-amd64]# 7.验证配置是否生效 http://10.0.0.31:9090/targets 8.Grafana导入模板ID 8919

b3a817b2c1876b41f3f981dd1c225ad8456f62a9e315f12f99b4831c78aa6b8f

4、联邦模式采集nginx
bash
1 编译安装nginx 1.1 安装编译工具 [root@node-exporter41 ~]# cat /etc/apt/sources.list # 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释 deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse # 以下安全更新软件源包含了官方源与镜像站配置,如有需要可自行修改注释切换 deb http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse # deb-src http://security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse # 预发布软件源,不建议启用 # deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse # # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-proposed main restricted universe multiverse [root@node-exporter41 ~]# [root@node-exporter41 ~]# apt update [root@node-exporter41 ~]# [root@node-exporter41 ~]# apt -y install git wget gcc make zlib1g-dev build-essential libtool openssl libssl-dev 参考链接: https://mirrors.tuna.tsinghua.edu.cn/help/ubuntu/ 1.2 克隆nginx-module-vts模块 git clone https://gitee.com/jasonyin2020/nginx-module-vts.git 1.3 下载nginx软件包 wget https://nginx.org/download/nginx-1.28.0.tar.gz 1.4 解压nginx tar xf nginx-1.28.0.tar.gz 1.5 配置nginx cd nginx-1.28.0/ ./configure --prefix=/weixiang/softwares/nginx --with-http_ssl_module --with-http_v2_module --with-http_realip_module --without-http_rewrite_module --with-http_stub_status_module --without-http_gzip_module --with-file-aio --with-stream --with-stream_ssl_module --with-stream_realip_module --add-module=/root/nginx-module-vts 1.6 编译并安装nginx make -j 2 && make install 1.7 修改nginx的配置文件 vim /weixiang/softwares/nginx/conf/nginx.conf ... http { vhost_traffic_status_zone; upstream weixiang-promethues { server 10.0.0.31:9090; } ... server { ... location / { root html; # index index.html index.htm; proxy_pass http://weixiang-promethues; } location /status { vhost_traffic_status_display; vhost_traffic_status_display_format html; } } } 1.8 检查配置文件语法 /weixiang/softwares/nginx/sbin/nginx -t 1.9 启动nginx /weixiang/softwares/nginx/sbin/nginx 1.10 访问nginx的状态页面 http://118.89.55.174/status/format/prometheus 2 安装nginx-vtx-exporter 2.1 下载nginx-vtx-exporter wget https://github.com/sysulq/nginx-vts-exporter/releases/download/v0.10.8/nginx-vtx-exporter_0.10.8_linux_amd64.tar.gz SVIP: wget http://192.168.21.253/Resources/Prometheus/softwares/nginx_exporter/nginx-vtx-exporter_0.10.8_linux_amd64.tar.gz 2.2 解压软件包到path路径 [root@node-exporter42 ~]# tar xf nginx-vtx-exporter_0.10.8_linux_amd64.tar.gz -C /usr/local/bin/ nginx-vtx-exporter [root@node-exporter42 ~]# [root@node-exporter42 ~]# ll /usr/local/bin/nginx-vtx-exporter -rwxr-xr-x 1 1001 avahi 7950336 Jul 11 2023 /usr/local/bin/nginx-vtx-exporter* [root@node-exporter42 ~]# 2.3 运行nginx-vtx-exporter [root@node-exporter42 ~]# nginx-vtx-exporter -nginx.scrape_uri=http://10.0.0.41/status/format/json 这是 10.1.20.5 上 prometheus.yml 的配置: # 全局配置 global: scrape_interval: 15s # 设置默认的抓取间隔为15秒 evaluation_interval: 15s # 设置默认的告警规则评估间隔为15秒 # scrape_timeout 默认是 10s # 告警管理器(Alertmanager)的配置,如果暂时不用可以忽略 alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # 抓取配置 (这是核心部分) scrape_configs: # Prometheus自身的监控 - job_name: 'prometheus' static_configs: - targets: ['localhost:9090'] # ========================================================== # 【新增】Nginx 监控的配置 # ========================================================== - job_name: 'nginx' # 使用 static_configs,因为你的 exporter 地址是固定的 static_configs: - targets: ['10.1.12.3:9913'] # <-- 你的 nginx-vtx-exporter 的地址和端口 labels: # 添加一些标签,方便查询和区分。非常重要! instance: 'nginx-server-10.1.12.15' # 标记这个数据是来自哪个Nginx实例 group: 'production' # 比如标记为生产环境 app: 'nginx' 配置联邦模式 # ... global 和 alerting 配置 ... scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090'] - job_name: 'nginx' static_configs: - targets: ['10.1.12.3:9913'] labels: instance: 'nginx-server-10.1.12.15' # 重要:添加一个标签来区分这个数据源于哪个局部Prometheus cluster: 'cluster-A' # 假设你还有一个Tomcat Exporter - job_name: 'tomcat' static_configs: - targets: ['10.1.12.4:9404'] # 假设Tomcat Exporter在10.1.12.4上 labels: instance: 'tomcat-app-1' cluster: 'cluster-A' 10.1.24.13 上的 prometheus.yml # ... global 和 alerting 配置 ... scrape_configs: # 这个job用于从其他Prometheus实例拉取数据 - job_name: 'federate' scrape_interval: 60s # 联邦模式的采集间隔通常更长 honor_labels: true # 非常重要!保留从局部Prometheus拉取过来的标签(如job, instance) metrics_path: /federate # 指定联邦模式的端点 # params 用于过滤需要拉取的时间序列 # match[] 是一个选择器,这里我们选择拉取所有非Prometheus自身监控的指标 params: 'match[]': - '{job!="prometheus"}' # 不拉取局部Prometheus自身的指标 - '{__name__=~"nginx_.*|tomcat_.*"}' # 或者更精确地,只拉取你关心的指标 static_configs: - targets: - '10.1.20.5:9090' # 局部Prometheus A - '10.1.24.4:9090' # 局部Prometheus B

16、Grafana配置MySQL作为数据源

bash
[root@node-exporter43 ~]# docker run --name mysql-server \ > -p 3306:3306 \ > -e MYSQL_ROOT_PASSWORD=Sdms2018 \ > -e MYSQL_DATABASE=prometheus \ > -e MYSQL_USER=weixiang98 \ > -e MYSQL_PASSWORD=yinzhengjie \ > -v mysql-server-data:/var/lib/mysql \ > -d mysql:8.0.36-oracle \ > --default-authentication-plugin=mysql_native_password 4ed972eb108a6f143fe097ded438f635faa8c996969d5bac6da82934ed8515a2 1.查看MySQL数据库 [root@node-exporter43 ~]# ss -ntl | grep 3306 LISTEN 0 151 *:3306 *:* LISTEN 0 70 *:33060 *:* [root@node-exporter43 ~]# [root@node-exporter43 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 0204416718e1 mysql:8.0.36-oracle "docker-entrypoint.s…" 32 hours ago Up 8 hours mysql-server [root@node-exporter43 ~]# [root@node-exporter43 ~]# docker inspect mysql-server [ { "Id": "0204416718e13d2d68bc72f1d95371e3a9d14c1ce4f1d6f366fcaf9f3eea7ceb", ... "Env": [ "MYSQL_DATABASE=prometheus", "MYSQL_USER=weixiang98", "MYSQL_PASSWORD=yinzhengjie", "MYSQL_ALLOW_EMPTY_PASSWORD=yes", ... ], ... [root@node-exporter43 ~]# docker exec -it mysql-server mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 8 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SHOW DATABASES; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | prometheus | | sys | +--------------------+ 5 rows in set (0.71 sec) mysql> mysql> USE prometheus Database changed mysql> SHOW TABLES; Empty set (0.01 sec) mysql> 2.修改Grafana的配置文件 [root@promethues-server31 ~]# vim /etc/grafana/grafana.ini ... [database] ... type = mysql host = 10.0.0.43:3306 name = prometheus user = weixiang98 password = yinzhengjie 扩展知识: [security] ... # 在首次启动 Grafana 时禁用管理员用户创建,说白了,就是不创建管理员用户(admin)。 ;disable_initial_admin_creation = false # 默认管理员用户,启动时创建,可以修改,若不指定,则默认为admin。 ;admin_user = admin # 指定默认的密码。 ;admin_password = admin # 默认的邮箱地址。 ;admin_email = admin@localhost 3.重启Grafana使得配置生效 [root@promethues-server31 ~]# systemctl restart grafana-server.service [root@promethues-server31 ~]# [root@promethues-server31 ~]# ss -ntl | grep 3000 LISTEN 0 4096 *:3000 *:* [root@promethues-server31 ~]# 4.验证MySQL [root@node-exporter43 ~]# docker exec -it mysql-server mysql prometheus Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 19 Server version: 8.0.36 MySQL Community Server - GPL Copyright (c) 2000, 2024, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> SELECT DATABASE(); +------------+ | DATABASE() | +------------+ | prometheus | +------------+ 1 row in set (0.00 sec) mysql> SHOW TABLES; +-------------------------------+ | Tables_in_prometheus | +-------------------------------+ | alert | | alert_configuration | | alert_configuration_history | | alert_image | | alert_instance | | alert_notification | | alert_notification_state | | alert_rule | | alert_rule_tag | | alert_rule_version | | annotation | | annotation_tag | | api_key | | builtin_role | | cache_data | | correlation | | dashboard | | dashboard_acl | | dashboard_provisioning | | dashboard_public | | dashboard_public_email_share | | dashboard_public_magic_link | | dashboard_public_session | | dashboard_public_usage_by_day | | dashboard_snapshot | | dashboard_tag | | dashboard_usage_by_day | | dashboard_usage_sums | | dashboard_version | | data_keys | | data_source | | data_source_acl | | data_source_cache | | data_source_usage_by_day | | entity_event | | file | | file_meta | | folder | | kv_store | | library_element | | library_element_connection | | license_token | | login_attempt | | migration_log | | ngalert_configuration | | org | | org_user | | permission | | playlist | | playlist_item | | plugin_setting | | preferences | | provenance_type | | query_history | | query_history_star | | quota | | recording_rules | | remote_write_targets | | report | | report_dashboards | | report_settings | | role | | secrets | | seed_assignment | | server_lock | | session | | setting | | short_url | | star | | tag | | team | | team_group | | team_member | | team_role | | temp_user | | test_data | | user | | user_auth | | user_auth_token | | user_dashboard_views | | user_role | | user_stats | +-------------------------------+ 82 rows in set (0.00 sec) mysql>

image

1、配置邮件告警
bash
cat > blackbox.yml <<'EOF' modules: http_ssl_check: prober: http timeout: 10s http: # 我们不需要特别的http配置,但tls_config是最佳实践 # 它会验证证书链是否有效 tls_config: insecure_skip_verify: false EOF docker run -d \ --name blackbox-exporter \ -p 9115:9115 \ -v "$(pwd)/blackbox.yml:/config/blackbox.yml" \ prom/blackbox-exporter:latest \ --config.file=/config/blackbox.yml [root@node-exporter42 Blackbox]# curl 'http://10.1.12.3:9115/probe?module=http_ssl_check&target=https://kubernetes.io'

如果你看到一大堆以 probe_ 开头的 Prometheus 指标,特别是 probe_ssl_earliest_cert_expiry 和 probe_success 1,那就说明 Blackbox Exporter 已经成功运行了。

bash
[root@prometheus-server32 prometheus-2.53.4.linux-amd64]# ./promtool check config /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml Checking /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml SUCCESS: /weixiang/softwares/prometheus-2.53.4.linux-amd64/prometheus.yml is valid prometheus config file syntax [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# curl -X POST 10.1.20.5:9090/-/reload [root@prometheus-server32 prometheus-2.53.4.linux-amd64]# curl cip.cc

17、pushgateway

1.什么是pushgateway
bash
说白了,就是自定义监控。 2.部署pushgateway wget https://github.com/prometheus/pushgateway/releases/download/v1.11.1/pushgateway-1.11.1.linux-amd64.tar.gz SVIP: [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/pushgateway/pushgateway-1.11.1.linux-amd64.tar.gz 3.解压软件包 [root@node-exporter43 ~]# tar xf pushgateway-1.11.1.linux-amd64.tar.gz -C /usr/local/bin/ pushgateway-1.11.1.linux-amd64/pushgateway --strip-components=1 [root@node-exporter43 ~]# [root@node-exporter43 ~]# ll /usr/local/bin/pushgateway -rwxr-xr-x 1 1001 1002 21394840 Apr 9 21:24 /usr/local/bin/pushgateway* [root@node-exporter43 ~]# 4.运行pushgateway [root@node-exporter43 ~]# pushgateway --web.telemetry-path="/metrics" --web.listen-address=:9091 --persistence.file=/weixiang/data/pushgateway.data 5.访问pushgateway的WebUI http://134.175.108.235:9091/# 6.模拟直播在线人数统计 6.1 使用curl工具推送测试数据pushgateway [root@node-exporter42 ~]# echo "student_online 35" | curl --data-binary @- http://10.1.12.4:9091/metrics/job/weixiang_student/instance/10.1.12.3

image

image

bash
6.2 pushgateway查询数据是否上传成功 [root@node-exporter43 ~]# curl -s http://10.1.12.4:9091/metrics | grep student_online # TYPE student_online untyped student_online{instance="10.1.12.3",job="weixiang_student"} 35 [root@node-exporter43 ~]# 7.Prometheus server监控pushgateway 7.1 修改prometheus的配置文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "weixiang-pushgateway" honor_labels: true params: static_configs: - targets: - 10.1.12.4:9091 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 7.2 热加载配置 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST 10.1.24.13:9090/-/reload [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 7.3 验证配置是否生效 http://10.0.0.31:9090/targets?search=

image

bash
7.4 查询特定指标 student_online 8.Grafana出图展示

image

bash
9.模拟直播间人数的变化 [root@node-exporter42 ~]# echo "student_online $RANDOM" | curl --data-binary @- http://10.1.12.4:9091/metrics/job/weixiang_student/instance/10.1.12.3 [root@node-exporter42 ~]# echo "student_online $RANDOM" | curl --data-binary @- http://10.1.12.4:9091/metrics/job/weixiang_student/instance/10.1.12.3

image

2.使用pushgateway监控TCP的十二种状态。
bash
Prometheus监控TCP的12种状态 1.监控TCP的12种状态 [root@node-exporter42 ~]# cat /usr/local/bin/tcp_status.sh #!/bin/bash pushgateway_url="http://10.1.12.4:9091/metrics/job/tcp_status" state="SYN-SENT SYN-RECV FIN-WAIT-1 FIN-WAIT-2 TIME-WAIT CLOSE CLOSE-WAIT LAST-ACK LISTEN CLOSING ESTAB UNKNOWN" for i in $state do count=`ss -tan |grep $i |wc -l` echo tcp_connections{state=\""$i"\"} $count >> /tmp/tcp.txt done; cat /tmp/tcp.txt | curl --data-binary @- $pushgateway_url rm -rf /tmp/tcp.txt [root@node-exporter42 ~]# 2.调用脚本 [root@node-exporter42 ~]# bash /usr/local/bin/tcp_status.sh [root@node-exporter42 ~]# 3.Prometheus查询数据 tcp_connections

image

bash
变量:state ---> Custom SYN-SENT,SYN-RECV,FIN-WAIT-1,FIN-WAIT-2,TIME-WAIT,CLOSE,CLOSE-WAIT,LAST-ACK,LISTEN,CLOSING,ESTAB,UNKNOWN 配置变量:

ddc3e3d0a730d84f5245512707cd9441

bash
4.Grafana出图展示 参考PromQL: tcp_connections{state="${state}"}

image

image

image

image

3.SRE运维开发实现自定义的exporter
bash
1.使用python程序自定义exporter案例 1.1 安装pip3工具包 [root@prometheus-server33 ~]# apt update [root@prometheus-server33 ~]# apt install -y python3-pip 1.2 安装实际环境中相关模块库 [root@prometheus-server33 ~]# pip3 install flask prometheus_client -i https://mirrors.aliyun.com/pypi/simple 1.3 编写代码 [root@prometheus-server33 ~]# cat > flask_metric.py <<'EOF' #!/usr/bin/python3 # auther: Jason Yin # blog: https://www.cnblogs.com/yinzhengjie/ from prometheus_client import start_http_server,Counter, Summary from flask import Flask, jsonify from wsgiref.simple_server import make_server import time app = Flask(__name__) # Create a metric to track time spent and requests made REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request') COUNTER_TIME = Counter("request_count", "Total request count of the host") @app.route("/apps") @REQUEST_TIME.time() def requests_count(): COUNTER_TIME.inc() return jsonify({"office": "https://www.weixiang.com"},{"auther":"Jason Yin"}) if __name__ == "__main__": print("启动老男孩教育程序: weixiang-linux-python-exporter, 访问路径: http://0.0.0.0:8001/apps,监控服务: http://0.0.0.0:8000") start_http_server(8000) httpd = make_server( '0.0.0.0', 8001, app ) httpd.serve_forever() EOF 1.4 启动python程序 [root@node-exporter42 ~]# python3 flask_metric.py 启动老男孩教育程序: weixiang-linux-python-exporter, 访问路径: http://0.0.0.0:8001/apps,监控服务: http://0.0.0.0:8000 1.5 客户端测试 [root@node-exporter43 ~]# cat > weixiang_curl_metrics.sh <<'EOF' #!/bin/bash URL=http://10.1.24.13:8001/apps while true;do curl_num=$(( $RANDOM%50+1 )) sleep_num=$(( $RANDOM%5+1 )) for c_num in `seq $curl_num`;do curl -s $URL &> /dev/null done sleep $sleep_num done EOF [root@node-exporter43 ~]# bash weixiang_curl_metrics.sh # 可以看到数据

image

bash
2.prometheus监控python自定义的exporter实战 2.1 编辑配置文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... - job_name: "yinzhengjie_python_custom_metrics" static_configs: - targets: - 10.1.24.13:8000 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.2 检查配置文件语法 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: prometheus.yml is valid prometheus config file syntax [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.3 重新加载配置文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST http://10.0.0.31:9090/-/reload [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.4 验证prometheus是否采集到数据 http://10.0.0.31:9090/targets

image

bash
2.5 grafana作图展示 request_count_total 老男孩教育apps请求总数。 increase(request_count_total{job="yinzhengjie_python_custom_metrics"}[1m]) 老男孩教育每分钟请求数量曲线QPS。 irate(request_count_total{job="yinzhengjie_python_custom_metrics"}[1m]) 老男孩教育每分钟请求量变化率曲线 request_processing_seconds_sum{job="yinzhengjie_python_custom_metrics"} / request_processing_seconds_count{job="yinzhengjie_python_custom_metrics"} 老男孩教育每分钟请求处理平均耗时

image

image

18、Alertmanager

1、Alertmanager环境部署及子路由配置
bash
- Alertmanager环境部署及子路由配置 1.什么是altermanager Alertmanager是一款开源的告警工具包,可以和Prometheus集成。 2.下载Alertmanager wget https://github.com/prometheus/alertmanager/releases/download/v0.28.1/alertmanager-0.28.1.linux-amd64.tar.gz SVIP: [root@node-exporter43 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/Alertmanager/alertmanager-0.28.1.linux-amd64.tar.gz 3.解压安装包 [root@node-exporter43 ~]# tar xf alertmanager-0.28.1.linux-amd64.tar.gz -C /usr/local/ [root@node-exporter43 ~]# 4.修改Alertmanager的配置文件 [root@node-exporter43 ~]# cd /usr/local/alertmanager-0.28.1.linux-amd64/ [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ll total 67932 drwxr-xr-x 2 1001 1002 4096 Mar 7 23:08 ./ drwxr-xr-x 12 root root 4096 Aug 7 11:17 ../ -rwxr-xr-x 1 1001 1002 38948743 Mar 7 23:06 alertmanager* -rw-r--r-- 1 1001 1002 356 Mar 7 23:07 alertmanager.yml -rwxr-xr-x 1 1001 1002 30582387 Mar 7 23:06 amtool* -rw-r--r-- 1 1001 1002 11357 Mar 7 23:07 LICENSE -rw-r--r-- 1 1001 1002 311 Mar 7 23:07 NOTICE [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml # 通用配置 global: resolve_timeout: 5m smtp_from: '1360821977@qq.com' smtp_smarthost: 'smtp.qq.com:465' smtp_auth_username: '1360821977@qq.com' smtp_auth_password: 'vsqtbsnvfgnobabd' smtp_require_tls: false smtp_hello: 'qq.com' # 定义路由信息 route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'sre_system' # 配置子路由 routes: - receiver: 'sre_dba' match_re: job: yinzhengjie_dba_exporter # 建议将continue的值设置为true,表示当前的条件是否匹配,都将继续向下匹配规则 # 这样做的目的是将消息发给最后的系统组(sre_system) continue: true - receiver: 'sre_k8s' match_re: job: yinzhengjie_k8s_exporter continue: true - receiver: 'sre_system' match_re: job: .* continue: true # 定义接受者 receivers: - name: 'sre_dba' email_configs: - to: '18295829783@163.com' send_resolved: true - to: '1304871040@qq.com' send_resolved: true - name: 'sre_k8s' email_configs: - to: '2996358563@qq.com' send_resolved: true - to: '2011014877@qq.com' send_resolved: true - name: 'sre_system' email_configs: - to: '3220434114@qq.com' send_resolved: true - to: '2825483220@qq.com' send_resolved: true [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# 5.检查配置文件语法 [root@prometheus-server33 alertmanager-0.28.1.linux-amd64]# ./amtool check-config alertmanager.yml Checking 'alertmanager.yml' SUCCESS Found: - global config - route - 0 inhibit rules - 3 receivers - 0 templates [root@prometheus-server33 alertmanager-0.28.1.linux-amd64]# 6.启动Alertmanager服务 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ./alertmanager 7.访问Alertmanager的WebUI http://134.175.108.235:9093/#/status

image

bash
- Prometheus server集成Alertmanager实现告警功能 1. 修改配置文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# egrep -v "^#|^$" prometheus.yml global: scrape_interval: 3s evaluation_interval: 3s ... alerting: alertmanagers: - static_configs: - targets: - 10.0.0.43:9093 rule_files: - "weixiang-linux-rules.yml" ... scrape_configs: ... - job_name: "yinzhengjie_dba_exporter" static_configs: - targets: ["10.0.0.41:9100"] - job_name: "yinzhengjie_k8s_exporter" static_configs: - targets: ["10.0.0.42:9100"] - job_name: "yinzhengjie_bigdata_exporter" static_configs: - targets: ["10.0.0.43:9100"] [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2 修改告警规则 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# cat > weixiang-linux-rules.yml <<'EOF' groups: - name: weixiang-linux-rules-alert rules: - alert: weixiang-dba_exporter-alert expr: up{job="yinzhengjie_dba_exporter"} == 0 for: 1s labels: school: weixiang class: weixiang98 apps: dba annotations: summary: "{{ $labels.instance }} 数据库实例已停止运行超过 3s!" - alert: weixiang-k8s_exporter-alert expr: up{job="yinzhengjie_k8s_exporter"} == 0 for: 1s labels: school: weixiang class: weixiang98 apps: k8s annotations: summary: "{{ $labels.instance }} K8S服务器已停止运行超过 3s!" - alert: weixiang-bigdata_exporter-alert expr: up{job="yinzhengjie_bigdata_exporter"} == 0 for: 1s labels: school: weixiang class: weixiang98 apps: bigdata annotations: summary: "{{ $labels.instance }} 大数据服务器已停止运行超过 5s!" EOF 3.检查配置文件语法 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml Checking prometheus.yml SUCCESS: 1 rule files found SUCCESS: prometheus.yml is valid prometheus config file syntax Checking weixiang-linux-rules.yml SUCCESS: 3 rules found [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 4.重新加载prometheus的配置 curl -X POST http://10.0.0.31:9090/-/reload 5.查看prometheus server的WebUI验证是否生效 http://106.55.44.37:9090/config

image

bash
http://106.55.44.37:9090/targets?search=

image

bash
http://106.55.44.37:9090/alerts?search=

image

bash
6.触发告警功能 [root@node-exporter41 ~]# systemctl stop node-exporter.service [root@node-exporter41 ~]# ss -ntl | grep 9100 [root@node-exporter41 ~]# [root@node-exporter42 ~]# systemctl stop node-exporter.service [root@node-exporter42 ~]# [root@node-exporter42 ~]# ss -ntl | grep 9100 [root@node-exporter42 ~]# [root@node-exporter43 ~]# systemctl stop node-exporter.service [root@node-exporter43 ~]# [root@node-exporter43 ~]# ss -ntl | grep 9100 [root@node-exporter43 ~]# 7.查看alermanager的WebUI及邮箱接受者 http://106.55.44.37:9090/alerts?search= # 已经变红了,邮箱能收到告警邮件

image

image

image

bash
# 恢复业务 [root@node-exporter41 ~]# systemctl start node-exporter.service [root@node-exporter42 ~]# ss -ntl | grep 9100 LISTEN 0 4096 *:9100 *:* [root@node-exporter42 ~]# systemctl start node-exporter.service [root@node-exporter42 ~]# ss -ntl | grep 9100 LISTEN 0 4096 *:9100 *:*

image

image

2、alertmanager自定义告警模板
bash
1 告警模板介绍 默认的告警信息界面有些简单,可以借助告警的模板信息,对告警信息进行丰富,需要借助于Alertmanager的模板功能来实现。 告警模板的使用流程如下: - 分析关键信息 - 定制模板内容 - Alertmanager加载模板文件 - 告警信息使用模板内容属性 模板文件使用标准Go模板语法,并暴露一些包含时间标签和值的变量。 - 标签引用: {{ $label.<label_name> }} - 指标样本值引用: {{ $value }} 为了显式效果,需要了解一些html相关技术,参考链接: https://www.w3school.com.cn/html/index.asp 2 altertmanger节点自定义告警模板参考案例 2.1 创建邮件模板文件工作目录 [root@prometheus-server43 alertmanager-0.28.1.linux-amd64]# mkdir -pv /weixiang/softwares/alertmanager/tmpl 2.2 创建模板实例,工作中可以考虑嵌入公司的logo [root@prometheus-server43 alertmanager-0.28.1.linux-amd64]# cat > /weixiang/softwares/alertmanager/tmpl/email.tmpl <<'EOF' {{ define "weixiang.html" }} <h1>老男孩IT教育欢迎您: https://www.weixiang.com/</h1> <table border="1"> <tr> <th>报警项</th> <th>实例</th> <th>报警阀值</th> <th>开始时间</th> </tr> {{ range $i, $alert := .Alerts }} <tr> <td>{{ index $alert.Labels "alertname" }}</td> <td>{{ index $alert.Labels "instance" }}</td> <td>{{ index $alert.Annotations "value" }}</td> <td>{{ $alert.StartsAt }}</td> </tr> {{ end }} </table> <img src="https://www.weixiang.com/static/images/header/logo.png"> {{ end }} EOF 2.3 alertmanager引用自定义模板文件 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml # 通用配置 global: resolve_timeout: 5m smtp_from: '13949913771@163.com' smtp_smarthost: 'smtp.163.com:465' smtp_auth_username: '13949913771@163.com' smtp_auth_password: 'UGTMVNtb2Xup2St4' smtp_require_tls: false smtp_hello: '163.com' # 定义路由信息 route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'sre_system' # 配置子路由 routes: - receiver: 'sre_dba' match_re: job: yinzhengjie_dba_exporter # 建议将continue的值设置为true,表示当前的条件是否匹配,都将继续向下匹配规则 # 这样做的目的是将消息发给最后的系统组(sre_system) continue: true - receiver: 'sre_k8s' match_re: job: yinzhengjie_k8s_exporter continue: true - receiver: 'sre_system' match_re: job: .* continue: true # 定义接受者 receivers: - name: 'sre_dba' email_configs: - to: '18295829783@163.com' headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang.html" . }}' send_resolved: true - to: '1304871040@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang.html" . }}' - name: 'sre_k8s' email_configs: - to: '2996358563@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang.html" . }}' - to: '2011014877@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang.html" . }}' - name: 'sre_system' email_configs: - to: '3220434114@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang.html" . }}' - to: '2825483220@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang.html" . }}' # 加载模板 templates: - '/weixiang/softwares/alertmanager/tmpl/*.tmpl' 2.4 alertmanager语法检查 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ./amtool check-config ./alertmanager.yml Checking './alertmanager.yml' SUCCESS Found: - global config - route - 0 inhibit rules - 3 receivers - 1 templates SUCCESS [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# 2.5 重启Alertmanager程序 [root@prometheus-server33 alertmanager-0.28.1.linux-amd64]# ./alertmanager [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ./alertmanager time=2025-08-07T06:50:21.818Z level=INFO source=main.go:191 msg="Starting Alertmanager" version="(version=0.28.1, branch=HEAD, revision=b2099eaa2c9ebc25edb26517cb9c732738e93910)" time=2025-08-07T06:50:21.818Z level=INFO source=main.go:192 msg="Build context" build_context="(go=go1.23.7, platform=linux/amd64, user=root@fa3ca569dfe4, date=20250307-15:05:18, tags=netgo)" time=2025-08-07T06:50:21.821Z level=INFO source=cluster.go:185 msg="setting advertise address explicitly" component=cluster addr=10.1.12.4 port=9094 time=2025-08-07T06:50:21.825Z level=INFO source=cluster.go:674 msg="Waiting for gossip to settle..." component=cluster interval=2s time=2025-08-07T06:50:21.854Z level=INFO source=coordinator.go:112 msg="Loading configuration file" component=configuration file=alertmanager.yml time=2025-08-07T06:50:21.855Z level=INFO source=coordinator.go:125 msg=Completed loading of configuration f 2.6 查看WebUi观察配置是否生效 http://134.175.108.235:9093/#/status 2.7 再次出发告警配置 停下服务再启动 2.8 如果value取不到值,可以考虑修改告警规则添加value字段即可(并重启服务) [root@promethues-server31 prometheus-2.53.4.linux-amd64]# cat weixiang-linux-rules.yml groups: - name: weixiang-linux-rules-alert rules: - alert: weixiang-dba_exporter-alert expr: up{job="yinzhengjie_dba_exporter"} == 0 for: 1s labels: school: weixiang class: weixiang98 apps: dba annotations: summary: "{{ $labels.instance }} 数据库实例已停止运行超过 3s!" # 这里注解部分增加了一个value的属性信息,会从Prometheus的默认信息中获取阈值 value: "{{ $value }}" - alert: weixiang-k8s_exporter-alert expr: up{job="yinzhengjie_k8s_exporter"} == 0 for: 1s labels: school: weixiang class: weixiang98 apps: k8s annotations: summary: "{{ $labels.instance }} K8S服务器已停止运行超过 3s!" value: "{{ $value }}" - alert: weixiang-bigdata_exporter-alert expr: up{job="yinzhengjie_bigdata_exporter"} == 0 for: 1s labels: school: weixiang class: weixiang98 apps: bigdata annotations: summary: "{{ $labels.instance }} 大数据服务器已停止运行超过 5s!" value: "{{ $value }}" [root@promethues-server31 prometheus-2.53.4.linux-amd64]# [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST http://10.1.24.13:9090/-/reload

image

image

bash
# 恢复 [root@node-exporter41 ~]# systemctl start node-exporter.service

image

image

image

3、自定义告警模板案例2
bash
- 自定义告警模板案例2 1.定义模板 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat /weixiang/softwares/alertmanager/tmpl/xixi.tmp {{ define "xixi" }} <!DOCTYPE html> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <style type="text/css"> body { font-family: 'Helvetica Neue', Arial, sans-serif; line-height: 1.6; color: #333; max-width: 700px; margin: 0 auto; padding: 20px; background-color: #f9f9f9; } .alert-card { border-radius: 8px; padding: 20px; margin-bottom: 20px; box-shadow: 0 2px 10px rgba(0,0,0,0.1); } .alert-critical { background: linear-gradient(135deg, #FFF6F6 0%, #FFEBEB 100%); border-left: 5px solid #FF5252; } .alert-resolved { background: linear-gradient(135deg, #F6FFF6 0%, #EBFFEB 100%); border-left: 5px solid #4CAF50; } .alert-title { font-size: 18px; font-weight: bold; margin-bottom: 15px; display: flex; align-items: center; } .alert-icon { width: 24px; height: 24px; margin-right: 10px; } .alert-field { margin-bottom: 8px; display: flex; } .field-label { font-weight: bold; min-width: 80px; color: #555; } .field-value { flex: 1; } .timestamp { color: #666; font-size: 13px; margin-top: 15px; text-align: right; } .divider { height: 1px; background: #eee; margin: 15px 0; } </style> </head> <body> {{- if gt (len .Alerts.Firing) 0 -}} <div class="alert-header alert-critical"> 告警触发 - 请立即处理! </div> <div> <img src="https://img95.699pic.com/element/40114/9548.png_860.png" width="200px" height="200px"> </div> {{- range $index, $alert := .Alerts -}} <div class="alert-card alert-critical"> <div class="alert-field"> <span class="field-label">告警名称:</span> <span class="field-value">{{ .Labels.alertname }}</span> </div> <div class="alert-field"> <span class="field-label">告警级别:</span> <span class="field-value">{{ .Labels.severity }}</span> </div> <div class="alert-field"> <span class="field-label">目标机器:</span> <span class="field-value">{{ .Labels.instance }}</span> </div> <div class="alert-field"> <span class="field-label">告警摘要:</span> <span class="field-value">{{ .Annotations.summary }}</span> </div> <div class="alert-field"> <span class="field-label">触发时间:</span> <span class="field-value">{{ (.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}</span> </div> {{- if .Annotations.description }} <div class="divider"></div> <div class="alert-field"> <span class="field-label">详细描述:</span> <span class="field-value">{{ .Annotations.description }}</span> </div> {{- end }} </div> {{- end }} {{- end }} {{- if gt (len .Alerts.Resolved) 0 -}} {{- range $index, $alert := .Alerts -}} <div class="alert-card alert-resolved"> <div class="alert-title"> 告警恢复通知 </div> <div> <img src="https://tse2-mm.cn.bing.net/th/id/OIP-C.n7AyZv_wWXqFCc1mtlGhFgHaHa?rs=1&pid=ImgDetMain" width="300" height="300"> </div> <div class="alert-field"> <span class="field-label">告警名称:</span> <span class="field-value">{{ .Labels.alertname }}</span> </div> <div class="alert-field"> <span class="field-label">目标机器:</span> <span class="field-value">{{ .Labels.instance }}</span> </div> <div class="alert-field"> <span class="field-label">告警摘要:</span> <span class="field-value">[ {{ .Annotations.summary }}] 此告警已经恢复~</span> </div> <div class="alert-field"> <span class="field-label">恢复时间:</span> <span class="field-value">{{ (.EndsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}</span> </div> {{- if .Annotations.description }} <div class="alert-field"> <span class="field-label">详细描述:</span> <span class="field-value">{{ .Annotations.description }}</span> </div> {{- end }} </div> {{- end }} {{- end }} </body> </html> {{ end }} [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# 2.引用模板 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml # 通用配置 global: resolve_timeout: 5m smtp_from: '13949913771@163.com' smtp_smarthost: 'smtp.163.com:465' smtp_auth_username: '13949913771@163.com' smtp_auth_password: 'UGTMVNtb2Xup2St4' smtp_require_tls: false smtp_hello: '163.com' # 定义路由信息 route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'sre_system' # 配置子路由 routes: - receiver: 'sre_dba' match_re: job: yinzhengjie_dba_exporter # 建议将continue的值设置为true,表示当前的条件是否匹配,都将继续向下匹配规则 # 这样做的目的是将消息发给最后的系统组(sre_system) continue: true - receiver: 'sre_k8s' match_re: job: yinzhengjie_k8s_exporter continue: true - receiver: 'sre_system' match_re: job: .* continue: true # 定义接受者 receivers: - name: 'sre_dba' email_configs: - to: '18295829783@163.com' headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' html: '{{ template "xixi" . }}' send_resolved: true - to: '1304871040@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' html: '{{ template "xixi" . }}' - name: 'sre_k8s' email_configs: - to: '2996358563@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' html: '{{ template "xixi" . }}' - to: '2011014877@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' html: '{{ template "xixi" . }}' - name: 'sre_system' email_configs: - to: '3220434114@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' html: '{{ template "xixi" . }}' - to: '2825483220@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' html: '{{ template "xixi" . }}' # 加载模板 templates: - '/weixiang/softwares/alertmanager/tmpl/*.tmpl' [root@node-exporter43 alertmanager-0.28.1.linux-amd64]#
4、自定义告警模板案例3
bash
- 自定义告警模板案例3 1.定义模板 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat /weixiang/softwares/alertmanager/tmpl/weixiang.tmpl {{ define "weixiang" }} <!DOCTYPE html> <html> <head> <title>{{ if eq .Status "firing" }}&#128680; 告警触发{{ else }}&#9989; 告警恢复{{ end }}</title> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <style> @font-face { font-family: "EmojiFont"; src: local("Apple Color Emoji"), local("Segoe UI Emoji"), local("Noto Color Emoji"); } :root { --color-critical: #ff4444; --color-warning: #ffbb33; --color-resolved: #00c851; --color-info: #33b5e5; } body { font-family: 'Segoe UI', system-ui, sans-serif, "EmojiFont"; line-height: 1.6; color: #333; max-width: 800px; margin: 20px auto; padding: 0 20px; } .header { text-align: center; padding: 30px; border-radius: 15px; margin-bottom: 30px; background: {{ if eq .Status "firing" }}#fff0f0{{ else }}#f0fff4{{ end }}; border: 2px solid {{ if eq .Status "firing" }}var(--color-critical){{ else }}var(--color-resolved){{ end }}; } .status-badge { padding: 8px 16px; border-radius: 20px; font-weight: bold; display: inline-block; } .alert-table { width: 100%; border-collapse: separate; border-spacing: 0; background: white; border-radius: 10px; overflow: hidden; box-shadow: 0 2px 6px rgba(0,0,0,0.1); margin: 20px 0; } .alert-table th { background: #f8f9fa; padding: 16px; text-align: left; width: 130px; border-right: 2px solid #e9ecef; } .alert-table td { padding: 16px; border-bottom: 1px solid #e9ecef; } .timeline { display: flex; justify-content: space-between; margin: 15px 0; } .timeline-item { flex: 1; text-align: center; padding: 10px; background: #f8f9fa; border-radius: 8px; margin: 0 5px; } .alert-image { text-align: center; margin: 30px 0; } .alert-image img { width: {{ if eq .Status "firing" }}140px{{ else }}100px{{ end }}; opacity: 0.9; transition: all 0.3s ease; } .emoji { font-family: "EmojiFont", sans-serif; font-size: 1.3em; } .severity-critical { color: var(--color-critical); } .severity-warning { color: var(--color-warning); } </style> </head> <body> <div class="header"> <h1> {{ if eq .Status "firing" }} <span class="emoji">&#128680;</span> 告警触发通知 {{ else }} <span class="emoji">&#9989;</span> 告警恢复通知 {{ end }} </h1> </div> {{ if eq .Status "firing" }} <!-- 告警触发内容 --> <table class="alert-table"> <tr> <th><span class="emoji">&#128683;</span> 告警名称</th> <td>{{ range .Alerts }}<span class="emoji">&#128227;</span> {{ .Labels.alertname }}{{ end }}</td> </tr> <tr> <th><span class="emoji">&#9888;&#65039;</span> 严重等级</th> <td class="severity-{{ range .Alerts }}{{ .Labels.severity }}{{ end }}"> {{ range .Alerts }}<span class="emoji">&#9210;</span> {{ .Labels.severity | toUpper }}{{ end }} </td> </tr> <tr> <th><span class="emoji">&#128346;</span> 触发时间</th> <td>{{ range .Alerts }}<span class="emoji">&#128337;</span> {{ .StartsAt.Format "2006-01-02 15:04:05" }}{{ end }}</td> </tr> </table> {{ else }} <!-- 告警恢复内容 --> <table class="alert-table"> <tr> <th><span class="emoji">&#128227;</span> 恢复告警</th> <td>{{ range .Alerts }}<span class="emoji">&#128272;</span> {{ .Labels.alertname }}{{ end }}</td> </tr> <tr> <th><span class="emoji">&#9203;</span> 持续时间</th> <td> {{ range .Alerts }} {{ .StartsAt.Format "15:04:05" }} - {{ .EndsAt.Format "15:04:05" }} ({{ .EndsAt.Sub .StartsAt | printf "%.0f" }} 分钟) {{ end }} </td> </tr> <tr> <th><span class="emoji">&#9989;</span> 恢复时间</th> <td>{{ range .Alerts }}<span class="emoji">&#128338;</span> {{ .EndsAt.Format "2006-01-02 15:04:05" }}{{ end }}</td> </tr> </table> {{ end }} <!-- 公共信息部分 --> <table class="alert-table"> <tr> <th><span class="emoji">&#128187;&#65039;</span> 实例信息</th> <td>{{ range .Alerts }}<span class="emoji">&#127991;</span> {{ .Labels.instance }}{{ end }}</td> </tr> <tr> <th><span class="emoji">&#128221;</span> 告警详情</th> <td>{{ range .Alerts }}<span class="emoji">&#128204;</span> {{ .Annotations.summary }}{{ end }}</td> </tr> <tr> <th><span class="emoji">&#128196;</span> 详细描述</th> <td>{{ range .Alerts }}<span class="emoji">&#128209;</span> {{ .Annotations.description }}{{ end }}</td> </tr> </table> <div class="alert-image"> {{ if eq .Status "firing" }} <img src="https://img95.699pic.com/element/40114/9548.png_860.png" alt="告警图标"> {{ else }} <img src="https://tse2-mm.cn.bing.net/th/id/OIP-C.n7AyZv_wWXqFCc1mtlGhFgHaHa?rs=1&pid=ImgDetMain" alt="恢复图标"> {{ end }} </div> <div class="timeline"> <div class="timeline-item"> <div class="emoji">&#128678; 当前状态</div> {{ range .Alerts }} <strong>{{ if eq .Status "firing" }}<span class="emoji">&#128293;</span> FIRING{{ else }}<span class="emoji">&#9989;</span> RESOLVED{{ end }}</strong> {{ end }} </div> <div class="timeline-item"> <div class="emoji">&#128204; 触发次数</div> <strong>{{ len .Alerts }} 次</strong> </div> </div> </body> </html> {{ end }} [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# 2.引用模板 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml # 通用配置 global: resolve_timeout: 5m smtp_from: '13949913771@163.com' smtp_smarthost: 'smtp.163.com:465' smtp_auth_username: '13949913771@163.com' smtp_auth_password: 'UGTMVNtb2Xup2St4' smtp_require_tls: false smtp_hello: '163.com' # 定义路由信息 route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'sre_system' # 配置子路由 routes: - receiver: 'sre_dba' match_re: job: yinzhengjie_dba_exporter # 建议将continue的值设置为true,表示当前的条件是否匹配,都将继续向下匹配规则 # 这样做的目的是将消息发给最后的系统组(sre_system) continue: true - receiver: 'sre_k8s' match_re: job: yinzhengjie_k8s_exporter continue: true - receiver: 'sre_system' match_re: job: .* continue: true # 定义接受者 receivers: - name: 'sre_dba' email_configs: - to: '18295829783@163.com' headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' send_resolved: true - to: '1304871040@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - name: 'sre_k8s' email_configs: - to: '2996358563@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - to: '2011014877@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - name: 'sre_system' email_configs: - to: '3220434114@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - to: '2825483220@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' # 加载模板 templates: - '/weixiang/softwares/alertmanager/tmpl/*.tmpl' [root@node-exporter43 alertmanager-0.28.1.linux-amd64]#
5、Alertmanager集成钉钉插件实现告警
bash
参考链接: https://github.com/timonwong/prometheus-webhook-dingtalk/ 0.注册钉钉账号并添加钉钉机器人 略,见视频。 1.部署钉钉插件 1.1 下载钉钉插件 wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v2.1.0/prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz svip: [root@node-exporter42 ~]# wget http://192.168.21.253/Resources/Prometheus/softwares/Alertmanager/prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz 1.2 解压文件 [root@node-exporter42 ~]# tar xf prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz -C /usr/local/ [root@node-exporter42 ~]# [root@node-exporter42 ~]# ll /usr/local/prometheus-webhook-dingtalk-2.1.0.linux-amd64/ total 18752 drwxr-xr-x 3 3434 3434 4096 Apr 21 2022 ./ drwxr-xr-x 11 root root 4096 Aug 7 15:54 ../ -rw-r--r-- 1 3434 3434 1299 Apr 21 2022 config.example.yml drwxr-xr-x 4 3434 3434 4096 Apr 21 2022 contrib/ -rw-r--r-- 1 3434 3434 11358 Apr 21 2022 LICENSE -rwxr-xr-x 1 3434 3434 19172733 Apr 21 2022 prometheus-webhook-dingtalk* [root@node-exporter42 ~]# 1.3 修改配置文件 [root@node-exporter42 ~]# cd /usr/local/prometheus-webhook-dingtalk-2.1.0.linux-amd64/ [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# cp config{.example,}.yml [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# ll total 18756 drwxr-xr-x 3 3434 3434 4096 Aug 7 15:54 ./ drwxr-xr-x 11 root root 4096 Aug 7 15:54 ../ -rw-r--r-- 1 3434 3434 1299 Apr 21 2022 config.example.yml -rw-r--r-- 1 root root 1299 Aug 7 15:54 config.yml drwxr-xr-x 4 3434 3434 4096 Apr 21 2022 contrib/ -rw-r--r-- 1 3434 3434 11358 Apr 21 2022 LICENSE -rwxr-xr-x 1 3434 3434 19172733 Apr 21 2022 prometheus-webhook-dingtalk* [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# cat config.yml # 也可以直接即可 targets: linux97: # 对应的是dingding的webhook url: https://oapi.dingtalk.com/robot/send?access_token=08462ff18a9c5e739b98a5d7a716408b4ccd8255d19a3b26ae6b8dcb90c73384 # 对应的是"加签"的值,复制过来即可 secret: "SECf5414b69dd0f8a3a72b0bb929cf9271ef061aaea4c60e270cb15deb127339e4b" [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]#

image

bash
1.4 启动钉钉插件 [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# ./prometheus-webhook-dingtalk --web.listen-address="10.1.12.3:8060" ... ts=2025-08-07T07:58:59.946Z caller=main.go:113 component=configuration msg="Webhook urls for prometheus alertmanager" urls=http://10.0.0.42:8060/dingtalk/weixiang98/send 2.Alertmanager集成钉钉插件 2.1 修改Alertmanager的配置文件 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml # 通用配置 global: resolve_timeout: 5m smtp_from: '13949913771@163.com' smtp_smarthost: 'smtp.163.com:465' smtp_auth_username: '13949913771@163.com' smtp_auth_password: 'UGTMVNtb2Xup2St4' smtp_require_tls: false smtp_hello: '163.com' # 定义路由信息 route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'sre_system' # 配置子路由 routes: - receiver: 'sre_dba' match_re: job: yinzhengjie_dba_exporter # 建议将continue的值设置为true,表示当前的条件是否匹配,都将继续向下匹配规则 # 这样做的目的是将消息发给最后的系统组(sre_system) continue: true - receiver: 'sre_k8s' match_re: job: yinzhengjie_k8s_exporter continue: true - receiver: 'sre_system' match_re: job: .* continue: true # 定义接受者 receivers: - name: 'sre_dba' email_configs: - to: '18295829783@163.com' headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' send_resolved: true - to: '1304871040@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - name: 'sre_k8s' email_configs: - to: '2996358563@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - to: '2011014877@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - name: 'sre_system' webhook_configs: # 指向的是Prometheus的插件地址 - url: 'http://10.0.0.42:8060/dingtalk/weixiang98/send' http_config: {} max_alerts: 0 send_resolved: true #email_configs: #- to: '3220434114@qq.com' # send_resolved: true # headers: { Subject: "[WARN] weixiang98报警邮件" } # # html: '{{ template "weixiang.html" . }}' # # html: '{{ template "xixi" . }}' # html: '{{ template "weixiang" . }}' #- to: '2825483220@qq.com' # send_resolved: true # headers: { Subject: "[WARN] weixiang98报警邮件" } # # html: '{{ template "weixiang.html" . }}' # # html: '{{ template "xixi" . }}' # html: '{{ template "weixiang" . }}' # 加载模板 templates: - '/weixiang/softwares/alertmanager/tmpl/*.tmpl' [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ./amtool check-config alertmanager.yml Checking 'alertmanager.yml' SUCCESS Found: - global config - route - 0 inhibit rules - 3 receivers - 1 templates SUCCESS [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# 2.2 启动Alertmanager [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ./alertmanager 3.测试告警验证

image

6、Alertmanager的告警静默(Silence)
bash
- Alertmanager的告警静默(Silence) 1.告警静默(Silence) 一般用于系统维护,预期要做的操作,这意味着就没有必要告警。 比如系统升级,需要8h,在这8h过程中,就可以考虑先不用告警。 2.实战案例 # 停止三个业务 [root@node-exporter41 ~]# systemctl stop node-exporter.service # 查看界面

image

bash
根据key值进行进行静默,只要有这个标签的都进行静默

image

image

image

image

bash
# 开启服务 [root@node-exporter43 ~]# systemctl start node-exporter.service # 钉钉收不到通知消息,静默测试成功

设置静默过期

image

bash
通知已收到

image

7、Alertmanager的告警抑制(inhibit)
bash
- Alertmanager的告警抑制(inhibit) 1.什么是告警抑制 说白了,就是抑制告警,和静默不同的是,抑制的应用场景一般用于抑制符合条件的告警。 举个例子: 一个数据中心有800台服务器,每台服务器有50个监控项,假设一个意味着有4w个监控告警。 如果数据中心断电,理论上来说就会有4w条告警发送到你的手机,你是处理不过来的,所以我们只需要将数据中心断电的告警发出来即可。 2.Prometheus Server编写规则 2.1 修改Prometheus server的配置文件 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml ... rule_files: # - "weixiang-linux-rules.yml" - "weixiang-linux-rules-inhibit.yml" ... 2.2 编写告警规则 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# cat weixiang-linux-rules-inhibit.yml groups: - name: weixiang-linux-rules-alert-inhibit rules: - alert: weixiang-dba_exporter-alert expr: up{job="yinzhengjie_dba_exporter"} == 0 for: 3s labels: apps: dba severity: critical dc: beijing # 下面根据这里的dc值告警,这个告警了,k8s就不会告警了 annotations: summary: "{{ $labels.instance }} 数据库实例已停止运行超过 3s!" # 这里注释部分增加了一个value的属性信息,会从Prometheus的默认信息中获取阈值 value: "{{ $value }}" - alert: weixiang-k8s_exporter-alert expr: up{job="yinzhengjie_k8s_exporter"} == 0 for: 3s labels: apps: k8s severity: warning dc: beijing # 下面根据这里的dc值告警 annotations: summary: "{{ $labels.instance }} K8S服务器已停止运行超过 3s!" value: "{{ $value }}" - alert: weixiang-bigdata_exporter-alert expr: up{job="yinzhengjie_bigdata_exporter"} == 0 for: 5s labels: apps: bigdata severity: warning dc: shenzhen annotations: summary: "{{ $labels.instance }} 大数据服务器已停止运行超过 5s!" value: "{{ $value }}" [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.3 热加载配置文件使得生效 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST 106.55.44.37:9090/-/reload [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# 2.4 验证prometheus的webUI配置是否剩下 http://106.55.44.37:9090/alerts?search= # 抑制规则已经生效

image

bash
3.Alertmanager配置告警抑制规则 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml ... ## 配置告警抑制规则 inhibit_rules: # 如果"dc"的值相同的前提条件下。 # 则当触发了"severity: critical"告警,就会抑制"severity: warning"的告警信息。 - source_match: severity: critical # 如果这个告警了 target_match: severity: warning # 这个就不会告了 equal: - dc # 根据这个值告警 [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# 4.启动Alertmanager [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ./alertmanager 5.检查Alertmanager的webuI http://134.175.108.235:9093/#/status 6.验证测试 # 三个服务都已停止

image

bash
k8s会被抑制,

image

bash
没收到k8s,因为被抑制了

image

8、监控Linux系统的根目录使用大小
1.Prometheus 配置,全部在prometheus.yml文件中操作

配置抓取任务 (Scrape Configs): 确保 Prometheus 配置文件中包含了抓取所有 Node Exporter 的任务。

bash
# prometheus.yml scrape_configs: ... rule_files: - "disk_alert.rules.yml" # 引用规则文件 ... - job_name: "weixiang-node-exporter" static_configs: - targets: - 10.1.12.15:9100 # 抓取这三台服务器 - 10.1.12.3:9100 - 10.1.12.4:9100 ...

【排错点】 : 确保每个 target 没有在多个 job 中重复出现,否则会导致数据重复。

2.配置告警规则文件: 创建一个专门存放磁盘告警规则的文件,并在 Prometheus.yml 中引用它。

bash
# prometheus.yml rule_files: - "disk_alert.rules.yml" # 引用规则文件

3.配置 Alertmanager 地址: 告诉 Prometheus 将触发的告警发送到哪里。

bash
# prometheus.yml alerting: alertmanagers: - static_configs: - targets: ['<Alertmanager_IP>:9093']
2. 告警规则配置 (disk_alert.rules.yml)

这是定义“什么情况下算作告警”的核心。

bash
# disk_alert.rules.yml groups: - name: LinuxNodeRules rules: - alert: LinuxRootFilesystemUsageHigh # PromQL 表达式,计算根目录磁盘使用率 expr: | ( node_filesystem_size_bytes{mountpoint="/"} - node_filesystem_avail_bytes{mountpoint="/"} ) / node_filesystem_size_bytes{mountpoint="/"} * 100 > 90 # 持续5分钟才触发,防止抖动 for: 5m labels: severity: critical # 告警的详细信息,用于钉钉通知 annotations: summary: "服务器根目录空间使用率过高" description: | - **告警级别**: {{ $labels.severity }} - **主机IP**: {{ $labels.instance }} - **挂载点**: {{ $labels.mountpoint }} - **当前使用率**: {{ $value | printf "%.2f" }}% - **详细信息**: 服务器 {{ $labels.instance }} 的根目录 (/) 空间使用率已超过90%,请立即处理! node_filesystem_size_bytes: 由 node_exporter 提供的文件系统总大小的指标。 node_filesystem_avail_bytes: 由 node_exporter 提供的文件系统可用空间大小的指标。 {mountpoint="/", fstype=~"ext4|xfs"}: 这是筛选条件,确保我们只监控根目录 (/),并且是常见的 ext4 或 xfs 文件系统,避免监控到虚拟文件系统。 > 90: 告警阈值,超过 90% 就满足条件。 for: 5m: 为了防止因为某个瞬间的读写导致使用率飙高而产生误报,设置了一个持续时间。只有当使用率持续 5 分钟都高于 90% 时,告警才会从 Pending 状态变为 Firing 状态,并发送通知。 annotations: 这部分内容会作为变量传递给钉钉模板,最终显示在你的钉钉消息里。{{ $labels.instance }} 会被替换为目标机器的 IP 和端口,{{ $value }} 会被替换为表达式计算出的当前值(即磁盘使用率)。

【排错点】 : expr 表达式中的标签选择器(如 mountpoint)必须与 Node Exporter 实际暴露的指标标签完全匹配。如果告警不触发,首先应检查这个表达式。

3. Alertmanager 配置 (alertmanager.yml)
bash
[root@node-exporter43 alertmanager-0.28.1.linux-amd64]# cat alertmanager.yml # 通用配置 global: resolve_timeout: 5m smtp_from: '1360821977@qq.com' smtp_smarthost: 'smtp.qq.com:465' smtp_auth_username: '1360821977@qq.com' smtp_auth_password: 'pnmwamrclxfpijfi' # 注意:密码等敏感信息请妥善保管 smtp_require_tls: false smtp_hello: 'qq.com' # 定义路由信息 route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'sre_system' # 配置子路由 routes: - receiver: 'sre_dba' match_re: job: yinzhengjie_dba_exporter # 建议将continue的值设置为true,表示当前的条件是否匹配,都将继续向下匹配规则 # 这样做的目的是将消息发给最后的系统组(sre_system) continue: true - receiver: 'sre_k8s' match_re: job: yinzhengjie_k8s_exporter continue: true - receiver: 'sre_system' match_re: job: .* continue: true # 定义接受者 receivers: - name: 'sre_dba' email_configs: - to: '18295829783@163.com' headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang" . }}' send_resolved: true - to: '1304871040@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang" . }}' - name: 'sre_k8s' email_configs: - to: '2996358563@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang" . }}' - to: '2011014877@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } html: '{{ template "weixiang" . }}' - name: 'sre_system' webhook_configs: # 指向的是Prometheus的插件地址 - url: 'http://10.1.12.3:8060/dingtalk/weixiang98/send' http_config: {} max_alerts: 0 send_resolved: true # 加载模板 templates: - '/weixiang/softwares/alertmanager/tmpl/*.tmpl'
4. 重启或重载
bash
# 执行alertmanager [root@node-exporter43 alertmanager-0.28.1.linux-amd64]# ./alertmanager # 热加载 [root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST http://10.1.24.13:9090/-/reload # 开始监听和接收来自 Alertmanager 的 HTTP 请求 [root@node-exporter42 prometheus-webhook-dingtalk-2.1.0.linux-amd64]# ./prometheus-webhook-dingtalk --web.listen-address="10.1.12.3:8060"
5. 验证
bash
登录一台被监控的服务器。 检查当前磁盘使用率: df -h /。 http://106.55.44.37:9090/alerts?search=

image

​​

bash
创建大文件,使根目录使用率超过90%:dd if=/dev/zero of=/test_large_file.img bs=1G count=... 观察 Prometheus UI: 打开 Prometheus 的 Alerts 页面。 预期流程:告警状态首先变为 Pending (黄色),并在 for 设定的时间(5分钟)后,变为 Firing (红色) http://106.55.44.37:9090/alerts?search=

image

bash
等待片刻(Prometheus 的抓取间隔 + 告警规则的 for 时间),你应该会在钉钉群里收到告警消

image

bash
Grafana 可视化: 创建新的 Dashboard 和 Panel。 选择 Prometheus 作为数据源。 配置查询 (Query): 查询语句: 使用与告警规则类似的 PromQL。 (node_filesystem_size_bytes{mountpoint="/"} - node_filesystem_avail_bytes{mountpoint="/"}) / node_filesystem_size_bytes{mountpoint="/"} * 100

image

41741c2410e9e6c81f54c50aa38894f5

992a3f82efca8bc9818d55364d4d2a1a

bash
# 清理测试文件,观察恢复 rm -f /big_test_file

image

image

19、prometheus-operator部署监控K8S集群

bash
- 昨日内容回顾: - pushgateway - 直播在线人数 - tcp状态 - 丢包率 - Alertmanager - 路由分组告警 - 自定义告警模板 - 告警静默 - 告警抑制 - 今日内容预告 - Prometheus监控K8S集群 - kubeadm的K8S证书续期问题 - prometheus-operator部署监控K8S集群 1.下载源代码 wget https://github.com/prometheus-operator/kube-prometheus/archive/refs/tags/v0.11.0.tar.gz svip: [root@master231 03-prometheus]# wget http://192.168.21.253/Resources/Kubernetes/Project/Prometheus/manifests/kube-prometheus-0.11.0.tar.gz 2.解压目录 [root@master231 03-prometheus]# tar xf kube-prometheus-0.11.0.tar.gz [root@master231 03-prometheus]# [root@master231 03-prometheus]# ll total 480 drwxr-xr-x 3 root root 4096 Aug 8 08:52 ./ drwxr-xr-x 5 root root 4096 Aug 8 08:52 ../ drwxrwxr-x 11 root root 4096 Jun 17 2022 kube-prometheus-0.11.0/ -rw-r--r-- 1 root root 475590 Jun 28 2024 kube-prometheus-0.11.0.tar.gz [root@master231 03-prometheus]# [root@master231 03-prometheus]# [root@master231 03-prometheus]# cd kube-prometheus-0.11.0/ [root@master231 kube-prometheus-0.11.0]# [root@master231 kube-prometheus-0.11.0]# ll total 220 drwxrwxr-x 11 root root 4096 Jun 17 2022 ./ drwxr-xr-x 3 root root 4096 Aug 8 08:52 ../ -rwxrwxr-x 1 root root 679 Jun 17 2022 build.sh* -rw-rw-r-- 1 root root 11421 Jun 17 2022 CHANGELOG.md -rw-rw-r-- 1 root root 2020 Jun 17 2022 code-of-conduct.md -rw-rw-r-- 1 root root 3782 Jun 17 2022 CONTRIBUTING.md drwxrwxr-x 5 root root 4096 Jun 17 2022 developer-workspace/ drwxrwxr-x 4 root root 4096 Jun 17 2022 docs/ -rw-rw-r-- 1 root root 2273 Jun 17 2022 example.jsonnet drwxrwxr-x 7 root root 4096 Jun 17 2022 examples/ drwxrwxr-x 3 root root 4096 Jun 17 2022 experimental/ drwxrwxr-x 4 root root 4096 Jun 17 2022 .github/ -rw-rw-r-- 1 root root 129 Jun 17 2022 .gitignore -rw-rw-r-- 1 root root 1474 Jun 17 2022 .gitpod.yml -rw-rw-r-- 1 root root 2302 Jun 17 2022 go.mod -rw-rw-r-- 1 root root 71172 Jun 17 2022 go.sum drwxrwxr-x 3 root root 4096 Jun 17 2022 jsonnet/ -rw-rw-r-- 1 root root 206 Jun 17 2022 jsonnetfile.json -rw-rw-r-- 1 root root 5130 Jun 17 2022 jsonnetfile.lock.json -rw-rw-r-- 1 root root 1644 Jun 17 2022 kubescape-exceptions.json -rw-rw-r-- 1 root root 4773 Jun 17 2022 kustomization.yaml -rw-rw-r-- 1 root root 11325 Jun 17 2022 LICENSE -rw-rw-r-- 1 root root 3379 Jun 17 2022 Makefile drwxrwxr-x 3 root root 4096 Jun 17 2022 manifests/ -rw-rw-r-- 1 root root 221 Jun 17 2022 .mdox.validate.yaml -rw-rw-r-- 1 root root 8253 Jun 17 2022 README.md -rw-rw-r-- 1 root root 4463 Jun 17 2022 RELEASE.md drwxrwxr-x 2 root root 4096 Jun 17 2022 scripts/ drwxrwxr-x 3 root root 4096 Jun 17 2022 tests/ [root@master231 kube-prometheus-0.11.0]# [root@master231 kube-prometheus-0.11.0]# 3.导入镜像【线下班学员操作,线上班忽略,到百度云盘找物料包即可!】 3.1 下载镜像 [root@master231 ~]# mkdir prometheus && cd prometheus [root@master231 ~]# wget http://192.168.21.253/Resources/Kubernetes/Project/Prometheus/batch-load-prometheus-v0.11.0-images.sh [root@master231 ~]# bash batch-load-prometheus-v0.11.0-images.sh 21 3.2 拷贝镜像到其他节点 [root@master231 ~]# scp -r prometheus/ 10.0.0.232:~ [root@master231 ~]# scp -r prometheus/ 10.0.0.233:~ 3.3 其他节点导入镜像 [root@worker232 ~]# cd prometheus/ && for i in `ls *.tar.gz` ; do docker load -i $i; done [root@worker233 ~]# cd prometheus/ && for i in `ls *.tar.gz` ; do docker load -i $i; done 4.安装Prometheus-Operator [root@master231 kube-prometheus-0.11.0]# kubectl apply --server-side -f manifests/setup [root@master231 kube-prometheus-0.11.0]# kubectl wait \ [root@master231 kube-prometheus-0.11.0]# --for condition=Established \ [root@master231 kube-prometheus-0.11.0]# --all CustomResourceDefinition \ [root@master231 kube-prometheus-0.11.0]# --namespace=monitoring [root@master231 kube-prometheus-0.11.0]# kubectl apply -f manifests/ 5.检查Prometheus是否部署成功 [root@master231 kube-prometheus-0.11.0]# kubectl get pods -n monitoring -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES alertmanager-main-0 2/2 Running 0 35s 10.100.203.159 worker232 <none> <none> alertmanager-main-1 2/2 Running 0 35s 10.100.140.77 worker233 <none> <none> alertmanager-main-2 2/2 Running 0 35s 10.100.160.140 master231 <none> <none> blackbox-exporter-746c64fd88-66ph5 3/3 Running 0 42s 10.100.203.158 worker232 <none> <none> grafana-5fc7f9f55d-qnfwj 1/1 Running 0 41s 10.100.140.86 worker233 <none> <none> kube-state-metrics-6c8846558c-pp5hf 3/3 Running 0 41s 10.100.203.173 worker232 <none> <none> node-exporter-6z9kb 2/2 Running 0 40s 10.0.0.231 master231 <none> <none> node-exporter-gx5dr 2/2 Running 0 40s 10.0.0.233 worker233 <none> <none> node-exporter-rq8mn 2/2 Running 0 40s 10.0.0.232 worker232 <none> <none> prometheus-adapter-6455646bdc-4fqcq 1/1 Running 0 39s 10.100.203.162 worker232 <none> <none> prometheus-adapter-6455646bdc-n8flt 1/1 Running 0 39s 10.100.140.91 worker233 <none> <none> prometheus-k8s-0 2/2 Running 0 35s 10.100.203.189 worker232 <none> <none> prometheus-k8s-1 2/2 Running 0 35s 10.100.140.68 worker233 <none> <none> prometheus-operator-f59c8b954-gm5ww 2/2 Running 0 38s 10.100.203.152 worker232 <none> <none> [root@master231 kube-prometheus-0.11.0]# [root@master231 kube-prometheus-0.11.0]# 5.修改Grafana的svc [root@master231 kube-prometheus-0.11.0]# cat manifests/grafana-service.yaml apiVersion: v1 kind: Service metadata: ... name: grafana namespace: monitoring spec: type: LoadBalancer ... [root@master231 kube-prometheus-0.11.0]# [root@master231 kube-prometheus-0.11.0]# kubectl apply -f manifests/grafana-service.yaml service/grafana configured [root@master231 kube-prometheus-0.11.0]# [root@master231 kube-prometheus-0.11.0]# kubectl get svc -n monitoring grafana NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE grafana LoadBalancer 10.200.48.10 10.0.0.155 3000:49955/TCP 19m [root@master231 kube-prometheus-0.11.0]# 6.访问Grafana的WebUI http://10.0.0.155:3000/ 默认的用户名和密码: admin

image

image

2、使用traefik暴露Prometheus的WebUI到K8S集群外部
bash
1.检查traefik组件是否部署 [root@master231 26-IngressRoute]# helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION traefik-server default 2 2025-07-30 10:13:32.840421004 +0800 CST deployed traefik-36.3.0 v3.4.3 [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ... traefik-server LoadBalancer 10.200.215.152 10.0.0.152 80:30883/TCP,443:24792/TCP 8d [root@master231 26-IngressRoute]# 2.编写资源清单 [root@master231 26-IngressRoute]# cat 19-ingressRoute-prometheus.yaml apiVersion: traefik.io/v1alpha1 kind: IngressRoute metadata: name: ingressroute-prometheus namespace: monitoring spec: entryPoints: - web routes: - match: Host(`prom.weixiang.com`) && PathPrefix(`/`) kind: Rule services: - name: prometheus-k8s port: 9090 [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl apply -f 19-ingressRoute-prometheus.yaml ingressroute.traefik.io/ingressroute-prometheus created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get ingressroute -n monitoring NAME AGE ingressroute-prometheus 7s [root@master231 26-IngressRoute]# 3.windows添加解析 10.0.0.152 prom.weixiang.com 4.浏览器访问测试 http://prom.weixiang.com/targets?search= 5.其他方案 NodePort port-forward LoadBalancer Ingress

image

image

3、Prometheus监控云原生应用etcd案例
bash
1.测试ectd metrics接口 1.1 查看etcd证书存储路径 [root@master231 yinzhengjie]# egrep "\--key-file|--cert-file" /etc/kubernetes/manifests/etcd.yaml - --cert-file=/etc/kubernetes/pki/etcd/server.crt - --key-file=/etc/kubernetes/pki/etcd/server.key [root@master231 yinzhengjie]# 1.2 测试etcd证书访问的metrics接口 [root@master231 yinzhengjie]# curl -s --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key https://10.1.12.15:2379/metrics -k | tail # TYPE process_virtual_memory_max_bytes gauge process_virtual_memory_max_bytes 1.8446744073709552e+19 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served. # TYPE promhttp_metric_handler_requests_in_flight gauge promhttp_metric_handler_requests_in_flight 1 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code. # TYPE promhttp_metric_handler_requests_total counter promhttp_metric_handler_requests_total{code="200"} 4 promhttp_metric_handler_requests_total{code="500"} 0 promhttp_metric_handler_requests_total{code="503"} 0 [root@master231 yinzhengjie]# 2.创建etcd证书的secrets并挂载到Prometheus server 2.1 查找需要挂载etcd的证书文件路径 [root@master231 yinzhengjie]# egrep "\--key-file|--cert-file|--trusted-ca-file" /etc/kubernetes/manifests/etcd.yaml - --cert-file=/etc/kubernetes/pki/etcd/server.crt - --key-file=/etc/kubernetes/pki/etcd/server.key - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt [root@master231 yinzhengjie]# 2.2 根据etcd的实际存储路径创建secrets [root@master231 yinzhengjie]# kubectl create secret generic etcd-tls --from-file=/etc/kubernetes/pki/etcd/server.crt --from-file=/etc/kubernetes/pki/etcd/server.key --from-file=/etc/kubernetes/pki/etcd/ca.crt -n monitoring secret/etcd-tls created [root@master231 yinzhengjie]# [root@master231 yinzhengjie]# kubectl -n monitoring get secrets etcd-tls NAME TYPE DATA AGE etcd-tls Opaque 3 12s [root@master231 yinzhengjie]# 2.3 修改Prometheus的资源,修改后会自动重启 [root@master231 yinzhengjie]# kubectl -n monitoring edit prometheus k8s ... spec: secrets: - etcd-tls ... [root@master231 yinzhengjie]# kubectl -n monitoring get pods -l app.kubernetes.io/component=prometheus -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES prometheus-k8s-0 2/2 Running 0 74s 10.100.1.57 worker232 <none> <none> prometheus-k8s-1 2/2 Running 0 92s 10.100.2.28 worker233 <none> <none> [root@master231 yinzhengjie]# 2.4.查看证书是否挂载成功 [root@master231 yinzhengjie]# kubectl -n monitoring exec prometheus-k8s-0 -c prometheus -- ls -l /etc/prometheus/secrets/etcd-tls total 0 lrwxrwxrwx 1 root 2000 13 Jan 24 14:07 ca.crt -> ..data/ca.crt lrwxrwxrwx 1 root 2000 17 Jan 24 14:07 server.crt -> ..data/server.crt lrwxrwxrwx 1 root 2000 17 Jan 24 14:07 server.key -> ..data/server.key [root@master231 yinzhengjie]# [root@master231 yinzhengjie]# kubectl -n monitoring exec prometheus-k8s-1 -c prometheus -- ls -l /etc/prometheus/secrets/etcd-tls total 0 lrwxrwxrwx 1 root 2000 13 Jan 24 14:07 ca.crt -> ..data/ca.crt lrwxrwxrwx 1 root 2000 17 Jan 24 14:07 server.crt -> ..data/server.crt lrwxrwxrwx 1 root 2000 17 Jan 24 14:07 server.key -> ..data/server.key [root@master231 yinzhengjie]# 3.编写资源清单 [root@master231 28-servicemonitors]# cat 01-smon-etcd.yaml apiVersion: v1 kind: Endpoints metadata: name: etcd-k8s namespace: kube-system subsets: - addresses: - ip: 10.0.0.231 ports: - name: https-metrics port: 2379 protocol: TCP --- apiVersion: v1 kind: Service metadata: name: etcd-k8s namespace: kube-system labels: apps: etcd spec: ports: - name: https-metrics port: 2379 targetPort: 2379 type: ClusterIP --- apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: weixiang-etcd-smon namespace: monitoring spec: # 指定job的标签,可以不设置。 jobLabel: kubeadm-etcd-k8s-yinzhengjie # 指定监控后端目标的策略 endpoints: # 监控数据抓取的时间间隔 - interval: 30s # 指定metrics端口,这个port对应Services.spec.ports.name port: https-metrics # Metrics接口路径 path: /metrics # Metrics接口的协议 scheme: https # 指定用于连接etcd的证书文件 tlsConfig: # 指定etcd的CA的证书文件 caFile: /etc/prometheus/secrets/etcd-tls/ca.crt # 指定etcd的证书文件 certFile: /etc/prometheus/secrets/etcd-tls/server.crt # 指定etcd的私钥文件 keyFile: /etc/prometheus/secrets/etcd-tls/server.key # 关闭证书校验,毕竟咱们是自建的证书,而非官方授权的证书文件。 insecureSkipVerify: true # 监控目标Service所在的命名空间 namespaceSelector: matchNames: - kube-system # 监控目标Service目标的标签。 selector: # 注意,这个标签要和etcd的service的标签保持一致哟 matchLabels: apps: etcd [root@master231 28-servicemonitors]# [root@master231 28-servicemonitors]# [root@master231 28-servicemonitors]# kubectl apply -f 01-smon-etcd.yaml endpoints/etcd-k8s created service/etcd-k8s created servicemonitor.monitoring.coreos.com/weixiang-etcd-smon created [root@master231 28-servicemonitors]# 4.Prometheus查看数据 etcd_cluster_version 5.Grafana导入模板 3070

image

image

4、Prometheus监控非云原生应用MySQL案例
bash
1.编写资源清单 [root@master231 28-servicemonitors]# cat 02-smon-mysqld.yaml apiVersion: apps/v1 kind: Deployment metadata: name: mysql80-deployment spec: replicas: 1 selector: matchLabels: apps: mysql80 template: metadata: labels: apps: mysql80 spec: containers: - name: mysql image: harbor250.weixiang.com/weixiang-db/mysql:8.0.36-oracle ports: - containerPort: 3306 env: - name: MYSQL_ROOT_PASSWORD value: yinzhengjie - name: MYSQL_USER value: weixiang98 - name: MYSQL_PASSWORD value: "weixiang" --- apiVersion: v1 kind: Service metadata: name: mysql80-service spec: selector: apps: mysql80 ports: - protocol: TCP port: 3306 targetPort: 3306 --- apiVersion: v1 kind: ConfigMap metadata: name: my.cnf data: .my.cnf: |- [client] user = weixiang98 password = weixiang [client.servers] user = weixiang98 password = weixiang --- apiVersion: apps/v1 kind: Deployment metadata: name: mysql-exporter-deployment spec: replicas: 1 selector: matchLabels: apps: mysql-exporter template: metadata: labels: apps: mysql-exporter spec: volumes: - name: data configMap: name: my.cnf items: - key: .my.cnf path: .my.cnf containers: - name: mysql-exporter image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/mysqld-exporter:v0.15.1 command: - mysqld_exporter - --config.my-cnf=/root/my.cnf - --mysqld.address=mysql80-service.default.svc.weixiang.com:3306 securityContext: runAsUser: 0 ports: - containerPort: 9104 #env: #- name: DATA_SOURCE_NAME # value: mysql_exporter:yinzhengjie@(mysql80-service.default.svc.yinzhengjie.com:3306) volumeMounts: - name: data mountPath: /root/my.cnf subPath: .my.cnf --- apiVersion: v1 kind: Service metadata: name: mysql-exporter-service labels: apps: mysqld spec: selector: apps: mysql-exporter ports: - protocol: TCP port: 9104 targetPort: 9104 name: mysql80 --- apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: weixiang-mysql-smon spec: jobLabel: kubeadm-mysql-k8s-yinzhengjie endpoints: - interval: 3s # 这里的端口可以写svc的端口号,也可以写svc的名称。 # 但我推荐写svc端口名称,这样svc就算修改了端口号,只要不修改svc端口的名称,那么我们此处就不用再次修改哟。 # port: 9104 port: mysql80 path: /metrics scheme: http namespaceSelector: matchNames: - default selector: matchLabels: apps: mysqld [root@master231 28-servicemonitors]# [root@master231 28-servicemonitors]# kubectl apply -f 02-smon-mysqld.yaml deployment.apps/mysql80-deployment created service/mysql80-service created configmap/my.cnf created deployment.apps/mysql-exporter-deployment created service/mysql-exporter-service created servicemonitor.monitoring.coreos.com/weixiang-mysql-smon created [root@master231 28-servicemonitors]# [root@master231 28-servicemonitors]# kubectl get pods -o wide -l "apps in (mysql80,mysql-exporter)" NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES mysql-exporter-deployment-557cbcb6df-p68rz 1/1 Running 0 2m18s 10.100.1.98 worker232 <none> <none> mysql80-deployment-58944b97cb-52d45 1/1 Running 0 2m19s 10.100.2.5 worker233 <none> <none> [root@master231 28-servicemonitors]# 2.Prometheus访问测试 mysql_up 3.Grafana导入模板 7362

image

image

5、smon监控redis实战案例
bash
- 在k8s集群部署redis服务 - 使用smon资源监控Redis服务 - 使用Grafana出图展示 1.部署redis [root@master231 26-IngressRoute]# cat 07-deploy-redis.yaml apiVersion: apps/v1 kind: Deployment metadata: name: deploy-redis spec: replicas: 1 selector: matchLabels: apps: redis template: metadata: labels: apps: redis spec: containers: - image: harbor250.weixiang.com/weixiang-db/redis:6.0.5 name: db ports: - containerPort: 6379 --- apiVersion: v1 kind: Service metadata: name: svc-redis spec: ports: - port: 6379 selector: apps: redis [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl apply -f 07-deploy-redis.yaml deployment.apps/deploy-redis created service/svc-redis created [root@master231 26-IngressRoute]# [root@master231 26-IngressRoute]# kubectl get pods -o wide -l apps=redis NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-redis-5dd745fbb9-v5hrg 1/1 Running 0 23s 10.100.1.99 worker232 <none> <none> [root@master231 26-IngressRoute]# 2.推送redis-exporter镜像到harbor仓库 2.1 拉取镜像 [root@harbor250.weixiang.com harbor]# docker pull oliver006/redis_exporter:v1.74.0-alpine v1.74.0-alpine: Pulling from oliver006/redis_exporter f18232174bc9: Pull complete 4f3b0056e2b7: Pull complete 5b461a57e1b6: Pull complete Digest: sha256:7eb52534fc92c5bebbf3ad89abf18d205ade8905c116394c130989ef9491a06c Status: Downloaded newer image for oliver006/redis_exporter:v1.74.0-alpine docker.io/oliver006/redis_exporter:v1.74.0-alpine [root@harbor250.weixiang.com harbor]# SVIP: wget http://192.168.21.253/Resources/Prometheus/images/Redis-exporter/weixiang-redis_exporter-v1.74.0-alpine.tar.gz 2.2 将镜像推送到harbor仓库 [root@harbor250.weixiang.com ~]# docker tag oliver006/redis_exporter:v1.74.0-alpine harbor250.weixiang.com/weixiang-db/redis_exporter:v1.74.0-alpine [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# docker login -u admin -p 1 harbor250.weixiang.com WARNING! Using --password via the CLI is insecure. Use --password-stdin. WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded [root@harbor250.weixiang.com ~]# [root@harbor250.weixiang.com ~]# docker push harbor250.weixiang.com/weixiang-db/redis_exporter:v1.74.0-alpine The push refers to repository [harbor250.weixiang.com/weixiang-db/redis_exporter] db0a8e4bb03a: Pushed 81c1f5332ad2: Pushed 08000c18d16d: Pushed v1.74.0-alpine: digest: sha256:c40820a283db66961099a2179566e2a70006a2d416017773015fd1c70be10949 size: 949 [root@harbor250.weixiang.com ~]# 3.使用Smon监控redis服务 3.1 准备资源清单 [root@master231 28-servicemonitors]# cat 03-smon-redis.yaml apiVersion: apps/v1 kind: Deployment metadata: name: redis-exporter-deployment spec: replicas: 1 selector: matchLabels: apps: redis-exporter template: metadata: labels: apps: redis-exporter spec: containers: - name: redis-exporter image: harbor250.weixiang.com/weixiang-db/redis_exporter:v1.74.0-alpine env: - name: REDIS_ADDR value: redis://svc-redis.default.svc:6379 - name: REDIS_EXPORTER_WEB_TELEMETRY_PATH value: /metrics - name: REDIS_EXPORTER_WEB_LISTEN_ADDRESS value: :9121 #command: #- redis_exporter #args: #- -redis.addr redis://svc-redis.default.svc:6379 #- -web.telemetry-path /metrics #- -web.listen-address :9121 ports: - containerPort: 9121 --- apiVersion: v1 kind: Service metadata: name: redis-exporter-service labels: apps: redis spec: selector: apps: redis-exporter ports: - protocol: TCP port: 9121 targetPort: 9121 name: redis-exporter --- apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: weixiang-redis-smon spec: endpoints: - interval: 3s port: redis-exporter path: /metrics scheme: http namespaceSelector: matchNames: - default selector: matchLabels: apps: redis [root@master231 28-servicemonitors]# [root@master231 28-servicemonitors]# kubectl apply -f 03-smon-redis.yaml deployment.apps/redis-exporter-deployment created service/redis-exporter-service created servicemonitor.monitoring.coreos.com/weixiang-redis-smon created [root@master231 28-servicemonitors]# 3.2 访问Prometheus的WebUI http://prom.weixiang.com/targets?search= 3.3 导入Grafana的ID 11835 14091

4a34d0fbacd00524a734fe6883932b1a

6、Alertmanager的配置文件使用自定义模板和配置文件定义
bash
1.Alertmanager的配置文件 [root@master231 manifests]# pwd /weixiang/manifests/projects/03-prometheus/kube-prometheus-0.11.0/manifests [root@master231 manifests]# [root@master231 manifests]# ll alertmanager-secret.yaml -rw-rw-r-- 1 root root 1443 Jun 17 2022 alertmanager-secret.yaml [root@master231 manifests]# [root@master231 manifests]# cat alertmanager-secret.yaml apiVersion: v1 kind: Secret metadata: labels: app.kubernetes.io/component: alert-router app.kubernetes.io/instance: main app.kubernetes.io/name: alertmanager app.kubernetes.io/part-of: kube-prometheus app.kubernetes.io/version: 0.24.0 name: alertmanager-main namespace: monitoring stringData: alertmanager.yaml: |- # 通用配置 global: resolve_timeout: 5m smtp_from: '13949913771@163.com' smtp_smarthost: 'smtp.163.com:465' smtp_auth_username: '13949913771@163.com' smtp_auth_password: 'UGTMVNtb2Xup2St4' smtp_require_tls: false smtp_hello: '163.com' # 定义路由信息 route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'sre_system' # 配置子路由 routes: - receiver: 'sre_dba' match_re: job: yinzhengjie_dba_exporter # 建议将continue的值设置为true,表示当前的条件是否匹配,都将继续向下匹配规则 # 这样做的目的是将消息发给最后的系统组(sre_system) continue: true - receiver: 'sre_k8s' match_re: job: yinzhengjie_k8s_exporter continue: true - receiver: 'sre_system' match_re: job: .* continue: true # 定义接受者 receivers: - name: 'sre_dba' email_configs: - to: '18295829783@163.com' headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' send_resolved: true - to: '1304871040@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - name: 'sre_k8s' email_configs: - to: '2996358563@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } #html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - to: '2011014877@qq.com' send_resolved: true headers: { Subject: "[WARN] weixiang98报警邮件" } # html: '{{ template "weixiang.html" . }}' # html: '{{ template "xixi" . }}' html: '{{ template "weixiang" . }}' - name: 'sre_system' webhook_configs: # 指向的是Prometheus的插件地址 - url: 'http://10.0.0.42:8060/dingtalk/weixiang98/send' http_config: {} max_alerts: 0 send_resolved: true #email_configs: #- to: '3220434114@qq.com' # send_resolved: true # headers: { Subject: "[WARN] weixiang98报警邮件" } # # html: '{{ template "weixiang.html" . }}' # # html: '{{ template "xixi" . }}' # html: '{{ template "weixiang" . }}' #- to: '2825483220@qq.com' # send_resolved: true # headers: { Subject: "[WARN] weixiang98报警邮件" } # # html: '{{ template "weixiang.html" . }}' # # html: '{{ template "xixi" . }}' # html: '{{ template "weixiang" . }}' # 加载模板 templates: - '/weixiang/softwares/alertmanager/tmpl/*.tmpl' [root@master231 manifests]# [root@master231 manifests]# [root@master231 manifests]# kubectl apply -f alertmanager-secret.yaml secret/alertmanager-main configured [root@master231 manifests]# 2.验证配置 [root@master231 manifests]# kubectl get svc -n monitoring alertmanager-main NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE alertmanager-main ClusterIP 10.200.203.24 <none> 9093/TCP,8080/TCP 5h44m [root@master231 manifests]# [root@master231 manifests]# [root@master231 manifests]# kubectl -n monitoring port-forward svc/alertmanager-main 9093:9093 --address=0.0.0.0 Forwarding from 0.0.0.0:9093 -> 9093 3.访问验证 http://10.0.0.231:9093/#/status 4.使用cm定义模板文件 [root@master231 12-configmaps]# cat 04-cm-alertmanager.yaml apiVersion: v1 kind: ConfigMap metadata: name: cm-alertmanager namespace: monitoring data: weixiang.tmpl: | {{ define "weixiang" }} <!DOCTYPE html> <html> <head> <title>{{ if eq .Status "firing" }}&#128680; 告警触发{{ else }}&#9989; 告警恢复{{ end }}</title> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <style> @font-face { font-family: "EmojiFont"; src: local("Apple Color Emoji"), local("Segoe UI Emoji"), local("Noto Color Emoji"); } :root { --color-critical: #ff4444; --color-warning: #ffbb33; --color-resolved: #00c851; --color-info: #33b5e5; } body { font-family: 'Segoe UI', system-ui, sans-serif, "EmojiFont"; line-height: 1.6; color: #333; max-width: 800px; margin: 20px auto; padding: 0 20px; } .header { text-align: center; padding: 30px; border-radius: 15px; margin-bottom: 30px; background: {{ if eq .Status "firing" }}#fff0f0{{ else }}#f0fff4{{ end }}; border: 2px solid {{ if eq .Status "firing" }}var(--color-critical){{ else }}var(--color-resolved){{ end }}; } .status-badge { padding: 8px 16px; border-radius: 20px; font-weight: bold; display: inline-block; } .alert-table { width: 100%; border-collapse: separate; border-spacing: 0; background: white; border-radius: 10px; overflow: hidden; box-shadow: 0 2px 6px rgba(0,0,0,0.1); margin: 20px 0; } .alert-table th { background: #f8f9fa; padding: 16px; text-align: left; width: 130px; border-right: 2px solid #e9ecef; } .alert-table td { padding: 16px; border-bottom: 1px solid #e9ecef; } .timeline { display: flex; justify-content: space-between; margin: 15px 0; } .timeline-item { flex: 1; text-align: center; padding: 10px; background: #f8f9fa; border-radius: 8px; margin: 0 5px; } .alert-image { text-align: center; margin: 30px 0; } .alert-image img { width: {{ if eq .Status "firing" }}140px{{ else }}100px{{ end }}; opacity: 0.9; transition: all 0.3s ease; } .emoji { font-family: "EmojiFont", sans-serif; font-size: 1.3em; } .severity-critical { color: var(--color-critical); } .severity-warning { color: var(--color-warning); } </style> </head> <body> <div class="header"> <h1> {{ if eq .Status "firing" }} <span class="emoji">&#128680;</span> 告警触发通知 {{ else }} <span class="emoji">&#9989;</span> 告警恢复通知 {{ end }} </h1> </div> {{ if eq .Status "firing" }} <!-- 告警触发内容 --> <table class="alert-table"> <tr> <th><span class="emoji">&#128683;</span> 告警名称</th> <td>{{ range .Alerts }}<span class="emoji">&#128227;</span> {{ .Labels.alertname }}{{ end }}</td> </tr> <tr> <th><span class="emoji">&#9888;&#65039;</span> 严重等级</th> <td class="severity-{{ range .Alerts }}{{ .Labels.severity }}{{ end }}"> {{ range .Alerts }}<span class="emoji">&#9210;</span> {{ .Labels.severity | toUpper }}{{ end }} </td> </tr> <tr> <th><span class="emoji">&#128346;</span> 触发时间</th> <td>{{ range .Alerts }}<span class="emoji">&#128337;</span> {{ .StartsAt.Format "2006-01-02 15:04:05" }}{{ end }}</td> </tr> </table> {{ else }} <!-- 告警恢复内容 --> <table class="alert-table"> <tr> <th><span class="emoji">&#128227;</span> 恢复告警</th> <td>{{ range .Alerts }}<span class="emoji">&#128272;</span> {{ .Labels.alertname }}{{ end }}</td> </tr> <tr> <th><span class="emoji">&#9203;</span> 持续时间</th> <td> {{ range .Alerts }} {{ .StartsAt.Format "15:04:05" }} - {{ .EndsAt.Format "15:04:05" }} ({{ .EndsAt.Sub .StartsAt | printf "%.0f" }} 分钟) {{ end }} </td> </tr> <tr> <th><span class="emoji">&#9989;</span> 恢复时间</th> <td>{{ range .Alerts }}<span class="emoji">&#128338;</span> {{ .EndsAt.Format "2006-01-02 15:04:05" }}{{ end }}</td> </tr> </table> {{ end }} <!-- 公共信息部分 --> <table class="alert-table"> <tr> <th><span class="emoji">&#128187;&#65039;</span> 实例信息</th> <td>{{ range .Alerts }}<span class="emoji">&#127991;</span> {{ .Labels.instance }}{{ end }}</td> </tr> <tr> <th><span class="emoji">&#128221;</span> 告警详情</th> <td>{{ range .Alerts }}<span class="emoji">&#128204;</span> {{ .Annotations.summary }}{{ end }}</td> </tr> <tr> <th><span class="emoji">&#128196;</span> 详细描述</th> <td>{{ range .Alerts }}<span class="emoji">&#128209;</span> {{ .Annotations.description }}{{ end }}</td> </tr> </table> <div class="alert-image"> {{ if eq .Status "firing" }} <img src="https://img95.699pic.com/element/40114/9548.png_860.png" alt="告警图标"> {{ else }} <img src="https://tse2-mm.cn.bing.net/th/id/OIP-C.n7AyZv_wWXqFCc1mtlGhFgHaHa?rs=1&pid=ImgDetMain" alt="恢复图标"> {{ end }} </div> <div class="timeline"> <div class="timeline-item"> <div class="emoji">&#128678; 当前状态</div> {{ range .Alerts }} <strong>{{ if eq .Status "firing" }}<span class="emoji">&#128293;</span> FIRING{{ else }}<span class="emoji">&#9989;</span> RESOLVED{{ end }}</strong> {{ end }} </div> <div class="timeline-item"> <div class="emoji">&#128204; 触发次数</div> <strong>{{ len .Alerts }} 次</strong> </div> </div> </body> </html> {{ end }} [root@master231 12-configmaps]# [root@master231 12-configmaps]# kubectl apply -f 04-cm-alertmanager.yaml configmap/cm-alertmanager created [root@master231 12-configmaps]# [root@master231 12-configmaps]# kubectl -n monitoring get cm cm-alertmanager NAME DATA AGE cm-alertmanager 1 24s [root@master231 12-configmaps]# 5.Alertmanager引用cm资源 [root@master231 manifests]# pwd /weixiang/manifests/projects/03-prometheus/kube-prometheus-0.11.0/manifests [root@master231 manifests]# [root@master231 manifests]# cat alertmanager-alertmanager.yaml apiVersion: monitoring.coreos.com/v1 kind: Alertmanager metadata: ... name: main namespace: monitoring spec: volumes: - name: data configMap: name: cm-alertmanager items: - key: weixiang.tmpl path: weixiang.tmpl volumeMounts: - name: data mountPath: /weixiang/softwares/alertmanager/tmpl image: quay.io/prometheus/alertmanager:v0.24.0 ... [root@master231 manifests]# [root@master231 manifests]# [root@master231 manifests]# kubectl apply -f alertmanager-alertmanager.yaml alertmanager.monitoring.coreos.com/main configured [root@master231 manifests]# 6.测试验证 [root@master231 12-configmaps]# kubectl get pods -o wide -n monitoring -l alertmanager=main NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES alertmanager-main-0 2/2 Running 0 16s 10.100.1.101 worker232 <none> <none> alertmanager-main-1 2/2 Running 0 32s 10.100.2.9 worker233 <none> <none> alertmanager-main-2 2/2 Running 0 38s 10.100.1.100 worker232 <none> <none> [root@master231 12-configmaps]# [root@master231 12-configmaps]# kubectl -n monitoring exec -it alertmanager-main-0 -- sh /alertmanager $ ls /weixiang/softwares/alertmanager/tmpl/ weixiang.tmpl /alertmanager $ /alertmanager $ head /weixiang/softwares/alertmanager/tmpl/weixiang.tmpl {{ define "weixiang" }} <!DOCTYPE html> <html> <head> <title>{{ if eq .Status "firing" }}&#128680; 告警触发{{ else }}&#9989; 告警恢复{{ end }}</title> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <style> @font-face { font-family: "EmojiFont"; src: local("Apple Color Emoji"), /alertmanager $ - Prometheus其他配置文件解读 1.黑盒监控的相关配置文件 [root@master231 manifests]# pwd /weixiang/manifests/projects/03-prometheus/kube-prometheus-0.11.0/manifests [root@master231 manifests]# [root@master231 manifests]# ls blackboxExporter-* blackboxExporter-clusterRoleBinding.yaml blackboxExporter-deployment.yaml blackboxExporter-serviceMonitor.yaml blackboxExporter-clusterRole.yaml blackboxExporter-networkPolicy.yaml blackboxExporter-service.yaml blackboxExporter-configuration.yaml blackboxExporter-serviceAccount.yaml [root@master231 manifests]# 2.Grafana的相关配置文件 [root@master231 manifests]# ls grafana-* grafana-config.yaml grafana-dashboardSources.yaml grafana-prometheusRule.yaml grafana-service.yaml grafana-dashboardDatasources.yaml grafana-deployment.yaml grafana-serviceAccount.yaml grafana-dashboardDefinitions.yaml grafana-networkPolicy.yaml grafana-serviceMonitor.yaml [root@master231 manifests]# 3.node-exporter的相关配置文件 [root@master231 manifests]# ls nodeExporter-* nodeExporter-clusterRoleBinding.yaml nodeExporter-networkPolicy.yaml nodeExporter-serviceMonitor.yaml nodeExporter-clusterRole.yaml nodeExporter-prometheusRule.yaml nodeExporter-service.yaml nodeExporter-daemonset.yaml nodeExporter-serviceAccount.yaml [root@master231 manifests]# 4.Prometheus服务端的相关配置 [root@master231 manifests]# ll prometheus-prometheus* -rw-rw-r-- 1 root root 15726 Jun 17 2022 prometheus-prometheusRule.yaml -rw-rw-r-- 1 root root 1238 Jun 17 2022 prometheus-prometheus.yaml [root@master231 manifests]#

20、kubeadm证书续期master节点证书

bash
- kubeadm证书续期master节点证书 1.检查证书的有效期 [root@master231 ~]# kubeadm certs check-expiration [check-expiration] Reading configuration from the cluster... [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' W0808 15:50:36.976738 357710 utils.go:69] The recommended value for "resolvConf" in "KubeletConfiguration" is: /run/systemd/resolve/resolv.conf; the provided value is: /run/systemd/resolve/resolv.conf CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED admin.conf Jul 09, 2026 02:40 UTC 334d ca no apiserver Jul 09, 2026 02:40 UTC 334d ca no apiserver-etcd-client Jul 09, 2026 02:40 UTC 334d etcd-ca no apiserver-kubelet-client Jul 09, 2026 02:40 UTC 334d ca no controller-manager.conf Jul 09, 2026 02:40 UTC 334d ca no etcd-healthcheck-client Jul 09, 2026 02:40 UTC 334d etcd-ca no etcd-peer Jul 09, 2026 02:40 UTC 334d etcd-ca no etcd-server Jul 09, 2026 02:40 UTC 334d etcd-ca no front-proxy-client Jul 09, 2026 02:40 UTC 334d front-proxy-ca no scheduler.conf Jul 09, 2026 02:40 UTC 334d ca no CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED ca Jul 07, 2035 02:40 UTC 9y no etcd-ca Jul 07, 2035 02:40 UTC 9y no front-proxy-ca Jul 07, 2035 02:40 UTC 9y no [root@master231 ~]# 2.查看证书的时间 [root@master231 ~]# ll /etc/kubernetes/pki/ total 72 drwxr-xr-x 3 root root 4096 Jul 18 10:13 ./ drwxr-xr-x 4 root root 4096 Jul 9 10:40 ../ -rw-r--r-- 1 root root 1285 Jul 9 10:40 apiserver.crt -rw-r--r-- 1 root root 1155 Jul 9 10:40 apiserver-etcd-client.crt -rw------- 1 root root 1679 Jul 9 10:40 apiserver-etcd-client.key -rw------- 1 root root 1675 Jul 9 10:40 apiserver.key -rw-r--r-- 1 root root 1164 Jul 9 10:40 apiserver-kubelet-client.crt -rw------- 1 root root 1675 Jul 9 10:40 apiserver-kubelet-client.key -rw-r--r-- 1 root root 1099 Jul 9 10:40 ca.crt -rw------- 1 root root 1675 Jul 9 10:40 ca.key drwxr-xr-x 2 root root 4096 Jul 9 10:40 etcd/ -rw-r--r-- 1 root root 1115 Jul 9 10:40 front-proxy-ca.crt -rw------- 1 root root 1675 Jul 9 10:40 front-proxy-ca.key -rw-r--r-- 1 root root 1119 Jul 9 10:40 front-proxy-client.crt -rw------- 1 root root 1679 Jul 9 10:40 front-proxy-client.key -rw------- 1 root root 1679 Jul 9 10:40 sa.key -rw------- 1 root root 451 Jul 9 10:40 sa.pub -rw-r--r-- 1 root root 258 Jul 18 10:13 token.csv [root@master231 ~]# [root@master231 ~]# tree /etc/kubernetes/pki/ /etc/kubernetes/pki/ ├── apiserver.crt ├── apiserver-etcd-client.crt ├── apiserver-etcd-client.key ├── apiserver.key ├── apiserver-kubelet-client.crt ├── apiserver-kubelet-client.key ├── ca.crt ├── ca.key ├── etcd │   ├── ca.crt │   ├── ca.key │   ├── healthcheck-client.crt │   ├── healthcheck-client.key │   ├── peer.crt │   ├── peer.key │   ├── server.crt │   └── server.key ├── front-proxy-ca.crt ├── front-proxy-ca.key ├── front-proxy-client.crt ├── front-proxy-client.key ├── sa.key ├── sa.pub └── token.csv 1 directory, 23 files [root@master231 ~]# [root@master231 ~]# openssl x509 -in /etc/kubernetes/pki/apiserver-kubelet-client.crt -text -noout Certificate: Data: Version: 3 (0x2) Serial Number: 3535312182605372125 (0x310ff3b2fe4ddadd) Signature Algorithm: sha256WithRSAEncryption Issuer: CN = kubernetes Validity Not Before: Jul 9 02:40:49 2025 GMT Not After : Jul 9 02:40:49 2026 GMT Subject: O = system:masters, CN = kube-apiserver-kubelet-client Subject Public Key Info: ... 3.续期master节点证书 [root@master231 ~]# kubeadm certs renew all [renew] Reading configuration from the cluster... [renew] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' W0808 15:55:20.829433 361409 utils.go:69] The recommended value for "resolvConf" in "KubeletConfiguration" is: /run/systemd/resolve/resolv.conf; the provided value is: /run/systemd/resolve/resolv.conf certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed certificate for serving the Kubernetes API renewed certificate the apiserver uses to access etcd renewed certificate for the API server to connect to kubelet renewed certificate embedded in the kubeconfig file for the controller manager to use renewed certificate for liveness probes to healthcheck etcd renewed certificate for etcd nodes to communicate with each other renewed certificate for serving etcd renewed certificate for the front proxy client renewed certificate embedded in the kubeconfig file for the scheduler manager to use renewed Done renewing certificates. You must restart the kube-apiserver, kube-controller-manager, kube-scheduler and etcd, so that they can use the new certificates. [root@master231 ~]# 4.查看证书的时间 [root@master231 ~]# kubeadm certs check-expiration [check-expiration] Reading configuration from the cluster... [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' W0808 15:55:40.002219 361701 utils.go:69] The recommended value for "resolvConf" in "KubeletConfiguration" is: /run/systemd/resolve/resolv.conf; the provided value is: /run/systemd/resolve/resolv.conf CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED admin.conf Aug 08, 2026 07:55 UTC 364d ca no apiserver Aug 08, 2026 07:55 UTC 364d ca no apiserver-etcd-client Aug 08, 2026 07:55 UTC 364d etcd-ca no apiserver-kubelet-client Aug 08, 2026 07:55 UTC 364d ca no controller-manager.conf Aug 08, 2026 07:55 UTC 364d ca no etcd-healthcheck-client Aug 08, 2026 07:55 UTC 364d etcd-ca no etcd-peer Aug 08, 2026 07:55 UTC 364d etcd-ca no etcd-server Aug 08, 2026 07:55 UTC 364d etcd-ca no front-proxy-client Aug 08, 2026 07:55 UTC 364d front-proxy-ca no scheduler.conf Aug 08, 2026 07:55 UTC 364d ca no CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED ca Jul 07, 2035 02:40 UTC 9y no etcd-ca Jul 07, 2035 02:40 UTC 9y no front-proxy-ca Jul 07, 2035 02:40 UTC 9y no [root@master231 ~]# 5.查看证书的时间 [root@master231 ~]# ll /etc/kubernetes/pki/ total 72 drwxr-xr-x 3 root root 4096 Jul 18 10:13 ./ drwxr-xr-x 4 root root 4096 Jul 9 10:40 ../ -rw-r--r-- 1 root root 1285 Aug 8 15:55 apiserver.crt -rw-r--r-- 1 root root 1155 Aug 8 15:55 apiserver-etcd-client.crt -rw------- 1 root root 1679 Aug 8 15:55 apiserver-etcd-client.key -rw------- 1 root root 1675 Aug 8 15:55 apiserver.key -rw-r--r-- 1 root root 1164 Aug 8 15:55 apiserver-kubelet-client.crt -rw------- 1 root root 1675 Aug 8 15:55 apiserver-kubelet-client.key -rw-r--r-- 1 root root 1099 Jul 9 10:40 ca.crt -rw------- 1 root root 1675 Jul 9 10:40 ca.key drwxr-xr-x 2 root root 4096 Jul 9 10:40 etcd/ -rw-r--r-- 1 root root 1115 Jul 9 10:40 front-proxy-ca.crt -rw------- 1 root root 1675 Jul 9 10:40 front-proxy-ca.key -rw-r--r-- 1 root root 1119 Aug 8 15:55 front-proxy-client.crt -rw------- 1 root root 1679 Aug 8 15:55 front-proxy-client.key -rw------- 1 root root 1679 Jul 9 10:40 sa.key -rw------- 1 root root 451 Jul 9 10:40 sa.pub -rw-r--r-- 1 root root 258 Jul 18 10:13 token.csv [root@master231 ~]# [root@master231 ~]# openssl x509 -in /etc/kubernetes/pki/apiserver-kubelet-client.crt -text -noout Certificate: Data: Version: 3 (0x2) Serial Number: 5672730602872110576 (0x4eb9966d8aadfdf0) Signature Algorithm: sha256WithRSAEncryption Issuer: CN = kubernetes Validity Not Before: Jul 9 02:40:49 2025 GMT Not After : Aug 8 07:55:21 2026 GMT ... - kubeadm证书续期worker节点证书 1.升级前查看客户端证书文件 [root@master231 ~]# ll /var/lib/kubelet/pki/ total 20 drwxr-xr-x 2 root root 4096 Jul 9 10:40 ./ drwx------ 8 root root 4096 Jul 9 10:40 ../ -rw------- 1 root root 2830 Jul 9 10:40 kubelet-client-2025-07-09-10-40-51.pem lrwxrwxrwx 1 root root 59 Jul 9 10:40 kubelet-client-current.pem -> /var/lib/kubelet/pki/kubelet-client-2025-07-09-10-40-51.pem -rw-r--r-- 1 root root 2258 Jul 9 10:40 kubelet.crt -rw------- 1 root root 1679 Jul 9 10:40 kubelet.key [root@master231 ~]# [root@master231 ~]# openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -text -noout Certificate: Data: Version: 3 (0x2) Serial Number: 7657321722586276322 (0x6a44458955a321e2) Signature Algorithm: sha256WithRSAEncryption Issuer: CN = kubernetes Validity Not Before: Jul 9 02:40:49 2025 GMT Not After : Jul 9 02:40:50 2026 GMT ... 2.使用kube-controller-manager进行续签证书: 参考链接: https://kubernetes.io/zh-cn/docs/reference/command-line-tools-reference/kube-controller-manager/ [root@master231 pki]# vim /etc/kubernetes/manifests/kube-controller-manager.yaml ... spec: containers: - command: - kube-controller-manager ... # 所签名证书的有效期限。每个 CSR 可以通过设置 spec.expirationSeconds 来请求更短的证书。 - --cluster-signing-duration=87600h0m0s # 启用controner manager自动签发CSR证书,可以不配置,默认就是启用的,但是建议配置上!害怕未来版本发生变化! - --feature-gates=RotateKubeletServerCertificate=true 3.验证kube-controller-manager是否启动成功。 [root@master231 ~]# kubectl get pods -n kube-system -l component=kube-controller-manager -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-controller-manager-master231 1/1 Running 0 2m7s 10.0.0.231 master231 <none> <none> [root@master231 ~]# [root@master231 ~]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health":"true","reason":""} [root@master231 ~]# 4.要求kubelet的配置文件中支持证书滚动,默认是启用的,无需配置。 [root@worker232 ~]# vim /var/lib/kubelet/config.yaml ... rotateCertificates: true 5.客户端节点修改节点的时间 centos操作如下: [root@worker232 ~]# date -s "2025-6-4" [root@worker232 ~]# [root@worker232 ~]# systemctl restart kubelet ubuntu系统操作如下: [root@worker232 ~]# timedatectl set-ntp off # 先关闭时间同步服务。 [root@worker232 ~]# [root@worker232 ~]# timedatectl set-time '2026-07-08 15:30:00' # 修改即将过期的时间的前一天 [root@worker232 ~]# [root@worker232 ~]# date -R Wed, 08 Jul 2026 15:30:01 +0800 [root@worker232 ~]# 6.重启kubelet [root@worker232 ~]# ll /var/lib/kubelet/pki/ total 20 drwxr-xr-x 2 root root 4096 Jul 9 2025 ./ drwxr-xr-x 8 root root 4096 Jul 9 2025 ../ -rw------- 1 root root 1114 Jul 9 2025 kubelet-client-2025-07-09-10-48-14.pem lrwxrwxrwx 1 root root 59 Jul 9 2025 kubelet-client-current.pem -> /var/lib/kubelet/pki/kubelet-client-2025-07-09-10-48-14.pem -rw-r--r-- 1 root root 2258 Jul 9 2025 kubelet.crt -rw------- 1 root root 1675 Jul 9 2025 kubelet.key [root@worker232 ~]# [root@worker232 ~]# systemctl restart kubelet [root@worker232 ~]# [root@worker232 ~]# ll /var/lib/kubelet/pki/ total 24 drwxr-xr-x 2 root root 4096 Jul 8 15:30 ./ drwxr-xr-x 8 root root 4096 Jul 9 2025 ../ -rw------- 1 root root 1114 Jul 9 2025 kubelet-client-2025-07-09-10-48-14.pem -rw------- 1 root root 1114 Jul 8 15:30 kubelet-client-2026-07-08-15-30-31.pem lrwxrwxrwx 1 root root 59 Jul 8 15:30 kubelet-client-current.pem -> /var/lib/kubelet/pki/kubelet-client-2026-07-08-15-30-31.pem -rw-r--r-- 1 root root 2258 Jul 9 2025 kubelet.crt -rw------- 1 root root 1675 Jul 9 2025 kubelet.key [root@worker232 ~]# 7.查看客户端的证书有效期 [root@worker232 prometheus]# openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -text -noout Certificate: Data: Version: 3 (0x2) Serial Number: 04:df:41:55:1b:d0:64:0a:c4:3a:20:ac:37:1a:f8:87 Signature Algorithm: sha256WithRSAEncryption Issuer: CN = kubernetes Validity Not Before: Aug 8 08:01:38 2025 GMT Not After : Jul 7 02:40:49 2035 GMT # Duang~证书续期了10年! ... 8.验证能够正常工作(如果无法创建Pod,则需要删除一下calico的名称空间的Pod) [root@master231 ~]# cat > test-cni.yaml <<EOF apiVersion: v1 kind: Pod metadata: name: xixi spec: nodeName: worker232 containers: - image: harbor250.weixiang.com/weixiang-xiuxian/apps:v1 name: c1 --- apiVersion: v1 kind: Pod metadata: name: haha spec: nodeName: worker233 containers: - image: harbor250.weixiang.com/weixiang-xiuxian/apps:v2 name: c1 EOF [root@master231 ~]# [root@master231 ~]# kubectl apply -f test-cni.yaml pod/xixi created pod/haha created [root@master231 ~]# [root@master231 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES haha 1/1 Running 0 58s 10.100.2.10 worker233 <none> <none> xixi 1/1 Running 0 58s 10.100.1.102 worker232 <none> <none> [root@master231 ~]# [root@master231 ~]# curl 10.100.2.10 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v2</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: red">凡人修仙传 v2 </h1> <div> <img src="2.jpg"> <div> </body> </html> [root@master231 ~]# [root@master231 ~]# curl 10.100.1.102 <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <title>yinzhengjie apps v1</title> <style> div img { width: 900px; height: 600px; margin: 0; } </style> </head> <body> <h1 style="color: green">凡人修仙传 v1 </h1> <div> <img src="1.jpg"> <div> </body> </html> [root@master231 ~]# 温馨提示: 生产环境中对于worker证书升级应该注意的事项: - 对生产环境一定要有敬畏之心,不可随意; - 对证书有效期有效期进行监控,很多开源组件都支持,比如zabbix,prometheus等。 - 在重启kubelet节点时,应该注意滚动更新,不要批量重启,避免Pod大面积无法访问的情况,从而造成业务的损失,甚至生产故障; - 尽量在业务的低谷期做升级操作,影响最小; - 在生产环境操作前,最好是先线下复刻的环境中重复执行3次以上;

本文作者:张龙龙

本文链接:

版权声明:本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!