灰度发布(金丝雀)

灰度发布(金丝雀)

📝 更优雅的方案:使用 HPA 控制比例

  1. 准备两个版本的镜像
    bash

    构建v1和v2

    docker build -t my-app:v1 .
    docker build -t my-app:v2 .
    k3d image import my-app:v1 -c centos-cluster
    k3d image import my-app:v2 -c centos-cluster
  2. 创建两个 Deployment(v1稳定版,v2金丝雀)
    yaml

    stable-deployment.yaml

    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: my-app-stable
    labels:
    app: my-app
    track: stable
    spec:
    replicas: 3
    selector:
    matchLabels:
    app: my-app
    track: stable
    template:
    metadata:
    labels:
     app: my-app
     track: stable
    
    spec:
    containers:
    • name: my-app
      image: my-app:v1
      ports:
      • containerPort: 8080

canary-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-canary
labels:
app: my-app
track: canary
spec:
replicas: 1 # 金丝雀实例
selector:
matchLabels:
app: my-app
track: canary
template:
metadata:
labels:
app: my-app
track: canary
spec:
containers:
- name: my-app
image: my-app:v2 # 新版本
ports:
- containerPort: 8080
3. 创建 Service(指向所有版本)
yaml

service.yaml

apiVersion: v1
kind: Service
metadata:
name: my-app
spec:
selector:
app: my-app # 选择所有 my-app 的 Pod
ports:

  • port: 80
    targetPort: 8080
  1. 逐步调整金丝雀实例数量
    bash

    初始状态:3个v1 + 1个v2(25%流量到v2)

监控一段时间后,如果稳定,增加金丝雀数量

kubectl scale deployment my-app-canary –replicas=2 # 40%流量到v2
kubectl scale deployment my-app-canary –replicas=3 # 50%流量到v2
kubectl scale deployment my-app-canary –replicas=6 # 67%流量到v2

减少稳定版数量

kubectl scale deployment my-app-stable –replicas=2 # 6个v2 + 2个v1 = 75% v2
kubectl scale deployment my-app-stable –replicas=1 # 6个v2 + 1个v1 = 86% v2
kubectl scale deployment my-app-stable –replicas=0 # 100% v2

最后,将金丝雀转为稳定版

kubectl delete deployment my-app-stable
kubectl label deployment my-app-canary track=stable –overwrite
🔧 自动化脚本:分阶段滚动更新
创建一个脚本 canary-rollout.sh:

bash
#!/bin/bash

DEPLOYMENT=”my-app”
NEW_VERSION=”v2”
TOTAL_REPLICAS=4
SLEEP_TIME=60 # 每个阶段等待时间(秒)

echo “=== 开始金丝雀发布 ===”

阶段1:调整滚动更新策略

echo “阶段1:配置滚动更新策略”
kubectl patch deployment $DEPLOYMENT -p ‘{
“spec”: {
“strategy”: {
“type”: “RollingUpdate”,
“rollingUpdate”: {
“maxSurge”: 1,
“maxUnavailable”: 0
}
}
}
}’

阶段2:更新到新版本(只创建1个新Pod)

echo “阶段2:部署第一个金丝雀实例”
kubectl set image deployment/$DEPLOYMENT *=my-app:$NEW_VERSION
sleep 5

等待第一个金丝雀就绪

echo “等待金丝雀实例就绪…”
kubectl wait –for=condition=available –timeout=120s deployment/$DEPLOYMENT

阶段3:监控金丝雀

echo “阶段3:监控金丝雀 $SLEEP_TIME 秒”
for i in $(seq 1 $SLEEP_TIME); do
echo -n “.”
sleep 1
done
echo “”

检查是否有错误

ERROR_COUNT=$(kubectl logs -l app=$DEPLOYMENT –tail=100 | grep -i error | wc -l)
if [ $ERROR_COUNT -gt 10 ]; then
echo “错误数过多,回滚!”
kubectl rollout undo deployment/$DEPLOYMENT
exit 1
fi

阶段4:触发第二次滚动(2个新Pod)

echo “阶段4:扩展金丝雀到2个实例”
kubectl patch deployment $DEPLOYMENT -p ‘{“spec”:{“template”:{“metadata”:{“labels”:{“update”:”step2”}}}}}’
sleep $SLEEP_TIME

阶段5:触发第三次滚动(3个新Pod)

echo “阶段5:扩展金丝雀到3个实例”
kubectl patch deployment $DEPLOYMENT -p ‘{“spec”:{“template”:{“metadata”:{“labels”:{“update”:”step3”}}}}}’
sleep $SLEEP_TIME

阶段6:完成全部替换

echo “阶段6:完成全部替换”
kubectl patch deployment $DEPLOYMENT -p ‘{“spec”:{“template”:{“metadata”:{“labels”:{“update”:”final”}}}}}’

echo “=== 金丝雀发布完成 ===”
📊 监控流量分布
bash

实时监控Pod数量

watch -n 2 ‘kubectl get pods -l app=my-app’

查看版本分布

kubectl get pods -l app=my-app -o custom-columns=NAME:.metadata.name,IMAGE:.spec.containers[0].image,STATUS:.status.phase

模拟流量测试

在一个终端

kubectl port-forward svc/my-app 8080:80

在另一个终端测试

while true; do
curl -s http://localhost:8080/version 2>/dev/null || echo “error”
sleep 0.5
done
⚠️ 注意事项
滚动更新不是真正的灰度发布,它总是会逐步替换旧版本

真正的灰度发布需要新老版本共存,通过 Service 权重分配流量

如果你需要精确控制流量比例,建议使用 Istio 或 Nginx Ingress 的金丝雀功能