## EKS Upgarde Progress
| Environment | Region | Cluster Versions |
| ----------- | ------- | ------------------------------------------------------------ |
| Dev | Beijing | Service: v1.33, ArgoCD: v1.33, Testkube: v1.32 |
| Dev | Ningxia | Service: v1.33 |
| QA | Beijing | Service: v1.31 (暂未继续升级,若需可供后续创建multi-cluster测试集群使用) |
| QA | Ningxia | Service: v1.33 |
| Prod | Beijing | Service: v1.30, ArgoCD: v1.33, Testkube: v1.32 |
| Prod | Ningxia | Service: v1.30 |
## EKS Multi-Cluster
### requirements
1. Ensure no interruption to the DKMS service during EKS upgrades and new version releases.
2. Support blue/green deployments and canary (gradual) releases.
| Solutions | Description | Pros | Cons | AWS Recommendation |
| --------- | ----------- | ---- | ---- | ------------------ |
| Route53 Switch | Two NLBs (NLB → ALB → EKS), use Route53 Weighted Routing to roughly switch traffic | 1. Easy to implement; 2. clear separation | 1. DNS TTL (60s) delay; 2. SDK client DNS caching may cause failover traffic to Ningxia | ✅ AWS Recommended |
| ------------------ | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ----------------- |
| Shared NLB | Two ALBs connected as Target Groups of same NLB; switch by binding/unbinding target groups | 1. No Route53 ops; 2. faster traffic shift | 1. Client reconnect; | ✅ Recommended |
| -------------- | ------------------------------------------------------------ | ------------------------------------------ | --------------------- | ------------- |
| Shared ALB | One ALB forwards to 2 EKS clusters via modify TargetGroupBinding | 1. Support blue/green deployments and canary (gradual) releases. | 1. High ops complexity; 2. risk of ALB/Ingress re-creation | ❌ Not recommended |
| -------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ---------------------------------------------------------- | ----------------- |
| | | | | |
### Option 1: Route53 Switch
#### AS-IS (Before Active-Active)
#### TO-BE (Blue/Green with Route53 Weighted Routing)
------
### Option 2: Shared NLB with Two ALBs
#### AS-IS
#### TO-BE (Switch Target Groups)
> 🔁 During upgrade: Unbind D1, bind D2
------
### Option 3: Shared ALB Across Clusters
#### AS-IS
#### TO-BE (Switch TargetGroupBinding)
> ⚠️ Risk: ALB/Ingress may be recreated when changing bindings
| Environment | Region | Cluster Versions |
| ----------- | ------- | ------------------------------------------------------------ |
| Dev | Beijing | Service: v1.33, ArgoCD: v1.33, Testkube: v1.32 |
| Dev | Ningxia | Service: v1.33 |
| QA | Beijing | Service: v1.31 (暂未继续升级,若需可供后续创建multi-cluster测试集群使用) |
| QA | Ningxia | Service: v1.33 |
| Prod | Beijing | Service: v1.30, ArgoCD: v1.33, Testkube: v1.32 |
| Prod | Ningxia | Service: v1.30 |
## EKS Multi-Cluster
### requirements
1. Ensure no interruption to the DKMS service during EKS upgrades and new version releases.
2. Support blue/green deployments and canary (gradual) releases.
| Solutions | Description | Pros | Cons | AWS Recommendation |
| --------- | ----------- | ---- | ---- | ------------------ |
| Route53 Switch | Two NLBs (NLB → ALB → EKS), use Route53 Weighted Routing to roughly switch traffic | 1. Easy to implement; 2. clear separation | 1. DNS TTL (60s) delay; 2. SDK client DNS caching may cause failover traffic to Ningxia | ✅ AWS Recommended |
| ------------------ | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ----------------- |
| Shared NLB | Two ALBs connected as Target Groups of same NLB; switch by binding/unbinding target groups | 1. No Route53 ops; 2. faster traffic shift | 1. Client reconnect; | ✅ Recommended |
| -------------- | ------------------------------------------------------------ | ------------------------------------------ | --------------------- | ------------- |
| Shared ALB | One ALB forwards to 2 EKS clusters via modify TargetGroupBinding | 1. Support blue/green deployments and canary (gradual) releases. | 1. High ops complexity; 2. risk of ALB/Ingress re-creation | ❌ Not recommended |
| -------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ---------------------------------------------------------- | ----------------- |
| | | | | |
### Option 1: Route53 Switch
#### AS-IS (Before Active-Active)
graph TD
A[Clients] --> B[Route53 - Simple]
B --> C[NLB - Beijing]
C --> D[ALB - Beijing]
D --> E[EKS Cluster - v1]#### TO-BE (Blue/Green with Route53 Weighted Routing)
graph TD
A[Clients] --> B[Route53 - Weighted TTL 60s]
B --> C1[NLB - Beijing] --> D1[ALB - Blue] --> E1[EKS - v1]
B --> C2[NLB - Beijing] --> D2[ALB - Green] --> E2[EKS - v2]------
### Option 2: Shared NLB with Two ALBs
#### AS-IS
graph TD
A[Clients] --> B[Route53 - Simple]
B --> C[NLB]
C --> D[ALB - v1] --> E[EKS Cluster - v1]#### TO-BE (Switch Target Groups)
graph TD
A[Clients] --> B[Route53 - Simple]
B --> C[NLB]
C --> D1[ALB - Blue] --> E1[EKS - v1]
C --> D2[ALB - Green] --> E2[EKS - v2]> 🔁 During upgrade: Unbind D1, bind D2
------
### Option 3: Shared ALB Across Clusters
#### AS-IS
graph TD
A[Clients] --> B[Route53 - Simple]
B --> C[NLB] --> D[ALB] --> E[EKS Cluster - v1]#### TO-BE (Switch TargetGroupBinding)
graph TD
A[Clients] --> B[Route53 - Simple]
B --> C[NLB] --> D[ALB]
D --> E1[EKS - v1]
D --> E2[EKS - v2]> ⚠️ Risk: ALB/Ingress may be recreated when changing bindings