## EKS Upgarde Progress

| Environment | Region | Cluster Versions |
| ----------- | ------- | ------------------------------------------------------------ |
| Dev | Beijing | Service: v1.33, ArgoCD: v1.33, Testkube: v1.32 |
| Dev | Ningxia | Service: v1.33 |
| QA | Beijing | Service: v1.31 (暂未继续升级,若需可供后续创建multi-cluster测试集群使用) |
| QA | Ningxia | Service: v1.33 |
| Prod | Beijing | Service: v1.30, ArgoCD: v1.33, Testkube: v1.32 |
| Prod | Ningxia | Service: v1.30 |



## EKS Multi-Cluster

### requirements

1. Ensure no interruption to the DKMS service during EKS upgrades and new version releases.
2. Support blue/green deployments and canary (gradual) releases.



| Solutions | Description | Pros | Cons | AWS Recommendation |
| --------- | ----------- | ---- | ---- | ------------------ |

| Route53 Switch | Two NLBs (NLB → ALB → EKS), use Route53 Weighted Routing to roughly switch traffic | 1. Easy to implement; 2. clear separation | 1. DNS TTL (60s) delay; 2. SDK client DNS caching may cause failover traffic to Ningxia | AWS Recommended |
| ------------------ | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ----------------- |

| Shared NLB | Two ALBs connected as Target Groups of same NLB; switch by binding/unbinding target groups | 1. No Route53 ops; 2. faster traffic shift | 1. Client reconnect; | Recommended |
| -------------- | ------------------------------------------------------------ | ------------------------------------------ | --------------------- | ------------- |

| Shared ALB | One ALB forwards to 2 EKS clusters via modify TargetGroupBinding | 1. Support blue/green deployments and canary (gradual) releases. | 1. High ops complexity; 2. risk of ALB/Ingress re-creation | Not recommended |
| -------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ---------------------------------------------------------- | ----------------- |
| | | | | |



### Option 1: Route53 Switch

#### AS-IS (Before Active-Active)

graph TD
  A[Clients] --> B[Route53 - Simple]
  B --> C[NLB - Beijing]
  C --> D[ALB - Beijing]
  D --> E[EKS Cluster - v1]


#### TO-BE (Blue/Green with Route53 Weighted Routing)

graph TD
  A[Clients] --> B[Route53 - Weighted TTL 60s]
  B --> C1[NLB - Beijing] --> D1[ALB - Blue] --> E1[EKS - v1]
  B --> C2[NLB - Beijing] --> D2[ALB - Green] --> E2[EKS - v2]


------

### Option 2: Shared NLB with Two ALBs

#### AS-IS

graph TD
  A[Clients] --> B[Route53 - Simple]
  B --> C[NLB]
  C --> D[ALB - v1] --> E[EKS Cluster - v1]


#### TO-BE (Switch Target Groups)

graph TD
  A[Clients] --> B[Route53 - Simple]
  B --> C[NLB]
  C --> D1[ALB - Blue] --> E1[EKS - v1]
  C --> D2[ALB - Green] --> E2[EKS - v2]


> 🔁 During upgrade: Unbind D1, bind D2

------

### Option 3: Shared ALB Across Clusters

#### AS-IS

graph TD
  A[Clients] --> B[Route53 - Simple]
  B --> C[NLB] --> D[ALB] --> E[EKS Cluster - v1]


#### TO-BE (Switch TargetGroupBinding)

graph TD
  A[Clients] --> B[Route53 - Simple]
  B --> C[NLB] --> D[ALB]
  D --> E1[EKS - v1]
  D --> E2[EKS - v2]


> ⚠️ Risk: ALB/Ingress may be recreated when changing bindings
 
 
Back to Top