Skip to main content

## EKS Upgarde Progress| Environment | Region | Cluster Versions || ----------- | ------- | ------------------------------------------------------------ || Dev | Beijing | Service: v1.33, ArgoCD: v1.33, Testkube: v1.32 || Dev | Ningxia | Service: v1.33 || QA | Beijing | Service: v1.31 (暂未继续升级,若需可供后续创建multi-cluster测试集群使用) || QA | Ningxia | Service: v1.33 || Prod | Beijing | Service: v1.30, ArgoCD: v1.33, Testkube: v1.32 || Prod | Ningxia | Service: v1.30 |## EKS Multi-Cluster### requirements1. Ensure no interruption to the DKMS service during EKS upgrades and new version releases.2. Support blue/green deployments and canary (gradual) releases.| Solutions | Description | Pros | Cons | AWS Recommendation || --------- | ----------- | ---- | ---- | ------------------ || Route53 Switch | Two NLBs (NLB → ALB → EKS), use Route53 Weighted Routing to roughly switch traffic | 1. Easy to implement; 2. clear separation | 1. DNS TTL (60s) delay; 2. SDK client DNS caching may cause failover traffic to Ningxia | ✅ AWS Recommended || ------------------ | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ----------------- || Shared NLB | Two ALBs connected as Target Groups of same NLB; switch by binding/unbinding target groups | 1. No Route53 ops; 2. faster traffic shift | 1. Client reconnect; | ✅ Recommended || -------------- | ------------------------------------------------------------ | ------------------------------------------ | --------------------- | ------------- || Shared ALB | One ALB forwards to 2 EKS clusters via modify TargetGroupBinding | 1. Support blue/green deployments and canary (gradual) releases. | 1. High ops complexity; 2. risk of ALB/Ingress re-creation | ❌ Not recommended || -------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ---------------------------------------------------------- | ----------------- || | | | | |### Option 1: Route53 Switch#### AS-IS (Before Active-Active)graph TD

  1. ## EKS Upgarde Progress

    | Environment | Region | Cluster Versions |
    | ----------- | ------- | ------------------------------------------------------------ |
    | Dev | Beijing | Service: v1.33, ArgoCD: v1.33, Testkube: v1.32 |
    | Dev | Ningxia | Service: v1.33 |
    | QA | Beijing | Service: v1.31 (暂未继续升级,若需可供后续创建multi-cluster测试集群使用) |
    | QA | Ningxia | Service: v1.33 |
    | Prod | Beijing | Service: v1.30, ArgoCD: v1.33, Testkube: v1.32 |
    | Prod | Ningxia | Service: v1.30 |



    ## EKS Multi-Cluster

    ### requirements

    1. Ensure no interruption to the DKMS service during EKS upgrades and new version releases.
    2. Support blue/green deployments and canary (gradual) releases.



    | Solutions | Description | Pros | Cons | AWS Recommendation |
    | --------- | ----------- | ---- | ---- | ------------------ |

    | Route53 Switch | Two NLBs (NLB → ALB → EKS), use Route53 Weighted Routing to roughly switch traffic | 1. Easy to implement; 2. clear separation | 1. DNS TTL (60s) delay; 2. SDK client DNS caching may cause failover traffic to Ningxia | AWS Recommended |
    | ------------------ | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ----------------- |

    | Shared NLB | Two ALBs connected as Target Groups of same NLB; switch by binding/unbinding target groups | 1. No Route53 ops; 2. faster traffic shift | 1. Client reconnect; | Recommended |
    | -------------- | ------------------------------------------------------------ | ------------------------------------------ | --------------------- | ------------- |

    | Shared ALB | One ALB forwards to 2 EKS clusters via modify TargetGroupBinding | 1. Support blue/green deployments and canary (gradual) releases. | 1. High ops complexity; 2. risk of ALB/Ingress re-creation | Not recommended |
    | -------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ---------------------------------------------------------- | ----------------- |
    | | | | | |



    ### Option 1: Route53 Switch

    #### AS-IS (Before Active-Active)

    graph TD
      A[Clients] --> B[Route53 - Simple]
      B --> C[NLB - Beijing]
      C --> D[ALB - Beijing]
      D --> E[EKS Cluster - v1]


    #### TO-BE (Blue/Green with Route53 Weighted Routing)

    graph TD
      A[Clients] --> B[Route53 - Weighted TTL 60s]
      B --> C1[NLB - Beijing] --> D1[ALB - Blue] --> E1[EKS - v1]
      B --> C2[NLB - Beijing] --> D2[ALB - Green] --> E2[EKS - v2]


    ------

    ### Option 2: Shared NLB with Two ALBs

    #### AS-IS

    graph TD
      A[Clients] --> B[Route53 - Simple]
      B --> C[NLB]
      C --> D[ALB - v1] --> E[EKS Cluster - v1]


    #### TO-BE (Switch Target Groups)

    graph TD
      A[Clients] --> B[Route53 - Simple]
      B --> C[NLB]
      C --> D1[ALB - Blue] --> E1[EKS - v1]
      C --> D2[ALB - Green] --> E2[EKS - v2]


    > 🔁 During upgrade: Unbind D1, bind D2

    ------

    ### Option 3: Shared ALB Across Clusters

    #### AS-IS

    graph TD
      A[Clients] --> B[Route53 - Simple]
      B --> C[NLB] --> D[ALB] --> E[EKS Cluster - v1]


    #### TO-BE (Switch TargetGroupBinding)

    graph TD
      A[Clients] --> B[Route53 - Simple]
      B --> C[NLB] --> D[ALB]
      D --> E1[EKS - v1]
      D --> E2[EKS - v2]


    > ⚠️ Risk: ALB/Ingress may be recreated when changing bindings