GCP Beginner Level

5,877 views

10 Easy Steps to Migrate Workload to a New GKE Node Pool

Akash Kumar • Published on August 2, 2026

10 min read | 1,300 words

Dev Knowledge • Hub

Upgrading underlying host machines or shifting workloads to specialized compute shapes is a standard administrative task in enterprise Kubernetes environments. When running on Google Kubernetes Engine (GKE), executing these migrations without application downtime requires a careful orchestration of node pooling, traffic management, and pod evictions. In this comprehensive step-by-step tutorial, we demonstrate how to gracefully migrate containerized workloads from a default node pool to a newly provisioned, high-performance node pool with absolute zero user-facing downtime.

⚡ Key Takeaways

Seamless Node Upgrades: Learn to shift running containers to new VM instances without causing packet loss or service disruptions.
Cordon vs. Drain: Master the two essential Kubernetes node commands that control pod scheduling and eviction safety.
Optimize Resources: Understand how to add high-performance node pools to GKE clusters and retire legacy pools safely.
Automated Rescheduling: Witness how GKE automatically detects unschedulable nodes and migrates pods to healthy resources.

Why Migrate Workloads to a New GKE Node Pool?

In a production Google Kubernetes Engine (GKE) environment, application workloads rarely remain static. Over time, your compute demands may evolve, necessitating larger CPU cores, GPU accelerators, or high-memory virtual machines. Creating a separate node pool allows you to introduce virtual machines with distinct hardware profiles, different IAM scopes, or modern operating systems into your existing cluster. To prevent active user requests from failing during this infrastructure shift, administrators must utilize Kubernetes' native scheduling controls to drain active nodes gracefully, forcing pods to redeploy onto the new, healthy node pool without interrupting application availability.

Prerequisites

To successfully follow along with this hands-on guide, you will need:

An active Google Cloud Platform (GCP) account.
A project configured with billing enabled.
An IAM identity with Owner, Editor, or Kubernetes Engine Admin permissions.
Google Cloud SDK installed locally, or access to the GCP Cloud Shell.

Step 1: Enable the Kubernetes Engine API

Before provisioning any managed resources in Google Cloud, you must ensure the corresponding APIs are enabled. Search for the 'Kubernetes Engine API' in your GCP Console search bar and click 'Enable'. Alternatively, you can enable it instantly via the Cloud Shell using the gcloud command line tool:

gcloud services enable container.googleapis.com

Step 2: Launch and Authenticate Cloud Shell

Google Cloud Shell provides a pre-configured terminal environment equipped with all the necessary CLI utilities, including gcloud and kubectl. Click the Cloud Shell icon in the top-right corner of the GCP console. Once the session initializes, run the authentication command to verify your identity:

gcloud auth login

Step 3: Set Your Target Project, Region, and Zone

To prevent resource creation errors, explicitly declare your active project ID and set the default compute region and zone. Replace [PROJECT_ID] with your actual GCP project identifier:

gcloud config set project [PROJECT_ID]
gcloud config set compute/region us-central1
gcloud config set compute/zone us-central1-a

Verify that your settings are correctly applied by checking the configuration list: gcloud config list.

Step 4: Provision Your Initial GKE Cluster

Start by creating a demo cluster consisting of three standard virtual machines. This cluster will act as the source hosting environment for our initial application workload. The following command creates a cluster named demo-cluster with three worker nodes running on standard e2-medium instances:

gcloud container clusters create demo-cluster --num-nodes=3 --machine-type=e2-medium

This process generally takes between 5 to 10 minutes to fully bootstrap the master control plane and provision the compute nodes.

Step 5: Clone the Application Demo Repository

Retrieve the sample Kubernetes deployment manifest files from our source code repository. Run the following git commands in your terminal to fetch the resources and switch to the project directory:

git clone https://github.com/ShivaniG04/KubernetesWorkloadMigration.git
cd KubernetesWorkloadMigration

Step 6: Deploy the Replicated Application to the Cluster

Deploy the replicated sample web application to your active GKE cluster. Apply the deployment manifest utilizing kubectl:

kubectl apply -f node-pools-deployment.yaml

Confirm that the pods are running and note their host node mappings by appending the wide output flag:

kubectl get pods -o wide

You will observe that all active application pods are currently distributed across the three nodes of your default node pool.

Step 7: Provision the High-Performance Target Node Pool

Now, we will introduce a new, higher-memory node pool named high-mem-pool into our cluster. This pool utilizes e2-highmem-2 instances, which are optimized for resource-intensive workloads. Execute the pool creation command:

gcloud container node-pools create high-mem-pool --cluster=demo-cluster --machine-type=e2-highmem-2 --num-nodes=3

Once created, list the node pools to verify both pools are attached to your cluster: gcloud container node-pools list --cluster=demo-cluster. Running kubectl get nodes will now display six active nodes in your cluster.

Step 8: Migrate Workloads to the New Pool (Cordon & Drain)

To safely migrate the workload, we must systematically evacuate the default nodes. This is a two-step process: cordoning and draining.

First, cordon all nodes belonging to the default-pool to mark them as unschedulable. This prevents Kubernetes from placing any new pods on these nodes:

for node in $(kubectl get nodes -l cloud.google.com/gke-nodepool=default-pool -o=name); do
  kubectl cordon "$node"
done

Next, drain the cordoned nodes. Draining evicts the active pods gracefully, forcing them to reschedule onto the newly available nodes in the high-mem-pool:

for node in $(kubectl get nodes -l cloud.google.com/gke-nodepool=default-pool -o=name); do
  kubectl drain --force --ignore-daemonsets --delete-emptydir-data --grace-period=10 "$node"
done

Verify that your pods have successfully rescheduled to the high-memory nodes by running: kubectl get pods -o wide. You will notice that they have shifted to the new nodes with zero downtime.

Step 9: Retain and Delete the Legacy Node Pool

Once all application pods have successfully rescheduled and are verified healthy on the new node pool, it is safe to decommission the legacy hardware. Deleting the old pool releases the compute resources and helps optimize your GCP spending:

gcloud container node-pools delete default-pool --cluster=demo-cluster

Confirm the deletion by listing the remaining node pools. Your cluster should now run exclusively on the high-memory pool.

Step 10: Clean Up Your Cloud Infrastructure

To avoid incurring ongoing charges on your GCP account, always delete the resources once your testing or operations are complete. Delete the entire GKE cluster, which will automatically clean up the attached load balancers, virtual machines, disks, and networking components:

gcloud container clusters delete demo-cluster

Quick Comparison: Cordon vs. Drain Operations

Operation	Primary Action	Impact on Existing Pods	Impact on New Pods
Cordon	Marks node as unschedulable	Existing pods continue running undisturbed	New pods are blocked from scheduling
Drain	Evicts pods and triggers rescheduling	Gracefully terminates and recreates pods elsewhere	Blocks new scheduling and clears active workloads

❓ Frequently Asked Questions

Does cordoning a node immediately stop the running containers?

No. Cordoning only updates the node's metadata to mark it as unschedulable. Any containers currently running on the node will continue to execute undisturbed until the node is explicitly drained or the pods are deleted.

Why must we include the --ignore-daemonsets flag when draining nodes?

DaemonSets are specialized pods that must run on every node in the cluster (such as logging agents or monitoring tools). If you do not include the --ignore-daemonsets flag, the drain operation will fail because DaemonSets cannot be rescheduled onto other nodes.

How do we ensure pods do not experience downtime during migration?

To prevent downtime, ensure your application deployment has a replicas count of two or more, and configure a PodDisruptionBudget. This ensures that when a node is drained, Kubernetes maintains a minimum number of healthy, active pods to handle user traffic.

Can we automate node pool migration dynamically?

Yes. Many enterprise environments use managed GKE features like Node Auto-Provisioning or GCP's managed node upgrades, which automatically perform these cordon, drain, and pool replacement processes behind the scenes during maintenance windows.

🎯 Conclusion

Migrating GKE workloads to a new node pool is a vital operational skill that guarantees your application scales seamlessly to handle evolving business needs. By understanding the distinction between cordoning and draining, you can transition running pods between different machine profiles with absolute confidence and zero client impact. Incorporate these practices into your regular infrastructure updates, and establish a resilient, highly available cloud-native framework today.

Related Topics: gke node pool migration, kubernetes cordon and drain, zero downtime workload migration, google kubernetes engine tutorials, pod rescheduling best practices, gcp compute engine shapes, kubernetes cluster administration, daemonset eviction

10 DevOps Tips That Will Skyrocket Your Team’s Productivity Overnight!

10 Essential Characteristics of Cloud Computing: A Must-Know for Aspiring Cloud Computing Professionals

Written By Akash Kumar

Senior Software Developer

Akash Kumar is a Senior Software Developer with 6+ years of experience as a full stack developer. He specializes in designing and building scalable web applications, optimizing cloud infrastructure, and implementing modern DevOps workflows.

10 Easy Steps to Migrate Workload to a New GKE Node Pool

⚡ Key Takeaways

Why Migrate Workloads to a New GKE Node Pool?

Prerequisites

Step 1: Enable the Kubernetes Engine API

Step 2: Launch and Authenticate Cloud Shell

Step 3: Set Your Target Project, Region, and Zone

Step 4: Provision Your Initial GKE Cluster

Step 5: Clone the Application Demo Repository

Step 6: Deploy the Replicated Application to the Cluster

Step 7: Provision the High-Performance Target Node Pool

Step 8: Migrate Workloads to the New Pool (Cordon & Drain)

Step 9: Retain and Delete the Legacy Node Pool

Step 10: Clean Up Your Cloud Infrastructure

Quick Comparison: Cordon vs. Drain Operations

❓ Frequently Asked Questions

Does cordoning a node immediately stop the running containers?

Why must we include the --ignore-daemonsets flag when draining nodes?

How do we ensure pods do not experience downtime during migration?

Can we automate node pool migration dynamically?

🎯 Conclusion

Written By Akash Kumar

Frequently Asked Questions (FAQ)

Was this page helpful?

Thank You!

Comments (0)

10 Easy Steps to Migrate Workload to a New GKE Node Pool

⚡ Key Takeaways

Why Migrate Workloads to a New GKE Node Pool?

Prerequisites

Step 1: Enable the Kubernetes Engine API

Step 2: Launch and Authenticate Cloud Shell

Step 3: Set Your Target Project, Region, and Zone

Step 4: Provision Your Initial GKE Cluster

Step 5: Clone the Application Demo Repository

Step 6: Deploy the Replicated Application to the Cluster

Step 7: Provision the High-Performance Target Node Pool

Step 8: Migrate Workloads to the New Pool (Cordon & Drain)

Step 9: Retain and Delete the Legacy Node Pool

Step 10: Clean Up Your Cloud Infrastructure

Quick Comparison: Cordon vs. Drain Operations

❓ Frequently Asked Questions

Does cordoning a node immediately stop the running containers?

Why must we include the --ignore-daemonsets flag when draining nodes?

How do we ensure pods do not experience downtime during migration?

Can we automate node pool migration dynamically?

🎯 Conclusion

Written By Akash Kumar

Frequently Asked Questions (FAQ)

Was this page helpful?

Thank You!

Comments (0)

Begin Programming Diagnostic

Compiling Cognitive Telemetry

Your Programming Skill Scan Report