I have a development K8S cluster deployed with kops on AWS EC2 instances which I initially deployed as an HA architecture with 3 masters and 3 nodes.
Now for cost saving I would like to turn off 2 of the 3 masters and leaving just 1 running
I tried with kubectl drain but it was ineffective and just terminating the node caused the cluster connection to be unstable.
Is there a safe way to remove a Master?
This issue has been already discussed on Github question - HA to single master migration.
There is already prepared solution for you.
Since etcd-manager was introduced in kops 1.12, and main and events etcd clusters are backup to S3 (same bucket for KOPS_STATE_STORE) automatically and regularly.
So if you have a k8s cluster newer than 1.12 version, maybe you need the following steps:
$ kops edit clusterIn etcdCluster section, remove etcdMembers items to keep only one instanceGroup for main and events. e.g.
etcdClusters:
- etcdMembers:
- instanceGroup: master-ap-southeast-1a
name: a
name: main
- etcdMembers:
- instanceGroup: master-ap-southeast-1a
name: a
name: events$ kops update cluster --yes
$ kops rolling-update cluster --yes$ kops delete ig master-xxxxxx-1b
$ kops delete ig master-xxxxxx-1cThis action cannot be undone, and it will delete the 2 master nodes immediately.
Now 2 out of 3 of your master nodes are deleted, k8s etcd services might be failed and the kube-api service will be unreachable. It is normal that your kops and kubectl commands do not work anymore after this step.
$ sudo systemctl stop protokube
$ sudo systemctl stop kubeletDownload the etcd-manager-ctl tool. If using a different etcd-manager version, adjust the download link accordingly
$ wget https://github.com/kopeio/etcd-manager/releases/download/3.0.20190930/etcd-manager-ctl-linux-amd64
$ mv etcd-manager-ctl-linux-amd64 etcd-manager-ctl
$ chmod +x etcd-manager-ctl
$ mv etcd-manager-ctl /usr/local/bin/Restore backups from S3. See the official docs
$ etcd-manager-ctl -backup-store=s3://<kops s3 bucket name>/<cluster name>/backups/etcd/main list-backups
$ etcd-manager-ctl -backup-store=s3://<kops s3 bucket name>/<cluster name>/backups/etcd/main restore-backup 2019-10-16T09:42:37Z-000001
# do the same for events
$ etcd-manager-ctl -backup-store=s3://<kops s3 bucket name>/<cluster name>/backups/etcd/events list-backups
$ etcd-manager-ctl -backup-store=s3://<kops s3 bucket name>/<cluster name>/backups/etcd/events restore-backup 2019-10-16T09:42:37Z-000001This does not start the restore immediately; you need to restart etcd: kill related containers and start kubelet
$ sudo systemctl start kubelet
$ sudo systemctl start protokubeWait for the restore to finish, then kubectl get nodes and kops validate cluster should be working. If not, you can just terminate the EC2 instance of the remaining master node in AWS console, a new master node will be created by Auto Scaling Groups, and etcd cluster will be restored.