I am trying write a bash script to troubleshoot a kubernetes cluster
I have a kubernetes cluster with one master node and few minions.
I am trying to write a script to troubleshoot the cluster which finally outputs a detailed report with errors(if existed) and successes
Q.) What I really need to know is, what steps should I follow and what components should I be testing/checking during the troubleshoot. I need a list of procedures, steps (maybe in bullet form) on how to troubleshoot a Kubernetes Cluster manually
PS: I don't want to use kubernetes in-built testing mechanism, I need it to be manually tested/troubleshooted.
Any one here can give me a good descriptive mechanism/steps?
You need to check below services on master to confirm that kubernetes is functioning fine
docker should be running
kubelet should be running ( if you run control-plane components in containers )
etcd
kubernetes scheduler
kubernetes controller manager
kubernetes components health ( kubectl get cs )
all services in kube-system should be running ( kubectl get pods -n kube-system )