Showing posts with label cluster. Show all posts
Showing posts with label cluster. Show all posts

Friday, December 30, 2022

Kubernetes Troubleshooting

 We as DevOps and DevSecOps Engineers working on many microservice based application architectures where we need to manage Kubernetes Cluster  Troubleshot at various levels.

You cannot rely on single point of look for failures. While working on Kubernetes Troubleshooting we can make ourselves easy to understand the problem, if we could classify the problem belong to the following categories.
  1. Application Failure
  2. Master node/ControlPlane Failures
  3. Worker node Failures

Application Failure - trobleshooting

Here I'm listing out these with my understanding and experiance in practice tests provided by Munshad Mohammad on KodeKloud.
  1. You should know the architecture how it is deployed what all its dependents, where they have deployed with what endpoints, what names used.
  2. Check the service 'name' defined and referring service should match and also check the services 'Endpoints' are correctly defined and in referenceing used correctly.
    k -n dev-ns get all
    
  3. Better to check that the selector are as properly aligned or not, as per the architecture design definitions. if it is not then you need to change them.
    k -n test-ns edit svc mysql-service
    
  4. Identify is there any mismatch for the environment values defined in the deployment cross check with the Kubernetes objects those are integrating.
    k -n test-ns descrube deploy webapp-mysql
    
    If that doesn't match example mysql-user value was mismatched then you can change it, it will automatically redeploy the pods.
    k -n test-ns edit deploy webapp-mysql
  5. Check also service NodePort port correctly mentioned or not. If it mismatches then need to replace with correct one as per the design.
    k -n test-ns describe service/web-service
    k -n test-ns edit service/web-service # edit nodePort value correct
    

Controlplane/Kubernetes Master node Failure - Troubleshooting

  1. Initial analysis start from nodes, pods
    To troubleshoot the controlplane failure first thing is to check the status of the nodes in the cluster.
    k get nodes 
    
    they all should be healthy then, go for the next step that is status of the pods,deployments,services,replicasets (all) within the namespace on which we have trouble.
    k get po 
    k get all 
    
    Then ensure that pods that belongs to kube-system are 'Running' status.
  2. Check the Controlplane services
    # Check kube-apiserver
    service kube-apiserver status 
    or 
    systemctl status kube-apiserver 
    
    # Check kube-controller-manager
    service kube-controller-manager status 
    or 
    systemctl status kube-controller-manager
    
    # Check kube-schduler
    service kube-schduler status 
    or 
    systemctl status kube-schduler
    
    # # Check kubelet service on the worker nodes 
    service kubelet status 
    or 
    systemctl status kubelet 
    
    # # Check kube-proxy service on the worker nodes 
    service kube-proxy status 
    or 
    systemctl status kube-proxy 
    
    # Check the logs of Controlplane components 
    kubectl logs kube-apiserver-master -n kube-system 
    # system level logs 
    journalctl -u kube-apiserver 
    
  3. If there is issue with the Kube-scheduler then to correct it we need to change the YAML file preent in default location `vi /etc/kubernetes/manifests/kube-scheduler.yaml`
    You may need to check the file `/etc/kubernetes/manifests/kube-controller-manager.yaml` parameters given for 'command'. Sometime there could be missing or incorrectly entered for the VolumeMounts paths values, if you correct them the kube-systeem pods automatically starts!

Worker Node failure - Troubleshooting

This is mostly around kubelet serivce unable to come up. The bronken Kubernetes cluster can be identified by listing your nodes, where it tells us 'NotReady' state. There could be several reason each one is a case that need to be understood, where Kubelet cannot communicate with the Master node. Identifying the reason is the main thing here.
  1. Kubelet service not started: There could be many reasons when worker node fails. One such is if there is a CA certs rotated on the there should be manually you need to start the kubelet service and validated it is running on worker node.
    # To investigate whats going on worker node 
    ssh node01 "service kubelet status"
    ssh node01 "journalctl -u kubelet"
    # To start the kubelet 
    ssh node01 "service kubelet start"
    
    Once started you need to double check that kubelet status again if it shows 'active' then fine.
  2. Kubelet Config mismatch : The kubelet service even you start it is failed to come up. There could be some CONFIG related issue. In one of the example practice test given that ca.crt file path wrongly mentioned. You may need to correct the ca.crt file path in the worker node in such case you must know where the kubeconfig resides! so the path is '/var/lib/kubelet/config.yaml' After editing the ca.crt file you need to start the kubelet
    service kubelet start 
    and check the kubelet logs using journalctl.
    journalctl -u kubelet -f 
    And ensure that in the controlplane node list show that node01 status as 'Ready'.
  3. Cluster Config mismatching: There could be conffig.yaml file currupted where master ip or port configured wrongly or cluster name, user, context may be wrongly entered that could be reason where kubelet unable to communicate with the master node. Compare the configuration available on the master node and worker node if you found mismatches correct them and restart the kubelet.
  4. Finally, check the kubelet status on the worker node and on the master node check the list of nodes.
Enjoy the Kubernetes Administration !!! Have more fun!

Thursday, October 13, 2022

Kubernetes Security - Multiple Cluster with Multiple User Config

Hello Guys! in this post we are going to explore about the Kubeconfig. This is a special configuration that will be part of Kubernetes Security. We can configure multiple clusters and different users can access these Kubernetes cluster. We can also configure the users can have access to multiple clusters.

When we started working on Kubernetes Cluster there is a config file automatically generated for us. 

To access a Kube Cluster using the certificate files generated for admin user can be given as follows:
kubectl get pods \
 --server controlplane:6443
 --clisent-key: admin.key
 --client-certificate admin.crt 
 --certificate-authority ca.crt 
 
Every time passing all these TLS details(server,client-key,client-certificate, certificate-authority) including in the kubectl command is tedious process. Instead of this we can move TLS Certificate file set into a config file that is called kubeconfig file. The usage will be as follows
kubectl get pods 
  --kubeconfig config 
Usually this config file will be stored under .kube inside the home directory. If the config file present in this $HOME/.kube/ location and file name as config is automatically detected by the kubectl command while executing.

 

What is KubeConfig contains?


The kubeconfig file have three sections clusters, users, contexts

Cluster section used to define the multiple sets of Kubernetes clusters such as development, testing, preprod and prod environment wise cluster or different organizations integrations use separate clsuter or different cloud providers can have clusters example google-cluster or azure-cluster etc.

And in the Users section we can have admin user, developer user etc. These users may have different privileges on different cluster resources.

Finally contexts manages the above two sections together mapping to form a context. here we will get to know that which user account will be used to access which cluster.

Remember, we are not going to create any new users or configuring any kind of user or authorization in this kubeconfig. We will be using only the existing users with their existing privileges and defining what user acces what cluster mapping. This way we don't have to specify tht user certifcates and server URL in each and every kubectl command to run.

The kubeconfig is in yaml format which basically have above mentioned three sections.
Kubernetes Configuration with different clusters map to Users


Filename: vybhava-config
apiVersion: v1
kind: Config

clusters:
- name: vybhava-prod-cluster
  cluster:
    certificate-authority: /etc/kubernetes/pki/ca.crt
    server: https://controlplane:6443

- name: vybhava-dev-cluster
  cluster:
    certificate-authority: /etc/kubernetes/pki/ca.crt
    server: https://controlplane:6443

- name: vybhava-gcp-cluster
  cluster:
    certificate-authority: /etc/kubernetes/pki/ca.crt
    server: https://controlplane:6443

- name: vybhava-qa-cluster
  cluster:
    certificate-authority: /etc/kubernetes/pki/ca.crt
    server: https://controlplane:6443

contexts:
- name: operations
  context:
    cluster: vybhava-prod-cluster
    user: kube-admin
    
- name: test-user@vybhava-dev-cluster
  context:
    cluster: vybhava-dev-cluster
    user: test-user

- name: gcp-user@vybhava-gcp-cluster
  context:
    cluster: vybhava-gcp-cluster
    user: gcp-user

- name: test-user@vybhava-prod-cluster
  context:
    cluster: vybhava-prod-cluster
    user: test-user

- name: research
  context:
    cluster: vybhava-qa-cluster
    user: dev-user

users:
- name: kube-admin
  user:
    client-certificate: /etc/kubernetes/pki/users/kube-admin/kube-admin.crt
    client-key: /etc/kubernetes/pki/users/kube-admin/kube-admin.key
- name: test-user
  user:
    client-certificate: /etc/kubernetes/pki/users/test-user/test-user.crt
    client-key: /etc/kubernetes/pki/users/test-user/test-user.key
- name: dev-user
  user:
    client-certificate: /etc/kubernetes/pki/users/dev-user/dev-user.crt
    client-key: /etc/kubernetes/pki/users/dev-user/dev-user.key
- name: gcp-user
  user:
    client-certificate: /etc/kubernetes/pki/users/gcp-user/gcp-user.crt
    client-key: /etc/kubernetes/pki/users/gcp-user/gcp-user.key

current-context: operations

To view the configuration of current cluster you must have the config in $HOME/.kube/config 

The content of kubernetes configuration can be view with the following command: :
kubectl config view 
Kubernetes Cluster view when default config used


To view newly created customized configurations, we need to specify the file path "vybhava-config" file. Note here the "vybhava-config" is available in the current directory.
kubectl config view --kubeconfig=vybhava-config


Know your Kubernetes cluster 

To check the list of cluster(s) exist in the default kubernetes cluster config
kubectl config get-clusters
Work with your customized config file vybhava-config to know clusters list
kubectl config get-clusters --kubeconfig=vybhava-config

Knowing about Kubernetes cluster from Kube Config


KuberConfig user details

To check the list of user(s) exist in the default kubernetes cluster config
kubectl config get-users
Work with your customized config file vybhava-config to know user list
kubectl config get-users --kubeconfig=vybhava-config


KubeConfig getting the users list

KubeConfig Context

Here the context will be using users, clusters  and each context is identified with a name define also here we can sees at the end of the configuration  current context. 

To find how many contexts in vybhava-config To know for default cluster contexts :
kubectl config get-contexts
To identify what user configured in the 'operator' context we need to use the 'get-contexts' option the mapping output is displayed as a table where 'CURRENT' context will be pointed with '*' in the column.
kubectl config --kubeconfig=vybhava-config get-contexts
Kubernetes Config getting Contexts using kubectl

Here in the Context section we could add a field namespace that can be specific to project module such as production cluster can be mapped to HR application that runs with hr-finance,hr-hirings namespce.

Here we have executed all possible choices for fetching the Users, Clusters, Context from KubeConfig object. Now let's try to set the context


delete user

kubectl config --kubeconfig=vybhava-config get-users
kubectl config --kubeconfig=vybhava-config delete-user test-user
kubectl config --kubeconfig=vybhava-config get-users
Deletion of Users from Config


delete cluster

kubectl config --kubeconfig=vybhava-config get-clusters 
kubectl config --kubeconfig=vybhava-config delete-cluster vybhava-gcp-cluster
kubectl config --kubeconfig=vybhava-config get-clusters 
Kubernetes Cluster deletion from KubeConfig


delete context

kubectl config --kubeconfig=vybhava-config get-contexts 
kubectl config --kubeconfig=vybhava-config delete-context gcp-user@vybhava-gcp-cluster
kubectl config --kubeconfig=vybhava-config get-contexts 
Deletion of Context from Kube Config

The certificate files can be part in cluster certificate-authority user certificates, The best tells that instead of using admin.crt we must use absolute path to the certificate files such as here /etc/kubernetes/pki/ca.crt and one more way to use certificate-authority-data field value can be used with certificate file content base64 encoded data. As we learnt about the content is sensitive then we need to use base64 command to encode run command as  "base64 ca.crt" and that can be understood by Kubernetes automatically. 

Categories

Kubernetes (25) Docker (20) git (15) Jenkins (12) AWS (7) Jenkins CI (5) Vagrant (5) K8s (4) VirtualBox (4) CentOS7 (3) docker registry (3) docker-ee (3) ucp (3) Jenkins Automation (2) Jenkins Master Slave (2) Jenkins Project (2) containers (2) create deployment (2) docker EE (2) docker private registry (2) dockers (2) dtr (2) kubeadm (2) kubectl (2) kubelet (2) openssl (2) Alert Manager CLI (1) AlertManager (1) Apache Maven (1) Best DevOps interview questions (1) CentOS (1) Container as a Service (1) DevOps Interview Questions (1) Docker 19 CE on Ubuntu 19.04 (1) Docker Tutorial (1) Docker UCP (1) Docker installation on Ubunutu (1) Docker interview questions (1) Docker on PowerShell (1) Docker on Windows (1) Docker version (1) Docker-ee installation on CentOS (1) DockerHub (1) Features of DTR (1) Fedora (1) Freestyle Project (1) Git Install on CentOS (1) Git Install on Oracle Linux (1) Git Install on RHEL (1) Git Source based installation (1) Git line ending setup (1) Git migration (1) Grafana on Windows (1) Install DTR (1) Install Docker on Windows Server (1) Install Maven on CentOS (1) Issues (1) Jenkins CI server on AWS instance (1) Jenkins First Job (1) Jenkins Installation on CentOS7 (1) Jenkins Master (1) Jenkins automatic build (1) Jenkins installation on Ubuntu 18.04 (1) Jenkins integration with GitHub server (1) Jenkins on AWS Ubuntu (1) Kubernetes Cluster provisioning (1) Kubernetes interview questions (1) Kuberntes Installation (1) Maven (1) Maven installation on Unix (1) Operations interview Questions (1) Oracle Linux (1) Personal access tokens on GitHub (1) Problem in Docker (1) Prometheus (1) Prometheus CLI (1) RHEL (1) SCM (1) SCM Poll (1) SRE interview questions (1) Troubleshooting (1) Uninstall Git (1) Uninstall Git on CentOS7 (1) Universal Control Plane (1) Vagrantfile (1) amtool (1) aws IAM Role (1) aws policy (1) caas (1) chef installation (1) create organization on UCP (1) create team on UCP (1) docker CE (1) docker UCP console (1) docker command line (1) docker commands (1) docker community edition (1) docker container (1) docker editions (1) docker enterprise edition (1) docker enterprise edition deep dive (1) docker for windows (1) docker hub (1) docker installation (1) docker node (1) docker releases (1) docker secure registry (1) docker service (1) docker swarm init (1) docker swarm join (1) docker trusted registry (1) elasticBeanStalk (1) global configurations (1) helm installation issue (1) mvn (1) namespaces (1) promtool (1) service creation (1) slack (1)