[Kubernetes] The example of commands for commonly checking kubernetes status and troubleshooting

This post is about the example of commands for checking kubernetes status and troubleshooting. The purpose is for the reference by myself.

Here are the commonly used commands:


# For minikube
$ minikube start
$ minikube status
$ minikube service hello-minikube --url
$ kubectl run hello-minikube --image=gcr.io/google_containers/echoserver:1.8 --port=8080
$ kubectl expose deployment hello-minikube --type=NodePort

# For K8S
# autocompletion for K8S commands
$ source <(kubectl completion bash)
$ source <(kubectl completion bash | sed s/kubectl/k/g)

$ kubectl cluster-info
$ kubectl create -f xxx.yaml
# show node name
$ kubectl get pods -o wide 
or
# list all pods and its nodes
$ kubectl get pod -o=custom-columns=NAME:.metadata.name,STATUS:.status.phase,NODE:.spec.nodeName --all-namespaces
$ kubectl get pods --show-labels
$ kubectl get pods -L creation_method,env  #show node has labels
$ kubectl get pods <name> -o yaml/json  #represent in yaml/json format
# "-l" can go with label, '!label', label in (...,...), label notin (...,...) 
# label1=xxx,label2=yyy
$ kubectl get po -l <label>=<value>
$ kubectl get endpoints <service_name>
$ kubectl get endpoints -w  # watch this command
$ kubectl label node <node_name> <label>=<value>
$ kubectl label po <pod_name> <label>=<value>
$ kubectl annotate pod <pod_name> <desc>
$ kubectl describe pod <pod_name>
$ kubectl explain pods.spec
# replicationcontroller ==> rc
$ kubetcl expose rc <rc_name> --type=LoadBalancer --name <service_name>
$ kubectl get replicationcontrollers
$ kubectl scale rc <rc_name> --replicas=3
$ kubectl scale deployment/<your_depl> --replicas=2
$ kubectl logs <pod_name>
$ kubectl logs <pod_name> -c <container_name>
# port-forward(8888) --> pod(8080)
$ kubectl port-forward <pod_name> 8888:8080

$ kubectl get ns
$ kubectl get po --namespace kube-system
$ kubectl get all --all-namespaces

$ kubectl create namespace <name>
$ kubectl delete namespace <name>
# change the namespace in alias 
$ alias kcd=`kubectl config set-context $(kubectl config current-context) --namespace`$ alias kcd=`kubectl config set-context $(kubectl config current-context) --namespace`$ alias kcd=`kubectl config set-context $(kubectl config current-context) --namespace`$ alias kcd=`kubectl config set-context $(kubectl config current-context) --namespace`
$ kcd <some-namespace> # use kcd to change namesapce

$ kubectl delete po <pod_name>
$ kubectl delete po <pod_name> -l <label>=<value>
$ kubectl delete rc <pod_name> --cascade=false
$ kubectl delete all --all

$ kubectl label pod <pod_name> <label>=<value> --overwrite

$ kubectl edit rc <rc_name> # modify the ReplicationController
$ kubectl run dnsutils --image=tutum/dnsutils --generator=run-pod/v1 --command -- sleep infinity
$ kubectl exec dnsutils nslookup dnsutils

# Check and Change the current namespace
$ kubectl config view --minify --output 'jsonpath={..namespace}'
$ kubectl config set-context --current --namespace=default

Check control plane's health status

$ kubectl get componentstatuses

Check logs

$ journalctl -f -u kubelet
$ journalctl -u kubelet.service

Test cluster's services

$ kubectl exec <pod_name> -- curl -s http://......
$ kubectl exec <pod_name> env

List etcd's /registry content

$ etcdctl get /registry --prefix=true

An example to create a deployment and expose the service

$ kubectl apply -f https://k8s.io/examples/service/access/hello-application.yaml
==>
apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-world
spec:
  selector:
    matchLabels:
      run: load-balancer-example
  replicas: 2
  template:
    metadata:
      labels:
        run: load-balancer-example
    spec:
      containers:
        - name: hello-world
          image: gcr.io/google-samples/node-hello:1.0
          ports:
            - containerPort: 8080
              protocol: TCP


$ kubectl expose deployment hello-world --type=NodePort --name=example-service
$ kubectl get svc # check the services
NAME              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
example-service   NodePort    10.107.137.152   <none>        8080:30686/TCP   13s
kubernetes        ClusterIP   10.96.0.1        <none>        443/TCP          5d21h

$ curl http://140.96.29.159:30686
Hello Kubernetes!

$ kubectl describe services example-service
Name:                     example-service
Namespace:                default
Labels:                   <none>
Annotations:              <none>
Selector:                 run=load-balancer-example
Type:                     NodePort
IP:                       10.107.137.152
Port:                     <unset>  8080/TCP
TargetPort:               8080/TCP
NodePort:                 <unset>  30686/TCP
Endpoints:                192.168.0.55:8080,192.168.0.56:8080
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

$ kubectl get pods --selector="run=load-balancer-example" --output=wide
NAME                           READY   STATUS    RESTARTS   AGE     IP             NODE            NOMINATED NODE   READINESS GATES
hello-world-6db874c846-fdq7x   1/1     Running   0          2m11s   192.168.0.55   51-0a50338-01   <none>           <none>
hello-world-6db874c846-mp5cl   1/1     Running   0          2m11s   192.168.0.56   51-0a50338-01   <none>           <none>

# Accesss user's cluster service via pod to execute curl command
$ kubectl exec hello-world-6db874c846-fdq7x -- curl -s http://10.107.137.152:8080

# Access the service information in env
$ kubectl exec hello-world-6db874c846-fdq7x env
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=hello-world-6db874c846-fdq7x
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
EXAMPLE_SERVICE_PORT_8080_TCP=tcp://10.107.137.152:8080
EXAMPLE_SERVICE_PORT_8080_TCP_PROTO=tcp
KUBERNETES_SERVICE_PORT=443
EXAMPLE_SERVICE_SERVICE_PORT=8080
EXAMPLE_SERVICE_PORT_8080_TCP_PORT=8080
EXAMPLE_SERVICE_PORT_8080_TCP_ADDR=10.107.137.152
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP_PORT=443
EXAMPLE_SERVICE_SERVICE_HOST=10.107.137.152
KUBERNETES_SERVICE_HOST=10.96.0.1
KUBERNETES_PORT_443_TCP_PROTO=tcp
EXAMPLE_SERVICE_PORT=tcp://10.107.137.152:8080
NPM_CONFIG_LOGLEVEL=info
NODE_VERSION=4.4.2
HOME=/root

The result of the examples:

# 
$ kubectl get all --all-namespaces
NAMESPACE     NAME                                            READY   STATUS    RESTARTS   AGE
kube-system   pod/calico-node-xjdcx                           2/2     Running   0          4d19h
kube-system   pod/coredns-54f68bc9bf-j599f                    1/1     Running   0          4d19h
kube-system   pod/coredns-54f68bc9bf-k9h9x                    1/1     Running   0          4d19h
kube-system   pod/etcd-iss01                                  1/1     Running   0          4d19h
kube-system   pod/kube-apiserver-iss01                        1/1     Running   0          4d19h
kube-system   pod/kube-controller-manager-iss01               1/1     Running   0          4d19h
kube-system   pod/kube-proxy-9wx7z                            1/1     Running   0          4d19h
kube-system   pod/kube-scheduler-iss01                        1/1     Running   0          4d19h
kube-system   pod/nvidia-device-plugin-daemonset-1.12-4w2w4   1/1     Running   0          12m


NAMESPACE     NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)         AGE
default       service/kubernetes     ClusterIP   10.96.0.1        <none>        443/TCP         4d19h
kube-system   service/calico-typha   ClusterIP   10.101.103.252   <none>        5473/TCP        4d19h
kube-system   service/kube-dns       ClusterIP   10.96.0.10       <none>        53/UDP,53/TCP   4d19h

NAMESPACE     NAME                                                 DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                 AGE
kube-system   daemonset.apps/calico-node                           1         1         1       1            1           beta.kubernetes.io/os=linux   4d19h
kube-system   daemonset.apps/kube-proxy                            1         1         1       1            1           <none>                        4d19h
kube-system   daemonset.apps/nvidia-device-plugin-daemonset-1.12   1         1         1       1            1           <none>                        4d19h

NAMESPACE     NAME                           DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/calico-typha   0         0         0            0           4d19h
kube-system   deployment.apps/coredns        2         2         2            2           4d19h

NAMESPACE     NAME                                     DESIRED   CURRENT   READY   AGE
kube-system   replicaset.apps/calico-typha-db64dbf86   0         0         0       4d19h
kube-system   replicaset.apps/coredns-54f68bc9bf       2         2         2       4d19h

# 
$ kubectl get nodes --show-labels
NAME    STATUS   ROLES    AGE     VERSION   LABELS
iss01   Ready    master   4d19h   v1.12.3   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/hostname=iss01,node-role.kubernetes.io/master=

# 
$ kubectl describe node iss01
Name:               iss01
Roles:              master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/hostname=iss01
                    node-role.kubernetes.io/master=
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    projectcalico.org/IPv4Address: 10.129.252.66/24
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Fri, 12 Jul 2019 12:29:14 +0000
Taints:             <none>
Unschedulable:      false
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  OutOfDisk        False   Wed, 17 Jul 2019 08:08:07 +0000   Fri, 12 Jul 2019 12:29:08 +0000   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure   False   Wed, 17 Jul 2019 08:08:07 +0000   Fri, 12 Jul 2019 12:29:08 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Wed, 17 Jul 2019 08:08:07 +0000   Fri, 12 Jul 2019 12:29:08 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Wed, 17 Jul 2019 08:08:07 +0000   Fri, 12 Jul 2019 12:29:08 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Wed, 17 Jul 2019 08:08:07 +0000   Wed, 17 Jul 2019 07:39:44 +0000   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:  10.129.252.66
  Hostname:    iss01
Capacity:
 cpu:                40
 ephemeral-storage:  959862832Ki
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             131672424Ki
 nvidia.com/gpu:     2
 pods:               110
Allocatable:
 cpu:                40
 ephemeral-storage:  884609584507
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             131570024Ki
 nvidia.com/gpu:     2
 pods:               110
System Info:
 Machine ID:                 47c0c32e61bb482790b8511c833ccf01
 System UUID:                DB938000-FD74-11E7-8000-E0D55E1A428D
 Boot ID:                    cee7be2b-29b1-4390-95a3-c3c648415ddd
 Kernel Version:             4.15.0-54-generic
 OS Image:                   Ubuntu 18.04.2 LTS
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://18.9.7
 Kubelet Version:            v1.12.3
 Kube-Proxy Version:         v1.12.3
PodCIDR:                     192.168.0.0/24
Non-terminated Pods:         (9 in total)
  Namespace                  Name                                         CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                  ----                                         ------------  ----------  ---------------  -------------  ---
  kube-system                calico-node-xjdcx                            250m (0%)     0 (0%)      0 (0%)           0 (0%)         4d19h
  kube-system                coredns-54f68bc9bf-j599f                     100m (0%)     0 (0%)      70Mi (0%)        170Mi (0%)     4d19h
  kube-system                coredns-54f68bc9bf-k9h9x                     100m (0%)     0 (0%)      70Mi (0%)        170Mi (0%)     4d19h
  kube-system                etcd-iss01                                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4d19h
  kube-system                kube-apiserver-iss01                         250m (0%)     0 (0%)      0 (0%)           0 (0%)         4d19h
  kube-system                kube-controller-manager-iss01                200m (0%)     0 (0%)      0 (0%)           0 (0%)         4d19h
  kube-system                kube-proxy-9wx7z                             0 (0%)        0 (0%)      0 (0%)           0 (0%)         4d19h
  kube-system                kube-scheduler-iss01                         100m (0%)     0 (0%)      0 (0%)           0 (0%)         4d19h
  kube-system                nvidia-device-plugin-daemonset-1.12-4w2w4    0 (0%)        0 (0%)      0 (0%)           0 (0%)         19m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                1 (2%)      0 (0%)
  memory             140Mi (0%)  340Mi (0%)
  ephemeral-storage  0 (0%)      0 (0%)
  nvidia.com/gpu     0           0
Events:
  Type    Reason                   Age   From            Message
  ----    ------                   ----  ----            -------
  Normal  Starting                 28m   kubelet, iss01  Starting kubelet.
  Normal  NodeHasSufficientDisk    28m   kubelet, iss01  Node iss01 status is now: NodeHasSufficientDisk
  Normal  NodeHasSufficientMemory  28m   kubelet, iss01  Node iss01 status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    28m   kubelet, iss01  Node iss01 status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     28m   kubelet, iss01  Node iss01 status is now: NodeHasSufficientPID
  Normal  NodeNotReady             28m   kubelet, iss01  Node iss01 status is now: NodeNotReady
  Normal  NodeAllocatableEnforced  28m   kubelet, iss01  Updated Node Allocatable limit across pods
  Normal  NodeReady                28m   kubelet, iss01  Node iss01 status is now: NodeReady

# 
$ kubectl describe pod/nvidia-device-plugin-daemonset-1.12-4w2w4 -n kube-system
Name:           nvidia-device-plugin-daemonset-1.12-4w2w4
Namespace:      kube-system
Priority:       0
Node:           iss01/10.129.252.66
Start Time:     Wed, 17 Jul 2019 07:48:19 +0000
Labels:         controller-revision-hash=68f76d744
                name=nvidia-device-plugin-ds
                pod-template-generation=1
Annotations:    cni.projectcalico.org/podIP: 192.168.0.4/32
                scheduler.alpha.kubernetes.io/critical-pod:
Status:         Running
IP:             192.168.0.4
Controlled By:  DaemonSet/nvidia-device-plugin-daemonset-1.12
Containers:
  nvidia-device-plugin-ctr:
    Container ID:   docker://a980649056f0812f9c6b1f29217cce0e26a733839e304af71ed59810b465b886
    Image:          nvidia/k8s-device-plugin:1.11
    Image ID:       docker-pullable://nvidia/k8s-device-plugin@sha256:41b3531d338477d26eb1151c15d0bea130d31e690752315a5205d8094439b0a6
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Wed, 17 Jul 2019 07:50:41 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/lib/kubelet/device-plugins from device-plugin (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-nlvjw (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  device-plugin:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/device-plugins
    HostPathType:
  default-token-nlvjw:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-nlvjw
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     CriticalAddonsOnly
                 node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
                 nvidia.com/gpu:NoSchedule
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  22m                default-scheduler  Successfully assigned kube-system/nvidia-device-plugin-daemonset-1.12-4w2w4 to iss01

Danny's tech notebook | 丹尼技術手札

Thursday, July 18, 2019

[Kubernetes] The example of commands for commonly checking kubernetes status and troubleshooting

Here are the commonly used commands:

# For minikube

# For K8S

Check control plane's health status

$ kubectl get componentstatuses

Check logs

Test cluster's services

List etcd's /registry content

$ etcdctl get /registry --prefix=true

An example to create a deployment and expose the service

No comments: