kubelet snippet

Path Purpose
/var/lib/kubelet/pods/ Pod + volume desired state
/var/lib/kubelet/plugins/kubernetes.io/csi/ CSI mount state
/var/lib/kubelet/volumeDevices/ or /volumePlugins/ Volume tracking

log

1
journalctl -u kubelet -g "<key-words>"

troubleshooting

mount pvc which is absent

Kubelet runs multiple controllers in parallel:

Loop Purpose Trigger
Mount loop NodePublishVolume Pod volume
Expand loop NodeExpandVolume PVC resize logic
Attach loop VolumeAttachment Controller side
  • Even though the PVC object is absent, Kubelet thinks: “Volume still required but temporarily unavailable” (kubelet still has stale DesiredStateOfWorld (DSW) entries created earlier, and retry loop continues forever)

  • When Mount pvc success, Kubelet thinks: “Mount succeeded, now check expansion”(kubelet calls NodeExpandVolume)

    That code path: mountVolume.NodeExpandVolume → GET PVC, But

  • kubelet schedules retry (2m backoff) (⚠️This retry is NOT CSI-controlled)

Once kubelet decides success Until DSW is updated

1
Pod X needs Volume Y

CSI cannot cancel kubelet desire

CSI Return Result
OK Expand loop triggers
NotFound Mount loop retries
FailedPrecondition Retries
Aborted Retries
Canceled Retries

Why kubelet behaves like this (design reality)

Kubernetes assumes:

  • API server may be temporarily unavailable
  • PVC deletion may be delayed
  • Node permissions may be limited

So kubelet never treats “PVC missing” as terminal, That decision is intentional.

If kubelet decides a Pod needs a volume, only kubelet can forget it. CSI can only convince it by saying “OK”.

Version from runtime service failed

1
2
3
E0325 12:15:10.889620 3185179 remote_runtime.go:189] "Version from runtime service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
E0325 12:15:10.889663 3185179 kuberuntime_manager.go:226] "Get runtime version failed" err="get remote runtime typed version failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
E0325 12:15:10.889683 3185179 run.go:74] "command failed" err="failed to run Kubelet: failed to create kubelet: get remote runtime typed version failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
  • resolve

    1
    2
    3
    4
    复制正常节点 /etc/containerd/config.toml 配置文件,然后重启 containerd
    sudo systemctl restart containerd

    kubelet服务会自动拉起