| Path | Purpose |
|---|---|
| /var/lib/kubelet/pods/ | Pod + volume desired state |
| /var/lib/kubelet/plugins/kubernetes.io/csi/ | CSI mount state |
| /var/lib/kubelet/volumeDevices/ or /volumePlugins/ | Volume tracking |
log
1 | journalctl -u kubelet -g "<key-words>" |
troubleshooting
mount pvc which is absent
Kubelet runs multiple controllers in parallel:
| Loop | Purpose | Trigger |
|---|---|---|
| Mount loop | NodePublishVolume | Pod volume |
| Expand loop | NodeExpandVolume | PVC resize logic |
| Attach loop | VolumeAttachment | Controller side |
Even though the PVC object is absent, Kubelet thinks:
“Volume still required but temporarily unavailable”(kubelet still has stale DesiredStateOfWorld (DSW) entries created earlier, and retry loop continues forever)When Mount pvc success, Kubelet thinks: “Mount succeeded, now check expansion”(kubelet calls NodeExpandVolume)
That code path:
mountVolume.NodeExpandVolume → GET PVC, Butkubelet schedules retry (2m backoff) (⚠️This retry is NOT CSI-controlled)
Once kubelet decides success Until DSW is updated
1 | Pod X needs Volume Y |
CSI cannot cancel kubelet desire
| CSI Return | Result |
|---|---|
| OK | Expand loop triggers |
| NotFound | Mount loop retries |
| FailedPrecondition | Retries |
| Aborted | Retries |
| Canceled | Retries |
Why kubelet behaves like this (design reality)
Kubernetes assumes:
- API server may be temporarily unavailable
- PVC deletion may be delayed
- Node permissions may be limited
So kubelet never treats “PVC missing” as terminal, That decision is intentional.
If kubelet decides a Pod needs a volume, only kubelet can forget it. CSI can only convince it by saying “OK”.
Version from runtime service failed
1 | E0325 12:15:10.889620 3185179 remote_runtime.go:189] "Version from runtime service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService" |
resolve
1
2
3
4复制正常节点 /etc/containerd/config.toml 配置文件,然后重启 containerd
sudo systemctl restart containerd
kubelet服务会自动拉起