Failure Phenomenon

This morning, I deployed several business applications via ArgoCD. After successfully deploying two applications, subsequent deployments from a third-party source consistently failed—despite using identical configurations, only the target cluster differed. Why would this happen?

I checked the logs and found the following:

1
2
3
  Warning  Failed     1m                kubelet, 172.16.25.13  Error: Error response from daemon: error creating overlay mount to /var/lib/docker/overlay2/ba37165607862efb350093e5e287207e2547759fd81dc4e5e356a86ac5e28324-init/merged: no space left on device
  Warning  Failed     1m                kubelet, 172.16.25.13  Error: Error response from daemon: error creating overlay mount to /var/lib/docker/overlay2/f69b62f360fc2a94487aca041b08d0929810beab0602e0ec8b90c94b2e893337-init/merged: no space left on device
  Warning  Failed     48s               kubelet, 172.16.25.13  Error: Error response from daemon: error creating overlay mount to /var/lib/docker/overlay2/a8d20a44183b39ae989eee8a442960124ff23844482f726ea7ab39a292aecbb3-init/merged: no space left on device

Solution

  1. Check disk space—no issues found:
1
2
3
root@gpu613:~# df -Th /
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/sda2      ext4  1.8T  359G  1.3T  22% /
  1. After Googling, I discovered this might be due to exhausted inotify watches.

Check current limit:

1
2
#cat /proc/sys/fs/inotify/max_user_watches
8192

Increase the number of watchable directories:

1
2
echo "fs.inotify.max_user_watches=100000" >> /etc/sysctl.conf
sysctl -p

After applying the change, re-triggered the ArgoCD sync. This time, the deployment was successfully created—confirming that inotify watch exhaustion was indeed the root cause.