Failure phenomenon

In the morning, several business applications were deployed through argocd. After deploying 2, the third-party deployment failed. The same configuration, different knowledge clusters, how could such a problem occur?

So I checked the log, as follows:

1
2
3
Warning Failed 1m kubelet, 172.16.25.13 Error: Error response from daemon: error creating overlay mount to /var/lib/docker/overlay2/ba37165607862efb350093e5e287207e2547759fd81dc4e5e356a86ac5e28324-init/merged: no space left on device
Warning Failed 1m kubelet, 172.16.25.13 Error: Error response from daemon: error creating overlay mount to /var/lib/docker/overlay2/f69b62f360fc2a94487aca041b08d0929810beab0602e0ec8b90c94b2e893337-init/merged: no space left on device
Warning Failed 48s kubelet, 172.16.25.13 Error: Error response from daemon: error creating overlay mount to /var/lib/docker/overlay2/a8d20a44183b39ae989eee8a442960124ff23844482f726ea7ab39a292aecbb3-init/merged: no space left on device

Solution

  1. Check the disk space and find that it is not full
1
2
3
root@gpu613:~# df -Th /
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda2 ext4 1.8T 359G 1.3T 22% /
  1. After searching on Google, I found that it might be caused by inotify watch exhaustion
1
2
#cat /proc/sys/fs/inotify/max_user_watches
8192

Try to modify the number of directories for fd watch

1
2
echo "fs.inotify.max_user_watches=100000" >> /etc/sysctl.conf
sysctl -p

Resend argocd sync synchronization application, and found that the deployment was successfully created this time. It was indeed the cause of this product.