@peter appears to be on to something - it may also be related to running AFP…read on…
I ran the following in ssh while seeing this error in the webui to find the processes with the most open files (credit: https://stackoverflow.com/questions/21752067/counting-open-files-per-process):
cd /proc; find -maxdepth 1 -type d -name '[0-9]*' -exec bash -c "ls {}/fd/ | wc -l | tr '\n' ' '" \; -printf "fds (PID = %P), command: " -exec bash -c "tr '\0' ' ' < {}/cmdline" \; -exec echo \; | sort -rn | head
…and the output shows gunicorn with lots of open files:
1015 fds (PID = 10247), command: /usr/bin/python /opt/rockstor/bin/gunicorn --bind=127.0.0.1:8000 --pid=/run/gunicorn.pid --workers=2 --log-file=/opt/rockstor/var/log/gunicorn.log --pythonpath=/opt/rockstor/src/rockstor --settings=settings --timeout=120 --graceful-timeout=120 wsgi:application
1015 fds (PID = 10246), command: /usr/bin/python /opt/rockstor/bin/gunicorn --bind=127.0.0.1:8000 --pid=/run/gunicorn.pid --workers=2 --log-file=/opt/rockstor/var/log/gunicorn.log --pythonpath=/opt/rockstor/src/rockstor --settings=settings --timeout=120 --graceful-timeout=120 wsgi:application
91 fds (PID = 3414), command: /usr/libexec/postfix/master -w
37 fds (PID = 1), command: /usr/lib/systemd/systemd --switched-root --system --deserialize 21
37 fds (PID = 10233), command: /usr/bin/python2.7 /opt/rockstor/bin/django ztaskd --noreload --replayfailed -f /opt/rockstor/var/log/ztask.log
29 fds (PID = 32082), command: /usr/sbin/afpd -d -F /etc/netatalk//afp.conf
27 fds (PID = 12817), command: docker-containerd-shim ef43564f0074807acd20eb7b8939a860c029793d0d15ed2ad13b5da6f7545a4b /var/run/docker/libcontainerd/ef43564f0074807acd20eb7b8939a860c029793d0d15ed2ad13b5da6f7545a4b docker-runc
27 fds (PID = 1117), command: /usr/lib/systemd/systemd-journald
26 fds (PID = 12571), command: dockerd --log-driver=journald -s btrfs -g /mnt2/rock-on-service
26 fds (PID = 10218), command: /usr/bin/python /opt/rockstor/bin/supervisord -c /opt/rockstor/etc/supervisord.conf
(gunicorn
is a python http server)
Listing the limits from proc for the gunicorn processes (credit: http://tuxgen.blogspot.com/2014/01/centosrhel-ulimit-and-maximum-number-of.html):
[root@ymtrockstor proc]# cat /proc/10247/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 13412 13412 processes
Max open files 1024 4096 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 13412 13412 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
[root@ymtrockstor proc]# cat /proc/10246/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 13412 13412 processes
Max open files 1024 4096 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 13412 13412 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
[root@ymtrockstor proc]#
I listed the open files using lsof | grep gunicorn
and I’m seeing thousands of these:
...
gunicorn 10246 root 52u REG 0,40 578 102228 /tmp/afp.conf (deleted)
gunicorn 10246 root 53u REG 0,40 578 102236 /tmp/afp.conf (deleted)
gunicorn 10246 root 54u REG 0,40 578 102238 /tmp/afp.conf (deleted)
gunicorn 10246 root 55u REG 0,40 578 102239 /tmp/afp.conf (deleted)
gunicorn 10246 root 56u REG 0,40 578 102240 /tmp/afp.conf (deleted)
gunicorn 10246 root 57u REG 0,40 578 102250 /tmp/afp.conf (deleted)
gunicorn 10246 root 58u REG 0,40 578 102251 /tmp/afp.conf (deleted)
gunicorn 10246 root 59u REG 0,40 578 102252 /tmp/afp.conf (deleted)
gunicorn 10246 root 60u REG 0,40 578 102254 /tmp/afp.conf (deleted)
gunicorn 10246 root 61u REG 0,40 578 102294 /tmp/afp.conf (deleted)
gunicorn 10246 root 62u REG 0,40 578 102296 /tmp/afp.conf (deleted)
gunicorn 10246 root 63u REG 0,40 578 102297 /tmp/afp.conf (deleted)
gunicorn 10246 root 64u REG 0,40 578 102298 /tmp/afp.conf (deleted)
gunicorn 10246 root 65u REG 0,40 578 102307 /tmp/afp.conf (deleted)
gunicorn 10246 root 66u REG 0,40 578 102308 /tmp/afp.conf (deleted)
gunicorn 10246 root 67u REG 0,40 578 102309 /tmp/afp.conf (deleted)
gunicorn 10246 root 68u REG 0,40 578 102311 /tmp/afp.conf (deleted)
...
I run AFP for time machine backups. Indeed the /tmp/apg.conf is not present on the filesystem.
Finally I’m seeing all kinds of “Too many open files” in the rockstor log:
[root@ymtrockstor proc]# cat /opt/rockstor/var/log/rockstor.log
[06/Jun/2018 11:37:04] ERROR [storageadmin.middleware:33] Exception while running command(['/sbin/btrfs', 'subvolume', 'list', '-u', '-p', '-q', '/mnt2/pool1']): [Errno 24] Too many open files
Traceback (most recent call last):
File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/core/handlers/base.py", line 132, in get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/views/decorators/csrf.py", line 58, in wrapped_view
return view_func(*args, **kwargs)
File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/views/generic/base.py", line 71, in view
return self.dispatch(request, *args, **kwargs)
File "/opt/rockstor/eggs/djangorestframework-3.1.1-py2.7.egg/rest_framework/views.py", line 452, in dispatch
response = self.handle_exception(exc)
File "/opt/rockstor/eggs/djangorestframework-3.1.1-py2.7.egg/rest_framework/views.py", line 449, in dispatch
response = handler(request, *args, **kwargs)
File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/utils/decorators.py", line 145, in inner
return func(*args, **kwargs)
File "/opt/rockstor/src/rockstor/storageadmin/views/command.py", line 323, in post
import_snapshots(share)
File "/opt/rockstor/src/rockstor/storageadmin/views/share_helpers.py", line 139, in import_snapshots
share.name)
File "/opt/rockstor/src/rockstor/fs/btrfs.py", line 485, in snaps_info
mnt_pt])
File "/opt/rockstor/src/rockstor/system/osi.py", line 107, in run_command
raise Exception(msg)
Exception: Exception while running command(['/sbin/btrfs', 'subvolume', 'list', '-u', '-p', '-q', '/mnt2/pool1']): [Errno 24] Too many open files
Workaround
Restarting the rockstor service (systemctl status rockstor
) restores the webui and closes all those open files.