Support server down / GUI crashing

Hi;

I’ve been trying to diagnose some user permission errors, that seem to coincide with the gui crashing.

I tried to submit via support but the server was not responding?

The traceback is here:

        Traceback (most recent call last):

File “/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py”, line 41, in _handle_exception
yield
File “/opt/rockstor/src/rockstor/storageadmin/views/appliances.py”, line 41, in get_queryset
self._update_hostname()
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/utils/decorators.py”, line 145, in inner
return func(*args, **kwargs)
File “/opt/rockstor/src/rockstor/storageadmin/views/appliances.py”, line 48, in _update_hostname
cur_hostname = gethostname()
File “/opt/rockstor/src/rockstor/system/osi.py”, line 766, in gethostname
o, e, rc = run_command([HOSTNAMECTL, ‘–static’])
File “/opt/rockstor/src/rockstor/system/osi.py”, line 113, in run_command
’Exception while running command({}): {}’.format(cmd, e))
Exception: Exception while running command([’/usr/bin/hostnamectl’, ‘–static’]): [Errno 24] Too many open files

Suggestions?

Thanks,
Steve

@smanley Thanks for the report. On a quick look, based on:

This looks very much like the following unresolved open issue:

There are several links there back to the forum where others have had the same. I’ve added this thread also.

Thanks for the quick reply.

I’m having this problem too.

gunicorn is the problem (most open files) according to output from:
lsof | awk 'BEGIN{print "command files"} NR!=1 {a[$1]++}END{for (i in a) if(a[i]>200){printf("%-10s %6.0f\n", i, a[i])}}'
I’ve just updated to latest version 3.9.2-11. These errors occur some time after boot, so I’ll shortly see whether has made a difference.

@grizzly Thanks for you input / testing on this one and please do update that issue with your findings. Gunicorn was the last thing updated, (by @suman) as detailed in that issue, to try and sort this but from the reports it doesn’t look like it’s done just yet.

Cheers.

It’s still happening … I updated to 3.9.2-11 and it’s back overnight.

Worried that some other issues I’m noticing are related, hard to fix.

I’m not sure if this is known / helps, but the problem does not seem to occur as frequently (or at all) if the Rockstor GUI is closed after use.

Still happening in 3.9.2-15 and I’ve only occasionally had webui open.

[root@nas ~]# lsof | awk 'BEGIN{print “command files”} NR!=1 {a[$1]++}END{for (i in a) if(a[i]>200){printf("%-10s %6.0f\n", i, a[i])}}'
command files
data-coll 726
smbd 599
gssproxy 264
gunicorn 6717
postgres 825
django 363

Has been up for over 30 days.

Same here. Seems to happen if I am running scrubs and/or balances (scheduled or manually initiated), I can watch the number of open files descriptors gradually increase…its all gunicorn as far as I can see for me.

For me also the impact is sometimes a total halt to the server, not just the web GUI not working. I have to hard reboot the server usually before the scrubs and/or balances had completed.

I did find, however, that if I close all instances of the web GUI, then the increase stops, and it remains at the total it had reached to that point. Scrubs and balances seem then to complete as planned. This is as smanley mentioned above. From my experience, leaving the GUI open for long periods of time when other processes are running, seems to be the scenario that leads to this issue.

Yes I’m having to reboot when this occurs too, which is disruptive. Is the solution not just a matter of increasing max open file descriptors?