File Errors on OpenSuse Release 4.0.4

magicalyak · October 20, 2020, 7:57pm

This started with 4.0.4 I think but I’m getting a lot of share issues on the UI and I haven’t debugged much. I thought it was quotas, so I turned them all off and rebooted. Things worked for a few hours then it came back.

##### Houston, we’ve had a problem.

Unknown internal error doing a GET to /api/shares?page=1&format=json&page_size=32000&count=&sortby=name

What logs can I share on this?

phillxnet · October 21, 2020, 11:39am

@magicalyak Hello again.
Re:

4.0.3 to 4.0.4 only changed how we authenticate to the Stable channel, see:

So I think this may be unrelated. Have you tried a reboot since the upgrade. It may be that you have part old and part new code running. Although an update by the suggested means (last fix workaround for Stable) or via Web-UI should avoid this happening. Let us know how you get on. And journalctl logs at time of the failed request may be informative. Do you have a lot of shares possibly, i.e > 9000 as we had a recent change to getting 32,0000 of ‘things’ rather than the prior 9,000 in 4.0.2-0:

And does it take say a minute and a half before this message appears by chance. So more info would be good on that front.

Thanks for the report thought. But we haven’t had any similar reports, at least not just yet. Fingers crossed .

And incidentally, as you don’t say which version you upgraded from, our even earlier release candidate was 4.0.1-0 which only changed the syntactic formating of the python code:

Also I’ve assumed here you are running an install from the DIY installer, or a hand configured Leap 15.2 as per the dev notes. We are no longer CentOS compatible with regard to sub-volume parsing.

Hope that helps.

magicalyak · October 21, 2020, 12:32pm

DIY installer
4.0.4

55 shares

definitely rebooted a few times
I think something is timing out
I have killed off docker processes and it seems to alleviate the condition. It’s odd because i don’t recall this happening before. I may need to play with nice/renice to figure out what is contending (Suse is a bit new to me while Centos wasn’t). I think this error may just be me but it does seem docker can break the rockstor side with contention.

Flox · October 21, 2020, 1:03pm

I was going to ask you about that… The few times we’ve seen this “Unknown internal error” was related to something else bogging the system doing, leading to timeout on some Rockstor operations.

Your mention of docker might be on point here and I wonder if you are in a similar situation as previously reported below:

As a result, you could try having a look at how many subvolumes you have:

btrfs subvolume list /mnt2/<you-pool-name-here>/ | wc -l

The Btrfs usage report is also always a good thing to check, even if it’s unlikely to reveal something wrong in your case:

btrfs fi usage /mnt2/<your-pool-name-here>

Hope this helps,

magicalyak · October 21, 2020, 5:23pm

hijacking a bit but is there a way to add a manual btrfs or a workaround?

I have FusionIO cards and rockstor says they can’t be used because it thinks they are not real serials (I see UUIDs though). Not sure if there is a way to manually enter this to use rockstor storageadmin but I can also one-off and make the BTRFS stuff manually.

Is there a way to do this so I don’t break what’s there?