Unresponsive system

phillxnet · May 11, 2020, 8:54pm

@shocker Thanks for the update. This all seems rather level ladened.

I’m afraid I have nothing to contribute on the LizardFS front knowledge wise.

But re:

What type of Raid1 is this. The openSUSE btrfs root doesn’t yet support mutiple devices on the btrfs front. And neither does Rockstor. But in time they both should.

You may have already done all your research but have you considered a lower layer to gain performance, given that was one of your main problems. I.e. bcache under btrfs. Rockstor should recognise this arrangement, given the udev caveat below, but there is no Web-UI to set it up. See the following for some details on this:

Not sure if the udev rules there are necessary anymore as haven’t tired bcache on openSUSE but Rockstor depends on the structure of the names created in those udev rules so you may have to enforce/replace them if openSUSE has it’s own and they work differently.

Does introduce single point of failure but you still have a single CPU / PSU / memory bank presumably. And bcache can be turned to passthrough (essentially off) mode. Or can be tweaked. So I had meant to mention this as an additional performance tweak possibility. I.e. a top quality nvme as the cache device.

On the btrfs dev list there looks to be tweaks comming where one can specify metadata to be stored on non rotational devices, if available, within the pool. So we should keep an eye on that feature if it ends up being standard. Most likely then a mount option.

Good to hear

Do you fancy giving a quick description of LizardFS. Sound like a GlusterFS type thing. And in which case might be worth doing your same tests with a modern GlusterFS as that appealed to me more as a cluster filesystem that still had a real filesystem underneath, unlike Ceph. Plus its got a pretty long history now. Are you after further redundancy that btrfs raid6 can give, hence btrfs raid1/raid10 single disk failure not matching your requirements.

Oh and have you checked your existing disks and the new ones as not being ‘secret’ DM-SMR drives. If your existing ones are then that would explain some of your performance issues. I believe no Seagate NAS drivers were but some WD and Toshiba were selling DM-SMR as NAS which is just no good at all. Just another thought as it was recent news.

Anyway just though I’d say about the bcache option really as it’s been in the kernel for years and is very well respected. And you main issues seemed to be performance related.

Do keep us updated on your adventures, I’m sure many folks here will be interested.

Cheers.