Disk Errors, Pool Mismatches, and More.. Much More

Hello, all!

So here’s a weird one - old server (HP Micro 8gen) started flaking out so I retired it and built it anew last year with an 8-bay NAS case, purchasing two new IronWolf 10Tb SATA drives for MORE STORAGE. I had four 4Tb drives from the old server, and a spare 3Tb so I chucked it into the case because why not. Built two pools: “Storage” with the two 10Tb drives (intending to replace the others as I could afford it), “Old_Stuff” with the four 4Tb drives (for which pool recovery “just worked” and all four drives re-assembled as a pool when asked).

Moved shares off the “Old_Stuff” pool into the new “Storage Pool” to prep for eventual replacement, and then things got weird and I had to shelve the “buy more drives” project for a bit.

(You may notice something strange there, based on the description - foreshadowing, if you will)

Nevertheless, everything was running smoothly… until now. See, bond0 crashed out (working that issue separately) and when I finally managed to un-bond it to see what was going on, the GUI shows me that things are… confused:

So, top two drives (THE NEW ONES?!?!) are currently flaking out, multiple read errors OMG). Can’t worry about that right now. The 3Tb “why not spare drive” now thinks it’s the only member in the “Old_Stuff” pool, even tho it was never a member:

And EVERYONE ELSE is in the “Storage” Pool. (Not sure if it’s significant, but I now have multiple 1 and 2 devices in the pool. Devices that are supposed to be in the other pool):

Trying not to panic because while I do back up via CrashPlan-Pro… there’s a ton of home movies, photos, a massive CD collection, and a ton of DVD/BRDs I laboriously ripped onto the server… and I would prefer not to have to rebuild everything from scratch.

So finally to the questions:

  1. How bad is this? Everything seems to be working well enough with the shares, still accessible via CIFS/NFS etc. Should I just adapt to this new normal and not worry?
  2. Is there a way to massage this so that everyone shows up in the correct pool so I can stop worrying about massive amounts of data loss in my near future?

Advices welcome, as always. :slight_smile:

Cheers,

KeithF

@KeithF hello again. Sorry, I did not check anything over this long weekend.

No solution for you, but let’s see whether this is more of an UI issue, or indeed something going on under the covers.

Firstly, which version of Rockstor are you running, and which Leap or Tumbleweed is it based upon?

Secondly, if you go to the command line and run btrfs filesystem show (or with sudo, if you’re using the Web-based shell), what does it show you? The same situation (devices assigned to the wrong pools or not at all, etc.) you’re observing in the Web UI or different?

2 Likes

@KeithF Hello again.

Could you clarify a little more here, re:

Rockstor’s internals assume unique Pool and Share names within an install. This is a current limitation and enforced within the Web-UI on the creation & import (for shares) side. However if pre-configured devices (btrfs-wise) are attached, that have duplicate Pool or Share names, confusion will ensue: primarily on how this is all represented within the Web-UI. All down to this unique naming constraint enforced only during Web-UI creation of say a new Pool. We also enforce this during pool import for shares, but of course there is no import if devices belong to a know (by name) Pool!

I think this is what has happened here based on your “Built two pools: “Storage” …” comment. As Hooverdan indicated - what we need here is a system level confirmation of this assumption via:

btrfs filesystem show

If you could copy the output of that command (run as the root user) into this thread we should have more to go on. And then a path to teasing these Web-UI only merge “Storage” Pools that actually have distince uuids.

To your questions:

If you avoid Web-UI manipulations; it is currently cosmetic! But not belittling the apparent horror here :). The Web-UI sees all “Storage” pool members as belonging to the assumed unique “Storage” labeled pool, only in your case it is actually two distinct Pools created on two different systems initial if I have read the history correctly here. My appologies otherwise of course as we are nearing a new release so I’m a bit pushed time wise currently.

Definitely, if my assumption is correct re two system independently created “Storage” labelled Pools then attached to the same host (who previously had a Pool of that same name/label. However it could be a delicate operation. We can tell more once we have your command output as we can then mimic this situation via VM to confirm a neat way around this. We have Web-UI tools planned to help with such things but are currently in the throws of milestones that pre-date this effort. All in good time hopefully.

Thanks for the report - and my apologies for this short-fall. If it is what I think, we guard against in creation on the existing system - but can’t cope currently when two systems same-name pools are attached to a single system. We have begun this work however and have to complete a move to uuid as canonical for Pool and Share identification.

Oh dear - that’s not good! Outside the realm of Rockstor but yet, cause for concern there. Lets get these Pools teased apart (if that is the confusion cause) first as you can then seem more what-is-what.

Hope that helps, and lets see that command output so folks all around can see the situation and we can duplicat locally to find a way to resolve at least that confustion.

1 Like

Bonjour, Phillip and Dan! :slight_smile:

Long weekend and brief hospitalization delay on this side, so no worries and sorry it took so long to get back to you. :frowning:

I’m currently running Rockstor 5.1.0-0 on openSuSE Tumbleweed 20260402.

And here’s the output from the “btrfs filesystem show” command:

raven:~ # btrfs filesystem show
Label: 'ROOT'  uuid: cbfce5f5-49d8-49b1-83ac-854176beab2b
        Total devices 1 FS bytes used 17.70GiB
        devid    1 size 463.70GiB used 21.05GiB path /dev/nvme0n1p4

Label: 'Old_Stuff'  uuid: bdb06fd7-619a-47ae-88f0-4636f5937b69
        Total devices 1 FS bytes used 176.00KiB
        devid    1 size 2.73TiB used 20.00MiB path /dev/sdg

Label: 'Storage'  uuid: 2edd8b2a-2599-4b31-aceb-b7618ceb749e
        Total devices 2 FS bytes used 6.81TiB
        devid    1 size 10.91TiB used 3.42TiB path /dev/sda
        devid    2 size 10.91TiB used 3.43TiB path /dev/sdb

Label: 'Storage'  uuid: 7ab39728-769b-4ca2-ae16-cce62a7b3a82
        Total devices 4 FS bytes used 6.53TiB
        devid    1 size 3.64TiB used 1.64TiB path /dev/sdd
        devid    2 size 3.64TiB used 1.64TiB path /dev/sdc
        devid    3 size 3.64TiB used 1.64TiB path /dev/sdf
        devid    4 size 3.64TiB used 1.64TiB path /dev/sde
raven:~ #

So while the UI shows “Everything Everywhere All At Once”, the underlying BTRFS system shows things in the mostly-proper place. Volumes in the… erm… first “Storage” pool are the newest drives. The second “Storage” pool was actually the “Old_Stuff” collection from the previous Rockstor - plan being to migrate all the data off (rsync is another word for winning!) and remove those drives and shelve them against future need or new project. The one drive that is in the “Old_Stuff” pool was a 3Gb spare drive I tossed in to park some stuff I was moving from my wife’s old laptop to her new one.

So the good news is that the underlying OS seems to mostly know where everything is supposed to be.

Thanks in advance,

KeithF

4 Likes