Physical vs. Logical Disk Assignments in Raid-10 (Solved! There is none!)

Tex1954 · August 19, 2024, 10:18pm

Okay, I know this is a weird request, but I love symmetry in hardware.

For instance, in old days, Drive A (C) would be assigned Master and Drive B (D) as slave. This was hardware and jumper dependent.

Now, I have 8 drives with the attachments to certain ports on an HBA in a physical order. The drives are stacked vertically. Port 0 of the HBA drives the top 4 drives 1 thru 4; Port 1 drives the lower 4 drives 5 thru 8. The port cables are ordered 1 thru 4 and there are 2 cables for the 8 drives.

Now, the drives are assigned sda thru sdg in the Raid-10 pool.

The BTRFS DevID seems to have no correlation to the HBA Port ID’s and therefore no correlation to the Physical Position of the hard drives.

To further complicate the situation, the “sdX” have no correlation to the DevIDs !

The end effect is, without physically looking at the hard disk serial number in the setup, there is no
way to tell what drive is part of what!

We thus have a triple wammy of confusion and “I think” needless extra work determining what disk may be a problem child.

Is there any way to amend this behavior and assign physical ports to specific “sdX” assignments?

Thanks!

Flox · August 19, 2024, 11:54pm

Hi @Tex1954 ,

This is unfortunately not my expertise, so please keep this in mind when reading what I’ve written below.

I do know that relying on the sdX names is inherently problematic as these are NOT guaranteed to be consistent at each reboot. I would need to refresh myself on how the sdX are attributed at boot time, but if I’m correct, these can vary between reboots, which would make using them to identify and track disks very risky. That is part of the reasons why Rockstor uses the disks’ serial numbers to identify and track disks. There might have been a time in the early days where Rocksor used to use the sdX labels, but that was most likely before my time.

Sorry I can’t answer or provide elements of feedback on your other points; I’m sure someone actually knowledgeable on the matter will chime in.

Hope this helps, nonetheless.

Tex1954 · August 20, 2024, 4:32am

Howdy!

It appears the DevID’s are in order of the LONG serial number of the disks as you stated once I took a second look… I was initially looking at the last part of the serial numbers after the dash.

I can understand that from the Rockstor point of view this may be entirely necessary. So, my question then modifies to “How can I tell which disks are part of the Raid-0 A array vs. the Raid-0 B array in the Raid-10 setup”?

Am I to presume the first 4 HDs are in serial number order and comprise the Raid-0A array? If so, then if I rearrange my HDs in serial number order, would that help identify their hardware positions?

I hope you see the problem… having to remove possibly ALL the drives to inspect the serial numbers to identify a physical problem child seems way too awkward in 2024…

phillxnet · August 20, 2024, 9:49am

@Tex1954 I can chip in a little here hopefully.

Firstly I second all that @Flox stated. In the early days, i.e. when MFM/RLL/IDE was the hardware port, hda etc were predictable per hardware port. That disappeared as SATA came in, and now the sda (from the earlier SCSI layer naming) is essentially random on each boot, depending on drive appearance/readiness I think. This created an early adaptation requirement at the time, where some systems had made assumption about the tally of sda type names to their wired ‘position’, in part based on earlier kernel & bios behaviour left over from the earlier hardware interfaces. Rockstor itself, very early on, made some false assumptions in this area: now resolved by using the by-id naming as canonical, as that was, in part, why the by-id names were created. And we track serials (also used in by-id name creation) to handle drives migrating from one interface to another.

The following is a recent kernel referenced used in fixing our TW drive activity widget. It contains the device names associated with drive interface types:

https://www.kernel.org/doc/Documentation/admin-guide/devices.txt

You are correct, and when you add a drive it gets the next number available: dev id is purely at the btrfs level. First Pool member to be formatted as such gets id 1. If it is removed from the pool, member ids start at 2. So not arbitrary but dependant on the order of addition to the Pool.

A conseptual element missing here re raid10 vs btrfs-raid10 use is that we are strictly, and currently soley aware of btrfs-raid10. The key difference is that btrfs profiles are akin but not the same as traditional hardware & md/dm raid setups. Btrfs raid is at the chunk level (1 GB sections) but drive aware. I.e. btrfs-raid1 ensures there are two copies on two different devices of each chunk. With 3 drives, say 1, 2, 3, chunk 1 may be on drive 1 & 2. But chunk 2 may be on drive 2 & 3. Ergo chunk not drive based: but drive aware for device redundancy.

The by-id names may well contain info about which hardware port they are attached to though. That may help regarding what drive is attached to what controller. Take a look at the drive overview, it also contains the associated Pool that each drive is a member of.

In part this is addressed by our use of by-id names. Does this pan-out in your current setup? If so we may have a Web-UI improvement where we can add to the disk table on the Pool details page to surface the port redundancy elements at play: if any.

Some NAS cases purposefully expose the end of the drive to inspection: this generally exposes a serial number sticker. Another option, if the drive can still be read from, is to use our flashing-light facility:

https://rockstor.com/docs/interface/storage/disks.html

See the Table links from left to right where we have:

Bulb Icon - flash drive’s activity light to identify its physical location.

Which of course assumes there is an interface disk activity light paired with each associated drive, or a drive resident light is visible.

I.e. btrfs-raid does not account for interface redundancy, only drive (and at the chunk level). Interface redundancy can be accommodated, kernel wise, via multi-path: but that is more associated with enterprise drives that have two interfaces each. One attaches each drive to two independent controllers. Providing for controller fail-over. Rockstor is unaware, and currently incompatible, with this arrangement. But were I to have local access to such a system (cost constraints currently) I’d like to try to add this capability time allowing. And of course our underlying openSUSE entirely supports such a thing. But again there would need to be a significant re-work of our lowest level disk management code to accommodate multi-path. However with the recent major clean-ups/modernisation enacted as part of: 5.0.9-0 this is now way more approachable: again assuming I had the necessary enterprise hardware.

Hope that helps, if only for some context.

Tex1954 · August 20, 2024, 11:28pm

Ummmm, (I had to get a magnifying glass out to look) there is indeed a tiny sized itsy bitsy thing one could call a label on the end of my drives…

My previous method was to swap the SSD for a windows boot SSD and use HD Tune Pro or CrystalDiskInfo to check things out…

Well, I suppose all my questions are addressed and thanks to all for your help! Now all I need to do is win the lottery and buy an enterprise class NAS hardware setup with more “lights-a-blinkering” and new house to hold it… and maybe some new eyes…and body…

LOL!

Tex1954 · August 23, 2024, 1:36pm

FYI I found a rather good explanation of BTRFS Raid stuff here…

Now I am really confused! LOL!

BTRFS has nothing to do with physical disk size or placement; yet the benefits of speed and redundency for a Raid-10 setup behave virtually the same as say a Winderz stripe Raid-10 so far as speed is concerned.

Amazing!

Main NAS SSD, 8 & 6 disk Raid-10 setup:

Tex1954 · August 29, 2024, 7:09am

FYI, I am surprised! Discovered Western Digitals RMA system seems to be rather fast and spot on!
This is a drive I found faulty and returned… first one!