Can't MIRROR a 4-1TB disk RAID-0 with a 1-disk 4TB drive?

Tex1954 · June 28, 2021, 10:54am

In fact, why can’t I add a mirror drive to ANY pool? For instance, why can’t I take a 2-2TB raid 0 and later add a 4TB mirror to it? Or why can’t I take a single 4TB disk and mirror it with half an 8TB disk?

Seems to me the restrictions are nonsense in that it’s simply a software thing at this point since I don’t use a dedicated (expensive) RAID controller…

OR, am I simply so aged that I can’t understand something simple?

LOL!

phillxnet · June 28, 2021, 12:46pm

@Tex1954 Hello there.

I don’t think I quite get your question/difficulty here. Sorry. But any pool in Rockstor can, as per btrfs, have any size drive added to it. Can you explain more what restriction you have running into.

Btrfs and Rockstor in this case (we don’t represent all of btrfs but drive adding we do) has no real ristriction on adding a drive to a pool. What I think may be happening here is a difference in interpretation re raid in old style hardware and raid in software and specifically btrfs.

Btrfs raid 1 for example can have not only 2 but any number of disks. But the ‘software’, read btrfs drivers, ensure that there are exactly 2 copies of each data and metadata on 2 different devices; indipendant of the number of devices. There are now 3 and 4 copy variants but we don’t yet ‘understand’ them in rockstor.

So put more simply raid in btrfs is at the chunk level, not the drive level. This enables things like different raid levels for data than for metadata i.e. raid1 using dup metadata for example while using ‘raid1’ for data.

So that may all be irrelevant here but I thought this may be a potential sticking point. I.e. data is mirrored at the chunk level not the drive level, at least in btrfs raid1/10 for the data. And raid levels are set at the pool level and new files will conform to this level, at least until a balance is performed that, depending on options, will re-write old data as new and thus move it to the new raid level. Rockstor does an auto ‘whole’ balance post a disk add so that is already taken care of via the Web-UI.

Hope that helps. And to change / add a disk you go to the pool details page and there’s a button at the bottom which in turn presents a set of Pool modification options, add remove change raid level.

GeoffA · June 28, 2021, 1:58pm

As @phillxnet comments above, btrfs raid works differently to the traditional notion of disk RAID.
I use this online calculator which lets you play with various disk size combinations to see the effect.
Note that (I believe) in Rockstor the number of ‘copies’ for raid1 is 2. Someone please correct me if I am wrong.
Hope this helps.

Tex1954 · June 28, 2021, 4:13pm

Okay… I think I am beginning to understand. I was thinking RAID at the DISK level, not the chunky level…

I have up to now only messed with DISK level RAID…

Soo, the next thing I would like to know is WHY I can’t have my cake the way I want it…

Example: I “WANT” a 4 disk RAID-0 for SPEED across a fiber network… I want the drives striped. Rockstor says this is what happens when I take 4 1TB drives and run them raid-0. It gives me 4X storage space and 4X speed.

Likewise, when I run those same 4 drives in RAID-10, I am getting only half the total size (Mirrored part) and half the total possible speed (RAID 0 part). Both these conditions have the disks setup to use their full size in a share.

Soo, let’s take the 4X 1TB drive Raid-0 example now.

I am happy with the speed and size but I want to backup the Raid-0 array somehow. My preference would be to copy all the data (-d & -m) to another array of “N” disk(s) where “N” is a value of 1 to 8 depending on the backup array disk sizes. I would like to do this automatically, but manually would be fine as well.

The question is, how can I do this? Would a snapshot do the trick? I would rather have it done automatically as in a RAID 1 application…

phillxnet · June 28, 2021, 5:18pm

@Tex1954 Re:

You would have 2 pools. One for speed, i.e. as in your raid 0 or 10 examples; and one to back-up this fast-but-fragile front-line pool to. This back-up pool would likely be raid1 and you can then add disks to this back-up pool as and when needed.

Copy methods.

Good old rsync (good but works on the file level)
btrfs send receive Such as is used in Rockstor’s replication. But we don’t currently support replication intra Rockstor instance. Only inter rockstor instance.
Other wrapper scripts that in turn use btrfs magic for think kind of thing.

So for a single machine the:

May come in, at least for the initial config. You could setup send receive scripts between the requried subvols on the fast-but-fragile to the redundant backing storage pools. Then execute these scripts via cron jobs or systemd timers.

Can’t help with the details on these but the info is out there re btrfs send receive. Plus it’s block level so can very quickly know the difference between the source and destination, at least after the first send. But note that you must send from a read only snapshot but these can be taken instantaneously. Rockstor already has scheduled tasks that you may be able to use to create these for your back-up scripts.

Hope that helps, at least to start a discussion for folks that know more than me on this subject here on the forum.

Tex1954 · June 28, 2021, 9:23pm

Thanks tons! You gave me a big idea! Instead of 3 separate Raid-1’s to use as backup, I could make one HUGE raid-1 to backup everything. That gives me 2 backups on the RockNas plus originals on main system… I’m going to try that rsync thing and see what I can do… Requires careful planning on my part to get what I want…

BTW, has there been any progress on getting bcache integrated so I can put a fast SSD in there like freeNAS does?

Thanks again! Off to the races again!!!

phillxnet · June 29, 2021, 11:20am

@Tex1954

Super.

And:

I’ve not personally tried our prior bcache ‘arrangement’ on the current Rockstor 4, which is, ideally, what you should be running now. But our older CentOS variant was proven to work with bcache as from the drive roles addition onwards (read LUKS support as it also uses roles). We ensured that our currently fledgling disk roles subsystem was flexible by developing both the LUKS and bcache capability simultaneously. They both need to recognise certain ‘categories’ of ‘special’ devices and our roles are used to label said devices so that they can be presented accordingly within the Web-UI. But the caveat is that Rockstor expects a very specific configuration regarding the naming of the bcache associated devices. If this is in place then the Web-UI should sanely represent the nature of the various devices and present proper configuration options for them in turn. These very specific naming requirements are setup by udev rules (on boot) and it is these rules that have not been established as existing or compatible within our “Built on openSUSE” variant. And given bchache compatibility within the Web-UI is not a current ‘core feature’ it’s not received the attention it needs to ensure it even works in the Rockstor 4 version. However if our base OS of openSUSE Leap 15.2/15.3 does have incompatible udev rules they could be replaced with the compatible ones we have in our technical doc/wiki here:

So that is basically the document we have for how to setup bchace device naming so that it is compatible with Rockstor 3; and hopefully still Rockstor 4, though with the aforementioned replacement of existing rules (untested). Note that if you do try this on Rockstor 4 you are likely going to be able to just install the bcache user land tools from the default repositories and not have to build them from scratch (source) due to the years newer underlying OS.

Remember that inevitably more complexity brings with it more fragility (more moving parts) so take care with ‘experimenting’ especially with important data. And if you end up scrambling a pool and then btrfs send broken subvols you may end up scrambling the received subvols as well. Bcache as a very good reputation but it does introduce another layer and can in turn weaken an otherwise complex enough setup. But at least we have dropped a number of layers in adopting btrfs in the first place, i.e. no partitions, LVM (physical/logical layers) or mdraid (physical/logical) and have these all wrapped up in the same project/layer. But it still introuduces another failure point and a single point at that if you have only a single ssd. And if you do, as some have, and use mdraid to beef up the single ssd you then add the inability of mdraid to return what it was given (no check summing) so you still have a single point of failure. I’d go for write through config at first if you try this. Unless of course you really need the apparently more instant writes.

Hope that helps.

Tex1954 · June 30, 2021, 12:11am

Installing the bcache is the only method I would consider for most of the reasons you mentioned. I am a retired HARDWARE engineer in a narrow field of motion controls to operated machines to tight tolerances. I’ve programmed gazzillion lines of MCU and DSP assembly language and only tidbits or “C”, and a lot of BASIC for test machine apparatus. It is clear to me that we have data incoming at speed “Rs” and data being written at speed “Ws” and if “Ws” is slower than “Rs” then some means of caching is required to maintain throughput for a given data length. If the data length is greater than the sum of all the hardware cache available, then I see the system apparently using RAM memory as a cache… as the length of the data continues to expand and fill up the RAM memory (32G ECC in this case) the system begins to slow down significantly.

If I use a 10G fiber or copper link, then 1GB/Second is my max calculated LAN speed. I get about 4/10’s that speed now with 4X 1TB WD-Black HDD;s. When I upgrade to at least 3 FAST SSD’s I can easily obtain write speeds well in excess of the LAN speed and thus be “Happy Camper” !!!

Sooo, because of all the research and checking and help from y’all with my (ignorance showing) noob questions, I’ve come to a completely new understanding of what I want and what is available.

In fact, after ANALising what I need, the truth is a dual FAST SSD of say 2TB would be fine since the only time I would really bottleneck the NAS would be in full directory or full disk copies from other backups to a share. Since that situation is only during startup, I will likely not have to worry in the future.

NEW PLAN is to have a “Less Fragile” RAID-10 FAST SSD setup in front and RAID-1(s) in the back.

My focus now is on the fact that in reality data failures happen at the hardware “DISK” level and I would prefer an easy way to correct the failure. If I have 4 disks in RAID-1 and one fails, figuring out which disk and how to replace it could be more problematic than having 2x 2-disk RAID-1 setups…

THANKS FOR ALL THE HELP!