I wanted to know what was your opinion on using Rockstor with RAID 0 with SSDs. The SSDs in question are WD Red 4TBs.
I’m curious about this since I read RAID (Redundant Array of Inexpensive Drives) is mostly meant for HDDs as those tend to fail unpredictably and are prone to mechanical wear.
And since I also read SSDs nowadays are sturdy little fellas that can withstand a lot of writes before becoming read only, even after becoming read only I can still recover the data from them, and WD Reds are a bit fairly more expensive compared to other drives, so they don’t exactly fall under the inexpensive category of the “Inexpensive Drives”, I was wondering of what you thought of setting 4x WD Reds 4TB SSD drives in RAID 0, while scrubbing the drives weekly, or even every 3 or 4 days since reads are ‘free’ on SSDs, and correct and corrupted data that might come along, together with using compression on the drives to minimize write amplification.
@jpff, I’m sure @Tex1954 can help you out with some viewpoints on SSDs, their durability, speed benefits, as he’s been burning through a few with various experiments, for example this one:
Fundamentally, as long as you accept the risk of still losing something (which can happen even with old style HDDs obviously) and have a backup strategy in place, irrespective of hardware approach, then this can work. The questions I would have:
Can your server motherboard and network components handle it (e.g. 10Gbps network adapters, etc)? Otherwise you will not be taking full advantage of what RAID 0 might bring you in terms of speed since e.g. your network will always be saturated before the read/write capacity of your drive is reached.
What is the use case that you have (if you’re inclined to share) that would require a massive increase in storage performance (e.g. video editing storage, game server, etc.)?
I was thinking more of taking advantage of the full capacity of the drives rather than performance. And since SSDs go read only when they hit their maximum write cycles, I could just copy everything to an external drive and swap the drives in the array for new ones, but doing that would happen before any drive hit that stage. But that’s going to be a long time to hit I’m assuming. I’ll also keep 20% of the Raid capacity unallocated so the controllers in each SSD can have more room to do their job.
Since Raid 0 splits data along the multiple drives less data is also written to each, reducing wear through write amplification I’m assuming.
I don’t think my server has a 10Gbps NIC, but it has a pcie slot that I can use to add one!
So more than speed I’m looking for capacity, and that can last longer than an average HDD while not being hammered like an HDD doing scrubs to look up for errors.
got it. Still, if it’s data you can’t afford to lose, you want a backup strategy, since if the controller fails, you could face catastrophic data loss.
Scrubbing on RAID0 will only detect errors but in most cases won’t be able to correct them, since you don’t really have redundancy. With that said, something lik a weekly scrub will help you to identify issues earlier.
I have my data on my desktop as well as duplicated in an external drive, and archived on MDISCS as well.
And you’re absolutely correct, I missed that… I need redundancy to actively correct detected errors.. For some reason I had the idea that reed solomon error recovery codes were embedded with the data with btrfs.
Which raid level would you advocate to minimize write amplification on SSDs?
I read RAID10 was a really good option for that, but I could be wrong. Curious regarding RAID1 and RAID6 for 4 drives.
I think RAID6 will cost you the most usable space (at the benefit of the most redundancy, but also other drawbacks). RAID1 or RAID1c3 could work for you, especially since you have your data backed up twice or more. RAID10, like RAID1 keeps two copies, so also only a one drive failure capability, but could be the most performant (though, again, with SSDs it might not matter, especially if you value capacity over raw performance). Personally, I am currently using the available raid10-1c3 on Rockstor profile: for my purposes it seems to be the best balance between capacity/redundancy. But it’s also on spinning HDDs and so I am also not expecting lightning speed performance. But, as usual, I am sure there are different opinions on this forum around that.
I just read about RAID levels on rockstor’s docs, the consensus amongst the BTRFS community is that btrfs RAID6 isn’t production ready, so I’ll avoid RAID6 with btrfs, unless it was to be with ZFS. I’d have to go for either RAID1 or RAID10. From what I understood RAID10 seems to be most balanced for reads and writes with stripping and mirroring throughout the drives.
I’ll be reading more about these pools like raid1c3 & raid1c4 on btrfs docs.
The NAS will not be used for any databases, or data that will be constantly be written, the most will be for forever incremental backups on the home’s computers, and it’s more of a place that I know will hold my data for a long time pro-actively fixing data corruption as it happens, unlike the external drive and my desktop drives.
My server embedded controller can do RAID10, are there any benefits to doing hardware RAID over software RAID?
The risk of RAID5 or 6 is greatly reduced if you have a UPS, but yes, the development team does have a warning there to not use it for production.
I assume, you have already taken a gander through @phillxnet updated documentation in the Rockstor docs?
Typically, the recommendation is to not put btrfs on top of a hardware RAID. The reason being that the hardware RAID will possibly mask any issues to btrfs, so the recovery options are greatly reduced. I think, when you look around, usually the combination of hardware and software RAIDs is counterproductive. I’m sure there are use cases where that combination could make sense, but in general. Also see the comment here in the docs:
And likely I won’t be able to call a scrubbing process when I want as that is implemented in the hardware RAID itself, if implemented at all.
I read the docs regarding the pools and RAID levels, seems that raid1c3 & raid1c4 has more redundancy, and therefore less final available space in the pool I assume.
I think I’ll stick to RAID10. From what I read seems it’s the most appropriate for SSDs, since it has less write amplification.
here are a few thoughts for you on the BTRFS Raid:
The BTRFS Raid has definitely an advantage regarding data integrity, as it uses checksum for everything and has access to all drives.
The main reason for hardware RAID controllers is performance, which is nowadays with the abundance of CPU computing capacity not a limitation of software RAID anymore.
BTRFS Raid 1 vs Raid 10
Raid 1 and Raid 10 offer the same level of redundancy: every copy is stored 2 times in total. Raid 10 additionally uses striping, which offers a higher read and write speed in case the disk is the bottleneck.
Note, that on BTRFS Raid 10 can also be created with only 2 drives.
As you said you are looking for capacity, I would actually recommend you the “simpler” method of Raid 1, as it has no disadvantage regarding redundancy and storage capacity.
I don’t know exactly whot you mean, but the write amplification from Raid 1 and Raid 10 should be exactly the same (x 2). Raid 6 on 4 disks would have also the same write amplification, Raid 5 on 4 disks less (x 1.5), Raid 1c3 and 1c4 of corse more with x 3 and x 4 respectively.
BTRFS Raid 1c3 / 1c4
As you said correctly, they reduce the available space but increase the redundancy.
Dependent on your use case, you could also start with e.g. Raid 1c3, and when you approach the storage limit of your Raid, you can convert to Raid 1 at any later point.