Reblancing from raid5 to 1

Dragon2611 · July 15, 2016, 8:45am

Not sure if it’s due to snapshots but it seems re-balancing a raid5 to raid1 might have been a bad idea.

2x 3TB drive + 1x2TB drive so far it’s been going on for about a week and is at 83% remaining.
It’s also why the FS wouldn’t mount on a reboot and why I thought the FS was buggered, turns out it did actually mount it just took a Very long time.

phillxnet · July 15, 2016, 9:36am

@Dragon2611 Oh dear. Thanks for the update, so only 34 days to go then. There have been reports of very long process times for raid5; I think the snapshots will definitely slow things up as it’s just more info to process but not entirely sure if it’s worth the risk of deleting what you can of them given the known instability of raid5. Also pausing the balance then deleting what you can snapshot wise, then resuming the balance is another option. Or just wait it out of course. Rather a poor show but it is a well known issue among the btrfs developers from the linux-btrfs mailing list contents; I expect in the future it will seem a little more comical than it does currently.

This is a tricky call as if you have this data no where else then your best bet is to simply wait it out. Is the file system otherwise also in use, ie are you changing it’s contents currently beyond that of the ongoing slow balance? Also note that there is a known memory leak with raid5 balance (though that is from memory ) but this probably relates to the number of operations / fs size, so you may well be OK.

How many snapshots do you have?

Dragon2611 · July 15, 2016, 9:40am

There are 72 snapshots as it’s set to take them automatically then delete the oldest one, whilst I also do backup most of the data, the snapshots were in place to allow to roll back if one of the crypto viruses managed to get RW access to the shares (The snapshots of course are set not to be mounted/available)

Important data (The backups of my pc’s is 3 way replicated with syncthing AND then backed up to hubic as well), Slightly less critical data is backed up to Hubic and then there’s other data (Mostly media) that isn’t backed up at all because it’s simply not worth worrying about.

Edit:

Thinking of replacing Hubic with Backblaze B2 as Hubic seems rather unreliable in terms of oauth randomly deciding not to authenticate also their 10Mbit/s throttle.

Edit2:

It seems to be btrfs-transaction that sits there hitting 100% of a core most of the time which I suspect is it trying to proccess it’s way through the snapshots, CPU is a C2750 with currently 2 cores allocated to rockstor so the indervidual cores are pretty slow.

phillxnet · July 15, 2016, 10:02am

@Dragon2611 Nice arrangement. So looks like you’ve identified the bottleneck. I must say 2 of 8 cores is a little mean there; but then it’s not very paralleled anyway just yet. Also additional calculations required with managing both the regular CoW check-sums and the parity components of raid5. Still you are heading in the right direction raid1 wise. Yes 72 snapshots isn’t that many but then those cores are only little.

Ultimately the more data and snapshots the more time on this one, but also additional risk with additional activity.

Dragon2611 · July 15, 2016, 10:26am

Usually the Rockstor VM doesn’t need to be able to read/write quickly as it’s a remote (Colo’d box) so the bottleneck is the WAN.

After all who cares if you only get 50MB/s writes if the data is only coming in at about 2MB/s

I do want to see what the pricing of the new asrock itx board with the E3 embedded is going to be as I might shove that alongside the existing box to take some of the load off. The C2750-D4I board was picked specifically as it had a lot of Sata I/O for an ITX board.