I’m looking to virtualize an instance of RockStor. I have a server running CentOS 7 with KVM, the OS is on 2 mirrored drives via the hardware raid controller. The rest of the drives are configured as individual drives in a RaidZ1 ZFS pool for the VMs and datastore. There will be multiple VMs on this host, a mix of Linux and Windows Server, I want to utilize ZFS for raid redundancy to protect all of the VMs. My questions are:
Are there any issues running RockStor and BTRFS on top of ZFS in a virtual environment?
If I have a drive fail, it will be rebuilt on the KVM host (ZFS). Should I use just a single large virtual drive (and depend on the underlying ZFS raid for redundancy) for RockStor, or should I create multiple virtual drives and raid them together with BTRFS as well?
Would there be any issues I’m missing with regards to recovery of a failed drive? I think that with the ZFS RaidZ1, even with a failed drive, it shouldn’t affect the operation of the RockStor VM during the rebuild process, correct?
Are there any operational issues or configurations I should disable (ex, with FreeNAS and ZFS, scrubbing should be disabled in a virtual environment without PCI passthrough, etc…)?
I’m mostly looking for best practices in this situation, and want to avoid any performance issues that may be introduced in this type of environment.
You could do that, but that would be very similar to running BTRFS on hardware RAID. The self healing provided by ZFS will not be aware of BTRFS level constructs, so you can’t really take full advantage of self healing benefits of ZFS. Not using BTRFS’s redundancy obviously means you are choosing not to use some of the same benefits.
scrub tries to automatically repair those blocks with checksum failure by fetching a copy. But if you don’t use BTRFS redundancy, there won’t be a copy to fetch and repair. So scrub seems largely useless.
Suman, Thanks for your replies! I think I will try this set up and just use one large virtual drive for RockStor, and give it some test runs. I’ll throw some data on there and pull drives and replace them to simulate a failed drive and just see how it does. Probably do some large writes and pull the plug in the middle of it, etc. While the self healing benefits of ZFS wouldn’t be available to the BTRFS filesystem itself, it should still be able to take care of any corruption that may happen to the actual virtual disk file. But if both ZFS is trying to fix that corruption at the same time BTRFS is trying to fix it’s filesystem errors, I wonder if it would just hose the BTRFS volume, but (I think) that’s why it would be better to just use one large virtual disk instead of a few in a BTRFS raid, maybe I’m wrong.
I’ll post back with what I find and how badly my tests mangle the data I put on there.