Single SMR Drive Going Offline

N-Gen · January 31, 2016, 4:23am

Hello everyone. First of all let me say I’m really enjoying the OS. I have been wanting to move away from a hardware based array to a software based one for a while now and it seems like Rockstor is just what I need.

However, I’m having some problems with an 8TB Seagate Archive SMR HDD. It’s the only HDD in the system since I want it to be my test HDD where all the data is delivered from my current RAID before those drives are wiped and placed in the Rockstor appliance. Rockstor itself is running on a 16GB SanDisk USB 3 flash drive.

The problem I’m having is that the HDD constantly goes offline requiring me to cold start it. So a quick power reset and rescan under disks and it’s back. The problem is of course, this is quite inconvenient and since it happens even while copying data it makes the task itself a nightmare. The drive is not even visible in lsblk when this happens but it’s physically still spinning. The disk name under the disk tab is also a random ID instead of sdb.

On a Windows machine formatted in NTFS it works just fine. I’m reattempting a test to write 1TB of data to it (it goes offline way before that under Rockstor) under Windows/NTFS and so far it has been fine. It has a little over 7 hours left since it’s a multitude of different files. A regular test drive of 80GB on Rockstor was also fine.

This leads me to think that this is some shortcoming in btrfs for SMR but I stand to be corrected since I’m still quite green in this world so any help would be appreciated. If it’s entirely a btrfs issue I’ll have to rethink my backup strategy.

Complete system specs should they be required:

ASRock H61M-VG4
Intel Celeron G1620
4GB Crucial Ballistix Sport 1866MHz
16GB SanDisk Ultra Fit USB3
8TB Seagate Archive SMR HDD
500w EZCool PSU

Dragon2611 · January 31, 2016, 12:23pm

Anything showing up in the kernel log/dmesg to suggest that the drive is dropping? Almost sounds like the sata controller is losing the drive.

N-Gen · January 31, 2016, 12:53pm

Hi Dragon2611, this is the result I get in dmesg from the start of the copy until the point of failure http://pastebin.com/cmDFiZFf

Under Windows it did complete the whole 1TB copy I gave it and it was still active after that for at least 12 hours.

phillxnet · January 31, 2016, 1:01pm

@N-Gen Welcome to the Rockstor community. My main concern is the SMR drive. Please see this post which details these concerns re btrfs on these drives. Short of it is they are known to the btrfs developers and there is work yet to be done to support their unique way of working.

The random ID as a disk / dev name is what Rockstor gives “missing” drives which is consistent with your finding that it falls from lsblk visibility, ie it no longer appears to be connected.

Hope that helps, but it looks like you are going to have to consider using another or better still several other drives, you will at least then gain redundancy which is a key feature of btrfs. These drives are very different from all that have come before them and need special treatment from the underlying OS; unlike all that have come before them in this regard. I would say the short coming is more in their reliance of offloading their responsibility to that of their connected OS. Not a fan myself as they seem like too much of a hack / push of the technology.

N-Gen · January 31, 2016, 5:29pm

Hi phillxnet, I have been through that post a couple of times yesterday in order to determine at which end of the spectrum the problem might be and if the drive might be faulty. This drive is a sort of transitory drive where my current array data (under hardware RAID) will reside while I migrate those 4 disks into the Rockstor appliance so it isn’t a big deal for it not to work under btrfs.

The purpose for it after the RAID migration would have been to periodically backup the RAID data itself on this disk as an extra layer of local backup but I might as well put it in an enclosure and attach it to my main router so I could just periodically rsync out of the Rockstor appliance into the storage. It seems that even under Windows only NTFS is supported for the SMR disk.Definitely not going to bin the disk but I am a bit disappointed by the whole SMR idea at the moment if only NTFS works fine. I did work fine on ZFS but one drive does not justify me sticking to ZFS especially for the resource cost.

I have a few other small drives I can test just to make sure there’s nothing wrong with the board itself before proceeding with the migration completely.