Pool Device Errors Alert

I started having troubles when i uploaded some files to my rockstor and the upload started to go super slow, then drop out, then reconnect etc. So i logged into the rockstor box and there was an error in red
" Pool Device Errors Alert"
I went back into FTP nad i couldnt delete
Error: rmdir /mnt2/*******: received failure with description ‘Failure’

there are about 100GB free of a 5TB pool.

Can anyone shed some light on whats going on>? Could my drives be failing? I cant see any info around this in the drives page.

@jakeyg Hello there.

Take a look as the Pool details page, that should give more info on what’s going on.

Hope that helps.

rockstorPools

does that mean one of my disks is dying?

Hi,

I know @phillxnet will most likely be able to give you more help, but in the meantime, you should be able to see info on the type of errors by looking at the details page for the pool in question. Just click on its name and you should see a table with the number of errors if any.

Hope this helps,

1 Like

thanks flox,
appreciate your help. i see tht there are some I/O errors on the one drive. does this mean the drive is dying?rockstor_io

@jakeyg Hello again,

Pretty much yes. Although a faulty / loose SATA cable can cause this also. Btrfs is keeping a tally of the times it had trouble Writing and Reading. And depending on use you may be reading more often, hence more read errors. That’s pretty simplified and there are a few layers below btrfs but this does look like it could be a failing drive.

You could visit this drives S.M.A.R.T page within the Rockstor Web-UI and update that page to get more info from another angle: from the drive itself in this case. See the following doc entry for where these pages are:

S.M.A.R.T

Do be sure to click the “Refresh” button on that page to get the latest info. If you have never visited these pages then they will initially be blank until you click that refresh button. You could compare the 2 drives S.M.A.R.T info, but given btrfs is listing having had issues with only one drive it does look to be drive/cable related.

I see that in this case you have configured your pool to be btrfs ‘single’. That is ‘all space and no redundancy’ that means there is only a single copy of data so btrfs has no means of repair. So when these read / write errors happened it had no ‘fall back’ to a second copy of any data. It also means that if the drives couldn’t ‘get around’ these errors you pool may be toast. But then the pool is also really really full so presumably space was more important than data safety / self healing capability. I would look to the integrity of your backups as soon as possible in this scenario, if relevant.

Also note that on this page the pool is indicated as being ro (read - only)! Btrfs, when in distress, will turn a pool to read-only to prevent further ‘damage’. It is likely that this pool, being of only a single raid level, had no ability to self heal and these disk/cable errors have caused it to go ro to protect what’s left. Hence the backup suggestion above. When the data is more important than the available space you are best advised to use btrfs raid levels of 1 or 10. See our Pools doc section of a rough break down of the btrfs raid levels.

Keep us posted on how this goes.

1 Like