I am running Rockstor 3.8.16-16 as a guest inside VMWare ESXi 6.5.0. I am seeing the following errors in dmesg:
[608715.636355] BTRFS warning (device sdb): csum failed ino 21521 off 176128 csum 2924415985 expected csum 2920221681
[608715.636495] BTRFS warning (device sdb): csum failed ino 21521 off 176128 csum 2924415985 expected csum 2920221681
[612830.374636] BTRFS warning (device sdb): checksum error at logical 150460334080 on dev /dev/sdb, sector 291770688, root 257, inode 17671, offset 170033152, length 4096, links 1 (path: <filename> )
[612830.374644] BTRFS error (device sdb): bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
[612830.375537] BTRFS error (device sdb): unable to fixup (regular) error at logical 150460334080 on dev /dev/sdb
[619025.297794] BTRFS warning (device sdb): checksum error at logical 790130016256 on dev /dev/sdb, sector 1541125536, root 741, inode 21521, offset 176128, length 4096, links 1 (path: doop.sparsebundle/bands/15cd)
[619025.297805] BTRFS error (device sdb): bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
[619025.297851] BTRFS error (device sdb): unable to fixup (regular) error at logical 790130016256 on dev /dev/sdb
I’m not sure how to proceed in debugging this. What other info do you need from me to investigate? I looked in the VMware logs, but didn’t see anything for the drive this resides on.
The first thing I would do is kick off a scrub on your array. Definitely do that.
The second thing I would do is check (and post) all the smartctl values for all drives in the btrfs array.
We need more information about your array to diagnose. “btrfs fi show” would help us see some of the basics. But definitely kick off a scrub and post the smartctl values for all your drives.
Label: 'rockstor_rockstor' uuid: a831e0c9-6245-47a8-8856-08844e9999da
Total devices 1 FS bytes used 3.49GiB
devid 1 size 17.51GiB used 6.04GiB path /dev/sda3
Label: 'UG1' uuid: add82f87-5f2f-4b02-bb27-75e47b0b9009
Total devices 1 FS bytes used 842.97GiB
devid 1 size 3.64TiB used 851.05GiB path /dev/sdb
This system is on a VMware ESXi 6.5.0 host, and sdb is a dedicated 4T SATA drive. SMART isn’t supported in the guest. On the host, here is the SMART output from VMware:
esxcli storage core device smart get -d vml.0100000000202020202057442d574343374b32415
a37335a52574443205744
Parameter Value Threshold Worst
---------------------------- ----- --------- -----
Health Status OK N/A N/A
Media Wearout Indicator N/A N/A N/A
Write Error Count 0 0 N/A
Read Error Count 0 51 N/A
Power-on Hours 96 0 96
Power Cycle Count 11 0 N/A
Reallocated Sector Count 0 140 N/A
Raw Read Error Rate 0 51 N/A
Drive Temperature 35 0 N/A
Driver Rated Max Temperature N/A N/A N/A
Write Sectors TOT Count N/A N/A N/A
Read Sectors TOT Count N/A N/A N/A
Initial Bad Block Count N/A N/A N/A
# btrfs scrub status <dev>
scrub status for add82f87-5f2f-4b02-bb27-75e47b0b9009
scrub resumed at Mon Oct 2 10:10:43 2017 and finished after 02:27:32
total bytes scrubbed: 845.65GiB with 0 errors
Pool name: RAID6_VOLUME
Scrub Status: cancelled
Scrub Statistics
Attribute Value
ID 9
Start Time February 18th 2018, 2:52:52 pm
End Time February 18th 2018, 4:27:41 pm
Data Scrubbed 3.35 TB
Data Extents Scrubbed 61707837
Tree Extents Scrubbed 759296
Tree Bytes Scrubbed 12440305664
Read Errors 2336
Csum Errors 0
Verify Errors 0
No Csum 158016
Csum Discards 0
Super Errors 0
Malloc Errors 0
Uncorrectable Errors 2336
Unverified Errors 0
Corrected Errors 0
Last Physical 3700956856320