Btrfs won't mount, says open_ctree failed

Here we go again :slight_smile:

Today comming home from work, I found that my Rockstor machine had crashed / frozen. Webpage wouldnt open, and telnetting into it didn’t work either.

So I did the a hard reset, as it was completely unresponsive.

The system rebooted, and came up with the prompt that you could go to the webpage.

But that still didn’t work, and dmesg gave some info. My pool “/mnt2/RSPool” hadn’t mounted:

[ 22.282894] BTRFS: device label RSPool devid 10 transid 481084 /dev/sdb
[ 22.283536] BTRFS info (device sdb): enabling auto defrag
[ 22.283536] BTRFS info (device sdb): use no compression
[ 22.283536] BTRFS info (device sdb): disk space caching is enabled
[ 22.283536] BTRFS info (device sdb): has skinny extents
[ 22.288284] BTRFS error (device sdb): failed to read the system array: -5
[ 22.305078] BTRFS error (device sdb): open_ctree failed

btrfs fi sh gives me this:

Label: ‘rockstor_rockstor’ uuid: af5bd5b1-c1d9-4773-b288-f3aaf2046dc2
Total devices 1 FS bytes used 3.03GiB
devid 1 size 12.93GiB used 4.53GiB path /dev/sdg3

Label: ‘RSPool’ uuid: 12bf3137-8df1-4d6b-bb42-f412e69e94a8
Total devices 6 FS bytes used 5.24TiB
devid 9 size 2.73TiB used 2.57TiB path /dev/sdf
devid 10 size 1.82TiB used 1.67TiB path /dev/sdb
devid 11 size 1.82TiB used 1.67TiB path /dev/sda
devid 12 size 1.82TiB used 1.67TiB path /dev/sdd
devid 13 size 2.73TiB used 2.57TiB path /dev/sde
devid 14 size 931.51GiB used 774.00GiB path /dev/sdc

Everything looks fine.

Btrfs fi us //mnt2/RSPool gives me this:

Overall:
Device size: 12.93GiB
Device allocated: 4.53GiB
Device unallocated: 8.40GiB
Device missing: 0.00B
Used: 3.03GiB
Free (estimated): 9.48GiB (min: 9.48GiB)
Data ratio: 1.00
Metadata ratio: 1.00
Global reserve: 16.00MiB (used: 0.00B)

Data,single: Size:4.00GiB, Used:2.92GiB
/dev/sdg3 4.00GiB

Metadata,single: Size:512.00MiB, Used:110.06MiB
/dev/sdg3 512.00MiB

System,single: Size:32.00MiB, Used:16.00KiB
/dev/sdg3 32.00MiB

Unallocated:
/dev/sdg3 8.40GiB

Which is definately not my RSPool, but rather my root drive.

Searching the net is not very helpfull in these situations, much of the info is old and outdated.

Found this page somewhat usefull:
http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.html
But none of the commands allowed me to mount (I have not tried to mount in degraded mode for now), or fix anything.

Going to this page:
https://btrfs.wiki.kernel.org/index.php/Btrfsck
I decided to run a btrfs check /dev/sd(x) (no repair option).

It gives the following result (this is from /dev/sdb, but the results are the same for different drives):
Checking filesystem on /dev/sdb
UUID: 12bf3137-8df1-4d6b-bb42-f412e69e94a8
checking extents
checking free space cache
cache and super generation don’t match, space cache will be invalidated
checking fs roots
checking csums
checking root refs
checking quota groups
Counts for qgroup id: 0/5 are different
our: referenced 16384 referenced compressed 16384
disk: referenced 32768 referenced compressed 32768
diff: referenced -16384 referenced compressed -16384
our: exclusive 16384 exclusive compressed 16384
disk: exclusive 32768 exclusive compressed 32768
diff: exclusive -16384 exclusive compressed -16384
Counts for qgroup id: 0/8682 are different
our: referenced 16384 referenced compressed 16384
disk: referenced 32768 referenced compressed 32768
diff: referenced -16384 referenced compressed -16384
our: exclusive 16384 exclusive compressed 16384
disk: exclusive 32768 exclusive compressed 32768
diff: exclusive -16384 exclusive compressed -16384
found 5763908636672 bytes used, no error found
total csum bytes: 5621155700
total tree bytes: 6391808000
total fs tree bytes: 221315072
total extent tree bytes: 170819584
btree space waste bytes: 398949541
file data blocks allocated: 5775958544384
referenced 5756768088064

There does seem to be some counts for qgoups that are different…
But nothing else is reported.

Im tempted to just run a repair, but have decided to wait and ask around (starting here), what to do next.

I don’t find btrfs particularly easy to work with in these regards :slight_smile:

I do still wonder why the system comes up, without me being able to open the webinterface, as the boot drive seems to be fine.

Hi !
Same behaviour for me but after a failing disk replacement on raid10 pool and perahps a Rockstor update.
Tried srubbing, balancing and always got “failed to read the system array: -5”. But pool not mounted on startup.
Btrfs Check ending with 0 errors on every 4 disks. With “btrfs device scan” I could manually mount the pool.
After I make a new Rockstor install, deleting previous boot disk only, then import my pool disks and all is OK.
BTRFS is still unstable ! !

Reading your post made me curious.

btrfs device scan revealed this:
[ 57.749369] BTRFS: device label RSPool devid 9 transid 481087 /dev/sdf
[ 57.749543] BTRFS: device label RSPool devid 14 transid 481087 /dev/sdc
[ 57.749661] BTRFS: device label RSPool devid 12 transid 481087 /dev/sdd
[ 57.749754] BTRFS: device label RSPool devid 10 transid 481087 /dev/sdb
[ 57.749876] BTRFS: device label RSPool devid 11 transid 481087 /dev/sda

mount /dev/sdf /mnt2/RSPool gives this:

[ 87.655623] BTRFS info (device sda): disk space caching is enabled
[ 87.655626] BTRFS info (device sda): has skinny extents
[ 87.812254] BTRFS info (device sda): bdev /dev/sde errs: wr 0, rd 1, flush 0, corrupt 0, gen 0

The pool is now mounted…
With an error about sde…

I’m running a scrub to see if this fixes things.

Perhaps I’ll have to reinstall, but I really hope not…

Try smartctl -a on all disk and check output in search of error logs.

Thanks, will do that when scrubbing is finished. It’ll take some time…

Hi,

I’ve got similar error today. I’ve powered down server (clean) yesterday and when i powered up server today i got this error:

Apr 19 14:59:27 server1 kernel: BTRFS info (device sdk): use no compression
Apr 19 14:59:27 server1 kernel: BTRFS info (device sdk): disk space caching is enabled
Apr 19 14:59:27 server1 kernel: BTRFS info (device sdk): has skinny extents
Apr 19 14:59:27 server1 kernel: BTRFS warning (device sdk): sdk checksum verify failed on 54882974711808 wanted 37994CB9 found ACE8821F level 0
Apr 19 14:59:27 server1 kernel: BTRFS error (device sdk): bad tree block start 5363791737531697536 54882974711808
Apr 19 14:59:27 server1 kernel: BTRFS: open_ctree failed

I think I’m going to need some help…

The scrub finished without errors. Which means all data on all drives seems to be OK.

But Rockstor still refuses to mount the Pool it allways ends with open_ctree failed. I can do it manually as described earlier, every time.

Anybody got any ideas?

Well, I went ahead and did a reinstall.

Seems to have fixed the problems. Rockstor now boots, and mounts the pools.

I decided to install to a traditional mechanical hdd, instead of my mSSD, to see if this helps with some of the weirdness I seem to sometimes get.

I’m really curious why I got open_ctree errors on a filesystem that seems to be OK.

I recently did a new install using two disks resulting in btrfs raid, and got this error.
The solution is this down the end of a gentoo page.

Oracle also does btrfs root, so I wonder how they handle it.

@luke Welcome to the Rockstor community and nice find.

Given this thread does not relate to multi-dev btrfs root you might like to start a fresh thread with this info. You could also link there to your recently opened GitHub issue which I have now responded to. That way we might spark a new discussion of this capability going forward in the light of the info now in that GitHub issue.

Thanks again for sharing your findings and do please consider building a dev environment for Rockstor as all pull requests are welcome. A quick peek at the prior closed pr’s should help with the ‘form’ and of course our:

Community Contributions
and more specifically the
Developers
subsection should prove useful in that regard.

1 Like

Well Drat, Looks like I get to join the party too.

Had to shutdown the Rockstor server computer for a few minutes while I moved around some furniture. Turned it back on and it doesn’t see the Shares. It sees my disks, it has my RAID1 pool listed but no shares.

When I click on my Rockstor’s RAID1 Pool i get this error:

        Traceback (most recent call last):

File “/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py”, line 41, in _handle_exception
yield
File “/opt/rockstor/src/rockstor/storageadmin/views/pool_balance.py”, line 47, in get_queryset
self._balance_status(pool)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/utils/decorators.py”, line 145, in inner
return func(*args, **kwargs)
File “/opt/rockstor/src/rockstor/storageadmin/views/pool_balance.py”, line 72, in _balance_status
cur_status = balance_status(pool)
File “/opt/rockstor/src/rockstor/fs/btrfs.py”, line 985, in balance_status
mnt_pt = mount_root(pool)
File “/opt/rockstor/src/rockstor/fs/btrfs.py”, line 252, in mount_root
run_command(mnt_cmd)
File “/opt/rockstor/src/rockstor/system/osi.py”, line 115, in run_command
raise CommandException(cmd, out, err, rc)
CommandException: Error running a command. cmd = /bin/mount /dev/disk/by-label/VaultRAID /mnt2/VaultRAID -o ,compress=no. rc = 32. stdout = [’’]. stderr = [‘mount: wrong fs type, bad option, bad superblock on /dev/sdc,’, ’ missing codepage or helper program, or other error’, ‘’, ’ In some cases useful info is found in syslog - try’, ’ dmesg | tail or so.’, ‘’]

When I dmesg | tail i get:

[ 1222.999156] BTRFS info (device sda): use no compression
[ 1222.999159] BTRFS info (device sda): disk space caching is enabled
[ 1222.999160] BTRFS info (device sda): has skinny extents
[ 1223.000925] BTRFS error (device sda): failed to read chunk tree: -5
[ 1223.013741] BTRFS error (device sda): open_ctree failed
[ 1223.217766] BTRFS info (device sda): use no compression
[ 1223.217768] BTRFS info (device sda): disk space caching is enabled
[ 1223.217769] BTRFS info (device sda): has skinny extents
[ 1223.218583] BTRFS error (device sda): failed to read chunk tree: -5
[ 1223.234755] BTRFS error (device sda): open_ctree failed

I’m tried some restarts and some tinkering to remount things but i get the same sort of “wrong fs type, bad option, bad superblock” errors. Maybe This should be a posted on as a new thread? I’m better at dealing with rockons and docker, and not so much btrfs, any suggestions on how to mount my drives and recover data without breaking it would be great. I’m paranoid after all i’ve read about ruining drives you attempt to repair.

After some serious heartbreak and feeling things weren’t going to end well for me I found Haioken’s Post

I couldn’t for the life of me mount the partitions in any useable way, so I went the recovery route. His command suggestion was the command:
btrfs restore /dev/<btrfs-device> <mounted-location-larger-than-btrfs-fs>
Luckily my NAS is quite small, only <2TB, so I had a spare drive that could be used for recovery.

I mounted another 2TB drive, and ran the command but with a slight adjustment:

btrfs restore -iv /dev/<btrfs-device> <mounted-location-larger-than-btrfs-fs>

Setting the -i for ignore errors, and -v for verbose, allowed it to get passed some errors it was having restoring content from docker container paths in the RockOns share. I’m not sure what, but that was forcing the restore to stop. And setting it to -v, verbose to let me watch its progress in realtime. I’ll recover my data, and likely rebuild the drives and shares again.

Thanks Haioken-from-the-past for helping me out in a pinch!

1 Like