BTRFS errors btrfs error bad tree block start

hi everyone, only been using rockstor for a week…and already issues :frowning:
i keep getting errors.

btrfs error bad tree block start

does that mean the hard disk is faulty?
i am 99% sure the hard disk is ok

parent transid verify failed on 897630208 wanted 533 found 316
parent transid verify failed on 897630208 wanted 533 found 316
parent transid verify failed on 899612672 wanted 533 found 316
parent transid verify failed on 899612672 wanted 533 found 316
parent transid verify failed on 901971968 wanted 533 found 317
parent transid verify failed on 901971968 wanted 533 found 317
checksum verify failed on 1042235392 found 92872EEA wanted DEA9A066
checksum verify failed on 1042235392 found 92872EEA wanted DEA9A066
checksum verify failed on 1042235392 found 9E66465F wanted 227F49AA
checksum verify failed on 1042235392 found 9E66465F wanted 227F49AA
bytenr mismatch, want=1042235392, have=391753727383542616
ERROR: failed to repair root items: Input/output error

Can someone assist? Is my data gone after 1 week of usage?

Hi @Bar1

Please have patience, this is a small project, and much of the support provided (this included) is community driven.

Ultimately, the issue you’re experiencing is that after (assumedly) a dirty reboot, the BTRFS transaction log (log of write operations including delete) does not match what is on the disk.
In some cases, this is fairly innocuous, (for example “wanted 533 found 532” would mean you were only one write operation off.
Unfortunately, yours are way off, with over 200 transactions not written to the disk.
I’m guessing you were in the middle of a large operation when the system restarted unexpectedly.

This is a particularly bad state for BTRFS to be in.

There are a few ways to address this.

The first, and easiest, but least recommended for your case would be to zero out the log and ignore any changes that weren’t written:

btrfs zero-log /dev/<btrfs-device>

Note that you need to replace with your BTRFS device, after identifying it.

Instead, I would suggest attempting to mount the partition in recovery mode, and copy the contents elsewhere.
To do this, again you’ll need to know your BTRFS device location.
At that point, you can hopefully mount it with:

mount -t btrfs -o rootflags=recovery,nospace_cache /dev/sda3 /mnt/sda3

Failing that:

mount -t btrfs -o rootflags=recovery,nospace_cache,clear_cache /dev/<btrfs-device> /mnt2

If you have mounted successfully, copy or rsync the contents elsewhere, then try the zero-log command above.

If that fails, dump the filesystem as follows:

btrfs restore /dev/<btrfs-device> <mounted-location-larger-than-btrfs-fs>

Then zero the log.

Failing that, nuke it from orbit and consider this a lesson in backing up critical data, regardless of the filesystem in use. (I say this with gritted teeth, having >13Tb of data with no suitable place to backup to)

1 Like

It’s precisely because of the current shortcomings that when I lost my pool and had to spend 2 weeks recovering that I made the decision to move to another operating system for my array. I loved rockstor and it was solid for 2 years but simple tasks are just not fully baked yet - mostly because of the infancy of btrfs. Maybe once the move to a newer kernel and to openSuSE might improve things. It’s not like I can’t afford down time - I just don’t want to.

Ultimately - telling someone that - sucks to be you - you should have backed up - while ultimately true - doesn’t really help someone with the frustration and the stress that goes along with the situation that they are in and it simply just goes on to prove the young state of btrfs and exposes the limitations of a small development project. - (One that I love, support and want to see succeed).

Yes, that’s true, and Rockstor is a NAS operating built around BTRFS, which is in it’s infancy.
Ultimately, at this stage, this project caters to power users who want to experience what BTRFS has to offer, and are willing to accept the tradeoffs associated with it.

Some, but not others. There’s a large wait list on BTRFS support for many things, stability in particular. BTRFS tooling requires a lot of work still. (Hence the lack of recovery options, particularly for transid mismatches.

No, but the other 95% of the post should provide some insights in how to get his filesystem mounted and provide some capability for recovery.
If those things don’t work and nobody else can shed more light, then unfortunately it sucks to be him.

As I stated in my post, that statement was made realizing my own predicament, I have 13Tb of data with nowhere else to stick it.
Fortunately, none of it is critical data (movies, tv eps, music - legitimate rips of course), so if I lost it, I’d chalk it up to lessons learned.

Please bare in mind that I wasn’t meaning any disrespect to anyone here - im a little sore from a frustrating experience - so if it came off that way - my apologies. This all basically highlights the issue with Btrfs - and the more data that gets amassed the larger the elephant it becomes - regardless of the data that is lost it sucks when it’s gone.

Another possibility here may be to mount with the mount option usebackuproot in readonly:

mount -t btrfs -o ro,usebackuproot /dev/<btrfs-device> /mnt2

If that works, try again without the readonly command:

mount -t btrfs -o usebackuproot /dev/<btrfs-device>

If this works, your filesystem is likely still screwed, this doesn’t fix anything, but it should allow you a clean mount hopefully before the transactions that have errored.

In that event, if possible, backup what you’ve currently got by dumping the filesystem again:

btrfs restore /dev/<btrfs-device> <mounted-location-larger-than-btrfs-fs>

Then Try running a check without repair:

btrfs check /mnt2

If the only reported errors are something along the lines of errors 400, nbytes wrong, you may be able to get a clean filesystem by using:

btrfs check --repair /mnt2

Do not attempt a check in repair mode before verifying this, as you’ll likely cause more harm than good.