Disk Pool mounted, shared missing. many errors

Oh wondeful internet peoples. I seek wisdom from the mindhive.

Rockstor 4.1.0-0 installed on a ESXI vm. Tried to get vmware-tools installed. followed a guide blindly. vm rebooted, all hell broke loose.

“Parent transid verify failed… wanted 32616 found 32441”
Pool remounts automatically as read-only.

Ive tried nurmerous options without any success.

I’m willing to put on the dunce hat and be tarred and feathered and publically mocked, so long as some kind souls help me to recover the data (not mission critical, but lots of sentimental photos and video)

Please advise what screenshots, prinouts etc you need to understand better where to start.

@GambitZA Welcome to the Rockstor community forum.

Oh dear, this is usually an indication that some writes were lost in transit or the like. I.e. it wants (expects) a newer copy of something but was pointed at what looks like an older copy.
So in short the pool is poorly.

If you have a read-only mount you can refresh your back-ups (or create them) as it may be there is little to nothing of consequence missing. The auto read-only ‘move’ on btrfs’s part is to protect the filesystem (Pool) from further damage. So copy off what you can before you do anything else. As it may be any attempt to repair this pool will end up worse of than it is now. For example if the RAM in the system has caused this corruption, things will only get worse. But they are safer now it’s read-only. Or if there was a kernel hang that prevented a write, or a write was lost some how (drive buffers not flushed etc - say via drive firware issue) or via a failing drive. Whatever the case, you at least now have read-only access, so do all you can with this before proceeding to to anything else.

You may find all is well with the important data. The you options are far more open, i.e. you can wipe the pool, recreate a fresh one, re-load the data; knowing that it is now not the only copy (if that was the case initially). So if the same thing occurs or the same root cause causes some other failure you are still good, or at least better than now.

So the above is mainly advise on what to do to minimise the loss. But we are not btrfs experts here, we are more users / sysadmins. The actual experts are the developers, and for this I can point you to the following doc entry we have on contacting them:

Asking for help”: https://rockstor.com/docs/data_loss.html#asking-for-help

If you are already knowledgeable in btrfs and system administration, see the upstream community Libera Chat - #btrfs channel. Finally, if your needs are extreme, consider seeking help on the btrfs mailing list.

With the following proviso there also:

> The btrfs mailing lists is primarily for btrfs developer use. Time taken-up on trivial interactions there may not be fair to the world of btrfs development. Also take careful note of what you are expected to include: i.e. the “What information to provide when asking a support question” section on the above linked mailing list page.

A further note, once you have all data that you can backed up, is to use a newer kernel. You are likely still using Leap 15.3. However our downloads page:
https://rockstor.com/dls.html
now has Release Candidate 4.5.5-0 Rockstor installers based on Leap 15.4, which in turn has a newer btrfs software stack (kernel and userspace). But note we have an ongoing/outstanding issue regarding import of config saves. But an install of this, to another system disk (disconnect the original 4.1.0-0 Leap 15.3 based one) to preserve your existing system, gives easy access to a newer system under which you can do any repairs if you end up taking that route (after refreshing backups).

We also have the following how-to that enables kernel backports to the base Leap 15.3 OS:

Installing the Stable Kernel Backport: https://rockstor.com/docs/howtos/stable_kernel_backport.html
which will also update both the kernel and filesystem subsystems to get any newer code that may help with the repair that you may end up undertaking.

But the first step must be to save what you have given you at least have read-only access, so should be able to pull off likely the vast majority of the data that concerns you: via a back-up copy-off.

Let us know how it goes, but don’t change anything before establishing a backup. Super important as there is no guarantee any repair, irrespective of the OS or kernel version will not make things worse; especially if the underlying hardware is failing/faulty.

Re info to help others help you, the output of the following (run as the root user) could help; hopefully.

uname -a
btrfs fi show

And note these will change if you end up updating, re-installing with RC2, or doing the stable kernel backports. In some cases thought, an older kernel will mount a pool that a newer kernel will not: due to the greater number of sanity checks added as btrfs is developed. But a newer kernel is favoured in repair scenarios. But every mount attempt could, especially by differing kernel versions, runs the risk of modifying the filesystem so that it doesn’t even mount read-only. Hence advising that you first get off what you can as you still have read-only access.

Hope that helps. And let us know how you get on.

2 Likes

Hello and thanks for the reply!

Im only allowed to post one image per post, so please excuse the upcoming spam.

As requested, here are some printouts:

uname -a
uname%20-a

btrfs fi show
btrfs%20fi%20show

cd /mnt2/DATA followed by ls
ls

cd /mnt2/DATA/DATASTORE (datastore folder is where all the pictures are)
browse%20to%20folder

As well as some screenshots of the following screens in Rockstor:

  • Storage / Disks
    Storage%20Disks
  • Storage / Pools
    Storage%20Pools
  • Storage / Shares
    Storage%20Shares

And finally a copy of fstab (which, in my uneducated opinion is where the problem started. I think the process of trying to create and install vmware-tools changed the fstab)
fstab

I dont think this issue warrants asking devs. This isnt a bug, this is a user issue (ie i broke it).