Soooo, is it dead? LOL, whatever

Perhaps none of what this post was about matters. I thought OK I have enough info to know the Raid6 I am running should get the testing update path. So I tried that to see if I could get rockstor to tell me there is a problem with the not mounted pool that was not giving errors any longer. Started update and got.

Unable to check update due to a system error: (Error running a command. cmd = /usr/bin/rpm --import /opt/rockstor/conf/ROCKSTOR-GPG-KEY. rc = 1. stdout = [‘’]. stderr =
[“error: can’t create transaction lock on /usr/lib/sysimage/rpm/.rpm.lock (Resource temporarily unavailable)”, ‘error: /opt/rockstor/conf/ROCKSTOR-GPG-KEY: key 1 import failed.’, ‘’]).

       Traceback (most recent call last):

File “/opt/rockstor/src/rockstor/storageadmin/views/command.py”, line 262, in post
return Response(rockstor_pkg_update_check(subscription=subo))
File “/opt/rockstor/src/rockstor/system/pkg_mgmt.py”, line 323, in rockstor_pkg_update_check
switch_repo(subscription)
File “/opt/rockstor/src/rockstor/system/pkg_mgmt.py”, line 241, in switch_repo
run_command([RPM, “–import”, rock_pub_key_file], log=True)
File “/opt/rockstor/src/rockstor/system/osi.py”, line 227, in run_command
raise CommandException(cmd, out, err, rc)
CommandException: Error running a command. cmd = /usr/bin/rpm --import /opt/rockstor/conf/ROCKSTOR-GPG-KEY. rc = 1. stdout = [‘’]. stderr =
[“error: can’t create transaction lock on /usr/lib/sysimage/rpm/.rpm.lock (Resource temporarily unavailable)”, ‘error: /opt/rockstor/conf/ROCKSTOR-GPG-KEY: key 1 import failed.’, ‘’]

So fun, Now what? No gui options return anything resembling the option chosen. LOL fml

Old heading.
How to mount pool, locate failing drive, replace it and upgrade another drive

Hello and thanks for being willing to teach a young dog new tricks. Replacement disk is slow-boating to Alaska along with two others to upgrade working drives. I have been searching and reading like a mad man but most of what I find is old, or written by folks that know more of the basics so nothing is elaborated on. First off Is there a recent guide I can follow to locate the bad disk and remove/replace it and I just haven’t found it?. My pool has enough space to operate without this disk and with the comments in quite a few posts, it sounds like I can just pull the disk and resilver or re-balance without a replacement, no? I would love to remove it now in order to bring it back online and add in the new one’s when they arrive. Thanks again and looking forward to figuring this OS out. I am missing the degraded pool warnings thus all the neat features I am reading about are not revealing themselves so I guess I am stuck searching for now.

System:
Current Mobo: GIGABYTE GA-78LMT-USB3 R2 AMD 760G USB 3.1 HDMI Micro ATX Motherboard
Current CPU: AMD CPU FX Series FX-4100 Quad Core CPU 3.6GHz Socket AM3+
Current Memory: Team Group TED38G1600C11BK 32GB 8GB x 4 DDR3 1600mhz Kit
Power Supply: Unable to locate purchase online. EVGA 500W
System drive: Kingston data Travler DT50 16GB
Storage drives: Seagate Barracuda Compute Model# ST8000DM004 8TB 5400rpm x 8
Case: GIGABYTE 3D AURORA GZ-FSCA1-ANB Black Aluminum ATX Full Tower
Rockstor Setup: Linux: 6.5.1-lp154.2.gc181e32-default Opted for Raid 6 I have chosen an update path of test as I am running Raid6 I have started a stable updates sub to support the cause.

@ScottPSilver Hello again.
Re:

Could just be caused by a timing issue. When one logs in an visits the Dashboard page we auto-check for updates so we might parse and display them. But we don’t yet check to see if our underlying OS happens to be doing the same thing. If it is, often the case on a freshly booted system, we clash as a lock is already held by the rpm/package-management system. This clash is harmless and we just don’t get the info needed to display the updates. But subsequent later visits/refreshes of the Dashboard, once the base OS automated rpm check is done, should work.

Could you indicate the version you are currently running on this system: top right of the Web-UI.

Once we have you updated we can look to the repair. That way if a bug is found we can tend by way of a testing update.

Hope that helps, at least by way of some context.

1 Like

@ScottPSilver A little on the repair side.

The Web-UI can usually tell us some stuff on poorly drives etc, so screen shots of for example the pool detail page where it lists the drive members in a table at the bottom of the page. That may have some stats gathered by btrfs on errors encountered in the past on a per dive basis. In a btrfs pool each disk/drive member is identified by an id. These are listed in that table also.

On the command line the output of the following command, run as the root user, may be of use to folks here as well:

btrfs fi show

Plus, as this new forum thread is focused on your repair - now you have presumably received the shiny new drive/drives, - some history of the nature of this raid 6 Pool failure could also help folks here. Plus anything you have tried. But don’t try anything more just yet until the history and current state of the failure is laid out here.

Our current canonical doc entry for btrfs repair is here:
https://rockstor.com/docs/data_loss.html
That would make for a good first read - but don’t be put off as that doc covers quite a lot. And there are pure Web-UI options available - likely already in your existing version. But again it would be good to see exactly what version you are running - including (if it’s a little older) the mouse-over popup info on the Uses openSUSE. Newer version don’t need a mouse over as it states for example 15.4 or what-ever.

Cheers.

2 Likes

ROCKSTOR 4.5.8-0

You are correct GUI is back to normal.

System:
Current Mobo: GIGABYTE GA-78LMT-USB3 R2 AMD 760G USB 3.1 HDMI Micro ATX Motherboard
Current CPU: AMD CPU FX Series FX-4100 Quad Core CPU 3.6GHz Socket AM3+
Current Memory: Team Group TED38G1600C11BK 32GB 8GB x 4 DDR3 1600mhz Kit
Power Supply: Unable to locate purchase online. EVGA 500W
System drive: Kingston data Travler DT50 16GB
Storage drives: Seagate Barracuda Compute Model# ST8000DM004 8TB 5400rpm x 8
Case: GIGABYTE 3D AURORA GZ-FSCA1-ANB Black Aluminum ATX Full Tower
Rockstor Setup: Uses openSUSE Leap Linux: 6.5.1-lp154.2.gc181e32-default
ROCKSTOR 4.5.8-0

I ran btrfs fi show a couple times last night but saw nothing but the 8 drives I have and the root pool.

btfrs fi show

Post from hardware I mention attempts and other details. I have learned quite a bit since I posted this so don’t be too harsh! LOL

To be through here I am including my details regarding data on this system.
Quote modified here

New drives don’t arrive till after April 10th, One lame thing about Alaska…
a bunch of things are way better, like, the air, to mention just one.

1 Like

Dev Stats

1 Like

Rugh Row! looks like I got myself into the no one knows zone… Still not sure what to do here either.

Wow! I go through this thread and can’t understand half of it. The only thing I see sticking out is the Raid-6 stuff that really hasn’t been working well for most folks. Well, I tried it and it didn’t work a while ago and don’t trust all that stuff at all. I’ve seen far to many software “fancy” setups like Raid 5/6 that simply don’t work. Those 5/6 setups simply crash and data is mostly unrecoverable.

What does seem to work very well is a simple Raid 10 “Mirroring” setup.

For me, I have a 10G FiberOptic network and want the speed of large file transfers. Well, no hard disk or SSD can’t saturate the network by itself. Soooo, Raid 0 for speed and Raid 1 for backup (Raid 10 setup). Also, I have another setup with 5 Raid 0 disks to backup the entirety of the first Raid 10 setup and someday it will be upgraded to a Raid 10 as well.

As far as ZFS and other Winderz solutions, they all have flaws. The ONLY setup I have seen that provably works is a Raid setup that incorporates simple mirroring. It ain’t “Fancy” but it works.

Having said all that, I am open to something better that can survive a power supply failure while writing to disks. ZFS failed that test every time I tried it. Other simpler setups would usually survive that test with some corrupted files, but nothing major.

Yes it requires tons of drives for a Raid 10 setup, but they are still running fine for me for years so far.

You might try to “simplify” your setup and forget about “Fancy” Raid…

(Just my ignorant 2 cents…)

:sunglasses:

NASLS3


Yeah, I just have no use for mirroring, I would rather just run them as single disks as I don’t need the data to be all that safe. If I lose it I just have to rip it again done. It’s movies music and the front end of my data, pointless to run raid 0 have no use for the speed and a single disk failure = rip at all again anyway. So anything other than 5 or 6 is pointless. Just running them as single shared disks connected to a windows PC provides ALL the storage capacity and if a disk fails I just re rip that disks data. The reason I am here is the soft rule of Intel only hardware in freenas became a hard and fast rule so my hardware became incompatible.

Ahhhh haaa! Okay! Now I understand… I have big big video files and need fast editing and transfer speeds.

Each to his own! Rockstor is so feature rich and I use but a small tiny portion of it…

Good luck!

:sunglasses:

The long and short is, failing disk, system not reporting the pool as having an issue, pool not mounting. lots of details above even if it is a bit broken up.

@ScottPSilver Re:

Pool not mounting is the issue here: it seems. Poorly pools don’t mount without custom mount options.

And without a mount btrfs can tell us little about what is wrong. Take a look at our following doc:

https://rockstor.com/docs/data_loss.html

It has a number of relevant sections to your requests. Also if no mount found to be possible, which could only be concluded once you have also followed for example:

https://rockstor.com/docs/howtos/stable_kernel_backport.html

And tried the newest recovery mount options: there is always:

https://btrfs.readthedocs.io/en/latest/btrfs-restore.html

One problem with you receiving assistance on this issue is the bouncing back and forth with your other thread:

And the numerous drive sales links :slight_smile: . I think they are best removed as it tends to cloud things and age threads quickly.

So in short: take a look at our above linked HowTo and research your mount options. As you do now seem interested in refreshing/updating a backup: and want to jump to whole pool repair first it seems: then you need a to aim for a read-write capable mount option. Read-only is far easier to achieve for data retrieval purposes as poorly btrfs pools will go read-only (ro) to protect data integrity. But one needs a read-write mount to be able to run a scrub: the go-to repair procedure that can seek-out failing checksums and replace the associated data/mettadata with the still good ones (if hey can also be found).

Hope that helps.

1 Like

@ScottPSilver A follow up on:

There are few repair improvements since then in either the now much older stable channel, or the RC status testing channel:

But do make sure you at least apply all OS updates since then.

A response to this will definitely help folks here help you.

Also note, if you have lost a drive, other forum thread indication, btrfs fi show should have indicated this, you would just need to add the degrade mount option: such as the Web-UI would normally walk you through:

https://rockstor.com/docs/data_loss.html#web-ui-and-data-integrity-threat-monitoring

And reboot: if a rw mount is then achieved then you can remove the missing drive via the Web-UI or command line to return your Pool to normal function. If space allows for a disk removal: otherwise its a replace at the command-line.

Btrfs-raid5 has a one drive failure normally and btrfs-raid6 two drives if you are lucky. This all assumes all data/metadata on the pool is consistently at that raid level - we don’t have Web-UI warning about that just yet. But it can happen in pools that have been incompletely converted from one version to another. If once just switches raid at the command line without a full balance, only new data is written as the new btrfs profile. Rockstror’s Web-UI always does full balances: slower but it guards against this caveat.

So lets have the full response to that command request!

You are saying there was not missing disks indicated then ! Your post looks incomplete. Sometimes there is info before that first line you posted. We normally catch that in our Web-UI command parsing though.

Your other forum thread reference !

Your pool is likely poorly then as not getting even a ro mount with the degraded option any longer means you need to use some mount options that are likely outside of what Rockstor’s Web-UI will currently allow (newer options have been introduced of late). Take a look at this section again:

https://rockstor.com/docs/interface/storage/disks.html#import-unwell-pool

as a guide to how to mount a pool that is unwell, in that case it’s for import, but it still holds. Plus a mount attempt at the command line can help folks know the nature of the mount failure. Be sure to use the indicated mount point thought as then the Web-UI can help there-after. This way you can mount a pool with any recovery orientated mount options, and work around our restrictions on these currently.

The following is a good article.

As you system did once response to degraded,ro for a while and then failed there-after: it may just be you have a compound failure. The btrfs-parity raid levels of 5 & we default ro for quite some time in our openSUSE period. And we inherited this form our upstream. They deemed it too flaky. But you are likely now using a backported kernel and filesystems repo presumably: can you detail if that is the case.

You are a little off-the-track here. Backported kernel (and filesystem repo hopefully as per our HowTo) and you have run a degraded btrfs-parity pool until it encountered yet more problems.

If space is more important than robustness, but it’s still a drag to repair/recover from backups, consider our now Web–UI supported mixed raid approach of say:

  • data btrfs-raid6
  • metadata raid1c4

It is far less susceptible to the know issues that plague the parity btrfs-raid levels of 5 & 6: which should simply never have been merged as they seem not to fully conform to the rest of btrfs. See the last entry in our kernel backport HowTo:

https://rockstor.com/docs/howtos/stable_kernel_backport.html#btrfs-mixed-raid-levels

There is work on-going upstream to update the nature of the btrfs parity raid levels: and this will be most welcome.

Hope that helps, at least with some context.

2 Likes

Is that not what I did here?