Soooo, is it dead? LOL, whatever

Wow! I go through this thread and can’t understand half of it. The only thing I see sticking out is the Raid-6 stuff that really hasn’t been working well for most folks. Well, I tried it and it didn’t work a while ago and don’t trust all that stuff at all. I’ve seen far to many software “fancy” setups like Raid 5/6 that simply don’t work. Those 5/6 setups simply crash and data is mostly unrecoverable.

What does seem to work very well is a simple Raid 10 “Mirroring” setup.

For me, I have a 10G FiberOptic network and want the speed of large file transfers. Well, no hard disk or SSD can’t saturate the network by itself. Soooo, Raid 0 for speed and Raid 1 for backup (Raid 10 setup). Also, I have another setup with 5 Raid 0 disks to backup the entirety of the first Raid 10 setup and someday it will be upgraded to a Raid 10 as well.

As far as ZFS and other Winderz solutions, they all have flaws. The ONLY setup I have seen that provably works is a Raid setup that incorporates simple mirroring. It ain’t “Fancy” but it works.

Having said all that, I am open to something better that can survive a power supply failure while writing to disks. ZFS failed that test every time I tried it. Other simpler setups would usually survive that test with some corrupted files, but nothing major.

Yes it requires tons of drives for a Raid 10 setup, but they are still running fine for me for years so far.

You might try to “simplify” your setup and forget about “Fancy” Raid…

(Just my ignorant 2 cents…)

:sunglasses:

NASLS3


Yeah, I just have no use for mirroring, I would rather just run them as single disks as I don’t need the data to be all that safe. If I lose it I just have to rip it again done. It’s movies music and the front end of my data, pointless to run raid 0 have no use for the speed and a single disk failure = rip at all again anyway. So anything other than 5 or 6 is pointless. Just running them as single shared disks connected to a windows PC provides ALL the storage capacity and if a disk fails I just re rip that disks data. The reason I am here is the soft rule of Intel only hardware in freenas became a hard and fast rule so my hardware became incompatible.

Ahhhh haaa! Okay! Now I understand… I have big big video files and need fast editing and transfer speeds.

Each to his own! Rockstor is so feature rich and I use but a small tiny portion of it…

Good luck!

:sunglasses:

The long and short is, failing disk, system not reporting the pool as having an issue, pool not mounting. lots of details above even if it is a bit broken up.

@ScottPSilver Re:

Pool not mounting is the issue here: it seems. Poorly pools don’t mount without custom mount options.

And without a mount btrfs can tell us little about what is wrong. Take a look at our following doc:

https://rockstor.com/docs/data_loss.html

It has a number of relevant sections to your requests. Also if no mount found to be possible, which could only be concluded once you have also followed for example:

https://rockstor.com/docs/howtos/stable_kernel_backport.html

And tried the newest recovery mount options: there is always:

https://btrfs.readthedocs.io/en/latest/btrfs-restore.html

One problem with you receiving assistance on this issue is the bouncing back and forth with your other thread:

And the numerous drive sales links :slight_smile: . I think they are best removed as it tends to cloud things and age threads quickly.

So in short: take a look at our above linked HowTo and research your mount options. As you do now seem interested in refreshing/updating a backup: and want to jump to whole pool repair first it seems: then you need a to aim for a read-write capable mount option. Read-only is far easier to achieve for data retrieval purposes as poorly btrfs pools will go read-only (ro) to protect data integrity. But one needs a read-write mount to be able to run a scrub: the go-to repair procedure that can seek-out failing checksums and replace the associated data/mettadata with the still good ones (if hey can also be found).

Hope that helps.

1 Like

@ScottPSilver A follow up on:

There are few repair improvements since then in either the now much older stable channel, or the RC status testing channel:

But do make sure you at least apply all OS updates since then.

A response to this will definitely help folks here help you.

Also note, if you have lost a drive, other forum thread indication, btrfs fi show should have indicated this, you would just need to add the degrade mount option: such as the Web-UI would normally walk you through:

https://rockstor.com/docs/data_loss.html#web-ui-and-data-integrity-threat-monitoring

And reboot: if a rw mount is then achieved then you can remove the missing drive via the Web-UI or command line to return your Pool to normal function. If space allows for a disk removal: otherwise its a replace at the command-line.

Btrfs-raid5 has a one drive failure normally and btrfs-raid6 two drives if you are lucky. This all assumes all data/metadata on the pool is consistently at that raid level - we don’t have Web-UI warning about that just yet. But it can happen in pools that have been incompletely converted from one version to another. If once just switches raid at the command line without a full balance, only new data is written as the new btrfs profile. Rockstror’s Web-UI always does full balances: slower but it guards against this caveat.

So lets have the full response to that command request!

You are saying there was not missing disks indicated then ! Your post looks incomplete. Sometimes there is info before that first line you posted. We normally catch that in our Web-UI command parsing though.

Your other forum thread reference !

Your pool is likely poorly then as not getting even a ro mount with the degraded option any longer means you need to use some mount options that are likely outside of what Rockstor’s Web-UI will currently allow (newer options have been introduced of late). Take a look at this section again:

https://rockstor.com/docs/interface/storage/disks.html#import-unwell-pool

as a guide to how to mount a pool that is unwell, in that case it’s for import, but it still holds. Plus a mount attempt at the command line can help folks know the nature of the mount failure. Be sure to use the indicated mount point thought as then the Web-UI can help there-after. This way you can mount a pool with any recovery orientated mount options, and work around our restrictions on these currently.

The following is a good article.

As you system did once response to degraded,ro for a while and then failed there-after: it may just be you have a compound failure. The btrfs-parity raid levels of 5 & we default ro for quite some time in our openSUSE period. And we inherited this form our upstream. They deemed it too flaky. But you are likely now using a backported kernel and filesystems repo presumably: can you detail if that is the case.

You are a little off-the-track here. Backported kernel (and filesystem repo hopefully as per our HowTo) and you have run a degraded btrfs-parity pool until it encountered yet more problems.

If space is more important than robustness, but it’s still a drag to repair/recover from backups, consider our now Web–UI supported mixed raid approach of say:

  • data btrfs-raid6
  • metadata raid1c4

It is far less susceptible to the know issues that plague the parity btrfs-raid levels of 5 & 6: which should simply never have been merged as they seem not to fully conform to the rest of btrfs. See the last entry in our kernel backport HowTo:

https://rockstor.com/docs/howtos/stable_kernel_backport.html#btrfs-mixed-raid-levels

There is work on-going upstream to update the nature of the btrfs parity raid levels: and this will be most welcome.

Hope that helps, at least with some context.

2 Likes

Is that not what I did here?

You said lots more stuff now, I am still on your statement of,

I like this idea, I have done nothing as you also stated, I am today trying the update function.
The ROCKSTOR 4.5.8-0 top right has an arrow next to it so the system recognizes it has an available update track.

about an hour ago I clicked that arrow and was sent to the update page with the testing track activated. Scrolled to the bottom and clicked start update then walked away. when I returned the gui was back and responding still showing the ROCKSTOR 4.5.8-0. So I Rebooted and when it reloaded I still have ROCKSTOR 4.5.8-0 with the arrow showing. So I’m trying a second time now while I read a bit of the plethora of information you have provided. I’ll post once more tonight with final attempts details or a message of success in getting it to update.

OK, It updated and now is giving an error on the dashboard.

Unknown internal error doing a GET to /api/pools?page=1&format=json&page_size=32000&count=

@ScottPSilver OK, nice. Glad your now on latest testing.

Our installers were build some time ago now so there are many OS updates that also get installed when doing that method of update. That means it can take ages. But future updates should be far faster now your base OS has had the available updates all caught-up.

From you Web-UI pic you are currently running a 6.5.1 kernel (top right) on Leap 15.4. That good as it’s required to enable read-write on the Leap 15.4 base OS. Default kernel on 15.4 is around 5.14.

Did you also install the filesystem backports as indicated in our:

https://rockstor.com/docs/howtos/stable_kernel_backport.html

Regarding the red error message in your last post: 4.5.8-0 was an RC5 in our last testing channel phase heading towards our current stable of 4.6.1-0. But 5.0.8-0 is latest in current testing channel. This means that almost every back-end dependency is at least years newer. And we had to make a lot of changes. Try to do a Web browser - More tools - developer tools - Network (tab) and ensure “Disable cache” is ticked. Then with that panel still open do a normal browser refresh. This will force the newer browser side of things into place. Just in case what we see here is some old stuff lingering in the browser cache. You can also try logging out and back in.

Let us know if that top-of-page error then disapears. But otherwise you do now look to be on 5.0.8-0 which is nice. The error may be related to an inability to interpret the pool but lets take this a step at a time.

Also familiarise yourself with our Web-UI log reader: System - Logs Manager. We don’t yet have a populated doc entry for this unfortunately, but pull requests are always welcome; and we have the following rockstor-doc repo issue for this:

It may help with seeing relevant doc entries to assist with the back and forth here on the forum.

I have extremely limited time for the forum as I’m working towards our next stable releae: with consequent installer / installer repo / website / Doc / back-end / server / Appman updates etc so may not be of much service here. But just wanted to nudge this along a little.

Glad your are at least now running what is our 3rd Stable Release Candidate (RC3). I hope to be releasing RC4 quite soon.

Can you also test the Pool overview and detail page for the problem pool? Pics if possible. One of our aims here is to address any malfunction on our side: as your basic issue is a poorly pool. Our Web-UI behaviour in such circumstance is more our remit here.

Hope that helps others here help @ScottPSilver further as I’m afraid I will find it difficult to ‘own’ this community support request currently.

2 Likes

OK, ran the first line from

zypper --non-interactive addrepo --refresh /repositories/Kernel:/stable:/Backport/standard - openSUSE Download Kernel_stable_Backport

and got,
Adding repository ‘Kernel_stable_Backport’ …[error]
Repository named ‘Kernel_stable_Backport’ already exists. Please use another alias.

SOOOO, from that I gather I have it already, so on to step two.

I run,
zypper --non-interactive addrepo --refresh https://download.opensuse.org/repositories/filesystems/15.X/ filesystems
And get,
Adding repository ‘filesystems’ …[error]
Repository named ‘filesystems’ already exists. Please use another alias.

SOOOO, must have that too!

Last step, three.

I run,
zypper --non-interactive --gpg-auto-import-keys refresh
and get,
Repository ‘Kernel_stable_Backport’ is up to date.
Repository ‘Leap_15_4’ is up to date.
Repository ‘Leap_15_4_Updates’ is up to date.
Repository ‘Rockstor-Testing’ is up to date.
Repository ‘filesystems’ is up to date.
Repository ‘home_rockstor’ is up to date.
Repository ‘home_rockstor_branches_Base_System’ is up to date.
Repository ‘Update repository of openSUSE Backports’ is up to date.
Repository ‘Update repository with updates from SUSE Linux Enterprise 15’ is up to date.
All repositories have been refreshed.

So good to go on to final step Installing the updates.
I run,
zypper up --no-recommends --allow-vendor-change
and get,
Loading repository data…
Warning: Repository ‘Update repository of openSUSE Backports’ metadata expired since 2024-02-05 04:25:26 AKST.

Warning: Repository metadata expired: Check if 'autorefresh' is turned on (zypper lr), otherwise
manualy refresh the repository (zypper ref). If this does not solve the issue, it could be that                                                                       
you are using a broken mirror or the server has actually discontinued to support the repository.                                                                      

Reading installed packages…

The following 197 packages are going to be upgraded:
aaa_base apparmor-parser avahi bind-utils binutils btrfsmaintenance btrfsprogs btrfsprogs-udev-rules containerd cpp7 crypto-policies cups-config curl device-mapper
docker dracut dracut-kiwi-lib dracut-kiwi-oem-dump dracut-kiwi-oem-repart dracut-mkinitrd-deprecated e2fsprogs gcc7 gcc7-c++ glibc glibc-devel glibc-locale
glibc-locale-base gpg2 gptfdisk grub2 grub2-branding-upstream grub2-i386-pc grub2-i386-pc-extras grub2-snapper-plugin grub2-x86_64-efi grub2-x86_64-efi-extras
kernel-firmware-bnx2 kernel-firmware-chelsio kernel-firmware-intel kernel-firmware-marvell kernel-firmware-network kernel-firmware-platform kernel-firmware-qlogic
libapparmor1 libasan4 libatomic1 libavahi-client3 libavahi-common3 libavahi-core7 libbtrfs0 libcilkrts5 libcom_err2 libcom_err-devel libcrypt1 libctf0 libctf-nobfd0
libcups2 libcurl4 libdevmapper1_03 libdevmapper-event1_03 libdns_sd libeconf0 libecpg6 libext2fs2 libfreebl3 libfuse2 libgcc_s1 libgnutls30 libgomp1 libhidapi-hidraw0
libicu65_1-ledata libicu-suse65_1 libitm1 libjansson4 libjbig2 liblsan0 liblvm2cmd2_03 libncurses6 libnghttp2-14 libntfs-3g89 libopenssl1_1 libopenssl-1_1-devel
libpolkit0 libpq5 libpython2_7-1_0 libpython3_6m1_0 libsnmp40 libsoftokn3 libsqlite3-0 libsss_certmap0 libsss_idmap0 libsss_nss_idmap0 libstdc++6 libstdc++6-devel-gcc7
libsystemd0 libteamdctl0 libtiff5 libtirpc3 libtirpc-netconfig libubsan0 libudev1 libwebp7 libX11-6 libX11-data libxcrypt-devel libxml2-2 libxml2-devel libxml2-tools
libXpm4 libz1 libzck1 libzypp login_defs lvm2 mdadm mozilla-nss mozilla-nss-certs ncurses-devel ncurses-utils net-snmp ntfs-3g openslp openssh openssh-clients
openssh-common openssh-server openssl-1_1 perl-Bootloader perl-SNMP pesign polkit postfix postgresql postgresql13 postgresql13-contrib postgresql13-devel
postgresql13-server postgresql13-server-devel postgresql-contrib postgresql-devel postgresql-llvmjit postgresql-server postgresql-server-devel ppp procps python python3
python3-base python3-bind python3-curses python3-ply python3-rpm python3-sssd-config python-base python-devel python-xml rpm runc samba samba-client samba-client-libs
samba-libs samba-winbind samba-winbind-libs shadow snmp-mibs sqlite3-tcl sssd sssd-ad sssd-common sssd-dbus sssd-krb5-common sssd-ldap sssd-tools suse-module-tools
systemd systemd-rpm-macros systemd-sysvinit system-group-hardware system-group-kvm system-group-wheel system-user-daemon system-user-lp system-user-mail
system-user-nobody system-user-upsd sysuser-shadow tack tar terminfo-base udev vim vim-data-common xen-libs xfsprogs zlib-devel zypper

The following 3 NEW packages are going to be installed:
kernel-default-6.8.6-lp155.3.1.g114e4b9 libprocps8 pesign-systemd

The following package requires a system reboot:
kernel-default-6.8.6-lp155.3.1.g114e4b9

197 packages to upgrade, 3 new.
Overall download size: 424.7 MiB. Already cached: 0 B. After the operation, additional 253.8 MiB will be used.

Note: System reboot required.

Continue? [y/n/v/…? shows all options] (y):
So I hit Y and enter and it goes on retrieving, and, applying delta, for quite a while. Once completed it checked for file conflicts then started to install 200 files.
On file 60 I got,
Installation of glibc-locale-2.31-150300.63.1.x86_64 failed:
Error: Subprocess failed. Error: RPM failed: Command exited with status 2.
then it offered me choices but then I got the UPS @local host broadcast due to changing my UPS and not changing it in the OS because I forgot how LOL, it deleted my options when it happened. The only option I saw and remembered was i for ignore so I did that.
and it went on installing 61.
I’m guessing that was important as it showed the size of it being 100MB so 1/3 of the hole update.
Should I run the last command again to get the errored file installed?

second fail.
installing package libstdc++6-devel-gcc7-7.5.0+r278197-150000.4.35.1.x86_64 needs 23MB on the / filesystem

( 98/200) Installing: libstdc++6-devel-gcc7-7.5.0+r278197-150000.4.35.1.x86_64 …[error]

Installation of libstdc++6-devel-gcc7-7.5.0+r278197-150000.4.35.1.x86_64 failed:

Error: Subprocess failed. Error: RPM failed: Command exited with status 2.
Retry just yields same error.
Ignore and continued.

@ScottPSilver Re:

You may have an insufficiently large system Pool/drive here. The following command

btrfs fi usage /

should help to identify this and guide folks on the forum in further assisting. There are journalctl and snapper clean-up commands that can help in a tight spot such as you may be in here. Take a look at the following:

https://en.opensuse.org/SDB:Disk_space

Hope that helps. And as always tread carefully in these corners. A partial update due to disk space failure is not something you want to be rebooting into: as it can often be a no-win situation. Plus btrfs can get grumpy when it’s entirely out of space.

1 Like

Yeah, not too sure. It continues installing the next update. I am on 118 of 200 now.
32GB drive. Perhaps it got filled with snapshots or something, I never turned anything called snapshots off. Had seen the term before but no clue what they do. For now I’m just keeping the process going. Looks like out of 200 I have had an issue with 5 and ignored them to continue installing the next update.

Interesting, it thinks it is only 5GB??? wtf? LOL

Device size:                   5.14GiB                                                                                                                                
Device allocated:              5.14GiB                                                                                                                                
Device unallocated:            1.05MiB                                                                                                                                
Device missing:                  0.00B                                                                                                                                
Device slack:                  1.50KiB                                                                                                                                
Used:                          4.97GiB                                                                                                                                
Free (estimated):              2.96MiB      (min: 2.96MiB)                                                                                                            
Free (statfs, df):             2.96MiB                                                                                                                                
Data ratio:                       1.00                                                                                                                                
Metadata ratio:                   1.00                                                                                                                                
Global reserve:               13.72MiB      (used: 0.00B)                                                                                                             
Multiple profiles:                  no                                                                                                                                

Data,single: Size:4.85GiB, Used:4.85GiB (99.94%)
/dev/sdi4 4.85GiB

Metadata,single: Size:264.00MiB, Used:128.48MiB (48.67%)
/dev/sdi4 264.00MiB

System,single: Size:32.00MiB, Used:16.00KiB (0.05%)
/dev/sdi4 32.00MiB

Unallocated:
/dev/sdi4 1.05MiB

Thinking about it, I do recall a prompt to create a pool for something. I thought was for add on’s, the rock-ons things I guess. I remember it suggesting no less than 5GB… I thought it was talking about creating it on the storage drives not the 32GB flash drive. Looks like it can’t be resized so it’s more and more starting to look like painted into a corner on this.
Any ideas feel free to chime in but if nothing pop’s up soon I may cut losses and chalk it all up to a crash course lesson in the use of Rockstor and start fresh with a few months of swapping disks at my entertainment center PC.

@ScottPSilver The full output of:

Should show what drive/partition is actually hosting the ROOT Pool. And the expected size. Our installer expands this Pool to take up the entire size of the chosen OS drive. So it’s surprising that you only have 5GB and entirely not enough: give our Minimum System requirements state 15 GB for this very reason. On every significant zypper update new snapshots are created to allow for boot-to-snapshot which we enable by default (hence 15 GB as minimum spec). Upstream installers do not enable this facility unless the chosen dive is at least 15GB also for this reason. Many updates means many snapshots.

Has many ways to clear the working space taken up by for example logs and zypper/snapper. Also note that skipping installs may well break you OS. They need to be installed as they are dependencies of other packages. And a full OS drive is a broken one - mostly.

Hope that helps. And lest see the full btrfs fi show as we can then advise on potential in-place options. There is not installer selection on the size of the OS drive: only as you suspect shares created after the install. So I’m curious how you ended up with an entirely insufficient 5 GB system Pool.

3 Likes

Unit is still running so, when I get home from work I can run the fi show.

1 Like

LOL, must have rebooted. All I get when truing to log in from multiple PC’s is

Page not found

Sorry, an unexpected internal error has occured.

If you need more help on this issue, email us at support@rockstor.com with the following information:

  • A step by step description of the actions leading up to the error.
  • Download the log.tgz file (containing the tarred and zipped server logs) available at error.tgz, and attach it to the email.

So I download the .tgz file and it’s 0kb! LOL, and farther down the rabbit hole I tumble!!!

I’m dying here!!! LOL, every step I make adds another layer of issue! LOL I think we are now 5 layers of problems here. It looks like to continue I have to extract the desktop from it’s cove and set it up on my dinner table with a Cat7 cable across my place, get a monitor and peripherals, and, and, and,
LOL, I am quickly losing interest. I was not trying to learn about every issue one could possibly have with an operating system. Not trying to bug Phill, so, if anyone else has a path forward please feel free to join in. All I know for sure is it is not going to sit on my table for more then a week or two. If nothing happens in the correct direction soon, I’m wiping then installing on a 128GB drive and remembering to chose 64GB for data storage. I’ll be watching and I’ll post again once I have the system pulled out and set up.