rockstor-pre.service crashed after updating to rockstor-5.0.5-0.x86_64.

@Hooverdan & @Flox

Keep in mind here that we cannot re-installl Poetry itself on TW:

So we have to tread carefully with a TW system re wiping and re-installing Poety itself: but the installer was created when this incompatibility for install did not exist.

Just in case this was a diagnostic route taken here!

1 Like

Ok, the sequence # /root/.local/bin/poetry install has just finished after more than 25min ! PyPi is certainly having a bad day, I’ve just done some speed tests - my Down is 2GB and my Up is 1GB. So this is not where the problems are.

Anyway, here’s the result:


Installing dependencies from lock file

Package operations: 16 installs, 0 updates, 0 removals

• Installing dbus-python (1.2.18)
• Installing distro (1.8.0)
• Installing django-oauth-toolkit (1.1.2)
• Installing django-pipeline (2.1.0)
• Installing djangorestframework (3.13.1)
• Installing gevent (23.9.1)
• Installing gunicorn (21.2.0)
• Installing huey (2.3.0)
• Installing psutil (5.9.4)
• Installing psycogreen (1.0)
• Installing psycopg2 (2.8.6)
• Installing python-socketio (5.9.0)
• Installing pyzmq (19.0.2)
• Installing six (1.16.0)
• Installing supervisor (4.2.4)
• Installing urlobject (2.1.1)

Installing the current project: rockstor (5.0.5)

a little better, but only a little

After restarting, I seem to have a website, here’s the WEB_UI page image:

indeed only a “little” better. Not that it should really matter, but if you clear your browser cache is the display any more complete?

When querying the status of the rockstor-pre, nginx and rockstor (e.g. systemctl status rockstor-pre) do they show active (or exited with success) or something else?

1 Like

@Hooverdan’s suggestions will definitely be helpful here :slight_smile:

What I suspect is actually what one should expect here given your previous failure or issue in your zypper update of the rockstor package. Indeed, the “text only” page you see would indicate you do not have any of the static files, which are compiled during the RPM install. If the latter never completed, then that would explain why you do not see images/css formatting.

Could you verify what the current install state is here?

zypper info rockstor

Also, I’m a bit unclear as to what happened when you ran the zypper up --no-recommends in your first post. Did it ever complete or did it look like it hanged? Your most recent posts would indicate that it actually was hanging (although I’m not sure if zypper ends up stopping with a timeout); this is especially possible given your report of a zypper process still running in the background in one of your early messages.

If that is the case, I would try again the zypper approach and give it enough time to complete.

  • if zypper info rockstor shows that the latest is installed (5.0.5), you could try to force the reinstall (zypper install -f rockstor)
  • if zypper info rockstor returns an out-of-date version, then try zypper update --no-recommends rockstor again. I would be remiss if I don’t specify to make sure not to use zypper up (without a specific package) in Tumbleweed.

In any case, there is a big slowdown somewhere that is causing a lot of issue so either it’ll resolve itself (slow server somewhere, such as PyPi), or there is something about hardware? You did mention that other processes were running fine so that seems less likely but still… don’t hesitate to detail anything you can think of about your hardware (the disk where the OS is installed, in particular).

Hope this helps!

  • OK new day, sudo new install TW from scratch

    • This time! I filmed the TW installation procedure, here are the error highlights:
    • FAILED ! Failed to activate DM RAID disks.
      See systemctl status dmraid-activation.service for details.
    • FAILED ! failed to start SUSE JeOS …zard - Create systemsnapshot.
      See systemctl status jeos-firstboot-snapshot.service for details.
    • FAILED ! failed to start SESSION c1 for user postgres.
      See systemctl status session-c1.scope for details.
    • FAILED ! failed to start SESSION c2 for user postgres.
      See systemctl status session-c2.scope for details.
    • FAILED ! failed to start SESSION c3 for user postgres.
      See systemctl status session-c3.scope for details.
  • view details :

    • systemctl status dmraid-activation.service
× dmraid-activation.service - Activation of DM RAID sets
Loaded: loaded (/usr/lib/systemd/system/dmraid-activation.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Tue 2023-10-24 08:59:00 CEST; 27min ago
Main PID: 1210 (code=exited, status=1/FAILURE)
CPU: 4ms

oct. 24 08:59:00 localhost systemd[1]: Starting Activation of DM RAID sets...
oct. 24 08:59:00 localhost dmraid[1210]: no raid disks
oct. 24 08:59:00 localhost systemd[1]: dmraid-activation.service: Main process exited, code=exited, status=1/FAILURE
oct. 24 08:59:00 localhost systemd[1]: dmraid-activation.service: Failed with result 'exit-code'.
oct. 24 08:59:00 localhost systemd[1]: Failed to start Activation of DM RAID sets.
  • systemctl status jeos-firstboot-snapshot.service
    × jeos-firstboot-snapshot.service - SUSE JeOS First Boot Wizard - create system snapshot
    Loaded: loaded (/usr/lib/systemd/system/jeos-firstboot-snapshot.service; disabled; preset: disabled)
    Active: failed (Result: exit-code) since Tue 2023-10-24 09:08:54 CEST; 19min ago
    Main PID: 2093 (code=exited, status=1/FAILURE)
    CPU: 23ms
    
    oct. 24 09:08:54 PhilRock.home systemd[1]: Starting SUSE JeOS First Boot Wizard - create system snapshot...
    oct. 24 09:08:54 PhilRock.home jeos-firstboot-snapshot[2097]: La configuration 'root' n'existe pas. Snapper n'est probablement pas configu>
    oct. 24 09:08:54 PhilRock.home jeos-firstboot-snapshot[2097]: Pour plus d'instructions, consultez 'man snapper'.
    oct. 24 09:08:54 PhilRock.home systemd[1]: jeos-firstboot-snapshot.service: Main process exited, code=exited, status=1/FAILURE
    oct. 24 09:08:54 PhilRock.home systemd[1]: jeos-firstboot-snapshot.service: Failed with result 'exit-code'.
    oct. 24 09:08:54 PhilRock.home systemd[1]: Failed to start SUSE JeOS First Boot Wizard - create system snapshot.
    lines 1-12/12 (END)
    
  • systemctl status session-c1.scope
    × session-c1.scope - Session c1 of User postgres
    Loaded: loaded (/run/systemd/transient/session-c1.scope; transient)
    Transient: yes
    Active: failed
    CPU: 0
    
    oct. 24 09:08:54 PhilRock.home systemd[1]: session-c1.scope: Couldn't move process 1780 to requested cgroup '/user.slice/user-475.slice/se>
    oct. 24 09:08:54 PhilRock.home systemd[1]: session-c1.scope: Failed to add PIDs to scope's control group: No such process
    oct. 24 09:08:54 PhilRock.home systemd[1]: session-c1.scope: Failed with result 'resources'.
    oct. 24 09:08:54 PhilRock.home systemd[1]: Failed to start Session c1 of User postgres.
    lines 1-10/10 (END)
    
  • systemctl status session-c2.scope
    × session-c2.scope - Session c2 of User postgres
    Loaded: loaded (/run/systemd/transient/session-c2.scope; transient)
    Transient: yes
    Active: failed
    CPU: 0
    
    oct. 24 09:08:54 PhilRock.home systemd[1]: session-c2.scope: Couldn't move process 1820 to requested cgroup '/user.slice/user-475.slice/se>
    oct. 24 09:08:54 PhilRock.home systemd[1]: session-c2.scope: Failed to add PIDs to scope's control group: No such process
    oct. 24 09:08:54 PhilRock.home systemd[1]: session-c2.scope: Failed with result 'resources'.
    oct. 24 09:08:54 PhilRock.home systemd[1]: Failed to start Session c2 of User postgres.
    lines 1-10/10 (END)
    
  • systemctl status session-c3.scope
    × session-c3.scope - Session c3 of User postgres
    Loaded: loaded (/run/systemd/transient/session-c3.scope; transient)
    Transient: yes
    Active: failed
    CPU: 0
    
    oct. 24 09:08:54 PhilRock.home systemd[1]: session-c3.scope: Couldn't move process 1833 to requested cgroup '/user.slice/user-475.slice/se>
    oct. 24 09:08:54 PhilRock.home systemd[1]: session-c3.scope: Failed to add PIDs to scope's control group: No such process
    oct. 24 09:08:54 PhilRock.home systemd[1]: session-c3.scope: Failed with result 'resources'.
    oct. 24 09:08:54 PhilRock.home systemd[1]: Failed to start Session c3 of User postgres.
    lines 1-10/10 (END)
    
  • Version at end of installation :
    • zypper info rockstor
      Chargement des données du dépôt...
      Lecture des paquets installés...
      
      
      Informations sur paquet rockstor :
      ----------------------------------
      Dépôt                     : @System
      Nom                       : rockstor
      Version                   : 4.5.8-0
      Architecture              : x86_64
      Fabricant                 : YewTreeApps
      Taille une fois installé  : 9,5 MiB
      Installé                  : Oui
      État                      : à jour
      Paquet source             : rockstor-4.5.8-0.src
      URL en amont              : https://rockstor.com/
      Résumé                    : Btrfs Network Attached Storage (NAS) Appliance.
      Description               :
      Software raid, snapshot capable NAS solution with built-in file integrity protection.
      Allows for file sharing between network attached devices.
      
      
  • I log on to Web-UI to launch the update procedure
    • Activate Testing Updates and enable Start Update.
      Update process in progress, counts down for 5min,
      Start second 3min countdown Rockstore is being updated to the latest version in the selected update channel Please wait.
    • Then I lost the GUI with 502 Bad Gateway, but I continued to wait (which I hadn’t done on my first attempts).
    • Console status check :
      • zypper info rockstor
        System management is locked by the application with pid 2897 (/usr/bin/zypper).
        Close this application before trying again.
        
        Update still in progress! I’m waiting …
      • 10min later: ps aux|grep zypper
        root 2897 0.6 2.9 352256 113740 ?       Sl 09:47 0:04 /usr/bin/zypper --non-interactive install rockstor
        root 3341 0.0 0.0 6588 2304 pts/0 S+ 10:00 0:00 grep --color=auto zypper
        
        I’m waiting, I’m waiting
      • 2h after the start of the update: no change
        # zypper info rockstor
        System management is locked by the application with pid 2897 (/usr/bin/zypper).
        Close this application before trying again.
        
      • I decide to kill the process: kill -9 2897.
        and restart the info test
        # zypper info rockstor
        Loading repository data...
        Reading installed packages...
        
        
        Info about package rockstor :
        ----------------------------------
        Repository: Rockstor-Testing
        Name: rockstor
        Version: 5.0.5-0
        Architecture: x86_64
        Manufacturer: YewTreeApps
        Installed size: 9.6 MiB
        Installed: Yes
        Status: Up-to-date
        Source Package: rockstor-5.0.5-0.src
        Upstream URL: https://rockstor.com/
        Summary: Btrfs Network Attached Storage (NAS) Appliance.
        Description:
        Software raid, snapshot capable NAS solution with built-in file integrity protection.
        Allows for file sharing between network attached devices.
        
        
      • I then launch a forced installation of rockstor.
        # zypper install -f rockstor
        Loading repository data...
        Read installed packages...
        Forced installation of rockstor-5.0.5-0.x86_64 from the Rockstor-Testing repository.
        Resolve package dependencies...
        
        The following package will be reinstalled:
        rockstor
        
        1 package to reinstall.
        Total download size: 2.3 MiB. Already cached: 0 B. No additional space will be used or freed after the operation.
        Continue? [o/n/v/...? displays all options] (o):
        Recovery: rockstor-5.0.5-0.x86_64 (Rockstor-Testing) (1/1), 2.3 MiB
        Recovery: rockstor-5.0.5-0.x86_64.rpm ..............................................................................[done (2,2 MiB/s)]
        
        Search for file conflicts: .................................................................................................[done]
        No static/config-backups directory found.
        (1/1) Installation of: rockstor-5.0.5-0.x86_64 .....................................................................................[done]
        Execute %posttrans script "rockstor-5.0.5-0.x86_64.rpm" -------------------------------------------------------------------------[-]
        Retrieving Poetry metadata
        
        The latest version (1.1.15) is already installed.
        ...
        ...
        Post-processed 'admin/css/rtl.css' as 'admin/css/rtl.30f903442dc5.css'
        Post-processed 'admin/css/widgets.css' as 'admin/css/widgets.5e372b41c483.css'
        Post-processed 'storageadmin/css/bootstrap-editable.css' as 'storageadmin/css/bootstrap-editable.79416658ad31.css'
        Post-processed 'storageadmin/css/bootstrap-switch.min.css' as 'storageadmin/css/bootstrap-switch.min.45abe3ae6425.css'
        Post-processed 'storageadmin/css/bootstrap-timepicker.css' as 'storageadmin/css/bootstrap-timepicker.b88ccda5a95a.css'
        Post-processed 'storageadmin/css/bootstrap.min.css' as 'storageadmin/css/bootstrap.min.9d54c9642ad5.css'
        Post-processed 'storageadmin/css/dataTables.bootstrap.min.css' as 'storageadmin/css/dataTables.bootstrap.min.4486edc8769e.css'
        Post-processed 'storageadmin/css/datepicker.css' as 'storageadmin/css/datepicker.97529c0f0f2c.css'
        Post-processed 'storageadmin/css/style.css' as 'storageadmin/css/style.0d9b0ddc03d5.css'
        Post-processed 'rest_framework/css/bootstrap-theme.min.css' as 'rest_framework/css/bootstrap-theme.min.66b84a04375e.css'
        Post-processed 'rest_framework/css/bootstrap-tweaks.css' as 'rest_framework/css/bootstrap-tweaks.46ed116b0edd.css'
        Post-processed 'rest_framework/css/bootstrap.min.css' as 'rest_framework/css/bootstrap.min.77017a69879a.css'
        Post-processed 'rest_framework/css/default.css' as 'rest_framework/css/default.789dfb5732d7.css'
        Post-processed 'rest_framework/css/font-awesome-4.0.3.css' as 'rest_framework/css/font-awesome-4.0.3.c1e1ea213abf.css'
        Post-processed 'rest_framework/css/prettify.css' as 'rest_framework/css/prettify.a987f72342ee.css'
        
        506 static files copied to '/opt/rockstor/static', 576 post-processed.
        
        ROCKSTOR BUILD SCRIPT COMPLETED
        
        If installing from source, from scratch, for development:
        1. Run 'cd /opt/rockstor'.
        2. Run 'systemctl start postgresql'.
        3. Run 'export DJANGO_SETTINGS_MODULE=settings'.
        4. Run 'poetry run initrock' as root (equivalent to rockstor-pre.service).
        5. Run 'systemctl enable --now rockstor-bootstrap'.
        PhilRock:/home/philadm #                                                                                                                       
        
        It still took over 1h30 to complete the process. You really have to be ultra-patient with this update, and you really need to add a progress indicator so that you don’t quit midstream, as I did previously.
        /
      • Just follow the recommendations:

        ```
        PhilRock:/home/philadm # cd /opt/rockstor
        PhilRock:/opt/rockstor # systemctl start postgresql
        PhilRock:/opt/rockstor # export DJANGO_SETTINGS_MODULE=settings
        PhilRock:/opt/rockstor # poetry run initrock
        bash: poetry: command not found
        PhilRock:/opt/rockstor # systemctl start rockstor-pre.service
        PhilRock:/opt/rockstor # systemctl enable --now rockstor-bootstrap
        ```
        
      • Then check the Web-UI status:
    • Conclusion

3 Likes

@philuser Thanks for your report here. All details failed services are form upstream re:

And we don’t depend on any of these: TW often breaks stuff which is why we use it currently to ‘see the future’. But in time we are to adopt it, or slowroll, or the like as our main platform. But we are not there just yet.

All the service juggling and poetry re-installing should not be necessary here actually as they are done by the rpm install: but your reported times (in much detail - thanks for that) are massive. Could you give us some indication of the hardware you are experiencing these times on. It should help to adjust our expected times as it were.

And yes, that progress bar is just a set time: left from ages ago. In fact many years ago I upped it to account for some slower developement machines I was using at the time. We need, as you say, something a bit more sophisticated :). Like a mechanism that monitors the poetry-install.txt file for example. Also note that a zypper force re-install re-does the entire poetry install anyway. Hence the time here. We took to wiping and rebuilding more recently as it was the only way we could transition form Py2.7 (current stable and in downloadable installers) to our current Py3.9 based testing updates offerings. But that has, in turn, slowed stuff up. Inevitable really but give us greater freedom to bump our Python versions which we have now done three times in testing. And feedback such as yours here help with finding out what we need to do to make that work.

So as you have surmised what you have here is likely a timed out update (with some understandable interruptions). Also note that on TW (development testing only for us currently): we also do an entire zypper dup (but only via the Web-UI) I think the command line still states zypper up, which is not correct for TW. But the Web-UI does do the right thing with upstream updates on TW. TW is great, but there are sometime GB’s of updates if left for just a little while. And so we really need to improve our user feedback during this time. As is evident from your extremely persistend endeavours here.

Your might be interested in exactly what we do during an install / update: and why it’s mostly better to rely on the package to do it’s thing. Take a look at our rockstor-rpmbuild repostiory here:

The spec file there (master branch for stable rpm builds / testing for testing rpms) has all the steps we do. Hence relying on a force re-install of zypper to help ensure all is done in order. Which is basically where you ended up.

The recomendations you followed are intended for source builds, i.e. outside of the rpm framework and allows folks to build from source during development only. All good but quite confusing at first.

Lets here the hardware you had these slow-downs on as I’m curious as to these delays - which as you say have lead to some false fails and you ending up taking a near developer root to instantiation.

All our rpms are first test installed on their target platform before they are uploaded. So we are pretty sure that each has passed a install: but updates (especially from much older versions) are necessarily more complex. And we generally only approach updates from last stable rpm to next. However I did have a successful 4.5.8-0 (Py2.7 based) to 5.0.4 (Py3.8 at the time) update more recently. But I also encountered the 503 for a while: but all was well with a bit of patience.

I think, all in, what we need is to surface install progress. Tricky if you are changing the legs and feet you are running on (Python/Django/JS). But with some thought we could maybe pull this off. Not entirely sure how just yet, but maybe a temporary change to our nginx to expose the poetry-install.txt file or something with a refresh-me button of sorts: there-by kicking folks back into the Django Python once it is put in place by the underlying OS again (via rpm/zypper).

Thanks for all the details and involvement on this one:
I particularly liked:

Although again, all of these are independent of us. And likely fixed/behave differently since we haven’t rebuild our installers for some time. Incidentally this can be done by anyone via our following repository:

However, as is so common with our TW target: stuff is broken untill it isn’t, and then it is again !! Too many moving parts but good to have access to all-the-same.

Hopefully once we rebuild our installers again (post the indicated Poetry install blocker issue resolution) there should be more options that work for your hardware again.

But @Hooverdan runs an ongoing TW instance, so theres that :slight_smile: .

Thanks again for the exposition of your experience here: TW is not our prefered target for a reason and this is pretty much it. So I hold out hope for slowroll or what-ever it ends up being called, or maybe ALP will work for us. But all in good time and Leap has a couple of years left in it as of yet with Leap 15.6 underway already. My own developments are now on Leap 15.5, and we provide testing rpms for that OS version but as of yet no installers !! Bit by bit.

Hope that helps, at least for some context.

2 Likes

@philuser, great to hear that your install was finally successful. As @phillxnet already mentioned, would you mind sharing your hardware specs (CPU, network card, SSD vs. HDD)? I am still surprised that an installation would take such a long time, as I have never experienced it that way, and I am certainly not using very new hardware, plus the VMs I set up usually only run with one or two CPUs and a limited amount of memory allocated.

2 Likes

thank @phillxnet for this argument,
concerning the carateristics machine here is my inxi

PhilRock:~ # inxi
CPU: dual core Intel Core i3 530 (-MT MCP-) speed/min/max: 1666/1200/2933 MHz
Kernel: 6.2.4-1-default x86_64 Up: 1h 55m Mem: 687.6/3772.3 MiB (18.2%)
Storage: 111.79 GiB (2.1% used) Procs: 172 Shell: Bash inxi: 3.3.27

let me know if more precision would be useful

  • On the other hand, since updating to ROCKSTOR 5.0.5-0 it has been impossible for me to connect in ssh.
    To be able to do so again, I’ve had to temporarily (until you find a solution) disable the /etc/ssh/sshd_config.d directory (renamed /etc/ssh/sshd_config.d.back). Otherwise systematic PAM error
    But perhaps that’s the subject of a new thread, for better reference to others :wink: , it’s up to you.
1 Like

ah, ok, so that CPU is about 13 years old then. On your OS drive do you have an SSD or or older HDD connected?

Assuming the ethernet adapter for the network is on the motherboard, it will likely have a similar age. Depending on the hard drive type this might explain the long installation times…

Yes and no, it’s just like good French cooking made with good, old-fashioned leftovers. A ZX-55M motherboard with an Intel I3 CPU and 4GB DDR3 RAM (all picked up for 24€), a 128GB Kingston SSD (system), 1 x 1T Mvme (fresh data) and 2 x 2T disks for mirroring backup (all disks in stock in my drawers).

This in no way explains the incredible latency of the installation, but I’ve just had an apology message from my network provider for the significant downgrading of its services since Sunday (it kindly gave me 1 month free for this inconvenience :trumpet:).

2 Likes

love the French cooking analogy, and that your network provider took the blame at least for the download issues you’ve experienced.

2 Likes

I could have gone further:

a ZX-55M motherboard - the cooker, the oven

a Kingston SSD (system) - My pots and pans

1 x 1T Mvme - the fridge

2 x 2T disks - my good wine cellar

1 Like