3.9 to 4.1 update report

mattyvau · February 3, 2022, 9:38am

Just wanting to post about how I went with the update from 3.9 to the new stable 4.1 to identify a few possible issues in case the information is of any use.

Process:

Created backup from web-UI of 3.9
Disconnected data drives
Fresh install of 4.1 (text installer, all automated/defaults) onto new drive as I was moving from USB drive to SATA SSD at anyway (at last)
Reconnected data drives
Import pools
Restore config from backup and wait until system settled (as per watching “top” from a shell)
Reboot

What worked:

Data import/pools/shares all intact (important parts a duplicated to a backup service)
NFS share configuration kept - just had to re-enable the service
Extra users I had added were kept as expected. Needed to re-add my ssh public key and enable that user via sshd_config

Mixed:

Rockons worked once I enabled the service, however some did not show up as installed in the web-UI. A docker ps showed them running, eg. nzbget, and I could access its web interface, but it allowed me to re-install and now shows up. Same happened for Emby, Unifi and Transmission if I recall correctly

Not so much

Samba share configurations weren’t restored and needed re-entering. Could this be because I had manually edited some parts of the config file previously? I changed the min protocol to get around older Sonos issues. Works after setting up the shares again.
Scheduled tasks - They are restored, but point to the wrong shares and I don’t seem to be able to edit this. One example attached as a screenshot.

1%20-%20edit987×583 29.6 KB

I believe this is because shares are referenced by a number in the backup file, eg. share 18, but these numbers must have changed on import?

One more suggestion. It would be great if we could export a backup that is not gzipped. I tried to reference the json file when manually restoring some settings but found it rather difficult to read after decompressing. A more human readable version might help.

Overall, glad to be on a much more up to date platform now after previously only having time to do a tiny bit of testing on another system back when we had to install OpenSUSE and rockstor manually on top.

phillxnet · February 3, 2022, 6:24pm

@mattyvau Nice to hear from your again. And thanks for the detailed report. Much appreciated.

I can chip in on:

and yes I think likewise that:

We most likely have a bug here. At one time our API used numbers to reference shares, but we now use names. We likely have some work to do there on the save/restore to re-connect somewhere along the way. Nice find and report. Would you mind making an issue for this save/restore bug here:

as that would be a nice one to get sorted in time. We then have the attribution of the report. If you do, link back to this forum thread for context.

Yes, just delete and re-create.

Curios. Did you refresh the Web-UI as it can take a while for them to show up. But at least they were restored and could be re-installed. This one is likely to be tricky to generate a reproducer for. It may well be we have some db contention or the like. Would be good to get to the root cause of this and a reproducer is the next step I’m thinking. If anyone finds a way to reliable reproduce this issue then dandy. Thanks for reporting this. We’ve not seen that or had it reported before. How many Rock-ons did you have and how many of them failed to show-up in the Web-UI?

We use some basic tools to do this export and from memory I think it’s basically a json in a set of square brackets. If the overall structure remains intact you can pop the majority of it into a json formatter and simply return the outer structure. Things have gotten massively larger since @Flox did the excellent work on the Rock-ons restore but the base structure is still simple. I’ve created the following doc issue for this:

github.com/rockstor/rockstor-doc

Add small section on config backup file structure

opened 06:11PM - 03 Feb 22 UTC

closed 12:24PM - 16 Dec 23 UTC

phillxnet

Thanks to forum member mattyvau for inspiring this issue. It would be good to ha…ve a short doc section of our back-up config file format. Along with the neccessary formatting element that ensure it's compatibility. This way folks can be more informed if they choose too, or have a need to hand edit these files. It would also be good to have some basic guidance on how it can be made human readable. Forum thread reference: https://forum.rockstor.com/t/3-9-to-4-1-update-report/8249

So in short it’s not the compression that makes it unreadable, it’s the tools we use to create the json internals that, again from memory, are wrapped in a dictionary dump. I’m not keen on adding further complexity re dumping a pretty format version of the file as this in turn may introduce bugs we could do without but lets see if that doc issue has any takers and if so we get a fresh looks at the easiest way to introduce readability after the fact. It’s likely it could still be re-imported in that state once the required formatting must-haves are established. Again this is from memory as I’ve not looked at that area for a bit. But it’s basically mostly json I believe.

Thanks again for the report and I’m glad your now on the new v4 “Built on openSUSE” shiny, and that the installer and at least the pool/share/snapshot import went relatively smoothly.

This is still a recognised option and we have the instructions to go this route on the downloads page. Far quicker to use the kiwi-ng created installer thought.

Cheers.

Hooverdan · February 3, 2022, 6:31pm

Just to add, I usually dump it into Notepad++ with the JSON Formatter plug-in. While it gives an error when reading it, it give me the tree to peruse:

Hooverdan · February 3, 2022, 6:49pm

@Flox, @phillxnet I took the liberty to open an issue on github for the error during parsing of the backup file in a json viewer (like I mentioned above). I am not sure that this is related to the actual issue encountered during backup restoration, but if it ends up being more than a cosmetic nuisance, then you can find it here:

https://github.com/rockstor/rockstor-core/issues/2354

mattyvau · February 3, 2022, 8:59pm

Thanks for the detailed responses.

No problem, will get onto it.

Easy enough, thanks. Avoided doing that initially in case there were any risks.

I did refresh a couple of times. It’s still possible I didn’t allow the system enough time, though as I say things seemed to have settled down on the system monitor. Could well be just my system so that’s fine.

I have 6 rock-ons, 3 were visible. I can’t think of any commonality between them eg. one had extra storage added later, the other two did not.

No worries, I’m not overly familiar with JSON so wasn’t too sure how to go about viewing it properly. Gedit took ages to open it and didn’t do any highlighting despite detecting JSON, and Notepad++ I didn’t know about the plug-in which would have helped.

Flox · February 4, 2022, 1:02pm

Hi @mattyvau,

It could be a matter of time, indeed. It’s a bit disheartening to see you had issue with your Rock-ons restore as we just fixed a somewhat related issue but it may be that 6 Rock-ons should be considered a lot here. I don’t recall the size for each of the Rock-ons you listed, but it can take quite some time to download the images and extract all the layers even before Rockstor can start their respective container. It unfortunately does reflect our need to improve our transparency and surfacing of what is happening in the background when a config restore is triggered. This is one of the main areas of interest for me once we have made good progress on addressing our technical debt.
Would you see anything of interest in the Rockstor log (rockstor.log) around the time of the restore, by any chance? I can’t remember whether we surface background task errors in the logs when not in debug mode, but it could be worth a look. I believe we now (as of one of our most recent releases) log the Rock-ons to be restored as they come so it could probe useful to see what happened here.
If the root cause here is related to the number of Rock-ons that were scheduled to be restored, we would need to further test this out and improve the robustness of the process. Incidentally, we’re currently working on substantially improving our functional testing so that we can hopefully catch these systematically.

Thank you so much for all the feedback and taking the time to report these, that is already extremely valuable to everybody.

jmangan · February 12, 2022, 2:04pm

I wish I could delete this - re-read and caught the bit I missed just minutes after posting this. Please ignore!

I’ve just gone through this but I’ve missed something. I installed 4.1 on a new disk with all the data disks disconnected. Went through initial config and updates (had to open a new subscription 'cause I’d forgotten to renew the old one - silly).
Restored the config backup and left it for a while.
Shutdown and re-connected the disks.
No pools/shares - re-restored the config backup and left it a while but no change.

Here I see a reference to ‘import the pools’. Could you elucidate please - and include any other steps I’ve missed out?

Thanks.

jmangan · February 12, 2022, 10:32pm

I restarted my build on 4.1, updated, shutdown, connected the disks, started up, imported the disks/partitions/shares and left it for a good hour.

Restored the backed up config and left it again.

It looks pretty good except that the Samba shares don’t seem to have been carried over. I’ve got notes for them though and I have the original 3.9 boot disk, if necessary.
Should I have started the Samba service before restoring the config?
If so, would it be good/bad to restore the config again now that Samba is running or should I start from scratch again.

One other oddity is that in the list of shares the first column, ‘size’ is actually the size of the pool, not the share, for all of the entries. Has anyone seen this before or have I managed something unique?

Thanks.

phillxnet · February 13, 2022, 1:41pm

@jmangan Glad things went sort-of OK. And thanks for the update.
Re:

This should all be automatic. We, as always, have some improvements to make so it’s likely a bug. They definitely are restored most of the time but as always things can go wrong. But mostly right in this case.

Your could try restoring/re-restoring the config again but if there are only a few exports (of shares) to enable then you may want to just re-do them by hand.

Glad you worked out the order re import pool (for shares, snapshots, clones) before config restore. We have the following issue open to help folks with that in the future re a Web-UI prompt:

This is kind of a feature (read bug ) where we don’t currently enforce quotas due to their relatively maturity. The next major version of Rockstor likely will enable this, at in the distant past we did enforce this but it didn’t work out due to quota issues within btrfs itself. But in the interim there has been a lot of quota work upstream but we haven’t gotten around to re-enabling / observing this. So for now the size of shares is cosmetic, and during the restore they inherit the pool size.

I think this may have been reported before and has been seen but I at least have failed to issue the observation. But I’ve now just done this here:

Thanks again for reporting your experience and glad you’ve pretty much made it over the inevitably large transition to v4 “Built on openSUSE”. Welcome to the future .

Hope that helps.

jmangan · February 13, 2022, 2:21pm

Thanks for the rapid response and the affirmation, I was a bit nervous.

I have to say the process was really pretty smooth and quite straightforward.

I had three ‘errors’:

the first was the reboot after installation freezing on ‘switch to radeondrmfb from EFI VGA’. A ‘nomodeset’ fixed that and had nothing to do with Rockstor. It seems to be a ‘common’ OpenSUSE problem.
the second was the import / restore config order which was purely down to my inability to read simple instructions in a linear manner.
the third is the ‘loss’ of Samba shares which, for me, is no big deal but I suppose some people may find quite onerous.

I knew the quotas weren’t enforced in 3.9 but since it showed up on the display it was a little unsettling to see it missing completely in the V4 rebuild but I’m reassured I haven’t broken anything or left a gaping hole to fall into ‘later’.

Top job all!

And many thanks again.