Duplicate Appliance IDs Generated

nigel · March 18, 2021, 12:01pm

[Please complete the below template with details of the problem reported on your Web-UI. Be as detailed as possible. Community members, including developers, shall try and help. Thanks for your time in reporting this issue! We recommend purchasing commercial support for expedited support directly from the developers.]

Brief description of the problem

I’m busy migrating some Thecus NAS devices to RockStor. However, I’m unable to link the 2 appliances I have built because they both have the same unique Appliance ID. How could this have happened? The model numbers of each hardware device are different, with different host names and different IPs. They were both installed from the same USB key. They both run from an internal SATA SSD device.

Detailed step by step instructions to reproduce the problem

Install latest stable build … 3.9.1-0

Web-UI screenshot

[Drag and drop the image here]

Error Traceback provided on the Web-UI


            Traceback (most recent call last):
  File "/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py", line 41, in _handle_exception
    yield
  File "/opt/rockstor/src/rockstor/storageadmin/views/appliances.py", line 110, in post
    appliance.save()
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 734, in save
    force_update=force_update, update_fields=update_fields)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 762, in save_base
    updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 846, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 885, in _do_insert
    using=using, raw=raw)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/manager.py", line 127, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/query.py", line 920, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/sql/compiler.py", line 974, in execute_sql
    cursor.execute(sql, params)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/utils.py", line 98, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
IntegrityError: duplicate key value violates unique constraint "storageadmin_appliance_uuid_key"
DETAIL:  Key (uuid)=(REMOVED) already exists.

nigel · March 18, 2021, 12:03pm

Wow…from 10 days ago. Having read this, it is not the same issue, but I will provide the same details requested.

nigel · March 18, 2021, 12:07pm

OK I can confirm that “cat /sys/class/dmi/id/product_uuid” produces the same UUID on each server, and it matches the appliance ID.

nigel · March 18, 2021, 12:12pm

One is an N8900 and the other is an N16000PRO, however, I can see that the N16000PRO also uses the N8900 system board.

nigel · March 18, 2021, 12:26pm

I can see in here …

github.com

rockstor/rockstor-core/blob/master/src/rockstor/system/osi.py#L2291-L2306


fake_puuids = (
    "03000200-0400-0500-0006-000700080009",
    "00020003-0004-0005-0006-000700080009",
    "5C4606FA-192F-453A-B299-7B088C63BB9B",
    "31393138-3538-5A43-3135-353130323750",
    "00000000-0000-0000-0807-060504030201",
    "00000000-2093bfe3-e53b-4fc3-9cb1-9217ea6228c7",
)
try:
    with open("/sys/class/dmi/id/product_uuid") as fo:
        puuid = fo.readline().strip()
        if puuid in fake_puuids:
            raise CommandException
        return puuid
except:
    return "{}-{}".format(run_command([HOSTID])[0][0], str(uuid.uuid4()))

That the duplicate appliance is already included as a fake…

5C4606FA-192F-453A-B299-7B088C63BB9B

So not sure what to do now.

Flox · March 18, 2021, 1:04pm

Hi @nigel, and welcome to the community!

I’m rather limited on time so please excuse my rather brief and dry answers. This area is @phillxnet’s expertise, but I’m wondering whether you are running a Rockstor version dating prior to the addition of your product ID as a fake appliance ID, hence you are not benefitting from this addition. If this is the case, and if you’ve just installed, it might be a good opportunity to try out our new Rockstor 4 version. While we only have a DIY installer for now, it might be a good choice here for you; see @phillxnet’s post in the thread to which you linked, for instance:

If you are indeed running Rockstor-3.9.1-0, it will represent a very substantial update for you as that version is now very old. There might be the option of manually adding your UI (as in the code to which you linked) to your local file and restating the rockstor service but I’ve never tried that so I’m unsure of the consequences.

Sorry again for being very brief and probably not as helpful as I would have liked, but I do hope this still helps.

nigel · March 18, 2021, 1:25pm

You are correct - that fake uid is not in here…

/opt/rockstor/src/rockstor/system/osi.py

However, I’ve downloaded the latest version on the rockstor website, so not sure how I could get to anything newer? I also don’t want to run something not production ready (e.g. version 4). Love to use it when it’s ready, but it’s not yet.

nigel · March 18, 2021, 2:43pm

I attempted to add the UID to that file and restarted rockstor, but it hasn’t changed the appliance ID. Hopefully @phillxnet can help! I’ve got 3 of these appliances to configure, and would gladly purchase activation, but from what I can see, I’m actually unable to activate, because this is a fake UUID, and the activation is based on that.

I’m therefore stuck! I can download the latest version (3.9.2-60) of the source. Is there a way I can manually update the system using this?

phillxnet · March 19, 2021, 10:50am

@nigel Well done on getting this far already. The fake Appliance ID problem has been quite the annoyance / challenge for us. And as you have found this is already a know one to us and we have a chicken and egg situation.

The Appliance ID of a machine is established when you first install the machine and so it is insufficient to change this post the initial setup. You could try extending that list in the code directly after a fresh install and before doing the initial admin sign-up process. You will likely still need to restart the rocksor process. I’ve not actually tried this myself.

But really given our Rockstor 4 is the future and we are now at release candidate 7, in version 4.0.6:

I would strongly consider doing your fresh installs using a DIY installer build via the following method:

It’s a fair bit of messing about just as of yet but what you end up with will be what we are maintaining going forward. We no longer release updates for the CentOS variant anyway and by most accounts our beta ‘Built on openSUSE’ is now ahead in many cases from our prior, now stuck at 3.9.2-57 CentOS variant. Plus if you do find any issues we can hopefully fix them. Where as our CentOS variant is now legacy.

Just a thought.

Let us know how you get on with that appliance ID addition to the fake list and service re-start before the initial admin user creation step. But really, given this problem, as you found yourself, has already been addressed and released in our new installer, if you can get over the hurdle of building that installer you are in a far better position.

Let us know the route you fancy taking but 4 is actually now better behaved and has a far newer (by years) btrfs foundation than our CentOS variant and we are itching to release the final Stable with accompanying installer but all in good time as we need our latest changed to be testing in the field first.

So given you are doing new installs I’d definitely given the 4 variant a fresh look. And if you have difficulties in the DIY build method then let us know in a specific post as there are many forum members how have now successfully accomplished this task and it is one we wish to spread general knowledge about anyway as the end result is a fully updated ISO, which includes all upstream updates as of the time of the build. This is a super useful facility to know how to use and may well help you in the future. Especially on newer hardware than needs these fixes (such as the newer Appliance ID fixes in our code case). So give that another consideration as it may be a better path in the long run on what you can spend you time on in getting up and running.

Let us know who you get on and which direction you want to go in on this one.

Cheers.

nigel · March 19, 2021, 2:28pm

Thanks @phillxnet. If I do attempt the Rockstor 4 option, is it seamless to transfer the contents? I have 10TB of data on each NAS, and am manually synchronising at the moment. I’d rather import the BTRFS after initial install than have to copy all over again. Will this import operation work between version 3 and 4?

GeoffA · March 19, 2021, 4:36pm

@nigel I have done this before: import a data pool from v3 to v4, and it worked fine. @phillxnet may wish to add some more to this (eg caveats if there are any), but for me it worked fine.
Just make sure you have a suitable backup, just in case…

nigel · March 19, 2021, 5:30pm

Thanks @GeoffA. The other NAS is the backup!

GeoffA · March 19, 2021, 5:53pm

Cool. It seems my middle name is ‘Backup’. I’m working on it…
Let us know how things transpire.

nigel · March 20, 2021, 4:00pm

@phillxnet OK I have managed to install Rockstor 4 on a test NAS which is an partially empty Thecus N8900 (same model as above). I am able to add the previous NAS without the duplicate key error, however, this is because the new version 4 uses lower case instead of upper case.

3.9 : 5C4606FA-192F-453A-B299-7B088C63BB9B
4.0 : 5c4606fa-192f-453a-b299-7b088c63bb9b

I think therefore you need to update your file /opt/rockstor/src/rockstor/system/osi.py to include lower case versions, as it’s obviously not triggering whatever process is required to generate a properly unique ID for the system. I also did flash the latest bios for this device, so maybe that providers a lower case return from /sys/class/dmi/id/product_uuid, or maybe it’s OpenSUSE vs Centos.

nigel · March 20, 2021, 4:13pm

I did the following:

Re-installed RS4
Accessed via ssh before doing initial config
Added the lower case UID to the osi.py file
Rebooted
Went through initial configuration

Now I finally have a unique appliance ID. So I guess this process would probably work with 3.9 too. I will now see if RS4 is stable enough for my uses.

phillxnet · March 21, 2021, 6:39pm

@nigel Well done on building the 4 installer and getting the install/applianceID think sorted. And thanks for narrowing down a new suspected bug re:

and

Nice find. We have suspected a case issue for some time now but unfortunately not gotten around to narrowing it down. But:

Looks to have proven it. Well done.

I’m voting for this myself, although it could be a change in the kernel that is incidental but either way it’s a 3 vs 4 thing we need to sort.

I’d ask that you open a new GitHub issue in our rockstor-core repo:

with your findings, i.e. “fake product_uuid should be case insensitivity” or the like.

and link back to this forum thread for context.

Thanks for an excellent, and patient, work through of this bug. Much appreciated. This very much looks like a candidate for inclusion prior to our stable 4 release so once you have created that GitHub issue I’ll add it to the ISO milestone so hopefully the next time you install it will be a far simpler affair.

And thanks for confirming that code change prior to the initial setup. Nice to know we can fall back on that method. And feel free to chip in on similar forum threads as they pop up given your experimentation in this area.

I ask that you create the GitHub issue as it then simplifies our attribution side of things.

Thanks again for sharing your findings and proving the work-around.

nigel · March 22, 2021, 5:08pm

@GeoffA The import did “work”… kind of. Failed on attempt 1, succeeded on attempt 2.

However, it then spun on a core permanently doing a btrfs qgroup operation, and continually disabled and enabled the quota. In the end I removed the pool and recreated, and it is synchronising across using rsync over nfs. I’m not too keen on the replication setup in Rockstor - seems overly complicated using snapshots and prone to error (which I got lots of).

GeoffA · March 23, 2021, 8:04am

@nigel That’s a strange one for sure, and not something I experienced. Sorry I am not able to throw any light on the issues you encountered.

Re replication, I’m a die-hard rsync user for any syncing - that’s how my backups are done.

nigel · March 23, 2021, 6:12pm

@phillxnet Issue raised on GitHub…

github.com/rockstor/rockstor-core

Fake Product UUID Case Sensitivity Incorrect on Rockstor 4

opened 06:11PM - 23 Mar 21 UTC

closed 03:28PM - 09 May 21 UTC

nigel-a

It seems Rockstor 4, or rather OpenSUSE generates the same UUID but in a differe…nt case to Centos. The file /opt/rockstor/src/rockstor/system/osi.py detects the fake UUID but in a case sensitive way. This means that proper appliance IDs are not generated in RS4 for systems that present with a fake UUID. The proper fix should make the Fake UUID detection process case insensitive. An alternative fix would be to list fake uuids in both upper and lower cases. EDIT/ADDITION by @phillxnet: Please see comment: https://github.com/rockstor/rockstor-core/issues/2290#issuecomment-833792188 for context of a refinement and name change to this issue. The issue of failing to identify known fake Product uuid's was tracked down to a change in 4.17 kernels onward. Hence we fail to match now in 4 when we did in 3. But as we no longer release for CentOS we can simply change our known fake uuid's over to lower case. We thus have 2 issues here; fake uuid detection, and Rockstor 3/4 appliance ID case change: upper/lower respectively. This issue is being considered as focused on the first of these. Appman can 'fix' the second via it's Appliance ID edit capability. Linking to the associated forum thread for this issue: https://forum.rockstor.com/t/duplicate-appliance-ids-generated/7723/14

phillxnet · March 24, 2021, 10:27am

@nigel, Given:

and

This sound like you were on the edge of the time out here. And the quota re-calculation can take quite some time. There is a guide to roughly how long in the tooltip of the quota enable/disable. It may well have been that you could have just waited this out as it can take quite some time.

Yes, we definitely have some robustness work to do there. And we are also working within the current constraints of btrfs replication not currently being able to suffer a network outage: i.e. not ability to resume a replication. Snapshots are required for the replication process and once they are all established it can be a very quick method for replicating a subvol from machine to machine as it only copies the changes and can ‘known’ what these are without having to scan all files, which is what rsync has to do. But yes, complicated and currently in need of more attention both in our realm and in upstream.

I believe there are a number of upstream improvements to btrfs replication in the works but as some will required a protocol / format change their addition is being held back until it’s more certain that the next protocol will be sufficient for some time.

Hope that helps.