Can not start Replication service

Hi,

I am on Rockstor 3.9.2-44
I am not able to start the Replication service. The network interface is configured. I have three to select (eth0: IP null, eth0: IP 192.168.1.119, docker0: IP 192.17.0.1). I selected IP 192.168.1.119, which is my normal network interface. Port is left on 10002. When I switch replicatin service on, the switch goes to on for a short time and then back to off.

I searched around in the logs and found /opt/rockstor/var/log/rockstor.log with the following:
[22/Nov/2018 21:58:51] ERROR [storageadmin.views.network:157] get() returned more than one NetworkConnection – it returned 2!
Traceback (most recent call last):
File “/opt/rockstor/src/rockstor/storageadmin/views/network.py”, line 154, in update_connection
name=dconfig[‘connection’])
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/manager.py”, line 127, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/query.py”, line 338, in get
(self.model._meta.object_name, num)
MultipleObjectsReturned: get() returned more than one NetworkConnection – it returned 2!
[22/Nov/2018 21:59:52] ERROR [smart_manager.replication.listener_broker:182] Failed to fetch network interface for Listner/Broker. Exception: get() returned more than one NetworkConnection – it returned 2!
[22/Nov/2018 21:59:53] ERROR [smart_manager.replication.listener_broker:182] Failed to fetch network interface for Listner/Broker. Exception: get() returned more than one NetworkConnection – it returned 2!
[22/Nov/2018 21:59:56] ERROR [smart_manager.replication.listener_broker:182] Failed to fetch network interface for Listner/Broker. Exception: get() returned more than one NetworkConnection – it returned 2!
[22/Nov/2018 21:59:59] ERROR [smart_manager.replication.listener_broker:182] Failed to fetch network interface for Listner/Broker. Exception: get() returned more than one NetworkConnection – it returned 2!
[22/Nov/2018 22:02:56] ERROR [storageadmin.views.network:85] Unknown ctype: bridge config: {}
[22/Nov/2018 22:05:22] ERROR [storageadmin.views.network:85] Unknown ctype: bridge config: {}

What goes wrong here?

@Schrauber Hello again, and thanks for helping to support development via a stable subscription and this report.

This looks very much like a bug to me, ie you selected one network yet 2 are passed through to the replication code:

and in turn:

I will try and replicate this issue locally and return to this thread with my findings. Hopefully we can get this sorted there after.

Thanks for a tidy report, this is a nice find as it goes. And assuming I manage to reproduce this I’ll open an issue, attributing yourself and this thread, and hopefully a fix should follow shortly there after.

Could you possibly post a picture of you network config page, as the following:

lists eth0 twice, once without ip and once with.

Also it looks from that like you have only 2 physical ethernet connections; is this correct?

And the output of the following command run as root is also likely to help:

nmcli

Thanks again for reporting. I can’t promise a quick fix but if I can reproduce then I’m hoping it shouldn’t take too long to figure out what’s going wrong.

Thank you phillxnet for directing me to the network configuration page. This was the solution.
For whatever reason I had two eth0 interfaces there. One of them was my original network interface, the second was disabled. I deleted the disabled one and now I can enable the replication service.

Not sure why this happened. This Rockstor was fresh installed in a VM from 3.9.1 ISO and then upgraded directly to 3.9.2-44. I never changed the network config. This was automatic from the install.
I will try to install the ISO in another VM to check if this happened while installation or while upgrade.

@Schrauber Thanks for the update and glad you are now sorted.

I did a a quick look at the code post my post but could only see a request for a device by name so your:

makes sense now. I had assumed it was just a forum typo.

No me neither, though we probably have a less urgent validation bug here somewhere.

Yes that is strange.

Anything you can do to provide a reproducer would help as we can then pin down exactly why this happened and guard against it. It’s the first report of it’s kind so is much appreciated, and thanks for sharing your work around.

OK, it took a while. But I can reproduce this issue.

I install rockstor in a KVM VM. And this happens always, if the network is set to virtio on installation. With e1000 network I have only one interface. Also if I change the network to virtio after installation is done, I have one disabled and one new enabled interface. Strange this.

@Schrauber Thanks for the feedback and nice find.

So now we have a reproducer we should be able to track down what’s going wrong. I’m not likely to get to this one myself in the near future but someone may step up to have a look.

If you fancy opening a GitHub issue describing your findings, now we have the reproducer, that would be most helpful. That way anyone can jump in with a pull request against that issue, or more related findings, and what progress is made can then be centred around that issue. I also usually try to link back to the origin forum thread for context if there is one. No worries if you don’t fancy opening the issue and in which case I’ll pop one in myself. But given your efforts in this area it would be nice to have you as the author.

Thanks again for reporting and working out a reproducer. Much appreciated.

Issue opened: https://github.com/rockstor/rockstor-core/issues/1997

@Schrauber Super thanks.