Replaced motherboard, unable to remove network bonding

Hi. I’m on Rockstor 3.9.2-22
My server’s motherboard (ASRock C2550D4I) died recently. I had it RMA. After I installed the new motherboard, I wanted to turn off the network bonding feature. When I used the UI to turn on one of the dual NIC, I got the following error message:

Houston, we’ve had a problem.
Error running a command. cmd = /usr/bin/nmcli c up 803043f3-e4e2-4fc0-b409-45c13359b66a. rc = 4. stdout = [’’]. stderr = [‘Error: Connection activation failed: No suitable device found for this connection.’, ‘’]

Traceback (most recent call last):
File “/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py”, line 41, in _handle_exception
yield
File “/opt/rockstor/src/rockstor/storageadmin/views/network.py”, line 380, in post
network.toggle_connection(nco.uuid, switch)
File “/opt/rockstor/src/rockstor/system/network.py”, line 198, in toggle_connection
return run_command([NMCLI, ‘c’, switch, uuid])
File “/opt/rockstor/src/rockstor/system/osi.py”, line 121, in run_command
raise CommandException(cmd, out, err, rc)
CommandException: Error running a command. cmd = /usr/bin/nmcli c up 803043f3-e4e2-4fc0-b409-45c13359b66a. rc = 4. stdout = [’’]. stderr = [‘Error: Connection activation failed: No suitable device found for this connection.’, ‘’]

I know my ways around Linux. Can you please tell me how to clear the RockStor configuration database and remove the network bonding?

Thanks

Hi @stray_tachyon,

WARNING: I am not a Rockstor developer, and may inadvertantly destroy your system, while still wearing an annoying grin and stating ‘whoops’.

So Rockstor doesn’t appear to be able to remove non-existent interfaces from it’s config.
It looks like to manage this, you’ll need to go into the Rockstor postgres database.

From what I can determine, this data is stored in storageadmin.storageadmin_networkconnection.

Based upon the UUID provided, you should be able to remove the listing with:

psql -U rocky -d storageadmin
<ENTER PASSWORD 'rocky' AT PROMPT>
SELECT id FROM storageadmin_networkconnection WHERE uuid = '803043f3-e4e2-4fc0-b409-45c13359b66a';
DELETE FROM storageadmin_ethernetconnection WHERE connection_id = <ID FROM SELECT>;
DELETE FROM storageadmin_networkdevice WHERE connection_id = <ID FROM SELECT>;
DELETE FROM storageadmin_networkconnection WHERE id = <ID FROM SELECT>;
\q

However be careful, I have no idea what else might rely on these tables!

Thanks for your reply Haioken
Here is what is in each tables:
storageadmin=# select id, name, uuid from storageadmin_networkconnection;
id | name | uuid
----±---------------±-------------------------------------
17 | bond01-slave-0 | 421e016a-a2cb-462f-93fd-9b158892370d
13 | enp7s0 | 803043f3-e4e2-4fc0-b409-45c13359b66a
14 | enp8s0 | 4720bf1f-8e32-423d-b385-7a2928379ac7
16 | bond01 | e463bf3e-9d54-4a15-ac48-68a06ae9eb61
15 | bond01-slave-1 | 1cc10236-4b71-4683-863e-fe1967fce93a
(5 rows)

storageadmin=# select id, connection_id, mac from storageadmin_ethernetconnection;
id | connection_id | mac
----±--------------±------------------
26 | 17 |
23 | 13 | D0:50:99:C0:C6:27
24 | 14 | D0:50:99:C0:C6:28
25 | 15 |
(4 rows)

storageadmin=# select id, name, dtype, connection_id, mac from storageadmin_networkdevice;
id | name | dtype | connection_id | mac
----±-------±---------±--------------±------------------
1 | lo | loopback | | 00:00:00:00:00:00
2 | enp7s0 | ethernet | 17 | D0:50:99:C3:9F:62
6 | bond01 | bond | 16 | D0:50:99:C3:9F:62
3 | enp8s0 | ethernet | 15 | D0:50:99:C3:9F:62
(4 rows)

I can clear all the entries in the three tables (except the loopback in the storageadmin_networkdevice). Now, how do I remove the bonding in the Linux config files?
I tried to follow the instructions on this page:
http://digiturf.net/blog/remove-network-bonding-in-linux/

However, I got stuck at “rmmod bonding” because “rmmod” is not installed and I don’t want to muck around with installing packages. Any instructions on how to remove the bonding in the Linux config files?

Thanks a lot

  • You shouldn’t need to remove the bonding capability, only the interface.
  • If you’re using docker containers (Rock-Ons) you probably still need bonding capability to be enabled
  • I believe that bonding is built into the Kernel on CentOS, thus you cannot remove the module.

Hi Haioken, your solution doesn’t work. I clear all the tables, including an extra one (storageadmin_bondconnection). After a reboot, the UI is still showing the bonding being enable and when I tried to enable either one of the network interfaces, I still get an error:

Houston, we’ve had a problem.
Error running a command. cmd = /usr/bin/nmcli c down 4720bf1f-8e32-423d-b385-7a2928379ac7. rc = 10. stdout = [’’]. stderr = [“Error: ‘4720bf1f-8e32-423d-b385-7a2928379ac7’ is not an active connection.”, ‘Error: no active connection provided.’, ‘’]

Traceback (most recent call last):
File “/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py”, line 41, in _handle_exception
yield
File “/opt/rockstor/src/rockstor/storageadmin/views/network.py”, line 380, in post
network.toggle_connection(nco.uuid, switch)
File “/opt/rockstor/src/rockstor/system/network.py”, line 198, in toggle_connection
return run_command([NMCLI, ‘c’, switch, uuid])
File “/opt/rockstor/src/rockstor/system/osi.py”, line 121, in run_command
raise CommandException(cmd, out, err, rc)
CommandException: Error running a command. cmd = /usr/bin/nmcli c down 4720bf1f-8e32-423d-b385-7a2928379ac7. rc = 10. stdout = [’’]. stderr = [“Error: ‘4720bf1f-8e32-423d-b385-7a2928379ac7’ is not an active connection.”, ‘Error: no active connection provided.’, ‘’]

Hi Mate,

Looks like I may have missed a table.
Have a look at

 SELECT * FROM storageadmin_bondconnection;

Might give you some clues.

Ya, i clean that table too. Seems the entries came back after a reboot

Whelp, I’m all out of guesses then. Perhaps somebody with a bit more details on Rockstor’s internals will weigh in. (Tagging @phillxnet, as he’s pretty good with these queries!)

@Haioken and @stray_tachyon, I’ve only glanced at the network config code but my understanding is that it attempts to take it’s cues from current system state. Ie if one has a bond configuration and does a complete rebuild (with consequent hole db wipe), when you first login in it will show the prior (current to underling system) bond arrangement: that is the db tries to reflect system state (within it’s limits). All interactions with the system are via the nmcli program. So I’d initially suggest that one confirms nmcli’s understanding of current network state by way of more information, ie:

nmcli

Should show network managers report of the current config plus some pointers to getting more info.

It might be worth attempting to ‘work around’ your current chicken and egg scenario by configuring your network via nmcli and see what Rockstor indicates, possibly after a reboot.

We do have a now outdated technical wiki entry on the network code but since then it’s had a fair bit of change, such as the addition of bonding for example (big change that one):

I’ve got a couple of test system here that are both dual ethernet bonded (via lacp):

one of which is also an ASRock C2550D4I.

And have previously changed their config from within Rockstor (to test the new at the time bonding code) and as far as I remember the only buggy behaviour was a temporary disconnect requiring a browser page refresh to get the new config to be displayed. Kind of inevitable with live network changes of the link used to do the changes.

Let us know any thing else you can about how this occurred (ie dhcp/static etc) so I or someone else on the forum may be able to reproduce your issue. In which case we can open a bug issue with the reproducer steps and get this one sorted. Changing the db directly is rather a last resort really but one can always re-install and ‘needs must’. But your initial description seems pretty complete.

My initial guess is that we are failing to cope when the underlying network system is switched out like that. I’ve tested this multiple times but not with a bonding config.

Let us know what nmcli thinks of what you have and if it can get you out of this situation. If the Rockstor Web-UI still can’t that is.

Sorry I can’t be of more help but I just haven’t yet done much with the network code. I’m currently working through my current queue so I’ll not be able to get to this for a bit I’m afraid but it’s definitely worth putting what extra details you can so we can get this behaving better.

Thanks for your report and for helping to support Rockstor development.

Hi Phil and Haioken, Can you please tell me the location of Rockstor Python scripts? Thanks

Here’s what’s in the network config files and the psql database:

[/etc/sysconfig/network-scripts]

more ifcfg-bond01-slave-0
TYPE=Ethernet
NAME=bond01-slave-0
UUID=421e016a-a2cb-462f-93fd-9b158892370d
DEVICE=enp7s0
ONBOOT=yes
MASTER=bond01
SLAVE=yes
[/etc/sysconfig/network-scripts]
more ifcfg-bond01-slave-1
TYPE=Ethernet
NAME=bond01-slave-1
UUID=1cc10236-4b71-4683-863e-fe1967fce93a
DEVICE=enp8s0
ONBOOT=yes
MASTER=bond01
SLAVE=yes
[/etc/sysconfig/network-scripts]
more ifcfg-enp7s0
HWADDR=D0:50:99:C0:C6:27
TYPE=Ethernet
BOOTPROTO=dhcp
DEFROUTE=yes
PEERDNS=yes
PEERROUTES=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
NAME=enp7s0
UUID=803043f3-e4e2-4fc0-b409-45c13359b66a
DEVICE=enp7s0
ONBOOT=yes
[/etc/sysconfig/network-scripts]
more ifcfg-enp8s0
HWADDR=D0:50:99:C0:C6:28
TYPE=Ethernet
BOOTPROTO=dhcp
DEFROUTE=yes
PEERDNS=yes
PEERROUTES=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
NAME=enp8s0
UUID=4720bf1f-8e32-423d-b385-7a2928379ac7
DEVICE=enp8s0
ONBOOT=yes

[/etc/sysconfig/network-scripts]

psql -U rocky -d storageadmin
Password for user rocky:
psql (9.2.23)
Type “help” for help.

storageadmin=# select id, name, uuid from storageadmin_networkconnection;
id | name | uuid
----±---------------±-------------------------------------
18 | bond01 | e463bf3e-9d54-4a15-ac48-68a06ae9eb61
21 | enp8s0 | 4720bf1f-8e32-423d-b385-7a2928379ac7
22 | enp7s0 | 803043f3-e4e2-4fc0-b409-45c13359b66a
19 | bond01-slave-0 | 421e016a-a2cb-462f-93fd-9b158892370d
20 | bond01-slave-1 | 1cc10236-4b71-4683-863e-fe1967fce93a
(5 rows)

storageadmin=#

[/etc/sysconfig/network-scripts]

nmcli
bond01: connected to bond01
“bond01”
bond, D0:50:99:C3:9F:62, sw, mtu 1500
ip4 default
inet4 192.168.0.125/24
route4 0.0.0.0/0
route4 192.168.0.0/24
inet6 fe80::d250:99ff:fec3:9f62/64
route6 ff00::/8
route6 fe80::/64
route6 fe80::/64

enp7s0: connected to bond01-slave-0
“Intel I210 Gigabit”
ethernet (igb), D0:50:99:C3:9F:62, hw, mtu 1500
master bond01
route6 ff00::/8

enp8s0: connected to bond01-slave-1
“Intel I210 Gigabit”
ethernet (igb), D0:50:99:C3:9F:62, hw, mtu 1500
master bond01
route6 ff00::/8

lo: unmanaged
“lo”
loopback (unknown), 00:00:00:00:00:00, sw, mtu 65536

DNS configuration:
servers: 192.168.0.120 192.168.0.1
interface: bond01

Actually, maybe I approach this problem in a wrong vector. Can network bonding interface be disabled through the WebUI? If yes, what is the correct sequence? I tried to enable the individual NICs first then disable the bond interface, that obviously failed

Should I just delete the following files?
/etc/sysconfig/network-scripts
ifcfg-bond01
ifcfg-bond01-slave-0
ifcfg-bond01-slave-1

Problem solved. I just deleted the bond01 interface. Unfortunately two new NIC got added by the system, with UUID different from the network config scripts:

@stray_tachyon Thanks for the update and glade you got it sorted.

I’ll try and duplicate your experience when I next get the time and maybe we can smooth over some rough edges here and there.