Failed to start Replication due to an error: Error running a command. cmd = /opt/rockstor/.venv/bin/supervisorctl start replication. rc = 7. stdout = ['replication: ERROR (spawn error)', '']. stderr = ['']

Howdy!

Brief description of the problem

Tre replication service does not start on system upgraded to 5.0.5. Replication was working in 4.6.1.

Detailed step by step instructions to reproduce the problem

From Web-UI: Storage → Replication → Receive and click Replication service to ON. On reload of the page the Replicatoin service is set to OFF.

Similar error is happening on Serivices → Replication.

Web-UI screenshot

Error Traceback provided on the Web-UI

Traceback (most recent call last): File "/opt/rockstor/src/rockstor/smart_manager/views/replication_service.py", line 82, in post superctl(service.name, command) File "/opt/rockstor/src/rockstor/system/services.py", line 147, in superctl out, err, rc = run_command([SUPERCTL_BIN, switch, service], throw=throw) File "/opt/rockstor/src/rockstor/system/osi.py", line 263, in run_command raise CommandException(cmd, out, err, rc) system.exceptions.CommandException: Error running a command. cmd = /opt/rockstor/.venv/bin/supervisorctl start replication. rc = 7. stdout = ['replication: ERROR (spawn error)', '']. stderr = ['']
1 Like

@niko Hello there.
Re:

Yes, our 5.0.5 testing channel release is not yet a Release Candidate phase and there are know broken bits and bobs - and we have changed (upgraded by years) a very great deal of our libraries. But we are getting there and a replication related issue is about to be fixed, but it is partnered with another I am about to move onto:

See the following GitHub issue:

However they concern our current testing and I did not think we had broken replication at 5.0.5-0 but it would not be surpirsing. The state of our current testing channel rpm release is covered in the following forum thread:

Where the 5.0.5-0 status/release post there: V5.0+ Testing Channel Changelog - #10 by phillxnet

Was our first tester release at:

So you are rather on the edge here. However replication is my current focus again and I’m working through the related issues before we release 5.0.6-0 hopefully.

Also:

But the error relates to supervisorctl which we are planing to cut-out entirely in the current testing channel phase:

as part of our current Milestone here:

Never seen that: but again we are at our first open testers release here, having just moved on from Developers only. Hang in there and it would be great to have your feedback now you are already on 5.0.5-0 and keep an eye out for the next and following testing versions to see if this rather involved subsystem starts to behave better. We have just had a lot of ‘churn’ of late in order to modernise our setup, but bit by bit.

Thanks for your feedback.

I hope that helps at least for some context.

3 Likes

Thanks for the response, it clarified the situation quite a bit. I have installed few VMs where I test the system, so I probably will keep at least one of them updated to the latest availabe release.

3 Likes

@niko Just a quick update regarding your report.

I’ve finally gotten around to reproducing your findings re testing channel and we now have the following issue that is likely next on my list, assuming few other distractions :slight_smile: .

I’ll update this forum thread once I’ve made some progress.

Thanks again for the report.

4 Likes

@niko Hello again.
Re:

We have just released 5.0.6-0 into the current testing channel:

and I’m pretty sure we have returned all prior replication capability, though we likely still have some performance/robustness improvements etc to address. Just wanted to note here given your interest in this capability and your feedback in this area here on the forum.

Hope that helps.

3 Likes