Can't set up replication

[Please complete the below template with details of the problem reported on your Web-UI. Be as detailed as possible. Community members, including developers, shall try and help. Thanks for your time in reporting this issue! We recommend purchasing commercial support for expedited support directly from the developers.]

Brief description of the problem

Having experienced random but recurrent problems with my old installation (crash, failure to reboot, kernel panic on start-up) I decided to start afresh with 5.0.9-0 on 2 appliances. I tested the download for integrity. I tested the RAM and disks on both machines. All went well (shutdown, reboot and cold reboot all working properly) until I tried to set up replication. The base appliance can’t retrieve the pool on the target appliance. The target pool has one active share, which is working normally.

Detailed step by step instructions to reproduce the problem

I have tried this with every combination of service running/not running on both base and target appliances.

I set this up with the cliapp key.

I have also tried creating a new, separate access key, but wouldn’t know how to use it, because the ‘Secret access key’ column reads ‘Secret encrypted: only available during creation’ so there is nothing to cut and paste into the appliance pairing dialog.

This may be relevant: when copied into a text editor, the access key of the target machine appears to have space or a tab to the left of the initial character. If I simply copy and paste from the target machine to the base machine, the appliances won’t pair: I have to go via a text editor to remove the blank space.

In the same way the secret access key of the base machine appears to have blank space to the right of the final character.

Against that background I:

(1) paired the 2 appliances

(2) opened storage > replication > send on the base appliance

(3) clicked ‘Add Replication Task’.

(4) entered the details.

As an experiment I tried to set up a send replication from the original target machine. The result was the same.

Web-UI screenshot

[Drag and drop t


he image here]

Error Traceback provided on the Web-UI

Traceback (most recent call last): File "/opt/rockstor/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 537, in _make_request response = conn.getresponse() ^^^^^^^^^^^^^^^^^^ File "/opt/rockstor/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 466, in getresponse httplib_response = super().getresponse() ^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib64/python3.11/http/client.py", line 1395, in getresponse response.begin() File "/usr/lib64/python3.11/http/client.py", line 325, in begin version, status, reason = self._read_status() ^^^^^^^^^^^^^^^^^^^ File "/usr/lib64/python3.11/http/client.py", line 286, in _read_status line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib64/python3.11/socket.py", line 706, in readinto return self._sock.recv_into(b) ^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib64/python3.11/ssl.py", line 1314, in recv_into return self.read(nbytes, buffer) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib64/python3.11/ssl.py", line 1166, in read return self._sslobj.read(len, buffer) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TimeoutError: The read operation timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File “/opt/rockstor/.venv/lib/python3.11/site-packages/requests/adapters.py”, line 486, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File “/opt/rockstor/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py”, line 847, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File “/opt/rockstor/.venv/lib/python3.11/site-packages/urllib3/util/retry.py”, line 470, in increment
raise reraise(type(error), error, _stacktrace)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/opt/rockstor/.venv/lib/python3.11/site-packages/urllib3/util/util.py”, line 39, in reraise
raise value
File “/opt/rockstor/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py”, line 793, in urlopen
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File “/opt/rockstor/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py”, line 539, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File “/opt/rockstor/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py”, line 370, in _raise_timeout
raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host=‘192.168.2.40’, port=443): Read timed out. (read timeout=2)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/opt/rockstor/src/rockstor/cli/api_wrapper.py”, line 66, in set_token
response = requests.post(
^^^^^^^^^^^^^^
File “/opt/rockstor/.venv/lib/python3.11/site-packages/requests/api.py”, line 115, in post
return request(“post”, url, data=data, json=json, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/opt/rockstor/.venv/lib/python3.11/site-packages/requests/api.py”, line 59, in request
return session.request(method=method, url=url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/opt/rockstor/.venv/lib/python3.11/site-packages/requests/sessions.py”, line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/opt/rockstor/.venv/lib/python3.11/site-packages/requests/sessions.py”, line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/opt/rockstor/.venv/lib/python3.11/site-packages/requests/adapters.py”, line 532, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host=‘192.168.2.40’, port=443): Read timed out. (read timeout=2)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/opt/rockstor/src/rockstor/smart_manager/views/receiver_pools.py”, line 35, in get
response = aw.api_call(“pools”)
^^^^^^^^^^^^^^^^^^^^
File “/opt/rockstor/src/rockstor/cli/api_wrapper.py”, line 85, in api_call
self.set_token()
File “/opt/rockstor/src/rockstor/cli/api_wrapper.py”, line 81, in set_token
raise Exception(msg)
Exception: Exception while setting access_token for url(https://192.168.2.40:443): HTTPSConnectionPool(host=‘192.168.2.40’, port=443): Read timed out. (read timeout=2). content: None

Hello @john_H any progress on this front.

During this ongoing (and nearly finished) testing branch phase, we have proven replication to be working as from:

Maybe the comments in that PR will be informative about something not right. Also ensure you have configured the replication service, last I remember it was required to select a network port there or things would fail. We always have improvements to make, in replication as elsewhere, and that last update returned it to it’s prior function; only under our new Python version and some other odds and ends that needed updating.

In short it is expected to work. And what we are missing, for some time now, is a proper detailed doc section on it’s setup. So that folks are aware of what must be done to get this service up-and-running.

Hope that helps, at least with some context. And do be sure to update to at least 5.0.14-0 as that is way closer to our pending next Stable release. But I’m not aware of any fixes that impact replication on from your stated 5.0.9-0. Also note that our replication subsystem, and the btrfs-send/receive that it uses, cannot handle network interruption. There is, as yet, no capability to recover from this bar a by-hand tidy-up of the sending machines snapshot that then failed to be sent due to network issues. So it’s a local/reliable network type thing only currently.

3 Likes