Replication start and then failes

Hello,

I have an existing rockstor whcih has a Share called daat which I would like to replicate to a backup rockstor, both systems are up to date.

Primary 192.168.1.14

[root@rockstor3 log]# cat supervisord_replication_stderr.log
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 195, in run
% (self.listener_interface, self.listener_port))
File “zmq/backend/cython/socket.pyx”, line 487, in zmq.backend.cython.socket.Socket.bind (zmq/backend/cython/socket.c:5156)
File “zmq/backend/cython/checkrc.pxd”, line 25, in zmq.backend.cython.checkrc._check_rc (zmq/backend/cython/socket.c:7535)
raise ZMQError(errno)
ZMQError: Address already in use
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 195, in run
% (self.listener_interface, self.listener_port))
File “zmq/backend/cython/socket.pyx”, line 487, in zmq.backend.cython.socket.Socket.bind (zmq/backend/cython/socket.c:5156)
File “zmq/backend/cython/checkrc.pxd”, line 25, in zmq.backend.cython.checkrc._check_rc (zmq/backend/cython/socket.c:7535)
raise ZMQError(errno)
ZMQError: Address already in use
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 195, in run
% (self.listener_interface, self.listener_port))
File “zmq/backend/cython/socket.pyx”, line 487, in zmq.backend.cython.socket.Socket.bind (zmq/backend/cython/socket.c:5156)
File “zmq/backend/cython/checkrc.pxd”, line 25, in zmq.backend.cython.checkrc._check_rc (zmq/backend/cython/socket.c:7535)
raise ZMQError(errno)
ZMQError: Address already in use
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 213, in run
address, command, msg = frontend.recv_multipart()
ValueError: need more than 2 values to unpack
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 213, in run
address, command, msg = frontend.recv_multipart()
ValueError: need more than 2 values to unpack
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 213, in run
address, command, msg = frontend.recv_multipart()
ValueError: need more than 2 values to unpack
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 213, in run
address, command, msg = frontend.recv_multipart()
ValueError: need more than 2 values to unpack
[root@rockstor3 log]#

Secondary:
[root@backup log]# cat supervisord_replication_stderr.log
Package upgrade
/opt/rockstor/eggs/setuptools-46.1.3-py2.7.egg/pkg_resources/py2_warn.py:21: UserWarning: Setuptools will stop working on Python 2


You are running Setuptools on Python 2, which is no longer
supported and

SETUPTOOLS WILL STOP WORKING <<<
in a subsequent release (no sooner than 2020-04-20).
Please ensure you are installing
Setuptools using pip 9.x or later or pin to setuptools<45
in your environment.
If you have done those things and are still encountering
this message, please follow up at
https://bit.ly/setuptools-py2-warning.


sys.version_info < (3,) and warnings.warn(pre + “" * 60 + msg + "” * 60)
/opt/rockstor/eggs/setuptools-46.1.3-py2.7.egg/pkg_resources/py2_warn.py:21: UserWarning: Setuptools will stop working on Python 2


You are running Setuptools on Python 2, which is no longer
supported and

SETUPTOOLS WILL STOP WORKING <<<
in a subsequent release (no sooner than 2020-04-20).
Please ensure you are installing
Setuptools using pip 9.x or later or pin to setuptools<45
in your environment.
If you have done those things and are still encountering
this message, please follow up at
https://bit.ly/setuptools-py2-warning.


sys.version_info < (3,) and warnings.warn(pre + “" * 60 + msg + "” * 60)
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 213, in run
address, command, msg = frontend.recv_multipart()
ValueError: need more than 2 values to unpack
/opt/rockstor/eggs/setuptools-46.1.3-py2.7.egg/pkg_resources/py2_warn.py:21: UserWarning: Setuptools will stop working on Python 2


You are running Setuptools on Python 2, which is no longer
supported and

SETUPTOOLS WILL STOP WORKING <<<
in a subsequent release (no sooner than 2020-04-20).
Please ensure you are installing
Setuptools using pip 9.x or later or pin to setuptools<45
in your environment.
If you have done those things and are still encountering
this message, please follow up at
https://bit.ly/setuptools-py2-warning.


sys.version_info < (3,) and warnings.warn(pre + “" * 60 + msg + "” * 60)
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 213, in run
address, command, msg = frontend.recv_multipart()
ValueError: need more than 2 values to unpack
/opt/rockstor/eggs/setuptools-46.1.3-py2.7.egg/pkg_resources/py2_warn.py:21: UserWarning: Setuptools will stop working on Python 2


You are running Setuptools on Python 2, which is no longer
supported and

SETUPTOOLS WILL STOP WORKING <<<
in a subsequent release (no sooner than 2020-04-20).
Please ensure you are installing
Setuptools using pip 9.x or later or pin to setuptools<45
in your environment.
If you have done those things and are still encountering
this message, please follow up at
https://bit.ly/setuptools-py2-warning.


sys.version_info < (3,) and warnings.warn(pre + “" * 60 + msg + "” * 60)
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 213, in run
address, command, msg = frontend.recv_multipart()
ValueError: need more than 2 values to unpack

@Stefan Is this unrelated to your fairly recent other post here:

Replication failing

which I have replied to with some suggestions as to that failure.

Hi All,

both sides are n the latest version.

The replication starts and after a while if fails over:

Primary:

[09/May/2020 17:44:38] ERROR [smart_manager.replication.sender:74] Id: 1E484D56-64A3-9411-8BCA-716EC6808932-6. Got null or error command(receiver-error) message(No response received from the broker. remaining tries: 0. Terminating the receiver… Exception: No response received from the broker. remaining tries: 0. Terminating the receiver.) from the Receiver while transmitting fsdata. Aborting… Exception: No response received from the broker. remaining tries: 0. Terminating the receiver… Exception: No response received from the broker. remaining tries: 0. Terminating the receiver.
[09/May/2020 17:45:35] ERROR [storageadmin.util:44] Exception: Cannot add/remove volume definitions of the container (nextcloud-official) as it belongs to an installed Rock-on (Nextcloud-Official). Uninstall it first and try again.
Traceback (most recent call last):
File “/opt/rockstor/eggs/gunicorn-19.7.1-py2.7.egg/gunicorn/workers/sync.py”, line 68, in run_for_one
self.accept(listener)
File “/opt/rockstor/eggs/gunicorn-19.7.1-py2.7.egg/gunicorn/workers/sync.py”, line 27, in accept
client, addr = listener.accept()
File “/usr/lib64/python2.7/socket.py”, line 202, in accept
sock, addr = self._sock.accept()
error: [Errno 11] Resource temporarily unavailable
[09/May/2020 17:45:35] ERROR [storageadmin.util:44] Exception: Errors occurred while processing updates for the following Rock-ons (Nextcloud-Official: [‘Cannot add/remove volume definitions of the container (nextcloud-official) as it belongs to an installed Rock-on (Nextcloud-Official). Uninstall it first and try again.’, ‘Traceback (most recent call last):\n File “/opt/rockstor/eggs/gunicorn-19.7.1-py2.7.egg/gunicorn/workers/sync.py”, line 68, in run_for_one\n self.accept(listener)\n File “/opt/rockstor/eggs/gunicorn-19.7.1-py2.7.egg/gunicorn/workers/sync.py”, line 27, in accept\n client, addr = listener.accept()\n File “/usr/lib64/python2.7/socket.py”, line 202, in accept\n sock, addr = self._sock.accept()\nerror: [Errno 11] Resource temporarily unavailable\n’]).
Traceback (most recent call last):
File “/opt/rockstor/src/rockstor/storageadmin/views/rockon.py”, line 111, in post
self._create_update_meta(r, rockons[r])
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/utils/decorators.py”, line 145, in inner
return func(*args, **kwargs)
File “/opt/rockstor/src/rockstor/storageadmin/views/rockon.py”, line 272, in _create_update_meta
handle_exception(Exception(e_msg), self.request)
File “/opt/rockstor/src/rockstor/storageadmin/util.py”, line 47, in handle_exception
trace=traceback.format_exc())
RockStorAPIException: [‘Cannot add/remove volume definitions of the container (nextcloud-official) as it belongs to an installed Rock-on (Nextcloud-Official). Uninstall it first and try again.’, ‘Traceback (most recent call last):\n File “/opt/rockstor/eggs/gunicorn-19.7.1-py2.7.egg/gunicorn/workers/sync.py”, line 68, in run_for_one\n self.accept(listener)\n File “/opt/rockstor/eggs/gunicorn-19.7.1-py2.7.egg/gunicorn/workers/sync.py”, line 27, in accept\n client, addr = listener.accept()\n File “/usr/lib64/python2.7/socket.py”, line 202, in accept\n sock, addr = self._sock.accept()\nerror: [Errno 11] Resource temporarily unavailable\n’]

Secondary:

[09/May/2020 17:43:21] ERROR [smart_manager.replication.receiver:354] Id: 1E484D56-64A3-9411-8BCA-716EC6808932-6. No response received from the broker. remaining tries: 9
[09/May/2020 17:43:27] ERROR [smart_manager.replication.receiver:354] Id: 1E484D56-64A3-9411-8BCA-716EC6808932-6. No response received from the broker. remaining tries: 8
[09/May/2020 17:43:33] ERROR [smart_manager.replication.receiver:354] Id: 1E484D56-64A3-9411-8BCA-716EC6808932-6. No response received from the broker. remaining tries: 7
[09/May/2020 17:43:39] ERROR [smart_manager.replication.receiver:354] Id: 1E484D56-64A3-9411-8BCA-716EC6808932-6. No response received from the broker. remaining tries: 6
[09/May/2020 17:43:45] ERROR [smart_manager.replication.receiver:354] Id: 1E484D56-64A3-9411-8BCA-716EC6808932-6. No response received from the broker. remaining tries: 5
[09/May/2020 17:43:51] ERROR [smart_manager.replication.receiver:354] Id: 1E484D56-64A3-9411-8BCA-716EC6808932-6. No response received from the broker. remaining tries: 4
[09/May/2020 17:43:57] ERROR [smart_manager.replication.receiver:354] Id: 1E484D56-64A3-9411-8BCA-716EC6808932-6. No response received from the broker. remaining tries: 3
[09/May/2020 17:44:03] ERROR [smart_manager.replication.receiver:354] Id: 1E484D56-64A3-9411-8BCA-716EC6808932-6. No response received from the broker. remaining tries: 2
[09/May/2020 17:44:09] ERROR [smart_manager.replication.receiver:354] Id: 1E484D56-64A3-9411-8BCA-716EC6808932-6. No response received from the broker. remaining tries: 1
[09/May/2020 17:44:15] ERROR [smart_manager.replication.receiver:354] Id: 1E484D56-64A3-9411-8BCA-716EC6808932-6. No response received from the broker. remaining tries: 0
[09/May/2020 17:44:15] ERROR [smart_manager.replication.receiver:91] No response received from the broker. remaining tries: 0. Terminating the receiver… Exception: No response received from the broker. remaining tries: 0. Terminating the receiver.

Best regards
Stefan

@Stefan Hello again.

and:

Doesn’t tell us much other than something held up Replication until all retries were used up.

To try and diagnose further, could you disable both the send and receiver for this replication task and setup up from fresh. And make sure to check the spanner on the replication service in Systems - Services as this establishes the network device to use. If the current replication task has not ever completed successfully, it takes 5 schedule replications runs to settle into it’s final stable state, then you can just remove the associated auto created snapshots on both the sender and the receiver and start over with your desired share. But if that share is large it will take a long time for each replication event and thus slow up our diagnosis so a small test share with little data can help achieve shorter replication duration and thus test the system through to the final state of 5 + replication events.

On a possibly unrelated note you have some Rock-on issue there also that is throwing exceptions:

I’m assuming this share is not related to your replication event. Either way it would be good to get that shorted as it’s filling your logs and making diagnosis of the replication issue more tricky. Do you have a local rock-on repo my chance and a Rock-on definition in there is clashing with the central Rock-on repo.

@Flox may be able to help more with clearing up that potentially legacy lingering rock-on definition or database entry.

Also could you clarify how you setup Replication, our current docs are a little outdated and it can be tricky. Maybe pics of the sender and the receiver config. But do remember to check that spanner in:

System - Services

replication-network-interface-selection

And in my case there is just the one interface available to select. Without this selection Replication fails, don’t think that is your issue but worth checking just in case.

So in short try setting up from scratch, possible with a small test share, to make sure you are familiar with how it’s meant to work. I’m assuming here that it hasn’t ever worked for your in this case. That way you can schedule the replication task very often and produce any failures fairly quickly. Then once you are clear on it’s quirks your can move back to your presumably more substantial share replication which will take much longer to go through the first 5 events before achieving a final state of 3 snapshots at each end.

Apologies I can’t be more specific but I much first get a large pull request in so we can move forward with the openSUSE move. But I will soon take another look at the replication setup and test we are still working as expected. Nothing has changed in that code from when it last worked, with Stable as both sender and receiver, so I’m assuming it still works as previously, with the known blocking snapshot failure when the sender can’t see the receiver mentioned earlier.

Hope that helps.

1 Like

Hi Phil,

sometimes the replication starts and sometimes it does not.

here is an overview of the steps

https://docs.google.com/document/d/1ajESvMCc8p4KZA63wXS6OLIRbQd9wNZPA7ANwygDOlY/edit?usp=sharing

Do people successfully use replication?

I am keen to help to make replication better and more stable. Please let me know I can help.

Best regards
Stefan

2 Likes

@Stefan I’m afraid I can’t look to this directly as I’m currently deep in preparing the next rpm release but with regard to:

I have a working setup here, but I will double check it as part of this discussion, although nothing springs to mind that might have changed it’s function since i last tried it.

Thanks for the offer to help and your input has already brought attention to this area. Hopefully others can chip in with their recent experience of the replication service on 3.9.2-57. It’s a super hand major feature that will definitely get priority but we first have to concentrate on getting folks to a more modern distro base, but given that is definitely getting there I should soon be able to tend to this in more detail. Unless someone else beats me to it of course.

Thanks for the doc reference but I unfortunately don’t have permission to view it.

Do make sure to setup a clean replication configuration from scratch to double check things. And to help narrow down the nature of the failure.

Could you also confirm if you have ever had replication working as of yet?

Hope to get back to this shortly. If anyone else can chip in here on setting up replication and if we have some new bug in this important feature that would be good. At minimum we should improve our docs on this. N.B. replication is known to be broken in our last CentOS testing release but that was over 2 years ago and it has since been mended and proven working but we may have a new blocking bug which would be good to discover.

1 Like

Hi Phil,

no worries.

I got replication started but then it fails over the logs suggest that one of the box stops responding.

this link should work now.

Best regards
Stefan

1 Like

@Stefan Thanks for your patience/understanding here.

I’ve just had a quick look at your excellent doc on this and the following looks suspicious:

Secondary

[10/May/2020 09:04:57] ERROR [storageadmin.middleware:32] Exception occurred while processing a request. Path: /api/commands/refresh-share-state method: POST
[10/May/2020 09:04:57] ERROR [storageadmin.middleware:33] Error running a command. cmd = /usr/bin/mount -t btrfs -o subvolid=370 /dev/disk/by-id/wwn-0x50014ee6aae1d4b5 /mnt2/.snapshots/1E484D56-64A3-9411-8BCA-716EC6808932_video/video_8_replication_1. rc = 32. stdout = [’’]. stderr = [‘mount: mount /dev/sdd on /mnt2/.snapshots/1E484D56-64A3-9411-8BCA-716EC6808932_video/video_8_replication_1 failed: Cannot allocate memory’, ‘’]

skipping some of the django internals in the traceback and we are back to our code:

File “/opt/rockstor/src/rockstor/storageadmin/views/command.py”, line 348, in post
import_shares(p, request)
File “/opt/rockstor/src/rockstor/storageadmin/views/share_helpers.py”, line 204, in import_shares
mount_share(nso, ‘%s%s’ % (settings.MNT_PT, s_in_pool))
File “/opt/rockstor/src/rockstor/fs/btrfs.py”, line 607, in mount_share
return run_command(mnt_cmd)
File “/opt/rockstor/src/rockstor/system/osi.py”, line 176, in run_command
raise CommandException(cmd, out, err, rc)
CommandException: Error running a command. cmd = /usr/bin/mount -t btrfs -o subvolid=370 /dev/disk/by-id/wwn-0x50014ee6aae1d4b5 /mnt2/.snapshots/1E484D56-64A3-9411-8BCA-716EC6808932_video/video_8_replication_1. rc = 32. stdout = [’’]. stderr = ['mount: mount /dev/sdd on /mnt2/.snapshots/1E484D56-64A3-9411-8BCA-716EC6808932_video/video_8_replication_1 failed: Cannot allocate memory’, ‘’]

So it looks like the mount command on the Secondary(receiver machine) failed as it was not able to dedicate enough memory to the mount command!! This in turn fails the receivers ability to continue receiving and so the sender sees it as having given up and all retries are exhausted as it wait for it to recover. But it can’t because the mount command has failed.

What are the specifications of both the Primary (sender) and Secondary (receiver) machines memory wise? It may be we are hitting some limit here where our minimum specifications are just not adequate. Also, could you give an indication of the size of the share you are trying to replicate? And the size of pools on both the send and receiver that are involved with this shares replication task. It may also be useful to know the usage of particularly the receiver Pool. Larger pools, and those with more data on, will take more memory to process. And mount does a fair bit of processing, although in this case it’s only the subvol requiring a mount but a btrfs subvolume still shares many filesystem structures with it’s parent pool, even though it looks mostly like a filesystem in it’s own right.

Hope that helps.

I tried the volume data and video both behave the same way.

Primary has 8GB, run virtual

The Secondary is a physical box 4GB RAM, The box dose not swap and the CPU is below 30% during replication. most of the memory is used for cache.

Are there tuning parameter for btrfs to allocate more memory?

btrfs fi show
Label: ‘rockstor_rockstor’ uuid: d82167d8-ad44-452b-be50-bdb4cfb8c4b5
Total devices 1 FS bytes used 7.78GiB
devid 1 size 13.91GiB used 11.04GiB path /dev/sdk3

Label: ‘data_pool’ uuid: 2c0ee95b-d4ae-4bcb-8368-7f75e1f5e75e
Total devices 2 FS bytes used 1.72TiB
devid 1 size 3.64TiB used 2.94TiB path /dev/sdb
devid 2 size 3.64TiB used 2.94TiB path /dev/sda

Label: ‘video_pool’ uuid: 790c620a-54ea-4296-aacf-eaf107daf33e
Total devices 8 FS bytes used 2.04TiB
devid 1 size 931.51GiB used 299.50GiB path /dev/sde
devid 2 size 931.51GiB used 299.50GiB path /dev/sdf
devid 3 size 931.51GiB used 299.50GiB path /dev/sdc
devid 4 size 931.51GiB used 299.50GiB path /dev/sdg
devid 5 size 931.51GiB used 299.50GiB path /dev/sdh
devid 6 size 931.51GiB used 299.50GiB path /dev/sdi
devid 7 size 931.51GiB used 299.50GiB path /dev/sdj
devid 8 size 931.51GiB used 299.50GiB path /dev/sdd

df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 3.9G 0 3.9G 0% /dev
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 3.9G 370M 3.6G 10% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/sdk3 14G 8.0G 4.2G 66% /
/dev/sdk3 14G 8.0G 4.2G 66% /home
/dev/sdk1 477M 140M 308M 32% /boot
/dev/sdk3 14G 8.0G 4.2G 66% /mnt2/rockstor_rockstor
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/data_pool
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/rockons
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/resilio
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/plex_config
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/pihole
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/data
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/Radarr
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/rockons3
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/NextCloud
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/data/.data_spapshot_202005060200
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/NextCloud/.STE_202005060045
/dev/sde 7.3T 2.1T 5.0T 30% /mnt2/video_pool
/dev/sde 7.3T 2.1T 5.0T 30% /mnt2/video
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/NextCloud/.STE_202005070045
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/NextCloud/.STE_202005080045
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/NextCloud/.STE_202005090045
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/NextCloud/.STE_202005100045
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/NextCloud/.STE_202005110045
/dev/sde 7.3T 2.1T 5.0T 30% /mnt2/Sharon_Mac_Backup
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/NextCloud/.STE_202005120045
/dev/sdb 3.7T 1.8T 2.0T 48% /mnt2/data/.data_spapshot_202005120200
tmpfs 799M 0 799M 0% /run/user/0

Secondary

btrfs fi show
Label: ‘rockstor_rockstor’ uuid: d91d9693-1fb3-4fa5-8e9f-b047a0145adf
Total devices 1 FS bytes used 6.10GiB
devid 1 size 234.23GiB used 13.02GiB path /dev/sdi3

Label: ‘backup’ uuid: 7da112ea-534d-4b4b-9551-8df76dbe00b9
Total devices 8 FS bytes used 129.48GiB
devid 1 size 931.51GiB used 19.50GiB path /dev/sde
devid 2 size 931.51GiB used 19.50GiB path /dev/sdg
devid 3 size 931.51GiB used 19.50GiB path /dev/sdh
devid 4 size 931.51GiB used 19.50GiB path /dev/sdc
devid 5 size 931.51GiB used 19.50GiB path /dev/sdf
devid 6 size 931.51GiB used 19.50GiB path /dev/sda
devid 7 size 931.51GiB used 19.50GiB path /dev/sdb
devid 8 size 931.51GiB used 19.50GiB path /dev/sdd

btrfs subvolume list /mnt2/backup/
ID 372 gen 1020 top level 5 path data_backup
ID 373 gen 1021 top level 5 path 1E484D56-64A3-9411-8BCA-716EC6808932_data
ID 374 gen 1154 top level 5 path .snapshots/1E484D56-64A3-9411-8BCA-716EC6808932_data/data_9_replication_1

df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 1.9G 0 1.9G 0% /dev
tmpfs 1.9G 0 1.9G 0% /dev/shm
tmpfs 1.9G 8.8M 1.9G 1% /run
tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup
/dev/sdi3 235G 6.2G 228G 3% /
tmpfs 1.9G 20K 1.9G 1% /tmp
/dev/sdi3 235G 6.2G 228G 3% /home
/dev/sdi1 477M 142M 306M 32% /boot
/dev/sde 7.3T 130G 7.2T 2% /mnt2/backup
/dev/sdi3 235G 6.2G 228G 3% /mnt2/rockstor_rockstor
/dev/sde 7.3T 130G 7.2T 2% /mnt2/data_backup
/dev/sdi3 235G 6.2G 228G 3% /mnt2/ssd_test
tmpfs 384M 0 384M 0% /run/user/0
/dev/sde 7.3T 130G 7.2T 2% /mnt2/1E484D56-64A3-9411-8BCA-716EC6808932_data
/dev/sde 7.3T 130G 7.2T 2% /mnt2/.snapshots/1E484D56-64A3-9411-8BCA-716EC6808932_data/data_9_replication_1

@Stefan Thanks for the additional info:

But in this case the secondary/reciever isn’t actually completing it’s part of the replication from what I can tell. And swap may only represent what can be swapped out. Much of the memory needed is likely to be ram only, ie there may just not be enough actual ram so it just fails with that “Cannot allocate memory” message.

I’ll try and do some more testing here soon. But currently it’s still looking like RAM on receiver but I would have thought 4 GB to be enough, although it’s an 8TB pool, but with no real data on! I’ll do some tests here once the next testing rpm release is out and report my findings.

Incidentally the df command is inadequate for measuring disk use on btrfs. The preferred command is:

btrfs fi usage -h /

Where in that example “/” is the mount point of the pool, ie /mnt2/video_pool in Rockstor.

Again apologies I can’t spend more time on this currently.

1 Like

btrfs fi usage -h /
Overall:
Device size: 13.91GiB
Device allocated: 11.04GiB
Device unallocated: 2.88GiB
Device missing: 0.00B
Used: 8.11GiB
Free (estimated): 3.96GiB (min: 2.52GiB)
Data ratio: 1.00
Metadata ratio: 1.99
Global reserve: 32.11MiB (used: 0.00B)

Data,single: Size:9.01GiB, Used:7.92GiB
/dev/sdk3 9.01GiB

Metadata,single: Size:8.00MiB, Used:0.00B
/dev/sdk3 8.00MiB

Metadata,DUP: Size:1.00GiB, Used:98.16MiB
/dev/sdk3 2.00GiB

System,single: Size:4.00MiB, Used:0.00B
/dev/sdk3 4.00MiB

System,DUP: Size:8.00MiB, Used:16.00KiB
/dev/sdk3 16.00MiB

Unallocated:
/dev/sdk3 2.88GiB
[root@rockstor3 bin]#

I am aldo waiting for more RAM, hopefully that will fix it.

@Stefan Your following command:

Only gives the usage for the mount point that is the system pool.

But I didn’t think your replication involved the system pool on either the sender or the receiver. Are you not using separate data only pools for your replication task?

Yes me too.

Let us know if you are using your system pool at either end. Not a good idea really as best to keep the system pool for only the system or stuff you don’t mind loosing as there is not redundance there and is is more limited in other capabilities than a dedicated data pool.

Hope that helps.

Hi Phil,

no I don’t use the system pool. Both Side are Raid5 pool on 1TB disks.

The Secondary not has *GB ram

replication failing with a new error

cat supervisord_replication_stderr.log
/opt/rockstor/eggs/setuptools-46.1.3-py2.7.egg/pkg_resources/py2_warn.py:21: UserWarning: Setuptools will stop working on Python 2


You are running Setuptools on Python 2, which is no longer
supported and

SETUPTOOLS WILL STOP WORKING <<<
in a subsequent release (no sooner than 2020-04-20).
Please ensure you are installing
Setuptools using pip 9.x or later or pin to setuptools<45
in your environment.
If you have done those things and are still encountering
this message, please follow up at
https://bit.ly/setuptools-py2-warning.


sys.version_info < (3,) and warnings.warn(pre + “" * 60 + msg + "” * 60)
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 213, in run
address, command, msg = frontend.recv_multipart()
ValueError: need more than 2 values to unpack
/opt/rockstor/eggs/setuptools-46.1.3-py2.7.egg/pkg_resources/py2_warn.py:21: UserWarning: Setuptools will stop working on Python 2


You are running Setuptools on Python 2, which is no longer
supported and

SETUPTOOLS WILL STOP WORKING <<<
in a subsequent release (no sooner than 2020-04-20).
Please ensure you are installing
Setuptools using pip 9.x or later or pin to setuptools<45
in your environment.
If you have done those things and are still encountering
this message, please follow up at
https://bit.ly/setuptools-py2-warning.


sys.version_info < (3,) and warnings.warn(pre + “" * 60 + msg + "” * 60)
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 213, in run
address, command, msg = frontend.recv_multipart()
ValueError: need more than 2 values to unpack
/opt/rockstor/eggs/setuptools-46.1.3-py2.7.egg/pkg_resources/py2_warn.py:21: UserWarning: Setuptools will stop working on Python 2


You are running Setuptools on Python 2, which is no longer
supported and

SETUPTOOLS WILL STOP WORKING <<<
in a subsequent release (no sooner than 2020-04-20).
Please ensure you are installing
Setuptools using pip 9.x or later or pin to setuptools<45
in your environment.
If you have done those things and are still encountering
this message, please follow up at
https://bit.ly/setuptools-py2-warning.


sys.version_info < (3,) and warnings.warn(pre + “" * 60 + msg + "” * 60)
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 310, in run
map(self.prune_replica_trail, Replica.objects.filter())
TypeError: argument 2 to map() must support iteration
/opt/rockstor/eggs/setuptools-46.1.3-py2.7.egg/pkg_resources/py2_warn.py:21: UserWarning: Setuptools will stop working on Python 2


You are running Setuptools on Python 2, which is no longer
supported and

SETUPTOOLS WILL STOP WORKING <<<
in a subsequent release (no sooner than 2020-04-20).
Please ensure you are installing
Setuptools using pip 9.x or later or pin to setuptools<45
in your environment.
If you have done those things and are still encountering
this message, please follow up at
https://bit.ly/setuptools-py2-warning.


sys.version_info < (3,) and warnings.warn(pre + “" * 60 + msg + "” * 60)
/opt/rockstor/eggs/setuptools-46.1.3-py2.7.egg/pkg_resources/py2_warn.py:21: UserWarning: Setuptools will stop working on Python 2


You are running Setuptools on Python 2, which is no longer
supported and

SETUPTOOLS WILL STOP WORKING <<<
in a subsequent release (no sooner than 2020-04-20).
Please ensure you are installing
Setuptools using pip 9.x or later or pin to setuptools<45
in your environment.
If you have done those things and are still encountering
this message, please follow up at
https://bit.ly/setuptools-py2-warning.


sys.version_info < (3,) and warnings.warn(pre + “" * 60 + msg + "” * 60)
/opt/rockstor/eggs/setuptools-46.1.3-py2.7.egg/pkg_resources/py2_warn.py:21: UserWarning: Setuptools will stop working on Python 2


You are running Setuptools on Python 2, which is no longer
supported and

SETUPTOOLS WILL STOP WORKING <<<
in a subsequent release (no sooner than 2020-04-20).
Please ensure you are installing
Setuptools using pip 9.x or later or pin to setuptools<45
in your environment.
If you have done those things and are still encountering
this message, please follow up at
https://bit.ly/setuptools-py2-warning.


sys.version_info < (3,) and warnings.warn(pre + “" * 60 + msg + "” * 60)
Process ReplicaScheduler-1:
Traceback (most recent call last):
File “/usr/lib64/python2.7/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/opt/rockstor/src/rockstor/smart_manager/replication/listener_broker.py”, line 213, in run
address, command, msg = frontend.recv_multipart()
ValueError: need more than 2 values to unpack

Not to worry, I am using rsync now, works fine.

2 Likes