Pre-seeding replication partner

I wish to setup 2 Rockstor appliances to replicate shares. The appliances will be different cities and will connected over a VPN tunnel.

I have over 1TB of data that I wish to replicate/synchronize between the two. My question is this:

Is it possible to pre-seed the remote partner and avoid sending all that data over the slow and costly VPN tunnel?

Hi @clawes, reading your post I assume:

  1. You don’t have Rockstor installed
  2. You have “raw” (not btrfs) data

Question: what do you mean with “I wish to replicate/sync” ?? Let me explain

sync/replicate == Btrfs replica -> Create your first Rockstor, add your data (config pools, shares, etc etc), install second Rockstor machine and enjoy locally Rockstor replication with send/receive, finally move Rockstor machine 2 to the other city

sync/replicate == rsyncing -> same as btrfs replica but with rsync or, if your second Rockstor machine already away, rsync to a new disk, move it to Rockstor 2 location & rsync (not kidding, considering 1TB data this seems least expensive way)

My last suggestion on btrfs “cloning” : dd / Clonezilla like replication not possible/not easy.

M.

Back story:

I am presently using rsync to mirror the data between two servers. Rsync, as you may know, takes long time determine file changes given the amount of data it needs to inspect each cycle (~ 1TB). I wish to implement Rockstor and it’s replication features to replace the current rsync process.
Given that the data is already in both places, I was hoping to pre-seed the remote server with data already there then have Rockstor send only the changes instead of the complete 1TB of data.

Another option might be to create a snapshot on an external (BTRFS) USB drive then ship the drive to the remote site. But, again, I’m not sure if the initial replication task would recognize the existing copy and only send the difference or will it send the entire data.

Actually don’t know how Rockstor replication works (better call @suman that coded it), but if I’m not wrong it runs over ZeroMQ, so syncing should be fast.

Except for that, my 2 cents on rsync to improve it: always “hack” it using a --files-from (rsync + find -mtime. Ex.: run a cron every 24h and rsync only files modified last day. Had a similar task, with pure rsync on 500k files -> 5-6 mins, with find -mtime -> 20sec)

M

See the script I use for BTRFS send\receive replication between two Rockstors here:

I did an initial rep locally, removed the disks, shipped to destination, built new NAS and resumed replication using the same script.