Syncing Collections

    It will fetch all documents of the specified collection from the master database and store them in the local instance. After the synchronization, the collection data on the slave will beidentical to the data on the master, provided no further data changes happen on the master. Any data changes that are performed on the master after the synchronization was started willnot be captured by syncCollection, but need to be replicated using the regular replicationapplier mechanism.

    For the following example setup, we’ll use the instance tcp://master.domain.org:8529 as the master, and the instance tcp://slave.domain.org:8530 as a slave.

    The goal is to have all data from the collection test in database system on master _tcp://master.domain.org:8529 be replicated to the collection test in database system on the slave _tcp://slave.domain.org:8530.

    To transfer this collection to the slave, issue the following commands there:

    Warning: The syncCollection command will replace the collection’s data in the slave database with data from the master database! Only execute these commands if you have verified you are on the correct server, in the correct database!

    Setting the optional incremental attribute in the call to syncCollection will start anincremental transfer of data. This may be useful in case when the slave already hasparts or almost all of the data in the collection and only the differences need to besynchronized. Note that to compute the differences the incremental transfer will build a sortedlist of all document keys in the collection on both the slave and the master, which may still beexpensive for huge collections in terms of memory usage and runtime. During building the listof keys the collection will be read-locked on the master.

    The syncCollection command may take a long time to complete if the collection is big. The shellwill block until the slave has synchronized the entire collection from the master or until an error occurs. By default, the syncCollection command in the ArangoShell will poll for a statusupdate every 10 seconds.

    When syncCollection is called from the ArangoShell, the optional async attribute can be usedto start the synchronization as a background process on the slave. If async is set to true,the call to syncCollection will return almost instantly with an id string. Using this id string,the status of the sync job on the slave can be queried using the getSyncResult function as follows:

    1. var replication = require("@arangodb/replication");
    2. endpoint: "tcp://master.domain.org:8529",
    3. username: "myuser",
    4. });

    getSyncResult will return false as long as the synchronization is not complete, and return thesynchronization result otherwise.