Pulsar geo-replication

    The diagram below illustrates the process of geo-replication across Pulsar clusters:

    In this diagram, whenever P1, P2, and P3 producers publish messages to the T1 topic on Cluster-A, Cluster-B, and Cluster-C clusters respectively, those messages are instantly replicated across clusters. Once replicated, C1 and C2 consumers can consume those messages from their respective clusters.

    Without geo-replication, C1 and C2 consumers are not able to consume messages published by P3 producer.

    Geo-replication must be enabled on a per-tenant basis in Pulsar. Geo-replication can be enabled between clusters only when a tenant has been created that allows access to both clusters.

    Although geo-replication must be enabled between two clusters, it's actually managed at the namespace level. You must complete the following tasks to enable geo-replication for a namespace:

    • Enable geo-replication namespaces
    • Configure that namespace to replicate across two or more provisioned clustersAny message published on any topic in that namespace will be replicated to all clusters in the specified set.

    When messages are produced on a Pulsar topic, they are first persisted in the local cluster, and then forwarded asynchronously to the remote clusters.

    In normal cases, when there are no connectivity issues, messages are replicated immediately, at the same time as they are dispatched to local consumers. Typically, end-to-end delivery latency is defined by the network (RTT) between the remote regions.

    In the aforementioned example, the T1 topic is being replicated among three clusters, Cluster-A, Cluster-B, and Cluster-C.

    All messages produced in any of the three clusters are delivered to all subscriptions in other clusters. In this case, C1 and C2 consumers will receive all messages published by P1, P2, and P3 producers. Ordering is still guaranteed on a per-producer basis.

    As stated in Geo-replication and Pulsar properties section, geo-replication in Pulsar is managed at the level.

    To replicate to a cluster, the tenant needs permission to use that cluster. You can grant permission to the tenant when you create it or grant later.

    Specify all the intended clusters when creating a tenant:

    To update permissions of an existing tenant, use instead of create.

    You can create a namespace with the following command sample.

    The replication clusters for a namespace can be changed at any time, without disruption to ongoing traffic. Replication channels are immediately set up or stopped in all clusters as soon as the configuration changes.

    Once you've created a geo-replication namespace, any topics that producers or consumers create within that namespace will be replicated across clusters. Typically, each application will use the serviceUrl for the local cluster.

    Selective replication

    By default, messages are replicated to all clusters configured for the namespace. You can restrict replication selectively by specifying a replication list for a message, and then that message will be replicated only to the subset in the replication list.

    The following is an example for the . Note the use of the method when constructing the Message object:

    Topic stats

    Topic-specific statistics for geo-replication topics are available via the pulsar-admin tool and API:

    Each cluster reports its own local stats, including the incoming and outgoing replication rates and backlogs.

    Deleting a geo-replication topic

    Given that geo-replication topics exist in multiple regions, it's not possible to directly delete a geo-replication topic. Instead, you should rely on automatic topic garbage collection.

    In Pulsar, a topic is automatically deleted when it meets the following three conditions:

    • when no producers or consumers are connected to it;
    • there are no subscriptions to it;