Geo-Replication Overview

    • Geo-replication
    • Disaster recovery
    • Feeding edge clusters into a central, aggregate cluster
    • Physical isolation of clusters (such as production vs. testing)
    • Cloud migration or hybrid cloud deployments
    • Legal and compliance requirements

    Administrators can set up such inter-cluster data flows with Kafka’s MirrorMaker (version 2), a tool to replicate data between different Kafka environments in a streaming manner. MirrorMaker is built on top of the Kafka Connect framework and supports features such as:

    • Replicates topics (data plus configurations)
    • Replicates consumer groups including offsets to migrate applications between clusters
    • Replicates ACLs
    • Preserves partitioning
    • Automatically detects new topics and partitions
    • Provides a wide range of metrics, such as end-to-end replication latency across multiple data centers/clusters
    • Fault-tolerant and horizontally scalable operations

    Note: Geo-replication with MirrorMaker replicates data across Kafka clusters. This inter-cluster replication is different from Kafka’s , which replicates data within the same Kafka cluster.

    With MirrorMaker, Kafka administrators can replicate topics, topic configurations, consumer groups and their offsets, and ACLs from one or more source Kafka clusters to one or more target Kafka clusters, i.e., across cluster environments. In a nutshell, MirrorMaker uses Connectors to consume from source clusters and produce to target clusters.

    These directional flows from source to target clusters are called replication flows. They are defined with the format in the MirrorMaker configuration file as described later. Administrators can create complex replication topologies based on these flows.

    Here are some example patterns:

    • Active/Active high availability deployments: A->B, B->A
    • Active/Passive or Active/Standby high availability deployments: A->B
    • Aggregation (e.g., from many clusters to one): A->K, B->K, C->K
    • Fan-out (e.g., from one to many clusters): K->A, K->B, K->C
    • Forwarding: A->B, B->C, C->D

    By default, a flow replicates all topics and consumer groups. However, each replication flow can be configured independently. For instance, you can define that only specific topics or consumer groups are replicated from the source cluster to the target cluster.

    Here is a first example on how to configure data replication from a primary cluster to a secondary cluster (an active/passive setup):

    The following sections describe how to configure and run a dedicated MirrorMaker cluster. If you want to run MirrorMaker within an existing Kafka Connect cluster or other supported deployment setups, please refer to KIP-382: MirrorMaker 2.0 and be aware that the names of configuration settings may vary between deployment modes.

    Beyond what’s covered in the following sections, further examples and information on configuration settings are available at:

    Configuration File Syntax

    The MirrorMaker configuration file is typically named connect-mirror-maker.properties. You can configure a variety of components in this file:

    • MirrorMaker settings: global settings including cluster definitions (aliases), plus custom settings per replication flow
    • Kafka Connect and connector settings
    • Kafka producer, consumer, and admin client settings

    Example: Define MirrorMaker settings (explained in more detail later).

    1. # Global settings
    2. clusters = us-west, us-east # defines cluster aliases
    3. us-west.bootstrap.servers = broker3-west:9092
    4. us-east.bootstrap.servers = broker5-east:9092
    5. topics = .* # all topics to be replicated by default
    6. # Specific replication flow settings (here: flow from us-west to us-east)
    7. us-west->us-east.enabled = true
    8. us-west->us.east.topics = foo.*, bar.* # override the default above

    MirrorMaker is based on the Kafka Connect framework. Any Kafka Connect, source connector, and sink connector settings as described in the can be used directly in the MirrorMaker configuration, without having to change or prefix the name of the configuration setting.

    Example: Define custom Kafka Connect settings to be used by MirrorMaker.

    1. # Setting Kafka Connect defaults for MirrorMaker
    2. tasks.max = 5

    Most of the default Kafka Connect settings work well for MirrorMaker out-of-the-box, with the exception of tasks.max. In order to evenly distribute the workload across more than one MirrorMaker process, it is recommended to set tasks.max to at least 2 (preferably higher) depending on the available hardware resources and the total number of topic-partitions to be replicated.

    You can further customize MirrorMaker’s Kafka Connect settings per source or target cluster (more precisely, you can specify Kafka Connect worker-level configuration settings “per connector”). Use the format of {cluster}.{config_name} in the MirrorMaker configuration file.

    Example: Define custom connector settings for the us-west cluster.

    1. # us-west custom settings
    2. us-west.offset.storage.topic = my-mirrormaker-offsets

    MirrorMaker internally uses the Kafka producer, consumer, and admin clients. Custom settings for these clients are often needed. To override the defaults, use the following format in the MirrorMaker configuration file:

    • {source}.consumer.{consumer_config_name}
    • {target}.producer.{producer_config_name}

    Example: Define custom producer, consumer, admin client settings.

    1. # us-west cluster (from which to consume)
    2. us-west.consumer.isolation.level = read_committed
    3. us-west.admin.bootstrap.servers = broker57-primary:9092
    4. # us-east cluster (to which to produce)
    5. us-east.producer.compression.type = gzip
    6. us-east.producer.buffer.memory = 32768
    7. us-east.admin.bootstrap.servers = broker8-secondary:9092
    • clusters (required): comma-separated list of Kafka cluster “aliases”
    • {clusterAlias}.bootstrap.servers (required): connection information for the specific cluster; comma-separated list of “bootstrap” Kafka brokers

    Example: Define two cluster aliases primary and secondary, including their connection information.

    1. clusters = primary, secondary
    2. primary.bootstrap.servers = broker10-primary:9092,broker-11-primary:9092
    3. secondary.bootstrap.servers = broker5-secondary:9092,broker6-secondary:9092

    Secondly, you must explicitly enable individual replication flows with {source}->{target}.enabled = true as needed. Remember that flows are directional: if you need two-way (bidirectional) replication, you must enable flows in both directions.

    1. # Enable replication from primary to secondary
    2. primary->secondary.enabled = true

    By default, a replication flow will replicate all but a few special topics and consumer groups from the source cluster to the target cluster, and automatically detect any newly created topics and groups. The names of replicated topics in the target cluster will be prefixed with the name of the source cluster (see section further below). For example, the topic foo in the source cluster us-west would be replicated to a topic named us-west.foo in the target cluster us-east.

    The subsequent sections explain how to customize this basic setup according to your needs.

    The configuration of a replication flow is a combination of top-level default settings (e.g., topics), on top of which flow-specific settings, if any, are applied (e.g., us-west->us-east.topics). To change the top-level defaults, add the respective top-level setting to the MirrorMaker configuration file. To override the defaults for a specific replication flow only, use the syntax format {source}->{target}.{config.name}.

    The most important settings are:

    • topics: list of topics or a regular expression that defines which topics in the source cluster to replicate (default: topics = .*)
    • topics.exclude: list of topics or a regular expression to subsequently exclude topics that were matched by the topics setting (default: topics.exclude = .*[\-\.]internal, .*\.replica, __.*)
    • groups: list of topics or regular expression that defines which consumer groups in the source cluster to replicate (default: groups = .*)
    • groups.exclude: list of topics or a regular expression to subsequently exclude consumer groups that were matched by the setting (default: groups.exclude = console-consumer-.*, connect-.*, __.*)
    • {source}->{target}.enable: set to true to enable the replication flow (default: false)

    Example:

    Additional configuration settings are supported, some of which are listed below. In most cases, you can leave these settings at their default values. See MirrorMakerConfig and for further details.

    • refresh.topics.enabled: whether to check for new topics in the source cluster periodically (default: true)
    • refresh.topics.interval.seconds: frequency of checking for new topics in the source cluster; lower values than the default may lead to performance degradation (default: 600, every ten minutes)
    • refresh.groups.enabled: whether to check for new consumer groups in the source cluster periodically (default: true)
    • refresh.groups.interval.seconds: frequency of checking for new consumer groups in the source cluster; lower values than the default may lead to performance degradation (default: 600, every ten minutes)
    • sync.topic.configs.enabled: whether to replicate topic configurations from the source cluster (default: true)
    • sync.topic.acls.enabled: whether to sync ACLs from the source cluster (default: true)
    • emit.heartbeats.enabled: whether to emit heartbeats periodically (default: true)
    • emit.heartbeats.interval.seconds: frequency at which heartbeats are emitted (default: 1, every one seconds)
    • heartbeats.topic.replication.factor: replication factor of MirrorMaker’s internal heartbeat topics (default: 3)
    • emit.checkpoints.enabled: whether to emit MirrorMaker’s consumer offsets periodically (default: true)
    • emit.checkpoints.interval.seconds: frequency at which checkpoints are emitted (default: 60, every minute)
    • checkpoints.topic.replication.factor: replication factor of MirrorMaker’s internal checkpoints topics (default: 3)
    • sync.group.offsets.enabled: whether to periodically write the translated offsets of replicated consumer groups (in the source cluster) to __consumer_offsets topic in target cluster, as long as no active consumers in that group are connected to the target cluster (default: false)
    • sync.group.offsets.interval.seconds: frequency at which consumer group offsets are synced (default: 60, every minute)
    • offset-syncs.topic.replication.factor: replication factor of MirrorMaker’s internal offset-sync topics (default: 3)

    MirrorMaker supports the same security settings as Kafka Connect, so please refer to the linked section for further information.

    Example: Encrypt communication between MirrorMaker and the us-east cluster.

    1. us-east.security.protocol=SSL
    2. us-east.ssl.truststore.location=/path/to/truststore.jks
    3. us-east.ssl.truststore.password=my-secret-password
    4. us-east.ssl.keystore.location=/path/to/keystore.jks
    5. us-east.ssl.keystore.password=my-secret-password
    6. us-east.ssl.key.password=my-secret-password
    Custom Naming of Replicated Topics in Target Clusters

    Replicated topics in a target cluster—sometimes called remote topics—are renamed according to a replication policy. MirrorMaker uses this policy to ensure that events (aka records, messages) from different clusters are not written to the same topic-partition. By default as per , the names of replicated topics in the target clusters have the format {source}.{source_topic_name}:

    1. us-west us-east
    2. ========= =================
    3. bar-topic
    4. foo-topic --> us-west.foo-topic

    You can customize the separator (default: .) with the replication.policy.separator setting:

    1. # Defining a custom separator
    2. us-west->us-east.replication.policy.separator = _

    If you need further control over how replicated topics are named, you can implement a custom ReplicationPolicy and override replication.policy.class (default is DefaultReplicationPolicy) in the MirrorMaker configuration.

    MirrorMaker processes share configuration via their target Kafka clusters. This behavior may cause conflicts when configurations differ among MirrorMaker processes that operate against the same target cluster.

    For example, the following two MirrorMaker processes would be racy:

    1. # Configuration of process 1
    2. A->B.enabled = true
    3. A->B.topics = foo
    4. # Configuration of process 2
    5. A->B.enabled = true
    6. A->B.topics = bar

    In this case, the two processes will share configuration via cluster B, which causes a conflict. Depending on which of the two processes is the elected “leader”, the result will be that either the topic foo or the topic bar is replicated, but not both.

    It is therefore important to keep the MirrorMaker configration consistent across replication flows to the same target cluster. This can be achieved, for example, through automation tooling or by using a single, shared MirrorMaker configuration file for your entire organization.

    To minimize latency (“producer lag”), it is recommended to locate MirrorMaker processes as close as possible to their target clusters, i.e., the clusters that it produces data to. That’s because Kafka producers typically struggle more with unreliable or high-latency network connections than Kafka consumers.

    1. First DC Second DC
    2. ========== =========================
    3. primary --------- MirrorMaker --> secondary
    1. # Run in secondary's data center, reading from the remote `primary` cluster
    2. $ ./bin/connect-mirror-maker.sh connect-mirror-maker.properties --clusters secondary

    The --clusters secondary tells the MirrorMaker process that the given cluster(s) are nearby, and prevents it from replicating data or sending configuration to clusters at other, remote locations.

    The following example shows the basic settings to replicate topics from a primary to a secondary Kafka environment, but not from the secondary back to the primary. Please be aware that most production setups will need further configuration, such as security settings.

    The following example shows the basic settings to replicate topics between two clusters in both ways. Please be aware that most production setups will need further configuration, such as security settings.

    1. # Bidirectional flow (two-way) between us-west and us-east clusters
    2. clusters = us-west, us-east
    3. us-west.bootstrap.servers = broker1-west:9092,broker2-west:9092
    4. Us-east.bootstrap.servers = broker3-east:9092,broker4-east:9092
    5. us-west->us-east.enabled = true

    Note on preventing replication “loops” (where topics will be originally replicated from A to B, then the replicated topics will be replicated yet again from B to A, and so forth): As long as you define the above flows in the same MirrorMaker configuration file, you do not need to explicitly add topics.exclude settings to prevent replication loops between the two clusters.

    Let’s put all the information from the previous sections together in a larger example. Imagine there are three data centers (west, east, north), with two Kafka clusters in each data center (e.g., west-1, west-2). The example in this section shows how to configure MirrorMaker (1) for Active/Active replication within each data center, as well as (2) for Cross Data Center Replication (XDCR).

    First, define the source and target clusters along with their replication flows in the configuration:

    1. # Basic settings
    2. clusters: west-1, west-2, east-1, east-2, north-1, north-2
    3. west-1.bootstrap.servers = ...
    4. west-2.bootstrap.servers = ...
    5. east-1.bootstrap.servers = ...
    6. east-2.bootstrap.servers = ...
    7. north-1.bootstrap.servers = ...
    8. north-2.bootstrap.servers = ...
    9. # Replication flows for Active/Active in West DC
    10. west-1->west-2.enabled = true
    11. west-2->west-1.enabled = true
    12. # Replication flows for Active/Active in East DC
    13. east-1->east-2.enabled = true
    14. east-2->east-1.enabled = true
    15. # Replication flows for Active/Active in North DC
    16. north-1->north-2.enabled = true
    17. north-2->north-1.enabled = true
    18. # Replication flows for XDCR via west-1, east-1, north-1
    19. west-1->east-1.enabled = true
    20. west-1->north-1.enabled = true
    21. east-1->west-1.enabled = true
    22. east-1->north-1.enabled = true
    23. north-1->west-1.enabled = true
    24. north-1->east-1.enabled = true

    Then, in each data center, launch one or more MirrorMaker as follows:

    1. # In West DC:
    2. $ ./bin/connect-mirror-maker.sh connect-mirror-maker.properties --clusters west-1 west-2
    3. # In East DC:
    4. $ ./bin/connect-mirror-maker.sh connect-mirror-maker.properties --clusters east-1 east-2
    5. # In North DC:
    6. $ ./bin/connect-mirror-maker.sh connect-mirror-maker.properties --clusters north-1 north-2

    With this configuration, records produced to any cluster will be replicated within the data center, as well as across to other data centers. By providing the --clusters parameter, we ensure that each MirrorMaker process produces data to nearby clusters only.

    Note: The --clusters parameter is, technically, not required here. MirrorMaker will work fine without it. However, throughput may suffer from “producer lag” between data centers, and you may incur unnecessary data transfer costs.

    You can run as few or as many MirrorMaker processes (think: nodes, servers) as needed. Because MirrorMaker is based on Kafka Connect, MirrorMaker processes that are configured to replicate the same Kafka clusters run in a distributed setup: They will find each other, share configuration (see section below), load balance their work, and so on. If, for example, you want to increase the throughput of replication flows, one option is to run additional MirrorMaker processes in parallel.

    To start a MirrorMaker process, run the command:

    1. $ ./bin/connect-mirror-maker.sh connect-mirror-maker.properties

    After startup, it may take a few minutes until a MirrorMaker process first begins to replicate data.

    Optionally, as described previously, you can set the parameter --clusters to ensure that the MirrorMaker process produces data to nearby clusters only.

    1. # Note: The cluster alias us-west must be defined in the configuration file
    2. $ ./bin/connect-mirror-maker.sh connect-mirror-maker.properties \
    3. --clusters us-west

    Note when testing replication of consumer groups: By default, MirrorMaker does not replicate consumer groups created by the kafka-console-consumer.sh tool, which you might use to test your MirrorMaker setup on the command line. If you do want to replicate these consumer groups as well, set the groups.exclude configuration accordingly (default: groups.exclude = console-consumer-.*, connect-.*, __.*). Remember to update the configuration again once you completed your testing.

    You can stop a running MirrorMaker process by sending a SIGTERM signal with the command:

    1. $ kill <MirrorMaker pid>

    To make configuration changes take effect, the MirrorMaker process(es) must be restarted.

    It is recommended to monitor MirrorMaker processes to ensure all defined replication flows are up and running correctly. MirrorMaker is built on the Connect framework and inherits all of Connect’s metrics, such source-record-poll-rate. In addition, MirrorMaker produces its own metrics under the kafka.connect.mirror metric group. Metrics are tagged with the following properties:

    • source: alias of source cluster (e.g., primary)
    • target: alias of target cluster (e.g., secondary)
    • topic: replicated topic on target cluster
    • partition: partition being replicated

    Metrics are tracked for each replicated topic. The source cluster can be inferred from the topic name. For example, replicating topic1 from primary->secondary will yield metrics like:

    • target=secondary
    • topic=primary.topic1
    • partition=1

    These metrics do not differentiate between created-at and log-append timestamps.