Use Aliyun OSS offloader with Pulsar

Follow the steps below to install the Aliyun OSS offloader.

  • Pulsar: 2.8.0 or later versions

This example uses Pulsar 2.8.0.

  1. Download the Pulsar tarball, see here.

  2. Download and untar the Pulsar offloaders package, then copy the Pulsar offloaders as in the Pulsar directory, see .

    Output

    As shown from the output, Pulsar uses Apache jclouds to support , GCS, , and Aliyun OSS for long-term storage.

  1. ##### note
  2. - If you are running Pulsar in a bare-metal cluster, make sure that `offloaders` tarball is unzipped in every broker's Pulsar directory.
  3. - If you are running Pulsar in Docker or deploying Pulsar using a Docker image (such as K8s and DCOS), you can use the `apachepulsar/pulsar-all` image. The `apachepulsar/pulsar-all` image has already bundled tiered storage offloaders.

Configuration

note

Before offloading data from BookKeeper to Aliyun OSS, you need to configure some properties of the Aliyun OSS offload driver.

Besides, you can also configure the Aliyun OSS offloader to run it automatically or trigger it manually.

You can configure the Aliyun OSS offloader driver in the configuration file broker.conf or standalone.conf.

  • Required configurations are as below.

Bucket (required)

A bucket is a basic container that holds your data. Everything you store in Aliyun OSS must be contained in a bucket. You can use a bucket to organize your data and control access to your data, but unlike directory and folder, you cannot nest a bucket.

Example

This example names the bucket as pulsar-topic-offload.

  1. managedLedgerOffloadBucket=pulsar-topic-offload

Endpoint (required)

The endpoint is the region where a bucket is located.

tip

For more information about Aliyun OSS regions and endpoints, see or Chinese website.

Example

This example sets the endpoint as oss-us-west-1-internal.

Authentication (required)

To be able to access Aliyun OSS, you need to authenticate with Aliyun OSS.

Set the environment variables ALIYUN_OSS_ACCESS_KEY_ID and ALIYUN_OSS_ACCESS_KEY_SECRET in conf/pulsar_env.sh.

“export” is important so that the variables are made available in the environment of spawned processes.

  1. export ALIYUN_OSS_ACCESS_KEY_ID=ABC123456789
  2. export ALIYUN_OSS_ACCESS_KEY_SECRET=ded7db27a4558e2ea8bbf0bf37ae0e8521618f366c

Size of block read/write

You can configure the size of a request sent to or read from Aliyun OSS in the configuration file broker.conf or standalone.conf.

Namespace policy can be configured to offload data automatically once a threshold is reached. The threshold is based on the size of data that a topic has stored on a Pulsar cluster. Once the topic reaches the threshold, an offloading operation is triggered automatically.

You can configure the threshold size using CLI tools, such as pulsar-admin.

The offload configurations in broker.conf and standalone.conf are used for the namespaces that do not have namespace level offload policies. Each namespace can have its own offload policy. If you want to set offload policy for each namespace, use the command pulsar-admin namespaces set-offload-policies options command.

Example

This example sets the Aliyun OSS offloader threshold size to 10 MB using pulsar-admin.

tip

For more information about the pulsar-admin namespaces set-offload-threshold options command, including flags, descriptions, and default values, see .

For individual topics, you can trigger the Aliyun OSS offloader manually using one of the following methods:

  • Use REST endpoint.

Example

  • This example triggers the Aliyun OSS offloader to run manually using pulsar-admin.

  1. **Output**
  2. ```
  3. Offload triggered for persistent://my-tenant/my-namespace/topic1 for messages before 2:0:-1
  4. ```
  5. ##### tip
  6. For more information about the `pulsar-admin topics offload options` command, including flags, descriptions, and default values, see [here](https://pulsar.apache.org/tools/pulsar-admin/2.6.0-SNAPSHOT/#-em-offload-em-).
  • This example checks the Aliyun OSS offloader status using pulsar-admin.

    1. bin/pulsar-admin topics offload-status persistent://my-tenant/my-namespace/topic1