Searchable snapshots

    A searchable snapshot is an index where data is read from a snapshot repository on demand at search time rather than all index data being downloaded to cluster storage at restore time. Because the index data remains in the snapshot format in the repository, searchable snapshot indexes are inherently read-only. Any attempt to write to a searchable snapshot index will result in an error.

    To enable the searchable snapshots feature, reference the following steps.

    There are several methods for enabling searchable snapshots, depending on the installation type.

    The flag is toggled using a new jvm parameter that is set either in or in config/jvm.options:

    • Option 2: Use the OPENSEARCH_JAVA_OPTS environment variable:

      1. export OPENSEARCH_JAVA_OPTS="-Dopensearch.experimental.feature.searchable_snapshot.enabled=true"
    • Option 3: For developers using Gradle, update run.gradle by adding the following lines:

    • Finally, create a node in your opensearch.yml file and define the node role as search:

      1. node.roles: [ search ]

    Enable with Docker containers

    If you’re running Docker, add the following line to docker-compose.yml underneath the opensearch-node and environment sections:

    1. version: '3'
    2. services:
    3. opensearch-node1:
    4. container_name: opensearch-node1
    5. environment:
    6. - cluster.name=opensearch-cluster
    7. - node.name=opensearch-node1
    8. - node.roles: [ search ]

    A searchable snapshot index is created by specifying the remote_snapshot storage type using the .

    To determine whether an index is a searchable snapshot index, look for a store type with the value of :

    1. {
    2. "my-index": {
    3. "index": {
    4. "store": {
    5. "type": "remote_snapshot"
    6. }
    7. }
    8. }
    9. }
    10. }

    The following are potential use cases for the searchable snapshots feature:

    • The ability to offload indexes from cluster-based storage but retain the ability to search them.
    • The ability to have a large number of searchable indexes in lower-cost media.

    The following are known limitations of the searchable snapshots feature:

    • Accessing data from a remote repository is slower than local disk reads, so higher latencies on search queries are expected.
    • Data is discarded immediately after being read. Subsequent searches for the same data will have to be downloaded again. This will be addressed in the future by implementing a disk-based cache for storing frequently accessed data.
    • Searching remote data can impact the performance of other queries running on the same node. We recommend that users provision dedicated nodes with the search role for performance-critical applications.