k-NN Index

    Method definitions are used when the underlying Approximate k-NN algorithm does not require training. For example, the following knn_vector field specifies that nmslib’s implementation of hnsw should be used for Approximate k-NN search. During indexing, nmslib will build the corresponding hnsw segment files.

    Model id’s are used when the underlying Approximate k-NN algorithm requires a training step. As a prerequisite, the model has to be created with the Train API. The model contains the information needed to initialize the native library segment files.

    1. "type": "knn_vector",
    2. "model_id": "my-model"
    3. }

    However, if you intend to just use painless scripting or a k-NN score script, you only need to pass the dimension.

    A method definition refers to the underlying configuration of the Approximate k-NN algorithm you want to use. Method definitions are used to either create a knn_vector field (when the method does not require training) or that can then be used to create a knn_vector field.

    A method definition will always contain the name of the method, the space_type the method is built for, the engine (the native library) to use, and a map of parameters.

    Method NameRequires Training?Supported SpacesDescription
    hnswfalse“l2”, “innerproduct”, “cosinesimil”, “l1”, “linf”Hierarchical proximity graph approach to Approximate k-NN search. For more details on the algorithm, !

    HNSW Parameters

    Paramater NameRequiredDefaultUpdatableDescription
    ef_constructionfalse512falseThe size of the dynamic list used during k-NN graph creation. Higher values lead to a more accurate graph, but slower indexing speed.
    mfalse16falseThe number of bidirectional links that the plugin creates for each new element. Increasing and decreasing this value can have a large impact on memory consumption. Keep this value between 2-100.

    Note — For nmslib, ef_search is set in the .

    Note — For hnsw, “innerproduct” is not available when PQ is used.

    HNSW Parameters

    Paramater NameRequiredDefaultUpdatableDescription
    ef_searchfalse512falseThe size of the dynamic list used during k-NN searches. Higher values lead to more accurate but slower searches.
    ef_constructionfalse512falseThe size of the dynamic list used during k-NN graph creation. Higher values lead to a more accurate graph, but slower indexing speed.
    false16falseThe number of bidirectional links that the plugin creates for each new element. Increasing and decreasing this value can have a large impact on memory consumption. Keep this value between 2-100.
    encoderfalseflatfalseEncoder definition for encoding vectors. Encoders can reduce the memory footprint of your index, at the expense of search accuracy.

    IVF Parameters

    Paramater NameRequiredDefaultUpdatableDescription
    nlistsfalse4falseNumber of buckets to partition vectors into. Higher values may lead to more accurate searches, at the expense of memory and training latency. For more information about choosing the right value, refer to faiss’s documentation.
    nprobesfalse1falseNumber of buckets to search over during query. Higher values lead to more accurate but slower searches.
    encoderfalseflatfalseEncoder definition for encoding vectors. Encoders can reduce the memory footprint of your index, at the expense of search accuracy.

    IVF training requirements

    The IVF algorithm requires a training step. To create an index that uses IVF, you need to train a model with the Train API, passing the IVF method definition. IVF requires that, at a minimum, there should be nlist training data points, but it is . Training data can either the same data that is going to be ingested or a separate set of data.

    Encoders can be used to reduce the memory footprint of a k-NN index at the expense of search accuracy. faiss has several different encoder types, but currently, the plugin only supports flat and pq encoding.

    An example method definition that specifies an encoder may look something like this:

    1. "method": {
    2. "name":"hnsw",
    3. "engine":"faiss",
    4. "encoder":{
    5. "name":"pq",
    6. "parameters":{
    7. "code_size": 8,
    8. "m": 8
    9. }
    10. }
    11. }
    12. }

    PQ Parameters

    Paramater NameRequiredDefaultUpdatableDescription
    mfalse1falseDetermine how many many sub-vectors to break the vector into. sub-vectors are encoded independently of each other. This dimension of the vector must be divisible by m. Max value is 1024.
    code_sizefalse8falseDetermines the number of bits to encode a sub-vector into. Max value is 8. Note — for IVF, this value must be less than or equal to 8. For HNSW, this value can only be 8.

    There are a lot of options to choose from when building your knn_vector field. To determine the correct methods and parameters to choose, you should first understand what requirements you have for your workload and what trade-offs you are willing to make. Factors to consider are (1) query latency, (2) query quality, (3) memory limits, (4) indexing latency.

    If memory is not a concern, HNSW offers a very strong query latency/query quality tradeoff.

    If you want to use less memory and index faster than HNSW, while maintaining similar query quality, you should evaluate IVF.

    If memory is a concern, consider adding a PQ encoder to your HNSW or IVF index. Because PQ is a lossy encoding, query quality will drop.

    Having a replica doubles the total number of vectors.

    HNSW memory estimation

    The memory required for HNSW is estimated to be 1.1 * (4 * dimension + 8 * M) bytes/vector.

    As an example, assume you have a million vectors with a dimension of 256 and M of 16. The memory requirement can be estimated as follows:

    IVF memory estimation

    The memory required for IVF is estimated to be 1.1 * (((4 * dimension) * num_vectors) + (4 * nlist * d)) bytes.

    As an example, assume you have a million vectors with a dimension of 256 and nlist of 128. The memory requirement can be estimated as follows:

    1. 1.1 * (((4 * 128) * 1000000) + (4 * 128 * 256)) ~= 563 MB

    Additionally, the k-NN plugin introduces several index settings that can be used to configure the k-NN structure as well.

    At the moment, several parameters defined in the settings are in the deprecation process. Those parameters should be set in the mapping instead of the index settings. Parameters set in the mapping will override the parameters set in the index settings. Setting the parameters in the mapping allows an index to have multiple knn_vector fields with different parameters.

    SettingDefaultUpdateableDescription
    index.knnfalsefalseWhether the index should build native library indices for the knn_vector fields. If set to false, the knn_vector fields will be stored in doc values, but Approximate k-NN search functionality will be disabled.
    index.knn.algo_param.ef_search512trueThe size of the dynamic list used during k-NN searches. Higher values lead to more accurate but slower searches. Only available for nmslib.
    index.knn.algo_param.ef_construction512false(Deprecated in 1.0.0. Use the mapping parameters to set this value instead.) Only available for nmslib. Refer to mapping definition.
    index.knn.algo_param.m16false(Deprecated in 1.0.0. Use the mapping parameters to set this value instead.) Only available for nmslib. Refer to mapping definition.
    “l2”false(Deprecated in 1.0.0. Use the mapping parameters to set this value instead.) Only available for nmslib. Refer to mapping definition.