[experimental] Natural Language Processing functions

    This is an experimental feature that is currently in development and is not ready for general use. It will change in unpredictable backwards-incompatible ways in future releases. Set to enable it.

    Performs stemming on a given word.

    Syntax

    Arguments

    • language — Language which rules will be applied. Must be in lowercase. String.
    • word — word that needs to be stemmed. Must be in lowercase. .

    Examples

    Query:

    1. SELECT SELECT arrayMap(x -> stem('en', x), ['I', 'think', 'it', 'is', 'a', 'blessing', 'in', 'disguise']) as res;

    Result:

    1. ┌─res────────────────────────────────────────────────┐
    2. ['I','think','it','is','a','bless','in','disguis']
    3. └────────────────────────────────────────────────────┘

    Syntax

    Arguments

    • word — Word that needs to be lemmatized. Must be lowercase. String.

    Examples

    Query:

    1. SELECT lemmatize('en', 'wolves');

    Result:

    1. ┌─lemmatize("wolves")─┐
    2. └─────────────────────┘

    Configuration:

    Finds synonyms to a given word. There are two types of synonym extensions: plain and wordnet.

    With the wordnet extension type we need to provide a path to a directory with WordNet thesaurus in it. Thesaurus must contain a WordNet sense index.

    Syntax

    1. synonyms('extension_name', word)

    Arguments

    • extension_name — Name of the extension in which search will be performed. .
    • word — Word that will be searched in extension. String.

    Examples

    Query:

    1. SELECT synonyms('list', 'important');

    Result:

    Configuration:

    1. <name>en</name>
    2. <type>plain</type>
    3. <path>en.txt</path>
    4. </extension>
    5. <extension>
    6. <name>en</name>
    7. <type>wordnet</type>
    8. <path>en/</path>
    9. </synonyms_extensions>