[experimental] Natural Language Processing functions
This is an experimental feature that is currently in development and is not ready for general use. It will change in unpredictable backwards-incompatible ways in future releases. Set to enable it.
Performs stemming on a given word.
Syntax
Arguments
language
— Language which rules will be applied. Must be in lowercase. String.word
— word that needs to be stemmed. Must be in lowercase. .
Examples
Query:
SELECT SELECT arrayMap(x -> stem('en', x), ['I', 'think', 'it', 'is', 'a', 'blessing', 'in', 'disguise']) as res;
Result:
┌─res────────────────────────────────────────────────┐
│ ['I','think','it','is','a','bless','in','disguis'] │
└────────────────────────────────────────────────────┘
Syntax
Arguments
word
— Word that needs to be lemmatized. Must be lowercase. String.
Examples
Query:
SELECT lemmatize('en', 'wolves');
Result:
┌─lemmatize("wolves")─┐
└─────────────────────┘
Configuration:
Finds synonyms to a given word. There are two types of synonym extensions: plain
and wordnet
.
With the wordnet
extension type we need to provide a path to a directory with WordNet thesaurus in it. Thesaurus must contain a WordNet sense index.
Syntax
synonyms('extension_name', word)
Arguments
extension_name
— Name of the extension in which search will be performed. .word
— Word that will be searched in extension. String.
Examples
Query:
SELECT synonyms('list', 'important');
Result:
Configuration:
<name>en</name>
<type>plain</type>
<path>en.txt</path>
</extension>
<extension>
<name>en</name>
<type>wordnet</type>
<path>en/</path>
</synonyms_extensions>