[TIP]

In the same way that the query is the go-to query for standard
full-text search, the match_phrase query(((“proximity matching”, “phrase matching”)))(((“phrase matching”)))(((“match_phrase query”))) is the one you should reach for
when you want to find words that are near each other:

GET /my_index/my_type/_search
{
“query”: {
“match_phrase”: {
“title”: “quick brown fox”
}
}

}

// SENSE: 120_Proximity_Matching/05_Match_phrase_query.json

Like the match query, the match_phrase query first analyzes the query
string to produce a list of terms. It then searches for all the terms, but
keeps only documents that contain all of the search terms, in the same
positions relative to each other. A query for the phrase quick fox
would not match any of our documents, because no document contains the word
quick immediately followed by fox.

[TIP]

The match_phrase query can also be written as a query with type
phrase:

“match”: {
“title”: {
“query”: “quick brown fox”,
“type”: “phrase”
}

}

// SENSE: 120_Proximity_Matching/05_Match_phrase_query.json

==================================================

When a string is analyzed, the analyzer returns not(((“phrase matching”, “term positions”)))(((“matchphrase query”, “position of terms”)))(((“position-aware matching”))) only a list of terms, but
also the _position, or order, of each term in the original string:

GET /_analyze?analyzer=standard

Quick brown fox

// SENSE: 120_Proximity_Matching/05_Term_positions.json

This returns the following:

[role=”pagebreak-before”]

{
“tokens”: [
{
“token”: “quick”,
“start_offset”: 0,
“end_offset”: 5,
“type”: ““,
“position”: 1 <1>
},
{
“token”: “brown”,
“start_offset”: 6,
“end_offset”: 11,
“type”: ““,
“position”: 2 <1>
},
{
“token”: “fox”,
“start_offset”: 12,
“end_offset”: 15,
“type”: ““,
“position”: 3 <1>
}
]

}

<1> The position of each term in the original string.

Positions can be stored in the inverted index, and position-aware queries like
the match_phrase query can use them to match only documents that contain
all the words in exactly the order specified, with no words in-between.

For a document to be considered a(((“match_phrase query”, “documents matching a phrase”)))(((“phrase matching”, “criteria for matching documents”))) match for the phrase ``quick brown fox,’’ the following must be true:

quick, brown, and fox must all appear in the field.
The position of brown must be greater than the position of quick.

If any of these conditions is not met, the document is not considered a match.

[TIP]

Internally, the match_phrase query uses the low-level span query family to
do position-aware matching. (((“match_phrase query”, “use of span queries for position-aware matching”)))(((“span queries”)))Span queries are term-level queries, so they have
no analysis phase; they search for the exact term specified.

Thankfully, most people never need to use the span queries directly, as the
query is usually good enough. However, certain specialized
fields, like patent searches, use these low-level queries to perform very
specific, carefully constructed positional searches.

==================================================

Phrase matching

}

[TIP]

}

Quick brown fox

}

[TIP]