By default, the full indexed document is returned as part of all searches. This is referred to as the source (_source
field in the search hits). If we don’t want the entire source document returned, we have the ability to request only a few fields from within source to be returned, or we can set _source
to false to omit the field entirely.
This example shows how to return two fields, account_number
and balance
(inside of _source
), from the search:
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_all": {} },
"_source": ["account_number", "balance"]
}'
Note that the above example simply reduces the information returned in the _source
field. It will still only return one field named _source
but only the fields account_number
and balance
will be included.
If you come from a SQL background, the above is somewhat similar in concept to the SQL query
SELECT account_number, balance FROM bank;
Now let’s move on to the query part. Previously, we’ve seen how the match_all
query is used to match all documents. Let’s now introduce a new query called the match query, which can be thought of as a basic fielded search query (i.e. a search done against a specific field or set of fields).
This example returns the account with the account_number
set to 20
:
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match": { "account_number": 20 } }
}'
This example returns all accounts containing the term "mill" in the address
:
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match": { "address": "mill" } }
}'
This example returns all accounts containing the term "mill" or "lane" in the address
:
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match": { "address": "mill lane" } }
}'
This example is a variant of match
(match_phrase
) that splits the query into terms and only returns documents that contain all terms in the address
in the same positions relative to each other[1].
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": { "match_phrase": { "address": "mill lane" } }
}'
Let’s now introduce the bool(ean) query. The bool query allows us to compose smaller queries into bigger queries using boolean logic.
This example composes two match queries and returns all accounts containing "mill" and "lane" in the address:
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": {
"bool": {
"must": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}'
In the above example, the bool must
clause specifies all the queries that must be true for a document to be considered a match.
In contrast, this example composes two match queries and returns all accounts containing "mill" or "lane" in the address
:
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": {
"bool": {
"should": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}'
In the above example, the bool should
clause specifies a list of queries either of which must be true for a document to be considered a match.
This example composes two match queries and returns all accounts that contain neither "mill" nor "lane" in the address
:
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": {
"bool": {
"must_not": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}'
In the above example, the bool must_not clause specifies a list of queries none of which must be true for a document to be considered a match.
We can combine must, should, and must_not clauses simultaneously inside a bool query. Furthermore, we can compose bool queries inside any of these bool clauses to mimic any complex multi-level boolean logic.
This example returns all accounts that belong to people who are exactly 40 years old and don’t live in Washington (WA
for short):
curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"query": {
"bool": {
"must": [
{ "match": { "age": "40" } }
],
"must_not": [
{ "match": { "state": "WA" } }
]
}
}
}'