Elasticsearch Boolean Queries

In this article. I will demonstrate the different types of boolean Queries in Elasticsearch with examples. 

Introduction

In Elasticsearch, a bool query allows you to combine multiple search queries with boolean conditions. It is also used to create an advanced query by chaining one or more boolean clauses together.

Using Boolean queries, We can get more precise results by more specific filter parameters.

We can add any type of query inside each bool clause, such as terms, match, and query_string.

There are two other important parameters that we can include with the bool query.  As below, 

minimum_should_match

We can set a threshold for a minimum amount of matching words that the document must contain should clauses. For that, we can use minimum_should_match parameter for the same. If the bool query includes at least one should clause and no must or filter clauses, the default value is 1. Otherwise, the default value is 0.

boost

boost allows you to give more weight to one query than to another.

Types of boolean clauses in Elasticsearch

There are four boolean clauses used for bool queries as below,

  1. Must
  2. Must_not
  3. Should
  4. Filter

A single bool query can create by a combination of all of the above clauses. For example,

GET /index_name/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "FIELD": "VALUE"
          }
        }
      ],
      "must_not": [
        {
          "term": {
            "FIELD": "VALUE"
          }
        }
      ],
      "should": [
        {
          "match": {
            "FIELD": "TEXT"
          }
        }
      ],
      "filter": [
        {
          "term": {
            "FIELD": "VALUE"
          }
        }
      ],
      "minimum_should_match": 2,
      "boost": 2.0
    }
  }
}

Let's understand one by one clause with some examples.

The mapping for our index is defined as follows,

PUT employees/_mapping
{
  "mappings": {
    "properties": {
      "id": {
        "type": "integer"
      },
      "name": {
        "type": "text"
      },
      "company_name": {
        "type": "text"
      },
      "phone_number": {
        "type": "text"
      },
      "gender": {
        "type": "text"
      },
      "salary": {
        "type": "float"
      },
      "is_part_time_employee": {
        "type": "boolean"
      }
    }
  }
}

Sample documents are in our index.

[
  {"id":1,"name":"Anders","company_name":"Apple","phone_number":"4196541230","gender":"Male","salary":12000,"is_part_time_employee":false}, 
  {"id":2,"name":"Asher","company_name":"Microsoft","phone_number":"4256231478","gender":"Male","salary":10000,"is_part_time_employee":true}, 
  {"id":3,"name":"Anjali","company_name":"Apple","phone_number":"4369852147","gender":"Female","salary":15000,"is_part_time_employee":true}, 
  {"id":4,"name":"Bacchus","company_name":"Infosys","phone_number":"4489562314","gender":"Male","salary":12000,"is_part_time_employee":false}
]

1. Must

The “must” clause is used when you want documents that match all clauses.

The “must” is similar to the “and” operator. If we have more than one query, then all of those queries must need to match.

For Example: If we want an employee which is working in “Apple” Company and Gender is “Female”.

Then The boolean expression will be: company_name = Apple AND gender = Female

The bool DSL Query,

GET /employees/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "company_name": "Apple"
          }
        },
        {
          "term": {
            "gender": "Female"
          }
        }
      ]
    }
  }
}

The above query will return the 3rd document [“id”:3] out of 4. Because only 1 document has the employee having company_name is “Apple” and gender is “Female”.

2. Must_Not

The “must_not” is the opposite of the must clause where documents that match the queries will not be returned. It is like the logical operator “not”.

Clauses are executed in filter context (Filter Context is a yes/no option, where the query clause answers the question “Does this document match this query clause?” and No scores are calculated. It is mostly used for structured data filtering. Ex.is_part_time_employee field set to true?), meaning this does not contribute to the final score or scoring is ignored and these query results can also be cached.

For Example: If we don't want part-time working employees.

Then The boolean expression will be: is_part_time_employee != true

GET /employees/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "term": {
            "is_part_time_employee": true
          }
        }
      ]
    }
  }
}

The above query will return 2 documents in result [“id”:1, “id”:4] out of 4. Because only these two documents have the employees, which is working as full time employee.

3. Should

The should clause type is different from the other types as it can be used to specify, queries that returns documents that match any one of the conditions.

Using should tell Elasticsearch at least one query will have to match the document. It corresponds to the boolean “or” operator. 

For Example: If we want employee which is working in Company “Microsoft” or “Infosys”.

Then The boolean expression will be: company_name = Microsoft OR company_name = Infosys

The bool DSL Query,

GET /employees/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "company_name": "Microsoft"
          }
        },
        {
          "match": {
            "company_name": "Infosys"
          }
        }
      ]
    }
  }
}

The above query will return 2 documents in result [“id”:2, “id”:4] out of 4. 

4. Filter

The filter clause tells Elasticsearch that the query must appear in matching documents. It is quite similar to the must clause but does not contribute to the score. Filter clauses are executed in filter context, meaning that score will not be computed and filter clause query results can also be cached.

Filter queries are automatically stored in the Elasticsearch cache. So if exact same filter query is requested in another request, then the result will be coming from the cache. 

For Example: If we want an employee which is having salary of more than 10000.

GET /employees/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "salary": {
              "gte": 10000
            }
          }
        }
      ]
    }
  }
}

The above query will return 3 documents in result [“id”:1, “id”:3, “id”:4] out of 4. 

We have individually gone through all the clause one by one with example. Now, sometimes simply matching one or two fields is not enough. We might want to search for multiple fields with multiple conditions. 

Let's say we want employees who are working in “Apple” or “Microsoft” company and Gender is “Male” and He/She is “Full-time” employee.

Then The boolean expression will be: (company_name = Apple OR company_name = Microsoft) AND (gender = Male) AND (is_part_time_employee != true)

The bool DSL Query,

GET /employees/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "should": [
              {
                "match": {
                  "company_name": "Apple"
                }
              },
              {
                "match": {
                  "company_name": "Microsoft"
                }
              }
            ]
          }
        },
        {
          "bool": {
            "must": [
              {
                "match": {
                  "gender": "Male"
                }
              }
            ]
          }
        },
        {
          "bool": {
            "must_not": [
              {
                "term": {
                  "is_part_time_employee": true
                }
              }
            ]
          }
        }
      ]
    }
  }
}

The above query will return 3 documents in result [“id”:1] out of 4. Because only the first document matches all the conditions in the query. 

Summary

In this article, we learned about the different types of boolean clauses in Elasticsearch and their uses with different types of examples. 

I hope you will find this article helpful. If you have any suggestions, then please feel free to ask in the comment section.

Thank you.


Similar Articles