Lowercase Tokenizer | Elasticsearch Reference [5.5]

WARNING: Version 5.5 of Elasticsearch has passed its EOL date.

This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.

› › ›

« Letter Tokenizer Whitespace Tokenizer »

Lowercase Tokenizeredit

The lowercase tokenizer, like the letter tokenizer breaks text into terms whenever it encounters a character which is not a letter, but it also lowercases all terms. It is functionally equivalent to the letter tokenizer combined with the lowercase token filter, but is more efficient as it performs both steps in a single pass.

Example outputedit

POST _analyze
{
  "tokenizer": "lowercase",
  "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}

The above sentence would produce the following terms:

[ the, quick, brown, foxes, jumped, over, the, lazy, dog, s, bone ]

Configurationedit

The lowercase tokenizer is not configurable.

« Letter Tokenizer Whitespace Tokenizer »