IMPORTANT: No additional bug fixes or documentation updates
will be released for this version. For the latest information, see the
current release documentation.
Simple analyzeredit
The simple
analyzer breaks text into tokens at any non-letter character, such
as numbers, spaces, hyphens and apostrophes, discards non-letter characters,
and changes uppercase to lowercase.
Exampleedit
response = client.indices.analyze( body: { analyzer: 'simple', text: "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone." } ) puts response
POST _analyze { "analyzer": "simple", "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone." }
The simple
analyzer parses the sentence and produces the following
tokens:
[ the, quick, brown, foxes, jumped, over, the, lazy, dog, s, bone ]
Customizeedit
To customize the simple
analyzer, duplicate it to create the basis for
a custom analyzer. This custom analyzer can be modified as required, usually by
adding token filters.
response = client.indices.create( index: 'my-index-000001', body: { settings: { analysis: { analyzer: { my_custom_simple_analyzer: { tokenizer: 'lowercase', filter: [] } } } } } ) puts response