Tech Topics

Elasticsearch Curator -- Version 1.1.0 Released

When Elasticsearch version 1.0.0 was released, it came with a new feature: Snapshot & Restore. The Snapshot portion of this feature allows you to create backups by taking a “picture” of your indices at a particular point in time. Soon after this announcement, the feature requests began to accumulate. Things like, “Add snapshots to Curator!” or “When will Curator be able to do snapshots?” If this has been your desire, your wish has finally been granted…and much, much more in addition!

New Features

These are the newest additions to Curator:

  • Brand new CLI structure
  • Snapshots
  • Aliases
  • Exclude indices by pattern
  • Allocation Routing
  • Show indices and snapshots
  • Repository management (in a separate script)
  • Documentation wiki

A Brand-New Command-Line Structure

Please note: The change to the command-line structure means that your older cron entries will not work with Curator 1.1.0. Please remember to update your commands when upgrading to Curator 1.1.0.

The concept of commands has been added to make things more simple, and to make navigating the help output easier. Curator will do the same tasks as with previous versions, it just uses a slightly different format:

Old format command:

curator -d 30 

New format command:

curator delete --older-than 30

Note that commands are not prepended by hyphens the way that flags are. Care was taken to ensure that similar flags now have similar, or identical names. For example the --older-than flag can be found in many of the commands. The implied value is identical in each case: indices older than the supplied number.

The new list of commands is:

  • alias
  • allocation
  • bloom
  • close
  • delete
  • optimize
  • show
  • snapshot

You can get the help output for any command by running:

curator [COMMAND] --help

All of the associated flags will then be displayed for that command.

Snapshots

The snapshot command allows you to capture snapshots of indices into a pre-existing repository.

Curator will create one snapshot per index, and it will take its name from the index. For example, an index named logstash-2014.06.10 will yield a snapshot named logstash-2014.06.10. It will loop through indices creating snapshots for each one in sequence based on the criteria you provide.

curator snapshot --older-than 20 --repository REPOSITORY_NAME

This command will take snapshots of all indices older than 20 days and send them to the repository identified by REPOSITORY_NAME.

A script has been included with curator to assist in repository creation, called es_repo_mgr. It can assist in the creation of both filesystem and S3 type repositories.

In addition to being able to snapshot older indices, curator provides a way for you to upload the most recent indices. This is useful when uploading Elasticsearch Marvel indices so others can view your performance data for troubleshooting purposes.

curator snapshot --most-recent 3 --prefix .marvel- --repository REPOSITORY_NAME

With this command you can capture the three most recent Marvel indices to the named repository.

Aliases

Curator now allows you to add indices to a pre-existing alias, and also remove indices from an alias. The alias must exist. Curator will not create it for you.

Supposing that I wanted to keep a rolling alias of previous week’s indices, called last_week. I could keep that updated with the following two commands:

curator alias --alias-older-than 7 --alias last_week
curator
alias --unalias-older-than 14 --alias last_week

It is useful to point out here that Elasticsearch allows you to automatically have newly created indices be part of an alias with index templates. You could have new indices automatically part of an alias called this_week and use a command like:

curator alias --unalias-older-than 7 --alias this_week

to keep a this_week and last_week alias updated.

Exclude Pattern

Sometimes you want to exclude a given index from operations. Previously you could only limit your selection by prefix and date. Now there’s an --exclude-pattern option that will allow you to filter out indices in addition to these other methods.

Supposing I never want the index logstash-2014.06.11 to be deleted, I could exclude this from deletes in this manner:

curator delete --older-than 15 --exclude-pattern 2014.06.11

Curator would match the default prefix of logstash- and would prevent an index with 2014.06.11 in it from being deleted.

Allocation Routing

Elasticsearch allows you to tag your nodes (not in the graffiti sense). With these tags you have the power to control where your indices and shards go within your cluster. A common use-case for this is having high-powered nodes with SSD drives for indexing, but lower-powered boxes with spinning hard disk drives for older, less frequently searched indices. In order for this to work, your hdd nodes must have a setting in the elasticsearch.yml file to correspond, e.g. node.tag: hdd or node.tag: ssd. Curator now provides a way to automatically update the tag on an index so it can be re-routed during off-peak hours.

The command:

curator allocation --older-than 2 --rule tag=hdd

…will apply the setting index.routing.allocation.require.tag=hdd to indices older than 2 days. The require portion of this will tell Elasticsearch that that the shards of that index are required to reside on a node with node.tag: hdd.

Show indices and snapshots

This is a simple way to get a quick look at what indices or snapshots you have:

curator show --show-indices

…will show all indices matching the default prefix of logstash-.

curator show --show-snapshots --repository REPOSITORY_NAME

…will show all snapshots matching the default prefix of logstash- within the named repository.

Repository management

As mentioned previously, a helper script called es_repo_mgr was included with curator to assist in creating snapshot repositories. At this time, only fs and s3 types are supported. Please be sure to read the documentation for the indicated type before creating a repository. For example, each node using a fs type repository must be able to access the same shared filesystem, in the same path, identified by --location

Create a fs type repository:

es_repo_mgr create_fs --location '/tmp/REPOSITORY_LOCATION' --repository REPOSITORY_NAME

Delete a repository:

es_repo_mgr delete --repository REPOSITORY_NAME

Documentation wiki

The documentation for Curator has been updated and put online in a wiki that anyone can edit. You can find more in-depth information about flags and commands there, and even add to the documentation if you feel so inclined.

Installation and Upgrading

Curator 1.1.0 is in the PyPi repository. To install:

pip install elasticsearch-curator

To upgrade from version 1.0.0:

pip uninstall elasticsearch-curator
pip install elasticsearch
-curator

To upgrade from a version older than 1.0.0:

pip uninstall elasticsearch-curator
pip uninstall elasticsearch
pip install elasticsearch
-curator

pip uninstall elasticsearch removes the older python elasticsearch module so the proper version can be re-installed as a dependency.

Conclusion

The new features in Curator are awesome! This release marks a huge improvement in user experience as well. If you run into trouble or find something we missed, please log an issue in our GitHub Issues page. If you love Curator, please tell us about it! We love tweets with #elasticsearch in them!

Curator is just getting started! We’ll be working on a roadmap for Curator 2.0 soon. Thanks for reading, and Happy Curating!