Object Classes

IndexList

class curator.indexlist.IndexList(client)
all_indices

Instance variable. All indices in the cluster at instance creation time. Type: list()

client

An Elasticsearch Client object Also accessible as an instance variable.

empty_list_check()

Raise exception if indices is empty

filter_allocated(key=None, value=None, allocation_type='require', exclude=True)

Match indices that have the routing allocation rule of key=value from indices

Parameters:
  • key – The allocation attribute to check for

  • value – The value to check for

  • allocation_type – Type of allocation to apply

  • exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is True

filter_by_age(source='name', direction=None, timestring=None, unit=None, unit_count=None, field=None, stats_result='min_value', epoch=None, exclude=False, unit_count_pattern=False)

Match indices by relative age calculations.

Parameters:
  • source – Source of index age. Can be one of ‘name’, ‘creation_date’, or ‘field_stats’

  • direction – Time to filter, either older or younger

  • timestring – An strftime string to match the datestamp in an index name. Only used for index filtering by name.

  • unit – One of seconds, minutes, hours, days, weeks, months, or years.

  • unit_count – The number of unit (s). unit_count * unit will be calculated out to the relative number of seconds.

  • unit_count_pattern – A regular expression whose capture group identifies the value for unit_count.

  • field – A timestamp field name. Only used for field_stats based calculations.

  • stats_result – Either min_value or max_value. Only used in conjunction with source`=``field_stats` to choose whether to reference the minimum or maximum result value.

  • epoch – An epoch timestamp used in conjunction with unit and unit_count to establish a point of reference for calculations. If not provided, the current time will be used.

  • exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is False

filter_by_alias(aliases=None, exclude=False)

Match indices which are associated with the alias or list of aliases identified by aliases.

An update to Elasticsearch 5.5.0 changes the behavior of this from previous 5.x versions: https://www.elastic.co/guide/en/elasticsearch/reference/5.5/breaking-changes-5.5.html#breaking_55_rest_changes

What this means is that indices must appear in all aliases in list aliases or a 404 error will result, leading to no indices being matched. In older versions, if the index was associated with even one of the aliases in aliases, it would result in a match.

It is unknown if this behavior affects anyone. At the time this was written, no users have been bit by this. The code could be adapted to manually loop if the previous behavior is desired. But if no users complain, this will become the accepted/expected behavior.

Parameters:
  • aliases (list) – A list of alias names.

  • exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is False

filter_by_count(count=None, reverse=True, use_age=False, pattern=None, source='creation_date', timestring=None, field=None, stats_result='min_value', exclude=True)

Remove indices from the actionable list beyond the number count, sorted reverse-alphabetically by default. If you set reverse to False, it will be sorted alphabetically.

The default is usually what you will want. If only one kind of index is provided–for example, indices matching logstash-%Y.%m.%d–then reverse alphabetical sorting will mean the oldest will remain in the list, because lower numbers in the dates mean older indices.

By setting reverse to False, then index3 will be deleted before index2, which will be deleted before index1

use_age allows ordering indices by age. Age is determined by the index creation date by default, but you can specify an source of name, max_value, or min_value. The name source requires the timestring argument.

Parameters:
  • count – Filter indices beyond count.

  • reverse – The filtering direction. (default: True).

  • use_age – Sort indices by age. source is required in this case.

  • pattern – Select indices to count from a regular expression pattern. This pattern must have one and only one capture group. This can allow a single count filter instance to operate against any number of matching patterns, and keep count of each index in that group. For example, given a pattern of '^(.*)-\d{6}$', it will match both rollover-000001 and index-999990, but not logstash-2017.10.12. Following the same example, if my cluster also had rollover-000002 through rollover-000010 and index-888888 through index-999999, it will process both groups of indices, and include or exclude the count of each.

  • source – Source of index age. Can be one of name, creation_date, or field_stats. Default: creation_date

  • timestring – An strftime string to match the datestamp in an index name. Only used if source name is selected.

  • field – A timestamp field name. Only used if source field_stats is selected.

  • stats_result – Either min_value or max_value. Only used if source field_stats is selected. It determines whether to reference the minimum or maximum value of field in each index.

  • exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is True

filter_by_regex(kind=None, value=None, exclude=False)

Match indices by regular expression (pattern).

Parameters:
  • kind – Can be one of: suffix, prefix, regex, or timestring. This option defines what kind of filter you will be building.

  • value – Depends on kind. It is the strftime string if kind is timestring. It’s used to build the regular expression for other kinds.

  • exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is False

filter_by_shards(number_of_shards=None, shard_filter_behavior='greater_than', exclude=False)

Match indices with a given shard count.

Selects all indices with a shard count ‘greater_than’ number_of_shards by default. Use shard_filter_behavior to select indices with shard count ‘greater_than’, ‘greater_than_or_equal’, ‘less_than’, ‘less_than_or_equal’, or ‘equal’ to number_of_shards.

Parameters:
  • number_of_shards – shard threshold

  • shard_filter_behavior – Do you want to filter on greater_than, greater_than_or_equal, less_than, less_than_or_equal, or equal?

  • exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is False

filter_by_space(disk_space=None, reverse=True, use_age=False, source='creation_date', timestring=None, field=None, stats_result='min_value', exclude=False, threshold_behavior='greater_than')

Remove indices from the actionable list based on space consumed, sorted reverse-alphabetically by default. If you set reverse to False, it will be sorted alphabetically.

The default is usually what you will want. If only one kind of index is provided–for example, indices matching logstash-%Y.%m.%d–then reverse alphabetical sorting will mean the oldest will remain in the list, because lower numbers in the dates mean older indices.

By setting reverse to False, then index3 will be deleted before index2, which will be deleted before index1

use_age allows ordering indices by age. Age is determined by the index creation date by default, but you can specify an source of name, max_value, or min_value. The name source requires the timestring argument.

threshold_behavior, when set to greater_than (default), includes if it the index tests to be larger than disk_space. When set to less_than, it includes if the index is smaller than disk_space

Parameters:
  • disk_space – Filter indices over n gigabytes

  • threshold_behavior – Size to filter, either greater_than or less_than. Defaults to greater_than to preserve backwards compatability.

  • reverse – The filtering direction. (default: True). Ignored if use_age is True

  • use_age – Sort indices by age. source is required in this case.

  • source – Source of index age. Can be one of name, creation_date, or field_stats. Default: creation_date

  • timestring – An strftime string to match the datestamp in an index name. Only used if source name is selected.

  • field – A timestamp field name. Only used if source field_stats is selected.

  • stats_result – Either min_value or max_value. Only used if source field_stats is selected. It determines whether to reference the minimum or maximum value of field in each index.

  • exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is False

filter_closed(exclude=True)

Filter out closed indices from indices

Parameters:

exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is True

filter_empty(exclude=True)

Filter indices with a document count of zero

Indices that are closed are automatically excluded from consideration due to closed indices reporting a document count of zero.

Parameters:

exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is True

filter_forceMerged(max_num_segments=None, exclude=True)

Match any index which has max_num_segments per shard or fewer in the actionable list.

Parameters:
  • max_num_segments – Cutoff number of segments per shard.

  • exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is True

filter_ilm(exclude=True)

Match indices that have the setting index.lifecycle.name

Parameters:

exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is True

filter_kibana(exclude=True)

Match any index named .kibana* in indices. Older releases addressed index names that no longer exist.

Parameters:

exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is True

filter_opened(exclude=True)

Filter out opened indices from indices

Parameters:

exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is True

filter_period(period_type='relative', source='name', range_from=None, range_to=None, date_from=None, date_to=None, date_from_format=None, date_to_format=None, timestring=None, unit=None, field=None, stats_result='min_value', intersect=False, week_starts_on='sunday', epoch=None, exclude=False)

Match indices with ages within a given period.

Parameters:
  • period_type – Can be either absolute or relative. Default is relative. date_from and date_to are required when using period_type='absolute'. range_from and range_to are required with period_type='relative'.

  • source – Source of index age. Can be one of ‘name’, ‘creation_date’, or ‘field_stats’

  • range_from – How many unit (s) in the past/future is the origin?

  • range_to – How many unit (s) in the past/future is the end point?

  • date_from – The simplified date for the start of the range

  • date_to – The simplified date for the end of the range. If this value is the same as date_from, the full value of unit will be extrapolated for the range. For example, if unit is months, and date_from and date_to are both 2017.01, then the entire month of January 2017 will be the absolute date range.

  • date_from_format – The strftime string used to parse date_from

  • date_to_format – The strftime string used to parse date_to

  • timestring – An strftime string to match the datestamp in an index name. Only used for index filtering by name.

  • unit – One of hours, days, weeks, months, or years.

  • field – A timestamp field name. Only used for field_stats based calculations.

  • stats_result – Either min_value or max_value. Only used in conjunction with source='field_stats' to choose whether to reference the minimum or maximum result value.

  • intersect – Only used when source='field_stats'. If True, only indices where both min_value and max_value are within the period will be selected. If False, it will use whichever you specified. Default is False to preserve expected behavior.

  • week_starts_on – Either sunday or monday. Default is sunday

  • epoch – An epoch timestamp used to establish a point of reference for calculations. If not provided, the current time will be used.

  • exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is False

index_info

Instance variable. Information extracted from indices, such as segment count, age, etc. Populated at instance creation time, and by other private helper methods, as needed. Type: dict()

indices

Instance variable. The running list of indices which will be used by an Action class. Populated at instance creation time. Type: list()

iterate_filters(filter_dict)

Iterate over the filters defined in config and execute them.

Parameters:

filter_dict – The configuration dictionary

Note

filter_dict should be a dictionary with the following form:

{ 'filters' : [
        {
            'filtertype': 'the_filter_type',
            'key1' : 'value1',
            ...
            'keyN' : 'valueN'
        }
    ]
}
working_list()

Return the current value of indices as copy-by-value to prevent list stomping during iterations

SnapshotList

class curator.snapshotlist.SnapshotList(client, repository=None)
client

An Elasticsearch Client object. Also accessible as an instance variable.

empty_list_check()

Raise exception if snapshots is empty

filter_by_age(source='creation_date', direction=None, timestring=None, unit=None, unit_count=None, epoch=None, exclude=False)

Remove snapshots from snapshots by relative age calculations.

Parameters:
  • source – Source of snapshot age. Can be ‘name’, or ‘creation_date’.

  • direction – Time to filter, either older or younger

  • timestring – An strftime string to match the datestamp in an snapshot name. Only used for snapshot filtering by name.

  • unit – One of seconds, minutes, hours, days, weeks, months, or years.

  • unit_count – The number of unit (s). unit_count * unit will be calculated out to the relative number of seconds.

  • epoch – An epoch timestamp used in conjunction with unit and unit_count to establish a point of reference for calculations. If not provided, the current time will be used.

  • exclude – If exclude is True, this filter will remove matching snapshots from snapshots. If exclude is False, then only matching snapshots will be kept in snapshots. Default is False

filter_by_count(count=None, reverse=True, use_age=False, source='creation_date', timestring=None, exclude=True)

Remove snapshots from the actionable list beyond the number count, sorted reverse-alphabetically by default. If you set reverse to False, it will be sorted alphabetically.

The default is usually what you will want. If only one kind of snapshot is provided–for example, snapshots matching curator-%Y%m%d%H%M%S– then reverse alphabetical sorting will mean the oldest will remain in the list, because lower numbers in the dates mean older snapshots.

By setting reverse to False, then snapshot3 will be acted on before snapshot2, which will be acted on before snapshot1

use_age allows ordering snapshots by age. Age is determined by the snapshot creation date (as identified by start_time_in_millis) by default, but you can also specify a source of name. The name source requires the timestring argument.

Parameters:
  • count – Filter snapshots beyond count.

  • reverse – The filtering direction. (default: True).

  • use_age – Sort snapshots by age. source is required in this case.

  • source – Source of snapshot age. Can be one of name, or creation_date. Default: creation_date

  • timestring – An strftime string to match the datestamp in a snapshot name. Only used if source name is selected.

  • exclude – If exclude is True, this filter will remove matching snapshots from snapshots. If exclude is False, then only matching snapshots will be kept in snapshots. Default is True

filter_by_regex(kind=None, value=None, exclude=False)

Filter out snapshots not matching the pattern, or in the case of exclude, filter those matching the pattern.

Parameters:
  • kind – Can be one of: suffix, prefix, regex, or timestring. This option defines what kind of filter you will be building.

  • value – Depends on kind. It is the strftime string if kind is timestring. It’s used to build the regular expression for other kinds.

  • exclude – If exclude is True, this filter will remove matching snapshots from snapshots. If exclude is False, then only matching snapshots will be kept in snapshots. Default is False

filter_by_state(state=None, exclude=False)

Filter out snapshots not matching state, or in the case of exclude, filter those matching state.

Parameters:
  • state – The snapshot state to filter for. Must be one of SUCCESS, PARTIAL, FAILED, or IN_PROGRESS.

  • exclude – If exclude is True, this filter will remove matching snapshots from snapshots. If exclude is False, then only matching snapshots will be kept in snapshots. Default is False

filter_period(period_type='relative', source='name', range_from=None, range_to=None, date_from=None, date_to=None, date_from_format=None, date_to_format=None, timestring=None, unit=None, week_starts_on='sunday', epoch=None, exclude=False)

Match snapshots with ages within a given period.

Parameters:
  • period_type – Can be either absolute or relative. Default is relative. date_from and date_to are required when using period_type='absolute'`. ``range_from and range_to are required with ``period_type=’relative’`.

  • source – Source of snapshot age. Can be ‘name’, or ‘creation_date’.

  • range_from – How many unit (s) in the past/future is the origin?

  • range_to – How many unit (s) in the past/future is the end point?

  • date_from – The simplified date for the start of the range

  • date_to – The simplified date for the end of the range. If this value is the same as date_from, the full value of unit will be extrapolated for the range. For example, if unit is months, and date_from and date_to are both 2017.01, then the entire month of January 2017 will be the absolute date range.

  • date_from_format – The strftime string used to parse date_from

  • date_to_format – The strftime string used to parse date_to

  • timestring – An strftime string to match the datestamp in an snapshot name. Only used for snapshot filtering by name.

  • unit – One of hours, days, weeks, months, or years.

  • week_starts_on – Either sunday or monday. Default is sunday

  • epoch – An epoch timestamp used to establish a point of reference for calculations. If not provided, the current time will be used.

  • exclude – If exclude is True, this filter will remove matching indices from indices. If exclude is False, then only matching indices will be kept in indices. Default is False

iterate_filters(config)

Iterate over the filters defined in config and execute them.

Parameters:

config – A dictionary of filters, as extracted from the YAML configuration file.

Note

config should be a dictionary with the following form:

{ 'filters' : [
        {
            'filtertype': 'the_filter_type',
            'key1' : 'value1',
            ...
            'keyN' : 'valueN'
        }
    ]
}
most_recent()

Return the most recent snapshot based on start_time_in_millis.

repository

An Elasticsearch repository. Also accessible as an instance variable.

snapshot_info

Instance variable. Information extracted from snapshots, such as age, etc. Populated by internal method __get_snapshots at instance creation time. Type: dict()

snapshots

Instance variable. The running list of snapshots which will be used by an Action class. Populated by internal methods __get_snapshots at instance creation time. Type: list()

working_list()

Return the current value of snapshots as copy-by-value to prevent list stomping during iterations