Utility & Helper Methods¶
- class curator.utils.TimestringSearch(timestring)¶
An object to allow repetitive search against a string, searchme, without having to repeatedly recreate the regex.
- Parameters:
timestring – An strftime pattern
- curator.utils.absolute_date_range(unit, date_from, date_to, date_from_format=None, date_to_format=None)¶
Get the epoch start time and end time of a range of
unit``s, reckoning the start of the week (if that's the selected unit) based on ``week_starts_on
, which can be eithersunday
ormonday
.- Parameters:
unit – One of
hours
,days
,weeks
,months
, oryears
.date_from – The simplified date for the start of the range
date_to – The simplified date for the end of the range. If this value is the same as
date_from
, the full value ofunit
will be extrapolated for the range. For example, ifunit
ismonths
, anddate_from
anddate_to
are both2017.01
, then the entire month of January 2017 will be the absolute date range.date_from_format – The strftime string used to parse
date_from
date_to_format – The strftime string used to parse
date_to
- Return type:
- curator.utils.byte_size(num, suffix='B')¶
Return a formatted string indicating the size in bytes, with the proper unit, e.g. KB, MB, GB, TB, etc.
- Parameters:
num – The number of byte
suffix – An arbitrary suffix, like Bytes
- Return type:
- curator.utils.check_csv(value)¶
Some of the curator methods should not operate against multiple indices at once. This method can be used to check if a list or csv has been sent.
- Parameters:
value – The value to test, if list or csv string
- Return type:
- curator.utils.check_master(client, master_only=False)¶
Check if connected client is the elected master node of the cluster. If not, cleanly exit with a log message.
- Parameters:
client – An
elasticsearch.Elasticsearch
client object- Return type:
None
- curator.utils.check_version(client)¶
Verify version is within acceptable range. Raise an exception if it is not.
- Parameters:
client – An
elasticsearch.Elasticsearch
client object- Return type:
None
- curator.utils.chunk_index_list(indices)¶
This utility chunks very large index lists into 3KB chunks It measures the size as a csv string, then converts back into a list for the return value.
- Parameters:
indices – A list of indices to act on.
- Return type:
- curator.utils.create_repo_body(repo_type=None, compress=True, chunk_size=None, max_restore_bytes_per_sec=None, max_snapshot_bytes_per_sec=None, location=None, bucket=None, region=None, base_path=None, access_key=None, secret_key=None, **kwargs)¶
Build the ‘body’ portion for use in creating a repository.
- Parameters:
repo_type – The type of repository (presently only fs and s3)
compress – Turn on compression of the snapshot files. Compression is applied only to metadata files (index mapping and settings). Data files are not compressed. (Default: True)
chunk_size – The chunk size can be specified in bytes or by using size value notation, i.e. 1g, 10m, 5k. Defaults to null (unlimited chunk size).
max_restore_bytes_per_sec – Throttles per node restore rate. Defaults to
20mb
per second.max_snapshot_bytes_per_sec – Throttles per node snapshot rate. Defaults to
20mb
per second.location – Location of the snapshots. Required.
bucket – S3 only. The name of the bucket to be used for snapshots. Required.
region – S3 only. The region where bucket is located. Defaults to US Standard
base_path – S3 only. Specifies the path within bucket to repository data. Defaults to value of
repositories.s3.base_path
or to root directory if not set.access_key – S3 only. The access key to use for authentication. Defaults to value of
cloud.aws.access_key
.secret_key – S3 only. The secret key to use for authentication. Defaults to value of
cloud.aws.secret_key
.
- Returns:
A dictionary suitable for creating a repository from the provided arguments.
- Return type:
- curator.utils.create_repository(client, **kwargs)¶
Create repository with repository and body settings
- Parameters:
client – An
elasticsearch.Elasticsearch
client objectrepository – The Elasticsearch snapshot repository to use
repo_type – The type of repository (presently only fs and s3)
compress – Turn on compression of the snapshot files. Compression is applied only to metadata files (index mapping and settings). Data files are not compressed. (Default: True)
chunk_size – The chunk size can be specified in bytes or by using size value notation, i.e. 1g, 10m, 5k. Defaults to null (unlimited chunk size).
max_restore_bytes_per_sec – Throttles per node restore rate. Defaults to
20mb
per second.max_snapshot_bytes_per_sec – Throttles per node snapshot rate. Defaults to
20mb
per second.location – Location of the snapshots. Required.
bucket – S3 only. The name of the bucket to be used for snapshots. Required.
region – S3 only. The region where bucket is located. Defaults to US Standard
base_path – S3 only. Specifies the path within bucket to repository data. Defaults to value of
repositories.s3.base_path
or to root directory if not set.access_key – S3 only. The access key to use for authentication. Defaults to value of
cloud.aws.access_key
.secret_key – S3 only. The secret key to use for authentication. Defaults to value of
cloud.aws.secret_key
.skip_repo_fs_check – Skip verifying the repo after creation.
- Returns:
A boolean value indicating success or failure.
- Return type:
- curator.utils.create_snapshot_body(indices, ignore_unavailable=False, include_global_state=True, partial=False)¶
Create the request body for creating a snapshot from the provided arguments.
- Parameters:
indices – A single index, or list of indices to snapshot.
ignore_unavailable (bool) – Ignore unavailable shards/indices. (default: False)
include_global_state (bool) – Store cluster global state with snapshot. (default: True)
partial (bool) – Do not fail if primary shard is unavailable. (default: False)
- Return type:
- curator.utils.date_range(unit, range_from, range_to, epoch=None, week_starts_on='sunday')¶
Get the epoch start time and end time of a range of
unit``s, reckoning the start of the week (if that's the selected unit) based on ``week_starts_on
, which can be eithersunday
ormonday
.- Parameters:
unit – One of
hours
,days
,weeks
,months
, oryears
.range_from – How many
unit
(s) in the past/future is the origin?range_to – How many
unit
(s) in the past/future is the end point?epoch – An epoch timestamp used to establish a point of reference for calculations.
week_starts_on – Either
sunday
ormonday
. Default issunday
- Return type:
- curator.utils.ensure_list(indices)¶
Return a list, even if indices is a single value
- Parameters:
indices – A list of indices to act upon
- Return type:
- curator.utils.find_snapshot_tasks(client)¶
Check if there is snapshot activity in the Tasks API. Return True if activity is found, or False
- Parameters:
client – An
elasticsearch.Elasticsearch
client object- Return type:
- curator.utils.fix_epoch(epoch)¶
Fix value of epoch to be epoch, which should be 10 or fewer digits long.
- Parameters:
epoch – An epoch timestamp, in epoch + milliseconds, or microsecond, or even nanoseconds.
- Return type:
- curator.utils.get_client(**kwargs)¶
- NOTE: AWS IAM parameters aws_sign_request and aws_region are
provided to facilitate request signing. The credentials will be fetched from the local environment as per the AWS documentation: http://amzn.to/2fRCGCt
AWS IAM parameters aws_key, aws_secret_key, and aws_region are provided for users that still have their keys included in the Curator config file.
Return an
elasticsearch.Elasticsearch
client object using the provided parameters. Any of the keyword arguments theelasticsearch.Elasticsearch
client object can receive are valid, such as:- Parameters:
hosts (list) – A list of one or more Elasticsearch client hostnames or IP addresses to connect to. Can send a single host.
port (int) – The Elasticsearch client port to connect to.
url_prefix (str) – Optional url prefix, if needed to reach the Elasticsearch API (i.e., it’s not at the root level)
use_ssl (bool) – Whether to connect to the client via SSL/TLS
certificate – Path to SSL/TLS certificate
client_cert – Path to SSL/TLS client certificate (public key)
client_key – Path to SSL/TLS private key
aws_key – AWS IAM Access Key (Only used if the
requests-aws4auth
python module is installed)aws_secret_key – AWS IAM Secret Access Key (Only used if the
requests-aws4auth
python module is installed)aws_region – AWS Region (Only used if the
requests-aws4auth
python module is installed)aws_sign_request –
- Sign request to AWS (Only used if the
requests-aws4auth
and
boto3
python modules are installed)
- arg aws_region:
AWS Region where the cluster exists (Only used if the
requests-aws4auth
andboto3
python modules are installed)
- Sign request to AWS (Only used if the
ssl_no_validate (bool) – If True, do not validate the certificate chain. This is an insecure option and you will see warnings in the log output.
http_auth (str) – Authentication credentials in user:pass format.
timeout (int) – Number of seconds before the client will timeout.
master_only (bool) – If True, the client will only connect if the endpoint is the elected master node of the cluster. This option does not work if `hosts` has more than one value. It will raise an Exception in that case.
skip_version_test – If True, skip the version check as part of the client connection.
api_key (str) – value to be used in optional X-Api-key header when accessing Elasticsearch
- Return type:
- curator.utils.get_date_regex(timestring)¶
Return a regex string based on a provided strftime timestring.
- Parameters:
timestring – An strftime pattern
- Return type:
- curator.utils.get_datemath(client, datemath, random_element=None)¶
Return the parsed index name from
datemath
- curator.utils.get_datetime(index_timestamp, timestring)¶
Return the datetime extracted from the index name, which is the index creation time.
- Parameters:
index_timestamp – The timestamp extracted from an index name
timestring – An strftime pattern
- Return type:
- curator.utils.get_indices(client)¶
Get the current list of indices from the cluster.
- Parameters:
client – An
elasticsearch.Elasticsearch
client object- Return type:
- curator.utils.get_point_of_reference(unit, count, epoch=None)¶
Get a point-of-reference timestamp in epoch + milliseconds by deriving from a unit and a count, and an optional reference timestamp, epoch
- Parameters:
unit – One of
seconds
,minutes
,hours
,days
,weeks
,months
, oryears
.unit_count – The number of
units
.unit_count
*unit
will be calculated out to the relative number of seconds.epoch – An epoch timestamp used in conjunction with
unit
andunit_count
to establish a point of reference for calculations.
- Return type:
- curator.utils.get_repository(client, repository='')¶
Return configuration information for the indicated repository.
- Parameters:
client – An
elasticsearch.Elasticsearch
client objectrepository – The Elasticsearch snapshot repository to use
- Return type:
- curator.utils.get_snapshot(client, repository=None, snapshot='')¶
Return information about a snapshot (or a comma-separated list of snapshots) If no snapshot specified, it will return all snapshots. If none exist, an empty dictionary will be returned.
- Parameters:
client – An
elasticsearch.Elasticsearch
client objectrepository – The Elasticsearch snapshot repository to use
snapshot – The snapshot name, or a comma-separated list of snapshots
- Return type:
- curator.utils.get_snapshot_data(client, repository=None)¶
Get
_all
snapshots from repository and return a list.- Parameters:
client – An
elasticsearch.Elasticsearch
client objectrepository – The Elasticsearch snapshot repository to use
- Return type:
- curator.utils.get_version(client)¶
Return the ES version number as a tuple. Omits trailing tags like -dev, or Beta
- Parameters:
client – An
elasticsearch.Elasticsearch
client object- Return type:
- curator.utils.get_yaml(path)¶
Read the file identified by path and import its YAML contents.
- Parameters:
path – The path to a YAML configuration file.
- Return type:
- curator.utils.health_check(client, **kwargs)¶
This function calls client.cluster.health and, based on the args provided, will return True or False depending on whether that particular keyword appears in the output, and has the expected value. If multiple keys are provided, all must match for a True response.
- Parameters:
client – An
elasticsearch.Elasticsearch
client object
- curator.utils.index_size(client, idx, value='total')¶
Return the sum of either primaries or total shards for index
idx
- Parameters:
client – An
elasticsearch.Elasticsearch
client objectidx – An Elasticsearch index
value – One of either primaries or total
- Return type:
integer
- curator.utils.is_master_node(client)¶
Return True if the connected client node is the elected master node in the Elasticsearch cluster, otherwise return False.
- Parameters:
client – An
elasticsearch.Elasticsearch
client object- Return type:
- curator.utils.name_to_node_id(client, name)¶
Return the node_id of the node identified by
name
- Parameters:
client – An
elasticsearch.Elasticsearch
client object- Return type:
- curator.utils.node_id_to_name(client, node_id)¶
Return the name of the node identified by
node_id
- Parameters:
client – An
elasticsearch.Elasticsearch
client object- Return type:
- curator.utils.node_roles(client, node_id)¶
Return the list of roles assigned to the node identified by
node_id
- Parameters:
client – An
elasticsearch.Elasticsearch
client object- Return type:
- curator.utils.parse_date_pattern(name)¶
Scan and parse name for
time.strftime()
strings, replacing them with the associated value when found, but otherwise returning lowercase values, as uppercase snapshot names are not allowed. It will detect if the first character is a <, which would indicate name is going to be using Elasticsearch date math syntax, and skip accordingly.The
time.strftime()
identifiers that Curator currently recognizes as acceptable include:Y
: A 4 digit yeary
: A 2 digit yearm
: The 2 digit monthW
: The 2 digit week of the yeard
: The 2 digit day of the monthH
: The 2 digit hour of the day, in 24 hour notationM
: The 2 digit minute of the hourS
: The 2 digit number of second of the minutej
: The 3 digit day of the year
- Parameters:
name – A name, which can contain
time.strftime()
strings
- curator.utils.parse_datemath(client, value)¶
Check if
value
is datemath. Parse it if it is. Return the bare value otherwise.
- curator.utils.prune_nones(mydict)¶
Remove keys from mydict whose values are None
- Parameters:
mydict – The dictionary to act on
- Return type:
- curator.utils.read_file(myfile)¶
Read a file and return the resulting data.
- Parameters:
myfile – A file to read.
- Return type:
- curator.utils.relocate_check(client, index)¶
This function calls client.cluster.state with a given index to check if all of the shards for that index are in the STARTED state. It will return True if all shards both primary and replica are in the STARTED state, and it will return False if any primary or replica shard is in a different state.
- Parameters:
client – An
elasticsearch.Elasticsearch
client objectindex – The index to check the index shards state.
- curator.utils.report_failure(exception)¶
Raise a exceptions.FailedExecution exception and include the original error message.
- Parameters:
exception – The upstream exception.
- Return type:
None
- curator.utils.repository_exists(client, repository=None)¶
Verify the existence of a repository
- Parameters:
client – An
elasticsearch.Elasticsearch
client objectrepository – The Elasticsearch snapshot repository to use
- Return type:
- curator.utils.restore_check(client, index_list)¶
This function calls client.indices.recovery with the list of indices to check for complete recovery. It will return True if recovery of those indices is complete, and False otherwise. It is designed to fail fast: if a single shard is encountered that is still recovering (not in DONE stage), it will immediately return False, rather than complete iterating over the rest of the response.
- Parameters:
client – An
elasticsearch.Elasticsearch
client objectindex_list – The list of indices to verify having been restored.
- curator.utils.rollable_alias(client, alias)¶
Ensure that alias is an alias, and points to an index that can use the
_rollover
API.- Parameters:
client – An
elasticsearch.Elasticsearch
client objectalias – An Elasticsearch alias
- curator.utils.safe_to_snap(client, repository=None, retry_interval=120, retry_count=3)¶
Ensure there are no snapshots in progress. Pause and retry accordingly
- Parameters:
client – An
elasticsearch.Elasticsearch
client objectrepository – The Elasticsearch snapshot repository to use
retry_interval – Number of seconds to delay betwen retries. Default: 120 (seconds)
retry_count – Number of attempts to make. Default: 3
- Return type:
- curator.utils.show_dry_run(ilo, action, **kwargs)¶
Log dry run output with the action which would have been executed.
- Parameters:
ilo – A
curator.indexlist.IndexList
action – The action to be performed.
kwargs – Any other args to show in the log output
- curator.utils.single_data_path(client, node_id)¶
In order for a shrink to work, it should be on a single filesystem, as shards cannot span filesystems. Return True if the node has a single filesystem, and False otherwise.
- Parameters:
client – An
elasticsearch.Elasticsearch
client object- Return type:
- curator.utils.snapshot_check(client, snapshot=None, repository=None)¶
This function calls client.snapshot.get and tests to see whether the snapshot is complete, and if so, with what status. It will log errors according to the result. If the snapshot is still IN_PROGRESS, it will return False. SUCCESS will be an INFO level message, PARTIAL nets a WARNING message, FAILED is an ERROR, message, and all others will be a WARNING level message.
- Parameters:
client – An
elasticsearch.Elasticsearch
client objectsnapshot – The name of the snapshot.
repository – The Elasticsearch snapshot repository to use
- curator.utils.snapshot_in_progress(client, repository=None, snapshot=None)¶
Determine whether the provided snapshot in repository is
IN_PROGRESS
. If no value is provided for snapshot, then check all of them. Return snapshot if it is found to be in progress, or False- Parameters:
client – An
elasticsearch.Elasticsearch
client objectrepository – The Elasticsearch snapshot repository to use
snapshot – The snapshot name
- curator.utils.snapshot_running(client)¶
Return True if a snapshot is in progress, and False if not
- Parameters:
client – An
elasticsearch.Elasticsearch
client object- Return type:
- curator.utils.task_check(client, task_id=None)¶
This function calls client.tasks.get with the provided task_id. If the task data contains
'completed': True
, then it will return True If the task is not completed, it will log some information about the task and return False- Parameters:
client – An
elasticsearch.Elasticsearch
client objecttask_id – A task_id which ostensibly matches a task searchable in the tasks API.
- curator.utils.test_client_options(config)¶
Test whether a SSL/TLS files exist. Will raise an exception if the files cannot be read.
- Parameters:
config – A client configuration file data dictionary
- Return type:
None
- curator.utils.test_repo_fs(client, repository=None)¶
Test whether all nodes have write access to the repository
- Parameters:
client – An
elasticsearch.Elasticsearch
client objectrepository – The Elasticsearch snapshot repository to use
- curator.utils.to_csv(indices)¶
Return a csv string from a list of indices, or a single value if only one value is present
- Parameters:
indices – A list of indices to act on, or a single value, which could be in the format of a csv string already.
- Return type:
- curator.utils.validate_actions(data)¶
Validate an Action configuration dictionary, as imported from actions.yml, for example.
The method returns a validated and sanitized configuration dictionary.
- Parameters:
data – The configuration dictionary
- Return type:
- curator.utils.validate_filters(action, filters)¶
Validate that the filters are appropriate for the action type, e.g. no index filters applied to a snapshot list.
- Parameters:
action – An action name
filters – A list of filters to test.
- curator.utils.verify_client_object(test)¶
Test if test is a proper
elasticsearch.Elasticsearch
client object and raise an exception if it is not.- Parameters:
test – The variable or object to test
- Return type:
None
- curator.utils.verify_index_list(test)¶
Test if test is a proper
curator.indexlist.IndexList
object and raise an exception if it is not.- Parameters:
test – The variable or object to test
- Return type:
None
- curator.utils.verify_snapshot_list(test)¶
Test if test is a proper
curator.snapshotlist.SnapshotList
object and raise an exception if it is not.- Parameters:
test – The variable or object to test
- Return type:
None
- curator.utils.wait_for_it(client, action, task_id=None, snapshot=None, repository=None, index=None, index_list=None, wait_interval=9, max_wait=-1)¶
This function becomes one place to do all wait_for_completion type behaviors
- Parameters:
client – An
elasticsearch.Elasticsearch
client objectaction – The action name that will identify how to wait
task_id – If the action provided a task_id, this is where it must be declared.
snapshot – The name of the snapshot.
repository – The Elasticsearch snapshot repository to use
wait_interval – How frequently the specified “wait” behavior will be polled to check for completion.
max_wait – Number of seconds will the “wait” behavior persist before giving up and raising an Exception. The default is -1, meaning it will try forever.
- class curator.SchemaCheck(config, schema, test_what, location)¶
Validate
config
with the provided voluptuousschema
.test_what
andlocation
are for reporting the results, in case of failure. If validation is successful, the method returnsconfig
as valid.