Spikekill is designed to overwrite specific data points in one or more RRD files. You should use great caution when running Spikekill, and you should always ensure you have proper backups from which to restore your RRDfiles if Spikekill performs a modification that you did not expect.
Spikekill is a tool used to remove spikes in a graph. Spikes can appear in a graph after a device reboots, or when you switch from 32-bit to 64-bit interface counters on a device. Spikekill works by statistically analyzing the data contained inside an RRDfile, and overwriting specific data points. It offers 4 methods by which to analyze the data and selectively overwrite data points:
Standard Deviation
Calulates the standard deviation of the data, and overwrites data points that are N times higher than the standard deviation.
Variance Average
Calculates the average value of the data, and overwrites data points that are N percent higher than the average.
GapFill
Finds gaps in the data (missing data points) and also finds data points that are N percent higher than the average, and overwrites them, but only does so inside the specified time range.
Float
Overwrites all data points that are within a specified time range.
Spikekill is easily run on a graph with a couple mouse clicks in the Cacti GUI:
On Cacti's main Graph
tab, click the Spikekill icon
next to a
graph.
In the drop-down menu that appears, review your current settings
by hovering on Settings
and reviewing each item, making
changes if desired.
Run Spikekill by choosing one of the four methods available in the drop-down menu. Spikekill runs immediately, and the graph may be modified, depending on your settings. The graph is refreshed when Spikekill is finished.
Spikekill also has a flexible command line interface. The following example will help you get started.
shell>php cli/removespikes.php --help
Spikekill requires certain values to successfully calculate which data points to overwrite. The following settings are customizable:
Replacement Method
When Spikekill identifies data points to overwrite, the value of Replacement Method represents the data that will be written in place of each data point.
When Replacement Method is set to NaN
, running
Spikekill using GapFill
or Float
methods will
perform no modifications. Please be aware of Issue #3673, where
'Last Known Good' is always used, regardless of setting.
Standard Deviations
When using the Standard Deviation method, the value of Standard Deviations is the coefficient which determines how many times above the standard deviation the data must be to be considered a spike.
Variance Percentage
When using the Variance Average method, the value of Variance Percentage is the coefficient which determines how much higher than the average the data must be to be considered a spike.
Variance Outliers
When using the Variance Average method, the highest N values and the lowest N values are considered outliers. These outliers are ignored when calculating the average of the data.
Kills Per RRA
Spikekill will limit the number of data points that it overwrites to the value of Kills Per RRA. For reference, a single RRDfile can contain multiple data sources and multiple archives. Therefore, the total number of data points that could be overwritten in a single RRDfile will be determined by the formula:
(number_of_RRAs) x (number_of_data_sources) x (Kills_Per_RRA)
This only applies when using the Standard Deviation
or
Variance Average
methods, and does not apply when using the
GapFill
or Float
methods.
The above settings are maintatined individually for each Cacti user. The default values can be changed globally in the Cacti settings, found at Configuration > Settings > Spikes. A user can change her or his individual settings by selecting different choices in the Spikekill menu which appears next to a graph.
When Spikekill runs, it analyzes each data source (DS) of each
round-robin archive (RRA) independently. During analysis, it calculates
values for average
and standard deviation
for
each RRA:DS pair.
To understand this better, consider the following example.
You have a round-robin database (RRD) file that stores traffic levels on a router interface.
You have two DSes, which are traffic_in
and
traffic_out
.
You also have selected three profiles, which are 1 minute for 7 days, 15 minutes for 5 weeks, and 1 hour for 3 years.
Lastly, you have three consolidation functions (CF), which are
average
, min
, and max
.
In this example, you have 9 RRAs (3 profiles multiplied by 3 CFs).
Therefore, Spikekill will calculate 18 averages and 18 standard
deviation values (9 RRAs multiplied by 2 DSes). Data points for each
RRA:DS pair will be compared against their respective
average
and standard deviation
values.
Spikekill searches for spikes in the same order as
rrddump
exports data, which is sequentially by RRA, from
oldest to newest data points within each RRA. Data points are
overwritten if they exceed the maximum allowed threshold or if they do
not reach the minimum required threshold.
Spikekill will cease overwriting data points for any individual RRA when it has reached the value of Kills Per RRA, counting all DSes of that RRA together. It will then begin the process again on the next RRA.
This means that for any given RRA, older spikes will always be overwritten before newer spikes. However, newer spikes in one RRA can still be overwritten before older spikes in a different RRA, since the thresholds are different per RRA:DS pair.