Class DescriptiveStatistics

  • All Implemented Interfaces:
    java.io.Serializable, StatisticalSummary
    Direct Known Subclasses:
    SynchronizedDescriptiveStatistics

    public class DescriptiveStatistics
    extends java.lang.Object
    implements StatisticalSummary, java.io.Serializable
    Maintains a dataset of values of a single variable and computes descriptive statistics based on stored data. The windowSize property sets a limit on the number of values that can be stored in the dataset. The default value, INFINITE_WINDOW, puts no limit on the size of the dataset. This value should be used with caution, as the backing store will grow without bound in this case. For very large datasets, SummaryStatistics, which does not store the dataset, should be used instead of this class. If windowSize is not INFINITE_WINDOW and more values are added than can be stored in the dataset, new values are added in a "rolling" manner, with new values replacing the "oldest" values in the dataset.

    Note: this class is not threadsafe. Use SynchronizedDescriptiveStatistics if concurrent access from multiple threads is required.

    Version:
    $Revision: 1054186 $ $Date: 2011-01-01 03:28:46 +0100 (sam. 01 janv. 2011) $
    See Also:
    Serialized Form
    • Field Detail

      • INFINITE_WINDOW

        public static final int INFINITE_WINDOW
        Represents an infinite window size. When the getWindowSize() returns this value, there is no limit to the number of data values that can be stored in the dataset.
        See Also:
        Constant Field Values
      • windowSize

        protected int windowSize
        hold the window size
    • Constructor Detail

      • DescriptiveStatistics

        public DescriptiveStatistics()
        Construct a DescriptiveStatistics instance with an infinite window
      • DescriptiveStatistics

        public DescriptiveStatistics​(int window)
        Construct a DescriptiveStatistics instance with the specified window
        Parameters:
        window - the window size.
      • DescriptiveStatistics

        public DescriptiveStatistics​(double[] initialDoubleArray)
        Construct a DescriptiveStatistics instance with an infinite window and the initial data values in double[] initialDoubleArray. If initialDoubleArray is null, then this constructor corresponds to DescriptiveStatistics()
        Parameters:
        initialDoubleArray - the initial double[].
      • DescriptiveStatistics

        public DescriptiveStatistics​(DescriptiveStatistics original)
        Copy constructor. Construct a new DescriptiveStatistics instance that is a copy of original.
        Parameters:
        original - DescriptiveStatistics instance to copy
    • Method Detail

      • addValue

        public void addValue​(double v)
        Adds the value to the dataset. If the dataset is at the maximum size (i.e., the number of stored elements equals the currently configured windowSize), the first (oldest) element in the dataset is discarded to make room for the new value.
        Parameters:
        v - the value to be added
      • removeMostRecentValue

        public void removeMostRecentValue()
        Removes the most recent value from the dataset.
      • replaceMostRecentValue

        public double replaceMostRecentValue​(double v)
        Replaces the most recently stored value with the given value. There must be at least one element stored to call this method.
        Parameters:
        v - the value to replace the most recent stored value
        Returns:
        replaced value
      • getGeometricMean

        public double getGeometricMean()
        Returns the geometric mean of the available values
        Returns:
        The geometricMean, Double.NaN if no values have been added, or if the product of the available values is less than or equal to 0.
      • getVariance

        public double getVariance()
        Returns the variance of the available values.
        Specified by:
        getVariance in interface StatisticalSummary
        Returns:
        The variance, Double.NaN if no values have been added or 0.0 for a single value set.
      • getStandardDeviation

        public double getStandardDeviation()
        Returns the standard deviation of the available values.
        Specified by:
        getStandardDeviation in interface StatisticalSummary
        Returns:
        The standard deviation, Double.NaN if no values have been added or 0.0 for a single value set.
      • getSkewness

        public double getSkewness()
        Returns the skewness of the available values. Skewness is a measure of the asymmetry of a given distribution.
        Returns:
        The skewness, Double.NaN if no values have been added or 0.0 for a value set <=2.
      • getKurtosis

        public double getKurtosis()
        Returns the Kurtosis of the available values. Kurtosis is a measure of the "peakedness" of a distribution
        Returns:
        The kurtosis, Double.NaN if no values have been added, or 0.0 for a value set <=3.
      • getMax

        public double getMax()
        Returns the maximum of the available values
        Specified by:
        getMax in interface StatisticalSummary
        Returns:
        The max or Double.NaN if no values have been added.
      • getMin

        public double getMin()
        Returns the minimum of the available values
        Specified by:
        getMin in interface StatisticalSummary
        Returns:
        The min or Double.NaN if no values have been added.
      • getN

        public long getN()
        Returns the number of available values
        Specified by:
        getN in interface StatisticalSummary
        Returns:
        The number of available values
      • getSum

        public double getSum()
        Returns the sum of the values that have been added to Univariate.
        Specified by:
        getSum in interface StatisticalSummary
        Returns:
        The sum or Double.NaN if no values have been added
      • getSumsq

        public double getSumsq()
        Returns the sum of the squares of the available values.
        Returns:
        The sum of the squares or Double.NaN if no values have been added.
      • clear

        public void clear()
        Resets all statistics and storage
      • getWindowSize

        public int getWindowSize()
        Returns the maximum number of values that can be stored in the dataset, or INFINITE_WINDOW (-1) if there is no limit.
        Returns:
        The current window size or -1 if its Infinite.
      • setWindowSize

        public void setWindowSize​(int windowSize)
        WindowSize controls the number of values which contribute to the reported statistics. For example, if windowSize is set to 3 and the values {1,2,3,4,5} have been added in that order then the available values are {3,4,5} and all reported statistics will be based on these values
        Parameters:
        windowSize - sets the size of the window.
      • getValues

        public double[] getValues()
        Returns the current set of values in an array of double primitives. The order of addition is preserved. The returned array is a fresh copy of the underlying data -- i.e., it is not a reference to the stored data.
        Returns:
        returns the current set of numbers in the order in which they were added to this set
      • getSortedValues

        public double[] getSortedValues()
        Returns the current set of values in an array of double primitives, sorted in ascending order. The returned array is a fresh copy of the underlying data -- i.e., it is not a reference to the stored data.
        Returns:
        returns the current set of numbers sorted in ascending order
      • getElement

        public double getElement​(int index)
        Returns the element at the specified index
        Parameters:
        index - The Index of the element
        Returns:
        return the element at the specified index
      • getPercentile

        public double getPercentile​(double p)
        Returns an estimate for the pth percentile of the stored values.

        The implementation provided here follows the first estimation procedure presented here.

        Preconditions:

        • 0 < p ≤ 100 (otherwise an IllegalArgumentException is thrown)
        • at least one value must be stored (returns Double.NaN otherwise)

        Parameters:
        p - the requested percentile (scaled from 0 - 100)
        Returns:
        An estimate for the pth percentile of the stored data
        Throws:
        java.lang.IllegalStateException - if percentile implementation has been overridden and the supplied implementation does not support setQuantile values
      • toString

        public java.lang.String toString()
        Generates a text report displaying univariate statistics from values that have been added. Each statistic is displayed on a separate line.
        Overrides:
        toString in class java.lang.Object
        Returns:
        String with line feeds displaying statistics
      • apply

        public double apply​(UnivariateStatistic stat)
        Apply the given statistic to the data associated with this set of statistics.
        Parameters:
        stat - the statistic to apply
        Returns:
        the computed value of the statistic.
      • getMeanImpl

        public UnivariateStatistic getMeanImpl()
        Returns the currently configured mean implementation.
        Returns:
        the UnivariateStatistic implementing the mean
        Since:
        1.2
      • setMeanImpl

        public void setMeanImpl​(UnivariateStatistic meanImpl)

        Sets the implementation for the mean.

        Parameters:
        meanImpl - the UnivariateStatistic instance to use for computing the mean
        Since:
        1.2
      • getGeometricMeanImpl

        public UnivariateStatistic getGeometricMeanImpl()
        Returns the currently configured geometric mean implementation.
        Returns:
        the UnivariateStatistic implementing the geometric mean
        Since:
        1.2
      • setGeometricMeanImpl

        public void setGeometricMeanImpl​(UnivariateStatistic geometricMeanImpl)

        Sets the implementation for the gemoetric mean.

        Parameters:
        geometricMeanImpl - the UnivariateStatistic instance to use for computing the geometric mean
        Since:
        1.2
      • getKurtosisImpl

        public UnivariateStatistic getKurtosisImpl()
        Returns the currently configured kurtosis implementation.
        Returns:
        the UnivariateStatistic implementing the kurtosis
        Since:
        1.2
      • setKurtosisImpl

        public void setKurtosisImpl​(UnivariateStatistic kurtosisImpl)

        Sets the implementation for the kurtosis.

        Parameters:
        kurtosisImpl - the UnivariateStatistic instance to use for computing the kurtosis
        Since:
        1.2
      • getMaxImpl

        public UnivariateStatistic getMaxImpl()
        Returns the currently configured maximum implementation.
        Returns:
        the UnivariateStatistic implementing the maximum
        Since:
        1.2
      • setMaxImpl

        public void setMaxImpl​(UnivariateStatistic maxImpl)

        Sets the implementation for the maximum.

        Parameters:
        maxImpl - the UnivariateStatistic instance to use for computing the maximum
        Since:
        1.2
      • getMinImpl

        public UnivariateStatistic getMinImpl()
        Returns the currently configured minimum implementation.
        Returns:
        the UnivariateStatistic implementing the minimum
        Since:
        1.2
      • setMinImpl

        public void setMinImpl​(UnivariateStatistic minImpl)

        Sets the implementation for the minimum.

        Parameters:
        minImpl - the UnivariateStatistic instance to use for computing the minimum
        Since:
        1.2
      • getPercentileImpl

        public UnivariateStatistic getPercentileImpl()
        Returns the currently configured percentile implementation.
        Returns:
        the UnivariateStatistic implementing the percentile
        Since:
        1.2
      • setPercentileImpl

        public void setPercentileImpl​(UnivariateStatistic percentileImpl)
        Sets the implementation to be used by getPercentile(double). The supplied UnivariateStatistic must provide a setQuantile(double) method; otherwise IllegalArgumentException is thrown.
        Parameters:
        percentileImpl - the percentileImpl to set
        Throws:
        java.lang.IllegalArgumentException - if the supplied implementation does not provide a setQuantile method
        Since:
        1.2
      • getSkewnessImpl

        public UnivariateStatistic getSkewnessImpl()
        Returns the currently configured skewness implementation.
        Returns:
        the UnivariateStatistic implementing the skewness
        Since:
        1.2
      • setSkewnessImpl

        public void setSkewnessImpl​(UnivariateStatistic skewnessImpl)

        Sets the implementation for the skewness.

        Parameters:
        skewnessImpl - the UnivariateStatistic instance to use for computing the skewness
        Since:
        1.2
      • getVarianceImpl

        public UnivariateStatistic getVarianceImpl()
        Returns the currently configured variance implementation.
        Returns:
        the UnivariateStatistic implementing the variance
        Since:
        1.2
      • setVarianceImpl

        public void setVarianceImpl​(UnivariateStatistic varianceImpl)

        Sets the implementation for the variance.

        Parameters:
        varianceImpl - the UnivariateStatistic instance to use for computing the variance
        Since:
        1.2
      • getSumsqImpl

        public UnivariateStatistic getSumsqImpl()
        Returns the currently configured sum of squares implementation.
        Returns:
        the UnivariateStatistic implementing the sum of squares
        Since:
        1.2
      • setSumsqImpl

        public void setSumsqImpl​(UnivariateStatistic sumsqImpl)

        Sets the implementation for the sum of squares.

        Parameters:
        sumsqImpl - the UnivariateStatistic instance to use for computing the sum of squares
        Since:
        1.2
      • getSumImpl

        public UnivariateStatistic getSumImpl()
        Returns the currently configured sum implementation.
        Returns:
        the UnivariateStatistic implementing the sum
        Since:
        1.2
      • setSumImpl

        public void setSumImpl​(UnivariateStatistic sumImpl)

        Sets the implementation for the sum.

        Parameters:
        sumImpl - the UnivariateStatistic instance to use for computing the sum
        Since:
        1.2
      • copy

        public DescriptiveStatistics copy()
        Returns a copy of this DescriptiveStatistics instance with the same internal state.
        Returns:
        a copy of this
      • copy

        public static void copy​(DescriptiveStatistics source,
                                DescriptiveStatistics dest)
        Copies source to dest.

        Neither source nor dest can be null.

        Parameters:
        source - DescriptiveStatistics to copy
        dest - DescriptiveStatistics to copy to
        Throws:
        java.lang.NullPointerException - if either source or dest is null