org.apache.commons.math3.stat.inference.TTest

public class TTest extends Object

An implementation for Student's t-tests.

Tests can be:

One-sample or two-sample
One-sided or two-sided
Paired or unpaired (for two-sample tests)
Homoscedastic (equal variance assumption) or heteroscedastic (for two sample tests)
Fixed significance level (boolean-valued) or returning p-values.

Test statistics are available for all tests. Methods including "Test" in in their names perform tests, all other methods return t-statistics. Among the "Test" methods, double-valued methods return p-values; boolean-valued methods perform fixed significance level tests. Significance levels are always specified as numbers between 0 and 0.5 (e.g. tests at the 95% level use alpha=0.05).

Input to tests can be either double[] arrays or StatisticalSummary instances.

Uses commons-math TDistribution implementation to estimate exact p-values.

Constructor Summary

Constructors

Constructor

Description

TTest()
Method Summary

Modifier and Type

Method

Description

protected double

df(double v1, double v2, double n1, double n2)

Computes approximate degrees of freedom for 2-sample t-test.

double

homoscedasticT(double[] sample1, double[] sample2)

Computes a 2-sample t statistic, under the hypothesis of equal subpopulation variances.

protected double

homoscedasticT(double m1, double m2, double v1, double v2, double n1, double n2)

Computes t test statistic for 2-sample t-test under the hypothesis of equal subpopulation variances.

double

homoscedasticT(StatisticalSummary sampleStats1, StatisticalSummary sampleStats2)

Computes a 2-sample t statistic, comparing the means of the datasets described by two StatisticalSummary instances, under the assumption of equal subpopulation variances.

double

homoscedasticTTest(double[] sample1, double[] sample2)

Returns the observed significance level, or p-value, associated with a two-sample, two-tailed t-test comparing the means of the input arrays, under the assumption that the two samples are drawn from subpopulations with equal variances.

boolean

homoscedasticTTest(double[] sample1, double[] sample2, double alpha)

Performs a two-sided t-test evaluating the null hypothesis that sample1 and sample2 are drawn from populations with the same mean, with significance level alpha, assuming that the subpopulation variances are equal.

protected double

homoscedasticTTest(double m1, double m2, double v1, double v2, double n1, double n2)

Computes p-value for 2-sided, 2-sample t-test, under the assumption of equal subpopulation variances.

double

homoscedasticTTest(StatisticalSummary sampleStats1, StatisticalSummary sampleStats2)

Returns the observed significance level, or p-value, associated with a two-sample, two-tailed t-test comparing the means of the datasets described by two StatisticalSummary instances, under the hypothesis of equal subpopulation variances.

double

pairedT(double[] sample1, double[] sample2)

Computes a paired, 2-sample t-statistic based on the data in the input arrays.

double

pairedTTest(double[] sample1, double[] sample2)

Returns the observed significance level, or p-value, associated with a paired, two-sample, two-tailed t-test based on the data in the input arrays.

boolean

pairedTTest(double[] sample1, double[] sample2, double alpha)

Performs a paired t-test evaluating the null hypothesis that the mean of the paired differences between sample1 and sample2 is 0 in favor of the two-sided alternative that the mean paired difference is not equal to 0, with significance level alpha.

double

t(double[] sample1, double[] sample2)

Computes a 2-sample t statistic, without the hypothesis of equal subpopulation variances.

double

t(double mu, double[] observed)

Computes a t statistic given observed values and a comparison constant.

protected double

t(double m, double mu, double v, double n)

Computes t test statistic for 1-sample t-test.

protected double

t(double m1, double m2, double v1, double v2, double n1, double n2)

Computes t test statistic for 2-sample t-test.

double

t(double mu, StatisticalSummary sampleStats)

Computes a t statistic to use in comparing the mean of the dataset described by sampleStats to mu.

double

t(StatisticalSummary sampleStats1, StatisticalSummary sampleStats2)

Computes a 2-sample t statistic , comparing the means of the datasets described by two StatisticalSummary instances, without the assumption of equal subpopulation variances.

double

tTest(double[] sample1, double[] sample2)

Returns the observed significance level, or p-value, associated with a two-sample, two-tailed t-test comparing the means of the input arrays.

boolean

tTest(double[] sample1, double[] sample2, double alpha)

Performs a two-sided t-test evaluating the null hypothesis that sample1 and sample2 are drawn from populations with the same mean, with significance level alpha.

double

tTest(double mu, double[] sample)

Returns the observed significance level, or p-value, associated with a one-sample, two-tailed t-test comparing the mean of the input array with the constant mu.

boolean

tTest(double mu, double[] sample, double alpha)

Performs a two-sided t-test evaluating the null hypothesis that the mean of the population from which sample is drawn equals mu.

protected double

tTest(double m, double mu, double v, double n)

Computes p-value for 2-sided, 1-sample t-test.

protected double

tTest(double m1, double m2, double v1, double v2, double n1, double n2)

Computes p-value for 2-sided, 2-sample t-test.

double

tTest(double mu, StatisticalSummary sampleStats)

Returns the observed significance level, or p-value, associated with a one-sample, two-tailed t-test comparing the mean of the dataset described by sampleStats with the constant mu.

boolean

tTest(double mu, StatisticalSummary sampleStats, double alpha)

Performs a two-sided t-test evaluating the null hypothesis that the mean of the population from which the dataset described by stats is drawn equals mu.

double

tTest(StatisticalSummary sampleStats1, StatisticalSummary sampleStats2)

Returns the observed significance level, or p-value, associated with a two-sample, two-tailed t-test comparing the means of the datasets described by two StatisticalSummary instances.

boolean

tTest(StatisticalSummary sampleStats1, StatisticalSummary sampleStats2, double alpha)

Performs a two-sided t-test evaluating the null hypothesis that sampleStats1 and sampleStats2 describe datasets drawn from populations with the same mean, with significance level alpha.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- TTest
  
  public TTest()
Method Details
- pairedT
  
  public double pairedT(double[] sample1, double[] sample2) throws NullArgumentException, NoDataException, DimensionMismatchException, NumberIsTooSmallException
  Computes a paired, 2-sample t-statistic based on the data in the input arrays. The t-statistic returned is equivalent to what would be returned by computing the one-sample t-statistic t(double, double[]), with mu = 0 and the sample array consisting of the (signed) differences between corresponding entries in sample1 and sample2.
  Preconditions:
  
  The input arrays must have the same length and their common length must be at least 2.
  Parameters:
  
  sample1 - array of sample data values
  
  sample2 - array of sample data values
  
  Returns:
  
  t statistic
  
  Throws:
  
  NullArgumentException - if the arrays are null
  
  NoDataException - if the arrays are empty
  
  DimensionMismatchException - if the length of the arrays is not equal
  
  NumberIsTooSmallException - if the length of the arrays is < 2
- pairedTTest
  
  public double pairedTTest(double[] sample1, double[] sample2) throws NullArgumentException, NoDataException, DimensionMismatchException, NumberIsTooSmallException, MaxCountExceededException
  Returns the observed significance level, or p-value, associated with a paired, two-sample, two-tailed t-test based on the data in the input arrays.
  The number returned is the smallest significance level at which one can reject the null hypothesis that the mean of the paired differences is 0 in favor of the two-sided alternative that the mean paired difference is not equal to 0. For a one-sided test, divide the returned value by 2.
  
  This test is equivalent to a one-sample t-test computed using tTest(double, double[]) with mu = 0 and the sample array consisting of the signed differences between corresponding elements of sample1 and sample2.
  
  Usage Note:
  The validity of the p-value depends on the assumptions of the parametric t-test procedure, as discussed here
  
  Preconditions:
  
  The input array lengths must be the same and their common length must be at least 2.
  Parameters:
  
  sample1 - array of sample data values
  
  sample2 - array of sample data values
  
  Returns:
  
  p-value for t-test
  
  Throws:
  
  NullArgumentException - if the arrays are null
  
  NoDataException - if the arrays are empty
  
  DimensionMismatchException - if the length of the arrays is not equal
  
  NumberIsTooSmallException - if the length of the arrays is < 2
  
  MaxCountExceededException - if an error occurs computing the p-value
- pairedTTest
  
  public boolean pairedTTest(double[] sample1, double[] sample2, double alpha) throws NullArgumentException, NoDataException, DimensionMismatchException, NumberIsTooSmallException, OutOfRangeException, MaxCountExceededException
  Performs a paired t-test evaluating the null hypothesis that the mean of the paired differences between sample1 and sample2 is 0 in favor of the two-sided alternative that the mean paired difference is not equal to 0, with significance level alpha.
  Returns true iff the null hypothesis can be rejected with confidence 1 - alpha. To perform a 1-sided test, use alpha * 2
  
  Usage Note:
  The validity of the test depends on the assumptions of the parametric t-test procedure, as discussed here
  
  Preconditions:
  
  The input array lengths must be the same and their common length must be at least 2.
  
  0 < alpha < 0.5
  Parameters:
  
  sample1 - array of sample data values
  
  sample2 - array of sample data values
  
  alpha - significance level of the test
  
  Returns:
  
  true if the null hypothesis can be rejected with confidence 1 - alpha
  
  Throws:
  
  NullArgumentException - if the arrays are null
  
  NoDataException - if the arrays are empty
  
  DimensionMismatchException - if the length of the arrays is not equal
  
  NumberIsTooSmallException - if the length of the arrays is < 2
  
  OutOfRangeException - if alpha is not in the range (0, 0.5]
  
  MaxCountExceededException - if an error occurs computing the p-value
- t
  
  public double t(double mu, double[] observed) throws NullArgumentException, NumberIsTooSmallException
  Computes a t statistic given observed values and a comparison constant.
  This statistic can be used to perform a one sample t-test for the mean.
  Preconditions:
  
  The observed array length must be at least 2.
  Parameters:
  
  mu - comparison constant
  
  observed - array of values
  
  Returns:
  
  t statistic
  
  Throws:
  
  NullArgumentException - if observed is null
  
  NumberIsTooSmallException - if the length of observed is < 2
- t
  
  public double t(double mu, StatisticalSummary sampleStats) throws NullArgumentException, NumberIsTooSmallException
  Computes a t statistic to use in comparing the mean of the dataset described by sampleStats to mu.
  This statistic can be used to perform a one sample t-test for the mean.
  Preconditions:
  
  observed.getN() ≥ 2.
  Parameters:
  
  mu - comparison constant
  
  sampleStats - DescriptiveStatistics holding sample summary statitstics
  
  Returns:
  
  t statistic
  
  Throws:
  
  NullArgumentException - if sampleStats is null
  
  NumberIsTooSmallException - if the number of samples is < 2
- homoscedasticT
  
  public double homoscedasticT(double[] sample1, double[] sample2) throws NullArgumentException, NumberIsTooSmallException
  
  Computes a 2-sample t statistic, under the hypothesis of equal subpopulation variances. To compute a t-statistic without the equal variances hypothesis, use t(double[], double[]).
  This statistic can be used to perform a (homoscedastic) two-sample t-test to compare sample means.
  
  The t-statistic is
  
  t = (m1 - m2) / (sqrt(1/n1 +1/n2) sqrt(var))
  where n1 is the size of first sample; n2 is the size of second sample; m1 is the mean of first sample; m2 is the mean of second sample
and var is the pooled variance estimate:
var = sqrt(((n1 - 1)var1 + (n2 - 1)var2) / ((n1-1) + (n2-1)))
with var1 the variance of the first sample and var2 the variance of the second sample.
Preconditions:
- The observed array lengths must both be at least 2.

Class TTest

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

TTest

Method Details

pairedT

pairedTTest

pairedTTest

t

t

homoscedasticT

t

t

homoscedasticT

tTest

tTest

tTest

tTest

tTest

homoscedasticTTest

tTest

homoscedasticTTest

tTest

homoscedasticTTest

tTest

df

t

t

homoscedasticT

tTest

tTest

homoscedasticTTest