An introduction to ttests
A ttest is a statistical test that is used to compare the means of two groups. It is often used in hypothesis testing to determine whether a process or treatment actually has an effect on the population of interest, or whether two groups are different from one another.
You want to know whether the mean petal length of iris flowers differs according to their species. You find two different species of irises growing in a garden and measure 25 petals of each species. You can test the difference between these two groups using a ttest.
 The null hypothesis (H_{0}) is that the true difference between these group means is zero.
 The alternate hypothesis (H_{a}) is that the true difference is different from zero.
When to use a ttest
A ttest can only be used when comparing the means of two groups (a.k.a. pairwise comparison). If you want to compare more than two groups, or if you want to do multiple pairwise comparisons, use an ANOVA test or a posthoc test.
The ttest is a parametric test of difference, meaning that it makes the same assumptions about your data as other parametric tests. The ttest assumes your data:
 are independent
 are (approximately) normally distributed.
 have a similar amount of variance within each group being compared (a.k.a. homogeneity of variance)
If your data do not fit these assumptions, you can try a nonparametric alternative to the ttest, such as the Wilcoxon SignedRank test for data with unequal variances.
What type of ttest should I use?
When choosing a ttest, you will need to consider two things: whether the groups being compared come from a single population or two different populations, and whether you want to test the difference in a specific direction.
Onesample, twosample, or paired ttest?
 If the groups come from a single population (e.g. measuring before and after an experimental treatment), perform a paired ttest.
 If the groups come from two different populations (e.g. two different species, or people from two separate cities), perform a twosample ttest (a.k.a. independent ttest).
 If there is one group being compared against a standard value (e.g. comparing the acidity of a liquid to a neutral pH of 7), perform a onesample ttest.
Onetailed or twotailed ttest?
 If you only care whether the two populations are different from one another, perform a twotailed ttest.
 If you want to know whether one population mean is greater than or less than the other, perform a onetailed ttest.
In your test of whether petal length differs by species:
 Your observations come from two separate populations (separate species), so you perform a twosample ttest.
 You don’t care about the direction of the difference, only whether there is a difference, so you choose to use a twotailed ttest.
Performing a ttest
The ttest estimates the true difference between two group means using the ratio of the difference in group means over the pooled standard error of both groups. You can calculate it manually using a formula, or use statistical analysis software.
Ttest formula
The formula for the twosample ttest (a.k.a. the Student’s ttest) is shown below.
In this formula, t is the tvalue, x_{1} and x_{2} are the means of the two groups being compared, s_{2} is the pooled standard error of the two groups, and n_{1} and n_{2} are the number of observations in each of the groups.
A larger tvalue shows that the difference between group means is greater than the pooled standard error, indicating a more significant difference between the groups.
You can compare your calculated tvalue against the values in a critical value chart to determine whether your tvalue is greater than what would be expected by chance. If so, you can reject the null hypothesis and conclude that the two groups are in fact different.
Ttest function in statistical software
Most statistical software (R, SPSS, etc.) includes a ttest function. This builtin function will take your raw data and calculate the tvalue. It will then compare it to the critical value, and calculate a pvalue. This way you can quickly see whether your groups are statistically different.
In your comparison of flower petal lengths, you decide to perform your ttest using R. The code looks like this:
Download the data set to practice by yourself.
Interpreting test results
If you perform the ttest for your flower hypothesis in R, you will receive the following output:
The output provides:
 An explanation of what is being compared, called data in the output table.
 The tvalue: 33.719. Note that it’s negative; this is fine! In most cases, we only care about the absolute value of the difference, or the distance from 0. It doesn’t matter which direction.
 The degrees of freedom: 30.196. Degrees of freedom is related to your sample size, and shows how many ‘free’ data points are available in your test for making comparisons. The greater the degrees of freedom, the better your statistical test will work.
 The pvalue: 2.2e16 (i.e. 2.2 with 15 zeros in front). This describes the probability that you would see a tvalue as large as this one by chance.
 A statement of the alternate hypothesis (H_{a}). In this test, the H_{a} is that the difference is not 0.
 The 95% confidence interval. This is the range of numbers within which the true difference in means will be 95% of the time. This can be changed from 95% if you want a larger or smaller interval, but 95% is very commonly used.
 The mean petal length for each group.
Presenting the results of a ttest
When reporting your ttest results, the most important values to include are the tvalue, the pvalue, and the degrees of freedom for the test. These will communicate to your audience whether the difference between the two groups is statistically significant (a.k.a. that it is unlikely to have happened by chance).
You can also include the summary statistics for the groups being compared, namely the mean and standard deviation. In R, the code for calculating the mean and the standard deviation from the data looks like this:
flower.data %>%
group_by(Species) %>%
summarize(mean_length = mean(Petal.Length),
sd_length = sd(Petal.Length))
In our example, you would report the results like this:
Frequently asked questions about ttests
 What is a ttest?

A ttest is a statistical test that compares the means of two samples. It is used in hypothesis testing, with a null hypothesis that the difference in group means is zero and an alternate hypothesis that the difference in group means is different from zero.
 What does a ttest measure?

A ttest measures the difference in group means divided by the pooled standard error of the two group means.
In this way, it calculates a number (the tvalue) illustrating the magnitude of the difference between the two group means being compared, and estimates the likelihood that this difference exists purely by chance (pvalue).
 Which ttest should I use?

Your choice of ttest depends on whether you are studying one group or two groups, and whether you care about the direction of the difference in group means.
If you are studying one group, use a paired ttest to compare the group mean over time or after an intervention, or use a onesample ttest to compare the group mean to a standard value. If you are studying two groups, use a twosample ttest.
If you want to know only whether a difference exists, use a twotailed test. If you want to know if one group mean is greater or less than the other, use a lefttailed or righttailed onetailed test.
 What is the difference between a onesample ttest and a paired ttest?

A onesample ttest is used to compare a single population to a standard value (for example, to determine whether the average lifespan of a specific town is different from the country average).
A paired ttest is used to compare a single population before and after some experimental intervention or at two different points in time (for example, measuring student performance on a test before and after being taught the material).
 Can I use a ttest to measure the difference among several groups?

A ttest should not be used to measure differences among more than two groups, because the error structure for a ttest will underestimate the actual error when many groups are being compared.
If you want to compare the means of several groups at once, it’s best to use another statistical test such as ANOVA or a posthoc test.