A test statistic is a standardized score used in hypothesis testing. It tells you how likely the results obtained from your sample data are under the assumption that the null hypothesis is true. The more unlikely your results are under this assumption, the easier it becomes to reject the null hypothesis in favor of an alternative hypothesis. The more likely your results are, the harder it becomes to reject the null hypothesis.

There are different kinds of test statistics, but they all work the same way. A test statistic maps the value of a particular sample statistic (such as a sample mean or a sample proportion) to a value on a standardized distribution, such as the Standard Normal Distribution or the t-distribution. This allows you to determine how likely or unlikely it is to observe the particular value of the statistic you obtained.

Olanrewaju Michael Akande reviews normal distribution in the following lesson clip:

As a quick example, say you have a null hypothesis that the average wait time to get seated at your favorite restaurant—at a table for two without a reservation on a Friday night—is 45 minutes. You select a random sample of 100 parties that got seated under these conditions and ask them what their wait times were. You find that the average wait time for your sample is 55 minutes ($\bar{x}$ = 55 minutes). A test statistic will convert this sample statistic $\bar{x}$ into a standardized number that helps you answer this question:

“Assuming that my null hypothesis is true—assuming that the average wait time at the restaurant actually is 45 minutes—what is the likelihood that I found an average wait time of 55 minutes for my randomly drawn sample?”

Remember, the lower the likelihood of observing your sample statistic, the more confident you can be rejecting the null hypothesis.

The type of test statistic you use in a hypothesis test depends on several factors including:

The type of statistic you are using in the test

The size of your sample

Assumptions you can make about the distribution of your data

Assumptions you can make about the distribution of the statistic used in the test

The General Formula for Calculating Test Statistics

The formula for calculating test statistics takes the following general form:

$\text{Test Statistic} = \frac{\text{Statistic} - \text{Parameter}}{\text{Standard Deviation of the Statistic}}$

Remember, a statistic is a measure calculated from a single sample or many samples. Examples include the sample mean $\bar{x}$, the difference between two sample means $\bar{x_{1}} - \bar{x_{2}}$, or a sample proportion $\hat{p}$.

A parameter is a measure calculated from a single population or many populations. Examples include the population mean $\mu$, the difference between two population means $\mu_{1}-\mu_{2}$, or a population proportion $p$.

In the denominator of the equation, you have the standard deviation—or the approximated standard deviation—of the statistic used in the numerator. If you use the sample mean $\bar{x}$, in the numerator, you should use the standard deviation of $\bar{x}$ or an approximation of it in the denominator.

Types of Test Statistics with Formulas

The test statistics you are most likely to encounter in an introductory statistics class are:

The Z-test statistic for a single sample mean

The Z-test statistic for population proportions

The t-test statistic for a single sample mean

The t-test statistic for two sample means

Z-test for a Sample Mean

We use the Z-test statistic (or Z-statistic) for a sample mean in hypothesis tests involving a sample mean $\bar{x}$, calculated for a single sample.

You use this test statistic when:

Your sample size is greater than or equal to 30 (n$\geq$30)

$\mu_{0}$ is the hypothesized value of the population mean according to the null hypothesis

$\sigma$ is the population standard deviation

$n$ is the sample size

$\frac{\sigma}{\sqrt{n}}$ is the standard error of $\bar{x}$. The standard error is just the standard deviation of the sampling distribution of the sample mean.

You may notice that a Z-test statistic is just a z-score for a particular value of a normally distributed statistic. There are many variations of the Z-test statistic. We can use these in hypothesis tests, where the sample statistic is being used in the test is approximately normally distributed. One such variation of the Z-test statistic is the Z-test for proportions.

Z-test for Proportions

We use the Z-test statistic for proportions in hypothesis tests where a sample proportion $\hat{p}$ is being tested against the hypothesized value of the population proportion, $p_{0}$.
We use the Z-test for proportions when your sample size is greater than or equal to 30 (n$\geq$30), and the distribution of the sample statistic is assumed to be normal.
The formula for the Z-test statistic for population proportions is:

Z is the symbol for the Z-test statistic for population proportions

$\hat{p}$ is the sample proportion

$p_{0}$ is the hypothesized value of the population proportion according to the null hypothesis

$n$ is the sample size

When your sample size is smaller than 30 (n<30)—or when you cannot assume that the distribution of your sample statistic is normally distributed—you’ll often use a t-test statistic rather than a Z-test.

T-test for a Single Sample Mean

We use the t-test statistic (or t-statistic) for a sample mean in hypothesis tests involving a sample mean calculated for a single sample drawn from a population. Unlike the Z-test for a single sample mean, you use the t-test when:

Your sample size is less than 30 (n<30)

The distribution of the sample statistic is not approximated by a normal distribution

The standard deviation of the population parameter $\sigma$ is unknown

A t-test statistic maps your statistics to a t-distribution as opposed to the normal distribution with a Z-test. A t-distribution is like a standard normal distribution, but it has thicker tails and changes depending on your sample size $n$. When $n$ is large, the t-distribution is closer to the normal distribution; and as the sample size gets larger and larger, a t-distribution will converge to the normal distribution. As $n$ gets smaller, the t-distribution gets flatter with thicker tails.

The formula for the t-test statistic for a sample mean is:

$t =\frac{\bar{x}-\mu_0}{\frac{s}{\sqrt{n}}}$

$t$ is the symbol for the t-test statistic

$\bar{x}$ is the sample mean

$\mu_0$ is the value of the population mean according to the null hypothesis

$s$ is the sample standard deviation

$n$ is the sample size

$\frac{s}{\sqrt{n}}$ is an approximation of the standard error of $\bar{x}$. In a t-test, because you do not know the value of the population standard deviation, you need to approximate the standard error of $\bar{x}$ using the sample standard deviation $s$.

T-test for Two Sample Means

We can also use t-test statistics in hypothesis tests where the values of two sample means ($x_{1}$ and $x_{2}$) are being compared. You do this to test the null hypothesis that the two samples are drawn from the same underlying population. If the null hypothesis is true, then any difference between the sample means is due to random variations in the data. Rejecting the null hypothesis suggests that the samples were drawn from two distinct populations and that the difference in the sample means reflects actual differences in the characteristics of subjects in one population compared to the other.

Like the t-test for a single sample mean, you use the t-test for two sample means when:

Your sample sizes are less than 30 (n<30)

The distribution of the sample statistics are not approximated by a normal distribution

The standard deviation of the population parameter $\sigma$ is unknown

The formula for the t-test statistic for two sample means is:

$\mu_1$ is the mean of the population from which sample 1 was drawn

$\mu_2$ is the mean of the population from which sample 2 was drawn

$s_1^2$ is the variance of sample 1

$s_2^2$ is the variance of sample 2

$n_{1}$ is the sample size for sample 1

$n_{2}$ is the sample size for sample 2

Difference Between T-Tests and Z-Tests and When to Use Each

T-tests are generally used in place of Z-tests when one or more of the following conditions hold:
The sample size is less than 30 (n<30)
The statistic you use in the hypothesis test is not approximated by a normal distribution
The population standard deviation \sigma is unknown

If you know the population standard deviation $\sigma$ and you are confident that the statistic used in your hypothesis test is normally distributed, then you can use a Z-test.

As with all test statistics, you should only use a Z-test or a t-test when your data is from a randomly and independently drawn sample.

How to Interpret a Test Statistic

We use test statistics together with critical values, p-values, and significance levels to determine whether to reject or not a null hypothesis.

A critical value is a value of a test statistic that marks a cutoff point. If a test statistic is more extreme than the critical value—greater than the critical value in the right tail of a distribution or less than the critical value in the left tail of a distribution—the null hypothesis is rejected.

Critical values are determined by the significance level (or alpha level) of a hypothesis test. The significance level you use is up to you, but the most commonly used significance level is 0.05 ($\alpha$=0.05).

A significance level of 0.05 means that if the probability of observing a sample statistic at least as extreme as the one you observed is less than 0.05 (or 5%), you should reject your null hypothesis. In a one-sided hypothesis test that uses a Z-test statistic, a significance level of 0.05 is associated with a critical value of 1.645 when you conduct the test in the right tail and a value of -1.645 when you conduct the test in the left tail.

A p-value is the probability associated with your test statistic’s value. Let’s say you calculate a Z-test statistic that maps to the standard normal distribution. You find that the test statistic is equal to 1.75. For this value of a Z-test statistic, the associated p-value is 0.04 or 4%—you can find p-values using tables or statistical software.

A p-value of 0.04 means that the probability of observing a sample statistic at least as extreme as the one you found from your sample data is 4%. If you choose a significance level of 0.05 for your test, we would reject the null hypothesis, since the p-value of 0.04 is less than the significance level of 0.05.

It can be easy to confuse test statistics, critical values, significance levels, and p-values. Remember, these are all different measures involved in determining whether to reject or fail to reject a null hypothesis.

Critical values and significance levels provide cut-offs for your test. The difference between a critical value and a significance level is that the critical value is a point on the distribution, and the significance level is a probability represented by an area under the distribution.

You can compare the test statistic and the p-value against the critical value and the significance level.

If the test statistic is more extreme than the critical value, you reject the null hypothesis.

If the p-value is less than the significance level, you reject the null hypothesis.

If the test statistic is less extreme than the critical value, you fail to reject the null hypothesis.

If the p-value is greater than the significance level, you reject the null hypothesis.

Outlier (from the co-founder of MasterClass) has brought together some of the world's best instructors, game designers, and filmmakers to create the future of online college.