Neon colored image of a door representing degrees of freedom
Outlier Articles Home

Statistics

Degrees of Freedom In Statistics

04.26.2023 • 5 min read

Sarah Thomas

Subject Matter Expert

Explore degrees of freedom. Learn about their importance, calculation methods, and two test types. Plus dive into solved examples for better understanding.

In This Article

  1. What Are Degrees of Freedom?

  2. How To Calculate Degrees of Freedom

  3. Why Are Degrees of Freedom Important?

  4. Degrees of Freedom Test Types

  5. Degrees of Freedom in Linear Regression

In statistics, you’ll often come across the term “degrees of freedom.” You might be reading the results of a statistical analysis and see the abbreviation d.f., or you may be trying to calculate a statistic like the standard deviation and see degrees of freedom in the denominator of the formula.

The term “degrees of freedom” pops up in many different contexts, and it can be challenging to grasp ‌what degrees of freedom are.

In this article, we’ll dive deeper into the meaning and importance of this much-used statistical term.

What Are Degrees of Freedom?

Degrees of freedom are the number of independent pieces of information used in calculating a statistical estimate. We say these independent pieces of information are “free to vary” given the constraints of your calculation.

To understand the intuition behind degrees of freedom, think about a game show where a prize is hidden behind 1 of 3 doors. How many doors would you need to open to be sure of where the prize is located?

You might get lucky and find the prize behind the first door you open, but if the prize is not behind door number 1, you’ll need to open another door. Suppose you open door number 2, and again, you find no prize. What can you be sure of? The prize must be behind door number 3.

Graphic of doors to a game show to show the intuition behind degrees of freedom

If a prize is behind 1 of the 3 doors, and you’ve opened 2 doors, you know for sure what must be behind the third door. The outcome of what’s behind door number 3 is not free to vary given what lies behind doors 1 and 2. In this case, we might say there are 2 degrees of freedom. Once you know what’s behind 2 of the 3 doors, the outcome of what’s behind the third door is fixed. It’s not free to vary.

Here’s another example. Say we have sample data on 3 individuals: Timmy, Tommy, and Joey. Suppose I told you, ‌the average age of the sample is 20. I also tell you ‌Timmy is 20, and so is Tommy. What must be true about Joey’s age? That’s right! He must also be 20.

graphic showing sample data for 3 people

If Timmy and Tommy are both 20, the only way the average age of your sample can be 20 is if Joey is the same age as the others.

This pattern holds true whenever you calculate an average. If you calculate an average (or mean) for a sample size of n, once you know the mean, only n-1 of the values in your sample are free to vary. Once you’ve calculated a sample average for your data, you are left with n-1 degrees of freedom!

Intro to Statistics

Intro to Statistics

How data describes our world.

Explore course

How To Calculate Degrees of Freedom

How you calculate degrees of freedom varies depending on what ‌you are estimating. Typically, degrees of freedom are equal to your sample size minus the number of parameters that have already been estimated as intermediary steps in your calculation.

Example: Calculating Sample Standard Deviations

Let’s look at the example of calculating a sample standard deviation. A sample standard deviation, ss, is a sample statistic used to estimate the true standard deviation, σ\sigma, of a population.

The formula for calculating a sample standard deviation is shown below. Notice that to calculate a sample standard deviation, we need to first calculate the sample mean xˉ\bar{x}. Once we have the sample mean, we take the square root of the sum of squared deviations from the sample mean divided by n-1 to get the standard deviation.

s=(xixˉ)2n1s = \sqrt{\frac{\sum_{}^{}(x_i-\bar{x})^2}{n-1}}

The n-1 in the denominator represents the degrees of freedom we need to use. As we already saw, by calculating the sample mean, xˉ\bar{x}, we are placing a constraint on our data.

We are anchoring 1 point in the data, which is not free to vary given the values of all the other data points used in the calculation. When we go to calculate the sample standard deviation, we account for this constraint by using n-1 degrees of freedom in our calculation.

Why Are Degrees of Freedom Important?

Degrees of freedom are important in statistics because they affect the accuracy of our statistical estimates. The lower the degrees of freedom, the less reliable your results will be. Just like a detective who has limited evidence to solve a crime, a statistician working with low degrees of freedom has limited information to estimate a parameter and will come up with a less reliable estimate.

Degrees of Freedom Test Types

Degrees of freedom also play an important role in statistical tests. For example, in a t-test using a Student’s t-distribution, degrees of freedom affect both the shape of the t-distribution and the critical values you use to reject the null hypothesis.

Student’s t-Test

The t-distribution is a distribution quite similar to a normal distribution. It’s unimodal, bell-shaped, and symmetric. The main distinction between a t-distribution and a normal distribution is that a t-distribution has fatter tails, and its shape depends on degrees of freedom.

The higher the degrees of freedom on a t-distribution, the closer the shape of the distribution will be to a normal distribution.

t-distributions varying degrees of freedom

When you’re using a t-distribution in a 1-sample t-test, you can determine the degrees of freedom by subtracting 1 from your sample size n. If you had a sample size of 10, for example, your degrees of freedom would be 10-1 = 9. If you are conducting a 2-sample t-test, your degrees of freedom will equal n-2.

Degrees of Freedom for sample tests

Chi-square Tests (𝛘2)

We use chi-square distributions in several statistical tests including tests for goodness of fit and tests of independence between categorical variables.

Just like the Student’s t-distribution, the shape of a Chi-square distribution depends on degrees of freedom. When dealing with chi-square distributions, we represent degrees of freedom with the lowercase letter k.

Chi-square distributions for different Degrees Of Freedom

A chi-square distribution is the distribution of the sum of squares of k-independent normally distributed random variables. To find the degrees of freedom for a chi-square distribution, you need to count the number of k-independent normally distributed random variables used to construct the sum of squares.

For example, say you have a normally distributed random variable χ\chi. χ2\chi^2 is then a random variable with a chi-square distribution with 1 degree of freedom. If you have 2 independent, normally distributed, random variables χ1\chi_1 and χ2\chi_2, χ12+χ22\chi_1^2 + \chi_2^2 will have a chi-square distribution with k=2 degrees of freedom.

In a chi-square test of independence, you can find the degrees of freedom by looking at a contingency table (also called a cross-tabulation) of your data. The degrees of freedom is equal to the number of columns listed under 1 of the categorical variables minus 1 multiplied by 1 minus the number of rows listed under your other categorical variable.

Degrees of Freedom for a Chi-Square Test of Independence

d.f. = k = (number of columns -1) x (number of rows - 1)

The contingency table below has 3 columns listed under Variable A and 2 rows listed for Variable B, and therefore, the degrees of freedom equal:

k=(31)×(21)=2k = (3-1) \times (2-1) = 2
contingency table with 3 columns showing equal degrees of freedom

Degrees of Freedom in Linear Regression

Linear regression is a method statisticians use to study the relationship between variables. Once again, degrees of freedom plays a crucial role.

In linear regression, the degrees of freedom equals the number of observations n minus the number of independent variables in your regression k, minus 1.

Degrees of Freedom in Linear Regression

d.f. = n-k-1

Where:

  • n is your sample size

  • k is the number of independent variables in the regression

Here’s more about linear regression:

Degrees of freedom is a measure of the number of independent pieces of information used in calculating a statistical estimate. In inferential statistics, you’ll come across degrees of freedom as you calculate sample statistics, as you construct confidence intervals or conduct hypothesis tests, and as you run regressions.

Remember, degrees of freedom typically ‌depend on both your sample size and the number of parameters you’re trying to estimate. All else being equal, a statistical analysis where the degrees of freedom are high is typically an analysis with more power and more reliable estimates.

Explore Outlier's Award-Winning For-Credit Courses

Outlier (from the co-founder of MasterClass) has brought together some of the world's best instructors, game designers, and filmmakers to create the future of online college.

Check out these related courses:

Intro to Statistics

Intro to Statistics

How data describes our world.

Explore course
Intro to Microeconomics

Intro to Microeconomics

Why small choices have big impact.

Explore course
Intro to Macroeconomics

Intro to Macroeconomics

How money moves our world.

Explore course
Intro to Psychology

Intro to Psychology

The science of the mind.

Explore course

Share