An introduction to simple linear regression

Regression models describe the relationship between variables by fitting a line to the observed data. Linear regression models use a straight line, while logistic and nonlinear regression models use a curved line. Regression allows you to estimate how a dependent variable changes as the independent variable(s) change.

Simple linear regression is used to estimate the relationship between two quantitative variables. You can use simple linear regression when you want to know:

  1. How strong the relationship is between two variables (e.g. the relationship between rainfall and soil erosion).
  2. The value of the dependent variable at a certain value of the independent variable (e.g. the amount of soil erosion at a certain level of rainfall).
You are a social researcher interested in the relationship between income and happiness. You survey 500 people whose incomes range from $15k to $75k and ask them to rank their happiness on a scale from 1 to 10.

Your independent variable (income) and dependent variable (happiness) are both quantitative, so you can do a regression analysis to see if there is a linear relationship between them.

If you have more than one independent variable, use multiple linear regression instead.

Continue reading: An introduction to simple linear regression

An introduction to t-tests

A t-test is a statistical test that is used to compare the means of two groups. It is often used in hypothesis testing to determine whether a process or treatment actually has an effect on the population of interest, or whether two groups are different from one another.

You want to know whether the mean petal length of iris flowers differs according to their species. You find two different species of irises growing in a garden and measure 25 petals of each species. You can test the difference between these two groups using a t-test.

  • The null hypothesis (H0) is that the true difference between these group means is zero.
  • The alternate hypothesis (Ha) is that the true difference is different from zero.

Continue reading: An introduction to t-tests

Statistical tests: which one should you use?

Statistical tests are used in hypothesis testing. They can be used to:

  • determine whether a predictor variable has a statistically significant relationship with an outcome variable.
  • estimate the difference between two or more groups.

Statistical tests assume a null hypothesis of no relationship or no difference between groups. Then they determine whether the observed data fall outside of the range of values predicted by the null hypothesis.

If you already know what types of variables you’re dealing with, you can use the flowchart to choose the right statistical test for your data.

Statistical tests flowchart

Continue reading: Statistical tests: which one should you use?

A guide to experimental design

An experiment is a type of research method in which you manipulate one or more independent variables and measure their effect on one or more dependent variables. Experimental design means creating a set of procedures to test a hypothesis.

A good experimental design requires a strong understanding of the system you are studying. By first considering the variables and how they are related (Step 1), you can make predictions that are specific and testable (Step 2).

How widely and finely you vary your independent variable (Step 3) will determine the level of detail and the external validity of your results. Your decisions about randomization, experimental controls, and independent vs repeated-measures designs (Step 4) will determine the internal validity of your experiment.

Continue reading: A guide to experimental design

Understanding types of variables

In statistical research, a variable is defined as an attribute of an object of study. Choosing which variables to measure is central to good experimental design.


If you want to test whether some plant species are more salt-tolerant than others, some key variables you might measure include the amount of salt you add to the water, the species of plants being studied, and variables related to plant health like growth and wilting.

You need to know which types of variables you are working with in order to choose appropriate statistical tests and interpret the results of your study.

You can usually identify the type of variable by asking two questions:

  1. What type of data does the variable contain?
  2. What part of the experiment does the variable represent?

Continue reading: Understanding types of variables

A step-by-step guide to hypothesis testing

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is most often used by scientists to test specific predictions, called hypotheses, that arise from theories.

There are 5 main steps in hypothesis testing:

  1. State your research hypothesis as a null (Ho) and alternate (Ha) hypothesis.
  2. Collect data in a way designed to test the hypothesis.
  3. Perform an appropriate statistical test.
  4. Decide whether the null hypothesis is supported or refuted.
  5. Present the findings in your results and discussion section.

Though the specific details might vary, the procedure you will use when testing a hypothesis will always follow some version of these steps.

Continue reading: A step-by-step guide to hypothesis testing