What is data dredging?

Data dredging (also called p-hacking) is the statistical manipulation of data in order to find patterns which can be presented as statistically significant, when in reality there is no underlying effect.

This can be achieved in a number of ways, such as:

  • Excluding certain participants
  • Stopping data collection once a p value of 0.05 is reached
  • Analyzing many outcomes, but only reporting those with p < 0.05

The reason for this practice is the widespread notion in the academic community that only statistically significant findings are noteworthy. This idea leads to publication bias.