Statistical Significance
Are research findings the result of a particular intervention or due to chance?

An important concept in education research is what is known as a “p value.” This is the outcome of a significance test that is based on the pessimistic assumption that a researcher’s results are merely due to chance. Big values of p are evidence supporting the idea that the results were due to chance. Small values of p are evidence against the idea that chance explains the outcome. 

As you might imagine, researchers usually hope to obtain small p values. If p is less than 5 percent, the result is often called “statistically significant” because it means there is only a 5 percent chance of seeing a result at least as extreme as the one the researcher has obtained if the results are actually the result of chance. If p is less than 1 percent, the results may be called “highly statistically significant.”

The p value is not only among the most widely used statistical terms; it is also among the most misused. One common misuse involves using p values in studies in which no chance process occurred, as when a test is given to every single student in a school district instead of to a random selection of students. The p value cannot tell you whether your results were affected by factors unrelated to chance – such as the fact that some of the students in the district were absent the day of the test. After all, the students probably weren’t randomly selected to miss school. 

Another common misuse is to assume that statistical significance is equivalent to obtaining big, meaningful or important results. Unfortunately, a p value will not really tell you anything about the magnitude or real-world importance of the results. It is important to keep in mind, for instance, that p values are very sensitive to sample sizes. The bigger the sample, the more likely the findings are statistically significant. But even while studies with many thousands of participants may yield highly statistically significant results, the actual effect of the program or intervention may be marginal.

­– Holly Yettick