Scientific Editing Services, Manuscript Editing Service
中文 繁體中文 English 한국어 日本語 Português Español

A discussion of p values

The p value is perhaps the most familiar test statistic in modern scientific discourse. It is often mistakenly used by new graduate students and lay readers to interpret the entirety of the empirical data for any given study. What once began as a useful tool for decision making in hypothesis testing, became a one-trick litmus test to determine whether results were or were not significant and, by unfortunate extension, publishable or not publishable.

Therefore, we would like to provide a brief, accurate portrayal of the p value and the manner in which it should be used and interpreted. This article will serve authors as a helpful update on the current status of the p value as a tool in the scientific community. Be aware that if a misunderstanding of p values is evident in your manuscript, then you can reasonably expect an outright rejection by the reviewer.

The p value was originally calculated as a test statistic that would describe a given set of data based on an assumed null hypothesis. Pierre-Simon Laplace—also the provider of the mathematical description of surface tension—originally calculated p values in an attempt to categorize gender distributions as “real.” Thus, the notion originated that the p value could detect whether differences were real or, alternatively, due to coincidental probability. The utility of the p value was that it would establish a common, standardized decision making process for rejecting or accepting hypotheses based on empirical data. As proposed by Ronald Fisher, this threshold would be set at <0.05 for the rejection of the null hypothesis. Importantly, this was a completely arbitrary value designated and used by the scientist, not the statistician.

So, given the utility of the p value, what exactly is being calculated?

The p value is a description of the data; it is not a description of the hypothesis. The value indicates the probability—assuming the null hypothesis is true—of acquiring a result as extreme as the data set tested. This is a valuable tool for deciding whether to reject or accept the null hypothesis. As a community, scientists have agreed on a threshold for rejecting the null. This directly reflects the probability of falsely rejecting the null (type I error) or falsely accepting the null (type II error). Therefore, it provides an intuitive indication of the likelihood that apparent differences are “real.”

Unfortunately, this intuitiveness has steadily led to widespread misuse of the p value, and recent developments, such as the reproducibility crisis, have shifted attitudes concerning the use and reporting of p values. Understanding these changes is now critical to achieving success in the publication process.

Recently, due to the uproar over p values, the American Statistics Association felt compelled to release a statement on the use of the p value.

From the The ASA's Statement on p-Values: Context, Process, and Purpose (2016) from Wasserstein and Lazar:

“P values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.”


“Scientific conclusions and business or policy decisions should not be based only on whether a p value passes a specific threshold.


“A p value, or statistical significance, does not measure the size of an effect or the importance of a result.”

These three judgments are critical to the modern use of p values. We will now provide some guidance for using the p statistic in your manuscript based on the preceding information.

Ronald L. Wasserstein & Nicole A. Lazar (2016) The ASA's Statement on p-Values: Context, Process, and Purpose, The American Statistician, 70:2, 129-133, DOI: 10.1080/00031305.2016.1154108

(Please retain the reference in reprint:

Contact us

Contact us  

Your name*

Your email*

Your message*

Please fill in all fields and provide a valid email.