Pearson's r
Computing Pearson's r
There are several formulas that can be used to compute Pearson's correlation. Some formulas make more conceptual sense whereas others are easier to actually compute. We are going to begin with a formula that makes more conceptual sense.
We are going to compute the correlation between the variables
and
shown in Table 1. We begin by computing the mean for
and subtracting this mean from all values of
. The new variable is called "
". The variable "
" is computed similarly. The variables
and
are said to be deviation scores because each score is a deviation from the mean. Notice that the means of
and
are both
. Next we create a new column by multiplying
and
.
Before proceeding with the calculations, let's consider why the sum of the
column reveals the relationship between
and
. If there were no relationship between
and
, then positive values of
would be just as likely to be paired with negative values of
as with positive values. This would make negative values of
as likely as positive values and the sum would be small. On the other hand, consider Table 1 in which high values of
are associated with high values of
and low values of
are associated with low values of
. You can see that positive values of
are associated with positive values of
and negative values of
are associated with negative values of
. In all cases, the product of
and
is positive, resulting in a high total for the
column. Finally, if there were a negative relationship then positive values of
would be associated with negative values of
and negative values of
would be associated with positive values of
. This would lead to negative values for
.
| X | Y | x | y | xy | x2 | y2 | |
|---|---|---|---|---|---|---|---|
| 1 | 4 | -3 | -5 | 15 | 9 | 25 | |
| 3 | 6 | -1 | -3 | 3 | 1 | 9 | |
| 5 | 10 | 1 | 1 | 1 | 1 | 1 | |
| 5 | 12 | 1 | 3 | 3 | 1 | 9 | |
| 6 | 13 | 2 | 4 | 8 | 4 | 16 | |
| Total | 20 | 45 | 0 | 0 | 30 | 16 | 60 |
| Mean | 4 | 9 | 0 | 0 | 6 |
Pearson's
is designed so that the correlation between height and weight is the same whether height is measured in inches or in feet. To achieve this property, Pearson's correlation is computed by dividing the sum of the
column
by the square root of the product of the sum of the
column
and the sum of the
column
. The resulting formula is:and therefore
An alternative computational formula that avoids the step of computing deviation scores is:


