1. Meaning of Pearson’s Correlation
Karl Pearson’s coefficient of correlation measures the strength and direction of a linear relationship between two variables. It shows how closely the points lie around a straight line when plotted on a graph.
The value of the coefficient lies between −1 and +1:
- +1 → perfect positive linear correlation
- −1 → perfect negative linear correlation
- 0 → no linear correlation
This helps understand how strongly two variables move together and in which direction.
2. Formula for Pearson’s Correlation
The formula for Pearson’s coefficient (r) using raw data is:
\( r = \dfrac{\sum (x - \bar{x})(y - \bar{y})}{\sqrt{\sum (x - \bar{x})^2} \sqrt{\sum (y - \bar{y})^2}} \)
Here:
- \( x, y \) = data values
- \( \bar{x}, \bar{y} \) = means of x and y
- The numerator shows how x and y vary together
- The denominator shows how much they vary individually
The value of r does not depend on the units of measurement.
3. Shortcut Formula
A commonly used shortcut formula for faster calculation is:
\( r = \dfrac{n\sum xy - (\sum x)(\sum y)}{\sqrt{[n\sum x^2 - (\sum x)^2][n\sum y^2 - (\sum y)^2]}} \)
This formula avoids calculating deviations and is convenient when working with small datasets.
4. Interpretation of r
The numerical value of r helps understand the direction and strength of the relationship:
- r = +1 → perfect positive linear relation
- r close to +1 → strong positive relation
- r = 0 → no linear relation
- r close to −1 → strong negative relation
- r = −1 → perfect negative linear relation
The closer r is to ±1, the stronger the relationship.
5. Example
Consider the paired data:
| x | y |
|---|---|
| 2 | 5 |
| 4 | 9 |
| 6 | 12 |
Step 1: Compute sums:
- \( \sum x = 12 \)
- \( \sum y = 26 \)
- \( \sum xy = 2(5) + 4(9) + 6(12) = 130 \)
- \( \sum x^2 = 2^2 + 4^2 + 6^2 = 56 \)
- \( \sum y^2 = 5^2 + 9^2 + 12^2 = 290 \)
Step 2: Apply shortcut formula:
\( r = \dfrac{3(130) - (12)(26)}{\sqrt{[3(56) - 144][3(290) - 676]}} \)
\( r = \dfrac{390 - 312}{\sqrt{(168 - 144)(870 - 676)}} \)
\( r = \dfrac{78}{\sqrt{24 \cdot 194}} \approx 0.92 \)
This indicates a strong positive linear relationship between x and y.