Karl Pearson’s Coefficient of Correlation

Understand Karl Pearson’s coefficient of correlation, its formula, interpretation, and how it measures the strength and direction of a linear relationship using simple examples.

1. Meaning of Pearson’s Correlation

Karl Pearson’s coefficient of correlation measures the strength and direction of a linear relationship between two variables. It shows how closely the points lie around a straight line when plotted on a graph.

The value of the coefficient lies between −1 and +1:

  • +1 → perfect positive linear correlation
  • −1 → perfect negative linear correlation
  • 0 → no linear correlation

This helps understand how strongly two variables move together and in which direction.

2. Formula for Pearson’s Correlation

The formula for Pearson’s coefficient (r) using raw data is:

\( r = \dfrac{\sum (x - \bar{x})(y - \bar{y})}{\sqrt{\sum (x - \bar{x})^2} \sqrt{\sum (y - \bar{y})^2}} \)

Here:

  • \( x, y \) = data values
  • \( \bar{x}, \bar{y} \) = means of x and y
  • The numerator shows how x and y vary together
  • The denominator shows how much they vary individually

The value of r does not depend on the units of measurement.

3. Shortcut Formula

A commonly used shortcut formula for faster calculation is:

\( r = \dfrac{n\sum xy - (\sum x)(\sum y)}{\sqrt{[n\sum x^2 - (\sum x)^2][n\sum y^2 - (\sum y)^2]}} \)

This formula avoids calculating deviations and is convenient when working with small datasets.

4. Interpretation of r

The numerical value of r helps understand the direction and strength of the relationship:

  • r = +1 → perfect positive linear relation
  • r close to +1 → strong positive relation
  • r = 0 → no linear relation
  • r close to −1 → strong negative relation
  • r = −1 → perfect negative linear relation

The closer r is to ±1, the stronger the relationship.

5. Example

Consider the paired data:

xy
25
49
612

Step 1: Compute sums:

  • \( \sum x = 12 \)
  • \( \sum y = 26 \)
  • \( \sum xy = 2(5) + 4(9) + 6(12) = 130 \)
  • \( \sum x^2 = 2^2 + 4^2 + 6^2 = 56 \)
  • \( \sum y^2 = 5^2 + 9^2 + 12^2 = 290 \)

Step 2: Apply shortcut formula:

\( r = \dfrac{3(130) - (12)(26)}{\sqrt{[3(56) - 144][3(290) - 676]}} \)

\( r = \dfrac{390 - 312}{\sqrt{(168 - 144)(870 - 676)}} \)

\( r = \dfrac{78}{\sqrt{24 \cdot 194}} \approx 0.92 \)

This indicates a strong positive linear relationship between x and y.