Spearman’s Rank Correlation

Understand Spearman’s rank correlation, how ranking works, the formula, and how to measure the strength of monotonic relationships using simple examples.

1. Meaning of Spearman’s Rank Correlation

Spearman’s rank correlation measures how well two variables move together when arranged in order. It checks the monotonic relationship between two sets of data. This means it looks at how values increase or decrease together, even if the actual relationship is not perfectly linear.

It is useful when the data is based on rankings or when the values do not follow a straight-line pattern.

2. Assigning Ranks to Data

To use Spearman’s method, each set of values must first be converted into ranks.

Ranking steps:

  • Arrange the values in increasing or decreasing order.
  • Give rank 1 to the smallest or largest value (depending on the chosen order).
  • If two values are equal (ties), assign them the average of their rank positions.

2.1. Example of Ranking

Suppose the marks of five students are:

A = 50, B = 72, C = 60, D = 72, E = 40

Ranks:

  • 72 → ranks 1 and 2 → tied → each gets rank \(\dfrac{1+2}{2} = 1.5\)
  • 60 → rank 3
  • 50 → rank 4
  • 40 → rank 5

3. Formula for Spearman’s Rank Correlation

The formula for Spearman’s coefficient is:

\( r_s = 1 - \dfrac{6\sum d_i^2}{n(n^2 - 1)} \)

Where:

  • \( n \) = number of pairs
  • \( d_i \) = difference between the ranks of each pair
  • \( r_s \) lies between −1 and +1

This formula is used when there are no tied ranks. When ties exist, a more detailed method is used, but the idea remains the same.

4. Interpretation of \( r_s \)

The value of Spearman’s rank correlation shows the direction and strength of a monotonic relation:

  • +1 → perfect increasing ranking pattern
  • −1 → perfect decreasing ranking pattern
  • 0 → no monotonic relationship

A value close to +1 or −1 means a strong relationship.

5. Example

Consider the following data of two variables X and Y:

XY
5060
7080
6065
9085

Step 1: Assign ranks.

X ranks (largest gets rank 1):

  • 90 → 1
  • 70 → 2
  • 60 → 3
  • 50 → 4

Y ranks:

  • 85 → 1
  • 80 → 2
  • 65 → 3
  • 60 → 4

Step 2: Compute rank differences:

Rank XRank Yd
1100
2200
3300
4400

Step 3: Apply the formula:

\( r_s = 1 - \dfrac{6(0)}{4(4^2 - 1)} = 1 \)

This means the ranking of X and Y matches perfectly.