Spearman correlation
Spearman rank correlation
The Spearman correlation coefficient (or Spearman’s $\rho$) measures rank correlation between two variables.
Assuming monotonicity, the Spearman’s $\rho$ will take values between $-1$ and $1$, representing completely opposite or identical ranks, respectively1.
Due to the dependance on ranks, the Spearman’s $\rho$ is used for ordinal value, although discrete and continous values are possible.
If we consider a dataset of size $n$, and $X_i, Y_i$ as the scores, we can then calculate the ranks as $\operatorname{R}({X_i}), \operatorname{R}({Y_i})$, and $\rho$ as
$$ r_s = \rho_{\operatorname{R}(X),\operatorname{R}(Y)} = \frac{\operatorname{cov}(\operatorname{R}(X), \operatorname{R}(Y))} {\sigma_{\operatorname{R}(X)} \sigma_{\operatorname{R}(Y)}}, $$
Here $\rho$ is the Pearson correlation coefficient, but applied to the rank variables, $\operatorname{cov}(\operatorname{R}(X), \operatorname{R}(Y))cov(R(X),R(Y))$ is the covariance of the rank variables, $\sigma_{\operatorname{R}(X)}$ and $\sigma_{\operatorname{R}(Y)}$ are the standard deviations of the rank variables.
If all the ranks are distinct integers, the simplified form can be applied
$$ r_s = 1 - \frac{6 \sum d_i^2}{n(n^2 - 1)}, $$
where $d_i = \operatorname{R}(X_i) - \operatorname{R}(Y_i)$ is the difference between the two ranks of each observation, $n$ is the number of observations.
Assuming no repeated ranks. ↩︎