Distance metrics
L-p metrics
Manhattan distance (L1)
Given two vectors $p$ and $q$, such that
$$ \begin{aligned} p &= \left(p_1, p_2, \dots,p_n\right) \ q &= \left(q_1, q_2, \dots,q_n\right) \end{aligned} $$
we define the Manhattan distance as:
$$ d_1(p, q) = |p - q|1 = \sum{i=1}^n |p_i-q_i| $$
Euclidean distance (L2)
In general, for points given by Cartesian coordinates in $n$-dimensional Euclidean space, the distance is
$$ d(p,q)=\sqrt {(p_{1}-q_{1})^{2}+(p_{2}-q_{2})^{2}+\dots +(p_{i}-q_{i})^{2}+\dots +(p_{n}-q_{n})^{2}} $$
Cluster distances
Within-cluster sum of squares (WCSS)
Given a set of observations ($x_1, x_2,\dots,x_n$), where each observation is a $d$-dimensional real vector, $k$-means clustering aims to partition the $n$ observations into $k$ ($\leq n$) sets $S=\lbrace S_1, S_2, \dots, S_k\rbrace$ so as to minimize the within-cluster sum of squares (WCSS) (i.e. variance). Formally, the objective is to find:
$$ {\underset {\mathbf {S} }{\operatorname {arg,min} }}\sum {i=1}^{k}\sum {\mathbf {x} \in S{i}}\left|\mathbf {x} -{\boldsymbol {\mu }}{i}\right|^{2}={\underset {\mathbf {S} }{\operatorname {arg,min} }}\sum {i=1}^{k}|S{i}|\operatorname {Var} S_{i} $$
where $\mu_i$ is the mean of points in $S_i$. This is equivalent to minimizing the pairwise squared deviations of points in the same cluster:
$$ {\displaystyle {\underset {\mathbf {S} }{\operatorname {arg,min} }}\sum {i=1}^{k},{\frac {1}{2|S{i}|}},\sum {\mathbf {x} ,\mathbf {y} \in S{i}}\left|\mathbf {x} -\mathbf {y} \right|^{2}} $$
The equivalence can be deduced from identity
$${\displaystyle \sum {\mathbf {x} \in S{i}}\left|\mathbf {x} -{\boldsymbol {\mu }}{i}\right|^{2}=\sum {\mathbf {x} \neq \mathbf {y} \in S{i}}(\mathbf {x} -{\boldsymbol {\mu }}{i})({\boldsymbol {\mu }}_{i}-\mathbf {y} )}. $$
Because the total variance is constant, this is equivalent to maximizing the sum of squared deviations between points in different clusters (between-cluster sum of squares, BCSS) which follows from the law of total variance.
Dunn index
A full explanation is available at Dunn index.
Gower distance
A full explanation with examples is available at site/Machine learning/Gower distance.