Performance Metrics

Performance Metrics #

1. Internal Metrics (No Ground Truth Needed)#

These metrics evaluate how well the clustering captures cohesion and separation.

a) Silhouette Score#

Measures how similar a point is to its own cluster vs other clusters.
Formula for point \(i\):

\[ s(i) = \frac{b(i) - a(i)}{\max(a(i), b(i))} \]

where:

\(a(i)\) = average distance from \(i\) to points in its own cluster
\(b(i)\) = minimum average distance from \(i\) to points in other clusters

Interpretation:

\(s(i) \approx 1\) → well-clustered
\(s(i) \approx 0\) → on boundary
\(s(i) < 0\) → likely misclustered

b) Davies-Bouldin Index (DBI)#

Measures average similarity between each cluster and its most similar cluster.

\[ DB = \frac{1}{K} \sum_{i=1}^{K} \max_{j \neq i} \frac{\sigma_i + \sigma_j}{d_{ij}} \]

\(\sigma_i\) = average distance of points in cluster \(i\) to its centroid
\(d_{ij}\) = distance between cluster centroids \(i\) and \(j\)

Interpretation:

Lower DBI → better clustering (more separation, less overlap)

c) Calinski-Harabasz Index (CH)#

Ratio of between-cluster dispersion to within-cluster dispersion:

\[ CH = \frac{\text{trace}(B_k) / (K-1)}{\text{trace}(W_k) / (N-K)} \]

Higher CH → better defined clusters

d) Cophenetic Correlation Coefficient (CCC)#

Measures how well the dendrogram preserves original pairwise distances:

\[ CCC = \text{correlation}(d_{ij}, d_{ij}^{\text{dendrogram}}) \]

\(d_{ij}\) = original distance between points \(i\) and \(j\)
\(d_{ij}^{\text{dendrogram}}\) = height at which \(i\) and \(j\) merge in dendrogram

Interpretation:

CCC close to 1 → dendrogram preserves pairwise distances well
Useful for validating hierarchical structure

2. External Metrics (Ground Truth Available)#

If true labels are known (rare in real-world unsupervised tasks), we can use:

a) Adjusted Rand Index (ARI)#

Measures agreement between predicted clusters and true labels, adjusted for chance
ARI = 1 → perfect match, ARI ≈ 0 → random

b) Normalized Mutual Information (NMI)#

Measures shared information between predicted clusters and true labels
NMI = 1 → perfect match, NMI = 0 → no mutual information

c) Fowlkes-Mallows Index (FMI)#

Geometric mean of precision and recall between predicted clusters and true labels
Range: 0–1, higher is better

3. Practical Notes#

Choose internal metrics for real unsupervised tasks, since true labels are usually unknown.
Cophenetic correlation coefficient is especially useful for hierarchical clustering, because it measures how well the dendrogram represents distances.
Silhouette and Davies-Bouldin can help decide optimal number of clusters by cutting the dendrogram at different heights.

Summary Table

Metric	Type	Goal	Higher/Lower Better
Silhouette Score	Internal	Cohesion vs separation	Higher
Davies-Bouldin Index	Internal	Cluster similarity	Lower
Calinski-Harabasz Index	Internal	Between/within dispersion	Higher
Cophenetic Correlation	Internal	Dendrogram quality	Higher
Adjusted Rand Index	External	Match with true labels	Higher
Normalized Mutual Info	External	Shared info with labels	Higher
Fowlkes-Mallows	External	Precision/recall	Higher

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import silhouette_score, davies_bouldin_score, calinski_harabasz_score
from scipy.cluster.hierarchy import linkage, fcluster, cophenet
from scipy.spatial.distance import pdist

# Step 1: Generate synthetic dataset
X, y_true = make_blobs(n_samples=100, centers=3, n_features=2, random_state=42)
X = StandardScaler().fit_transform(X)

# Step 2: Compute linkage matrix using Ward's method
Z = linkage(X, method='ward')

# Step 3: Evaluate dendrogram with different numbers of clusters
num_clusters = [2, 3, 4, 5]
metrics_results = []

for k in num_clusters:
    labels = fcluster(Z, t=k, criterion='maxclust')
    
    silhouette = silhouette_score(X, labels)
    dbi = davies_bouldin_score(X, labels)
    ch = calinski_harabasz_score(X, labels)
    ccc, _ = cophenet(Z, pdist(X))
    
    metrics_results.append((k, silhouette, dbi, ch, ccc))

# Step 4: Display results
print("Clusters | Silhouette | Davies-Bouldin | Calinski-Harabasz | Cophenetic Corr")
for row in metrics_results:
    print(f"{row[0]:>7} | {row[1]:>10.3f} | {row[2]:>14.3f} | {row[3]:>16.3f} | {row[4]:>15.3f}")

# Step 5: Plot dendrogram
plt.figure(figsize=(10, 6))
from scipy.cluster.hierarchy import dendrogram
dendrogram(Z, color_threshold=0, leaf_rotation=45, leaf_font_size=10)
plt.title("Hierarchical Clustering Dendrogram")
plt.xlabel("Sample Index")
plt.ylabel("Distance")
plt.grid(True)
plt.show()

Clusters | Silhouette | Davies-Bouldin | Calinski-Harabasz | Cophenetic Corr
|      0.687 |          0.459 |          214.067 |           0.977
|      0.849 |          0.210 |         1686.708 |           0.977
|      0.679 |          0.574 |         1308.390 |           0.977
|      0.518 |          0.854 |         1146.523 |           0.977

../../../_images/2aef4cf0d4d79bf6a8e995df2fabe63a3803722c697c8a9638c314d20092f7dd.png

Performance Metrics

Contents