Continous X#

Continuous X → Continuous Y#

# X = BMI
# Y = Disease progression (continuous)
from sklearn.datasets import load_diabetes
from scipy.stats import pearsonr, spearmanr, kendalltau
from sklearn.feature_selection import mutual_info_regression
import numpy as np

data = load_diabetes()
X = data.data[:, 2]        # BMI feature
y = data.target            # continuous target

pearson, _ = pearsonr(X, y)
spearman, _ = spearmanr(X, y)
kendall, _ = kendalltau(X, y)
mi = mutual_info_regression(X.reshape(-1,1), y)[0]

print("Pearson:", pearson)
print("Spearman:", spearman)
print("Kendall:", kendall)
print("Mutual Information:", mi)
Pearson: 0.5864501344746886
Spearman: 0.5613820101065616
Kendall: 0.39119525733058874
Mutual Information: 0.17815067428832032

Metric

Type

Strength

Interpretation

Pearson = 0.586

Linear

Moderate–Strong

Clear upward linear trend

Spearman = 0.561

Monotonic

Moderate–Strong

BMI ↑ → Disease ↑ consistently

Kendall = 0.391

Rank

Moderate

Good concordance of ordering

MI = 0.178

Nonlinear

Meaningful

BMI carries predictive signal

Continuous X → Binary Y#

X = mean radius Y = diagnosis (0/1)

from sklearn.datasets import load_breast_cancer
from scipy.stats import pointbiserialr
from sklearn.metrics import roc_auc_score
from sklearn.feature_selection import mutual_info_classif
from sklearn.linear_model import LogisticRegression

data = load_breast_cancer()
X = data.data[:, 0]      # mean radius
y = data.target          # binary

pb, _ = pointbiserialr(y, X)

model = LogisticRegression()
model.fit(X.reshape(-1,1), y)
beta = model.coef_[0][0]

auc = roc_auc_score(y, X)
mi = mutual_info_classif(X.reshape(-1,1), y)[0]

print("Point-Biserial:", pb)
print("Logistic β:", beta)
print("AUC:", auc)
print("Mutual Information:", mi)
Point-Biserial: -0.7300285113754563
Logistic β: -1.0251962293185464
AUC: 0.0624834839596216
Mutual Information: 0.3690285464383032

Metric

What it Means

Point–Biserial: –0.73

Very strong negative association

Logistic β: –1.025

Higher X → sharply lower P(Y=1)

AUC: 0.062

Model predicts perfectly in the reverse direction

MI: 0.369

High information content

Continuous X → Ordinal Y#

from sklearn.datasets import fetch_california_housing
from scipy.stats import spearmanr, kendalltau
from sklearn.feature_selection import mutual_info_classif
import pandas as pd
import numpy as np

data = fetch_california_housing()
df = pd.DataFrame(data.data, columns=data.feature_names)
df["target"] = data.target

X = df["MedInc"]  # continuous
# Create ordinal bins of the target
y = pd.qcut(df["target"], q=3, labels=[1,2,3])  # ordinal 1<2<3
y = y.astype(int)

spearman, _ = spearmanr(X, y)
kendall, _ = kendalltau(X, y)
mi = mutual_info_classif(X.values.reshape(-1,1), y)[0]

print("Spearman:", spearman)
print("Kendall:", kendall)
print("Mutual Information:", mi)
Spearman: 0.6303437101989041
Kendall: 0.5068776790681007
Mutual Information: 0.2659892673280484

Metric

Value

What It Measures

Strength

Interpretation

Spearman

0.6303

Monotonic (rank-based) correlation

Strong positive

As X increases, Y increases consistently in rank; strong ordered trend

Kendall

0.5069

Concordance of ranked pairs

Strong positive

Most observation pairs are concordant; X and Y move together directionally

Mutual Information

0.2660

General dependency (linear + nonlinear)

Moderate–Strong

X contains meaningful predictive information about Y; noticeable shared dependency

Continuous X → Nominal (Categorical) Y#

#X = Petal length
#Y = Species (Setosa / Versicolor / Virginica)
from sklearn.datasets import load_iris
from scipy.stats import f_oneway, kruskal
from sklearn.feature_selection import mutual_info_classif
import pandas as pd

data = load_iris()
df = pd.DataFrame(data.data, columns=data.feature_names)
df["species"] = data.target

X = df["petal length (cm)"]
y = df["species"]

groups = [X[y == c] for c in y.unique()]

anova_f, _ = f_oneway(*groups)
kw, _ = kruskal(*groups)
mi = mutual_info_classif(X.values.reshape(-1,1), y)[0]

print("ANOVA F:", anova_f)
print("Kruskal–Wallis:", kw)
print("Mutual Information:", mi)
ANOVA F: 1180.161182252981
Kruskal–Wallis: 130.41104857977163
Mutual Information: 1.0063262962592772

Metric

Value

What It Measures

Strength

Interpretation

Spearman

0.6303

Monotonic (rank-based) association

Strong positive

X and Y increase together in a consistent ranked order; strong monotonic trend.

Kendall

0.5069

Pairwise concordance (rank agreement)

Strong positive

Most observation pairs move in the same direction; high directional agreement.

Mutual Information

0.2660

Overall dependency (linear + nonlinear)

Moderate–Strong

X carries substantial information about Y; clear shared dependency.

Continuous X → Discrete Numeric Y#

X = AveRooms (continuous) Y = Population (discrete numeric integer)

from sklearn.datasets import fetch_california_housing
from scipy.stats import pearsonr, spearmanr, kendalltau
from sklearn.feature_selection import mutual_info_regression
import pandas as pd

data = fetch_california_housing()
df = pd.DataFrame(data.data, columns=data.feature_names)

X = df["AveRooms"]
y = df["Population"]

pearson, _ = pearsonr(X, y)
spearman, _ = spearmanr(X, y)
kendall, _ = kendalltau(X, y)
mi = mutual_info_regression(X.values.reshape(-1,1), y)[0]

print("Pearson:", pearson)
print("Spearman:", spearman)
print("Kendall:", kendall)
print("Mutual Information:", mi)
Pearson: -0.0722128486589335
Spearman: -0.10538515380075536
Kendall: -0.07251597080592617
Mutual Information: 0.034846658720868895

Metric

Value

What It Measures

Strength

Interpretation

Pearson

–0.0722

Linear correlation

Very weak

No linear relationship; the small negative value is negligible and not meaningful.

Spearman

–0.1054

Monotonic (rank-based) association

Very weak

No consistent increasing or decreasing trend; ordering is mostly random.

Kendall

–0.0725

Pairwise concordance (rank agreement)

Very weak

No directional agreement; pairs behave nearly randomly.

Mutual Information

0.0348

Overall dependency (linear + nonlinear)

Extremely weak

Variables share almost no information; effectively independent.