Topics Covered
Pearson's r
The "gold standard" for linear relationships. Measures the strength and direction of the straight-line bond between variables.
Spearman's ρ (Rho)
When linearity fails but the trend is still monotonic (always going up or down), rank-based correlation saves the day.
The Beta-r Bridge
The deep mathematical link: β₁ = r × (sᵧ / sₓ). Lean how regression is just correlation with units.
The Invariance Principle
One of the most powerful properties of correlation is its invariance to linear transformations. Whether you measure height in inches or centimeters, or body mass in grams or kilograms, the correlation coefficient r remains identical.
Why Scaling and Shifting "Don't Count"
Multiplying a variable (scaling) or adding a constant (shifting) doesn't change the relative positioning of data points. Since correlation is based on z-scores (standardized distances), these changes cancel out. Regression slopes (β₁) do scale, but the significance and R² stay rock solid.
Key Concepts
- Correlation ≠ Causation: But it's the first step in building a predictive model.
- Monotonicity: Spearman's superpower—it handles curves as long as they don't reverse direction.
- Standardization: How z-scores transform your raw units into a dimensionless "correlation space."
- Outlier Sensitivity: Why Pearson can be "fooled" by a single distant point, while Spearman remains robust.
Homework
- Run
CorrelationRevealed.Rusing the Palmer Penguins dataset. - Calculate Pearson and Spearman for Bill Length vs. Body Mass—notice the difference?
- Multiply the body mass by 1,000 (g to mg) and verify that r doesn't budge while β₁ changes by exactly 1,000.
- Create a scatter plot with
geom_smooth(method="lm")and add the correlation as a text label.