ANCOVA (Analysis of Covariance) combines regression and ANOVA to: - Compare group means while adjusting for a continuous covariate - Increase statistical power by reducing residual error - Control for confounding variables
When to Use ANCOVA
Use ANCOVA when you have: - Response variable: Continuous - Predictor variable: Categorical (factor/groups) - Covariate: Continuous variable that affects the response
Key Assumptions of ANCOVA
Independence of observations
Normality of residuals
Homogeneity of variances across groups
Linearity between response and covariate within each group
Homogeneity of slopes (most critical!) - regression slopes must be equal across all groups
Critical First Step
Always test for homogeneity of slopes before proceeding with ANCOVA. If slopes differ significantly between groups, standard ANCOVA is inappropriate.
Part 1: Cricket Chirping Analysis
Data Overview
We want to compare chirping rate of two cricket species: - Oecanthus exclamationis - Oecanthus niveus
But we measured rates at different temperatures, and there’s a relationship between pulse rate and temperature. ANCOVA lets us adjust for temperature effect to get a more powerful test!
# Create simulated cricket data based on lecture exampleset.seed(456)n <-40species <-rep(c("O. exclamationis", "O. niveus"), each = n/2)temp <-c(rnorm(n/2, mean =22, sd =2), rnorm(n/2, mean =24, sd =2))chirp_rate <-40+2.5* (temp -23) +ifelse(species =="O. exclamationis", 10, 0) +rnorm(n, sd =3)cricket_data <-data.frame(species = species, temp = temp, chirp_rate = chirp_rate)# View data structurehead(cricket_data)
species temp chirp_rate
1 O. exclamationis 19.31296 40.69557
2 O. exclamationis 23.24355 51.78799
3 O. exclamationis 23.60175 50.75553
4 O. exclamationis 19.22222 40.80589
5 O. exclamationis 20.57129 50.16484
6 O. exclamationis 21.35188 46.24225
# Plot with regression lines by speciesggplot(cricket_data, aes(x = temp, y = chirp_rate, color = species)) +geom_point(alpha =0.7) +geom_smooth(method ="lm", se =FALSE)
`geom_smooth()` using formula = 'y ~ x'
Step 1: Test Homogeneity of Slopes
This is the most critical assumption! We test if the regression slopes are equal across all groups.
# Test for homogeneity of slopes by including interaction termcricket_slopes_model <-lm(chirp_rate ~ temp * species, data = cricket_data)Anova(cricket_slopes_model, type =3)
Interpretation: If p > 0.05, slopes are homogeneous and we can proceed with ANCOVA. If p < 0.05, slopes differ and standard ANCOVA is inappropriate.
Step 2: Fit ANCOVA Model
Since slopes are homogeneous (p > 0.05), we can fit the ANCOVA model without the interaction term.
# Fit ANCOVA model (without interaction)cricket_ancova <-lm(chirp_rate ~ temp + species, data = cricket_data)# Get model summarysummary(cricket_ancova)
Call:
lm(formula = chirp_rate ~ temp + species, data = cricket_data)
Residuals:
Min 1Q Median 3Q Max
-6.0065 -1.9653 0.1923 0.7886 5.9192
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -13.2012 4.7423 -2.784 0.00842 **
temp 2.7926 0.2048 13.634 0.000000000000000530 ***
speciesO. niveus -11.8005 0.8593 -13.733 0.000000000000000424 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.694 on 37 degrees of freedom
Multiple R-squared: 0.8994, Adjusted R-squared: 0.894
F-statistic: 165.4 on 2 and 37 DF, p-value: < 0.00000000000000022
# View ANOVA tableAnova(cricket_ancova)
Anova Table (Type II tests)
Response: chirp_rate
Sum Sq Df F value Pr(>F)
temp 1348.81 1 185.90 0.0000000000000005296 ***
species 1368.34 1 188.59 0.0000000000000004236 ***
Residuals 268.46 37
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Step 3: Check Model Assumptions
# Create diagnostic plotspar(mfrow =c(2, 2))plot(cricket_ancova, main ="ANCOVA Diagnostic Plots")
par(mfrow =c(1, 1))
Step 4: Calculate Adjusted Means
ANCOVA compares adjusted means - what each group’s mean would be at the overall mean of the covariate.
# Calculate adjusted means using emmeanscricket_adjusted_means <-emmeans(cricket_ancova, "species")# Convert to dataframe for plottingcricket_adj_means_df <-as.data.frame(cricket_adjusted_means)cricket_adj_means_df
species emmean SE df lower.CL upper.CL
O. exclamationis 51.70513 0.6049702 37 50.47934 52.93091
O. niveus 39.90462 0.6049702 37 38.67883 41.13040
Confidence level used: 0.95
Step 5: Pairwise Comparisons
# Pairwise comparisons of adjusted meanspairs(cricket_adjusted_means, adjust ="sidak")
contrast estimate SE df t.ratio p.value
O. exclamationis - O. niveus 11.8 0.859 37 13.733 <.0001
Step 6: Visualize Results
# Plot adjusted means with confidence intervalsplot(cricket_adjusted_means, comparisons =TRUE)
# Bar chart of adjusted meansggplot(cricket_adj_means_df, aes(x = species, y = emmean, fill = species)) +geom_bar(stat ="identity", width =0.7) +geom_errorbar(aes(ymin = lower.CL, ymax = upper.CL), width =0.2) +labs(title ="Adjusted Mean Chirping Rate by Species",subtitle ="Means adjusted for temperature",x ="Species",y ="Adjusted Chirping Rate") +theme_minimal() +theme(legend.position ="none",axis.text.x =element_text(angle =45, hjust =1))
Part 2: Partridge Longevity Analysis
Data Overview
We’ll analyze the effect of mating strategy on male fruitfly longevity, using thorax length as a covariate.
# Visualize the relationship between thorax length and longevity by treatmentggplot(partridge, aes(x = THORAX, y = LONGEV, color = treatment)) +geom_point() +geom_smooth(method ="lm", se =FALSE) +labs(title ="Relationship between Thorax Length and Longevity",x ="Thorax Length (mm)",y ="Longevity (days)",color ="Treatment") +theme_minimal() +theme(legend.position ="bottom")
`geom_smooth()` using formula = 'y ~ x'
Step 1: Test Homogeneity of Slopes
# Test for homogeneity of slopeshomo_slopes_model <-lm(LONGEV ~ THORAX * treatment, data = partridge)Anova(homo_slopes_model, type =3)
# Fit the ANCOVA model (without interaction)ancova_model <-lm(LONGEV ~ THORAX + treatment, data = partridge)# Get more detailed summarysummary(ancova_model)