instead of trying to preserve exact distances or maximize variance explained
focuses on preserving the rank order of similarities between samples
makes it remarkably robust for ecological data.
Data - built in R and other places
| Code | Description of the variable |
|---|---|
| das | Distance from the source [km] |
| alt | Altitude [m a.s.l.] |
| pen | Slope [per thousand] |
| deb | Mean minimum discharge [m3s-1] |
| pH | pH of water |
| dur | Calcium concentration (hardness) [mgL-1] |
| pho | Phosphate concentration [mgL-1] |
| nit | Nitrate concentration [mgL-1] |
| amn | Ammonium concentration [mgL-1] |
| oxy | Dissolved oxygen [mgL-1] |
| dbo | Biological oxygen demand [mgL-1] |
| Code | Common Name (English) | Family |
|---|---|---|
| Chb | Bullhead | Cottidae |
| Tru | Brown trout | Salmonidae |
| Vai | Minnow | Cyprinidae |
| Loc | Stone loach | Balitoridae |
| Omb | Grayling | Salmonidae |
| Bla | Souffia | Cyprinidae |
| Hot | Common nase | Cyprinidae |
| Tox | Sofie | Cyprinidae |
| Van | Dace | Cyprinidae |
| Che | Chub | Cyprinidae |
| Bar | Barbel | Cyprinidae |
| Lot | Burbot | Lotidae |
| Spi | Spirlin | Cyprinidae |
| Gou | Gudgeon | Cyprinidae |
| Bro | Pike | Esocidae |
| Per | Perch | Percidae |
| Tan | Tench | Cyprinidae |
| Gar | Roach | Cyprinidae |
| Lam | Brook lamprey | petromizonidae |
Uses dissimilarity matrices (not covariance)
Non-parametric (rank-based)
Better for non-linear ecological relationships
No assumption about data distribution
dissimilarity matrix contains values for every pair of samples = “how different”
typically range from 0 (identical) to 1 (completely different).
bounded between 0 and 1
weights by abundance (not just presence/absence) - ignores joint absences entirely
Similarity is based only on what’s actually present and shared between sites
performs well with sparse data
The formula is:
\[BC_{jk} = \frac{\sum_i |y_{ij} - y_{ik}|}{\sum_i (y_{ij} + y_{ik})}\]
Where \(y_{ij}\) is the abundance of species \(i\) at site \(j\)
# Prepare species data (remove site column)
spe_matrix <- doubs_spe %>% select(-site, -reach) %>% as.matrix()
# Add small constant to avoid issues with zeros
spe_matrix <- spe_matrix + 0.1
# Run NMDS with multiple random starts
set.seed(123)
fish_nmds <- metaMDS(spe_matrix, distance = "bray", # Bray-Curtis dissimilarity
k = 2, trymax = 100) # 2 dimensions + Maximum tries
## Run 0 stress 0.07449349
## Run 1 stress 0.1201175
## Run 2 stress 0.1201749
## Run 3 stress 0.1195174
## Run 4 stress 0.1201175
## Run 5 stress 0.1468615
## Run 6 stress 0.1395204
## Run 7 stress 0.09450578
## Run 8 stress 0.07460885
## ... Procrustes: rmse 0.02069815 max resid 0.09861179
## Run 9 stress 0.1273318
## Run 10 stress 0.1200159
## Run 11 stress 0.1273318
## Run 12 stress 0.07449349
## ... Procrustes: rmse 4.373442e-06 max resid 1.595028e-05
## ... Similar to previous best
## Run 13 stress 0.09450578
## Run 14 stress 0.140665
## Run 15 stress 0.07459993
## ... Procrustes: rmse 0.02014078 max resid 0.09796964
## Run 16 stress 0.1262615
## Run 17 stress 0.07460885
## ... Procrustes: rmse 0.02069823 max resid 0.09861196
## Run 18 stress 0.09134568
## Run 19 stress 0.07460885
## ... Procrustes: rmse 0.02069807 max resid 0.09861147
## Run 20 stress 0.07460885
## ... Procrustes: rmse 0.02069863 max resid 0.09861444
## *** Best solution repeated 1 timesfish_nmds <- metaMDS(spe_matrix, distance = "bray", # Bray-Curtis dissimilarity
k = 2, trymax = 100) # 2 dimensions + Maximum triesmetaMDS(): Main NMDS function from vegan package
distance = "bray": Bray-Curtis dissimilarity (best for abundances)k = 2: Two dimensions for plottingtrymax = 100: Try 100 random starting configurationsUnderstanding the Output:
Clear separation between river reaches
Gradient pattern from upper to lower reaches
Sites within each reach are more similar to each other than to other reaches
Multivariate version of ANOVA
Uses distances between samples instead of means
Tests: “Are the centers of these groups different in multivariate space?” .
No assumptions about data distribution
Creates empirical null distribution
Accounts for complex dependency structures
H₀: River position doesn’t affect community composition
H₁: River position significantly affects community composition
betadisper() to test homogeneity of dispersion## Homogeneity of dispersion test p-value: 0.8037
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = spe_matrix ~ reach, data = doubs_env, permutations = 999, distance = "bray")
## Df SumOfSqs R2 F Pr(>F)
## Model 2 2.3631 0.42634 9.6614 0.001 ***
## Residual 26 3.1798 0.57366
## Total 28 5.5429 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Significant result (p < 0.001): We reject H₀
River reach explains substantial variation in fish communities
Fish communities differ significantly between river reaches
Very few permutations gave F ≥ observed F
All pairwise comparisons are statistically significant even after Bonferroni correction
Upper vs Lower shows the strongest difference (highest F-statistic)
Downstream comparisons to up and middle explains a substantial portion of variance (R² > 0.3)
Biological interpretation: Fish communities change progressively down the river
## pairs Df SumsOfSqs F.Model R2 p.value p.adjusted
## 1 Upstream vs Midstream 1 0.4068962 3.434528 0.1680747 0.021 0.063
## 2 Upstream vs Downstream 1 1.8495110 15.490360 0.4767679 0.001 0.003
## 3 Midstream vs Downstream 1 1.2829784 9.972485 0.3565105 0.001 0.003
## sig
## 1
## 2 *
## 3 *
Results:
R ≈ 1: Groups are completely separated
R ≈ 0: Groups are indistinguishable
R < 0: More dissimilarity within groups than between
ANOSIM uses ranks of distances
PERMANOVA uses actual distances
ANOSIM is more robust but less powerful
R = 1: Perfect separation
R = 0: No separation
R = -1: More similar between groups than within
# Run ANOSIM
anosim_result <- anosim(spe_matrix, doubs_env$reach,
distance = "bray", permutations = 999)
# Display results
anosim_result
##
## Call:
## anosim(x = spe_matrix, grouping = doubs_env$reach, permutations = 999, distance = "bray")
## Dissimilarity: bray
##
## ANOSIM statistic R: 0.5082
## Significance: 0.001
##
## Permutation: free
## Number of permutations: 999sites within same river reach tend to be more similar to each other (lower dissimilarity ranks) than sites in different reaches (higher dissimilarity ranks)
R statistic of 0.51 quantifies this separation
Interpreting R:
## # A tibble: 2 × 6
## Method `Test Statistic` `P-value` Interpretation `What it tests` Approach
## <chr> <chr> <dbl> <chr> <chr> <chr>
## 1 PERMANOVA F = 9.66 0.001 Groups have dif… Differences in… Uses ac…
## 2 ANOSIM R = 0.508 0.001 Excellent group… Overlap betwee… Uses ra…
More powerful for detecting differences
Better for complex experimental designs
Can handle interactions and covariates
Preferred for most applications
betadisper() before interpreting PERMANOVA!
Comparison
##
## ***VECTORS
##
## NMDS1 NMDS2 r2 Pr(>r)
## das 0.98098 0.19411 0.7284 0.001 ***
## alt -1.00000 -0.00057 0.5650 0.001 ***
## pen -0.61902 0.78538 0.2546 0.026 *
## deb 0.99744 -0.07146 0.5701 0.001 ***
## pH -0.09760 -0.99523 0.0746 0.374
## dur 0.99967 -0.02575 0.2960 0.008 **
## pho 0.22860 0.97352 0.5228 0.001 ***
## nit 0.66665 0.74537 0.5200 0.001 ***
## amm 0.19902 0.98000 0.5471 0.002 **
## oxy -0.46402 -0.88583 0.7826 0.001 ***
## dbo 0.20211 0.97936 0.6883 0.001 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation: free
## Number of permutations: 999
envfit() to correlate environment with ordination