MEPS Revisions Part 3
Final unconstrained and constrained ordinations
I finally figured out all of my N/A problems and completed my unconstrained and constrained ordinations!
Protein abundance
I didn’t have any problems with my protein abundance NMDS and previously obtained pairwise ANOSIM R-statistic and p-values. In this R script, I visualized my ordinations. You can find one version with sample numbers here, and another with site and habitat distinctions here. Because I exported my files as pdfs and don’t feel like changing the code and re-exporting files, I’m also including screenshots of the ordinations.
Figures 1 and 2. Protein abundance ordinations with confidence ellipses.
Environmental data
My original problem with the environmental data was that I had N/As in my dataframe, but was unable to use method = "gower"
with the metaMDS
function in the package vegan
. Turns out vegan
uses vegdist
, which doesn’t handle N/As. Instead, I used the daisy
function in the cluster
library to calculate a Gower’s dissimilarity (distance) matrix. I then inputted that matrix directly into metaMDS
with method = "euclidean"
. It worked! My code can be found here. I then conducted an Analysis of Similarity (ANOSIM) to assess the significance of my results.
Table 1. One-way ANOSIM results for environmental data NMDS based on site (Case Inlet vs. Fidalgo Bay vs. Port Gamble Bay vs. Skokomish River Delta vs. Willapa Bay), habitat (bare vs. eelgrass), parameter (mean vs. variance), and environmental variable (pH vs. dissolved oxygen vs. salinity vs. temperature). Significant p-values are bolded.
One-way ANOSIM | R | p-value |
---|---|---|
Site | -0.03428841 | 0.943 |
Habitat | -0.02037461 | 0.982 |
Parameter | 0.7378174 | 0.001 |
Environmental Variable | 0.1796322 | 0.001 |
Only the parameter and environmental variable ANOSIMs were significant. My guess is that because I ordinated means and variances in the same space, this is skewing my results. I decided to conduct two-way ANOSIMs to go further.
Table 2. Two-way ANOSIM results for environmental data NMDS. Significant p-values are bolded.
Two-way ANOSIM | R | p-value |
---|---|---|
Site and Habitat | -0.09991229 | 1 |
Parameter and Site | 0.4778027 | 0.001 |
Parameter and Habitat | 0.4778027 | 0.001 |
Environmental Variable and Site | 0.02098092 | 0.315 |
Environmental Variable and Habitat | 0.1089352 | 0.007 |
Environmental Variable and Parameter | 0.7802082 | 0.001 |
When accounting for parameter, site and habitat were significant. When accounting for environmental variable, only habitat was significant. I included environmental variable and parameter, but I already figured that would be significant. I need to double check with Julian, but I belive conducting pairwise ANOSIMs based on the significant two-way ANOSIM results will help me understand what is driving the significant results.
I also visualized my ordination here.
Figure 3. Environmental data NMDS. Means are outlined in solid lines, variances are outlined in dashed lines.
Constrained ordination
In this R Markdown file, I conducted a constrained ordination to look at relationships in protein abundance based on my environmental data. To do this, I first created an environmental data matrix with mean and variance values from the entire outplant, and matched them with my objects, oyster sample IDs. This meant I had the same objects in both my protein abundance and environmental data matrices. I calculated the gradient length and found that the underlying model was linear, so I used an RDA. I specified na.action = na.exclude
in my rda
function so it could handle my missing values.
Table 3. Variance explained by constrained ordination. The RDA explains 29.21% of all variance.
Variance Partition | Inertia | Proportion |
---|---|---|
Total | 0.015616 | 1 |
Constrained | 0.004561 | 0.2921 |
Unconstrained | 0.011055 | 0.7079 |
Table 4. ANOVA results for overall RDA analysis. With F(6,19) = 1.3064 and p-value = 0.195, the RDA is not significant.
Partition | Df | Variance | F | Pr(>F) |
---|---|---|---|---|
Model | 6 | 0.0045607 | 1.3064 | 0.195 |
Residual | 19 | 0.0110550 |
Table 5. Significance of each RDA axis. No axis is significant.
Axis | Df | Variance | F | Pr(>F) |
---|---|---|---|---|
RDA1 | 1 | 0.0023293 | 4.0033 | 0.316 |
RDA2 | 1 | 0.0013380 | 2.2995 | 0.603 |
RDA3 | 1 | 0.0003888 | 0.6682 | 0.998 |
RDA4 | 1 | 0.0003065 | 0.5268 | 0.994 |
RDA5 | 1 | 0.0001395 | 0.2397 | 0.999 |
RDA6 | 1 | 0.0000586 | 0.1008 | 0.999 |
Table 6. Significance of each environmental descriptor in the RDA. Temperature mean and variance are marginally significant predictors.
Environmental Variable | Df | Variance | F | Pr(>F) |
---|---|---|---|---|
Temperature Mean | 1 | 0.0013542 | 2.3275 | 0.065 |
Temperature Variance | 1 | 0.0012584 | 2.1627 | 0.067 |
pH Mean | 1 | 0.0003063 | 0.5264 | 0.781 |
pH Variance | 1 | 0.0007082 | 1.2171 | 0.301 |
Dissolved Oxygen Mean | 1 | 0.0006024 | 1.0354 | 0.349 |
Dissolved Oxygen Variance | 1 | 0.0003312 | 0.5692 | 0.725 |
Residual | 19 |
One weird thing I noticed is that my salinity variables do not show up in Table 6 or my ordinations below. That’s something I’ll have to figure out with Julian. Other than that, it’s interesting to note that my RDA really isn’t significant, which fits the loose interpretation we had earlier of environment affecting protein abundance in the manuscript.
To visualized my contrained ordination, I included all proteins and environmental descriptors here, and only significant proteins and environmental descriptors here.
Figures 4-5. RDA visualization including all proteins and descriptors (top) and only those that are significant (bottom).
Going forward
- Discuss all results with Julian
- Create a better ordination visual
- Update manuscript with new methods and results
- Tweak discussion