SRM Analysis Part 3

All aboard the NMDS Struggle Bus

The first step in my analysis is to see if my technical replicates cluster together without any normalization. If they do cluster, that means both mass spectrometer injections were similar and I can keep them for further analysis. If they do not cluster together, I must remove them from my dataset.

All of my work can be found in this R script. Here’s what I did:

  • Imported my non-pivoted dataset
  • Merged Skyline SRM data with the sequence file and biological replicate information
    • When I merged my data with the biological replicate information, somehow my OBLNK data was not conserved in the merge. This is okay since I wasn’t planning on analyzing my procedural blanks in this step anyways, but it’s still weird.
    • Wrote out the master dataframe as a .csv
  • Subsetted Area, Protein, Peptide, Transition and sample identifiers for NMDS analysis
    • Merged Protein, Peptide and Transition information into one column to use as row names
    • Cast my dataframe using reshape2 so it could be used with metaMDS
    • Saved the cast dataframe as a .csv
  • Used metaMDS to make an NMDS plot
    • Was unsuccessful using the Bray-Curtis distance, even when log(x+1) transforming the data

screen shot 2017-09-08 at 9 22 16 am

screen shot 2017-09-08 at 9 25 51 am

Figures 1-2. Unsuccessful NMDS plot and corresponding Shepard plot using Bray-Curtis distance.

screen shot 2017-09-08 at 9 38 18 am

screen shot 2017-09-08 at 9 27 12 am

Figures 3-4. Unsuccessful NMDS plot and corresponding Shepard plot using log(x+1) transformed data and Bray-Curtis distance.

  • Was successful when using Euclidean distances

screen shot 2017-09-08 at 9 23 08 am

screen shot 2017-09-08 at 9 26 04 am

Figures 5-6. Preliminary NMDS plot using Euclidean distances and corresponding Shepard plot.

Now I’ll work on color-coding my plot so I can see if the technical replicates are clustering togther. I’m meeting with Julian tomorrow, so I’ll ask him for his advice on using raw Euclidean distances instead of Bray-Curtis distances. After I see how my replicates cluster, I can average them and look primarily at site and eelgrass differences!

Written on September 7, 2017