West Coast Green Crab Experiment Part 88
Preliminary gene expression analysis
Ashray is working on a preliminary gene expression analysis with DESeq2. We will determine the best methods for understanding how six weeks of exposure to either 25ºC or 5ºC impacted gene expression. Once we have those methods streamlined, we can focus on exploring how genotype also impacts gene expression. In addition to working on getting the transcriptome files ready for him (see this saga), I’m troubleshooting part of the R code. This is because DESeq2 doesn’t have the best user manual, and my limited familiarity with gene expression analysis means that I have to actually code to remember how to do things! I’ve sent him a previous script I’ve worked on for him to use as an additional reference.
2026-03-13
Quick recap
Ashray is leading the analysis in this R Markdown script. What’s been done so far:
- Took isoform-level count matrix from raw transcriptome and ran it through
DESeq2 - Tried making volcano plots
- Made preliminary PCA
I outlined the next steps for the analysis for him:
- Import gene-level count matrix
- Remove contaminated sequences
- Count number of genes with higher expression in 25ºC, and vice versa
- Annotate differentially expressed genes
- Create a PCA and heatmap
The next steps for me:
- Determine the best plan forward for functional level analysis
- Option 1: Manually looking at certain gene groups (ex. HSPs, ubiquitinases, oxidative stress genes)
- Option 2: WGCNA and
GO-MWU - Option 3: ASCA and
GoSeqenrichment for clusters
- Provide scaffolding for poster creation
I think Option 1 would be a good option for breaking up a heatmap by function. While Option 2 seems to be “gold-standard” for RNA-Seq analysis right now, I think Option 3 is more powerful and would dovetail nicely with the metabolomics analysis. I’ll have to see how long it takes to complete the tasks that are already at hand, and what path is more feasible with the time remaining before WCBSURC.
Reviewing Ashray’s code
The last thing I wanted to do was review Ashray’s script. Because of the back-and-forth nature of the analysis, I’ve noticed that his code is not well-annotated, and there are often superfluous code chunks that either generate incorrect datasets or make it hard to figure out what is going on. I wanted to review his script to point out all the places where the code could be made more reproducible. That issue can be found here. This will also make it easier for me to review his code and see what he’s working on.
Going forward
- Use
DESeq2for preliminary temperature-specific gene expression analysis - Determine methods for functional analysis
- Compare
DESeq2withedgeRand pick a method to move forward with - Repeat analysis with clean transcriptome and fuller annotations
- Identify temperature- and genotype-specific differentially expressed genes at the end of the experiment
- Identify genes influenced by both temperature and time
- Additional strand-specific analysis in the supergene region
- Examine HOBO data from 2023 experiment
- Demographic data analysis for 2023 paper