Hawaii Gigas Methylation Analysis

Trimming Hawaii data

I’m starting data analysis for a new(-ish) project! Dr. Maria Haws at the University of Hawai’i Hilo exposed diploid and triploid adult C. gigas to low or ambient pH in a factorial design for six weeks and dissected ctenidia tissue. Sam extracted DNA from these tissues for WGBS at Zymo. Once he got the sequencing data, he ran FastQC. My goal is to identify differentially methylated loci associated with ploidy and/or ocean acidification in these samples.

To start, I wanted to trim data with trimgalore on mox. Even though fastp works faster, I wanted to use trimgalore because that’s what we used in the coral methylation comparison pipeline. I started by downloading the sequence data from nightingales into the gscratch/scrubbed/yaaminiv directory on mox:

wget -r -l1 —no-parent -A “zr3644*gz” http://owl.fish.washington.edu/nightingales/C_gigas/ #Obtain sequencing data
wget http://owl.fish.washington.edu/nightingales/C_gigas/zr3644_MD5.txt #Obtain md5 information
md5sum -c zr3644_MD5.txt #Check md5sum

I copied the mox script we used to trim coral samples and pasted it into a new shell script. For the paired WGBS data, I need to hard trim 10 bp from each end, so I made sure the script did that. I added all the paths for the paired data into the script, then transferred it to mox and ran it:

rsync —archive —progress —verbose yaamini@172.25.149.226:/Users/yaamini/Documents/project-oyster-oa/code/Haws/01-trimgalore.sh . #Transfer script to USER directory, not scrubbed directory
sbatch 01-trimgalore.sh

It ran…without actually trimming anything! Since there were no trimmed files, fastqc and multiqc didn’t run either! When I checked the program paths, I noticed that path to cutadapt didn’t lead to the program. I checked the shared srlab/programs folders, but couldn’t find anything, so I decided to install cutadapt myself. When I brought this up at lab meeting, Laura mentioned that she had gone through the same process earlier. According to Sam, the way the cutadapt program works means that it can’t be installed at a shared location — every user has to install it for themselves. I followed the clear steps Laura laid out in this issue to install cutadapt and add fastqc to my path. I then transferred the revised script to mox to run! The files are trimming, so now I just need to wait.

Going forward

  1. Look at FastQC and MultiQC output
  2. Edit coral methylation scripts to fit data
  3. Run bismark pipeline
Written on January 6, 2021