Hawaii Gigas Methylation Analysis Part 6

Aligning to the new C. gigas genome

There’s a new C. gigas genome! Steven mentioned Mac may have been using the new genome, so I posted this issue to get a link to it. Although Mac never used the Roslin genome, Steven thought it would be a good idea to align to it. It has linkage groups, not just scaffolds, which is better for mapping and understanding my output. It has 10 linkage groups, which is similar to the 10 C. virginica chromosomes! My goal is to align the samples to this version of the genome instead, and move forward with these alignments in the main workflow. I’m contemplating using the previously-generated alignments as well as the new ones just to compare.

Adding the mitochondrial genome

The first thing I needed to do was add the mitochondrial genome sequence to the Roslin genome. I couldn’t figure out how to download the FASTA (NCBI why is your website so confusing). I opened up nano and copied and pasted the sequence into a new file, then saved it as a .fa and hoped for the best. Steven had the Roslin genome in one of his directories, so I copied it over to my /gscratch/scrubbed/yaamini/Haws/data/Cg-roslin folder. I appended the mitochondrial sequence to the Roslin genome using cat, and saved it as a new .fa file. I looked at the file with less and the format looked like a FASTA file, so I decided to proceed with bismark.

bismark script

I created this script to prepare a bisulfite genome and align samples. I referred to this old Jupyter notebook for the genome preparation arguments. I also used rsync to move the FASTQ files from gannet to mox so I could run the script. I submitted the script to mox and it’s in the queue to run!

Going forward

  1. Try BS-SNPer and EpiDiverse for SNP extraction from WGBS data
  2. Obtain preliminary methylation assessment from methylKit
  3. Test-run DSS and ramwas
  4. Investigate comparison mechanisms for samples with different ploidy in oysters and other taxa
  5. Transfer scripts used to a nextflow workflow
  6. Update methods
  7. Update results
Written on March 3, 2021