Killifish Hypoxia RRBS Part 20

Re-align to the new genome

It’s been a while so it’s time to dive back into this project! Before revisiting my analyses, I want to start mapping the data to the newest version of the genome. Similar to the old genome, the newer version has 24 chromosomes. Based on the annotation of genes closest to DMR, there are some genes close to DMR that are not in the most current version of the genome. Additionally, it’s hard to find the RefSeq annotation or RepeatMasker information for the old genome on NCBI. The old version of the genome was put together in 2015, while the new version is from 2020.

There are lots of reasons to use the genome! I’m going to go through the entire BAT workflow with the new genome, then compare it to the output from the old genome. Lack of large-scale methylation differences using both genome versions would lend further support to our results.

I downloaded new genome onto Poseidon from this link. I modified the envfile to include the proper path to the new genome, and removed the code that bound Neel’s home directory to the singularity container since the genome was in my home directory and not his in the alignment script. The BAT-mapping and BAT-mapping-stat scripts remain unchanged. Since I had a problem running scripts over 24 hours last time, I didn’t bother to change the time limit on the script.

The genome indices are being built now, so as long as some samples start mapping in the next 24 hours I should be set to get mapping results in the next week!

Going forward

  1. Revise methylation landscape information
  2. Update methods and results
  3. Match DMR with RNA-Seq information
  4. Continue BAT workflow with new genome
  5. Try DMR identification with bismark and methylKit
  6. Create OSF repository for all intermediate files
Written on September 7, 2022