Killifish Hypoxia RRBS Part 15

Revised DMR identification

I’m moving forward with 10 reads minimum and 500 reads maximum for BAT_filter_vcf and default settings for BAT_DMRcalling. Time to apply these settings to all my data!

BAT_filter_vcf

I updated my code to include the revised filtering settings. The reason I still want to include a maximum for number of reads is because I wanted to avoid regions with excessive PCR duplication. Once I obtained filtered bedGraphs, I sorted them and remove extra chromosome columns.

The revised filtering output can be found in this folder. I’m still seeing the weird coverage graphs in L4 samples when compared to others:

Figures 1-2. Coverage for filtered CpGs in L3 and L4 samples

Screen Shot 2022-05-24 at 12 08 34 PM

Screen Shot 2022-05-24 at 12 08 29 PM

I’m a bit concerned about using 500 reads max at an individual locus given the weird L4 graphs, but I’ll go through the rest of the workflow and decide later. It may be good to scale down the number of reads (ex. 200, 300, 400) and see if I can still call these DMR while being more conservative if I am concerned about the pattern in L4 samples.

BAT_summarize and BAT_overview

The output for BAT_summarize and BAT_overview with the re-filtered samples can be found here. Looking at the pdfs, I didn’t see anything notable beyond average methylation differences between treatments.

BAT_DMRcalling

Now the good stuff…DMR! The output for every contrast can be found in this folder.

Table 1. Number of DMR per contrast

Contrast Group 1 Group 2 DMR
All samples N S 0
OC N S 1
20 N S 1
5 N S 2
N 20 5 16
N 20 OC 2
S 20 5 1
s 20 OC 1

I’m not surprised there were no DMR identified when all samples were compared between populations: that introduced a lot of variability. Now that I have a list of DMR, the next step is to view them in IGV and determine if the DMR are adjacent to any genes or other genome features.

Going forward

  1. Update methods and results
  2. Obtain other important methylation landscape information
  3. Generate genome feature tracks
  4. Annotate DMR locations
  5. Try DMR identification with bismark and methylKit
  6. Start RNA-Seq analysis
  7. Create OSF repository for all intermediate files
Written on May 24, 2022