Killifish Hypoxia RRBS Part 13

Messing around with DMR identification

Originally when I tried identifying DMR, I got 0 DMR for all contrasts. I’m going to mess around with some things to see if I can figure out why I’m not getting any DMR.

Investigating 0 DMR result

Neel’s first suggestion was to see how many methylated regions metilene identified between treatment conditions prior to filtering for DMR. When I opened the output files, they were empty! This suggests that the initial methylation region identification failed. I’m not sure if it didn’t work because of the input files (potential outliers could warp this) or because of other user-defined settings. One potential setting would be the need for regions to contain at least 10 Cs. The BAT_DMRcalling manual suggests that this is done post-processing. To confirm this, I reran BAT_DMRcalling with -c 1 to indicate a minimum of 1 C per region. Once again, metilene was unable to call methylated regions, so there is an issue with the input files or the metilene calling itself.

Looking at metilene input

The next thing I did was look at the differences in mean methylation rates between groups for all my contrasts. It stands to reason that if there aren’t large differences in mean methylation, then I’m not going to get any differentially methylated regions. I used the following awk commands to identify maximum and minimum methylation rate differences between groups based on code I found on Stack Overflow:

(base) [yaamini.venkataraman@poseidon-l1 all_pop]$ awk -v idx=4 'NR==2 || $idx>max{max=$idx} END{print max}' all_pop_diff_N_S.bedgraph
0.0755
(base) [yaamini.venkataraman@poseidon-l1 all_pop]$ awk -v idx=4 'NR==2 || $idx<min{min=$idx} END{print min}' all_pop_diff_N_S.bedgraph
-0.132166666666667

Table 1. Maximum and minimum methylation rate differences between groups for all contrasts

Contrast Group 1 Group 2 Maximum Difference Minimum Difference
All Samples N S 0.0755 -0.132166666666667
OC N S 0.435 -0.4175
20 N S 0.415 -0.6
5 N S 0.498333333333333 -0.66
N 20 5 0.646666666666667 -0.663333333333333
N 20 OC 0.5425 -0.5975
S 20 5 0.3525 -0.41
s 20 OC 0.42 -0.38

With the exception of the all sample comparison, the contrasts do have loci with mean methylation rate differences larger than 0.1. If there are multiple of those loci near eachother, then there probably could be a DMR formed. Perhaps there’s an outlier sample I need to remove? I don’t know the best way to identify outliers, so I’ll talk to Neel about that. Another thing I need to consider is an issue with metilene code or the input file.

Going forward

  1. Determine issue with DMR calling
  2. Try DMR identification with bismark and methylKit
  3. Write methods and results
  4. Obtain other important methylation landscape information
  5. Generate genome feature tracks
  6. Annotate DMR locations
  7. Start RNA-Seq analysis
  8. Create OSF repository for all intermediate files
Written on May 19, 2022