Killifish Hypoxia RRBS Part 14
Revisiting DMR identification
After talking to Neel, we decided to go back to BAT_filter_vcf
step and see if changing some of the filtering settings would lead to DMR identification. By imposing more lenient restrictions at the filtering step, I may have more loci available for metilene
to construct methylated regions.
Checking code
Before I proceeded with re-filtering my called samples, I checked the BAT_summarize
code to make sure I didn’t mix any samples up! I reviewed the code and fortunately (or unfortunately?) there were no obvious mistakes. I also reviewed the pdf output from BAT_overview
to see if the samples sequenced on L4 clustered together. Since I noticed they had more truncated patterns than the other samples, I wanted to make sure the truncation wasn’t impacting my ability to call DMR. I didn’t see any L4 samples cluster together in the dendograms.
Refiltering samples
Alright, now that I checked everything else, it was time to re-filter samples. I tried the following settings:
- MDP_min = 10, MDP_max = 100
- MDP_min = 10
- MDP_min = 10, MDP_max = 500
- MDP_min = 10, MDP_max = 1000
Once samples were re-filtered, I ran them through BAT_summarize
and BAT_DMRcalling
. I used four contrasts for these tests: 20 vs. 5 and 20 vs. OC within each population. For DMR_calling
, I used a minimum of 1 CpG per DMR just to see if I could get any methylated region detected first, then changed those settings if I identified DMR.
Table 1. Number of DMR identified with different filtering and DMR identification settings
Contrast | Group 1 | Group 2 | MDP_min = 10, MDP_max = 100, MR_min = .10, c = 10 | MDP_min = 10, MDP_max = 100, MR_min = .10, c = 1 | MDP_min = 10, c = 1 | MDP_min = 10, c = 5 | MDP_min = 10, c = 10 | MDP_min = 10, MDP_max = 100, c = 1 | MDP_min = 10, MDP_max = 500, c = 1 | MDP_min = 10, MDP_max = 500, c = 5 | MDP_min = 10, MDP_max = 500, c = 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
N | 20 | 5 | 0 | 0 | 16 | 16 | 16 | 0 | 16 | 16 | 16 |
N | 20 | OC | 0 | 0 | 2 | 2 | 2 | 0 | 2 | 2 | 2 |
S | 20 | 5 | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 1 |
s | 20 | OC | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 1 |
Reviewing my DMR output, it’s clear that the same DMR were called regardless of the settings! Now that I know there are DMR in these datasets, I want to finalize my BAT_filter_vcf
and BAT_DMRcalling
settings so I can apply them to the other four contrasts.
Going forward
- Finalize
BAT_filter_vcf
andBAT_DMRcalling
settings - Run all data through
BAT_filter_vcf
,BAT_summarize
,BAT_overview
, andBAT_DMRcalling
- Update methods and results
- Obtain other important methylation landscape information
- Generate genome feature tracks
- Annotate DMR locations
- Try DMR identification with
bismark
andmethylKit
- Start RNA-Seq analysis
- Create OSF repository for all intermediate files