Killifish Hypoxia RRBS Part 15
Revised DMR identification
I’m moving forward with 10 reads minimum and 500 reads maximum for
BAT_filter_vcf and default settings for
BAT_DMRcalling. Time to apply these settings to all my data!
I updated my code to include the revised filtering settings. The reason I still want to include a maximum for number of reads is because I wanted to avoid regions with excessive PCR duplication. Once I obtained filtered bedGraphs, I sorted them and remove extra chromosome columns.
The revised filtering output can be found in this folder. I’m still seeing the weird coverage graphs in L4 samples when compared to others:
Figures 1-2. Coverage for filtered CpGs in L3 and L4 samples
I’m a bit concerned about using 500 reads max at an individual locus given the weird L4 graphs, but I’ll go through the rest of the workflow and decide later. It may be good to scale down the number of reads (ex. 200, 300, 400) and see if I can still call these DMR while being more conservative if I am concerned about the pattern in L4 samples.
The output for
BAT_overview with the re-filtered samples can be found here. Looking at the pdfs, I didn’t see anything notable beyond average methylation differences between treatments.
Now the good stuff…DMR! The output for every contrast can be found in this folder.
Table 1. Number of DMR per contrast
|Contrast||Group 1||Group 2||DMR|
I’m not surprised there were no DMR identified when all samples were compared between populations: that introduced a lot of variability. Now that I have a list of DMR, the next step is to view them in IGV and determine if the DMR are adjacent to any genes or other genome features.
- Update methods and results
- Obtain other important methylation landscape information
- Generate genome feature tracks
- Annotate DMR locations
- Try DMR identification with
- Start RNA-Seq analysis
- Create OSF repository for all intermediate files