DML Analysis Part 19

Overlaps with methylKit gene background

The next big step in the C. virginica project is to conduct a gene enrichment (if necessary…we’ll get to this later). I started this notebook with that intention. Before I could get into gene enrichment or description, I needed to characterize overlaps between the gene background used in methylKit and various genome feature files. I started this small analysis in December, but didn’t follow through because I was focusing on my MEPS resubmission.

The gene background from methylKit is formed using unite. I took this background and saved it as a BEDfile in this R Markdown file. I used intersectBed to find the overlaps between the gene background, exons, introns, mRNA coding regions, and transposable elements. Finally, I calculated the overlap proportions. My next step is to use these overlap proportions, and overlap proportions from DML and DMR to conduct a proportion test.

Going forward

  1. Conduct a proportion test with gene background and DML/DMR overlaps with genome feature files
  2. See how min_cov, alignment stringency, or SNPs affect clustering
  3. Determine if a formal gene enrichment is necessary
  4. If necessary, select the most appropriate gene enrichment method
  5. Describe functions of most interesting genes with DML and DMR
Written on January 4, 2019