DML Analysis Part 22

Gene enrichment for mRNA overlaps

Back in analysis mode! I decided to tackle a gene enrichment for the any feature files that overlaped with mRNA coding regions. The reasons why I only chose files with mRNA coding region overlaps are because learning which coding regions are enriched is most interesting to me, and because I needed Genbank IDs to match overlap results with my blastx output. I did a gene enrichment before, but this was using the wrong background. For this analysis, I used a previously generated blastx output and the gene background from methylKit. I merged documents and isolated Uniprot codes in this R Markdown file.

I followed the instructions from my previous gene enrichment using DAVID. I downloaded the functional annotation table, functional annotation clustering, and GOterm information (biological processes, cellular components, and molecular functions) from DAVID for each analysis. The output from all my analyses can be found in this master folder. I’ve gone into detail about some of the GOterm results below, but there were barely any significantly enriched terms after correcting for multiple comparisons. There are a handful of enriched GOterms that are significant without correction that could be interesting to describe.

Overlaps

I performed a gene enrichment from the DMR-mRNA and DML-mRNA overlaps to see which genes were overrepresented in differentially methylated loci and regions between ambient and treatment samples.

Based on corrected p-values, there was only one significantly enriched GOterm for DMR-mRNA overlaps: cilium morphogenesis! That could be interesting seeing how impacts on cilia could affect cellular structure. The only other GOterm with less than a 10% FDR was cellular projection organization, which may also be involved with cilia, flagella, and sperm motility.

There were no significantly enriched GOterms when I looked at DMLs instead of DMRs. For cellular components, cytoplasm had a FDR less than 10%. The molecular function ubiquitin-protein transferase activity also had a FDR less than 10%.

Upstream Flanks

I previously conducted a flanking analysis to identify 100 bp flanks upstream and downstream of mRNA coding regions. I then intersected these flanks with DMR and DML. Understanding what processes are enriched in the flanking regions can provide insight into regulatory mechanisms.

For the upstream flanks, I used this DMR overlap file and this DML overlap file.

There were no significantly enriched GOterms for the intersection of upstream flanks with DMRs or DMLs after correcting for multiple comparisons.

Downstream Flanks

For the downstream flanks, I used this DMR overlap file and this DML overlap file.

There were no significantly enriched GOterms for the intersection of downstreams flanks with DMRs or DMLs after correcting for multiple comparisons.

Closest Non-Overlapping

When I conduced my flanking analysis, I also identified the closest non-overlapping DMR and DML to each mRNA coding region. Again, understanding what processes are enriched in these non-overlapping elements may provide information about regulatory mechanisms or related gene functions. I used this file for DMRs and this file for DMLs.

There were also no significantly enriched GOterms for the closest non-overlapping DMRs or DMLs.

Going forward

  1. Determine if this is the best gene enrichment approach
  2. Add Genbank IDs to exon, intron, and transposable element overlaps, then perform gene enrichment
  3. Describe functions of most interesting genes with DML and DMR

Please enable JavaScript to view the comments powered by Disqus.

Written on February 23, 2019