DML Analysis Part 4
flankBed 101
It’s not enough to understand where DMLs or CG motifs intersect with exons, introns and mRNAs! Regions up or downstream from mRNA regions may contain promoters or other transcription factors. These areas could also be regulated by methylation. To explore this, I identified flank
in `bedtools in this issue as a method to add 1000 bp to the beginning and end of an mRNA region.
Before I could do this successfully, I had to create a “genome file”. For flankBed
, the genome just needed to be a tab-delimited file with the chromosome name, start and stop positions. I used NCBI information to create the “genome file” in TextWrangler.
I used the following code in my Jupyter notebook to generate the flanks:
! /Users/Shared/bioinformatics/bedtools2/bin/flankBed
-i C_virginica-3.0_Gnomon_mRNA.gff3
-g 2018-06-15-bedtools-Chromosome-Lengths.txt
-b 1000 \
> 2018-06-15-mRNA-1000bp-Flanks.bed
I had an issue running the code over the weekend, since I couldn’t get any output. Once I restarted Jupyter and reran the code, it worked!
I took my flankBed
output and ran it through intersectBed
to find overlaps between the output, DMLs, and CG motifs.
! /Users/Shared/bioinformatics/bedtools2/bin/intersectBed \
-wb \
-a 2018-06-15-mRNA-1000bp-Flanks.bed \
-b ../2018-05-29-MethylKit-Full-Samples/2018-05-30-DML-Locations.bed \
> 2018-06-19-mRNA-100bp-Flanks-DML.txt
! /Users/Shared/bioinformatics/bedtools2/bin/intersectBed \
-wb \
-a 2018-06-15-mRNA-1000bp-Flanks.bed \
-b C_virginica-3.0_CG-motif.bed \
> 2018-06-19-mRNA-100bp-Flanks-CGmotif.txt
You can find the flank-DML output on Github, but the CG motif one was too big. I’ll upload that to OWL shortly.
That’s a wrap on the gonad methylation stuff until I’m back in town!