MethCompare Part 26

DMC locations and cleaning up scripts

At our last meeting, we decided we wanted location information for DMC. I previously ran that analysis in this script, but since then I’ve updated the genome feature tracks and Mac has updated the DMC lists! Mac posted her revised DMC BEDfiles in this issue so I could characterize locations. Although she was mainly interested in two different DMC lists, I decided it was easy enough to run the analysis for all of them in case we wanted that information later.

The first thing I did was remove the old analysis BEDfiles and output from a different directory. Then, I changed the paths to the BEDfiles to the correct directory and modified my wildcards and loops to ensure the revised files were the only ones being used. Once I confirmed my old code still worked with the P. acuta DMC lists, I used the same code for M. capitata. I added all generated BEDfiles to the .gitignore and created lists of overlap counts for each genome feature.

I took those count files and generated tables in this R Markdown script:

M. capitata:

P. acuta:

I also included the links to these tables in the issue for Mac to look at. While in the script, I also removed the stacked barplots I created for the upset plot data and the associated code. At this point, it’s all supplemental material and having that information in a table format is probably the only way it will be useful. To round out my scritp cleaning I followed the instructions in this issue to update the scripts README.md. I moved my scripts to the bottom since it’s more important to run QC information before looking at CpG methylation status and genomic location.

Going forward

  1. Help write mansucript!
  2. Look into program for mCpG identification
Written on July 29, 2020