WGBS Analysis Part 3

Rerunning bismark and starting with methylKit

My bismark headache

After Mox maintenance ended, I ran this script and hoped for the best. Somehow, only my ambient file was processed! I tried running just the alignment for my low pH sample with this script and got this error:

Use of uninitialized value $path_to_bowtie in concatenation (.) or string at /gscratch/srlab/programs/Bismark-0.21.0/bismark line 6893.
Failed to execute Bowtie 2 porperly (return code of 'bowtie2 --version' was 32512). Please install Bowtie 2 or HISAT2 first and make sure it is in the PATH, or specify the path to the Bowtie 2 with --path_to_bowtie2 /path/to/bowtie2, or --path_to_hisat2 /path/to/hisat2

Stuck after changing path_to_bowtie to path_to_bowtie2 and not getting the script to run, I reopened this issue/. Steven had me add the following line of code to the top of my script:

source /gscratch/srlab/programs/scripts/paths.sh

Apparently it does something with the bash paths set on Mox such that the paths in my script are recognized. I was able to align, deduplicate, sort, and index the low pH file!…but I couldn’t get bismark2summary to work:

It looks like there is a mix of samples, e.g. RRBS as well as WGBS, in this folder. Please consider running bismark2summary only on samples of the same data type,
or specify the input files manually (--help for more information). Extiting...

For some reason, bismark thinks that my files were made with different sequencing methods. I created this script and manually set the files I wanted the command to use. Even then, I still got the error! I posted this issue because I was concerned about the file formats. Steven suggested I reformat my scripts and process the files together.

I created a 2019-09-11-Gigas-Bismark directory and started running this script to hopefully process both files together. In the meantime, I transferred all files I’ve generated to start testing out methylKit. I used this code to transfer the files:

rsync --archive --progress --verbose /gscratch/scrubbed/yaamini/analyses/Gigas-WGBS/2019-09-03-Gigas-Bismark/* yaamini@172.25.149.226:/Volumes/web/spartina/2019-09-03-project-gigas-oa-meth/analyses/2019-09-03-Bismark

Trying methylKit

Even though my final bam files may not be identical to the ones I transferred from Mox, I wanted to use them to test out the methylKit pipeline I generated for C. virginica. I created a new R project and this R Markdown script, changed all the file names, and started processing files. I decided to keep the same settings I used for C. virgincia:

  • 5x coverage; generated using processBismarkAln with mincov = 2 to quickly process my files, then using filterByCoverage to set minimum and maximum coverage thresholds (5 and 100)
  • destrand = TRUE to separate forward and reverse strands

I started processBismarkAln and it’s chugging along. Here’s hoping I can solidify the pipeline and get data so I can make a presentation!

Update 2019-09-12 10:45 p.m.: I have over 15,000 DML! Relevant files below:

All DML: BEDfile; csv Hypermethylated DML: BEDfile; csv Hypomethylated DML: BEDfile; csv

I will check these in IGV before proceeding with analyses.

Going forward

  1. Get revised bismark output from Mox and run through analysis pipeline
  2. Verify DML in IGV
  3. Obtain C. gigas feature tracks
  4. Characterize locations of DML
  5. Conduct a gene enrichment and/or identify genes with DML and discern what it means
Written on September 12, 2019