WGBS Analysis Part 3
bismark and starting with
After Mox maintenance ended, I ran this script and hoped for the best. Somehow, only my ambient file was processed! I tried running just the alignment for my low pH sample with this script and got this error:
Use of uninitialized value $path_to_bowtie in concatenation (.) or string at /gscratch/srlab/programs/Bismark-0.21.0/bismark line 6893. Failed to execute Bowtie 2 porperly (return code of 'bowtie2 --version' was 32512). Please install Bowtie 2 or HISAT2 first and make sure it is in the PATH, or specify the path to the Bowtie 2 with --path_to_bowtie2 /path/to/bowtie2, or --path_to_hisat2 /path/to/hisat2
Stuck after changing
path_to_bowtie2 and not getting the script to run, I reopened this issue/. Steven had me add the following line of code to the top of my script:
Apparently it does something with the
bash paths set on Mox such that the paths in my script are recognized. I was able to align, deduplicate, sort, and index the low pH file!…but I couldn’t get
bismark2summary to work:
It looks like there is a mix of samples, e.g. RRBS as well as WGBS, in this folder. Please consider running bismark2summary only on samples of the same data type, or specify the input files manually (--help for more information). Extiting...
For some reason,
bismark thinks that my files were made with different sequencing methods. I created this script and manually set the files I wanted the command to use. Even then, I still got the error! I posted this issue because I was concerned about the file formats. Steven suggested I reformat my scripts and process the files together.
I created a
2019-09-11-Gigas-Bismark directory and started running this script to hopefully process both files together. In the meantime, I transferred all files I’ve generated to start testing out
methylKit. I used this code to transfer the files:
rsync --archive --progress --verbose /gscratch/scrubbed/yaamini/analyses/Gigas-WGBS/2019-09-03-Gigas-Bismark/* email@example.com:/Volumes/web/spartina/2019-09-03-project-gigas-oa-meth/analyses/2019-09-03-Bismark
Even though my final bam files may not be identical to the ones I transferred from Mox, I wanted to use them to test out the
methylKit pipeline I generated for C. virginica. I created a new R project and this R Markdown script, changed all the file names, and started processing files. I decided to keep the same settings I used for C. virgincia:
- 5x coverage; generated using
mincov = 2to quickly process my files, then using
filterByCoverageto set minimum and maximum coverage thresholds (5 and 100)
destrand = TRUEto separate forward and reverse strands
processBismarkAln and it’s chugging along. Here’s hoping I can solidify the pipeline and get data so I can make a presentation!
Update 2019-09-12 10:45 p.m.: I have over 15,000 DML! Relevant files below:
I will check these in IGV before proceeding with analyses.
- Get revised
bismarkoutput from Mox and run through analysis pipeline
- Verify DML in IGV
- Obtain C. gigas feature tracks
- Characterize locations of DML
- Conduct a gene enrichment and/or identify genes with DML and discern what it means