DML Analysis Part 10

Testing bismark parameters

After meeting with Steven this week, we decided the best course of action was to revise the analysis pipeline and document why we chose the settings we did. The first part of this revision is to go back to bismark. When I ran my full samples through bismark, I used the default setting for -score_min. This lead to a ~20% mapping efficiency. That’s not great.

The -score_min option dictates how strict the alignment is. The default option is L,0,-0.2, which is pretty stringent. In this Jupyter notebook, I tested 3 different -score_min options: L,0,-0.6; L,0,-0.9; L,0,-1.2. Samples were run in the following order:

screen shot 2018-10-04 at 10 47 55 am

Therefore, the first mapping efficiency listed belongs to sample 10, second to sample 1, etc. I kept this in mind and generated the following table:

Table 1. Mapping efficiency (%) based on different -score_min settings.

Sample L,0,-0.6 L,0,-0.9 L,0,-1.2
1 15.5 20.2 28.8
2 32.4 40.2 49.8
3 37.2 45.3 53.6
4 36.0 44.7 52.9
5 34.6 42.9 51.7
6 36.7 45.0 53.8
7 34.6 42.9 51.4
8 31.7 39.0 47.6
9 33.0 41.2 49.9
10 36.6 44.9 53.0

To optimize mapping efficiency, we decided to run full samples with -score_min L,0,-1.2. I started the analysis in a new Jupyter notebook.

Written on October 4, 2018