DML Analysis Part 10
Testing bismark
parameters
After meeting with Steven this week, we decided the best course of action was to revise the analysis pipeline and document why we chose the settings we did. The first part of this revision is to go back to bismark.
When I ran my full samples through bismark
, I used the default setting for -score_min
. This lead to a ~20% mapping efficiency. That’s not great.
The -score_min
option dictates how strict the alignment is. The default option is L,0,-0.2, which is pretty stringent. In this Jupyter notebook, I tested 3 different -score_min
options: L,0,-0.6; L,0,-0.9; L,0,-1.2. Samples were run in the following order:
Therefore, the first mapping efficiency listed belongs to sample 10, second to sample 1, etc. I kept this in mind and generated the following table:
Table 1. Mapping efficiency (%) based on different -score_min
settings.
Sample | L,0,-0.6 | L,0,-0.9 | L,0,-1.2 |
---|---|---|---|
1 | 15.5 | 20.2 | 28.8 |
2 | 32.4 | 40.2 | 49.8 |
3 | 37.2 | 45.3 | 53.6 |
4 | 36.0 | 44.7 | 52.9 |
5 | 34.6 | 42.9 | 51.7 |
6 | 36.7 | 45.0 | 53.8 |
7 | 34.6 | 42.9 | 51.4 |
8 | 31.7 | 39.0 | 47.6 |
9 | 33.0 | 41.2 | 49.9 |
10 | 36.6 | 44.9 | 53.0 |
To optimize mapping efficiency, we decided to run full samples with -score_min L,0,-1.2
. I started the analysis in a new Jupyter notebook.