DML Analysis Part 2
Understanding gene enrichment
TL;DR: I got confused a lot but I think I’m on the right track, but this will take a while.
I set up some code and reformatted by DML-mRNA overlap file in the script.
topGO requires GOterms as inputs as well, so I tried using
org.Hs.eg.db to match gene IDs to GOterms. However, I ran into my main problem: I don’t have Entrez Gene IDs (official NCBI IDs) for my C. virginica genes. Without these, I cannot use any sort of gene enrichment R package, let anlone convert gene IDs to GOterms! Steven suggested I
blastx the C. virginica genome against the UNIPROT database. This would give me UNIPROT accession codes and GOterms. I can find a way to convert UNIPROT accession codes to Entrez Gene IDs to use
topGO. He also suggested I use DAVID for gene enrichment and compare the results.
I rearraged my R Markdown file and started
blastx. We’ll see how this goes…
Want to know more about my thought process for this analysis? I started using Wordpress to document intermediate thoughts! Click on the “Feed” link in the top right corner of this webpage, or navigate to these links: