West Coast Green Crab Experiment Part 91
Modifying the transcriptome assembly
I ran my action plan by Carolyn. She didn’t provide any comments on which samples to input, and said my plan of running the assembly with the correct library type was a good first step! I started a script for 15 days to run the transcriptome assembly and collapse the isoforms into supertranscripts.
# DE NOVO TRANSCRIPTOME ASSEMBLY
echo "Start de novo transcriptome assembly"
# Run Trinity to assemble de novo transcriptome. Using primarily default parameters.
${TRINITY}/Trinity \
--seqType fq \
--max_memory 100G \
--samples_file ${trinity_file_list} \
--SS_lib_type FR \
--min_contig_length 200 \
--full_cleanup \
--CPU 28
# Move transcriptome to the correct location
mv trinity_out_dir.Trinity.fasta ${OUTPUT_DIR}/trinity_out_dir/Trinity.fasta
# Collapse isoforms into supertranscripts.
# Output files are trinity_genes.fasta (supertranscripts in fasta format), trinity_genes.gtf (transcript structure annotation in gtf format), and trinity_genes.malign (multiple alignment view that contrasts the different candidate splicing isoforms)
${TRINITY}/Analysis/SuperTranscripts/Trinity_gene_splice_modeler.py \
--trinity_fasta ${OUTPUT_DIR}/trinity_out_dir/Trinity.fasta \
--incl_malign
# Move output files to a new folder
mkdir supertranscript_output
rsync --archive --progress --verbose trinity_genes.* supertranscript_output/.
echo "Completed de novo transcriptome assembly"
The job was pending due to resources, but hopefully it starts sooner rather than later!
Going forward
- Tweak transcriptome assembly parameters to reduce the number of assembly artifacts and total supertranscripts
- Annotate transcriptome with
EnTAP - Remove contaminant sequences identified by
EnTAP - Create count matrix for clean transcriptome
- Calculate Ex50 and N50 statistics for clean transcriptome
- Repeat analysis with clean transcriptome and fuller annotations in
edgeR - Identify temperature- and genotype-specific differentially expressed genes at the end of the experiment
- Identify genes influenced by both temperature and time
- Determine methods for functional analysis
- Additional strand-specific analysis in the supergene region
- Examine HOBO data from 2023 experiment
- Demographic data analysis for 2023 paper
Written on June 30, 2026