West Coast Green Crab Experiment Part 81
Troubleshooting script issues
2025-12-05
My script ran for 6 hours before it eventually got cancelled. Weird! I finally got around to checking the error log:
-------------------------------------------
----------- Jellyfish --------------------
-- (building a k-mer catalog from reads) --
-------------------------------------------
CMD: jellyfish count -t 28 -m 25 -s 100000000 both.fa
CMD finished (8320 seconds)
CMD: jellyfish histo -t 28 -o jellyfish.K25.min2.kmers.fa.histo mer_counts.jf
CMD finished (317 seconds)
CMD: jellyfish dump -L 2 mer_counts.jf > jellyfish.K25.min2.kmers.fa
CMD finished (671 seconds)
CMD: touch jellyfish.K25.min2.kmers.fa.success
CMD finished (0 seconds)
-generating stats files
CMD: /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin/util/..//Inchworm/bin/fastaToKmerCoverageStats --reads left.fa --kmers jellyfish.K25.min2.kmers.fa --kmer_size 25 --num_threads 14 > left.fa.K25.stats
CMD: /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin/util/..//Inchworm/bin/fastaToKmerCoverageStats --reads right.fa --kmers jellyfish.K25.min2.kmers.fa --kmer_size 25 --num_threads 14 > right.fa.K25.stats
-reading Kmer occurrences...-reading Kmer occurrences...
slurmstepd: error: Job 1785248 exceeded memory limit (105782124 > 104857600), being killed
slurmstepd: error: Exceeded job memory limit
slurmstepd: error: *** JOB 1785248 ON pn052 CANCELLED AT 2025-12-01T22:13:32 ***
I upped the memory request to 180 GB in the SLURM header:
#SBATCH --mem=180gb
Based on the memory request, my job is pending until Dec. 6 So I guess I’ll check in then.
2025-12-08
Job ran for 24 hours but was killed due to time restrictions! I’m not sure why this happened when I had an “unlimited” time restriction information in my SLURM header.
(base) [yaamini.venkataraman@poseidon-l1 ~]$ tail /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/yrv_trinity1792444.log
CMD: touch left.fa.K25.stats.sort.ok
CMD finished (0 seconds)
CMD: touch right.fa.K25.stats.sort.ok
CMD finished (0 seconds)
-defining normalized reads
CMD: /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin/util/..//util/support_scripts//nbkc_merge_left_right_stats.pl --left left.fa.K25.stats.sort --right right.fa.K25.stats.sort --sorted > pairs.K25.stats
-opening left.fa.K25.stats.sort
-opening right.fa.K25.stats.sort
-done opening files.
slurmstepd: error: *** JOB 1792444 ON pn076 CANCELLED AT 2025-12-06T15:33:00 DUE TO TIME LIMIT ***
The response from WHOI IT:
I just edited your submit script - there were several hidden characters throughout that may have been getting in the way of Slurm parsing the header. Sometimes when scripts are copied/pasted to/from Windows hidden characters come over… you can clean these up using vi and the :set list command. Please try to resubmit when you can, and we can keep an eye on how it progresses.
I resumbmitted the script and let’s see what happens.
2025-12-09
Realized there was a follow-up message from WHOI IT!
Hi Yaamini, looking over your submit script again, I realized you do not have a time statement - so even though you are using the unlim QOS, the default time limit of 24 hours is going to be applied. If you add the line to your slurm header like so it should be able to exceed 24hours: #SBATCH –time=2-24:25:00 The time format is d-hh:mm:ss Apologies that I didn’t notice this earlier!
I first checked the log to see the progress out of curiosity (script had been running for 23 hours):
(base) [yaamini.venkataraman@poseidon-l1 ~]$ tail /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/yrv_trinity1795649.log
-capturing normalized reads from: /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06b-trimgalore/trim-illumina-polyA/25-055_R2_001_val_2_val_2.fq.gz
-capturing normalized reads from: /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06b-trimgalore/trim-illumina-polyA/25-055_R1_001_val_1_val_1.fq.gz
-capturing normalized reads from: /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06b-trimgalore/trim-illumina-polyA/25-056_R2_001_val_2_val_2.fq.gz
-capturing normalized reads from: /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06b-trimgalore/trim-illumina-polyA/25-056_R1_001_val_1_val_1.fq.gz
-capturing normalized reads from: /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06b-trimgalore/trim-illumina-polyA/25-058_R2_001_val_2_val_2.fq.gz
-capturing normalized reads from: /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06b-trimgalore/trim-illumina-polyA/25-058_R1_001_val_1_val_1.fq.gz
-capturing normalized reads from: /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06b-trimgalore/trim-illumina-polyA/25-059_R2_001_val_2_val_2.fq.gz
-capturing normalized reads from: /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06b-trimgalore/trim-illumina-polyA/25-059_R1_001_val_1_val_1.fq.gz
-capturing normalized reads from: /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06b-trimgalore/trim-illumina-polyA/25-060_R2_001_val_2_val_2.fq.gz
-capturing normalized reads from: /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06b-trimgalore/trim-illumina-polyA/25-060_R1_001_val_1_val_1.fq.gz
I then killed the script and modified the SLURM header:
#!/bin/bash
#SBATCH --partition=compute # Queue selection
#SBATCH --job-name=yrv_trinity # Job name
#SBATCH --mail-type=ALL # Mail events (BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=yaamini.venkataraman@whoi.edu # Where to send mail
#SBATCH --nodes=1 # One node
#SBATCH --exclusive # All 36 procs on the one node
#SBATCH --mem=180gb # Job memory request
#SBATCH --qos=unlim # Unlimited time allowed
#SBATCH --time=10-00:00:00 # Time limit (d-hh:mm:ss)
#SBATCH --output=yrv_trinity%j.log # Standard output/error
#SBATCH --chdir=/vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity # Working directory for this script
The revised script is now running! I’m going to keep an eye on it to ensure that it actually continues running the way it should.
2025-12-10
The script died after ~15 hours lolsob. Looks like inchworm was starting when the error occured:
--------------- Inchworm (K=25, asm) ---------------------
-- (Linear contig construction from k-mers) --
----------------------------------------------
* [Wed Dec 10 01:12:30 2025] Running CMD: /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin/Inchworm/bin//inchworm --kmers jellyfish.kmers.25.asm.fa --run_inchworm -K 25 --monitor 1 --num_threads 6 --PARALLEL_IWORM --min_any_entropy 1.0 -L 25 --no_prune_error_kmers > /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/inchworm.fa.tmp
Kmer length set to: 25
Min assembly length set to: 25
Monitor turned on, set to: 1
min entropy set to: 1
setting number of threads to: 6
-setting parallel iworm mode.
-reading Kmer occurrences...
[2710M] Kmers parsed.
done parsing 2710642398 Kmers, 2710642398 added, taking 8912 seconds.
TIMING KMER_DB_BUILDING 8912 s.
Pruning kmers (min_kmer_count=1 min_any_entropy=1 min_ratio_non_error=0.005)
Pruned 15962305 kmers from catalog.
Pruning time: 1573 seconds = 26.2167 minutes.
TIMING PRUNING 1573 s.
-populating the kmer seed candidate list.
Kcounter hash size: 2710642398
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
sh: line 1: 185644 Aborted /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin/Inchworm/bin//inchworm --kmers jellyfish.kmers.25.asm.fa --run_inchworm -K 25 --monitor 1 --num_threads 6 --PARALLEL_IWORM --min_any_entropy 1.0 -L 25 --no_prune_error_kmers > /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/inchworm.fa.tmp
Error, cmd: /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin/Inchworm/bin//inchworm --kmers jellyfish.kmers.25.asm.fa --run_inchworm -K 25 --monitor 1 --num_threads 6 --PARALLEL_IWORM --min_any_entropy 1.0 -L 25 --no_prune_error_kmers > /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/inchworm.fa.tmp died with ret 34304 No such file or directory at /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin/PerlLib/Pipeliner.pm line 187.
Pipeliner::run(Pipeliner=HASH(0x55555558a8d8)) called at /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin//Trinity line 2729
eval {...} called at /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin//Trinity line 2719
main::run_inchworm("/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinit"..., "/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinit"..., "RF", "", 25, 0) called at /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin//Trinity line 1836
main::run_Trinity() called at /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin//Trinity line 1500
eval {...} called at /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin//Trinity line 1499
If it indicates bad_alloc(), then Inchworm ran out of memory. You'll need to either reduce the size of your data set or run Trinity on a server with more memory available.
I need more memory! I will follow advice from Rich of WHOI IT to use a node with a higher memory allocation:
If you still are running out of memory at 180GB, there is a medium memory partition with nodes that have 512GB - if you need to use those, the partition name is medmem - simply replace compute with medmem in the –partition= statement…
I modified my SLURM header as follows:
#!/bin/bash
#SBATCH --partition=medmem # Queue selection
#SBATCH --job-name=yrv_trinity # Job name
#SBATCH --mail-type=ALL # Mail events (BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=yaamini.venkataraman@whoi.edu # Where to send mail
#SBATCH --nodes=1 # One node
#SBATCH --exclusive # All 36 procs on the one node
#SBATCH --mem=500gb # Job memory request
#SBATCH --qos=unlim # Unlimited time allowed
#SBATCH --time=10-00:00:00 # Time limit (d-hh:mm:ss)
#SBATCH --output=yrv_trinity%j.log # Standard output/error
#SBATCH --chdir=/vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity # Working directory for this script
The job is now in the queue. Given how much memory is needed for the run, I am not sure when it will start so I’ll need to keep my eye on my email.
2025-12-31
Alright, so my job stopped running on Dec. 12 after starting (maybe finishing?) chrysallis. I tried restarting the job on Dec. 24, but instead of picking up where the previous run left off, it started from the beginning! I quit that job and now I am revisiting.
Based on this part of the trinity manual, it seemed like I could get trinity to run chrysallis only, then finish out the job. I could then run trinity again to finish up the job:
# Run Trinity to assemble de novo transcriptome. Using primarily default parameters.
${TRINITY}/Trinity \
--seqType fq \
--max_memory 100G \
--samples_file ${trinity_file_list} \
--SS_lib_type RF \
--min_contig_length 200 \
--full_cleanup \
--no_distributed_trinity_exec \
--CPU 28
(base) [yaamini.venkataraman@poseidon-l1 ~]$ less /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/yrv_trinity1816186.log nity_out_dir/chrysalis/iworm_cluster_welds_graph.txt.sorted > /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/iworm_cluster_welds_graph.txt.sorted.wIwormNames, checkpoint [/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/iworm_cluster_welds_graph.txt.sorted.wIwormNames.ok] exists. – Skipping CMD: /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin/Chrysalis/bin/BubbleUpClustering -i /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/inchworm.fa.min100 -weld_graph /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/iworm_cluster_welds_graph.txt.sorted -min_contig_length 200 -max_cluster_size 25 > /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/GraphFromIwormFasta.out, checkpoint [/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/GraphFromIwormFasta.out.ok] exists. – Skipping CMD: /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin/Chrysalis/bin/CreateIwormFastaBundle -i /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/GraphFromIwormFasta.out -o /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/bundled_iworm_contigs.fasta -min 200, checkpoint [/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/bundled_iworm_contigs.fasta.ok] exists. – Skipping CMD: /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin/Chrysalis/bin/ReadsToTranscripts -i /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/both.fa -f /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/bundled_iworm_contigs.fasta -o /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/readsToComponents.out -t 28 -max_mem_reads 50000000 -strand -p 10 , checkpoint [/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/readsToComponents.out.ok] exists. – Skipping CMD: /vortexfs1/home/yaamini.venkataraman/.conda/envs/trinity_env/bin/sort –parallel=28 -T . -S 100G -k 1,1n -k3,3nr -k2,2 /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/readsToComponents.out > /scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/readsToComponents.out.sort, checkpoint [/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir/chrysalis/readsToComponents.out.sort.ok] exists. ###################################################################
Stopping here due to –no_distributed_trinity_exec in effect
################################################################### Error: Couldn’t open /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trimgalore/trinity_out_dir/Trinity.fasta
Huh, it couldn’t open the FASTA file! I decided to go find this fasta file myself. I navigated to /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity/trinity_out_dir and could not find Trinity.fasta. I also realized that above, it was looking for a compiled transcriptome in an folder that doesn’t exist (06c-trimgalore)…….that in itself is very odd. Then I realized that my script listed the following path for my output directory: OUTPUT_DIR=/vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trimgalore
OF COURSE it’s looking for a transcriptome in that odd folder — that’s where I said it would be! I changed the output directory to /vortexfs1/scratch/yaamini.venkataraman/wc-green-crab/output/06c-trinity and then reran the script. Obviously, I still got the same error because the file does not exist. Because of this, I decided to restart the trinity process from the beginning, but increased my wall time to 25 days. The script is running, so we’ll see what happens next.
Going forward
- Index, get advanced transcriptome statistics, and pseudoalign with
salmon - Clean transcriptome with
EnTapandblastn - Identify differentially expressed genes
- Additional strand-specific analysis in the supergene region
- Examine HOBO data from 2023 experiment
- Demographic data analysis for 2023 paper
- Start methods and results of 2023 paper