Uploading Data to NCBI SRA

Documenting for posterity (because it’s an annoying process)

I’m uploading the C. gigas Manchester WGBS data to the NCBI SRA, and I wanted to document this process so I remember how to do it next time!

Obtaining accession numbers

First things first, I needed to get BioProject and BioSample accession numbers. I went to the NCBI Submisison Portal and registered a BioProject. Once it was registered, I used the accession number in my BioSample registration.

Uploading data

Then came the saga of trying to upload data to the NCBI SRA. I first tried using the Aspera plug-in to upload my 16 files. I posted this discussion to ask how Sam does the same thing. He suggested using FTP, but newer Mac OS versions don’t have ftp. Enter homebrew. I first installed ftp:

brew install inetutils

I followed the instructions from the NCBI SRA to upload the files using ftp. Here’s where I ran into an issue: ftp takes FOREVER. Sam suggested I ssh into a Roberts Lab computer, mount owl, run screen so I can kill the Terminal session, then run ftp from there. He suggested I use the following code:

sudo mount -t cifs //owl.fish.washington.edu/web owl/ -o username=yaaminiv,vers=1.0

This code would mount owl remotely, since I need to navigate to the nightingales directory and upload my files from there. I used ssh to log into ostrich, but wasn’t able to use sudo on this machine. Sam mentioned he provided instructions thinking I was using Linux, so I logged into roadrunner instead and it worked! After I navigated to the owl directory with my raw data, started screen, and I used ftp -i, then mput to upload all the files. To check on the screen, I used screen -ls to find the session number, then screen -r to attach the session.

Thankfully my connection didn’t drop during the ftp, and I was able to load the files and submit to the NCBI SRA!

Written on February 15, 2022