Uploading Data to NCBI SRA
Documenting for posterity (because it’s an annoying process)
I’m uploading the C. gigas Manchester WGBS data to the NCBI SRA, and I wanted to document this process so I remember how to do it next time!
Obtaining accession numbers
First things first, I needed to get BioProject and BioSample accession numbers. I went to the NCBI Submisison Portal and registered a BioProject. Once it was registered, I used the accession number in my BioSample registration.
Uploading data
Then came the saga of trying to upload data to the NCBI SRA. I first tried using the Aspera plug-in to upload my 16 files. I posted this discussion to ask how Sam does the same thing. He suggested using FTP, but newer Mac OS versions don’t have ftp. Enter homebrew. I first installed ftp:
brew install inetutils
I followed the instructions from the NCBI SRA to upload the files using ftp. Here’s where I ran into an issue: ftp takes FOREVER. Sam suggested I ssh into a Roberts Lab computer, mount owl, run screen so I can kill the Terminal session, then run ftp from there. He suggested I use the following code:
sudo mount -t cifs //owl.fish.washington.edu/web owl/ -o username=yaaminiv,vers=1.0
This code would mount owl remotely, since I need to navigate to the nightingales directory and upload my files from there. I used ssh to log into ostrich, but wasn’t able to use sudo on this machine. Sam mentioned he provided instructions thinking I was using Linux, so I logged into roadrunner instead and it worked! After I navigated to the owl directory with my raw data, started screen, and I used ftp -i, then mput to upload all the files. To check on the screen, I used screen -ls to find the session number, then screen -r to attach the session.
Thankfully my connection didn’t drop during the ftp, and I was able to load the files and submit to the NCBI SRA!