有它就够了!下载GEO数据库转成fastq数据
数据获取是信息学分析的重中之重。本文主要介绍如何安装和使用 sra-tools 这个软件,进行SRA数据下载,主要用途还是把NGS序列原始数据从 sra 格式转换到 fastq 格式,以便于后续的数据分析。
# 下载和安装容器
docker run --rm -it registry.cn-hangzhou.aliyuncs.com/weinfo/fortansfer:ascpsratoolkit1.0.0 bash
aspera@bb5b0b437942:~$ prefetch SRR3589948
This sra toolkit installation has not been configured.
Before continuing, please run: vdb-config --interactive
For more information, see https://www.ncbi.nlm.nih.gov/sra/docs/sra-cloud/
aspera@bb5b0b437942:~$ vdb-config --interactive
# 出现图形界面,点击字母“x”退出,继续运行即可
aspera@bb5b0b437942:~$ prefetch SRR3589948
2020-05-08T13:08:25 prefetch.2.10.2 int: connection busy while validating within network system module - cannot open remote file: https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos2/sra-pub-run-7/SRR3589948/SRR3589948.1
2020-05-08T13:08:28 prefetch.2.10.2: 1) Downloading 'SRR3589948'...
2020-05-08T13:08:28 prefetch.2.10.2: Downloading via https...
2020-05-08T13:11:16 prefetch.2.10.2: https download succeed
2020-05-08T13:11:16 prefetch.2.10.2: 1) 'SRR3589948' was downloaded successfully
aspera@bb5b0b437942:~$ ls
SRR3589948 cli.run
aspera@bb5b0b437942:~$ fasterq-dump SRR3589948
spots read : 40,008,592
reads read : 80,017,184
reads written : 80,017,184
aspera@bb5b0b437942:~$ ls
SRR3589948 SRR3589948_1.fastq SRR3589948_2.fastq cli.run
# 转化成功
aspera@bb5b0b437942:~$
拥有fastq原始格式数据之后,就可以进行基于原始文件的分析了。

请关注“恒诺新知”微信公众号,感谢“R语言“,”数据那些事儿“,”老俊俊的生信笔记“,”冷🈚️思“,“珞珈R”,“生信星球”的支持!