Hi there,
Doc: Download Example FASTQ Files
I followed the instructions and found something different.
Q1: Convert SRA to FASTQ files
## Convert SRA to FASTQ files
fastq-dump -I --split-files SRR7890827 --gzip
fastq-dump -I --split-files SRR7890824 --gzip
(It might take 35~39 hours for fetching.)
I tested the command fastq-dump
and found that it didn’t convert the SRA files to FASTQ files.
Instead, it fetched the FASTQ files directly from the NCBI SRA source.
The conversion command should be:
## Convert "local" SRA to FASTQ files
fastq-dump -I --split-files ./SRR7890827 --gzip
fastq-dump -I --split-files ./SRR7890824 --gzip
(prefix files with ./
)
(It might take 13~15 hours for conversion.)
Q2: paired reads have different names: “SRR7890824.1.2”, “SRR7890824.1.1”
Doc: somatic pipeline
I followed the instructions and got the following error.
[PB Error 2022-Apr-13 11:18:37][ParaBricks/src/CReadWrite.cpp:379] paired reads have different names: "SRR7890824.1.2", "SRR7890824.1.1" in file /workspace/datasets/somatic/SRR7890824_1.fastq.gz and /workspace/datasets/somatic/SRR7890824_2.fastq.gz
, exiting.
Where SRR7890824.1.2
means accession.spot.readid
The readid
should not be placed in fastq files, otherwise it will cause the somatic pipeline to throw an error.
Therefore, the conversion command should be corrected to:
## Convert "local" SRA to FASTQ files
fastq-dump --split-files ./SRR7890827 --gzip
fastq-dump --split-files ./SRR7890824 --gzip
(don’t take the parameter -I
)
or to fix pbrun somatic
to accept the fastq format accession.spot.readid
?
Could you clarify the two questions?