Hi, there
https://docs.nvidia.com/clara/parabricks/v3.5/text/germline_pipeline.html
The figure shows that it will call the
ApplyBQSR
process.
When I tried to run the germline pipeline, I found that there was no ApplyBQSR info in the log.
Here is the log from the sample run.
$ pbrun germline \
> --ref parabricks_sample/Ref/Homo_sapiens_assembly38.fasta \
> --in-fq parabricks_sample/Data/sample_1.fq.gz parabricks_sample/Data/sample_2.fq.gz \
> --knownSites parabricks_sample/Ref/Homo_sapiens_assembly38.known_indels.vcf.gz \
> --out-bam output.bam \
> --out-variants output.vcf \
> --out-recal-file report.txt \
> --x3
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation
[Parabricks Options Mesg]: Automatically generating ID prefix
[Parabricks Options Mesg]: Read group created for /uploads/workspace/parabricks_sample/Data/sample_1.fq.gz and
/uploads/workspace/parabricks_sample/Data/sample_2.fq.gz
[Parabricks Options Mesg]: @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1
docker run --gpus all -u=1000:1000 --rm -w=/uploads/workspace --net=host -v /opt/parabricks:/INSTALL/ -v /uploads/workspace/WODDX80V:/uploads/workspace/WODDX80V -v /uploads/workspace:/uploads/workspace -v /uploads/workspace/parabricks_sample/Ref:/uploads/workspace/parabricks_sample/Ref -v /uploads/workspace/parabricks_sample/Data:/uploads/workspace/parabricks_sample/Data parabricks/release:v3.5.0 fq2bam --ref /uploads/workspace/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta --in-fq /uploads/workspace/parabricks_sample/Data/sample_1.fq.gz /uploads/workspace/parabricks_sample/Data/sample_2.fq.gz @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1 --knownSites /uploads/workspace/parabricks_sample/Ref/Homo_sapiens_assembly38.known_indels.vcf.gz --out-bam /uploads/workspace/output.bam --out-recal-file /uploads/workspace/report.txt --memory-limit 110 --num-cpu-threads 0 --tmp-dir /uploads/workspace/WODDX80V --num-gpus 2 --x3
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation
[Parabricks Options Mesg]: Checking argument compatibility
[Parabricks Options Mesg]: Read group created for /uploads/workspace/parabricks_sample/Data/sample_1.fq.gz and
/uploads/workspace/parabricks_sample/Data/sample_2.fq.gz
[Parabricks Options Mesg]: @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1
g 2 b 0 B 2 P 4 s 1 r 0 o 2 m 1 z 4 f 2 v 0 M 2 name /uploads/workspace/output.bam report /uploads/workspace/report.txt K /uploads/workspace/parabricks_sample/Ref/Homo_sapiens_assembly38.known_indels.vcf.gz
/usr/local/cuda/.pb/binaries//bin/bwa mem /uploads/workspace/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta /uploads/workspace/parabricks_sample/Data/sample_1.fq.gz /uploads/workspace/parabricks_sample/Data/sample_2.fq.gz @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1 -Z ./pbOpts.txt
------------------------------------------------------------------------------
|| Parabricks accelerated Genomics Pipeline ||
|| Version v3.5.0 ||
|| GPU-BWA mem, Sorting Phase-I ||
|| Contact: Parabricks-Support@nvidia.com ||
------------------------------------------------------------------------------
[M::bwa_idx_load_from_disk] read 0 ALT contigs
GPU-BWA mem
ProgressMeter Reads Base Pairs Aligned
WARNING
The system has 12 threads, however recommended number of threads with 2 GPU is 24.
The run might not finish or might have less than expected performance.
[09:00:49] 5043564 590000000
[09:01:15] 10087128 1160000000
[09:01:41] 15130692 1730000000
[09:02:07] 20174256 2310000000
[09:02:33] 25217820 2900000000
[09:02:59] 30261384 3490000000
[09:03:25] 35304948 4060000000
[09:03:51] 40348512 4640000000
[09:04:18] 45392076 5220000000
[09:04:44] 50435640 5800000000
GPU-BWA Mem time: 287.420745 seconds
GPU-BWA Mem is finished.
GPU Sorting, Marking Dups, BQSR
ProgressMeter SAM Entries Completed
Total GPU-BWA Mem + Sorting + MarkingDups + BQSR Generation + BAM writing
Processing time: 287.421802 seconds
[main] CMD: PARABRICKS mem -Z ./pbOpts.txt /uploads/workspace/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta /uploads/workspace/parabricks_sample/Data/sample_1.fq.gz /uploads/workspace/parabricks_sample/Data/sample_2.fq.gz @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1
[main] Real time: 291.557 sec; CPU: 3389.752 sec
------------------------------------------------------------------------------
|| Program: GPU-BWA mem, Sorting Phase-I ||
|| Version: v3.5.0 ||
|| Start Time: Thu Jun 10 09:00:09 2021 ||
|| End Time: Thu Jun 10 09:05:05 2021 ||
|| Total Time: 4 minutes 56 seconds ||
------------------------------------------------------------------------------
/usr/local/cuda/.pb/binaries//bin/sort -sort_unmapped -ft 10 -gb 110
------------------------------------------------------------------------------
|| Parabricks accelerated Genomics Pipeline ||
|| Version v3.5.0 ||
|| Sorting Phase-II ||
|| Contact: Parabricks-Support@nvidia.com ||
------------------------------------------------------------------------------
progressMeter - Percentage
[09:05:06] 0.0 0.00 GB
Sorting and Marking: 10.000 seconds
------------------------------------------------------------------------------
|| Program: Sorting Phase-II ||
|| Version: v3.5.0 ||
|| Start Time: Thu Jun 10 09:05:06 2021 ||
|| End Time: Thu Jun 10 09:05:16 2021 ||
|| Total Time: 10 seconds ||
------------------------------------------------------------------------------
/usr/local/cuda/.pb/binaries//bin/postsort /uploads/workspace/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta -o /uploads/workspace/output.bam -sort_unmapped -ft 4 -wt 2 -zt 3 -bq 2 -gb 110 -a /uploads/workspace/report.txt /uploads/workspace/parabricks_sample/Ref/Homo_sapiens_assembly38.known_indels.vcf.gz
------------------------------------------------------------------------------
|| Parabricks accelerated Genomics Pipeline ||
|| Version v3.5.0 ||
|| Marking Duplicates, BQSR ||
|| Contact: Parabricks-Support@nvidia.com ||
------------------------------------------------------------------------------
progressMeter - Percentage
[09:05:27] 0.0 19.33 GB
[09:05:37] 0.3 19.23 GB
[09:05:47] 43.7 10.67 GB
[09:05:57] 79.4 3.01 GB
[09:06:07] 100.0 0.00 GB
BQSR and writing final BAM: 55.401 seconds
------------------------------------------------------------------------------
|| Program: Marking Duplicates, BQSR ||
|| Version: v3.5.0 ||
|| Start Time: Thu Jun 10 09:05:16 2021 ||
|| End Time: Thu Jun 10 09:06:13 2021 ||
|| Total Time: 57 seconds ||
------------------------------------------------------------------------------
docker run --gpus all -u=1000:1000 --rm -w=/uploads/workspace --net=host -v /opt/parabricks:/INSTALL/ -v /uploads/workspace/WODDX80V:/uploads/workspace/WODDX80V -v /uploads/workspace:/uploads/workspace -v /uploads/workspace/parabricks_sample/Ref:/uploads/workspace/parabricks_sample/Ref -v /uploads/workspace/parabricks_sample/Data:/uploads/workspace/parabricks_sample/Data parabricks/release:v3.5.0 haplotypecaller --ref /uploads/workspace/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta --in-bam /uploads/workspace/output.bam --out-variants /uploads/workspace/output.vcf --ploidy 2 --num-htvc-threads 5 --in-recal-file /uploads/workspace/report.txt --tmp-dir /uploads/workspace/WODDX80V --num-gpus 2 --x3
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation
/usr/local/cuda/.pb/binaries//bin/htvc /uploads/workspace/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta /uploads/workspace/output.bam 2 -o /uploads/workspace/output.vcf -nt 5 -a /uploads/workspace/report.txt
------------------------------------------------------------------------------
|| Parabricks accelerated Genomics Pipeline ||
|| Version v3.5.0 ||
|| GPU-GATK4 HaplotypeCaller ||
|| Contact: Parabricks-Support@nvidia.com ||
------------------------------------------------------------------------------
ProgressMeter - Current-Locus Elapsed-Minutes Regions-Processed Regions/Minute
0 /uploads/workspace/output.bam /uploads/workspace/output.vcf
[09:06:45] chr1:69736213 0.2 295788 1774728
[09:06:55] chr1:172127728 0.3 638210 1914630
[09:07:05] chr2:24575905 0.5 1059308 2118616
[09:07:15] chr2:118607996 0.7 1438509 2157763
[09:07:25] chr2:210110165 0.8 1822251 2186701
[09:07:35] chr3:53063751 1.0 2183068 2183068
[09:07:45] chr3:143860641 1.2 2555289 2190247
[09:07:55] chr4:62855995 1.3 3034038 2275528
[09:08:05] chr4:171853801 1.5 3487445 2324963
[09:08:15] chr5:84331180 1.7 3892371 2335422
[09:08:25] chr5:173260116 1.8 4264213 2325934
[09:08:35] chr6:74481799 2.0 4586215 2293107
[09:08:45] chr7:11246134 2.2 5032603 2322739
[09:08:55] chr7:130997412 2.3 5521721 2366451
[09:09:05] chr8:61243172 2.5 5878954 2351581
[09:09:15] chr9:20644571 2.7 6315026 2368134
[09:09:25] chr10:3110359 2.8 6733447 2376510
[09:09:35] chr10:117102296 3.0 7207833 2402611
[09:09:45] chr11:73843030 3.2 7574909 2392076
[09:09:55] chr12:26831401 3.3 7945179 2383553
[09:10:05] chr13:28406382 3.5 8431610 2409031
[09:10:15] chr14:34871914 3.7 8861037 2416646
[09:10:25] chr15:57763138 3.8 9295695 2424963
[09:10:35] chr16:64607746 4.0 9700290 2425072
[09:10:45] chr17:68337341 4.2 10069197 2416607
[09:10:55] chr18:68500686 4.3 10399080 2399787
[09:11:05] chr20:38097406 4.5 10836101 2408022
[09:11:15] chr22:44141670 4.7 11223280 2404988
[09:11:25] chr17_GL000258v2_alt:1521348 4.8 11959301 2474338
Total time taken: 304.593
------------------------------------------------------------------------------
|| Program: GPU-GATK4 HaplotypeCaller ||
|| Version: v3.5.0 ||
|| Start Time: Thu Jun 10 09:06:17 2021 ||
|| End Time: Thu Jun 10 09:11:36 2021 ||
|| Total Time: 5 minutes 19 seconds ||
------------------------------------------------------------------------------
Below is the binary list from the container.
usr/local/cuda-10.1/.pb/binaries/bin$ ls -ls
total 467644
360 -rwxrwxrwx 367256 Feb 23 applyBQSR
1192 -rwxrwxrwx 1218248 Feb 23 bamreadcount
296 -rwxrwxrwx 301024 Feb 23 bcftoolscall
432 -rwxrwxrwx 440368 Feb 23 bcftoolsmpileup
96 -rwxrwxrwx 96584 Feb 23 bedcov
500 -rwxrwxrwx 510696 Feb 23 bqsr
2424 -rwxrwxrwx 2481896 Feb 23 bwa
708 -rwxrwxrwx 723584 Feb 23 cnnscorevariants
220 -rwxrwxrwx 223192 Feb 23 cnvkit
844 -rwxrwxrwx 863712 Feb 23 collectmultiplemetrics
336 -rwxrwxrwx 342408 Feb 23 coverage
272 -rwxrwxrwx 276456 Feb 23 dbsnp
2252 -rwxrwxrwx 2305216 Feb 23 deepvariant
652 -rwxrwxrwx 664832 Feb 23 deviceQuery
296520 -rwxrwxrwx 303633183 Feb 23 gatk-package-4.1.0.0-local.jar
20 -rwxrwxrwx 19197 Feb 23 gatk_cpu
572 -rwxrwxrwx 584152 Feb 23 genotypegvcf
146552 -rwxrwxrwx 150068464 Feb 23 glnexus
2132 -rwxrwxrwx 2182792 Feb 23 htvc
252 -rwxrwxrwx 255960 Feb 23 indexgvcf
220 -rwxrwxrwx 223168 Feb 23 licenseManagerTool
324 -rwxrwxrwx 329688 Feb 23 licenseinfo
220 -rwxrwxrwx 223104 Feb 23 licensereturn
44 -rwxrwxrwx 42968 Feb 23 markQueryName
32 -rwxrwxrwx 31848 Feb 23 mergegvcf_humanpar
1824 -rwxrwxrwx 1867488 Feb 23 mutect
12 -rwxrwxrwx 10120 Feb 23 pb_driver
1000 -rwxrwxrwx 1022960 Feb 23 postsort
316 -rwxrwxrwx 321512 Feb 23 samtoolsmpileup
340 -rwxrwxrwx 346616 Feb 23 somaticsniper
432 -rwxrwxrwx 440296 Feb 23 sort
364 -rwxrwxrwx 370976 Feb 23 splitncigar
2920 -rwxrwxrwx 2989784 Feb 23 star
624 -rwxrwxrwx 636896 Feb 23 starfusion
468 -rwxrwxrwx 477168 Feb 23 trioCombineGVCF
768 -rwxrwxrwx 784640 Feb 23 variantfiltration
312 -rwxrwxrwx 317400 Feb 23 varscan
792 -rwxrwxrwx 809120 Feb 23 vqsr
The germline pipeline seems to just use:
/usr/local/cuda/.pb/binaries//bin/bwa mem ...
/usr/local/cuda/.pb/binaries//bin/sort ...
/usr/local/cuda/.pb/binaries//bin/postsort ...
/usr/local/cuda/.pb/binaries//bin/htvc ...
Does the germline pipeline call through the ApplyBQSR process?