I have encountered and issue that disallowes me to use parabricks, as it simply does not work, when used as presented in the fq2bam tutorial.
My computational environment consists of 16vCPUs, 256GB of RAM and 1 Tesla A100 (80GB).
Here is my printout from the nvidia-smi:
Thu Nov 28 10:00:04 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100-SXM4-80GB Off | 00000000:00:10.0 Off | 0 |
| N/A 23C P0 49W / 500W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
My command, for running the fq2bam does not differ greatly from the original one from the tutorial (which also does not work properly) that i have tried, returns the least informative log i have had ever see. Literally no information can be inferred from that printout.
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation
/usr/local/parabricks/run_pb.py fq2bam --num-gpus 1 --x3 --verbose --memory-limit 60 --bwa-options=-K 19 --ref /workdir/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta --in-fq /workdir/parabricks_sample/Data/sample_1.fq.gz /workdir/parabricks_sample/Data/sample_2.fq.gz --out-bam /outputdir/fq2bam_output.bam --tmp-dir //1U548PMS
[Parabricks Options Mesg]: Checking argument compatibility
[Parabricks Options Mesg]: Automatically generating ID prefix
[Parabricks Options Mesg]: Read group created for /workdir/parabricks_sample/Data/sample_1.fq.gz and
/workdir/parabricks_sample/Data/sample_2.fq.gz
[Parabricks Options Mesg]: @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1
/usr/local/parabricks/binaries/bin/pbbwa /workdir/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta --mode pair-ended-gpu /workdir/parabricks_sample/Data/sample_1.fq.gz /workdir/parabricks_sample/Data/sample_2.fq.gz -R @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1 --nGPUs 1 --nstreams 4 --cpu-thread-pool 16 -K 19 -F 0 --min-read-size 1 --max-read-size 480 --markdups --write-bin --verbose
For technical support visit https://docs.nvidia.com/clara/index.html#parabricks
Exiting...
Could not run fq2bam
Exiting pbrun ...
Hi there nvidia.
Are you even working on support for developers?
There are a lot of problems just like mine, that have been ghosted for even longer than 24 hours.
It’s not even funny, since you are marketing your solutions as “for Healthcare & Life Sciences”. All of life sciences professionals would agree, that such form of support is unacceptable, especially since your software and tutorials seems to be quite erroneous and counter intuitive.
It would be greatly appreciated if you would respond with something that could be usefull.
Hello, apologies for the late reply. We are based in the US and these posts came during the Thanksgiving holiday.
We double-checked and the command from the tutorial works as displayed on FQ2BAM Tutorial - NVIDIA Docs. The exact directories and directory structure may need to be changed to match your directory structure.
Looking at the particular command you ran, you added --bwa-options="-K 19". The option -K # for bwa-mem is a hidden parameter that controls the chunk size (in terms of number of bases) that is used for the window to determine the best alignment with output for paired-alignment. The number 19 is not bad, just very small and will severely hamper performance. Using a larger parameter like --bwa-options="-K 10000000" would yield better performance and still maintain reproducibility if you pass the same parameter to another run. However, that is not causing the crash you’re seeing.
The strange thing with your run is that the actual binary is not being run. It seems to be getting a signal from docker or the host OS before getting to that stage (that is why fq2bam says “Exiting…” because it received a signal). Usually the signal that is received which could cause this printout is an out of memory signal but 256GB should be more than enough.
Could you confirm that the sample data downloaded correctly? The md5sum I see for parabricks_sample.tar.gz is 05b51303a7b9939c9232f88e7ecd1444.
One other idea would be to explicitly set the temporary directory for work to be --tmp-dir /workdir. Although we have tested and we will print out an informative error if the disk is out of space.
Hi,
I do confirm that the md5sum is 05b51303a7b9939c9232f88e7ecd1444
The command docker run --gpus all --rm --volume $(pwd):/workdir --volume $(pwd):/outputdir nvcr.io/nvidia/clara/clara-parabricks:4.4.0-1 pbrun fq2bam --num-gpus 1 --x3 --verbose --memory-limit 60 --bwa-options="-K 10000000" --tmp-dir /workdir --ref /workdir/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta --in-fq /workdir/parabricks_sample/Data/sample_1.fq.gz /workdir/parabricks_sample/Data/sample_2.fq.gz --out-bam /outputdir/fq2bam_output.bam
have returned same output as before:
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation
/usr/local/parabricks/run_pb.py fq2bam --num-gpus 1 --x3 --verbose --memory-limit 60 --bwa-options=-K 10000000 --tmp-dir /workdir/OHF980NE --ref /workdir/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta --in-fq /workdir/parabricks_sample/Data/sample_1.fq.gz /workdir/parabricks_sample/Data/sample_2.fq.gz --out-bam /outputdir/fq2bam_output.bam
[Parabricks Options Mesg]: Checking argument compatibility
[Parabricks Options Mesg]: Automatically generating ID prefix
[Parabricks Options Mesg]: Read group created for /workdir/parabricks_sample/Data/sample_1.fq.gz and
/workdir/parabricks_sample/Data/sample_2.fq.gz
[Parabricks Options Mesg]: @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1
/usr/local/parabricks/binaries/bin/pbbwa /workdir/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta --mode pair-ended-gpu /workdir/parabricks_sample/Data/sample_1.fq.gz /workdir/parabricks_sample/Data/sample_2.fq.gz -R @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1 --nGPUs 1 --nstreams 4 --cpu-thread-pool 16 -K 10000000 -F 0 --min-read-size 1 --max-read-size 480 --markdups --write-bin --verbose
For technical support visit https://docs.nvidia.com/clara/index.html#parabricks
Exiting...
Could not run fq2bam
Exiting pbrun ...
Moreover, when i check the config of nvidia container toolkit, everything seems to be fine, and the cuda sample tests are passed.
$ docker run --rm --gpus all nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda10.2
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done
$ docker run --rm --gpus all nvcr.io/nvidia/k8s/cuda-sample:devicequery
/cuda-samples/sample Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "NVIDIA A100-SXM4-80GB"
CUDA Driver Version / Runtime Version 12.5 / 12.5
CUDA Capability Major/Minor version number: 8.0
Total amount of global memory: 81038 MBytes (84974239744 bytes)
(108) Multiprocessors, (064) CUDA Cores/MP: 6912 CUDA Cores
GPU Max Clock rate: 1410 MHz (1.41 GHz)
Memory Clock rate: 1593 Mhz
Memory Bus Width: 5120-bit
L2 Cache Size: 41943040 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total shared memory per multiprocessor: 167936 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 3 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Enabled
Device supports Unified Addressing (UVA): Yes
Device supports Managed Memory: Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 16
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.5, CUDA Runtime Version = 12.5, NumDevs = 1
Result = PASS
Hi,
Unfortunetely the journalctl does not return anything, as no process has been killed.
On the other hand, there is a new printout while trying to run the process within the container.
After entering the container with $ docker run --gpus all -it --rm --volume $(pwd):/workdir --volume $(pwd):/outputdir nvcr.io/nvidia/clara/clara-parabricks:4.4.0-1 /bin/bash
and running this command inside of it /usr/local/parabricks/binaries/bin/pbbwa /workdir/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta --mode pair-ended-gpu /workdir/parabricks_sample/Data/sample_1.fq.gz /workdir/parabricks_sample/Data/sample_2.fq.gz -R @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1 --nGPUs 1 --nstreams 4 --cpu-thread-pool 16 -K 10000000 -F 0 --min-read-size 1 --max-read-size 480 --markdups --write-bin --verbose
The output I’m getting is pointing directly to the culprit /usr/local/parabricks/binaries/bin/pbbwa: error while loading shared libraries: libfilehandle.so: cannot open shared object file: No such file or directory
This seems like the parabricks container in version 4.4.0-1 has not been created properly and there are errors within shared libraries.
We set the LD_LIBRARY_PATH when running through pbrun. That is the entry point for users. The shared libraries are correct in the container and all paths will be set if you run through pbrun.
We have not had other reports of users having issues with the container.
Can you give a full description of your environment? Are you running in the cloud? If so, what image, instance, etc. On-prem? What OS? CUDA driver version? Can you confirm that the container was downloaded correctly by confirming the sha256sum?
If you have NVIDIA Enterprise Support, the best way to get the issue resolved is by contacting our dedicated Enterprise Support Team. You can contact our support team via web portal, phone, or web form. You can find this information listed on our Enterprise Customer Support page.