It it possible precompile JIT cache for nvidia deepstream libraries for the Orin?

• Platform: Nvidia Orin AGX 64 GB
• DeepStream Version 6.3
• JetPack Version: 35.4.1
• Issue Type: Question about performance

I notice that the first time I run my deepstream pipeline on an Orin, it takes much longer to initialize due to JIT compilation: https://developer.nvidia.com/blog/cuda-pro-tip-understand-fat-binaries-jit-caching/

Is there anyway to precompile the nvidia deepstream libs so that we do not need this JIT cache?

Hi,

Could you share more details about the JIT compiling?
In general, Deepstream libraries have compiled with Orin GPU architecture so the JIT should not be needed.

Thanks.

Hi,

Thanks for the response. I am cross compiling a custom gstreamer plugin for an Orin, When running the code on the orin, I see a folder being created “.nv/ComputeCache”. This usually takes a few minutes.

The folder contains several subfolders and an index file.
– ComputeCache
– 5
– 6
– c
– index

How can I understand what code is generating this cache? I have tried compiling my custom plugin for the orin specific architecture, and that is why I believe it might be the nvidia libs themselves that need to be compiled for the orin.

Hi,

Yes, the cache indicates JIT is triggered.
Could you share your source with us so we can check which library requires the JIT compiling?
Or could you turn off the JIT to locate the failure?

For example:

$ CUDA_DISABLE_PTX_JIT=1 ./demo

Thanks.

Hi thank you for the response!

I am unable to share my source, but I will try your second suggestion. When I turn off JIT, I get a segfault and the following errors:

‘’’
/dvs/git/dirty/git-master_linux/nvutils/nvbufsurftransform/nvbufsurftransform_copy.cpp:335: => Failed in mem copy

Cuda failure: status=223 in cuResData at line 712
NVMEDIA: Need to set EMC bandwidth : 846000
NvVideo: bBlitMode is set to TRUE
Cuda failure: status=223 in cuResData at line 731
/dvs/git/dirty/git-master_linux/nvutils/nvbufsurftransform/nvbufsurftransform_copy.cpp:429: => Failed in mem copy

Cuda failure: status=223 in cuResData at line 731
/dvs/git/dirty/git-master_linux/nvutils/nvbufsurftransform/nvbufsurftransform_copy.cpp:335: => Failed in mem copy

/dvs/git/dirty/git-master_linux/nvutils/nvbufsurftransform/nvbufsurftransform_copy.cpp:335: => Failed in mem copy

/dvs/git/dirty/git-master_linux/nvutils/nvbufsurftransform/nvbufsurftransform_copy.cpp:335: => Failed in mem copy

Cuda failure: status=223 in cuResData at line 625
/dvs/git/dirty/git-master_linux/nvutils/nvbufsurftransform/nvbufsurftransform_copy.cpp:335: => Failed in mem copy

Segmentation fault
‘’’

I also see that a Compute Cache directory is still created with only a single index file inside.

Hi,

Thanks for the testing. It helps.
We will check if we can reproduce this with a pipeline that also uses nvbufsurftransform.

Thanks.

Hi,

Could you check if the same issue occurs without using the custom plugin?
We try to reproduce this issue with Deepstream sample but it can run normally with JIT disabled.

$ CUDA_DISABLE_PTX_JIT=1  deepstream-app -c source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt 
Setting min object dimensions as 16x16 instead of 1x1 to support VIC compute mode.
0:00:04.545290925 82625 0xaaaaf4bee380 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<secondary_gie_1> NvDsInferContext[UID 5]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2095> [UID = 5]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-7.0/samples/configs/deepstream-app/../../models/Secondary_VehicleMake/resnet18_vehiclemakenet.etlt_b16_gpu0_int8.engine
INFO: [Implicit Engine Info]: layers num: 2
0   INPUT  kFLOAT input_1         3x224x224       
1   OUTPUT kFLOAT predictions/Softmax 20x1x1          

...
**PERF:  29.99 (30.82)	29.99 (30.82)	29.99 (30.82)	29.99 (30.82)	
**PERF:  30.01 (30.73)	30.01 (30.73)	30.01 (30.73)	30.01 (30.73)	
nvstreammux: Successfully handled EOS for source_id=2
nvstreammux: Successfully handled EOS for source_id=3
nvstreammux: Successfully handled EOS for source_id=0
nvstreammux: Successfully handled EOS for source_id=1
** INFO: <bus_callback:314>: Received EOS. Exiting ...

Quitting
[NvMultiObjectTracker] De-initialized
App run successful

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.