Hello,
I am experimenting with the Python API for TensorRT that was included in the latest version of JetPack.
One thing that wasn’t immediately clear to me is to how to allocate memory to be used by the inference engine. More specifically, I want to use mapped pinned memory (i.e., I want to pass in cudaHostAllocMapped to cudaHostAlloc()) since this memory API has shown itself to be the fastest on the TX2 in benchmarks.
Is there any way to allocate memory using the TensorRT Python API or is PyCUDA effectively required to do so? If PyCUDA is required to allocate such buffers, are there any plans to include it in JetPack so that users don’t have to install it manually? I think it would be helpful to include PyCUDA so at least users can run the Python samples (which use PyCUDA) without needing to manually install any libraries.
Thanks in advance for the help.