• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.1.1
• TensorRT Version 8.5.2
• NVIDIA GPU Driver Version (valid for GPU only) Driver Version: 520.61.05 CUDA Version: 11.8
• Issue Type( questions, new requirements, bugs) question
Hello
I am willing to write a custom backend Triton server using Python, for a RCNN network my company developed.
I am fairly familiar with Deepstream SDK, my question is how to define the python backend within deepstream - this has very little documentation. As I understood from here:
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvinferserver.html
I should specify in my inferserver config fiel specify:
infer_config {
unique_id: 1
gpu_ids: [0]
max_batch_size: 1
backend {
triton {
# model_name: "smoke_32"
version: -1
model_repo {
root: "./triton_model_repo"
log_level: 1
strict_model_config: true
# Triton runtime would reserve 64MB pinned memory
pinned_memory_pool_byte_size: 67108864
# Triton runtim would reserve 64MB CUDA device memory on GPU 0
cuda_device_memory { device: 0, memory_pool_byte_size: 67108864 }
**backend_dir: "path/to/my/python/backend/files"**
}
}
output_mem_type: MEMORY_TYPE_CPU
}
But I just can’t understand the file structure for the backend dir…
Please help!
Guy