Tensor rt multi-contexts with multi-threads

1115384863 · August 26, 2024, 2:51pm

Description

I want to use multi-threaded inference. I have assigned a context to each thread, but encountered an error. I can only use one thread and one context. C++
nvinfer1::Dims4 inputDims = {32, 3, img_size_, img_size_};
for (int i = 0; i < thread_nums_; ++i) {
void* deviceInput;
void* deviceOutput;
cudaStream_t stream = nullptr;
cudaStreamCreate(&stream);

        cudaMalloc(&deviceInput, 32 * 3 * 224 * 224 * sizeof(float));
        cudaMalloc(&deviceOutput, 32 * 1000 * sizeof(float));
        void** bindings = new void*[2];
        bindings[0] = deviceInput;
        bindings[1] = deviceOutput;
        auto m_context = m_engine->createExecutionContext();
        m_context->setInputShape("input", inputDims);
        m_context->setTensorAddress("input", deviceInput);
        m_context->setTensorAddress("output", deviceOutput);
        m_contexts.emplace_back(m_context);
        streams_.emplace_back(stream);
        deviceInputs_.emplace_back(deviceInput);
        deviceOutputs_.emplace_back(deviceOutput);
    }
            cudaMemcpyAsync(sptr->deviceInputs_[i], inputHost, 32 * 3 * 224 * 224 * sizeof(float), cudaMemcpyHostToDevice, sptr->streams_[i]);
            if (sptr->m_contexts[i]->enqueueV3(sptr->streams_[i])) {
                cudaMemcpyAsync(outputHost, sptr->deviceOutputs_[i], 32 * 1000 * sizeof(float), cudaMemcpyDeviceToHost, sptr->streams_[i]);
                std::cout << outputHost[0] << std::endl;
            }

Environment

TensorRT Version: 8.6.1
GPU Type:
Nvidia Driver Version: rtx3070
CUDA Version: cuda11.5
CUDNN Version:
Operating System + Version: ubuntu20.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

AakankshaS · August 29, 2024, 8:16pm

Hi @1115384863 ,
Can you please help us with complete set of logs and repro model and scripts/steps

1115384863 · September 2, 2024, 5:05am

it works，each context need a profile.

Topic		Replies	Views
How to use TensorRT by the multi-threading package of python Jetson AGX Xavier tensorrt	13	19266	October 18, 2021
Can multiple CUDA contexts share an inference engine? TensorRT tensorrt , cuda	3	190	January 21, 2025
Segmentation fault (core dumped) error in multiple context and multiple stream in tensorrt TensorRT	4	389	August 8, 2024
How to implement TensorRT as an inference server? TensorRT	2	1789	October 24, 2019
Multiple context and/or multithreading TensorRT	1	1331	March 24, 2022
Threading on TX2 using Tensorrt TensorRT	1	540	May 20, 2020
Multiple tensorrt engine contexts for different models TensorRT	3	1928	March 16, 2023
Speeding up multi-threaded C++ program of TensorRT models TensorRT tensorrt	7	1536	February 20, 2025
Multiple calls of enqueueV2 TensorRT	15	2313	September 19, 2021
Parallel execution of several trt contexts on one GPU TensorRT onnx	1	1371	August 7, 2023

Tensor rt multi-contexts with multi-threads

Description

Environment

Relevant Files

Steps To Reproduce

Related topics