Any examples of using tensorrt without this streams, or double buffer concept?

Most examples have some iteration like the following:

    void* buffers[2];
    cudaMalloc(&buffers[inputIndex], inputVolume);
    cudaMemcpy(buffers[inputIndex], cudaArray, inputVolume, cudaMemcpyDeviceToDevice);

    context->execute(batchSize, buffers);

I’m pulling dx11 frames into cuda, I’m already on the GPU. I don’t really need an output buffer?

Is there any examples of one and done in line execution?

It might also just take a single buffer, I gotta go look at the docs, I just can’t play until after work.

Hi @ryan111 ,
CUDA Forum should be able to help you here.