Transfer data from GPU to CPU takes too much times on TX2

wangsc_up · May 24, 2019, 1:32pm

Hi,guys

I met data transferring problem on jetson TX2.

When i run inference(data from CPU to GPU,inference,data from GPU to CPU) on jetson TX2 based on my network(onnx format),i found that transferring data from GPU to CPU takes a lot of time.It took up about 80% of the inference time.

The size of data needed to transfer is 1x17x80x64. TensorRT version: 5.0.6.1, Linux version: ubantu 18.04.Copy function i using is cudaMemcpyAsync().

Maybe i can optimize this processing by following ways,but there still are some issues waiting to solve:

1.I can use pinned memory to improve memory copy times,but it looks like that it can not speed up my processing time.
2.In fact,i will process those data(1x17x80x64) to 1x2x17 by function which is implemented by “C++” after transfer data to CPU.I might implement this function by cuda in order to run on GPU,then just transfer small size data. So, can you provide some sources or links to help implement my function in cuda or tensorRT?

I would appreciate it if you have any advices and help!

zimenglan · August 9, 2019, 11:25am

right，
for input, it needs to transfer the data from cpu to gpu and transfer the output from gpu to cpu. it’s time-consuming.
hope any advices to help it out.
thanks

Topic		Replies	Views
Transfer data from GPU->CPU takes too much time. TensorRT	0	313	May 23, 2019
Transfer data from GPU->CPU takes too much time. TensorRT	0	346	May 23, 2019
Transfer data from GPU->CPU takes too much time. TensorRT	0	277	May 23, 2019
Transfer data from GPU->CPU takes too much time. TensorRT	0	585	May 23, 2019
Transfer data from GPU->CPU takes too much time. TensorRT	0	284	May 23, 2019
transfer data from GPU->CPU takes too much time. TensorRT	0	323	May 23, 2019
Transfer data from GPU to CPU takes too much times on TX2 Jetson TX2	4	1064	June 5, 2019
TensorRT copy data cost a lot of time TensorRT	1	709	April 8, 2020
data transfer cost a lot of time Jetson TX2	1	807	November 7, 2017
Cuda transfer from device to host is extremely slow TensorRT cuda	5	2771	February 13, 2022

Transfer data from GPU to CPU takes too much times on TX2

Related topics