I run imagenet-console with TensorRT in jeston tx2, it spends about 30s to classify 300 pictures when only using one thread。
when using 5 threads, it spends about 180s to classify 20005 pictures(2000 pictures in each thread, total is 20005), about 55fps.
when using 6 threads, there are some errors（Cuda Error in execute: 4）.
PS:googlenet, batch-size is 128, FP16 enabled, mode 0,
my source based on the following: