GPU support for tflite

Hi,

I am using Jetpack 4.4DP(yet to flash 4.4) and Tensorflow 2.2.0 version(nvidia provided one).

I am able to run Tensorflow model in GPU.

But if i try to run tflite model it is not using the GPU it only runs in CPU.

Please give your suggestions.

Thanks,
rps

Hi,

It seems that tflite doesn’t support nvidia GPU.
https://github.com/tensorflow/tensorflow/issues/34536#issuecomment-565632906

Thanks.

@AastaLLL Thanks for the clarification.

There are discussions about CUDA support through OpenGL-ES or OpenCL, is this something possible with Jetson?

https://github.com/tensorflow/tensorflow/issues/18199#issuecomment-431106804

Please give your opinion on this.

Additionally, main reason i am checking this is i couldn’t convert my model to TensorRT but i am able to convert it to tflite format.

I am using a ResNet50 model. Tried to convert to TRT format in Jetson Nano with up to 16GB of swap memory but still i couldn’t convert. It fails due to memory usage as it utilize almost all the memory and crashes.

Thanks,
rps

Hi,

Based on your use case, it’s recommended to use TensorRT or TF-TRT.
Would you mind to share the file size of your .pb file with us first?

Thanks.

@AastaLLL

My pb file is around 4.5Mb. It’s pretty much similar to the standard ResNet50.

Thanks
rps

Hi,

It should work on Jetson Nano.

You can give it a try.
Please install TensorFlow package with the instructions shared here.
Then you can run the TF-TRT with the API as follows:

Thanks.

@AastaLLL

I have referred the same link before, i have installed tf2.0 and tried with both JP44 and JP43 but same result.

Have tried increasing swap size up-to 20GB but result remains same. Looking forward for your suggestions.

Here is the log for your reference:

2020-10-27 22:52:11.899612: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1165 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
2020-10-27 22:54:11.859197: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-10-27 22:54:11.879654: I tensorflow/core/grappler/devices.cc:55] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2020-10-27 22:54:11.881411: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2020-10-27 22:54:12.081102: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-10-27 22:54:12.081779: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216
pciBusID: 0000:00:00.0
2020-10-27 22:54:12.257006: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-10-27 22:54:12.402032: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-10-27 22:54:12.454602: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-10-27 22:54:12.478200: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-10-27 22:54:12.500468: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-10-27 22:54:12.533394: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-10-27 22:54:12.564841: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-10-27 22:54:12.565056: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-10-27 22:54:12.565252: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-10-27 22:54:12.565371: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-10-27 22:54:12.565556: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-10-27 22:54:12.565652: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-10-27 22:54:12.565798: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-10-27 22:54:12.566705: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-10-27 22:54:12.567267: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-10-27 22:54:12.567607: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1165 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
2020-10-27 22:54:13.814603: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:769] Optimization results for grappler item: graph_to_optimize
2020-10-27 22:54:13.814749: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:771]   function_optimizer: Graph size after: 1265 nodes (940), 2572 edges (2247), time = 152.917ms.
2020-10-27 22:54:13.814792: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:771]   function_optimizer: function_optimizer did nothing. time = 3.156ms.
Killed

Thanks,
rps

Hi,

The log information looks weird to me.
Not sure if this causes the error.

It seems that TensorFlow try to open libcudart.so.10.0.
However, CUDA version should be 10.2 in JetPack4.4.

Would you mind to check if you have installed the correct TensorFlow package first.
It’s recommended to use out latest JetPack4.4.1 with TensorFlow 2.3.0+nv20.9.

Thanks.