Hi, I try to build tensorflow lite to evaluate the performance of resnet on tx2. But I encountered a problem and have not found ways to solve them.
I download the master branch code of tensorflow from github, and use the following command to build the tensorflow lite convertor:
bazel build //tensorflow/lite/python:tflite_convert --local_ram_resources=“HOST_RAM*.9” --local_cpu_resources=4 --verbose_failures --config=nonccl
The local resources flags are used to restrict resources occupied by bazel to avoid insufficient ram problem. The ‘config=nonccl’ flag is to set a non-nccl building, as there is no nccl on tx2. However, a nccl-related building error appeared.
//…many similar NVLink errors, listing the last few below along with final error log.
nvlink error : entry function ‘_Z28ncclAllReduceLLKernel_sum_i88ncclColl’ with max regcount of 80 calls function ‘_Z25ncclReduceScatter_max_u64P14CollectiveArgs’ with regcount of 96
nvlink error : entry function ‘_Z29ncclAllReduceLLKernel_sum_i328ncclColl’ with max regcount of 80 calls function ‘_Z25ncclReduceScatter_max_u64P14CollectiveArgs’ with regcount of 96
nvlink error : entry function ‘_Z29ncclAllReduceLLKernel_sum_f168ncclColl’ with max regcount of 80 calls function ‘_Z25ncclReduceScatter_max_u64P14CollectiveArgs’ with regcount of 96
nvlink error : entry function ‘_Z29ncclAllReduceLLKernel_sum_u328ncclColl’ with max regcount of 80 calls function ‘_Z25ncclReduceScatter_max_u64P14CollectiveArgs’ with regcount of 96
nvlink error : entry function ‘_Z29ncclAllReduceLLKernel_sum_f328ncclColl’ with max regcount of 80 calls function ‘_Z25ncclReduceScatter_max_u64P14CollectiveArgs’ with regcount of 96
nvlink error : entry function ‘_Z29ncclAllReduceLLKernel_sum_u648ncclColl’ with max regcount of 80 calls function ‘_Z25ncclReduceScatter_max_u64P14CollectiveArgs’ with regcount of 96
nvlink error : entry function ‘_Z28ncclAllReduceLLKernel_sum_u88ncclColl’ with max regcount of 80 calls function ‘_Z25ncclReduceScatter_max_u64P14CollectiveArgs’ with regcount of 96
It seems the error is related to some nccl kernels. I dont know why nccl-related kernels were still compiled.