Yes, as far as i remember it worked out-of-the-box. I just followed the instructions given here: Installing Bazel on Ubuntu - Bazel main to build it from the source.
It is important that you have to check for the bazel version that works best with the Tensorflow’s version etc. Sometimes, the latest bazel version may be too much for TF and might require down-grade etc.
gcc --version
gcc (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
But seeing this error:
ERROR: /xavier_ssd/home/tensorflow/tensorflow/lite/kernels/BUILD:321:1: C++ compilation of rule '//tensorflow/lite/kernels:builtin_op_kernels' failed (Exit 1)
In file included from ./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8.h:23:0,
from ./tensorflow/lite/kernels/internal/optimized/depthwiseconv_multithread.h:22,
from tensorflow/lite/kernels/depthwise_conv.cc:28:
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h: In static member function 'static void tflite::optimized_ops::depthwise_conv::PackMacroBlock<(tflite::DepthwiseConvImplementation)3, (tflite::DepthwiseConvDepthMultiplication)0, 0>::PackMacroBlockNeon(const uint8*, int8*, const tflite::optimized_ops::depthwise_conv::DepthwiseConvDotProdParams*)':
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:5981:47: note: use -flax-vector-conversions to permit conversions between vectors with differing element types or numbers of subparts
input_data_a = vld1q_u8(input_data_0);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:5981:47: error: cannot convert 'uint8x16_t {aka __vector(16) unsigned char}' to 'int8x16_t {aka __vector(16) signed char}' in assignment
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:5982:65: error: cannot convert 'uint8x16_t {aka __vector(16) unsigned char}' to 'int8x16_t {aka __vector(16) signed char}' in assignment
input_data_b = vld1q_u8(input_data_0 + 1 * input_depth);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:5983:65: error: cannot convert 'uint8x16_t {aka __vector(16) unsigned char}' to 'int8x16_t {aka __vector(16) signed char}' in assignment
input_data_c = vld1q_u8(input_data_0 + 2 * input_depth);
^
./tensorflow/lite/kernels/internal/optimized/depthwiseconv_uint8_3x3_filter.h:5984:65: error: cannot convert 'uint8x16_t {aka __vector(16) unsigned char}' to 'int8x16_t {aka __vector(16) signed char}' in assignment
input_data_d = vld1q_u8(input_data_0 + 3 * input_depth);
In case this is useful to anyone, I’ve found that you can build just about anything for the Jetsons on AWS arm64 instances (which are only available in the US East (Ohio) region). It’s much faster, and I wouldn’t dream of trying to build anything directly on a Jetson after getting this working.
The above examples are for CPU build which is not using the GPU of jetson. Has anyone compiled tensorflow-gpu 2.0 for Jetson? Is tensorflow-gpu 2.0 still not available for Jetson? Again i don’t want to tensorflow-lite because it doesn’t support all the functions.
— Srkm009 used the following command “pip3 install -U /tmp/tensorflow_pkg/tensorflow-2.0.0rc0-cp36-cp36m-linux_aarch64.whl” — it clearly says cpu.
— And according my understanding tensorflow is written for x86 integrated with gpu or tpu. On the other hand jetson has arm core. So does it compile properly for jetson with integrated gpu, since it uses arm core?.
— They have released tensorflow-lite as the official version of tensorflow for arm type of computer. they also release tensorflow lite - gpu delegate for arm with gpu processor(Mobile gpu-- mali).
— So my question is why their is tensorflow lite version(arm64) if tensorflow compiles works for arm64? if we compile tensorflow-gpu 2.0 for jetson will it work? how do you test whether the tf-gpu works on it and has a optimum performance?
I may be wrong here but i’m just trying to understand
I am trying to build tensorflow2.0.0 on tx2. below are the details.
System information
OS Platform and Distribution: jetpack 4.3
TensorFlow installed from (source or binary): source
TensorFlow version (use command below): 2.0.0
Python version: 2.7
Bazel version (if compiling from source): 0.26.1
GCC/Compiler version (if compiling from source): 7.5
CUDA/cuDNN version: 10.0/7.6.3
GPU model and memory: Tegra TX2
Describe the current behavior
when I try running sample application it is crashing. With error logs as “CUDA runtime implicit initialization on GPU:0 failed. Status: unknown error”
Describe the expected behavior
it should start a session.
Standalone code to reproduce the issue
Sample Code
#include <tensorflow/core/platform/env.h>
#include <tensorflow/core/public/session.h>
#include <iostream>
using namespace std;
using namespace tensorflow;
int main()
{
Session* session;
Status status = NewSession(SessionOptions(), &session);
if (!status.ok()) {
cout << status.ToString() << "\n";
}
session->Close();
cout << "Session successfully created.\n";
}
CMakeLists.txt
cmake_minimum_required(VERSION 3.10)
project(test_hello)
set(TENSORFLOW_LIBRARIES tensorflow_cc protobuf)
add_executable(example example.cpp)
set(CUDA_NVCC_FLAGS
${CUDA_NVCC_FLAGS};
-O3 -gencode arch=compute_30,code=sm_30;
--std=c++11
)
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
# Link the Tensorflow library.
# TensorFlow headers
include_directories("/usr/local/include/")
include_directories("/usr/local/include/tensorflow/")
include_directories("/usr/local/include/third-party/")
target_link_libraries(example "/usr/local/lib/libtensorflow_cc.so")
# You may also link cuda if it is available.
find_package(CUDA)
if(CUDA_FOUND)
target_link_libraries(example ${CUDA_LIBRARIES})
endif()
Other info / logs
$ ./example
2020-07-08 15:01:34.122381: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-07-08 15:01:34.127413: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-07-08 15:01:34.127566: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3
pciBusID: 0000:00:00.0
2020-07-08 15:01:34.127608: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-07-08 15:01:34.127693: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-07-08 15:01:34.127815: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-07-08 15:01:34.127879: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: unknown error
Segmentation fault (core dumped)
I am very new to tensorflow and cuda programming both …not sure what is going on. Appreciate the help.
I have upgraded to jetpack4.5. and now I could run this program. thanks!
But my requirement is to run application inside docker on tx2. How to install cuda, cudnn and related libraries inside docker container.
I tried the following.
Mounted /usr/local/cuda/ to docker container.
infact the following command
docker container run --rm --net=host --privileged -e PATH=:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin --env LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib/cm:/usr/lib:/opt/libav/lib:/opt/pylon5/lib64:/usr/lib/aarch64-linux-gnu/tegra:/usr/local/cuda/lib64 --device=/dev/nvhost-ctrl --device=/dev/nvhost-ctrl-gpu --device=/dev/nvhost-prof-gpu --device=/dev/nvmap --device=/dev/nvhost-gpu --device=/dev/nvhost-as-gpu -v /usr/local/cuda-10.2:/usr/local/cuda -v /usr/lib/aarch64-linux-gnu/tegra:/usr/lib/aarch64-linux-gnu/tegra
then I faces issue regarding related to libcudnn.so and libcublas.so.10. I overcome the libcudnn issue by installing libcudnn8_8.0.0.180-1+cuda10.2_arm64.deb and libcudnn8-dev_8.0.0.180-1+cuda10.2_arm64.deb inside docker container.
But still facing issue for libcublas.so.10. how to overcome this?
Or do you have a prebuilt docker image for cuda10.2 an cudnn 8. so that I can simply derived my docker from that.
I have found an image 10.2-cudnn8-devel-ubuntu18.04. but this is applicable for x86_64 I need similar image for arm64.