TX1 caffe FP16 make error

Hi,
on building the following repository with the following Makefile.config configurations

Clone Caffe: “git clone https://github.com/NVIDIA/caffe.git -b experimental/fp16”
modified Makefile.config with

Line 5: enabled “USE_CUDNN := 1”
Line 17: enabled “NATIVE_FP16 := 1
Line 41: inserted “-gencode arch=compute_53,code=sm_53 \”
Line 42: modified “-gencode arch=compute_50,code=compute_50” to “-gencode arch=compute_53,code=compute_53”

cd caffe/
make

get the following ERROR

NVCC src/caffe/layers/im2col_layer.cu
NVCC src/caffe/layers/deconv_layer.cu
NVCC src/caffe/layers/tile_layer.cu
NVCC src/caffe/layers/cudnn_sigmoid_layer.cu
NVCC src/caffe/util/fp16_conversion.cu
NVCC src/caffe/util/math_functions.cu
src/caffe/util/math_functions.cu(284): warning: __shared__ memory variable with non-empty constructor or destructor (potential race between threads)
          detected during instantiation of "void caffe::gpu_dot_kernel(int, const Dtype *, const Dtype *, Mtype *) [with Dtype=caffe::float16, Mtype=caffe::float16]" 
(321): here

src/caffe/util/math_functions.cu(352): warning: __shared__ memory variable with non-empty constructor or destructor (potential race between threads)
          detected during instantiation of "void caffe::gpu_asum_kernel(int, const Dtype *, Mtype *) [with Dtype=caffe::float16, Mtype=caffe::float16]" 
(387): here

src/caffe/util/math_functions.cu(284): warning: __shared__ memory variable with non-empty constructor or destructor (potential race between threads)
          detected during instantiation of "void caffe::gpu_dot_kernel(int, const Dtype *, const Dtype *, Mtype *) [with Dtype=caffe::float16, Mtype=caffe::float16]" 
(321): here

src/caffe/util/math_functions.cu(352): warning: __shared__ memory variable with non-empty constructor or destructor (potential race between threads)
          detected during instantiation of "void caffe::gpu_asum_kernel(int, const Dtype *, Mtype *) [with Dtype=caffe::float16, Mtype=caffe::float16]" 
(387): here

src/caffe/util/math_functions.cu(284): warning: __shared__ memory variable with non-empty constructor or destructor (potential race between threads)
          detected during instantiation of "void caffe::gpu_dot_kernel(int, const Dtype *, const Dtype *, Mtype *) [with Dtype=caffe::float16, Mtype=caffe::float16]" 
(321): here

src/caffe/util/math_functions.cu(352): warning: __shared__ memory variable with non-empty constructor or destructor (potential race between threads)
          detected during instantiation of "void caffe::gpu_asum_kernel(int, const Dtype *, Mtype *) [with Dtype=caffe::float16, Mtype=caffe::float16]" 
(387): here

src/caffe/util/math_functions.cu(284): warning: __shared__ memory variable with non-empty constructor or destructor (potential race between threads)
          detected during instantiation of "void caffe::gpu_dot_kernel(int, const Dtype *, const Dtype *, Mtype *) [with Dtype=caffe::float16, Mtype=caffe::float16]" 
(321): here

src/caffe/util/math_functions.cu(352): warning: __shared__ memory variable with non-empty constructor or destructor (potential race between threads)
          detected during instantiation of "void caffe::gpu_asum_kernel(int, const Dtype *, Mtype *) [with Dtype=caffe::float16, Mtype=caffe::float16]" 
(387): here

src/caffe/util/math_functions.cu(284): warning: __shared__ memory variable with non-empty constructor or destructor (potential race between threads)
          detected during instantiation of "void caffe::gpu_dot_kernel(int, const Dtype *, const Dtype *, Mtype *) [with Dtype=caffe::float16, Mtype=caffe::float16]" 
(321): here

src/caffe/util/math_functions.cu(352): warning: __shared__ memory variable with non-empty constructor or destructor (potential race between threads)
          detected during instantiation of "void caffe::gpu_asum_kernel(int, const Dtype *, Mtype *) [with Dtype=caffe::float16, Mtype=caffe::float16]" 
(387): here

NVCC src/caffe/util/im2col.cu
AR -o /home/ubuntu/username/caffe_FP_16/caffe/.build_release/lib/libcaffe-nv.a
LD -o /home/ubuntu/username/caffe_FP_16/caffe/.build_release/lib/libcaffe-nv.so.0.12.0
CXX tools/upgrade_net_proto_text.cpp
CXX/LD -o /home/ubuntu/username/caffe_FP_16/caffe/.build_release/tools/upgrade_net_proto_text.bin
CXX tools/device_query.cpp
CXX/LD -o /home/ubuntu/username/caffe_FP_16/caffe/.build_release/tools/device_query.bin
CXX tools/caffe_fp16.cpp
CXX/LD -o /home/ubuntu/username/caffe_FP_16/caffe/.build_release/tools/caffe_fp16.bin
/home/ubuntu/username/caffe_FP_16/caffe/.build_release/tools/caffe_fp16.o: In function `CopyLayers(caffe::Solver<caffe::float16, float>*, std::string const&)':
caffe_fp16.cpp:(.text+0x19a): undefined reference to `caffe::Net<caffe::float16, float>::CopyTrainedLayersFrom(std::string)'
caffe_fp16.cpp:(.text+0x1f8): undefined reference to `caffe::Net<caffe::float16, float>::CopyTrainedLayersFrom(std::string)'
/home/ubuntu/username/caffe_FP_16/caffe/.build_release/tools/caffe_fp16.o: In function `test()':
caffe_fp16.cpp:(.text+0xa58): undefined reference to `caffe::Net<caffe::float16, float>::Net(std::string const&, caffe::Phase, caffe::Net<caffe::float16, float> const*)'
caffe_fp16.cpp:(.text+0xa6a): undefined reference to `caffe::Net<caffe::float16, float>::CopyTrainedLayersFrom(std::string)'
caffe_fp16.cpp:(.text+0xb04): undefined reference to `caffe::Net<caffe::float16, float>::Forward(std::vector<caffe::Blob<caffe::float16, float>*, std::allocator<caffe::Blob<caffe::float16, float>*> > const&, float*)'
caffe_fp16.cpp:(.text+0xb36): undefined reference to `caffe::Blob<caffe::float16, float>::cpu_data() const'
caffe_fp16.cpp:(.text+0xbf8): undefined reference to `cpu_half2float(__half)'
caffe_fp16.cpp:(.text+0xd6c): undefined reference to `cpu_half2float(__half)'
/home/ubuntu/username/caffe_FP_16/caffe/.build_release/tools/caffe_fp16.o: In function `time()':
caffe_fp16.cpp:(.text+0x12f2): undefined reference to `caffe::Net<caffe::float16, float>::Net(std::string const&, caffe::Phase, caffe::Net<caffe::float16, float> const*)'
caffe_fp16.cpp:(.text+0x132e): undefined reference to `caffe::Net<caffe::float16, float>::Forward(std::vector<caffe::Blob<caffe::float16, float>*, std::allocator<caffe::Blob<caffe::float16, float>*> > const&, float*)'
/home/ubuntu/username/caffe_FP_16/caffe/.build_release/tools/caffe_fp16.o: In function `caffe::Layer<caffe::float16, float>::Forward(std::vector<caffe::Blob<caffe::float16, float>*, std::allocator<caffe::Blob<caffe::float16, float>*> > const&, std::vector<caffe::Blob<caffe::float16, float>*, std::allocator<caffe::Blob<caffe::float16, float>*> > const&)':
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0x10): undefined reference to `caffe::LayerBase::Lock()'
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0x56): undefined reference to `cpu_half2float(__half)'
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0x88): undefined reference to `cpu_float2half_rn(float)'
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0x8e): undefined reference to `cpu_half2float(__half)'
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0xa8): undefined reference to `caffe::Blob<caffe::float16, float>::gpu_data() const'
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0xb4): undefined reference to `caffe::Blob<caffe::float16, float>::gpu_diff() const'
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0xc4): undefined reference to `void caffe::caffe_gpu_dot<caffe::float16, float>(int, caffe::float16 const*, caffe::float16 const*, float*)'
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0xfc): undefined reference to `caffe::LayerBase::Unlock()'
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0x130): undefined reference to `cpu_half2float(__half)'
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0x162): undefined reference to `cpu_float2half_rn(float)'
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0x168): undefined reference to `cpu_half2float(__half)'
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0x182): undefined reference to `caffe::Blob<caffe::float16, float>::cpu_data() const'
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0x18e): undefined reference to `caffe::Blob<caffe::float16, float>::cpu_diff() const'
caffe_fp16.cpp:(.text._ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_[_ZN5caffe5LayerINS_7float16EfE7ForwardERKSt6vectorIPNS_4BlobIS1_fEESaIS6_EESA_]+0x198): undefined reference to `float caffe::caffe_cpu_dot<caffe::float16, float>(int, caffe::float16 const*, caffe::float16 const*)'
/home/ubuntu/username/caffe_FP_16/caffe/.build_release/tools/caffe_fp16.o: In function `main':
caffe_fp16.cpp:(.text.startup+0x54): undefined reference to `caffe::gpu_memory::init(std::vector<int, std::allocator<int> > const&, caffe::gpu_memory::PoolMode, bool)'
caffe_fp16.cpp:(.text.startup+0x6a): undefined reference to `caffe::gpu_memory::destroy()'
caffe_fp16.cpp:(.text.startup+0xb8): undefined reference to `caffe::gpu_memory::destroy()'
caffe_fp16.cpp:(.text.startup+0x16e): undefined reference to `caffe::gpu_memory::destroy()'
collect2: error: ld returned 1 exit status

Hi,
NvCaffe shows all good in my side.
My environment is set up by JetPack with cuda and cuDNN.

Since fp16 is a new feature in cuda 8.0, could you check if you are in the right version?
Or just use JetPack to install cuda 8.0.

I made OK,but when I "sh examples/mnist/create_mnist.sh ",got the errors as follws:

root@tegra-ubuntu:~/caffe_fp16/caffe# sh examples/mnist/create_mnist.sh
Creating lmdb…
F0921 10:59:35.111879 10842 convert_mnist_data.cpp:100] Check failed: mdb_env_open(mdb_env, db_path, 0, 0664) == 0 (12 vs. 0) mdb_env_open failed
*** Check failure stack trace: ***
@ 0x7f9a272718 google::LogMessage::Fail()
@ 0x7f9a274614 google::LogMessage::SendToLog()
@ 0x7f9a272290 google::LogMessage::Flush()
@ 0x7f9a274eb4 google::LogMessageFatal::~LogMessageFatal()
@ 0x403ca8 convert_dataset()
@ 0x402a4c main
@ 0x7f99daf8a0 __libc_start_main
Aborted
F0921 10:59:35.231420 10843 convert_mnist_data.cpp:100] Check failed: mdb_env_open(mdb_env, db_path, 0, 0664) == 0 (12 vs. 0) mdb_env_open failed
*** Check failure stack trace: ***
@ 0x7f9366f718 google::LogMessage::Fail()
@ 0x7f93671614 google::LogMessage::SendToLog()
@ 0x7f9366f290 google::LogMessage::Flush()
@ 0x7f93671eb4 google::LogMessageFatal::~LogMessageFatal()
@ 0x403ca8 convert_dataset()
@ 0x402a4c main
@ 0x7f931ac8a0 __libc_start_main
Aborted
Done.

Could you provide your environment?

I can successfully compile caffe on L4T 24-2 release with CUDA8.0.

Successfully Compililed the caffe_fp16 ??

These instructions could help for building on the TX1
:

Afte making,When I run “sh examples/mnist/create_mnist.sh”,there would be some wrong messages about lmdb.Method of solving the problem is changing the LMDB_MAP_SIZE in the file of caffe_fp16/caffe/include/caffe/util/db_lmdb.hpp.After that,remaking the caffe,it will be fine!

May I know the right size of LMDB_MAP_SIZE? Thank you

Hi Jianbin,

I have answered your question on topic 999122, please check.
https://devtalk.nvidia.com/default/topic/999122/jetson-tx1/error-of-building-caffe-fp16-in-jetson-tx1-/?offset=4#5112149

Thanks.