Good day,
NVIDIA Jetson-AGX
L4T 32.5.0 [ JetPack 4.5 ]
Ubuntu 18.04.6 LTS
Kernel Version: 4.9.201-tegra
CUDA 10.2.89
CUDA Architecture: 7.2
OpenCV version: 4.1.1
OpenCV Cuda: NO
CUDNN: 8.0.0.180
TensorRT: 7.1.3.0
Vision Works: 1.6.0.501
VPI: 1.0.15
Vulcan: 1.2.70
I would like to use Pytorch on my Jetson AGX board. I am working with Python 2.7, so the latest version of pytorch, which I can use is 1.4 . However I couldn’t find the wheel for my Jetson AGX parameters (CUDA 10.2) and after that I would like to build it from the source. I have used the next example: PyTorch for Jetson to compile the wheel for my case.
I have used the next configuration for TORCH_CUDA_ARCH_LIST:
export TORCH_CUDA_ARCH_LIST="10.2"
On the step:
python setup.py bdist_wheel
I have met the next error:
Building wheel torch-1.4.0
-- Building version 1.4.0
cmake -GNinja -DBUILD_PYTHON=True -DBUILD_TEST=True -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/home/nvidia/pytorch1.4/pytorch/torch -DCMAKE_PREFIX_PATH=/home/nvidia/catkin_ws3/devel:/opt/ros/melodic -DNUMPY_INCLUDE_DIR=/usr/lib/python2.7/dist-packages/numpy/core/include -DPYTHON_EXECUTABLE=/usr/bin/python -DPYTHON_INCLUDE_DIR=/usr/include/python2.7 -DPYTHON_LIBRARY=/usr/lib/libpython2.7.so.1.0 -DTORCH_BUILD_VERSION=1.4.0 -DUSE_DISTRIBUTED=0 -DUSE_NCCL=0 -DUSE_NUMPY=True -DUSE_PYTORCH_QNNPACK=0 -DUSE_QNNPACK=0 /home/nvidia/pytorch1.4/pytorch
-- std::exception_ptr is supported.
-- NUMA is available
-- Building using own protobuf under third_party per request.
-- Use custom protobuf build.
-- Caffe2 protobuf include directory: $<BUILD_INTERFACE:/home/nvidia/pytorch1.4/pytorch/third_party/protobuf/src>$<INSTALL_INTERFACE:include>
-- Trying to find preferred BLAS backend of choice: MKL
-- MKL_THREADING = OMP
-- MKL_THREADING = OMP
CMake Warning at cmake/Dependencies.cmake:141 (message):
MKL could not be found. Defaulting to Eigen
Call Stack (most recent call first):
CMakeLists.txt:380 (include)
CMake Warning at cmake/Dependencies.cmake:160 (message):
Preferred BLAS (MKL) cannot be found, now searching for a general BLAS
library
Call Stack (most recent call first):
CMakeLists.txt:380 (include)
-- MKL_THREADING = OMP
-- Checking for [mkl_intel_lp64 - mkl_gnu_thread - mkl_core - gomp - pthread - m - dl]
-- Library mkl_intel_lp64: not found
-- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - gomp - pthread - m - dl]
-- Library mkl_intel_lp64: not found
-- Checking for [mkl_intel - mkl_gnu_thread - mkl_core - gomp - pthread - m - dl]
-- Library mkl_intel: not found
-- Checking for [mkl_intel - mkl_intel_thread - mkl_core - gomp - pthread - m - dl]
-- Library mkl_intel: not found
-- Checking for [mkl_gf_lp64 - mkl_gnu_thread - mkl_core - gomp - pthread - m - dl]
-- Library mkl_gf_lp64: not found
-- Checking for [mkl_gf_lp64 - mkl_intel_thread - mkl_core - gomp - pthread - m - dl]
-- Library mkl_gf_lp64: not found
-- Checking for [mkl_gf - mkl_gnu_thread - mkl_core - gomp - pthread - m - dl]
-- Library mkl_gf: not found
-- Checking for [mkl_gf - mkl_intel_thread - mkl_core - gomp - pthread - m - dl]
-- Library mkl_gf: not found
-- Checking for [mkl_intel_lp64 - mkl_gnu_thread - mkl_core - iomp5 - pthread - m - dl]
-- Library mkl_intel_lp64: not found
-- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - iomp5 - pthread - m - dl]
-- Library mkl_intel_lp64: not found
-- Checking for [mkl_intel - mkl_gnu_thread - mkl_core - iomp5 - pthread - m - dl]
-- Library mkl_intel: not found
-- Checking for [mkl_intel - mkl_intel_thread - mkl_core - iomp5 - pthread - m - dl]
-- Library mkl_intel: not found
-- Checking for [mkl_gf_lp64 - mkl_gnu_thread - mkl_core - iomp5 - pthread - m - dl]
-- Library mkl_gf_lp64: not found
-- Checking for [mkl_gf_lp64 - mkl_intel_thread - mkl_core - iomp5 - pthread - m - dl]
-- Library mkl_gf_lp64: not found
-- Checking for [mkl_gf - mkl_gnu_thread - mkl_core - iomp5 - pthread - m - dl]
-- Library mkl_gf: not found
-- Checking for [mkl_gf - mkl_intel_thread - mkl_core - iomp5 - pthread - m - dl]
-- Library mkl_gf: not found
-- Checking for [mkl_intel_lp64 - mkl_gnu_thread - mkl_core - pthread - m - dl]
-- Library mkl_intel_lp64: not found
-- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - pthread - m - dl]
-- Library mkl_intel_lp64: not found
-- Checking for [mkl_intel - mkl_gnu_thread - mkl_core - pthread - m - dl]
-- Library mkl_intel: not found
-- Checking for [mkl_intel - mkl_intel_thread - mkl_core - pthread - m - dl]
-- Library mkl_intel: not found
-- Checking for [mkl_gf_lp64 - mkl_gnu_thread - mkl_core - pthread - m - dl]
-- Library mkl_gf_lp64: not found
-- Checking for [mkl_gf_lp64 - mkl_intel_thread - mkl_core - pthread - m - dl]
-- Library mkl_gf_lp64: not found
-- Checking for [mkl_gf - mkl_gnu_thread - mkl_core - pthread - m - dl]
-- Library mkl_gf: not found
-- Checking for [mkl_gf - mkl_intel_thread - mkl_core - pthread - m - dl]
-- Library mkl_gf: not found
-- Checking for [mkl_intel_lp64 - mkl_sequential - mkl_core - m - dl]
-- Library mkl_intel_lp64: not found
-- Checking for [mkl_intel - mkl_sequential - mkl_core - m - dl]
-- Library mkl_intel: not found
-- Checking for [mkl_gf_lp64 - mkl_sequential - mkl_core - m - dl]
-- Library mkl_gf_lp64: not found
-- Checking for [mkl_gf - mkl_sequential - mkl_core - m - dl]
-- Library mkl_gf: not found
-- Checking for [mkl_intel_lp64 - mkl_core - gomp - pthread - m - dl]
-- Library mkl_intel_lp64: not found
-- Checking for [mkl_intel - mkl_core - gomp - pthread - m - dl]
-- Library mkl_intel: not found
-- Checking for [mkl_gf_lp64 - mkl_core - gomp - pthread - m - dl]
-- Library mkl_gf_lp64: not found
-- Checking for [mkl_gf - mkl_core - gomp - pthread - m - dl]
-- Library mkl_gf: not found
-- Checking for [mkl_intel_lp64 - mkl_core - iomp5 - pthread - m - dl]
-- Library mkl_intel_lp64: not found
-- Checking for [mkl_intel - mkl_core - iomp5 - pthread - m - dl]
-- Library mkl_intel: not found
-- Checking for [mkl_gf_lp64 - mkl_core - iomp5 - pthread - m - dl]
-- Library mkl_gf_lp64: not found
-- Checking for [mkl_gf - mkl_core - iomp5 - pthread - m - dl]
-- Library mkl_gf: not found
-- Checking for [mkl_intel_lp64 - mkl_core - pthread - m - dl]
-- Library mkl_intel_lp64: not found
-- Checking for [mkl_intel - mkl_core - pthread - m - dl]
-- Library mkl_intel: not found
-- Checking for [mkl_gf_lp64 - mkl_core - pthread - m - dl]
-- Library mkl_gf_lp64: not found
-- Checking for [mkl_gf - mkl_core - pthread - m - dl]
-- Library mkl_gf: not found
-- Checking for [mkl - guide - pthread - m]
-- Library mkl: not found
-- MKL library not found
-- Checking for [Accelerate]
-- Library Accelerate: BLAS_Accelerate_LIBRARY-NOTFOUND
-- Checking for [vecLib]
-- Library vecLib: BLAS_vecLib_LIBRARY-NOTFOUND
-- Checking for [openblas]
-- Library openblas: /usr/lib/aarch64-linux-gnu/libopenblas.so
-- Found a library with BLAS API (open).
-- Brace yourself, we are building NNPACK
-- NNPACK backend is neon
-- Found PythonInterp: /usr/bin/python (found version "2.7.17")
-- git Version: v1.4.0-505be96a
-- Version: 1.4.0
-- Performing Test HAVE_STD_REGEX -- success
-- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile
-- Performing Test HAVE_POSIX_REGEX -- success
-- Performing Test HAVE_STEADY_CLOCK -- success
-- Found Numa (include: /usr/include, library: /usr/lib/aarch64-linux-gnu/libnuma.so)
-- Using third party subdirectory Eigen.
Python 2.7.17
-- Found PythonInterp: /usr/bin/python (found suitable version "2.7.17", minimum required is "2.7")
-- Could NOT find pybind11 (missing: pybind11_DIR)
-- Could NOT find pybind11 (missing: pybind11_INCLUDE_DIR)
-- Using third_party/pybind11.
-- pybind11 include dirs: /home/nvidia/pytorch1.4/pytorch/cmake/../third_party/pybind11/include
-- Adding OpenMP CXX_FLAGS: -fopenmp
-- No OpenMP library needs to be linked against
-- Caffe2: CUDA detected: 10.2
-- Caffe2: CUDA nvcc is: /usr/local/cuda-10.2/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda-10.2
-- Caffe2: Header version is: 10.2
-- Found cuDNN: v7.6.5 (include: /usr/local/cuda-10.2/include, library: /usr/local/cuda-10.2/lib64/libcudnn.so)
CMake Warning at cmake/public/utils.cmake:196 (message):
In the future we will require one to explicitly pass TORCH_CUDA_ARCH_LIST
to cmake instead of implicitly setting it as an env variable. This will
become a FATAL_ERROR in future version of pytorch.
Call Stack (most recent call first):
cmake/public/cuda.cmake:395 (torch_cuda_get_nvcc_gencode_flag)
cmake/Dependencies.cmake:903 (include)
CMakeLists.txt:380 (include)
CMake Error at cmake/Modules_CUDA_fix/upstream/FindCUDA/select_compute_arch.cmake:215 (message):
Unknown CUDA Architecture Name 10.2 in CUDA_SELECT_NVCC_ARCH_FLAGS
Call Stack (most recent call first):
cmake/public/utils.cmake:212 (cuda_select_nvcc_arch_flags)
cmake/public/cuda.cmake:395 (torch_cuda_get_nvcc_gencode_flag)
cmake/Dependencies.cmake:903 (include)
CMakeLists.txt:380 (include)
CMake Error at cmake/Modules_CUDA_fix/upstream/FindCUDA/select_compute_arch.cmake:219 (message):
arch_bin wasn't set for some reason
Call Stack (most recent call first):
cmake/public/utils.cmake:212 (cuda_select_nvcc_arch_flags)
cmake/public/cuda.cmake:395 (torch_cuda_get_nvcc_gencode_flag)
cmake/Dependencies.cmake:903 (include)
CMakeLists.txt:380 (include)
-- Added CUDA NVCC flags for:
-- Could NOT find CUB (missing: CUB_INCLUDE_DIR)
Generated: /home/nvidia/pytorch1.4/pytorch/build/third_party/onnx/onnx/onnx_onnx_torch-ml.proto
Generated: /home/nvidia/pytorch1.4/pytorch/build/third_party/onnx/onnx/onnx-operators_onnx_torch-ml.proto
--
-- ******** Summary ********
-- CMake version : 3.10.2
-- CMake command : /usr/bin/cmake
-- System : Linux
-- C++ compiler : /usr/bin/c++
-- C++ compiler version : 7.5.0
-- CXX flags : -fvisibility-inlines-hidden -fopenmp -Wnon-virtual-dtor
-- Build type : Release
-- Compile definitions : ONNX_ML=1
-- CMAKE_PREFIX_PATH : /home/nvidia/catkin_ws3/devel:/opt/ros/melodic;/usr/local/cuda-10.2
-- CMAKE_INSTALL_PREFIX : /home/nvidia/pytorch1.4/pytorch/torch
-- CMAKE_MODULE_PATH : /home/nvidia/pytorch1.4/pytorch/cmake/Modules;/home/nvidia/pytorch1.4/pytorch/cmake/public/../Modules_CUDA_fix
--
-- ONNX version : 1.6.0
-- ONNX NAMESPACE : onnx_torch
-- ONNX_BUILD_TESTS : OFF
-- ONNX_BUILD_BENCHMARKS : OFF
-- ONNX_USE_LITE_PROTO : OFF
-- ONNXIFI_DUMMY_BACKEND : OFF
-- ONNXIFI_ENABLE_EXT : OFF
--
-- Protobuf compiler :
-- Protobuf includes :
-- Protobuf libraries :
-- BUILD_ONNX_PYTHON : OFF
--
-- ******** Summary ********
-- CMake version : 3.10.2
-- CMake command : /usr/bin/cmake
-- System : Linux
-- C++ compiler : /usr/bin/c++
-- C++ compiler version : 7.5.0
-- CXX flags : -fvisibility-inlines-hidden -fopenmp -Wnon-virtual-dtor
-- Build type : Release
-- Compile definitions : ONNX_ML=1
-- CMAKE_PREFIX_PATH : /home/nvidia/catkin_ws3/devel:/opt/ros/melodic;/usr/local/cuda-10.2
-- CMAKE_INSTALL_PREFIX : /home/nvidia/pytorch1.4/pytorch/torch
-- CMAKE_MODULE_PATH : /home/nvidia/pytorch1.4/pytorch/cmake/Modules;/home/nvidia/pytorch1.4/pytorch/cmake/public/../Modules_CUDA_fix
--
-- ONNX version : 1.4.1
-- ONNX NAMESPACE : onnx_torch
-- ONNX_BUILD_TESTS : OFF
-- ONNX_BUILD_BENCHMARKS : OFF
-- ONNX_USE_LITE_PROTO : OFF
-- ONNXIFI_DUMMY_BACKEND : OFF
--
-- Protobuf compiler :
-- Protobuf includes :
-- Protobuf libraries :
-- BUILD_ONNX_PYTHON : OFF
-- Found CUDA with FP16 support, compiling with torch.cuda.HalfTensor
-- Removing -DNDEBUG from compile flags
-- MAGMA not found. Compiling without MAGMA support
-- Could not find hardware support for NEON on this machine.
-- No OMAP3 processor on this machine.
-- No OMAP4 processor on this machine.
-- asimd/Neon found with compiler flag : -D__NEON__
-- Found a library with LAPACK API (open).
disabling ROCM because NOT USE_ROCM is set
-- MIOpen not found. Compiling without MIOpen support
disabling MKLDNN because USE_MKLDNN is not set
-- GCC 7.5.0: Adding gcc and gcc_s libs to link line
-- NUMA paths:
-- /usr/include
-- /usr/lib/aarch64-linux-gnu/libnuma.so
-- Found OpenMP_C: -fopenmp
-- Found OpenMP_CXX: -fopenmp
-- Found OpenMP: TRUE
-- Configuring build for SLEEF-v3.4.0
Target system: Linux-4.9.201-tegra
Target processor: aarch64
Host system: Linux-4.9.201-tegra
Host processor: aarch64
Detected C compiler: GNU @ /usr/bin/cc
-- Using option `-Wall -Wno-unused -Wno-attributes -Wno-unused-result -Wno-psabi -ffp-contract=off -fno-math-errno -fno-trapping-math` to compile libsleef
-- Building shared libs : OFF
-- MPFR : LIB_MPFR-NOTFOUND
-- GMP : LIBGMP-NOTFOUND
-- RT : /usr/lib/aarch64-linux-gnu/librt.so
-- FFTW3 : LIBFFTW3-NOTFOUND
-- OPENSSL : 1.1.1
-- SDE : SDE_COMMAND-NOTFOUND
-- RUNNING_ON_TRAVIS : 0
-- COMPILER_SUPPORTS_OPENMP : 1
AT_INSTALL_INCLUDE_DIR include/ATen/core
core header install: /home/nvidia/pytorch1.4/pytorch/build/aten/src/ATen/core/TensorBody.h
core header install: /home/nvidia/pytorch1.4/pytorch/build/aten/src/ATen/core/TensorMethods.h
-- NCCL operators skipped due to no CUDA support
-- Excluding ideep operators as we are not using ideep
-- Excluding image processing operators due to no opencv
-- Excluding video processing operators due to no opencv
-- MPI operators skipped due to no MPI support
-- Include Observer library
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cudnn/AffineGridGenerator.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cudnn/BatchNorm.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cudnn/Conv.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cudnn/GridSampler.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cudnn/LossCTC.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cudnn/RNN.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/miopen/BatchNorm_miopen.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/miopen/Conv_miopen.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/miopen/RNN_miopen.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/sparse/cuda/SparseCUDATensor.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/cuda/CUDABlas.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/cuda/CUDAContext.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/cuda/CUDAGenerator.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/cuda/CuSparseHandlePool.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/cuda/CublasHandlePool.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/cuda/PinnedMemoryAllocator.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/cuda/detail/CUDAHooks.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/CUDAUnaryOps.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/LegacyDefinitions.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/TensorShapeCUDA.cpp
-- /home/nvidia/pytorch1.4/pytorch/build/aten/src/ATen/CUDAType.cpp
-- /home/nvidia/pytorch1.4/pytorch/build/aten/src/ATen/CUDAType.h
-- /home/nvidia/pytorch1.4/pytorch/build/aten/src/ATen/LegacyTHFunctionsCUDA.cpp
-- /home/nvidia/pytorch1.4/pytorch/build/aten/src/ATen/LegacyTHFunctionsCUDA.h
-- /home/nvidia/pytorch1.4/pytorch/build/aten/src/ATen/SparseCUDAType.cpp
-- /home/nvidia/pytorch1.4/pytorch/build/aten/src/ATen/SparseCUDAType.h
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCCachingHostAllocator.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCGeneral.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCStorageCopy.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensor.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCReduceApplyUtils.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCBlas.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCSleep.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCStorage.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCStorageCopy.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensor.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensorCopy.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensorMath.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensorMathBlas.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensorMathMagma.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensorMathPairwise.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensorMathReduce.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensorMathScan.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensorIndex.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensorRandom.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensorScatterGather.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensorTopK.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensorSort.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCSortUtils.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/THCTensorMode.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorSortByte.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathPointwiseByte.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathReduceByte.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMaskedByte.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorSortChar.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathPointwiseChar.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathReduceChar.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMaskedChar.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorSortShort.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathPointwiseShort.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathReduceShort.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMaskedShort.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorSortInt.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathPointwiseInt.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathReduceInt.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMaskedInt.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorSortLong.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathPointwiseLong.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathReduceLong.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMaskedLong.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorSortHalf.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathPointwiseHalf.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathReduceHalf.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMaskedHalf.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorSortFloat.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathPointwiseFloat.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathReduceFloat.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMaskedFloat.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorSortDouble.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathPointwiseDouble.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathReduceDouble.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMaskedDouble.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathPointwiseBool.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathReduceBool.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMaskedBool.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMathReduceBFloat16.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THC/generated/THCTensorMaskedBFloat16.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/BCECriterion.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/ELU.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/GatedLinearUnit.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/HardTanh.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/LeakyReLU.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/LogSigmoid.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/MultiLabelMarginCriterion.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/MultiMarginCriterion.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/RReLU.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/SoftMarginCriterion.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/SoftPlus.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/SoftShrink.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/SpatialConvolutionMM.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/SpatialDepthwiseConvolution.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/THCUNN/Tanh.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/cuda/detail/IndexUtils.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Activation.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/AdaptiveAveragePooling.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/AdaptiveAveragePooling3d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/AdaptiveMaxPooling2d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/AdaptiveMaxPooling3d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/AveragePool2d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/AveragePool3d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/BatchLinearAlgebra.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/BinaryArithmeticKernel.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/BinaryCompareKernel.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/BinaryMiscOpsKernels.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/CUDAScalar.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Col2Im.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Copy.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/CrossKernel.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/DilatedMaxPool2d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/DilatedMaxPool3d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/DistanceKernel.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Distributions.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Dropout.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Embedding.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/EmbeddingBackwardKernel.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/EmbeddingBag.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/FillKernel.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/FractionalMaxPool2d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/FractionalMaxPool3d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/GridSampler.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Im2Col.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Indexing.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Lerp.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/LinearAlgebra.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Loss.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/LossCTC.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/MaxUnpooling.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/NaiveConvolutionTranspose2d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/NaiveConvolutionTranspose3d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/NaiveDilatedConvolution.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Normalization.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/PointwiseOpsKernel.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/PowKernel.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/RNN.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/RangeFactories.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Reduce.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/ReduceOpsKernel.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/ReflectionPad.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Repeat.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/ReplicationPadding.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Resize.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/SoftMax.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/SortingKthValue.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/SparseMM.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/SpectralOps.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/SummaryOps.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/TensorCompare.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/TensorFactories.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/TensorTransformations.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/TriangularOps.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/UnaryOpsKernel.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/Unique.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/UpSampleBicubic2d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/UpSampleBilinear2d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/UpSampleLinear1d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/UpSampleNearest1d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/UpSampleNearest2d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/UpSampleNearest3d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/UpSampleTrilinear3d.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/WeightNorm.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/cuda/layer_norm_kernel.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/sparse/cuda/SparseCUDABlas.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/sparse/cuda/SparseCUDATensor.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/sparse/cuda/SparseCUDATensorMath.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/quantized/cuda/fake_quantize_core.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/quantized/cuda/fake_quantize_per_channel_affine.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/native/quantized/cuda/fake_quantize_per_tensor_affine.cu
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/cudnn/Descriptors.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/cudnn/Handle.cpp
-- /home/nvidia/pytorch1.4/pytorch/aten/src/ATen/cudnn/Types.cpp
-- /home/nvidia/pytorch1.4/pytorch/caffe2/core/common_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/core/blob_serialization_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/core/common_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/core/event_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/core/context_gpu.cu
-- utils/math/broadcast.cu
-- utils/math/elementwise.cu
-- utils/math/reduce.cu
-- utils/math/transpose.cu
-- utils/math_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/contrib/aten/aten_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/db/create_db_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/distributed/file_store_handler_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/conv_op_cache_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/conv_op_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/conv_transpose_op_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/dropout_op_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/elu_op_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/local_response_normalization_op_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/order_switch_ops_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/pool_op_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/sigmoid_op_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/softmax_op_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/tanh_op_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/transpose_op_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/channelwise_conv3d_op_cudnn.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/depthwise_3x3_conv_op_cudnn.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/spatial_batch_norm_op_cudnn.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/communicator_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/concat_split_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/conv_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/conv_op_shared_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/conv_transpose_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/counter_ops_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/do_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/elementwise_add_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/elementwise_sub_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/exp_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/expand_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/expand_squeeze_dims_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/free_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/fully_connected_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/if_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/im2col_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/load_save_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/locally_connected_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/log_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/matmul_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/negate_gradient_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/negative_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/order_switch_ops_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/prepend_dim_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/reshape_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/scale_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/shape_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/sqr_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/sqrt_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/stop_gradient_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/tensor_protos_db_input_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/while_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/zero_gradient_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/abs_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/accumulate_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/accuracy_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/acos_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/affine_channel_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/alias_with_name.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/arg_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/asin_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/assert_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/atan_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/batch_gather_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/batch_matmul_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/batch_moments_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/batch_permutation_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/batch_sparse_to_dense_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/boolean_mask_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/boolean_unmask_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/bucketize_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/cast_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/cbrt_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/ceil_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/channel_backprop_stats_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/channel_shuffle_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/channel_stats_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/clip_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/copy_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/cos_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/cosh_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/cosine_embedding_criterion_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/cross_entropy_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/cube_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/data_couple_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/deform_conv_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/distance_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/dropout_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/elementwise_div_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/elementwise_linear_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/elementwise_mul_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/elementwise_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/elu_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/enforce_finite_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/ensure_cpu_output_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/erf_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/filler_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/find_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/floor_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/gather_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/gelu_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/generate_proposals_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/generate_proposals_op_util_nms_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/given_tensor_byte_string_to_uint8_fill_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/given_tensor_fill_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/glu_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/group_norm_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/gru_unit_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/half_float_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/hard_sigmoid_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/instance_norm_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/integral_image_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/layer_norm_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/leaky_relu_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/lengths_pad_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/lengths_tile_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/local_response_normalization_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/logit_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/loss_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/lp_pool_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/lstm_unit_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/margin_ranking_criterion_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/max_pool_with_index.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/mean_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/mem_query_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/minmax_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/moments_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/multi_class_accuracy_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/normalize_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/one_hot_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/pack_segments.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/pad_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/perplexity_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/piecewise_linear_transform_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/pool_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/pow_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/prelu_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/reciprocal_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/reduce_front_back_max_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/reduce_front_back_sum_mean_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/reduce_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/reduction_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/relu_n_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/relu_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/replace_nan_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/resize_3d_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/resize_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/reverse_packed_segs_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/rmac_regions_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/roi_align_gradient_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/roi_align_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/roi_align_rotated_gradient_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/roi_align_rotated_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/roi_pool_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/rsqrt_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/scale_blobs_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/segment_reduction_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/selu_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/sequence_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/sigmoid_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/sin_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/sinh_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/slice_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/softmax_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/softplus_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/softsign_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/space_batch_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/sparse_normalize_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/sparse_to_dense_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/spatial_batch_norm_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/stump_func_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/summarize_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/swish_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/tan_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/tanh_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/thresholded_relu_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/tile_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/top_k.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/transpose_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/unique_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/upsample_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/utility_ops.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/weighted_sample_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/rnn/recurrent_op_cudnn.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/rnn/recurrent_network_blob_fetcher_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/rnn/recurrent_network_executor_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/operators/rnn/recurrent_network_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/queue/queue_ops_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/sgd/iter_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/sgd/learning_rate_op_gpu.cc
-- /home/nvidia/pytorch1.4/pytorch/caffe2/sgd/adadelta_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/sgd/adagrad_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/sgd/adam_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/sgd/fp16_momentum_sgd_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/sgd/fp32_momentum_sgd_op.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/sgd/lars_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/sgd/momentum_sgd_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/sgd/rmsprop_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/sgd/yellowfin_op_gpu.cu
-- /home/nvidia/pytorch1.4/pytorch/caffe2/../torch/csrc/jit/fuser/cuda/fused_kernel.cpp
-- /home/nvidia/pytorch1.4/pytorch/caffe2/../torch/csrc/autograd/profiler_cuda.cpp
-- /home/nvidia/pytorch1.4/pytorch/caffe2/../torch/csrc/autograd/functions/comm.cpp
-- /home/nvidia/pytorch1.4/pytorch/caffe2/../torch/csrc/cuda/comm.cpp
-- /usr/bin/c++ /home/nvidia/pytorch1.4/pytorch/caffe2/../torch/abi-check.cpp -o /home/nvidia/pytorch1.4/pytorch/build/abi-check
-- Determined _GLIBCXX_USE_CXX11_ABI=1
-- pytorch is compiling with OpenMP.
OpenMP CXX_FLAGS: -fopenmp.
OpenMP libraries: /usr/lib/gcc/aarch64-linux-gnu/7/libgomp.so;/usr/lib/aarch64-linux-gnu/libpthread.so.
-- Caffe2 is compiling with OpenMP.
OpenMP CXX_FLAGS: -fopenmp.
OpenMP libraries: /usr/lib/gcc/aarch64-linux-gnu/7/libgomp.so;/usr/lib/aarch64-linux-gnu/libpthread.so.
-- Using ATen parallel backend: OMP
-- Using lib/python2.7/dist-packages as python relative installation path
CMake Warning at CMakeLists.txt:583 (message):
Generated cmake files are only fully tested if one builds with system glog,
gflags, and protobuf. Other settings may generate files that are not well
tested.
--
-- ******** Summary ********
-- General:
-- CMake version : 3.10.2
-- CMake command : /usr/bin/cmake
-- System : Linux
-- C++ compiler : /usr/bin/c++
-- C++ compiler id : GNU
-- C++ compiler version : 7.5.0
-- BLAS : MKL
-- CXX flags : -fvisibility-inlines-hidden -fopenmp -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow
-- Build type : Release
-- Compile definitions : ONNX_ML=1;ONNX_NAMESPACE=onnx_torch;HAVE_MMAP=1;_FILE_OFFSET_BITS=64;HAVE_SHM_OPEN=1;HAVE_SHM_UNLINK=1;HAVE_MALLOC_USABLE_SIZE=1
-- CMAKE_PREFIX_PATH : /home/nvidia/catkin_ws3/devel:/opt/ros/melodic;/usr/local/cuda-10.2
-- CMAKE_INSTALL_PREFIX : /home/nvidia/pytorch1.4/pytorch/torch
--
-- TORCH_VERSION : 1.4.0
-- CAFFE2_VERSION : 1.4.0
-- BUILD_CAFFE2_MOBILE : ON
-- USE_STATIC_DISPATCH : OFF
-- BUILD_BINARY : OFF
-- BUILD_CUSTOM_PROTOBUF : ON
-- Link local protobuf : ON
-- BUILD_DOCS : OFF
-- BUILD_PYTHON : True
-- Python version : 2.7.17
-- Python executable : /usr/bin/python
-- Pythonlibs version : 2.7.17
-- Python library : /usr/lib/libpython2.7.so.1.0
-- Python includes : /usr/include/python2.7
-- Python site-packages: lib/python2.7/dist-packages
-- BUILD_CAFFE2_OPS : ON
-- BUILD_SHARED_LIBS : ON
-- BUILD_TEST : True
-- BUILD_JNI : OFF
-- INTERN_BUILD_MOBILE :
-- USE_ASAN : OFF
-- USE_CUDA : ON
-- CUDA static link : OFF
-- USE_CUDNN : ON
-- CUDA version : 10.2
-- cuDNN version : 7.6.5
-- CUDA root directory : /usr/local/cuda-10.2
-- CUDA library : /usr/local/cuda-10.2/lib64/stubs/libcuda.so
-- cudart library : /usr/local/cuda-10.2/lib64/libcudart.so
-- cublas library : /usr/lib/aarch64-linux-gnu/libcublas.so
-- cufft library : /usr/local/cuda-10.2/lib64/libcufft.so
-- curand library : /usr/local/cuda-10.2/lib64/libcurand.so
-- cuDNN library : /usr/local/cuda-10.2/lib64/libcudnn.so
-- nvrtc : /usr/local/cuda-10.2/lib64/libnvrtc.so
-- CUDA include path : /usr/local/cuda-10.2/include
-- NVCC executable : /usr/local/cuda-10.2/bin/nvcc
-- CUDA host compiler : /usr/bin/cc
-- USE_TENSORRT : OFF
-- USE_ROCM : OFF
-- USE_EIGEN_FOR_BLAS : ON
-- USE_FBGEMM : OFF
-- USE_FFMPEG : OFF
-- USE_GFLAGS : OFF
-- USE_GLOG : OFF
-- USE_LEVELDB : OFF
-- USE_LITE_PROTO : OFF
-- USE_LMDB : OFF
-- USE_METAL : OFF
-- USE_MKL : OFF
-- USE_MKLDNN : OFF
-- USE_NCCL : 0
-- USE_NNPACK : ON
-- USE_NUMPY : ON
-- USE_OBSERVERS : ON
-- USE_OPENCL : OFF
-- USE_OPENCV : OFF
-- USE_OPENMP : ON
-- USE_TBB : OFF
-- USE_PROF : OFF
-- USE_QNNPACK : 0
-- USE_REDIS : OFF
-- USE_ROCKSDB : OFF
-- USE_ZMQ : OFF
-- USE_DISTRIBUTED : 0
-- BUILD_NAMEDTENSOR : OFF
-- Public Dependencies : Threads::Threads
-- Private Dependencies : nnpack;cpuinfo;/usr/lib/aarch64-linux-gnu/libnuma.so;fp16;aten_op_header_gen;foxi_loader;rt;gcc_s;gcc;dl
-- Configuring incomplete, errors occurred!
See also "/home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeOutput.log".
See also "/home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeError.log".
Traceback (most recent call last):
File "setup.py", line 755, in <module>
build_deps()
File "setup.py", line 316, in build_deps
cmake=cmake)
File "/home/nvidia/pytorch1.4/pytorch/tools/build_pytorch_libs.py", line 59, in build_caffe2
rerun_cmake)
File "/home/nvidia/pytorch1.4/pytorch/tools/setup_helpers/cmake.py", line 319, in generate
self.run(args, env=my_env)
File "/home/nvidia/pytorch1.4/pytorch/tools/setup_helpers/cmake.py", line 141, in run
check_call(command, cwd=self.build_dir, env=env)
File "/usr/lib/python2.7/subprocess.py", line 190, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '-GNinja', '-DBUILD_PYTHON=True', '-DBUILD_TEST=True', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INSTALL_PREFIX=/home/nvidia/pytorch1.4/pytorch/torch', '-DCMAKE_PREFIX_PATH=/home/nvidia/catkin_ws3/devel:/opt/ros/melodic', '-DNUMPY_INCLUDE_DIR=/usr/lib/python2.7/dist-packages/numpy/core/include', '-DPYTHON_EXECUTABLE=/usr/bin/python', '-DPYTHON_INCLUDE_DIR=/usr/include/python2.7', '-DPYTHON_LIBRARY=/usr/lib/libpython2.7.so.1.0', '-DTORCH_BUILD_VERSION=1.4.0', '-DUSE_DISTRIBUTED=0', '-DUSE_NCCL=0', '-DUSE_NUMPY=True', '-DUSE_PYTORCH_QNNPACK=0', '-DUSE_QNNPACK=0', '/home/nvidia/pytorch1.4/pytorch']' returned non-zero exit status 1
The CMakeError.log is:
Performing C++ SOURCE FILE Test CAFFE2_COMPILER_SUPPORTS_AVX2_EXTENSIONS failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_851a0"
[1/2] Building CXX object CMakeFiles/cmTC_851a0.dir/src.cxx.o
FAILED: CMakeFiles/cmTC_851a0.dir/src.cxx.o
/usr/bin/c++ -DCAFFE2_COMPILER_SUPPORTS_AVX2_EXTENSIONS -mavx2 -o CMakeFiles/cmTC_851a0.dir/src.cxx.o -c src.cxx
c++: error: unrecognized command line option ‘-mavx2’
ninja: build stopped: subcommand failed.
Source file was:
#include <immintrin.h>
int main() {
__m256i a, b;
a = _mm256_set1_epi8 (1);
b = a;
_mm256_add_epi8 (a,a);
__m256i x;
_mm256_extract_epi64(x, 0); // we rely on this in our AVX2 code
return 0;
}
Performing C++ SOURCE FILE Test CAFFE2_COMPILER_SUPPORTS_AVX512_EXTENSIONS failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_b2871"
[1/2] Building CXX object CMakeFiles/cmTC_b2871.dir/src.cxx.o
FAILED: CMakeFiles/cmTC_b2871.dir/src.cxx.o
/usr/bin/c++ -DCAFFE2_COMPILER_SUPPORTS_AVX512_EXTENSIONS -mavx512f -mavx512dq -mavx512vl -o CMakeFiles/cmTC_b2871.dir/src.cxx.o -c src.cxx
c++: error: unrecognized command line option ‘-mavx512f’
c++: error: unrecognized command line option ‘-mavx512dq’
c++: error: unrecognized command line option ‘-mavx512vl’
ninja: build stopped: subcommand failed.
Source file was:
#if defined(_MSC_VER)
#include <intrin.h>
#else
#include <immintrin.h>
#endif
// check avx512f
__m512 addConstant(__m512 arg) {
return _mm512_add_ps(arg, _mm512_set1_ps(1.f));
}
// check avx512dq
__m512 andConstant(__m512 arg) {
return _mm512_and_ps(arg, _mm512_set1_ps(1.f));
}
int main() {
__m512i a = _mm512_set1_epi32(1);
__m256i ymm = _mm512_extracti64x4_epi64(a, 0);
ymm = _mm256_abs_epi64(ymm); // check avx512vl
__mmask16 m = _mm512_cmp_epi32_mask(a, a, _MM_CMPINT_EQ);
__m512i r = _mm512_andnot_si512(a, a);
}
Determining if the pthread_create exist failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_b8d21"
[1/2] Building C object CMakeFiles/cmTC_b8d21.dir/CheckSymbolExists.c.o
[2/2] Linking C executable cmTC_b8d21
FAILED: cmTC_b8d21
: && /usr/bin/cc -rdynamic CMakeFiles/cmTC_b8d21.dir/CheckSymbolExists.c.o -o cmTC_b8d21 && :
CMakeFiles/cmTC_b8d21.dir/CheckSymbolExists.c.o: In function `main':
CheckSymbolExists.c:(.text+0x14): undefined reference to `pthread_create'
CheckSymbolExists.c:(.text+0x18): undefined reference to `pthread_create'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
File /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp/CheckSymbolExists.c:
/* */
#include <pthread.h>
int main(int argc, char** argv)
{
(void)argv;
#ifndef pthread_create
return ((int*)(&pthread_create))[argc];
#else
(void)argc;
return 0;
#endif
}
Determining if the function pthread_create exists in the pthreads failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_a89ae"
[1/2] Building C object CMakeFiles/cmTC_a89ae.dir/CheckFunctionExists.c.o
[2/2] Linking C executable cmTC_a89ae
FAILED: cmTC_a89ae
: && /usr/bin/cc -DCHECK_FUNCTION_EXISTS=pthread_create -rdynamic CMakeFiles/cmTC_a89ae.dir/CheckFunctionExists.c.o -o cmTC_a89ae -lpthreads && :
/usr/bin/ld: cannot find -lpthreads
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
Performing C SOURCE FILE Test BLAS_F2C_DOUBLE_WORKS failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_2b010"
[1/2] Building C object CMakeFiles/cmTC_2b010.dir/src.c.o
[2/2] Linking C executable cmTC_2b010
Return value: 1
Source file was:
#include <stdlib.h>
#include <stdio.h>
float x[4] = { 1, 2, 3, 4 };
float y[4] = { .1, .01, .001, .0001 };
int four = 4;
int one = 1;
extern double sdot_();
int main() {
int i;
double r = sdot_(&four, x, &one, y, &one);
exit((float)r != (float).1234);
}
Performing C++ SOURCE FILE Test HAVE_CXX_FLAG_WSHORTEN_64_TO_32 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_1859b"
[1/2] Building CXX object CMakeFiles/cmTC_1859b.dir/src.cxx.o
FAILED: CMakeFiles/cmTC_1859b.dir/src.cxx.o
/usr/bin/c++ -fvisibility-inlines-hidden -std=c++11 -Wall -Wextra -Wshadow -pedantic -pedantic-errors -DHAVE_CXX_FLAG_WSHORTEN_64_TO_32 -Wshorten-64-to-32 -Wshorten-64-to-32 -o CMakeFiles/cmTC_1859b.dir/src.cxx.o -c src.cxx
c++: error: unrecognized command line option '-Wshorten-64-to-32'
c++: error: unrecognized command line option '-Wshorten-64-to-32'
ninja: build stopped: subcommand failed.
Source file was:
int main() { return 0; }
Performing C++ SOURCE FILE Test HAVE_CXX_FLAG_WD654 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_f1abe"
[1/2] Building CXX object CMakeFiles/cmTC_f1abe.dir/src.cxx.o
FAILED: CMakeFiles/cmTC_f1abe.dir/src.cxx.o
/usr/bin/c++ -fvisibility-inlines-hidden -std=c++11 -Wall -Wextra -Wshadow -pedantic -pedantic-errors -Wfloat-equal -fstrict-aliasing -Wno-deprecated-declarations -Wstrict-aliasing -DHAVE_CXX_FLAG_WD654 -wd654 -wd654 -o CMakeFiles/cmTC_f1abe.dir/src.cxx.o -c src.cxx
c++: error: unrecognized command line option '-wd654'
c++: error: unrecognized command line option '-wd654'
ninja: build stopped: subcommand failed.
Source file was:
int main() { return 0; }
Performing C++ SOURCE FILE Test HAVE_CXX_FLAG_WTHREAD_SAFETY failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_d7def"
[1/2] Building CXX object CMakeFiles/cmTC_d7def.dir/src.cxx.o
FAILED: CMakeFiles/cmTC_d7def.dir/src.cxx.o
/usr/bin/c++ -fvisibility-inlines-hidden -std=c++11 -Wall -Wextra -Wshadow -pedantic -pedantic-errors -Wfloat-equal -fstrict-aliasing -Wno-deprecated-declarations -Wstrict-aliasing -DHAVE_CXX_FLAG_WTHREAD_SAFETY -Wthread-safety -Wthread-safety -o CMakeFiles/cmTC_d7def.dir/src.cxx.o -c src.cxx
c++: error: unrecognized command line option '-Wthread-safety'; did you mean '-fthread-jumps'?
c++: error: unrecognized command line option '-Wthread-safety'; did you mean '-fthread-jumps'?
ninja: build stopped: subcommand failed.
Source file was:
int main() { return 0; }
Determining if the include file cpuid.h exists failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_998af"
[1/2] Building C object CMakeFiles/cmTC_998af.dir/CheckIncludeFile.c.o
FAILED: CMakeFiles/cmTC_998af.dir/CheckIncludeFile.c.o
/usr/bin/cc -fopenmp -fPIE -o CMakeFiles/cmTC_998af.dir/CheckIncludeFile.c.o -c CheckIncludeFile.c
CheckIncludeFile.c:1:10: fatal error: cpuid.h: No such file or directory
#include <cpuid.h>
^~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Performing C SOURCE FILE Test NO_GCC_EBX_FPIC_BUG failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_065aa"
[1/2] Building C object CMakeFiles/cmTC_065aa.dir/src.c.o
FAILED: CMakeFiles/cmTC_065aa.dir/src.c.o
/usr/bin/cc -fopenmp -DNO_GCC_EBX_FPIC_BUG -fPIE -o CMakeFiles/cmTC_065aa.dir/src.c.o -c src.c
src.c: In function ‘cpuid’:
src.c:6:9: error: impossible constraint in ‘asm’
asm volatile ( "cpuid" : "+a"(a), "=b"(b), "+c"(c), "=d"(d) );
^~~
ninja: build stopped: subcommand failed.
Source file was:
#include <stdint.h>
static inline void cpuid(uint32_t *eax, uint32_t *ebx,
uint32_t *ecx, uint32_t *edx)
{
uint32_t a = *eax, b, c = *ecx, d;
asm volatile ( "cpuid" : "+a"(a), "=b"(b), "+c"(c), "=d"(d) );
*eax = a; *ebx = b; *ecx = c; *edx = d;
}
int main() {
uint32_t a,b,c,d;
cpuid(&a, &b, &c, &d);
return 0;
}
Performing C SOURCE FILE Test C_HAS_AVX_1 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_1db07"
[1/2] Building C object CMakeFiles/cmTC_1db07.dir/src.c.o
FAILED: CMakeFiles/cmTC_1db07.dir/src.c.o
/usr/bin/cc -fopenmp -DC_HAS_AVX_1 -fPIE -o CMakeFiles/cmTC_1db07.dir/src.c.o -c src.c
src.c:2:12: fatal error: immintrin.h: No such file or directory
#include <immintrin.h>
^~~~~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Source file was:
#include <immintrin.h>
int main()
{
__m256 a;
a = _mm256_set1_ps(0);
return 0;
}
Performing C SOURCE FILE Test C_HAS_AVX_2 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_5f07c"
[1/2] Building C object CMakeFiles/cmTC_5f07c.dir/src.c.o
FAILED: CMakeFiles/cmTC_5f07c.dir/src.c.o
/usr/bin/cc -fopenmp -DC_HAS_AVX_2 -mavx -fPIE -o CMakeFiles/cmTC_5f07c.dir/src.c.o -c src.c
cc: error: unrecognized command line option ‘-mavx’
ninja: build stopped: subcommand failed.
Source file was:
#include <immintrin.h>
int main()
{
__m256 a;
a = _mm256_set1_ps(0);
return 0;
}
Performing C SOURCE FILE Test C_HAS_AVX_3 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_98529"
[1/2] Building C object CMakeFiles/cmTC_98529.dir/src.c.o
FAILED: CMakeFiles/cmTC_98529.dir/src.c.o
/usr/bin/cc -fopenmp -DC_HAS_AVX_3 /arch:AVX -fPIE -o CMakeFiles/cmTC_98529.dir/src.c.o -c src.c
cc: error: /arch:AVX: No such file or directory
ninja: build stopped: subcommand failed.
Source file was:
#include <immintrin.h>
int main()
{
__m256 a;
a = _mm256_set1_ps(0);
return 0;
}
Performing C SOURCE FILE Test C_HAS_AVX2_1 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_a7cd1"
[1/2] Building C object CMakeFiles/cmTC_a7cd1.dir/src.c.o
FAILED: CMakeFiles/cmTC_a7cd1.dir/src.c.o
/usr/bin/cc -fopenmp -DC_HAS_AVX2_1 -fPIE -o CMakeFiles/cmTC_a7cd1.dir/src.c.o -c src.c
src.c:2:12: fatal error: immintrin.h: No such file or directory
#include <immintrin.h>
^~~~~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Source file was:
#include <immintrin.h>
int main()
{
__m256i a = {0};
a = _mm256_abs_epi16(a);
__m256i x;
_mm256_extract_epi64(x, 0); // we rely on this in our AVX2 code
return 0;
}
Performing C SOURCE FILE Test C_HAS_AVX2_2 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_24bd3"
[1/2] Building C object CMakeFiles/cmTC_24bd3.dir/src.c.o
FAILED: CMakeFiles/cmTC_24bd3.dir/src.c.o
/usr/bin/cc -fopenmp -DC_HAS_AVX2_2 -mavx2 -mfma -fPIE -o CMakeFiles/cmTC_24bd3.dir/src.c.o -c src.c
cc: error: unrecognized command line option ‘-mavx2’
cc: error: unrecognized command line option ‘-mfma’
ninja: build stopped: subcommand failed.
Source file was:
#include <immintrin.h>
int main()
{
__m256i a = {0};
a = _mm256_abs_epi16(a);
__m256i x;
_mm256_extract_epi64(x, 0); // we rely on this in our AVX2 code
return 0;
}
Performing C SOURCE FILE Test C_HAS_AVX2_3 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_0b263"
[1/2] Building C object CMakeFiles/cmTC_0b263.dir/src.c.o
FAILED: CMakeFiles/cmTC_0b263.dir/src.c.o
/usr/bin/cc -fopenmp -DC_HAS_AVX2_3 /arch:AVX2 -fPIE -o CMakeFiles/cmTC_0b263.dir/src.c.o -c src.c
cc: error: /arch:AVX2: No such file or directory
ninja: build stopped: subcommand failed.
Source file was:
#include <immintrin.h>
int main()
{
__m256i a = {0};
a = _mm256_abs_epi16(a);
__m256i x;
_mm256_extract_epi64(x, 0); // we rely on this in our AVX2 code
return 0;
}
Performing C SOURCE FILE Test CXX_HAS_AVX_1 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_962a8"
[1/2] Building C object CMakeFiles/cmTC_962a8.dir/src.c.o
FAILED: CMakeFiles/cmTC_962a8.dir/src.c.o
/usr/bin/cc -fopenmp -DCXX_HAS_AVX_1 -fPIE -o CMakeFiles/cmTC_962a8.dir/src.c.o -c src.c
src.c:2:12: fatal error: immintrin.h: No such file or directory
#include <immintrin.h>
^~~~~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Source file was:
#include <immintrin.h>
int main()
{
__m256 a;
a = _mm256_set1_ps(0);
return 0;
}
Performing C SOURCE FILE Test CXX_HAS_AVX_2 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_ea8be"
[1/2] Building C object CMakeFiles/cmTC_ea8be.dir/src.c.o
FAILED: CMakeFiles/cmTC_ea8be.dir/src.c.o
/usr/bin/cc -fopenmp -DCXX_HAS_AVX_2 -mavx -fPIE -o CMakeFiles/cmTC_ea8be.dir/src.c.o -c src.c
cc: error: unrecognized command line option ‘-mavx’
ninja: build stopped: subcommand failed.
Source file was:
#include <immintrin.h>
int main()
{
__m256 a;
a = _mm256_set1_ps(0);
return 0;
}
Performing C SOURCE FILE Test CXX_HAS_AVX_3 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_ca1bc"
[1/2] Building C object CMakeFiles/cmTC_ca1bc.dir/src.c.o
FAILED: CMakeFiles/cmTC_ca1bc.dir/src.c.o
/usr/bin/cc -fopenmp -DCXX_HAS_AVX_3 /arch:AVX -fPIE -o CMakeFiles/cmTC_ca1bc.dir/src.c.o -c src.c
cc: error: /arch:AVX: No such file or directory
ninja: build stopped: subcommand failed.
Source file was:
#include <immintrin.h>
int main()
{
__m256 a;
a = _mm256_set1_ps(0);
return 0;
}
Performing C SOURCE FILE Test CXX_HAS_AVX2_1 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_89f34"
[1/2] Building C object CMakeFiles/cmTC_89f34.dir/src.c.o
FAILED: CMakeFiles/cmTC_89f34.dir/src.c.o
/usr/bin/cc -fopenmp -DCXX_HAS_AVX2_1 -fPIE -o CMakeFiles/cmTC_89f34.dir/src.c.o -c src.c
src.c:2:12: fatal error: immintrin.h: No such file or directory
#include <immintrin.h>
^~~~~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Source file was:
#include <immintrin.h>
int main()
{
__m256i a = {0};
a = _mm256_abs_epi16(a);
__m256i x;
_mm256_extract_epi64(x, 0); // we rely on this in our AVX2 code
return 0;
}
Performing C SOURCE FILE Test CXX_HAS_AVX2_2 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_4b627"
[1/2] Building C object CMakeFiles/cmTC_4b627.dir/src.c.o
FAILED: CMakeFiles/cmTC_4b627.dir/src.c.o
/usr/bin/cc -fopenmp -DCXX_HAS_AVX2_2 -mavx2 -mfma -fPIE -o CMakeFiles/cmTC_4b627.dir/src.c.o -c src.c
cc: error: unrecognized command line option ‘-mavx2’
cc: error: unrecognized command line option ‘-mfma’
ninja: build stopped: subcommand failed.
Source file was:
#include <immintrin.h>
int main()
{
__m256i a = {0};
a = _mm256_abs_epi16(a);
__m256i x;
_mm256_extract_epi64(x, 0); // we rely on this in our AVX2 code
return 0;
}
Performing C SOURCE FILE Test CXX_HAS_AVX2_3 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_ea953"
[1/2] Building C object CMakeFiles/cmTC_ea953.dir/src.c.o
FAILED: CMakeFiles/cmTC_ea953.dir/src.c.o
/usr/bin/cc -fopenmp -DCXX_HAS_AVX2_3 /arch:AVX2 -fPIE -o CMakeFiles/cmTC_ea953.dir/src.c.o -c src.c
cc: error: /arch:AVX2: No such file or directory
ninja: build stopped: subcommand failed.
Source file was:
#include <immintrin.h>
int main()
{
__m256i a = {0};
a = _mm256_abs_epi16(a);
__m256i x;
_mm256_extract_epi64(x, 0); // we rely on this in our AVX2 code
return 0;
}
Performing C SOURCE FILE Test COMPILER_SUPPORTS_FLOAT128 failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_a3ad4"
[1/2] Building C object CMakeFiles/cmTC_a3ad4.dir/src.c.o
FAILED: CMakeFiles/cmTC_a3ad4.dir/src.c.o
/usr/bin/cc -fopenmp -Wno-ignored-qualifiers -Wno-absolute-value -DCOMPILER_SUPPORTS_FLOAT128 -fPIE -o CMakeFiles/cmTC_a3ad4.dir/src.c.o -c src.c
src.c: In function ‘main’:
src.c:2:16: error: unknown type name ‘__float128’; did you mean ‘_Float128’?
int main() { __float128 r = 1;
^~~~~~~~~~
_Float128
src.c: At top level:
cc1: warning: unrecognized command line option ‘-Wno-absolute-value’
ninja: build stopped: subcommand failed.
Source file was:
int main() { __float128 r = 1;
}
Performing C SOURCE FILE Test COMPILER_SUPPORTS_SVE failed with the following output:
Change Dir: /home/nvidia/pytorch1.4/pytorch/build/CMakeFiles/CMakeTmp
Run Build Command:"/home/nvidia/.local/bin/ninja" "cmTC_88c4f"
[1/2] Building C object CMakeFiles/cmTC_88c4f.dir/src.c.o
FAILED: CMakeFiles/cmTC_88c4f.dir/src.c.o
/usr/bin/cc -fopenmp -Wno-ignored-qualifiers -Wno-absolute-value -DCOMPILER_SUPPORTS_SVE -march=armv8-a+sve -fPIE -o CMakeFiles/cmTC_88c4f.dir/src.c.o -c src.c
cc1: error: invalid feature modifier in ‘-march=armv8-a+sve’
cc1: warning: unrecognized command line option ‘-Wno-absolute-value’
ninja: build stopped: subcommand failed.
Source file was:
#include <arm_sve.h>
int main() {
svint32_t r = svdup_n_s32(1); }
Could you please help me with this question, what is the source of this problem or maybe it is possible to find somewhere the weel for Pytorch 1.4, cp27 for CUDA 10.2 or just CPU?
I have tried to change gcc from 7.5 to 6.5, however it didn’t solve the problem.