[GNU/Linux][Ubuntu 17.10 64bit] Unable to run/build FleXDemo/UnrealEngine+FleX

  1. I've been clonned NVIDIA FleX from official github repo: [code=bash]$ git clone https://github.com/NVIDIAGameWorks/FleX.git[/code] and tried to run the demo. the output was this error: [code=bash]$ ./bin/linux64/NvFlexDemoDebugCUDA_x64 Reshaping Error creating CUDA context.[/code] it seems CUDA installed successfully: [code=bash]$ apt list --installed | grep cuda (click here for output)[/code] any same case? solution? workaround?!
  2. After that, i've been clonned UnrealEngine+FleX:
    $ git clone -b FleX-4.17.1 https://github.com/NvPhysX/UnrealEngine.git

    and tried to build using these commands:

    $ ./Setup.sh
    $ ./GenerateProjectFiles.sh
    $ make

    … and theres so many errors/warnings in make output.
    anyone tried to build UnrealEngine+FleX on gnu/linux machine? please guide me.

  3. Off-Topic: After some googling to find a fine documention/tutorial on FleX, i've been found nothing. please give me some links if you know anything!

I’ve got the same problem:

$ ./bin/linux64/NvFlexDemoDebugCUDA_x64 Reshaping Error creating CUDA context.

I’ve wasted too much time on this !

I’ve built things according to requirements in the README.md file on Ubuntu 14.04. I am using a GT-640 graphics card on a Dell XPS computer.

For example, upon checking nvidia-smi, it shows a correct installation of Nvidia-396.54:

| NVIDIA-SMI 396.54                 Driver Version: 396.54                    |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  GeForce GT 640      Off  | 00000000:01:00.0 N/A |                  N/A |
| 16%   25C    P8    N/A /  N/A |    215MiB /   973MiB |     N/A      Default |
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|    0                    Not Supported                                       |

When I check the cuda toolkit with nvcc -V, it shows correct installation of version 8.0.44 as specified:

$nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Sun_Sep__4_22:14:01_CDT_2016
Cuda compilation tools, release 8.0, V8.0.44

I’ve got g++ version 4.8.4, so when I type g++ -v I get:

g++ -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.8.4-2ubuntu1~14.04.4' --with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.8 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libmudflap --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.4) 

When I perform your check, I get the following:

$ apt list --installed | grep cuda
WARNING: apt does not have a stable CLI interface yet. Use with caution in scripts.

libcuda1-396/trusty,now 396.54-0ubuntu0~gpu14.04.1 amd64 [installed,automatic]

I also don’t find much support on the web. I’m excited to get Flex working, so I hope some folks can help with this.

2 weeks later, I’m still stumped on this problem. I’ve now gone through the following in attempt to fix the issue:


First, I made sure that my paths were exported properly in both /etc/environment and in my bashrc.
Added the following to /etc/environment:


in my bashrc, I’ve got this:

export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64

I used


to make the CUDA 8.0 samples and check if things were working. Following the instructions with this link, I was able to run both deviceQuery and bandwidthTest and both passed. Here is the output of deviceQuery:

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GT 640"
  CUDA Driver Version / Runtime Version          9.2 / 8.0
  CUDA Capability Major/Minor version number:    3.0
  Total amount of global memory:                 972 MBytes (1019543552 bytes)
  ( 2) Multiprocessors, (192) CUDA Cores/MP:     384 CUDA Cores
  GPU Max Clock rate:                            954 MHz (0.95 GHz)
  Memory Clock rate:                             2500 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 262144 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.2, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GT 640
Result = PASS

I also ran the bandwidthTest:

[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: GeForce GT 640
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			10516.1

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			10586.4

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			55071.5

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

