Invalid device function (cuda - gpu mismatch)

Hello all,

I’m trying to run SSD (Single Shot MultiBox Detector) on my Jetson TX1 according to the link:
[url]https://myurasov.github.io/2016/11/27/ssd-tx1.html[/url]

I’m running Jetpack 2.3 (and not 2.3.1 like in the link).
When I test my network in GPU mode I get the following error:
Check failed: error == cudaSuccess (8 vs. 0) invalid device function

In my google searches I saw this means a mismatch between cuda version and GPU, however I’m using the Cuda-8.0 that comes with the Jetpack so this should not happen.

Can anyone help me please?

Thanks in advance,
Haggai

BTW,
CPU mode works

Did you use a remote login to run your program? If so, then you may be seeing a side effect of X11 forwarding…parts of a forwarded session run on the remote target, but rendering (GPU operations…CUDA does not seem to distinguish between rendering to a display device versus a virtual buffer) may forward to your desktop host (and your host may have a missing or wrong CUDA version…if the host does work, then there may be a mysterious massive boost in performance).

If you are starting the program remotely, then you may want to test to see if there is a difference in behavior when run locally on the Jetson from a graphical login session.

If this does not help, then you have a CUDA issue instead of an environment issue.

Thanks for the answer, linuxdev, however I found out what was wrong.
The cuda-8.0 version that comes with Jetpack 2.3 (as opposed to 2.3.1) needs a compatibility level of 53 and not 60 as was written in the supplied makefile.config file.
This sorted out the problem.
Again, thank you for your rapid response.
Haggai