Cannot run example "even easier introduction to CUDA"

okaydokay · January 13, 2019, 8:52pm

I’m trying to run this. I’m on ubuntu 16.04, with 2x 1080TIs and a 1030, cuda 9.1.

Initially, running with cuda-memcheck resulted in this error:

“Program hit cudaErrorInvalidConfiguration (error 9) due to “invalid configuration argument” on CUDA API call to cudaLaunch”

After reading stuff online I thought setting COMPUTE_PROFILE=0 might help, now I’m getting a different error (and unsetting the variable does not help):

“Internal Memcheck Error: Memcheck failed initialization as some other tools is currently attached. Please make sure that nvprof and Nsight Visual Studio Edition are not being run simultaneously”

What is going on? Is the example out of date?

njuffa · January 13, 2019, 10:04pm

Works fine for me (see below): CUDA 8, Win 7, Quadro P2000. Note that this program does not use any status checking, presumably to prevent the code from becoming cluttered. I would suggest adding that in. There may be an error long before the code gets to the kernel launch.

Running this app under cuda-memcheck crashes the display driver on my system, presumably due to hitting TDR (Windows’s two-second GUI watchdog timer) although it is not clear to me why this would be the case.

c:\Users\Norbert\My Programs>nvcc -arch=sm_61 -o cudamanaged.exe cudamanaged.cu
nvcc warning : nvcc support for Microsoft Visual Studio 2010 and earlier has been deprecated and is no longer being maintained
cudamanaged.cu
support for Microsoft Visual Studio 2010 has been deprecated!
   Creating library cudamanaged.lib and object cudamanaged.exp

c:\Users\Norbert\My Programs>cudamanaged
Max error: 0

c:\Users\Norbert\My Programs>nvprof cudamanaged.exe
==6500== NVPROF is profiling process 6500, command: cudamanaged.exe
Max error: 0
==6500== Profiling application: cudamanaged.exe
==6500== Profiling result:
Time(%)      Time     Calls       Avg       Min       Max  Name
100.00%  251.00ms         1  251.00ms  251.00ms  251.00ms  add(int, float*, float*)

==6500== Unified Memory profiling result:
Device "Quadro P2000 (0)"
   Count  Avg Size  Min Size  Max Size  Total Size  Total Time  Name
    2048  4.0000KB  4.0000KB  4.0000KB  8.000000MB  21.04488ms  Host To Device
     384  32.000KB  32.000KB  32.000KB  12.00000MB  7.825223ms  Device To Host

==6500== API calls:
Time(%)      Time     Calls       Avg       Min       Max  Name
 41.68%  287.57ms         2  143.78ms  4.2846ms  283.28ms  cudaMallocManaged
 36.42%  251.23ms         1  251.23ms  251.23ms  251.23ms  cudaDeviceSynchronize
 13.47%  92.920ms         1  92.920ms  92.920ms  92.920ms  cuDevicePrimaryCtxRelease
  7.53%  51.971ms         1  51.971ms  51.971ms  51.971ms  cudaLaunch
  0.72%  4.9578ms         2  2.4789ms  1.7708ms  3.1870ms  cudaFree
  0.11%  773.23us        91  8.4970us       0ns  365.06us  cuDeviceGetAttribute
  0.06%  422.53us         1  422.53us  422.53us  422.53us  cuModuleUnload
  0.00%  15.540us         1  15.540us  15.540us  15.540us  cudaConfigureCall
  0.00%  10.849us         1  10.849us  10.849us  10.849us  cuDeviceTotalMem
  0.00%  4.9850us         3  1.6610us     294ns  4.1050us  cuDeviceGetCount
  0.00%  3.2260us         3  1.0750us     587ns  1.4660us  cudaSetupArgument
  0.00%  1.4660us         3     488ns     293ns     879ns  cuDeviceGet
  0.00%  1.1720us         1  1.1720us  1.1720us  1.1720us  cuDeviceGetName

okaydokay · January 13, 2019, 10:37pm

You’re using CUDA 8, I’m using CUDA 9, you’re using Windows 7 (why are you using win7?!?), I’m on ubuntu, I’m using GTX1080TIs, you’re using a quadro P2000s. It’s hard to imagine a more different configuration. But thanks for the tip on error checking, how does one do that?

njuffa · January 13, 2019, 10:45pm

I am using what I have in front of me. Yes, my setup is quite different from yours, but the quick check shows that there is nothing fundamentally wrong with the code from the blog. That’s about as much work as I am willing to do for free.

Your favorite internet search engine will return pages of relevant information at the click of a button.

okaydokay · January 13, 2019, 11:26pm

touche

njuffa · January 14, 2019, 1:14am

It’s possible that there is a bug in CUDA 9.x, but given that the code from the blog is a trivial example, I would expect any such bug to be found during regression testing and never make it into a CUDA or driver release. Is Ubuntu 16.04 on the list of officially supported operating systems for CUDA 9.1? Are you running with the stock kernel for Ubuntu 16.04?

Robert_Crovella · January 14, 2019, 2:28am

Are you certain that you are running the code in the blog verbatim, and have not changed the n variable

is n set to 256 in your code?

If n is set to 256 in your code, have you verified your CUDA install? Instructions for verification are in the cuda linux install guide and amount to building and running a few sample projects such as vectorAdd

okaydokay · January 14, 2019, 11:24am

Don’t know what happened but it works today. I’ve had CUDA installed for 6+ months and use it almost everyday, but never needed to actually write any CUDA code until now. Thanks anyways!

Topic		Replies	Views
[RESOLVED] Profiling error 4168:999 Visual Profiler and nvprof	34	11262	September 19, 2020
CUDA invalid records warning CUDA Setup and Installation	10	6248	August 10, 2018
Internal Memcheck Error: Memcheck failed initialization as profiler is attached CUDA Programming and Performance	1	1984	October 11, 2013
nvprof error code 139 but memcheck OK Visual Profiler and nvprof	14	13831	December 11, 2020
Incompatible CUDA driver version Visual Profiler and nvprof cuda	2	1524	July 29, 2021
cuda-memcheck failed on cufft library GPU-Accelerated Libraries	8	2194	December 29, 2020
Internal profiling error 4087:35. Visual Profiler and nvprof	8	3813	November 8, 2021
Unable to run several CUDA samples. CUDA Programming and Performance	2	827	April 1, 2019
Unable to use cuda-memcheck --tool racecheck CUDA Programming and Performance	3	982	October 12, 2021
Unable to compile CUDA file CUDA Setup and Installation	9	10267	May 19, 2017

Cannot run example "even easier introduction to CUDA"

Related topics