Hello,
https://paste.ofcode.org/7YEtshDZpx324P9nExcM8k
In this code, line 313, i’m having difficulties tracking back the mentionned error.
This operation works if h_img[j] = 0 some lines above, but if it’s equal to something else (take 0.1f, or 1.0f), i’ve got either an illegal memory acces error from cuda, or an unspecified launch error.
With memchecks, diagnostic gives 200 times this:
========= Invalid global read of size 4
========= at 0x000001f8 in void csrMvT_hyb_kernel<float, float, float, int=7, int=2, int=8, int=5, int=0>(cusparseCsrMvParams<float, float, float>, int*)
========= by thread (116,0,0) in block (105,0,0)
========= Address 0x509d05d64 is out of bounds
========= Saved host backtrace up to driver entry point at kernel launch time
========= Host Frame:C:\WINDOWS\system32\nvcuda.dll (cuTexRefSetAddress + 0x1aa848) [0x1b7e95]
========= Host Frame:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\bin\cusparse64_91.dll (cusparseDpruneCsr2csrByPercentage + 0x28fe) [0x26d89e]
========= Host Frame:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\bin\cusparse64_91.dll (cusparseDpruneCsr2csrByPercentage + 0x1eb1) [0x26ce51]
========= Host Frame:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\bin\cusparse64_91.dll (cusparseCsrmvEx_bufferSize + 0x4061) [0x10d341]
========= Host Frame:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\bin\cusparse64_91.dll (cusparseCsrmvEx_bufferSize + 0x802c) [0x11130c]
========= Host Frame:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\bin\cusparse64_91.dll (cusparseScsrmv + 0x85) [0x112165]
========= Host Frame:C:\Users\stagiaire\Desktop\Workstation\GPGPU_Tests\x64\Debug\GPGPU_Tests.exe (GPU_Method_2 + 0x11dc) [0x1312c]
========= Host Frame:C:\Users\stagiaire\Desktop\Workstation\GPGPU_Tests\x64\Debug\GPGPU_Tests.exe (main + 0xda) [0x17aaa]
========= Host Frame:C:\Users\stagiaire\Desktop\Workstation\GPGPU_Tests\x64\Debug\GPGPU_Tests.exe (invoke_main + 0x34) [0x1a504]
========= Host Frame:C:\Users\stagiaire\Desktop\Workstation\GPGPU_Tests\x64\Debug\GPGPU_Tests.exe (__scrt_common_main_seh + 0x127) [0x1a3c7]
========= Host Frame:C:\Users\stagiaire\Desktop\Workstation\GPGPU_Tests\x64\Debug\GPGPU_Tests.exe (__scrt_common_main + 0xe) [0x1a28e]
========= Host Frame:C:\Users\stagiaire\Desktop\Workstation\GPGPU_Tests\x64\Debug\GPGPU_Tests.exe (mainCRTStartup + 0x9) [0x1a529]
========= Host Frame:C:\WINDOWS\System32\KERNEL32.DLL (BaseThreadInitThunk + 0x14) [0x11fe4]
========= Host Frame:C:\WINDOWS\SYSTEM32\ntdll.dll (RtlUserThreadStart + 0x21) [0x6efc1]
when the error returned is illegal access memory.
when it is unspecified lauch error (20% of the time?) it doesn’t give any useful information. (it just mentions the unspecified launch error).
What’s weird is that it works if h_img is filled with zeroes, it works for the h_unitarian calculus and fo the h_v3 calculus, it just fails for the h_img => h_v1 operation. Program runs fine if I load h_img with 0.
Here’s the complete code base, with example files:
go to debug/x86/ , shift right click, open a powershell console or a traditional console, and “./GPGPU_Tests.exe 35” to reproduce the error. you can mess with the silence variable for tracing.
cuSPARSE requires cc 2+
PS: Please don’t hesitate to give any additional comment on general programming, i’m fairly new to all this i’ll take anything to improve. :)
PPS: At some point starting this program gave me a nice Windows 10 blue screen, couldn’t reproduce yet, consider saving anything you’ve got opened if you’re taking a look