GPU Teaching Kit - Accelerated Computing Labs: compile issues

I followed the instructions on the BitBucket site to install the labs on Windows 10 with Visual Studio 2013 Professional.

After running configure and generate in CMake, the build directory looks like the screenshot in the setup instructions.

When I right-click on template.cu in DeviceQuery and compile it, I get 5 errors, 2 in json11.hpp and 3 undefined identifiers in template.cu

Any advice?

output of CMake Configure:
The C compiler identification is MSVC 18.0.21005.1
The CXX compiler identification is MSVC 18.0.21005.1
Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio 12.0/VC/bin/cl.exe
Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio 12.0/VC/bin/cl.exe – works
Detecting C compiler ABI info
Detecting C compiler ABI info - done
Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio 12.0/VC/bin/cl.exe
Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio 12.0/VC/bin/cl.exe – works
Detecting CXX compiler ABI info
Detecting CXX compiler ABI info - done
Detecting CXX compile features
Detecting CXX compile features - done
Found CUDA: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v7.5 (found version “7.5”)
Configuring done

output of DeviceQuery_Solution.log:
Build started 9/6/2016 3:03:26 PM.
1>Project “C:\Users\user\Documents\out2\DeviceQuery_Solution.vcxproj” on node 2 (CustomBuild target(s)).
1>CustomBuild:
Building NVCC (Device) object CMakeFiles/DeviceQuery_Solution.dir/Module2/DeviceQuery/Debug/DeviceQuery_Solution_generated_template.cu.obj
template.cu
1>c:\users\user\documents\labs-source\libwb\vendor/json11.hpp(109): error : expected a “;”
1>c:\users\user\documents\labs-source\libwb\vendor/json11.hpp(110): error : expected a “;”
2 errors detected in the compilation of “C:/Users/user/AppData/Local/Temp/tmpxft_00001640_00000000-8_template.cpp1.ii”.
template.cu
CMake Error at DeviceQuery_Solution_generated_template.cu.obj.Debug.cmake:267 (message):
Error generating file
C:/Users/user/Documents/out2/CMakeFiles/DeviceQuery_Solution.dir/Module2/DeviceQuery/Debug/DeviceQuery_Solution_generated_template.cu.obj
1>Done Building Project “C:\Users\user\Documents\out2\DeviceQuery_Solution.vcxproj” (CustomBuild target(s)) – FAILED.
Build FAILED.
Time Elapsed 00:00:03.95

We have committed some updates and workaround to the Bitbucket repo, can you give those a try and let me know how it goes? We now test the build on Windows 2015 and 2013. You’ll need to pull the new changes (including from the git submodules).

These issues are apparently related to C++ 11 support in VS 2013.


From: gjw
Sent: Friday, September 16, 2016 11:47 AM
To: jbungo
Subject: RE: GPU Teaching Kit - Accelerated Computing Labs: compile issues

Your pointed questions solved the problem (mostly). Since I did not have git installed on my Windows machine, I had downloaded the repository on a linux machine and then transferred the directory. Apparently the git download must be done on the target machine because it seems the specific files downloaded take into account what’s already on the machine. So the compilation problems with libwb went away once I installed git on the windows machine and downloaded the files directly. 

The only compilation error left is with Convolution_DatasetGenerator line 32 (clamp( ) function)
return std::min(std::max(x, 0.0f), 1.0f);

Sounds like a minor issue, but I’m not a C++ programmer,


From: jbungo
Sent: Friday, September 16, 2016 8:13 AM
To: gjw
Subject: RE: GPU Teaching Kit - Accelerated Computing Labs: compile issues

What is the output for the command ‘git log’ (https://git-scm.com/docs/git-log) while in C:\Users\user\Documents\gputeachingkit-labs\libwb ?


From: gjw
Sent: Thursday, September 15, 2016 2:44 PM
To: jbungo
Subject: RE: GPU Teaching Kit - Accelerated Computing Labs: compile issues

See inline answers


From: jbungo
Date: Thursday, September 15, 2016 at 1:16 PM
To: gjw
Subject: RE: GPU Teaching Kit - Accelerated Computing Labs: compile issues

what’s the output of msbuild /version

C:\Users\user\Documents\build>C:\Windows\Microsoft.NET\assembly\GAC_64\MSBuild\v4.0_12.0.0.0__b03f5f7f11d50a3a\msbuild /version
Microsoft ® Build Engine version 12.0.40629.0
[Microsoft .NET Framework, version 4.0.30319.42000]
Copyright © Microsoft Corporation. All rights reserved.

12.0.40629.0

Can you cd in C:\Users\user\Documents\gputeachingkit-labs\libwb and give us the output of git log.

don’t see a log – here are the files in that dir:

Directory of C:\Users\user\Documents\gputeachingkit-labs\libwb

09/07/2016 01:22 PM .
09/07/2016 01:22 PM …
09/07/2016 11:05 AM 1,089 appveyor.yml
09/07/2016 11:05 AM 1,050 CMakeLists.txt
09/07/2016 11:05 AM 1,736 LICENSE.TXT
09/07/2016 11:05 AM 1,408 Makefile
09/07/2016 11:05 AM 427 README.md
09/07/2016 11:05 AM 643 sources.cmake
09/07/2016 01:22 PM vendor
09/07/2016 11:05 AM 4,791 wb.h
09/07/2016 11:05 AM 3,482 wbArg.cpp
09/07/2016 11:05 AM 1,292 wbArg.h
09/07/2016 11:05 AM 767 wbAssert.h
09/07/2016 11:05 AM 455 wbCast.h
09/07/2016 11:05 AM 4,044 wbComparator.h
09/07/2016 11:05 AM 454 wbCUDA.cpp
09/07/2016 11:05 AM 1,758 wbCUDA.h
09/07/2016 11:05 AM 5,170 wbDataset.cpp
09/07/2016 11:05 AM 985 wbDataset.h
09/07/2016 11:05 AM 645 wbDataset_test.cpp
09/07/2016 11:05 AM 1,908 wbDirectory.cpp
09/07/2016 11:05 AM 268 wbDirectory.h
09/07/2016 11:05 AM 3,044 wbExit.cpp
09/07/2016 11:05 AM 96 wbExit.h
09/07/2016 11:05 AM 12,333 wbExport.cpp
09/07/2016 11:05 AM 3,591 wbExport.h
09/07/2016 11:05 AM 7,127 wbFile.cpp
09/07/2016 11:05 AM 2,003 wbFile.h
09/07/2016 11:05 AM 3,566 wbImage.cpp
09/07/2016 11:05 AM 1,202 wbImage.h
09/07/2016 11:05 AM 17,064 wbImport.cpp
09/07/2016 11:05 AM 3,405 wbImport.h
09/07/2016 11:05 AM 1,707 wbInit.cpp
09/07/2016 11:05 AM 186 wbInit.h
09/07/2016 11:05 AM 8,390 wbLogger.cpp
09/07/2016 11:05 AM 3,250 wbLogger.h
09/07/2016 11:05 AM 3,272 wbMalloc.h
09/07/2016 11:05 AM 9,439 wbMD5.h
09/07/2016 11:05 AM 1,587 wbMPI.cpp
09/07/2016 11:05 AM 701 wbMPI.h
09/07/2016 11:05 AM 1,235 wbPath.cpp
09/07/2016 11:05 AM 1,192 wbPath.h
09/07/2016 11:05 AM 4,714 wbPPM.cpp
09/07/2016 11:05 AM 199 wbPPM.h
09/07/2016 11:05 AM 9,211 wbSolution.cpp
09/07/2016 11:05 AM 1,753 wbSolution.h
09/07/2016 11:05 AM 2,655 wbSparse.cpp
09/07/2016 11:05 AM 321 wbSparse.h
09/07/2016 11:05 AM 9,965 wbString.h
09/07/2016 11:05 AM 291 wbThrust.h
09/07/2016 11:05 AM 16,146 wbTimer.cpp
09/07/2016 11:05 AM 5,695 wbTimer.h
09/07/2016 11:05 AM 1,193 wbTypes.h
09/07/2016 11:05 AM 16 wbUtils.cpp
09/07/2016 11:05 AM 326 wbUtils.h
09/07/2016 11:05 AM 224 wb_test.cpp
53 File(s) 169,471 bytes

Also, are you on a 32bit system? According to the documentation http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-microsoft-windows/

The computer has Windows 10 64-bit
The CUDA toolkit is version 7.5.18

x86_32 support is limited. See the x86 32-bit Support section for details.

x86 32-bit Support
Native development using the CUDA Toolkit on x86_32 is unsupported. Deployment and execution of CUDA > applications on x86_32 is still supported, but is limited to use with GeForce GPUs. To create 32-bit > CUDA applications, use the cross-development capabilities of the CUDA Toolkit on x86_64.
Support for developing and running x86 32-bit applications on x86_64 Windows is limited to use with:
• GeForce GPUs
• CUDA Driver
• CUDA Runtime (cudart)
• CUDA Math Library (math.h)
• CUDA C++ Compiler (nvcc)
• CUDA Development Tools


From: gjw
Sent: Wednesday, September 14, 2016 3:47 PM
To: jbungo
Subject: RE: GPU Teaching Kit - Accelerated Computing Labs: compile issues

AT first I used the original MS VS Professional 2013 version 12.0.21005.1 REL But now that you mentioned it, I got the latest update
Version 12.0.40629.00 Update 5
and tried again, with same result. Below is a copy of the output:

1>------ Rebuild All started: Project: ZERO_CHECK, Configuration: Debug Win32 ------
1> Checking Build System
1> CMake does not need to re-run because C:/Users/user/Documents/build/CMakeFiles/generate.stamp is up-to-date.
2>------ Rebuild All started: Project: wb, Configuration: Debug Win32 ------
2> Building Custom Rule C:/Users/user/Documents/gputeachingkit-labs/CMakeLists.txt
2> CMake does not need to re-run because C:\Users\user\Documents\build\CMakeFiles\generate.stamp is up-to-date.
2> wbArg.cpp
2> wbCUDA.cpp
2> wbDataset.cpp
2> wbDirectory.cpp
2>C:\Users\user\Documents\gputeachingkit-labs\libwb\wbDirectory.cpp(34): error C3861: ‘snprintf’: identifier not found
2> wbExit.cpp
2> wbExport.cpp
2> wbFile.cpp
2> wbImage.cpp
2> wbImport.cpp
2> wbInit.cpp
2> wbLogger.cpp
2> wbMPI.cpp
2> wbPPM.cpp
2> wbPath.cpp
2> wbSolution.cpp
2> wbSparse.cpp
2> wbTimer.cpp
2> wbUtils.cpp
2> json11.cpp
2>C:\Users\user\Documents\gputeachingkit-labs\libwb\vendor\json11.cpp(61): error C3861: ‘snprintf’: identifier not found
2>C:\Users\user\Documents\gputeachingkit-labs\libwb\vendor\json11.cpp(70): error C3861: ‘snprintf’: identifier not found
2>C:\Users\user\Documents\gputeachingkit-labs\libwb\vendor\json11.cpp(76): error C3861: ‘snprintf’: identifier not found
2>C:\Users\user\Documents\gputeachingkit-labs\libwb\vendor\json11.cpp(82): error C3861: ‘snprintf’: identifier not found
2>C:\Users\user\Documents\gputeachingkit-labs\libwb\vendor\json11.cpp(110): error C3861: ‘snprintf’: identifier not found
2>C:\Users\user\Documents\gputeachingkit-labs\libwb\vendor\json11.cpp(517): error C3861: ‘snprintf’: identifier not found
2>C:\Users\user\Documents\gputeachingkit-labs\libwb\vendor\json11.cpp(519): error C3861: ‘snprintf’: identifier not found
2> Generating Code…
3>------ Rebuild All started: Project: DeviceQuery_Solution, Configuration: Debug Win32 ------
3> Building NVCC (Device) object CMakeFiles/DeviceQuery_Solution.dir/Module2/DeviceQuery/Debug/DeviceQuery_Solution_generated_template.cu.obj
3> template.cu
3>
3> template.cu
3>
3> Building Custom Rule C:/Users/user/Documents/gputeachingkit-labs/CMakeLists.txt
3> CMake does not need to re-run because C:\Users\user\Documents\build\CMakeFiles\generate.stamp is up-to-date.
3>LINK : fatal error LNK1104: cannot open file ‘Debug\wb.lib’
========== Rebuild All: 1 succeeded, 2 failed, 0 skipped ==========


From: jbungo
Date: Wednesday, September 14, 2016 at 1:25 PM
To: gjw
Subject: RE: GPU Teaching Kit - Accelerated Computing Labs: compile issues

What VS2013 Service Pack are you using?


From: gjw
Sent: Wednesday, September 14, 2016 11:39 AM
To: jbungo
Subject: RE: GPU Teaching Kit - Accelerated Computing Labs: compile issues

Unfortunately not, I tried the latest repository on BitBucket both with VStudio 2012 & 2013, and didn’t get the reload message, but I continue to get errors related to json11.hp from the wb project (specifically “cannot open include file initializer_list – no such file/directory”).

The instructions on github warn that “the project depends on an external libwb [so] we must perform a recursive clone (to also checkout the libwb repository)”. I used the command
git clone --recursive https://gjwilms@bitbucket.org/hwuligans/gputeachingkit-labs.git
maybe I am doing something wrong there?


From: jbungo
Sent: Thursday, September 08, 2016 8:29 AM
To: gjw
Subject: RE: GPU Teaching Kit - Accelerated Computing Labs: compile issues

Sorry you’re having issues.

Did you delete all previous folders, check out and start cmake process over again?

http://stackoverflow.com/questions/3619725/stop-visual-studio-asking-for-each-project-has-been-modified-outside-the-enviro

Unfortunately I’m not an expert on cmake – by chance have you checked cmake forums as well?


From: gjw
To: jbungo
Sent: Wednesday, September 07, 2016 3:38 PM
Subject: RE: GPU Teaching Kit - Accelerated Computing Labs: compile issues

I now get 12 errors – the same 3 in the template.cu of DeviceQuery_Solution, the rest from json11.cpp, and 1 from wbDirectory.cpp.
I also get the following:

From: gjw
Sent: Friday, September 23, 2016 3:12 PM
To: jbungo
Subject: Re: GPU Teaching Kit - Accelerated Computing Labs: compile issue

I should have inspected the code more carefully, but this one is so obvious that I feel like I must be the first one to actually run it. In the documentation on the NVidia Developer website AND in the latest repository on bitbucket, the wbCheck function states
if (err == cudaSuccess)….
which of course terminates the program when the cuda operation actually was successful. That is why bypassing the wbCheck function lets the program proceed…

However, I still get a result array of all 0s. The odd thing is that setting the cells to some random value in the sgemm kernel function, as in
C[row * numBColumns + col] = 4.5;
STILL results on all 0s!


From: jbungo
Date: Friday, September 23, 2016 at 1:55 PM
To: gjw
Subject: Re: GPU Teaching Kit - Accelerated Computing Labs: compile issue

Still thinking it’s an allocation issue. WbCheck is defined as

#define wbCheck(stmt)
do {
cudaError_t err = stmt;
if (err != cudaSuccess) {
wbLog(ERROR, "Failed to run stmt ", #stmt);
wbLog(ERROR, "Got CUDA error … ", cudaGetErrorString(err));
return -1;
}
} while (0)

Could you be checking error incorrectly?


From: gjw
Sent: Friday, September 23, 2016 10:55 AM
To: jbungo
Subject: Re: GPU Teaching Kit - Accelerated Computing Labs: compile issue

No I tested this. As I said, when I bypass wbCheck and call cudaMalloc directly and check the error code returned, it reports “no error”. All the sample programs that come with the CUDA Toolkit also run correctly.


From: jbungo
Date: Friday, September 23, 2016 at 10:47 AM
To: gjw
Subject: Re: GPU Teaching Kit - Accelerated Computing Labs: compile issue

As this is a runtime error, what hardware are you using? May not have enough memory to allocate the matrix.


From: gjw
Sent: Thursday, September 22, 2016 3:06 PM
To: jbungo
Subject: Re: GPU Teaching Kit - Accelerated Computing Labs: compile issue

C:\Users\user\Documents\cuda\build\Debug\MatrixMultiplication\Dataset\0>…\BasicMatrixMultiplication_Solution.exe -e output.raw -i input0.raw,input1.raw
{“data”: {“elapsed_time”: 3949848, “end_file”: “C:/Users/user/Documents/cuda/gputeachingkit-labs/Module4/BasicMatrixMultiplication/solution.cu”, “end_function”: “main”, “end_line”: 52, “end_time”: 4947245923438, “id”: “a18658cd-0ff0-477b-8083-1fd61ef4f768”, “idx”: 0, “kind”: “Generic”, “message”: “Importing data and creating memory on host”, “mpi_rank”: 0, “parent_id”: -1, “session_id”: “session_id_disabled”, “start_file”: “C:/Users/user/Documents/cuda/gputeachingkit-labs/Module4/BasicMatrixMultiplication/solution.cu”, “start_function”: “main”, “start_line”: 45, “start_time”: 4947241973590, “stopped”: true}, “id”: “a18658cd-0ff0-477b-8083-1fd61ef4f768”, “session_id”: “session_id_disabled”, “type”: “timer”}
{“data”: {“file”: “C:/Users/user/Documents/cuda/gputeachingkit-labs/Module4/BasicMatrixMultiplication/solution.cu”, “function”: “main”, “id”: “aef41e70-a007-402b-89b6-bf4093a5cbfe”, “level”: “Trace”, “line”: 57, “message”: “The dimensions of A are 16 x 16”, “mpi_rank”: 0, “session_id”: “session_id_disabled”, “time”: 4947352581734}, “id”: “aef41e70-a007-402b-89b6-bf4093a5cbfe”, “session_id”: “session_id_disabled”, “type”: “logger”}
{“data”: {“file”: “C:/Users/user/Documents/cuda/gputeachingkit-labs/Module4/BasicMatrixMultiplication/solution.cu”, “function”: “main”, “id”: “54a2c75a-1882-4525-ae7f-c8b097026116”, “level”: “Trace”, “line”: 58, “message”: “The dimensions of B are 16 x 16”, “mpi_rank”: 0, “session_id”: “session_id_disabled”, “time”: 4947391317118}, “id”: “54a2c75a-1882-4525-ae7f-c8b097026116”, “session_id”: “session_id_disabled”, “type”: “logger”}
{“data”: {“file”: “C:/Users/user/Documents/cuda/gputeachingkit-labs/Module4/BasicMatrixMultiplication/solution.cu”, “function”: “main”, “id”: “312d37d1-7a63-4b83-9c7c-af804ca1d44c”, “level”: “Trace”, “line”: 59, “message”: “The dimensions of C are 16 x 16”, “mpi_rank”: 0, “session_id”: “session_id_disabled”, “time”: 4947425279781}, “id”: “312d37d1-7a63-4b83-9c7c-af804ca1d44c”, “session_id”: “session_id_disabled”, “type”: “logger”}
{“data”: {“file”: “C:/Users/user/Documents/cuda/gputeachingkit-labs/Module4/BasicMatrixMultiplication/solution.cu”, “function”: “main”, “id”: “aefd05dd-f7c3-4647-984d-216d810935dc”, “level”: “Error”, “line”: 65, “message”: “Failed to run stmt cudaMalloc((void **)&deviceA, numAColumns * sizeof(float))”, “mpi_rank”: 0, “session_id”: “session_id_disabled”, “time”: 4947539451693}, “id”: “aefd05dd-f7c3-4647-984d-216d810935dc”, “session_id”: “session_id_disabled”, “type”: “logger”}

The cuda Malloc call in line 65 is passed to wbCheck(). When I called it directly
cudaError_t err = cudaMalloc((void **)&deviceA, numARows * numAColumns*sizeof(float));
cudaGetErrorString(err) reports “no error” . When I remove all the wbCheck references, the program terminates but fails the wbSolution call
{“data”: {“file”: “C:\Users\user\Documents\cuda\gputeachingkit-labs\libwb\wbSolution.cpp”, “function”: “wbSolution”, “id”: “97610309-d69f-4031-906c-3ad84fb334f0”, “level”: “Error”, “line”: 145, “message”: “Failed to grade solution”, “mpi_rank”: 0, “session_id”: “session_id_disabled”, “time”: 4872522873799}, “id”: “97610309-d69f-4031-906c-3ad84fb334f0”, “session_id”: “session_id_disabled”, “type”: “logger”}
{“data”: {“correctq”: false, “message”: “”}, “type”: “solution”}
and indeed the entire hostC array is nothing but zeroes.


From: jbungo
Date: Thursday, September 22, 2016 at 8:27 AM
To: gjw
Subject: Re: GPU Teaching Kit - Accelerated Computing Labs: compile issue

Can you provide the failing messages?


From: gjw
Sent: Wednesday, September 21, 2016 10:47 PM
To: jbungo
Subject: Re: GPU Teaching Kit - Accelerated Computing Labs: compile issues

I may have spoken too soon. The files compile now (with the exception of 1), but they fail at runtime anytime they use a kernel function or access device memory.

From: jbungo
Sent: Friday, September 23, 2016 6:29 PM
To: gjw
Subject: Re: GPU Teaching Kit - Accelerated Computing Labs: compile issues

if (err == cudaSuccess)….

Indeed, that’s a typo. and should be

if (err != cudaSuccess)

We have committed a fix.

From: gjw
Sent: Tuesday, October 11, 2016 8:17 PM
To: jbungo
Subject: Re: GPU Teaching Kit - Accelerated Computing Labs: compile issues

I have found the support libraries (libwb) to be very finicky, particularly the 3rd party hpp code in the vendor folder. On machines with identical setup (VStudio 2013 Update 5, CudaToolkit 7.5) the wbTime_stop() function works fine on some machines and crashes on others. The following code is the culprit:
json11::Json json = json11::Json::object{
{“type”, “timer”},
{“id”, wbTimerNode_getId(node)},
{“session_id”, wbTimerNode_getSessionId(node)},
{“data”, wbTimerNode_toJSONObject(node)}};
std::cout << json.dump() << std::endl;

The assignment of “timer” to type in the json struct is invalid (in some machines) and causes the crash:

I have been able to work around it by commenting out the wbStop function call (or the assignment to type), but it is still puzzling.

From: jbungo
Sent: Tuesday, October 18, 2016 1:41 PM
To: gjw
Subject: Re: GPU Teaching Kit - Accelerated Computing Labs: compile issues

Can you try upgrading to the latest version of CUDA? This has solved a lot of compatibility issues so far.

Hi,

The links to the BitBucket site and to README provided in the Module[2]-DeviceQuery.pdf lead to nowhere. I’m wondering if you could provide these links? Also, when I tried to compile the code within it, there’s an error message saying that wb.h cannot be found. I’m using Windows 7 and Visual Studio 12. I’m wondering if you could help? Many thanks.

Dave

Bitbucket site/README is here: https://bitbucket.org/hwuligans/gputeachingkit-labs. Can you post the exact error you’re getting?