Hello,
I think Visual Studio is a very bad/slow/buggy IDE it drives me nuts.
Therefore I just want to write some simple kernels and compile them with textpad 4 or textpad 5… to get the hang of it External Image
I already added nvcc as a program to the tools section of textpad.
I also added the VC\BIN folder to the environment path of windows.
As well as the VC\BIN\AMD64 path.
Which is a bit surprising but ok.
NVCC runs from command prompt.
CL strangely does not and “crashes with application error”.
(I also changed the third color in textpad to purple which is more nice for global External Image External Image
Anyway NVCC seems to compile up to this point:
File adder.cu:
global void kernel( int a, int b )
{
int c;
c = a + b;
}
nvcc ouput:
adder.cu
O:/CUDA C/test add-er/version 0.01/adder.cu(3): warning: variable “c” was set but never used
tmpxft_00000344_00000000-3_adder.cudafe1.gpu
tmpxft_00000344_00000000-8_adder.cudafe2.gpu
adder.cu
O:/CUDA C/test add-er/version 0.01/adder.cu(3): warning: variable “c” was set but never used
tmpxft_00000344_00000000-3_adder.cudafe1.cpp
tmpxft_00000344_00000000-14_adder.ii
LINK : fatal error LNK1104: cannot open file ‘kernel32.lib’
Tool completed with exit code 2
kernel32.lib is nowhere to be found. Does this mean kernel32.lib is for host applications/executables only ?
I just want some PTX or perhaps CUBIN’s…
So I probably need to add some command line parameters to NVCC ?
Additional command line options like “no optimizing” might be nice too… to prevent nvcc from illiminating simple code like the add-er External Image
The first thing I went to get rid of is this error message: “kernel32.lib” is that possible ?
Perhaps I should use a different compiler instead of nvcc ?
Perhaps use opencc directly ?
I tried that too by adding it to textpad, this was the result:
nvopencc.exe ERROR: signal 11 caught, stop processing
Tool completed with exit code 1
Doesn’t seem to good… so back to nvcc me thinks…
Ok the solution is real easy just add -ptx command line parameter.
Now nvcc works, opencc does not but is not needed, just nvcc is needed ! External Image =D Very nice ! External Image =D
Real sweet… it just generates one nice file… not the huge overhead that visual studio produces External Image
Output from nvcc: (it still optimized the code away so gotta try and get that gone… also gotta try and get tools in main menu instead of submenu in textpad…
maybe command line would be a better choice for textpad External Image but then again maybe not External Image)
File adder.ptx:
.version 1.4
.target sm_10, map_f64_to_f32
// compiled with C:\Tools\CUDA\Toolkit 4.0\v4.0\bin/../open64/lib//be.exe
// nvopencc 4.0 built on 2011-05-13
//-----------------------------------------------------------
// Compiling C:/Users/Skybuck/AppData/Local/Temp/tmpxft_000017b0_00000000-11_adder.cpp3.i (C:/Users/Skybuck/AppData/Local/Temp/ccBI#.a04264)
//-----------------------------------------------------------
//-----------------------------------------------------------
// Options:
//-----------------------------------------------------------
// Target:ptx, ISA:sm_10, Endian:little, Pointer Size:64
// -O3 (Optimization level)
// -g0 (Debug level)
// -m2 (Report advisories)
//-----------------------------------------------------------
.file 1 "C:/Users/Skybuck/AppData/Local/Temp/tmpxft_000017b0_00000000-10_adder.cudafe2.gpu"
.file 2 "c:\tools\microsoft visual studio 10.0\vc\include\codeanalysis\sourceannotations.h"
.file 3 "C:\Tools\CUDA\Toolkit 4.0\v4.0\bin/../include\crt/device_runtime.h"
.file 4 "C:\Tools\CUDA\Toolkit 4.0\v4.0\bin/../include\host_defines.h"
.file 5 "C:\Tools\CUDA\Toolkit 4.0\v4.0\bin/../include\builtin_types.h"
.file 6 "c:\tools\cuda\toolkit 4.0\v4.0\include\device_types.h"
.file 7 "c:\tools\cuda\toolkit 4.0\v4.0\include\driver_types.h"
.file 8 "c:\tools\cuda\toolkit 4.0\v4.0\include\surface_types.h"
.file 9 "c:\tools\cuda\toolkit 4.0\v4.0\include\texture_types.h"
.file 10 "c:\tools\cuda\toolkit 4.0\v4.0\include\vector_types.h"
.file 11 "c:\tools\cuda\toolkit 4.0\v4.0\include\builtin_types.h"
.file 12 "c:\tools\cuda\toolkit 4.0\v4.0\include\host_defines.h"
.file 13 "C:\Tools\CUDA\Toolkit 4.0\v4.0\bin/../include\device_launch_parameters.h"
.file 14 "c:\tools\cuda\toolkit 4.0\v4.0\include\crt\storage_class.h"
.file 15 "C:\Tools\Microsoft Visual Studio 10.0\VC\bin/../../VC/INCLUDE\time.h"
.file 16 "O:/CUDA C/test add-er/version 0.01/adder.cu"
.file 17 "C:\Tools\CUDA\Toolkit 4.0\v4.0\bin/../include\common_functions.h"
.file 18 "c:\tools\cuda\toolkit 4.0\v4.0\include\math_functions.h"
.file 19 "c:\tools\cuda\toolkit 4.0\v4.0\include\math_constants.h"
.file 20 "c:\tools\cuda\toolkit 4.0\v4.0\include\device_functions.h"
.file 21 "c:\tools\cuda\toolkit 4.0\v4.0\include\sm_11_atomic_functions.h"
.file 22 "c:\tools\cuda\toolkit 4.0\v4.0\include\sm_12_atomic_functions.h"
.file 23 "c:\tools\cuda\toolkit 4.0\v4.0\include\sm_13_double_functions.h"
.file 24 "c:\tools\cuda\toolkit 4.0\v4.0\include\sm_20_atomic_functions.h"
.file 25 "c:\tools\cuda\toolkit 4.0\v4.0\include\sm_20_intrinsics.h"
.file 26 "c:\tools\cuda\toolkit 4.0\v4.0\include\surface_functions.h"
.file 27 "c:\tools\cuda\toolkit 4.0\v4.0\include\texture_fetch_functions.h"
.file 28 "c:\tools\cuda\toolkit 4.0\v4.0\include\math_functions_dbl_ptx1.h"
.entry _Z6kernelii (
.param .s32 __cudaparm__Z6kernelii_a,
.param .s32 __cudaparm__Z6kernelii_b)
{
.loc 16 1 0
$LDWbegin__Z6kernelii:
.loc 16 6 0
exit;
$LDWend__Z6kernelii:
} // _Z6kernelii
Bye,
Skybuck.