How to setup textpad to compile simple kernels ?

Hello,

I think Visual Studio is a very bad/slow/buggy IDE it drives me nuts.

Therefore I just want to write some simple kernels and compile them with textpad 4 or textpad 5… to get the hang of it ;)

I already added nvcc as a program to the tools section of textpad.

I also added the VC\BIN folder to the environment path of windows.

As well as the VC\BIN\AMD64 path.

Which is a bit surprising but ok.

NVCC runs from command prompt.

CL strangely does not and “crashes with application error”.

(I also changed the third color in textpad to purple which is more nice for global ;) :))

Anyway NVCC seems to compile up to this point:

File adder.cu:

global void kernel( int a, int b )
{
int c;

c = a + b;

}

nvcc ouput:

adder.cu
O:/CUDA C/test add-er/version 0.01/adder.cu(3): warning: variable “c” was set but never used

tmpxft_00000344_00000000-3_adder.cudafe1.gpu
tmpxft_00000344_00000000-8_adder.cudafe2.gpu
adder.cu
O:/CUDA C/test add-er/version 0.01/adder.cu(3): warning: variable “c” was set but never used

tmpxft_00000344_00000000-3_adder.cudafe1.cpp
tmpxft_00000344_00000000-14_adder.ii
LINK : fatal error LNK1104: cannot open file ‘kernel32.lib’

Tool completed with exit code 2

kernel32.lib is nowhere to be found. Does this mean kernel32.lib is for host applications/executables only ?

I just want some PTX or perhaps CUBIN’s…

So I probably need to add some command line parameters to NVCC ?

Additional command line options like “no optimizing” might be nice too… to prevent nvcc from illiminating simple code like the add-er ;)

The first thing I went to get rid of is this error message: “kernel32.lib” is that possible ?

Perhaps I should use a different compiler instead of nvcc ?

Perhaps use opencc directly ?

I tried that too by adding it to textpad, this was the result:

nvopencc.exe ERROR: signal 11 caught, stop processing

Tool completed with exit code 1

Doesn’t seem to good… so back to nvcc me thinks…

Ok the solution is real easy just add -ptx command line parameter.

Now nvcc works, opencc does not but is not needed, just nvcc is needed ! ;) =D Very nice ! ;) =D

Real sweet… it just generates one nice file… not the huge overhead that visual studio produces ;)

Output from nvcc: (it still optimized the code away so gotta try and get that gone… also gotta try and get tools in main menu instead of submenu in textpad…
maybe command line would be a better choice for textpad ;) but then again maybe not ;))

File adder.ptx:

.version 1.4
.target sm_10, map_f64_to_f32
// compiled with C:\Tools\CUDA\Toolkit 4.0\v4.0\bin/../open64/lib//be.exe
// nvopencc 4.0 built on 2011-05-13

//-----------------------------------------------------------
// Compiling C:/Users/Skybuck/AppData/Local/Temp/tmpxft_000017b0_00000000-11_adder.cpp3.i (C:/Users/Skybuck/AppData/Local/Temp/ccBI#.a04264)
//-----------------------------------------------------------

//-----------------------------------------------------------
// Options:
//-----------------------------------------------------------
//  Target:ptx, ISA:sm_10, Endian:little, Pointer Size:64
//  -O3	(Optimization level)
//  -g0	(Debug level)
//  -m2	(Report advisories)
//-----------------------------------------------------------

.file	1	"C:/Users/Skybuck/AppData/Local/Temp/tmpxft_000017b0_00000000-10_adder.cudafe2.gpu"
.file	2	"c:\tools\microsoft visual studio 10.0\vc\include\codeanalysis\sourceannotations.h"
.file	3	"C:\Tools\CUDA\Toolkit 4.0\v4.0\bin/../include\crt/device_runtime.h"
.file	4	"C:\Tools\CUDA\Toolkit 4.0\v4.0\bin/../include\host_defines.h"
.file	5	"C:\Tools\CUDA\Toolkit 4.0\v4.0\bin/../include\builtin_types.h"
.file	6	"c:\tools\cuda\toolkit 4.0\v4.0\include\device_types.h"
.file	7	"c:\tools\cuda\toolkit 4.0\v4.0\include\driver_types.h"
.file	8	"c:\tools\cuda\toolkit 4.0\v4.0\include\surface_types.h"
.file	9	"c:\tools\cuda\toolkit 4.0\v4.0\include\texture_types.h"
.file	10	"c:\tools\cuda\toolkit 4.0\v4.0\include\vector_types.h"
.file	11	"c:\tools\cuda\toolkit 4.0\v4.0\include\builtin_types.h"
.file	12	"c:\tools\cuda\toolkit 4.0\v4.0\include\host_defines.h"
.file	13	"C:\Tools\CUDA\Toolkit 4.0\v4.0\bin/../include\device_launch_parameters.h"
.file	14	"c:\tools\cuda\toolkit 4.0\v4.0\include\crt\storage_class.h"
.file	15	"C:\Tools\Microsoft Visual Studio 10.0\VC\bin/../../VC/INCLUDE\time.h"
.file	16	"O:/CUDA C/test add-er/version 0.01/adder.cu"
.file	17	"C:\Tools\CUDA\Toolkit 4.0\v4.0\bin/../include\common_functions.h"
.file	18	"c:\tools\cuda\toolkit 4.0\v4.0\include\math_functions.h"
.file	19	"c:\tools\cuda\toolkit 4.0\v4.0\include\math_constants.h"
.file	20	"c:\tools\cuda\toolkit 4.0\v4.0\include\device_functions.h"
.file	21	"c:\tools\cuda\toolkit 4.0\v4.0\include\sm_11_atomic_functions.h"
.file	22	"c:\tools\cuda\toolkit 4.0\v4.0\include\sm_12_atomic_functions.h"
.file	23	"c:\tools\cuda\toolkit 4.0\v4.0\include\sm_13_double_functions.h"
.file	24	"c:\tools\cuda\toolkit 4.0\v4.0\include\sm_20_atomic_functions.h"
.file	25	"c:\tools\cuda\toolkit 4.0\v4.0\include\sm_20_intrinsics.h"
.file	26	"c:\tools\cuda\toolkit 4.0\v4.0\include\surface_functions.h"
.file	27	"c:\tools\cuda\toolkit 4.0\v4.0\include\texture_fetch_functions.h"
.file	28	"c:\tools\cuda\toolkit 4.0\v4.0\include\math_functions_dbl_ptx1.h"


.entry _Z6kernelii (
	.param .s32 __cudaparm__Z6kernelii_a,
	.param .s32 __cudaparm__Z6kernelii_b)
{
.loc	16	1	0

$LDWbegin__Z6kernelii:
.loc 16 6 0
exit;
$LDWend__Z6kernelii:
} // _Z6kernelii

Bye,
Skybuck.

One way is to write the command in a batch file and have textpad call it.

Not needed…

Just add nvcc as a command to textpad ;) :)

Also for 32 bit applications don’t forget to add -machine 32 to command line options, and perhaps -ptx for assembly output only, and perhaps -arch compute_20 for compute 2.0 and beyond, this allows dynamic allocation and such ;) :)