linking hand-coded PTX

alex_dubinsky · August 29, 2007, 6:40pm

I think the title sums it up.

I’ve written some PTX assembly, I’ve successfully compiled said code using ptxas, but i’m struggling to find a way of integrating it into my program.

Is the only way to do this through the device code repository mechanism? I’m trying to figure it out now, but it’s far from straightforward (tips appreciated!). Surely there’s some more direct way, such as compiling the ptx into a .obj and linking, but I haven’t been able to figure it out.

I really wish I could just simply pass .ptx files along with .cu to nvcc. Or better yet, inline it into the .cu.

Given the poor quality of the current compiler, ptx is turning out much more valuable than hand assembly should normally be. P.S. I’d really appreciate a “no optimizations” option in ptxas, and even a full gammut of -O’s.

tachyon_john · August 29, 2007, 6:55pm

I haven’t spent any time on ptxas yet, but it seems that you should be able to write a PTX function, create a header file for it, and call it from your .cu files yes? Which compiler stage does the function inlining? This came to mind when someone was asking about calling SAD from their CUDA code. How does nvcc pull in intrinsics like _expf() etc, do they have ptx inlining implemented or something? I’ve had passing interest in this but all of my CUDA coding so far has been in C, and I’ve only been reading the PTX to verify that the compiler is doing what I want. I haven’t felt the need to start writing in PTX itself.

John Stone

alex_dubinsky · August 29, 2007, 11:38pm

ok, I’ve gotten the hang of the code repository somewhat. Actually, it’s a fairly powerful and simple to use (though poorly documented!) feature.

Basically, it works like this:
You can change kernels without recompiling the executable by simply creating or replacing files in a special directory. If your executable is called “./L33tProg.exe,” the runtime will automatically check for kernel implementations in a folder (or tar file!) called “./L33tProg.devcode”. The kernels can be either in ptx or cubin form.

Creating the ptx/cubin files is a bit difficult, though, because there are some sort of restrictions on the contents of the files, and there’s no error reporting except when your kernel runs fine and gives broken results. To not have to start from scratch, you can add the “-dir=$(ProjectName).exe.devcode -ext=all –int=none –arch compute_10 –code compute_10,sm_10,sm_11” compile flags. These will generate the L33tProg.devcode folder that contains all the kernels in your project. The flags also make it so that the executable requires this folder and doesn’t have embedded kernels itself. That’s a debugging trick so you never have doubts whether the new kernel gets loaded.

However, programming in ptx is turning out especially difficult with no debugging. In fact, you’re not even really informed when the kernel totally fails except that the output data looks a certain way. ptxas doesn’t emit informative syntax errors either, and sometimes just dies with “internal error” with no linenumber. It would be great if ptx could be compiled to host code and debugged just like cu. Ah… I dream of inline ptx. And of a better assembly syntax… maybe something that looks like c but isn’t (so a mov is an assignment, a cvt is a cast, an ALU op is an expression… but you can only do one thing on a line). Hell, ptx ain’t real assembly anyway.

p.s. the proper docs are in chapter 6 of C:\CUDA\doc\NVCC_1.0.pdf

alex_dubinsky · August 30, 2007, 10:17pm

I’m giving up hand-coding ptx assembly because bugs in ptxas are making it impossible.

I’d like to submit a repro case that is able to reproduce two different ways that ptxas craps itself. Who can I send it to?

paulius · August 31, 2007, 12:12am

You can send me a message.

Paulius

Topic		Replies	Views
Example code using PTX CUDA Programming and Performance	6	9035	March 25, 2008
Inline PTX assembly example CUDA Programming and Performance	1	14787	August 3, 2010
How to link in modified .ptx code? CUDA Programming and Performance	1	2318	September 29, 2009
How To Write PTX Code Directly not enough document now CUDA Programming and Performance	2	3810	November 8, 2007
PTX "Assembler" Rolling my own? CUDA Programming and Performance	7	2884	September 30, 2008
asm inlining in CUDA code? CUDA Programming and Performance	5	6512	July 19, 2010
Writing a function in PTX? Need to hand-code a function in PTX CUDA Programming and Performance	3	3325	September 10, 2008
Inline PTX Assembly CUDA Programming and Performance	0	2548	August 10, 2010
Going to learn PTX and write a GPU compiler CUDA Programming and Performance	20	26995	January 19, 2009
Newbie - How can I execute the manually modified PTX file? CUDA Programming and Performance	3	3536	December 8, 2008

linking hand-coded PTX

Related topics