CUPTI, CUDA9.1 and sass_source_map example

I am running the CUPTI sass_source_map example under Win7/64, cuda 9.1, Titan Volta, and dev driver 388.59.
If you enable the DUMP_CUBIN symbol you will get the binary for the kernel. You can then use
nvdisasm to dump the sass.

I get the binary file. Then I type nvdisasm -b SM70 sass_source_map.cubin
I get an nvdisasm error. Unrecognized operation for functional unit ‘uC’ at address 0x00000000.

Can anyone help me diagnose this error?

Hi Bob,

Can you try by not using the option “-b SM70”. You should provide -b option only for raw instruction binary file. Since you want to dump the sass of cubin which already has the information about the architecture for which SASS needs to be dumped there is no need to pass -b option.

Hi SagarAgrawal

The comment in the code says to do this.

	// Try nvdisasm -b -fun <function_id> sass_to_source.cubin

I tried

nvdisasm -b -fun transpose sass_source_map.cubin
nvdisasm -b -fun 1 sass_source_map.cubin
nvdisasm -b SM70 -fun transpose sass_source_map.cub

The first two failed.
The third one said ‘transpose: expected a number’

I couldn’t figure out what the <function_id> is so I went to this web page and searched for it. function_id doesnt exist.

The -fun option says it wants a . I couldn’t figure out what that was (You can see I just guessed 1 above)

So I did this.

nvdisasm sass_source_map.cubin > aa.txt

I then searched for ‘transpose’ with the hope of finding a symbol index. I don’t know what I am looking for.

I looked at the documentation for -b and it says this …

“When this option is specified, the input file is assumed to contain a raw instruction binary, that is, a sequence of binary instruction encodings as they occur in instruction memory. The value of this option must be the asserted architecture of the raw binary. Allowed values for this option: ‘SM20’,‘SM21’,‘SM30’,‘SM32’, ‘SM35’,‘SM37’,‘SM50’,‘SM52’,‘SM53’,‘SM60’,‘SM61’,‘SM62’,‘SM70’.”


Hi Bob,

nvdisasm utility is used to dump the SASS and according to your comment #3 you are able to dump the sass using command “nvdisasm sass_source_map.cubin

Now coming to your query about option fun passed to nvdiasm.

Pasting the output of help section of nvdisasm -h here
–cuda-function-index ,… (-fun)
Restrict the output to the CUDA functions represented by symbols with the
given indices. The CUDA function for a given symbol is the enclosing section.
This only restricts executable sections; all other sections will still be

So -fun option is used to dump executable section of only interested function. So even if you don’t use this option you will get the executable section of all kernel in cubin. So this option is only used for filtering purpose.

Now question arises how to identify the fun id . For this you need to use cuobjdump utility with following option

cuobjdump -elf cubin_name.cubin

and in symbol table you will see the mangled name of your kernel with corresponding index

Pasting some part of output of cuobjdump -elf sass_source_map.cubin

.section .symtab
index value size info other shndx name
0 0 0 0 0 0 (null)
1 0 0 3 0 c .text._Z9transposePfPKf
2 0 0 3 0 d .nv.shared._Z9transposePfPKf
3 0 0 3 0 b .nv.constant0._Z9transposePfPKf
4 0 0 3 0 4 .debug_line
5 0 0 3 0 5 .nv_debug_line_sass
6 0 0 3 0 6 .nv_debug_ptx_txt
7 0 768 12 10 c _Z9transposePfPKf

So interested field for you here is to look for .text.KernelNameOfYourIntrest and it’s corresponding index.

Now as mentioned in comment #2 if you want to dump the SASS from cubin using nvdiasm. Do not pass -b option and if you have cubin with 100’s of kernel and you want to look at particular kernel SASS code then only -fun option is useful otherwise for small number of kernel you can dump the executable section output of all kernels. Even for large number of kernels in cubin you can redirect the output of nvdisasm to some file and grep into the file for your interested function. So I think option fun should not block you anyway even if you are unable to find the index of the kernel.

I hope it clarify your all query.

Thank you for your response. You’ve caused me to look even deeper into the problem. I think it would have been better if you started by saying “Bob … the comment in the code is wrong!”

I am running all of these tools under Win7 in debug mode. I looked at the output of the sass CUPTI program and finally realized that the functionId is actually printed out! So I know it is 1 for the transpose function.

I am trying to help you make your products better.

If you follow the comment in the code you get an error!

nvdisasm -b -fun 1 sass_source_map.cubin

Clearly, this command is also wrong (as you said) … and it generates an error.

nvdisasm -b sass_source_map.cubin

by the way … although this command works, the function is excluded from the output file. Try it. There
is no .text section for the function.

nvdisasm -fun 1 sass_source_map.cubin > output.txt

So we think the comment should have said “try nvdisasm sass_source_map.cubin > output.txt”

That will work. You can then crack open output.txt and look for the transpose function.

So here’s the next problem … the purpose of the CUPTI sass example is to show you how to correlate opcodes back to source lines. You will quickly see that this command “nvdisasm sass_source_map.cubin > output.txt”
is useless.

It should be “nvdisasm -g sass_source_map.cubin > output.txt”

It would be really great if you fixed the sample code so that other developers dont have to struggle with this.