cuobjdump supports disassembly of .cubin files built for sm_10 through sm_13 and is now available to registered developers. If you drop the binary in your CUDA bin directory on Linux, you can then disassemble running applications using the disas command.
Hey, an official decuda clone. That’s great news! I guess that’ll make me register as a developer then.
Thanks!
Hey, an official decuda clone. That’s great news! I guess that’ll make me register as a developer then.
Thanks!
I believe the PDF file was not rendered correctly. It ends with a copy of the Supported Options section and then the message:
ERROR: undefined
OFFENDING COMMAND: f‘~
STACK:
Does the PDF contain all the information it was intended to contain?
I believe the PDF file was not rendered correctly. It ends with a copy of the Supported Options section and then the message:
ERROR: undefined
OFFENDING COMMAND: f‘~
STACK:
Does the PDF contain all the information it was intended to contain?
PDF being broken is known, and there should be an updated package shortly.
PDF being broken is known, and there should be an updated package shortly.
It is unclear to me how to use cuobjdump. I tried
cuobjdump.exe -sass test_kernel.compute_20.sm_20.cubin
The PDF indicates that this would output disassembled code, but I only get code, const and bincode blocks.
The PDF doesn’t describe what the difference between the positional argument and the --file option is. I tried calls such as the ones below without getting any output (not even error messages):
cuobjdump.exe -fnam test_kernel.compute_20.sm_20.cubin
cuobjdump.exe -fnam -f test_kernel.compute_20.sm_20.cubin
cuobjdump.exe -cubin -fun _Z13test_kernelPfyiif test_kernel.compute_20.sm_20.cubin
It is unclear to me how to use cuobjdump. I tried
cuobjdump.exe -sass test_kernel.compute_20.sm_20.cubin
The PDF indicates that this would output disassembled code, but I only get code, const and bincode blocks.
The PDF doesn’t describe what the difference between the positional argument and the --file option is. I tried calls such as the ones below without getting any output (not even error messages):
cuobjdump.exe -fnam test_kernel.compute_20.sm_20.cubin
cuobjdump.exe -fnam -f test_kernel.compute_20.sm_20.cubin
cuobjdump.exe -cubin -fun _Z13test_kernelPfyiif test_kernel.compute_20.sm_20.cubin
cuobjdump doesn’t support sm_20, only sm_10 through sm_13.
cuobjdump doesn’t support sm_20, only sm_10 through sm_13.
Nice! cuobjectdump seems to also take .exe files, not just .cubin files, and at least dump out ptx. But, while I can disassemble a .cubin file, -sass doesn’t seem to show the disassembly that same data embedded in the .exe. Not sure why not, even though it seems I have the assembled code in the .exe. See output below.
d -ptx -sass -elf hw.compute_10.sm_10.cubin
.section .strtab STRTAB
.section .shstrtab STRTAB
.section .symtab SYMTAB
index value size info other shndx name
0 0 0 0 0 0 (null)
1 0 0 3 0 1 .shstrtab
2 0 0 3 0 2 .strtab
3 0 0 3 0 3 .symtab
4 0 0 3 0 0
5 0 0 3 0 0
6 0 24 3 0 4 .text._Z3funPi
7 0 0 3 0 6 .nv.shared._Z3funPi
8 0 0 3 0 5 .nv.info._Z3funPi
9 0 0 12 10 4 _Z3funPi
.nv.shared._Z3funPi NOBITS
No data to be dumped.
.text._Z3funPi PROGBITS
bar = 0 reg = 2 lmem=0 smem=20
0x1000c801 0x0423c780 0x10018005 0x00000003
0xd00e0005 0xa0c00781
.nv.info._Z3funPi PROGBITS
<0x1>
Attribute: EIATTR_SMEM_PARAM_OFFSETS
Format: EIFMT_SVAL
Value: 0x0
code for sm_10
--------------
Function : _Z3funPi
/*0000*/ MOV R0, g [0x4];
/*0008*/ MVI R1, 0x1;
/*0010*/ GST.U32 global14 [R0], R1;
...................
d -ptx -sass -elf ../Debug/hw.exe
c:/Personal/tem/cuda-memory-debug/hw/hw.cu:
==========================================
Version = 0x00000004
gpuInfoVersion = 0xa14f518d
key = 0d5c85b21bfe3b0c
usageMode = -maxrregcount=32
debuggable = no
ptx code for compute_20
-----------------------
.version 2.1
.target sm_20
// compiled with C:\CUDA\bin/../open64/lib//be.exe
// nvopencc 3.1 built on 2010-06-08
//-----------------------------------------------------------
// Compiling hw.compute_20.cpp3.i (C:/Users/Ken/AppData/Local/Temp/ccBI#.a07412)
//-----------------------------------------------------------
//-----------------------------------------------------------
// Options:
//-----------------------------------------------------------
// Target:ptx, ISA:sm_20, Endian:little, Pointer Size:32
// -O3 (Optimization level)
// -g0 (Debug level)
// -m2 (Report advisories)
//-----------------------------------------------------------
.file 1 “hw.compute_20.cudafe2.gpu”
.file 2 "C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\crtdefs.h"
.file 3 "C:\CUDA\include\crt/device_runtime.h"
.file 4 "C:\CUDA\include\host_defines.h"
.file 5 "C:\CUDA\include\builtin_types.h"
.file 6 "c:\cuda\include\device_types.h"
.file 7 "c:\cuda\include\driver_types.h"
.file 8 "c:\cuda\include\surface_types.h"
.file 9 "c:\cuda\include\texture_types.h"
.file 10 "c:\cuda\include\vector_types.h"
.file 11 "c:\cuda\include\builtin_types.h"
.file 12 "c:\cuda\include\host_defines.h"
.file 13 "C:\CUDA\include\device_launch_parameters.h"
.file 14 "c:\cuda\include\crt\storage_class.h"
.file 15 "C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\time.h"
.file 16 "c:\cuda\include\texture_fetch_functions.h"
.file 17 "C:\CUDA\include\common_functions.h"
.file 18 "c:\cuda\include\math_functions.h"
.file 19 "c:\cuda\include\math_constants.h"
.file 20 "c:\cuda\include\device_functions.h"
.file 21 "c:\cuda\include\sm_11_atomic_functions.h"
.file 22 "c:\cuda\include\sm_12_atomic_functions.h"
.file 23 "c:\cuda\include\sm_13_double_functions.h"
.file 24 "c:\cuda\include\sm_20_atomic_functions.h"
.file 25 "c:\cuda\include\sm_20_intrinsics.h"
.file 26 "c:\cuda\include\surface_functions.h"
.file 27 "c:\cuda\include\math_functions_dbl_ptx3.h"
.file 28 "c:/Personal/tem/cuda-memory-debug/hw/hw.cu"
.entry _Z3funPi (
.param .u32 __cudaparm__Z3funPi_mem)
{
.reg .u32 %r<4>;
.loc 28 4 0
$LDWbegin__Z3funPi:
.loc 28 6 0
mov.s32 %r1, 1;
ld.param.u32 %r2, [__cudaparm__Z3funPi_mem];
st.global.s32 [%r2+0], %r1;
.loc 28 7 0
exit;
$LDWend__Z3funPi:
} // _Z3funPi
ptx code for compute_10
-----------------------
.version 1.4
.target sm_10, map_f64_to_f32
// compiled with C:\CUDA\bin/../open64/lib//be.exe
// nvopencc 3.1 built on 2010-06-08
//-----------------------------------------------------------
// Compiling hw.compute_10.cpp3.i (C:/Users/Ken/AppData/Local/Temp/ccBI#.a04652)
//-----------------------------------------------------------
//-----------------------------------------------------------
// Options:
//-----------------------------------------------------------
// Target:ptx, ISA:sm_10, Endian:little, Pointer Size:32
// -O3 (Optimization level)
// -g0 (Debug level)
// -m2 (Report advisories)
//-----------------------------------------------------------
.file 1 “hw.compute_10.cudafe2.gpu”
.file 2 "C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\crtdefs.h"
.file 3 "C:\CUDA\include\crt/device_runtime.h"
.file 4 "C:\CUDA\include\host_defines.h"
.file 5 "C:\CUDA\include\builtin_types.h"
.file 6 "c:\cuda\include\device_types.h"
.file 7 "c:\cuda\include\driver_types.h"
.file 8 "c:\cuda\include\surface_types.h"
.file 9 "c:\cuda\include\texture_types.h"
.file 10 "c:\cuda\include\vector_types.h"
.file 11 "c:\cuda\include\builtin_types.h"
.file 12 "c:\cuda\include\host_defines.h"
.file 13 "C:\CUDA\include\device_launch_parameters.h"
.file 14 "c:\cuda\include\crt\storage_class.h"
.file 15 "C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\time.h"
.file 16 "c:\cuda\include\texture_fetch_functions.h"
.file 17 "C:\CUDA\include\common_functions.h"
.file 18 "c:\cuda\include\math_functions.h"
.file 19 "c:\cuda\include\math_constants.h"
.file 20 "c:\cuda\include\device_functions.h"
.file 21 "c:\cuda\include\sm_11_atomic_functions.h"
.file 22 "c:\cuda\include\sm_12_atomic_functions.h"
.file 23 "c:\cuda\include\sm_13_double_functions.h"
.file 24 "c:\cuda\include\sm_20_atomic_functions.h"
.file 25 "c:\cuda\include\sm_20_intrinsics.h"
.file 26 "c:\cuda\include\surface_functions.h"
.file 27 "c:\cuda\include\math_functions_dbl_ptx1.h"
.file 28 "c:/Personal/tem/cuda-memory-debug/hw/hw.cu"
.entry _Z3funPi (
.param .u32 __cudaparm__Z3funPi_mem)
{
.reg .u32 %r<4>;
.loc 28 4 0
$LDWbegin__Z3funPi:
.loc 28 6 0
mov.s32 %r1, 1;
ld.param.u32 %r2, [__cudaparm__Z3funPi_mem];
st.global.s32 [%r2+0], %r1;
.loc 28 7 0
exit;
$LDWend__Z3funPi:
} // _Z3funPi
code for sm_20
--------------
architecture {sm_20}
abiversion {1}
cubinversion {1}
modname {cubin}
texmode {texmode_unified}
code {
name = _Z3funPi
lmem = 0
smem = 0
reg = 3
bar = 0
ctaidZUsed = 0
const {
segname = const
segnum = 0
offset = 0
bytes = 36
}
bincode {
0x00005de4 0x28004404 0x80001de4 0x28004000
0x04009de2 0x18000000 0x00009c85 0x90000000
0x00001de7 0x80000000
}
}
.section .strtab STRTAB
.section .shstrtab STRTAB
.section .symtab SYMTAB
index value size info other shndx name
0 0 0 0 0 0 (null)
1 0 0 3 0 1 .shstrtab
2 0 0 3 0 2 .strtab
3 0 0 3 0 3 .symtab
4 0 0 3 0 0
5 0 0 3 0 0
6 0 40 3 0 4 .text._Z3funPi
7 0 0 3 0 7 .nv.info
8 0 0 3 0 5 .nv.constant0._Z3funPi
9 0 0 3 0 6 .nv.info._Z3funPi
10 0 40 12 10 4 _Z3funPi
.nv.constant0._Z3funPi PROGBITS
0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
0x00000000 0x00000000 0x00000000 0x00000000
.text._Z3funPi PROGBITS
bar = 0 reg = 3 lmem=0 smem=0
0x00005de4 0x28004404 0x80001de4 0x28004000
0x04009de2 0x18000000 0x00009c85 0x90000000
0x00001de7 0x80000000
.nv.info._Z3funPi PROGBITS
<0x1>
Attribute: EIATTR_CBANK_PARAM_OFFSETS
Format: EIFMT_SVAL
Value: 0x0
<0x2>
Attribute: EIATTR_PARAM_CBANK
Format: EIFMT_SVAL
Value: 0x8 0x40020
.nv.info PROGBITS
<0x1>
Attribute: EIATTR_FRAME_SIZE
Format: EIFMT_SVAL
Value: function: _Z3funPi(0xa) frame size: 0x0
code for sm_10
--------------
architecture {sm_10}
abiversion {1}
cubinversion {1}
modname {cubin}
texmode {texmode_unified}
code {
name = _Z3funPi
lmem = 0
smem = 20
reg = 2
bar = 0
ctaidZUsed = 0
bincode {
0x1000c801 0x0423c780 0x10018005 0x00000003
0xd00e0005 0xa0c00781
}
}
.section .strtab STRTAB
.section .shstrtab STRTAB
.section .symtab SYMTAB
index value size info other shndx name
0 0 0 0 0 0 (null)
1 0 0 3 0 1 .shstrtab
2 0 0 3 0 2 .strtab
3 0 0 3 0 3 .symtab
4 0 0 3 0 0
5 0 0 3 0 0
6 0 24 3 0 4 .text._Z3funPi
7 0 0 3 0 6 .nv.shared._Z3funPi
8 0 0 3 0 5 .nv.info._Z3funPi
9 0 0 12 10 4 _Z3funPi
.nv.shared._Z3funPi NOBITS
No data to be dumped.
.text._Z3funPi PROGBITS
bar = 0 reg = 2 lmem=0 smem=20
0x1000c801 0x0423c780 0x10018005 0x00000003
0xd00e0005 0xa0c00781
.nv.info._Z3funPi PROGBITS
<0x1>
Attribute: EIATTR_SMEM_PARAM_OFFSETS
Format: EIFMT_SVAL
Value: 0x0
Nice! cuobjectdump seems to also take .exe files, not just .cubin files, and at least dump out ptx. But, while I can disassemble a .cubin file, -sass doesn’t seem to show the disassembly that same data embedded in the .exe. Not sure why not, even though it seems I have the assembled code in the .exe. See output below.
d -ptx -sass -elf hw.compute_10.sm_10.cubin
.section .strtab STRTAB
.section .shstrtab STRTAB
.section .symtab SYMTAB
index value size info other shndx name
0 0 0 0 0 0 (null)
1 0 0 3 0 1 .shstrtab
2 0 0 3 0 2 .strtab
3 0 0 3 0 3 .symtab
4 0 0 3 0 0
5 0 0 3 0 0
6 0 24 3 0 4 .text._Z3funPi
7 0 0 3 0 6 .nv.shared._Z3funPi
8 0 0 3 0 5 .nv.info._Z3funPi
9 0 0 12 10 4 _Z3funPi
.nv.shared._Z3funPi NOBITS
No data to be dumped.
.text._Z3funPi PROGBITS
bar = 0 reg = 2 lmem=0 smem=20
0x1000c801 0x0423c780 0x10018005 0x00000003
0xd00e0005 0xa0c00781
.nv.info._Z3funPi PROGBITS
<0x1>
Attribute: EIATTR_SMEM_PARAM_OFFSETS
Format: EIFMT_SVAL
Value: 0x0
code for sm_10
--------------
Function : _Z3funPi
/*0000*/ MOV R0, g [0x4];
/*0008*/ MVI R1, 0x1;
/*0010*/ GST.U32 global14 [R0], R1;
...................
d -ptx -sass -elf ../Debug/hw.exe
c:/Personal/tem/cuda-memory-debug/hw/hw.cu:
==========================================
Version = 0x00000004
gpuInfoVersion = 0xa14f518d
key = 0d5c85b21bfe3b0c
usageMode = -maxrregcount=32
debuggable = no
ptx code for compute_20
-----------------------
.version 2.1
.target sm_20
// compiled with C:\CUDA\bin/../open64/lib//be.exe
// nvopencc 3.1 built on 2010-06-08
//-----------------------------------------------------------
// Compiling hw.compute_20.cpp3.i (C:/Users/Ken/AppData/Local/Temp/ccBI#.a07412)
//-----------------------------------------------------------
//-----------------------------------------------------------
// Options:
//-----------------------------------------------------------
// Target:ptx, ISA:sm_20, Endian:little, Pointer Size:32
// -O3 (Optimization level)
// -g0 (Debug level)
// -m2 (Report advisories)
//-----------------------------------------------------------
.file 1 “hw.compute_20.cudafe2.gpu”
.file 2 "C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\crtdefs.h"
.file 3 "C:\CUDA\include\crt/device_runtime.h"
.file 4 "C:\CUDA\include\host_defines.h"
.file 5 "C:\CUDA\include\builtin_types.h"
.file 6 "c:\cuda\include\device_types.h"
.file 7 "c:\cuda\include\driver_types.h"
.file 8 "c:\cuda\include\surface_types.h"
.file 9 "c:\cuda\include\texture_types.h"
.file 10 "c:\cuda\include\vector_types.h"
.file 11 "c:\cuda\include\builtin_types.h"
.file 12 "c:\cuda\include\host_defines.h"
.file 13 "C:\CUDA\include\device_launch_parameters.h"
.file 14 "c:\cuda\include\crt\storage_class.h"
.file 15 "C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\time.h"
.file 16 "c:\cuda\include\texture_fetch_functions.h"
.file 17 "C:\CUDA\include\common_functions.h"
.file 18 "c:\cuda\include\math_functions.h"
.file 19 "c:\cuda\include\math_constants.h"
.file 20 "c:\cuda\include\device_functions.h"
.file 21 "c:\cuda\include\sm_11_atomic_functions.h"
.file 22 "c:\cuda\include\sm_12_atomic_functions.h"
.file 23 "c:\cuda\include\sm_13_double_functions.h"
.file 24 "c:\cuda\include\sm_20_atomic_functions.h"
.file 25 "c:\cuda\include\sm_20_intrinsics.h"
.file 26 "c:\cuda\include\surface_functions.h"
.file 27 "c:\cuda\include\math_functions_dbl_ptx3.h"
.file 28 "c:/Personal/tem/cuda-memory-debug/hw/hw.cu"
.entry _Z3funPi (
.param .u32 __cudaparm__Z3funPi_mem)
{
.reg .u32 %r<4>;
.loc 28 4 0
$LDWbegin__Z3funPi:
.loc 28 6 0
mov.s32 %r1, 1;
ld.param.u32 %r2, [__cudaparm__Z3funPi_mem];
st.global.s32 [%r2+0], %r1;
.loc 28 7 0
exit;
$LDWend__Z3funPi:
} // _Z3funPi
ptx code for compute_10
-----------------------
.version 1.4
.target sm_10, map_f64_to_f32
// compiled with C:\CUDA\bin/../open64/lib//be.exe
// nvopencc 3.1 built on 2010-06-08
//-----------------------------------------------------------
// Compiling hw.compute_10.cpp3.i (C:/Users/Ken/AppData/Local/Temp/ccBI#.a04652)
//-----------------------------------------------------------
//-----------------------------------------------------------
// Options:
//-----------------------------------------------------------
// Target:ptx, ISA:sm_10, Endian:little, Pointer Size:32
// -O3 (Optimization level)
// -g0 (Debug level)
// -m2 (Report advisories)
//-----------------------------------------------------------
.file 1 “hw.compute_10.cudafe2.gpu”
.file 2 "C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\crtdefs.h"
.file 3 "C:\CUDA\include\crt/device_runtime.h"
.file 4 "C:\CUDA\include\host_defines.h"
.file 5 "C:\CUDA\include\builtin_types.h"
.file 6 "c:\cuda\include\device_types.h"
.file 7 "c:\cuda\include\driver_types.h"
.file 8 "c:\cuda\include\surface_types.h"
.file 9 "c:\cuda\include\texture_types.h"
.file 10 "c:\cuda\include\vector_types.h"
.file 11 "c:\cuda\include\builtin_types.h"
.file 12 "c:\cuda\include\host_defines.h"
.file 13 "C:\CUDA\include\device_launch_parameters.h"
.file 14 "c:\cuda\include\crt\storage_class.h"
.file 15 "C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\time.h"
.file 16 "c:\cuda\include\texture_fetch_functions.h"
.file 17 "C:\CUDA\include\common_functions.h"
.file 18 "c:\cuda\include\math_functions.h"
.file 19 "c:\cuda\include\math_constants.h"
.file 20 "c:\cuda\include\device_functions.h"
.file 21 "c:\cuda\include\sm_11_atomic_functions.h"
.file 22 "c:\cuda\include\sm_12_atomic_functions.h"
.file 23 "c:\cuda\include\sm_13_double_functions.h"
.file 24 "c:\cuda\include\sm_20_atomic_functions.h"
.file 25 "c:\cuda\include\sm_20_intrinsics.h"
.file 26 "c:\cuda\include\surface_functions.h"
.file 27 "c:\cuda\include\math_functions_dbl_ptx1.h"
.file 28 "c:/Personal/tem/cuda-memory-debug/hw/hw.cu"
.entry _Z3funPi (
.param .u32 __cudaparm__Z3funPi_mem)
{
.reg .u32 %r<4>;
.loc 28 4 0
$LDWbegin__Z3funPi:
.loc 28 6 0
mov.s32 %r1, 1;
ld.param.u32 %r2, [__cudaparm__Z3funPi_mem];
st.global.s32 [%r2+0], %r1;
.loc 28 7 0
exit;
$LDWend__Z3funPi:
} // _Z3funPi
code for sm_20
--------------
architecture {sm_20}
abiversion {1}
cubinversion {1}
modname {cubin}
texmode {texmode_unified}
code {
name = _Z3funPi
lmem = 0
smem = 0
reg = 3
bar = 0
ctaidZUsed = 0
const {
segname = const
segnum = 0
offset = 0
bytes = 36
}
bincode {
0x00005de4 0x28004404 0x80001de4 0x28004000
0x04009de2 0x18000000 0x00009c85 0x90000000
0x00001de7 0x80000000
}
}
.section .strtab STRTAB
.section .shstrtab STRTAB
.section .symtab SYMTAB
index value size info other shndx name
0 0 0 0 0 0 (null)
1 0 0 3 0 1 .shstrtab
2 0 0 3 0 2 .strtab
3 0 0 3 0 3 .symtab
4 0 0 3 0 0
5 0 0 3 0 0
6 0 40 3 0 4 .text._Z3funPi
7 0 0 3 0 7 .nv.info
8 0 0 3 0 5 .nv.constant0._Z3funPi
9 0 0 3 0 6 .nv.info._Z3funPi
10 0 40 12 10 4 _Z3funPi
.nv.constant0._Z3funPi PROGBITS
0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
0x00000000 0x00000000 0x00000000 0x00000000
.text._Z3funPi PROGBITS
bar = 0 reg = 3 lmem=0 smem=0
0x00005de4 0x28004404 0x80001de4 0x28004000
0x04009de2 0x18000000 0x00009c85 0x90000000
0x00001de7 0x80000000
.nv.info._Z3funPi PROGBITS
<0x1>
Attribute: EIATTR_CBANK_PARAM_OFFSETS
Format: EIFMT_SVAL
Value: 0x0
<0x2>
Attribute: EIATTR_PARAM_CBANK
Format: EIFMT_SVAL
Value: 0x8 0x40020
.nv.info PROGBITS
<0x1>
Attribute: EIATTR_FRAME_SIZE
Format: EIFMT_SVAL
Value: function: _Z3funPi(0xa) frame size: 0x0
code for sm_10
--------------
architecture {sm_10}
abiversion {1}
cubinversion {1}
modname {cubin}
texmode {texmode_unified}
code {
name = _Z3funPi
lmem = 0
smem = 20
reg = 2
bar = 0
ctaidZUsed = 0
bincode {
0x1000c801 0x0423c780 0x10018005 0x00000003
0xd00e0005 0xa0c00781
}
}
.section .strtab STRTAB
.section .shstrtab STRTAB
.section .symtab SYMTAB
index value size info other shndx name
0 0 0 0 0 0 (null)
1 0 0 3 0 1 .shstrtab
2 0 0 3 0 2 .strtab
3 0 0 3 0 3 .symtab
4 0 0 3 0 0
5 0 0 3 0 0
6 0 24 3 0 4 .text._Z3funPi
7 0 0 3 0 6 .nv.shared._Z3funPi
8 0 0 3 0 5 .nv.info._Z3funPi
9 0 0 12 10 4 _Z3funPi
.nv.shared._Z3funPi NOBITS
No data to be dumped.
.text._Z3funPi PROGBITS
bar = 0 reg = 2 lmem=0 smem=20
0x1000c801 0x0423c780 0x10018005 0x00000003
0xd00e0005 0xa0c00781
.nv.info._Z3funPi PROGBITS
<0x1>
Attribute: EIATTR_SMEM_PARAM_OFFSETS
Format: EIFMT_SVAL
Value: 0x0
Packages have been updated with a fixed PDF. Sorry for the inconvenience.
Packages have been updated with a fixed PDF. Sorry for the inconvenience.
is there any problem with this command line? “cuobjdump.exe -ptx convolutionTexture.sm_13.cubin” . it doesn’t o/p anything
is there any problem with this command line? “cuobjdump.exe -ptx convolutionTexture.sm_13.cubin” . it doesn’t o/p anything
Ok… the ptx is in the exe file, but same as when created with --keep option. so to see the final optimization we need to have a look at the assembly itself right?
Ok… the ptx is in the exe file, but same as when created with --keep option. so to see the final optimization we need to have a look at the assembly itself right?
Is cuobjdump part of some dev release (3.2 RC?) – or where can I download it?