Don't see the SASS code via objdump

mahmood.nt · February 19, 2020, 6:22pm

May I know, why I don’t see the SASS code of the a binary file which is built using the following nvcc command?

$ nvcc -arch=sm_75 -use_fast_math -Xptxas -O3,-v mm.cu -lcublas -lcurand -o mm.2080ti
ptxas info    : 0 bytes gmem
$ cuobjdump -sass mm.2080ti

Fatbin elf code:
================
arch = sm_75
code version = [1,7]
producer = <unknown>
host = linux
compile_size = 64bit

        code for sm_75

Fatbin elf code:
================
arch = sm_75
code version = [1,7]
producer = <unknown>
host = linux
compile_size = 64bit

        code for sm_75

Fatbin ptx code:
================
arch = sm_75
code version = [6,4]
producer = <unknown>
host = linux
compile_size = 64bit
compressed
ptxasOptions = -O3 -v

Robert_Crovella · February 19, 2020, 7:24pm

The only sass code you’ll see is the sass for kernels you actually have in your code. If you are making calls to CUBLAS (only) then you won’t see any SASS in that case.

If you want to see the CUBLAS SASS, then use cuobjdump on the cublas library.

mahmood.nt · February 19, 2020, 8:27pm

You are right. Thank you.
I wanted to check that because, I have specified SM_75 in

nvcc -arch=sm_75 -use_fast_math -Xptxas -O3,-v mm.cu -lcublas -lcurand -o mm.2080ti

but when I profile with nsight, I see the kernel name starts with volta_*
I expected to see turing_*

Please see

$ ~/sdk/deviceQuery/deviceQuery
deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce RTX 2080 Ti"
  CUDA Driver Version / Runtime Version          10.1 / 10.1
  CUDA Capability Major/Minor version number:    7.5
...
$ nvcc -arch=sm_75 -use_fast_math -Xptxas -O3,-v mm.cu -lcublas -lcurand -o mm.2080ti
ptxas info    : 0 bytes gmem
$ nv-nsight-cu-cli ./mm.2080ti 100
==PROF== Connected to process 44682
==PROF== Profiling "generate_seed_pseudo" - 1: 0%....50%....100% - 33 passes
==PROF== Profiling "gen_sequenced" - 2: 0%....50%....100% - 32 passes
==PROF== Profiling "generate_seed_pseudo" - 3: 0%....50%....100% - 33 passes
==PROF== Profiling "gen_sequenced" - 4: 0%....50%....100% - 32 passes
==PROF== Profiling "volta_sgemm_32x32_sliced1x4_nn" - 5: 0%....50%....100% - 32 passes
==PROF== Disconnected from process 44682

Is that OK?

Robert_Crovella · February 19, 2020, 8:44pm

cublas is a compiled library.

It does not matter what compilation settings you make when you call into that library.

The library decides for itself what GPU it is running on, and what kernels it will call. You have essentially no control over that, and sm_75 compilation for your code doesn’t change the library behavior at all.

Yes, its OK. If the cublas team feels that an already-designed kernel called volta_sgemm_… is perfectly suited for use on Turing, they may very well reuse that kernel, even though you are running on a Turing GPU. There is not a separate set of kernels for every possible architecture.

Topic		Replies	Views
A problem of PTX code of Cublas64_50_35 of Cuda5.0 release CUDA Programming and Performance	8	1376	November 7, 2012
How should I use correctly the sm_XX and compute_XX? CUDA Programming and Performance	3	4541	July 14, 2022
How can I make a PTX fat binary from individual PTX files? CUDA Programming and Performance	4	288	May 11, 2024
The PTX and SASS codes corresponding to the device code are empty CUDA Programming and Performance nvcc	3	352	March 8, 2024
Why with cuobjdump, some kernels only have SASS and no PTX CUDA Programming and Performance	2	993	May 6, 2018
Strange behavior of TeslaC2050 CUDA Programming and Performance	13	7392	November 19, 2010
Extracting SASS instructions of an OptiX binary? OptiX	7	71	December 13, 2024
CUPTI, CUDA9.1 and sass_source_map example GPU-Accelerated Libraries	4	888	March 13, 2018
No device code according to cuobj CUDA Programming and Performance	5	1260	June 10, 2019
Hide SASS and PTX so that code isn't exposed via cuobjdump? CUDA Programming and Performance	9	1409	December 17, 2016

Don't see the SASS code via objdump

Related topics