__lanemask_lt undefined?

dave64 · June 3, 2022, 3:55pm

So there is lots of Nvidia sample code from over the years that uses __lanemask_lt() but that API is “undefined” as of 11.7. I’ve tried compute_80 and sm_86 but neither make the API available. Has that API been removed in favor of raw PTX?

Robert_Crovella · June 3, 2022, 4:47pm

I can’t find it either. Not in CUDA 9.2, not in CUDA 11.4.

It is mentioned both in the CUDA 11.7 programming guide as well as this blog.

If this is a concern to you, I suggest filing a bug.

There is a similar function in the namespace cooperative_groups::details, but you are not supposed to depend on anything in such a “details” namespace, so that is as far as I’ll go.

If it were me, and I wanted to construct my own, I would follow that example.

rs277 · June 14, 2022, 4:31am

Apologies Robert, but I have no useful C++ knowledge, having progressed to a pretty average C competency.

Is there a file you can refer me to for this “example”?

Thanks.

Robert_Crovella · June 14, 2022, 1:38pm

generally speaking, in CUDA, when you want to use cooperative groups functionality, you should include the header file cooperative_groups.h. So that header file is going to be in the “usual place” for CUDA header files. On a typical linux install that would be /usr/local/cuda/include

If you study that include file, or just look around in that directory, you will notice a subdirectory called cooperative_groups (at least it is there on my CUDA 11.4 install). Inside that is a directory called details. In that directory you will find helpers.h. (this appears to be a “helper function”, implemented in the details namespace, for cooperative groups). If you:

grep lanemask /usr/local/cuda/include/cooperative_groups/details/helpers.h

You’ll find an example function that could be used as a model to create your own lanemask_lt.

rs277 · June 14, 2022, 6:22pm

Thanks for that.

Follow up: I had not realised “lanemask_lt” was a special register. As such, it’s described in the PTX ISA:

Robert_Crovella · June 14, 2022, 8:03pm

Yep. I interpreted the request here to be a function callable from CUDA C++.

If you’re doing PTX you can use it directly. But it isn’t formally exposed in CUDA C++ via any builtins or intrinsics that I am aware of.

Topic		Replies	Views
CUDA 11.4 - cooperative groups no longer supported on SM < 7.0? CUDA Programming and Performance	6	1647	November 18, 2021
Cooperative groups with NVRTC CUDA Programming and Performance	1	859	January 20, 2018
Nvc++ and cooperative groups (with fun little patch) nvc, nvc++ and nvfortran	6	767	September 2, 2023
Link error when using cooperative groups (already tried -rdc=true) CUDA Programming and Performance	2	558	October 12, 2021
Can I use Independent Thread Scheduling and Cooperative Groups with Cuda 9 + Pascal CUDA Programming and Performance	4	1843	August 15, 2017
Cooperative Groups: Flexible CUDA Thread Programming Technical Blog	32	12760	February 7, 2023
Group Collectives (cooperative_groups::inclusive_scan) failing for some configurations CUDA Programming and Performance	1	60	September 29, 2025
Unresolved external when using cooperative groups. CUDA Programming and Performance	3	1699	March 23, 2018
Using cooperative groups with NVRTC CUDA Programming and Performance	2	1036	December 2, 2020
Open source projects that use co-operative groups (CGs) CUDA Programming and Performance cuda	0	377	December 26, 2020

__lanemask_lt undefined?

Related topics