__lanemask_lt undefined?

So there is lots of Nvidia sample code from over the years that uses __lanemask_lt() but that API is “undefined” as of 11.7. I’ve tried compute_80 and sm_86 but neither make the API available. Has that API been removed in favor of raw PTX?

I can’t find it either. Not in CUDA 9.2, not in CUDA 11.4.

It is mentioned both in the CUDA 11.7 programming guide as well as this blog.

If this is a concern to you, I suggest filing a bug.

There is a similar function in the namespace cooperative_groups::details, but you are not supposed to depend on anything in such a “details” namespace, so that is as far as I’ll go.

If it were me, and I wanted to construct my own, I would follow that example.

Apologies Robert, but I have no useful C++ knowledge, having progressed to a pretty average C competency.

Is there a file you can refer me to for this “example”?

Thanks.

generally speaking, in CUDA, when you want to use cooperative groups functionality, you should include the header file cooperative_groups.h. So that header file is going to be in the “usual place” for CUDA header files. On a typical linux install that would be /usr/local/cuda/include

If you study that include file, or just look around in that directory, you will notice a subdirectory called cooperative_groups (at least it is there on my CUDA 11.4 install). Inside that is a directory called details. In that directory you will find helpers.h. (this appears to be a “helper function”, implemented in the details namespace, for cooperative groups). If you:

grep lanemask /usr/local/cuda/include/cooperative_groups/details/helpers.h

You’ll find an example function that could be used as a model to create your own lanemask_lt.

1 Like

Thanks for that.

Follow up: I had not realised “lanemask_lt” was a special register. As such, it’s described in the PTX ISA:

Yep. I interpreted the request here to be a function callable from CUDA C++.

If you’re doing PTX you can use it directly. But it isn’t formally exposed in CUDA C++ via any builtins or intrinsics that I am aware of.