Uint8 mask to bitmask

peis · June 29, 2022, 4:47am

Are there any native cuda primitives to efficiently cast uint8 mask (only the least significant bit is used to indicate true or false) to a bitmask without doing a loop?

input: a[32] of type uint8
output: b[4] of type uint32

i checked CUDA Math API :: CUDA Toolkit Documentation but didn’t find any

njuffa · June 29, 2022, 6:25am

Maybe I am particularly dense today, but it is not clear what the desired operation does. So the input comprises 32 bytes a[], each of which contains a boolean flag in a[i]<0>, i=0, …, 31. And these 32 bits are to be deposited nibble-wise in the 128 bits of b[], such that:

b[0]<0> = a[0]<0>
b[0]<1> = 0
b[0]<2> = 0
b[0]<3> = 0
b[0]<4> = a[1]<0>
b[0]<5> = 0
b[0]<6> = 0
b[0]<7> = 0
b[0]<8> = a[2]<0>
[…]
b[0]<28> = a[7]<0>
b[0]<29> = 0
b[0]<30> = 0
b[0]<31> = 0
b[1]<0> = a[8]<0>
[…]

Correct?

What about the upper bits of the a[i]? Do we have a[i]<7:1> == 0b0000000, a[i]<7:1> == 0b1111111, or a[i]<7:1> == 0bxxxxxxx?

Are there any alignment guarantees for a? Does the input data have to be delivered as uint8_t a[32], or could is be delivered as uchar4 a[8], for example? The difference is in what alignment is guaranteed by CUDA for each type.

Topic		Replies	Views
how to pack bits/extract MSBs efficiently? CUDA Programming and Performance	4	4400	February 2, 2010
Bitmagic ! CUDA Programming and Performance	2	2913	April 1, 2008
Simple bitwise problem CUDA Programming and Performance	3	3586	May 21, 2009
Boolean array packed into 32 bit registers? CUDA Programming and Performance	1	997	June 16, 2016
Convert int to bits CUDA Programming and Performance	9	1829	November 14, 2016
fastest way to access the 8 MSB of an (unsigned) int? CUDA Programming and Performance	2	1149	December 30, 2009
decimal representation of number to binary CUDA Programming and Performance	6	2224	January 17, 2013
Temporary uint8_t's... CUDA not giving right answers unless CUDA Programming and Performance	2	2268	July 10, 2008
Possible to use the CUDA math API integer intrinsics to find the nth unset bit in a 32 bit int CUDA Programming and Performance	37	8888	March 1, 2015
Bitmap representation for simple mask filtering CUDA Programming and Performance	0	1044	July 29, 2009

Uint8 mask to bitmask

Related topics