[CLOSED] CUarray pixel wise multiplication by float array

Hi,

Jetson tx2, JetPack 4.2

I have Y plane of CUeglFrame.frame.pArray[0]. Frame size is wh.
And i have Npp32f * weights array allocated on gpu with the same size w
h.

How i can multiply element wise pArray[0] with float weights array?
Is there some npp function?

Best regards, Viktor.

Also can’t find surf2Dread function. It is in the surface_functions.h but all code is not defined there.
Where is surface functions library or sources?

surf2Dread is only available in CUDA device code

It is a built-in function in CUDA device code, so the only requirements should be to compile with nvcc

You won’t find sources for it anywhere, and there are no special compiling or linking requirements to use it.

Please explain me what does mean CUDA device code?
In the cudaHist sample code it is global function and it works.

Here is my error:

$ nvcc kernels.cu -gencode arch=compute_75,code=sm_75
kernels.cu(30): error: no instance of overloaded function "surf2Dread" matches the argument list
            argument types are: (uint8_t *, const CUsurfObject *, int32_t, int32_t)

kernels.cu(36): error: no instance of overloaded function "surf2Dread" matches the argument list
            argument types are: (uint8_t *, const CUsurfObject *, int32_t, int32_t)

kernels.cu(39): error: no instance of overloaded function "surf2Dwrite" matches the argument list
            argument types are: (uint8_t, CUsurfObject *, size_t, size_t)

And here is the code:

__global__ void remap(
        const size_t width,
        const size_t height,
        const float * mapX0,
        const float * mapY0,
        const float * mapX1,
        const float * mapY1,
        const float * mapW0,
        const float * mapW1,
        const size_t frameWidth,
        const CUsurfObject * frame0,
        const CUsurfObject * frame1,
        CUsurfObject * pano
        )
{
    size_t x{ blockIdx.x * blockDim.x + threadIdx.x };
    size_t y{ blockIdx.y * blockDim.y + threadIdx.y };
    if( x < width and y < height )
    {
        size_t index{ x + y * width };
        uint8_t c{}, c0{}, c1{};
        float w0{ mapW0[ index ] };
        float w1{ mapW1[ index ] };
        if( w0 > 0.f )
        {
            int32_t X0{ int32_t( mapX0[ index ] ) };
            int32_t Y0{ int32_t( mapY0[ index ] ) };
            surf2Dread( & c0, frame0, X0, Y0 );
        }
        if( w1 > 0.f )
        {
            int32_t X1{ int32_t( mapX1[ index ] ) };
            int32_t Y1{ int32_t( mapY1[ index ] ) };
            surf2Dread( & c1, frame1, X1, Y1 );
        }
        c = c0 * w0 + c1 * w1;
        surf2Dwrite( c, pano, x, y );

//        if( ( index + 1 ) % width == 0 )
    }
}

Have replaced from

CUsurfObject *

to

cudaSurfaceObject_t

and errors went.

According to the programming guide:

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#surf2dread-object

surf2Dread() expects:

template<class T>
void surf2Dread(T* data,
                 cudaSurfaceObject_t surfObj,
                 int x, int y,
                 boundaryMode = cudaBoundaryModeTrap);

(the above info is also available in /usr/local/cuda/include/surface_indirect_functions.h)

It seems evident that the arguments you are passing don’t match that. The obvious difference I can see is your use of const CUsurfObject * in place of cudaSurfaceObject_t

Why are you doing that?

In /usr/local/cuda/include/surface_types.h we see that cudaSurfaceObject_t is a typedef for unsigned long long, so I’m not sure why you think using a const CUsurfObject * would be a possible replacement.

/usr/local/cuda/include/surface_types.h:typedef __device_builtin__ unsigned long long cudaSurfaceObject_t;

Thanks. Closed.