How to get result location of NPP LabelMark?

Hi, I use cuda toolkit 11.7 npp library.
In my program, 2D 8bit image is saved in gpu VRAM, and i want to label this image.
I use nppiLabelMarkersUF_8u32u_C1R function and 32bit labeld Image is saved in LabelDevPtr.
this function work well.
My source code is below.

bool D2D_CUDA::NppLabelMark()
{
	int buffer_size;
	NppStatus e;
	NppiSize source_roi = { Data.TotalWidth, Data.TotalHeight };
	
	// 레이블링 수행
	e = nppiLabelMarkersUF_8u32u_C1R(
		(Npp8u*)DiffImageDevPtr, Data.TotalWidth * sizeof(Npp8u), // 이미 존재하는 입력 GPU 메모리 dSrc
		LabelDevPtr, Data.TotalWidth * sizeof(Npp32u),            // 출력 GPU 메모리 dDst
		source_roi, nppiNormL1,                                   // ROI와 연결성(Norm) 설정
		LabelBufPtr
	);
	if (e != NPP_SUCCESS) {
		std::cerr << "Error labeling markers: " << e << std::endl;
		return false;
	}
    return true;
}

But i want to get location and size of labels.
For example, let’s assume a labeled image like this

There is 7 labels in picture.
In my source code, LabeldDevPtr save that datas.
I want know additional data like this structure (labelNum, location, size)
location is left top index of same pixels and size is pixel count.
For Instance : (1, 8, 15) , (2, 17, 9) , (4, 113, 14) …

How can i get that datas??
Can i create my custom kernel function, or Is there any similiar library already developed and available for use??
please let me know.

You could use a method like this to get bounding boxes. Bounding boxes effectively give you location and size. Yes, I acknowledge my usage of “size” there does not line up exactly with your definition of “size”. Once you have the bounding box, you could refine the size with a parallel algorithm.

These functions may also be of interest.

hi, i’m leaving a comment after a long time because i solved my problem.
First, I want to let you know that there is an NPP library function called nppiCompressedMarkerLabelsUFInfo_32u_C1R_Ctx, which returns the rectangle information.
but this library has a critical problem.

The issue is that the coordinate information provided by this library continuously gives different positions and sizes, even when the same arguments are passed and it is executed repeatedly.
(Unless NVIDIA fixes this issue, it seems there is no way to resolve it)

So, I slightly modified the method recommended by Robert to achieve my goal.

template <typename T>
__global__ void bb(const T* __restrict__ i, int* __restrict__ bottom, int* __restrict__ top, int* __restrict__ right, int* __restrict__ left, int* __restrict__ pixelCount, int height, int pitch) {

	int idx = threadIdx.x + blockDim.x * blockIdx.x;
	int idy = threadIdx.y + blockDim.y * blockIdx.y;
	if ((idx < pitch) && (idy < height)) {
		T myval = i[idy * pitch + idx];
		if (myval > 0) {
			atomicMax(right + myval - 1, idx);
			atomicMin(left + myval - 1, idx);
			atomicMax(bottom + myval - 1, idy);
			atomicMin(top + myval - 1, idy);
			atomicAdd(pixelCount + myval - 1, 1);
		}
	}
}

By doing this, I can store the L, T, R, B values and the pixel count of the objects classified by NPP into the desired array.

However, since it uses a lot of atomic operations by default, I couldn’t achieve the desired computation speed. Personally, I don’t think I’ll be able to use this approach until the issues with the nppiCompressedMarkerLabelsUFInfo_32u_C1R_Ctx function are improved.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.