* URGENT * How to convert Deepstream tensor to Numpy?

• Hardware Platform : GPU
• DeepStream 5.0
• TensorRT 7
• NVIDIA GPU Driver Version :440.59

So I’m migrating my tensorrt based custom model to PyDS, and I need to post-process on the whole output tensor, which is big so get_detections(layer.buffer, index) wont be efficient (Also couldn’t find get_detections() in the documentation)

I got my tensor output using
layer = pyds.get_nvds_LayerInfo(*params)

then inspecting layer.buffer, it was a pycapsule object
so I decoded it to byte-buffer using

byte_buffer = ctypes.pythonapi.PyCapsule_GetPointer(layer.buffer, None)

Now when I use
nd_arr = np.frombuffer(byte_buffer)

numpy throws the following error
ValueError: buffer size must be a multiple of element size

I’m suspecting either of these :

  • It needs to be transferred to host memory first?
  • Wrong datatype when converting buffer to numpy
  • Don’t remember the third

Then there’s a binding of CudaToNumpy in jetson-utils that I can try to port but It’s kinda urgent.
Is this right approach? Please guide how to take this forward.


Are you using this sample?

Based on the page, the numpy binding should already be done.
Please find the following document and deepstream_ssd_parser.py for more information:


Thanks for the fast response.

Yes, I’m mainly using ssd_parser.py for reference only. Sorry If its naive but I was unable to find any tensor to numpy convertor, can you please specifically point it out.

Afaik, Numpy bindings are only available for converting buf_surface(input matrix) into opencv frame, which is very high abstraction, expects/supports a certain format of dimension

Custom parser guide gives a good idea of the workflow but not the exact function to get it.

To be more precise, the output of my network can be thought like a 2d mask for each image(but in float).
So the conventional functions that we use in these examples retrieve detection/classes by iterating one element a time would be really inefficient

I inspected further, putting dtype=uint8, when using np.frombuffer() outputs values, albeit wrong dimensions and values.
The model is FP32 for testing, the numpy output dimensions are 285 when using uint8/int8

The actual output dimensions are 4800 and I verified that this is the size received right upto layer.buffer using get_detections()

So its prolly an issue of using a right datatype? If you could suggest which datatype to use for this conversion


Would you mind to share a source for the issue?
So we can check and give a further suggestion directly.


@AastaLLL I can, sure, but instead, I’ll ask with reference to ssd example itself, so it’d be faster for you too.
So in ssd python example, if you open ssd_parser.py file, if you can navigate to
res.detectionConfidence = pyds.get_detections(score_layer.buffer, index)
inside make_nodi function.

So in this line of code, score_layer.buffer is basically flattened 1-d version of the output tensor “score_layer” and the function get_detections() returns a single element from this tensor corresponding to the index .

So basically, if you can show how to convert whole score_layer.buffer into a numpy array at once, without looping through each index one by one, the solution will solve my issue too.

I need the output array converted to numpy all at once because my output is too big, (80x60)

1 Like

Hi saransh661, you can try getting the layer datatype from NvDsInferLayerInfo. The dataType field is an enum with these values:
FLOAT FP32 format.
HALF FP16 format.
INT8 INT8 format.
INT32 INT32 format.

We’ll look into returning the output tensors as a numpy array but I understand that your need is urgent. Hope with the right dtype and shape, you can get your numpy array with existing bindings.

Hi together.

I just found a working solution:

ptr = ctypes.cast(pyds.get_ptr(layer.buffer), ctypes.POINTER(ctypes.c_float))
v = np.ctypeslib.as_array(ptr, shape=(4096,))

Of course, the data type of the pointer and the shape have to be modified.


Thanks for the sharing.

Thank you a lot that worked for me.

I cannot print the value of my tensor by printing this pointer out_buf_ptrs_host[i]; how can i print this tensor in c++

How do we get this tensor in c++

Is it possible to get this to work with FP16? As far as I can tell, ctypes doesn’t have such an option