TEST Cuda pcl,the result semms error

Hi, I test the Cuda pcl(icp and ndt),
when I test, the cuda ndt result as follow:

It seems cpu faster than gpu, So I want to konw is there any problem here?

this is my computer:

Hi,

Could you share which source and the testing command with us?
We want to give it a try to get more info.

Thanks.

cuPCL/cuNDT at main · NVIDIA-AI-IOT/cuPCL · GitHub , this is the source code。“I am testing the code according to the instructions on GitHub.”

Hi,

Have you maximized the device’s performance?
This can be done via the below command:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

Thanks.

ok, I try it and then try you the result

Hi , I run the
$ sudo nvpmodel -m 0
$ sudo jetson_clocks,

the result still seems not right.
first is cuda ndt:

second is cudaicp,

Hi,

We can reproduce a similar behavior.
Need to check with our internal team first. Will update more with you later.

Thanks.

Hello,Is there any new progress here?

Hi,

Please try to adjust the parameter to make the fitness score of OSS-PCL is better (lower) than or equal to that of cuPCL.
We can get a better performance on GPU when the fitness_scores are close.

...
Loaded 7000 data points for Q with the following fields: x y z
Target rigid transformation : cloud_P -> cloud_Q
Rotation matrix :
    | 0.923880 -0.382683 0.000000 |
R = | 0.382683 0.923880 0.000000 |
    | 0.000000 0.000000 1.000000 |
Translation vector :
t = < 0.000000, 0.000000, 0.200000 >------------checking PCL NDT(CPU)----------------
PCL align Time: 79.5729 ms.
Normal Distributions Transform has converged: 1 score: 0.541405
Rotation matrix :
    | 0.999255 0.009666 0.037361 |
R = | -0.008477 0.999457 -0.031832 |
    | -0.037648 0.031492 0.998795 |
Translation vector :
t = < 0.037921, 0.105031, 0.180639 >------------checking CUDA NDT(GPU)----------------
CUDA NDT by Time: 73.6361 ms.
CUDA NDT fitness_score: 0.538531
Rotation matrix :
    | 0.999171 0.010360 0.039369 |
R = | -0.009026 0.999384 -0.033907 |
    | -0.039696 0.033523 0.998649 |
Translation vector :
t = < 0.056860, 0.143136, 0.188665 >

Thanks.

Why is this different from the results on github, which show that cuda gpu versions are much faster and score less

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.