With the Cuda-segmentation demo, I get an CUDA error on the cudaExtractCluster part, while the cudaSegmentation part runs fine.
Other demos such as cuda-filter runs fine.
The reason of the failure is that the scan range is too large to fit in the cache.
We have tested the default sample.pcd on JetPack 4.6 and it can work well without error.
$ ./demo ./sample.pcd
GPU has cuda devices: 1
----device id: 0 info----
GPU : Xavier
Capbility: 7.2
Global memory: 31920MB
Const memory: 64KB
SM in a block: 48KB
warp size: 32
threads in a block: 1024
block dim: (1024,1024,64)
grid dim: (2147483647,65535,65535)
-------------------------
CUDA segment by Time: 20.6349 ms.
CUDA modelCoefficients: -0.00269913 0.0424975 0.999093 2.10639
CUDA find points: 7519
-------------------------
PCL(CPU) segment by Time: 69.4973 ms.
Model coefficients: -0.0026991 0.0424981 0.999093 2.10639
Model inliers: 7519
Your test_P.pcd file is an HTML file rather than pcd file.
Could you generate a data file with generate PCD format and try it again?
sample.pcd
# .PCD v0.7 - Point Cloud Data file format
VERSION 0.7
FIELDS x y z
SIZE 4 4 4
TYPE F F F
COUNT 1 1 1
WIDTH 119978
HEIGHT 1
VIEWPOINT 0 0 0 1 0 0 0
POINTS 119978
DATA ascii
62.7612 8.7600002 2.3940001
60.686398 8.6639996 2.3243999
58.545597 8.7323999 2.2523999
58.503597 8.9136 2.2523999
57.5868 8.9591999 2.2212
...
We test it on Xavier with JetPack 4.6.
Confirmed that the cuda-segmentation sample can work correctly.
$ ./demo test14frame350.pcd
GPU has cuda devices: 1
----device id: 0 info----
GPU : Xavier
Capbility: 7.2
Global memory: 31920MB
Const memory: 64KB
SM in a block: 48KB
warp size: 32
threads in a block: 1024
block dim: (1024,1024,64)
grid dim: (2147483647,65535,65535)
-------------------------
CUDA segment by Time: 11.1412 ms.
CUDA modelCoefficients: -0.00629804 -0.0541567 0.998513 2.0456
CUDA find points: 1392
-------------------------
PCL(CPU) segment by Time: 14.0393 ms.
Model coefficients: -0.00629818 -0.0541568 0.998513 2.0456
Model inliers: 1392