Cutom key point detection Not Face related

david9xqqb · February 9, 2023, 7:43pm

I’m looking to implement a key point detection model with a custom object not face related.

It could also be similar to transfer learning body pose estimation to non human object.

What is the best way to do that in TAO?

Thanks!

Morganh · February 10, 2023, 5:46am

For generic key point detection, please download the fpenet notebook TAO Toolkit Quick Start Guide — TAO Toolkit 4.0 documentation for the steps.
notebooks/tao_launcher_starter_kit/fpenet/fpenet.ipynb

david9xqqb · February 14, 2023, 11:07am

Thanks!!

I’ll try that

Dave

david9xqqb · February 17, 2023, 1:33pm

@Morganh I ran all the steps in the notebook but fails with no clear error:

2023-02-17 13:26:07.302193: F ./tensorflow/core/kernels/random_op_gpu.h:225] Non-OK-status: GpuLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: the provided PTX was compiled with an unsupported toolchain.
[71ad73f96a9a:00198] *** Process received signal ***
[71ad73f96a9a:00198] Signal: Aborted (6)
[71ad73f96a9a:00198] Signal code:  (-6)
[71ad73f96a9a:00198] [ 0] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x43090)[0x7f66a9358090]
[71ad73f96a9a:00198] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f66a935800b]
[71ad73f96a9a:00198] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f66a9337859]
[71ad73f96a9a:00198] [ 3] /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/_pywrap_tensorflow_internal.so(+0x20baf4)[0x7f66a5840af4]
[71ad73f96a9a:00198] [ 4] /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/../libtensorflow_cc.so.1(_ZN10tensorflow7functor16FillPhiloxRandomIN5Eigen9GpuDeviceENS_6random19UniformDistributionINS4_12PhiloxRandomEfEEEclEPNS_15OpKernelContextERKS3_S6_PfxS7_+0x1d5)[0x7f65f890e7e5]
[71ad73f96a9a:00198] [ 5] /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/../libtensorflow_cc.so.1(+0x8ed75ea)[0x7f65f890b5ea]
[71ad73f96a9a:00198] [ 6] /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/../libtensorflow_framework.so.1(_ZN10tensorflow13BaseGPUDevice7ComputeEPNS_8OpKernelEPNS_15OpKernelContextE+0x3cb)[0x7f66a47333db]
[71ad73f96a9a:00198] [ 7] /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/../libtensorflow_framework.so.1(+0x113caa7)[0x7f66a4790aa7]
[71ad73f96a9a:00198] [ 8] /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/../libtensorflow_framework.so.1(+0x113d10f)[0x7f66a479110f]
[71ad73f96a9a:00198] [ 9] /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/../libtensorflow_framework.so.1(_ZN5Eigen15ThreadPoolTemplIN10tensorflow6thread16EigenEnvironmentEE10WorkerLoopEi+0x285)[0x7f66a4845725]
[71ad73f96a9a:00198] [10] /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/../libtensorflow_framework.so.1(_ZNSt17_Function_handlerIFvvEZN10tensorflow6thread16EigenEnvironment12CreateThreadESt8functionIS0_EEUlvE_E9_M_invokeERKSt9_Any_data+0x48)[0x7f66a4842268]
[71ad73f96a9a:00198] [11] /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/../libtensorflow_framework.so.1(+0x18d69a0)[0x7f66a4f2a9a0]
[71ad73f96a9a:00198] [12] /usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609)[0x7f66a92fa609]
[71ad73f96a9a:00198] [13] /usr/lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7f66a9434133]
[71ad73f96a9a:00198] *** End of error message ***
Telemetry data couldn't be sent, but the command ran successfully.
[WARNING]: <urlopen error [Errno -2] Name or service not known>
Execution status: FAIL
2023-02-17 15:26:08,361 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

The complete error log here:

fpenet error log 2023 02 17.txt (22.8 KB)

The spec file unmodified:

experiment_spec.yaml (2.3 KB)

Thanks!

David

Morganh · February 19, 2023, 5:12pm

Please update nvidia-driver. You can search this info in this forum and get similar error log.

david9xqqb · February 19, 2023, 8:12pm

That’s a scary proposition as in the past updating the Nvidia GPU driver has caused the corruption of two workstations that needed full reinstall…

My current driver version is:

nvidia-smi
Sun Feb 19 21:50:19 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.108.03   Driver Version: 510.108.03   CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+

And I’ll be updating to latest, 525…

Thanks!

david9xqqb · February 19, 2023, 9:31pm

And sure enough, the computer now gets stuck on boot after upgrading to driver 525…

Morganh · February 20, 2023, 1:58am

Use below way.
Check with nvidia-smi, if it is 470 version, then

Uninstall current driver:
sudo apt purge nvidia-driver-470
sudo apt autoremove
sudo apt autoclean

Install new driver.
sudo apt install nvidia-driver-520

david9xqqb · February 20, 2023, 12:18pm

I was able to complete training after upgradi8ng driver to 525.78.01.

However, the results are not very good. In the following image, used in the default notebook, section 7, one face doesn’t get detected at all and key points are not generated, and for the second face, some of the key points are very far from their expected locations…

Thanks!

Morganh · February 25, 2023, 4:03pm

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Please share the models/exp1/result.txt

Topic		Replies	Views
Training fpenet from scratch on tao for one keypoint TAO Toolkit deepstream	13	264	January 17, 2025
Trining TAO Toolkit results in 0.0000% accuracy TAO Toolkit	6	585	February 23, 2024
Looking for Landmark (key point) estimator base model TAO Toolkit	4	653	July 11, 2022
Tao-converter [ERROR] Failed to parse the model, please check the encoding key to make sure its correct TAO Toolkit deepstream	69	2671	July 10, 2023
No CUDA-capable device is detected TAO Toolkit cuda , tao	9	280	February 17, 2025
No CUDA-capable device is detected - yolov4 TAO Toolkit	9	381	August 16, 2024
Tao toolkit facenet Error TAO Toolkit	13	1470	March 7, 2022
Options to retrain Fpenet and which one to use TAO Toolkit	5	536	October 4, 2023
TAO Toolkit with Yolov4-Tiny and custom pretrained model TAO Toolkit	29	1595	June 26, 2023
TAO 4.0 AutoML - the provided PTX was compiled with an unsupported toolchain TAO Toolkit	5	803	July 17, 2023

Cutom key point detection Not Face related

Related topics