Hello,
I evaluated UniFace on Jetson Orin Nano with the following setup ,
Jetson Orin Nano , JetPack 6.1 , L4T R36.4.7, Python 3.10, CUDA 12.6 (default with JP 6.1)
Functional status (CPU)
UniFace runs successfully with onnxruntime (CPU).
I verified:
-
Face detection and recognition pipelines work correctly
-
Valid 512-D face embeddings are generated
Performance (CPU):
-
Total latency: ~500 ms per frame (~2 FPS)
-
Face detection ≈ 93% of total time
This is not sufficient for real-time use.
GPU acceleration issue
UniFace requires onnxruntime-gpu for hardware acceleration.
However:
-
A compatible onnxruntime-gpu wheel for JetPack 6.1 (L4T R36.4.7) is not available
- Attempted installation from Jetson AI Lab → missing wheel / 404
-
Installed wheel:
onnxruntime_gpu-1.23.x-cp310-linux_aarch64.whlbut:
ort.get_available_providers()returns:
['AzureExecutionProvider', 'CPUExecutionProvider'] -
CUDAExecutionProvider is missing even though:
-
CUDA is working
-
torch.cuda.is_available() == True
-
GPU is visible in tegrastats
-
So the wheel appears to be CPU-only build.
Is there an official onnxruntime-gpu build for JetPack 6.1 (Python 3.10) that includes:
-
CUDAExecutionProvider
-
TensorRTExecutionProvider?
- If not available:
- Is building ONNX Runtime from source the recommended approach for JetPack 6?
- Any recommended alternative inference path for ONNX models on JetPack 6 for real-time performance?