What is the GPU memory limitation of Drive Orin 32GB Verion?

Please provide the following info (tick the boxes after creating this topic):
Software Version
[1] DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
[1] Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
[1] DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
[1] 1.9.1.10844
other

Host Machine Version
[1] native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

I have encountered a problem with TensorRT model deployment.
The error message is as below:

[02/22/2023-02:24:45] [V] [TRT] --------------- Timing Runner: {ForeignNode[4107…ScatterND_8175]} (Myelin)
[02/22/2023-02:25:03] [W] [TRT] Skipping tactic 0 due to insufficient memory on requested size of 110933120 detected for tactic 0x0000000000000000.
[02/22/2023-02:25:03] [V] [TRT] Fastest Tactic: 0xd15ea5edd15ea5ed Time: inf
[02/22/2023-02:25:03] [V] [TRT] Deleting timing cache: 3792 entries, served 17385 hits since creation.
[02/22/2023-02:25:03] [E] Error[4]: [optimizer.cpp::computeCosts::3635] Error Code 4: Internal Error (Could not find any implementation for node {ForeignNode[4107…ScatterND_8175]} due to insufficient workspace. See verbose log for requested sizes.)
[02/22/2023-02:25:03] [E] Error[2]: [builder.cpp::buildSerializedNetwork::636] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
[02/22/2023-02:25:03] [E] Engine could not be created from network
[02/22/2023-02:25:03] [E] Building engine failed
[02/22/2023-02:25:03] [E] Failed to create engine from model.
[02/22/2023-02:25:03] [E] Engine set up failed

However, the model can be deployed by TensorRT on 3060Ti with 12GB GPU memory.
So I want to know, what is the GPU memory limitation of Drive Orin 32GB Verion? And how to solve above problem.

Dear @WanchaoYao,
Note that, On Tegra, the DRAM is shared by both CPU and GPU.
Did you copy any data/files onto target? Could you share df -h on target?

nvidia@tegra-ubuntu:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/vblkdev0p1 27G 16G 9.1G 64% /
none 14G 0 14G 0% /dev
tmpfs 14G 8.4M 14G 1% /dev/shm
tmpfs 2.8G 1.8M 2.8G 1% /run
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 14G 0 14G 0% /sys/fs/cgroup
tmpfs 2.8G 12K 2.8G 1% /run/user/110
tmpfs 2.8G 4.0K 2.8G 1% /run/user/1000

@SivaRamaKrishnaNV Is this related to insufficient memory during TensorRT model conversion?

Dear @WanchaoYao,
I could see available memory is 16GB on target now. It could be issue with workspace size. Could you check increasing workspace size with --workspaceSize param in trtexec?

@SivaRamaKrishnaNV After increasing workspaceSize param, conversion and test passed. Could u explain what workspaceSize means indeed?

Dear @WanchaoYao ,

[02/22/2023-02:25:03] [E] Error[4]: [optimizer.cpp::computeCosts::3635] Error Code 4: Internal Error (Could not find any implementation for node {ForeignNode[4107…ScatterND_8175]} due to insufficient workspace. See verbose log for requested sizes.)

It limits the maximum size that any layer in the network can use. If insufficient workspace is provided, TensorRT may not be able to find an implementation for a layer.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.