Hi! Im trying to implement inference using citrinet model for jetson nano. After exporting model nemo → onnx → tensorrt the output of trt model on x86 PC doesn’t equal the output on jetson nano.
Both trt models have the same precition and workspace .
My model shapes:
binding 0 - 'audio_signal'
input: True
shape: (1, 80, -1)
dtype: DataType.FLOAT
size: -320
dynamic: True
profiles: [{'min': (1, 80, 10), 'opt': (1, 80, 150), 'max': (1, 80, 300)}]
binding 1 - 'length'
input: True
shape: (1,)
dtype: DataType.INT32
size: 4
dynamic: False
profiles: [{'min': (1,), 'opt': (1,), 'max': (1,)}]
binding 2 - 'logprobs'
input: False
shape: (1, -1, 1025)
dtype: DataType.FLOAT
size: -4100
dynamic: True
profiles: []
When I put this tensors to input layer:
torch.Size([1, 80, 151]) - audio_signal
tensor([[[-0.3492, -0.3492, -0.3492, ..., 4.5867, 4.7051, 0.0000],
[-0.3971, -0.3971, -0.3971, ..., 2.7422, 2.7433, 0.0000],
[-0.3963, -0.3963, -0.3963, ..., 2.1081, 2.1132, 0.0000],
...,
[-0.4859, -0.4859, -0.4859, ..., 2.3163, 1.9124, 0.0000],
[-0.4717, -0.4717, -0.4717, ..., 2.5303, 1.0611, 0.0000],
[-0.4861, -0.4861, -0.4861, ..., 1.7891, 1.3740, 0.0000]]])
torch.Size([1]) - length
tensor([150])
the output on PC is:
[[2.43806308e-08 1.67615531e-06 1.31286612e-07 ... 2.07928537e-08
4.35268888e-08 9.99820173e-01]
[2.12243876e-08 1.40326529e-06 1.04342654e-07 ... 2.41917171e-08
4.14337720e-08 9.99853849e-01]
[2.73860437e-08 1.28268800e-06 1.74102283e-07 ... 4.25407798e-08
5.49981323e-08 9.99853611e-01]
...
[9.64823510e-09 2.46931700e-06 1.78903429e-06 ... 7.19314357e-07
9.83519470e-08 9.98588264e-01]
[8.09493415e-08 1.92733733e-05 5.31556134e-06 ... 4.09953236e-06
1.16596459e-06 9.96734917e-01]
[9.93186688e-08 1.32259365e-05 1.55816360e-06 ... 8.91289631e-07
1.00996351e-06 9.99091506e-01]]
but when I run this code on jetson nano I always get nan:
[[nan nan nan ... nan nan nan]
[nan nan nan ... nan nan nan]
[nan nan nan ... nan nan nan]
...
[nan nan nan ... nan nan nan]
[nan nan nan ... nan nan nan]
[nan nan nan ... nan nan nan]]
trt_builder.py has the same code for PC and Jetson nano. After building TRT engines my PC version is 333MB (fp16), jetson nano - 1104MB(fp16).
I really have no idea why jetson nano model doesn’t work as expected