Cannot run model tested on GPU on TX2 - garbage returned.

The story:

  1. I have a Jetson TX2 to run a model on (to detect things in images)
  2. I'm given a model, which is converted to UFF.
  3. I use bin2c.py to produce sorta C code to include in my sampleUffMNIST.cpp (heavely patched sample file)
  4. I use sample build setup to produce the binary out of that sampleUffMNIST.cpp (right on that Jetson)
  5. I run that binary and it waits on AF_UNIX socket...
  6. I run the feed.py which grabs the images, feeds it to sampleUffMNIST.cpp over that local socket and
  7. feed.py fetches output from sampleUffMNIST.cpp (which is expected to be a matrix of "probabilities" for a pixel to belong to an object)
  8. the matrix is then applied to the source image and we get the image with everything masked out except of objects found.

I have a mockup in Python3 (just rewritten the same original sampleUffMNIST.cpp) on a PC with GPU and it works fine!
But I have no TensorRT Python binding on Jetson.
Hence, I have to code in C++ (which is not my favorite by any mean) to run the engine.

This doesn’t work. I get garbage on the output.

Here we are.

The details:

  • Jetson TX2 running tegra-ubuntu 4.4.38-tegra
  • tensorrt 4.0.2.0-1+cuda9.0 package on TX2
  • tensorrt 5.0.0.10-1+cuda9.0 package on PC/GPU
  • the source code for mentioned files is available here

Any help would be appreciated!

Few words more: GPU is 1080 (not “Ti”), both FP16/FP32 modes attempted, 1080 prefers FP32, TX2 prefers FP16…

Even more: if I build the same patched sampleUffMNIST.cpp on PC/GPU and run the routine described above… it works!

And fails on Jetson.

Why??

Let’s try the low hanging fruit first. Can you try updating to jetpack 4.1 which contains TRT5 (matching your desktop configuration) and see if results improve?

Well, I’ll try to.

But it’s said that 4.1 is for Xavier only…

Yes, it is.

I mean, no way (“Next” button fails here): https://i.imgur.com/wWja5v4.png (I have no right to use img tag)

This fruit hangs high enough.
wWja5v4.png

Attempt to forcibly upgrade (apt-get install --only-upgrade tensorrt) fails too:

Reading state information... Done
tensorrt is already the newest version (4.0.2.0-1+cuda9.0).

Few more details: an old version model (for 256x256 source images) looks ok for both platforms.
The new one (for 512x512 source image) is not.

I don’t understand how to map this fact to possible reason/fix…