Hi,
I’m working with the nvidia ML tools for the first time, and I’m finding myself up to my eyeballs in documentation, none of which seems to work.
I’m running the TAO Mask-RCNN segmentation. I have a working model thanks to the very nice jupyter notebook example.
However, I’ve hit several dead ends when trying to deploy the model on a Jetson device. There seem to be many deployment methods, and none of them work.
I have tried exporting a 32 bit version of the engine file from my desktop and running it on the Jetson with TensorRT in python with runtime.deserialize_cuda_engine(f.read())
this yields:
Reading engine from file model.step-25000.engine
[11/17/2022-15:07:26] [TRT] [E] 1: [stdArchiveReader.cpp::StdArchiveReader::40] Error Code 1: Serialization (Serialization assertion stdVersionRead == serializationVersion failed.Version tag does not match. Note: Current Version: 213, Serialized Engine Version: 205)
[11/17/2022-15:07:26] [TRT] [E] 4: [runtime.cpp::deserializeCudaEngine::49] Error Code 4: Internal Error (Engine deserialization failed.)
I have tried exporting an int8 version from TAO, then running it through tao-convert on the orin and attempting to load the resulting file. It seems to take in 32 bit float input (which makes no sense to me) and returns zero detections. (The outputs are also undocumented, so I’m not entirely sure how to interpret them. Is there any source available for how TAO runs the model?)
I have tried reading the etlt file directly into deepstream. There is no example file for this anywhere, but following examples in the TLT user guide, I trie d running deepstream-app with a custom configuration file. that yielded:
:00:00.221617212 9295 0xaaaadbbdf130 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
WARNING: [TRT]: The implicit batch dimension mode has been deprecated. Please create the network with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag whenever possible.
ERROR: [TRT]: 3: conv1/Conv2D:kernel weights has count 3528 but 0 was expected
ERROR: [TRT]: 4: conv1/Conv2D: count of 3528 weights in kernel, but kernel dimensions (7,7) with 0 input channels, 24 output channels and 1 groups were specified. Expected Weights count is 0 * 7*7 * 24 / 1 = 0
along with many other errors.
I have no doubt I’m doing something wrong, but neither exporting .engine files or importing .etlt files seems to work.
In many of my dead-end attempts, I got fatal warnings about implicit batch size not being implimented in the model. I suspect that my attempts to export .engine files were foiled because of mismatching TRT versions on the Jetson (which only supports up to 8.4.1) and TAO (which is on 8.5 I think).
I am running the most recent release of TAO (downloaded yesterday).
I am trying to run on an Orin Jetson devkit, which I just upgraded to JetPack 5.0.2.
Can anyone point me at a complete working system to run Mask RCNN on jetson? Eventually I’ll want it running in my own codebase, so the deepstream-app route is not ideal, but I want to start with anything that works.
Thanks,
Nathaniel Tagg