Hello,
I am trying to deploy a unet model which I trained with the nvidia tao toolkit and exported as ONNX file. The ONNX model checker returns that the model is valid.
The inference operator however throws the following error after converting the model to trt engine file:
[error] [infer_utils.cpp:41] %s
Model Properties:
format: ONNX v6
producer: tf2onnx 1.9.2
version: 0
imports: ai.onnx v11
graph: tf2onnx
description: test
INPUTS:
input_1:0
name: input_1:0
tensor: float32[1,3,224,224]
OUTPUTS:
argmax_1
name: argmax_1
tensor: int64[1,224,224,1]
Code:
# source parameters
source_width = 1920
source_height = 1080
bpp = 4 # bytes per pixel
n_channels = 3 # RGB, 4 bytes/channel, 3 channels
in_dtype = "rgb888"
source_block_size = source_width * source_height * n_channels * bpp
source_num_blocks = 2
source_pool_kwargs = dict(
storage_type=MemoryStorageType.DEVICE,
block_size=source_block_size,
num_blocks=source_num_blocks,
)
# inference parameters
inference_width = 224
inference_height = 224
inference_n_channels = 3
inference_block_size = inference_width * inference_height * inference_n_channels * bpp
inference_num_blocks = 4
inference_pool_kwargs = dict(
storage_type=MemoryStorageType.DEVICE,
block_size=inference_block_size,
num_blocks=inference_num_blocks,
)
# Define the replayer and holoviz operators
replayer = VideoStreamReplayerOp(
self,
name="Replayer",
directory=VIDEO_DIR,
basename=VIDEO_BASENAME,
frame_rate=0,
repeat=True,
realtime=True,
)
cuda_stream_pool = CudaStreamPool(
self,
name="CudaStream",
dev_id=0,
stream_flags=0,
stream_priority=0,
reserved_size=1,
max_size=5,
)
format_converter = FormatConverterOp(
self,
name="FormatConverter",
resize_width = inference_width,
resize_height = inference_height,
scale_min=0,
scale_max=1,
out_dtype="float32",
in_dtype=in_dtype,
pool=BlockMemoryPool(self, name="pool", **source_pool_kwargs),
cuda_stream_pool=cuda_stream_pool
)
preprocessor = PreprocessorOp(
self,
name="Preprocessor",
permute_axes=[2, 0, 1],
reshape=[1, inference_n_channels, inference_width, inference_height],
ascontiguous=True
)
# inference op
self.model_path_map = {
"unet_tool_segmentation": os.path.join(MODELS_PATH, "model.fixed.onnx"),
}
pre_processor_map = {"unet_tool_segmentation": ["input_1:0"]}
inference_map = {"unet_tool_segmentation": ["argmax_1"]}
inference = InferenceOp(
self,
name="Inference",
model_path_map=self.model_path_map,
allocator=BlockMemoryPool(self, name="pool", **inference_pool_kwargs),
backend="trt",
pre_processor_map=pre_processor_map,
inference_map=inference_map,
parallel_inference=True,
infer_on_cpu=False,
enable_fp16=True,
input_on_cuda=True,
output_on_cuda=True,
transmit_on_cuda=True,
is_engine_path=False
)
Log:
root@cagx:/workspace/storage# cd /workspace/storage ; /usr/bin/env /bin/python3 /root/.vscode-server/extensions/ms-python.debugpy-2024.4.0-linux-arm64/bundled/libs/debugpy/adapter/../../debugpy/launcher 44447 -- /workspace/storage/repos/nvidia-holoscan/video-streaming-platform/src/nvidia_holoscan/applications/tao_unet/app.py
[info] [gxf_executor.cpp:210] Creating context
[info] [gxf_executor.cpp:1595] Loading extensions from configs...
[info] [gxf_executor.cpp:1741] Activating Graph...
[info] [resource_manager.cpp:79] ResourceManager cannot find Resource of type: nvidia::gxf::GPUDevice for entity [eid: 00002, name: __entity_2]
[info] [resource_manager.cpp:106] ResourceManager cannot find Resource of type: nvidia::gxf::GPUDevice for component [cid: 00003, name: CudaStream]
[info] [resource.hpp:44] Resource [type: nvidia::gxf::GPUDevice] from component [cid: 3] cannot find its value from ResourceManager
[info] [resource_manager.cpp:79] ResourceManager cannot find Resource of type: nvidia::gxf::GPUDevice for entity [eid: 00004, name: __entity_4]
[info] [resource_manager.cpp:106] ResourceManager cannot find Resource of type: nvidia::gxf::GPUDevice for component [cid: 00005, name: pool]
[info] [resource.hpp:44] Resource [type: nvidia::gxf::GPUDevice] from component [cid: 5] cannot find its value from ResourceManager
[info] [resource_manager.cpp:79] ResourceManager cannot find Resource of type: nvidia::gxf::GPUDevice for entity [eid: 00006, name: __entity_6]
[info] [resource_manager.cpp:106] ResourceManager cannot find Resource of type: nvidia::gxf::GPUDevice for component [cid: 00007, name: pool]
[info] [resource.hpp:44] Resource [type: nvidia::gxf::GPUDevice] from component [cid: 7] cannot find its value from ResourceManager
[info] [resource_manager.cpp:79] ResourceManager cannot find Resource of type: nvidia::gxf::GPUDevice for entity [eid: 00008, name: __entity_8]
[info] [resource_manager.cpp:106] ResourceManager cannot find Resource of type: nvidia::gxf::GPUDevice for component [cid: 00009, name: pool]
[info] [resource.hpp:44] Resource [type: nvidia::gxf::GPUDevice] from component [cid: 9] cannot find its value from ResourceManager
[info] [gxf_executor.cpp:1771] Running Graph...
[info] [gxf_executor.cpp:1773] Waiting for completion...
[info] [gxf_executor.cpp:1774] Graph execution waiting. Fragment:
[info] [greedy_scheduler.cpp:190] Scheduling 7 entities
[info] [context.cpp:50] _______________
[info] [context.cpp:50] Vulkan Version:
[info] [context.cpp:50] - available: 1.2.131
[info] [context.cpp:50] - requesting: 1.2.0
[info] [context.cpp:50] ______________________
[info] [context.cpp:50] Used Instance Layers :
[info] [context.cpp:50]
[info] [context.cpp:50] Used Instance Extensions :
[info] [context.cpp:50] VK_EXT_debug_utils
[info] [context.cpp:50] VK_KHR_external_memory_capabilities
[info] [context.cpp:50] ____________________
[info] [context.cpp:50] Compatible Devices :
[info] [context.cpp:50] 0: Quadro RTX 6000
[info] [context.cpp:50] Physical devices found :
[info] [context.cpp:50] 1
[info] [context.cpp:50] ________________________
[info] [context.cpp:50] Used Device Extensions :
[info] [context.cpp:50] VK_KHR_external_memory
[info] [context.cpp:50] VK_KHR_external_memory_fd
[info] [context.cpp:50] VK_KHR_external_semaphore
[info] [context.cpp:50] VK_KHR_external_semaphore_fd
[info] [context.cpp:50] VK_KHR_push_descriptor
[info] [context.cpp:50] VK_EXT_line_rasterization
[info] [context.cpp:50]
[info] [vulkan_app.cpp:777] Using device 0: Quadro RTX 6000
[info] [infer_utils.cpp:222] Input tensor names empty from Config. Creating from pre_processor map.
[info] [infer_utils.cpp:224] Input Tensor names: [input_1:0]
[info] [infer_utils.cpp:258] Output tensor names empty from Config. Creating from inference map.
[info] [infer_utils.cpp:260] Output Tensor names: [argmax_1]
[info] [inference.cpp:202] Inference Specifications created
[info] [core.cpp:46] TRT Inference: converting ONNX model at /workspace/storage/models/unet/model.fixed.onnx
[info] [utils.cpp:81] Cached engine found: /workspace/storage/models/unet/model.fixed.QuadroRTX6000.7.5.72.trt.8.2.3.0.engine.fp16
[info] [core.cpp:79] Loading Engine: /workspace/storage/models/unet/model.fixed.QuadroRTX6000.7.5.72.trt.8.2.3.0.engine.fp16
[info] [core.cpp:122] Engine loaded: /workspace/storage/models/unet/model.fixed.QuadroRTX6000.7.5.72.trt.8.2.3.0.engine.fp16
[info] [infer_manager.cpp:343] HoloInfer buffer created for argmax_1
[info] [inference.cpp:213] Inference context setup complete
[info] [holoviz.cpp:1425] Input spec:
- type: color
name: ""
opacity: 1.000000
priority: 0
[error] [infer_utils.cpp:41] %s
[error] [gxf_wrapper.cpp:68] Exception occurred for operator: 'Inference' - Error in Inference Operator, Sub-module->Tick, Inference execution, Message->Error in Inference Operator, Sub-module->Tick, Data extraction
[error] [entity_executor.cpp:529] Failed to tick codelet Inference in entity: Inference code: GXF_FAILURE
[warning] [greedy_scheduler.cpp:242] Error while executing entity 62 named 'Inference': GXF_FAILURE
[info] [greedy_scheduler.cpp:398] Scheduler finished.
[error] [program.cpp:556] wait failed. Deactivating...
[error] [runtime.cpp:1408] Graph wait failed with error: GXF_FAILURE
[warning] [gxf_executor.cpp:1775] GXF call GxfGraphWait(context) in line 1775 of file /workspace/holoscan-sdk/src/core/executors/gxf/gxf_executor.cpp failed with 'GXF_FAILURE' (1)
[error] [gxf_executor.cpp:1779] GxfGraphWait Error: GXF_FAILURE
Setting the loglevel to DEBUG/TRACE doesn’t yield more information. Do you have an idea what the issue could be?