How to Send FP16 Input Tensors Using gRPC in C# for NVIDIA Triton Inference Server?

Description

I am developing a gRPC client in C# using .NET Core to communicate with the NVIDIA Triton Inference Server for model inference. I am attempting to send FP16 image data to the densenet_onnx model hosted on Triton Server.

I encountered the following error during compilation:

error CS1061: 'InferTensorContents' does not contain a definition for 'RawContents' and no accessible extension method 'RawContents' accepting a first argument of type 'InferTensorContents' could be found (are you missing a using directive or an assembly reference?)

Environment

Triton Server:

  • Triton Server Version: 2.35.0
  • Docker Image Used: nvcr.io/nvidia/tritonserver:23.06-py3
  • Backend: ONNX Runtime
  • Model Name: densenet_onnx
  • Input Datatype: FP16
  • gRPC API Version: 2.35.0
  • Ports Open:
    • 8000 (HTTP)
    • 8001 (gRPC)
    • 8002 (Metrics)

GPU:

  • GPU Type: NVIDIA GeForce RTX 3060 Laptop GPU
  • Driver Version: 566.36
  • CUDA Version: 12.7
  • CUDNN Version: 9.7.0
  • Memory Usage: 1254 MiB / 6144 MiB
  • GPU Utilization: 4%

Operating System:

  • OS: Ubuntu 20.04 (Running WSL 2 on Windows 11)

.NET Environment:

  • .NET Core Version: 8.0.405
  • gRPC Client Library Version: 2.67.0

Steps To Reproduce

Docker Command Used to Run Triton Server: docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002
-v /home/ubuntu/triton-inference-server/model_repository:/models
nvcr.io/nvidia/tritonserver:23.06-py3 tritonserver --model-repository=/models --grpc-infer-allocation-pool-size=0

Model Details:

  • Model Name: densenet_onnx
  • Input Tensor Name: data_0
  • Input Datatype: FP16
  • Input Shape: [3, 224, 224]

gRPC Client Code:

  • C# .NET Core gRPC Client
  • Attempting to send FP16 data using : RawContents
    input.Contents = new InferTensorContents();
    input.Contents.RawContents = Google.Protobuf.ByteString.CopyFrom(inputDataFp16.SelectMany(BitConverter.GetBytes).ToArray());

Relevant Files:

  • Proto Files:
    • grpc_service.proto
    • model_config.proto
    • health.proto
  • C# Client Code:
    • Source Code of gRPC Client in .NET Core
  • Model Configuration:
    • Model Repository Path: /home/ubuntu/...
    • Model Name: densenet_onnx

Error Encountered: ‘InferTensorContents’ does not contain a definition for ‘RawContents’ and no accessible extension method ‘RawContents’ accepting a first argument of type ‘InferTensorContents’ could be found (are you missing a using directive or an assembly reference?)

I expected the gRPC client to send the FP16 tensor data using RawContents and receive inference output from the model. but The RawContents property is not available in InferTensorContents. This prevents the client from sending the input tensor data as required by the model. Is RawContents supported in Triton Server Version 2.35.0 for gRPC FP16 input?

  1. what is the recommended way to send FP16 input tensors in ushort[] format?
  2. Should I be using .UintContents instead for FP16 data? If so, what is the correct format?

I have checked the documentation and tried using .UintContents, but I am unsure if it’s the correct approach for FP16 tensors. Any guidance or suggestions would be greatly appreciated!

Hi @Madiha_Shaikh ,
Please raise the concern on TRITON INFERENCE SERVER Github issues page to get better assistance on teh topic.

Thanks

1 Like