I converted my pretrian model from onnx to caffe2 by this instruction https://devblogs.nvidia.com/nvidia-serves-deep-learning-inference/ in the section “ONNX Models” there is a link to this repo https://github.com/pytorch/pytorch/tree/master/caffe2/python/onnx as standard for conversion.
I tried convert by the instruction https://github.com/onnx/tutorials/blob/master/tutorials/Caffe2OnnxExport.ipynb
convert-onnx-to-caffe2 assets/squeezenet.onnx --output predict_net.pb --init-net-output init_net.pb
and this
convert-onnx-to-caffe2 assets/squeezenet.onnx --output predict_net.netdef --init-net-output init_net.netdef
I created a hierarchy as it was written here https://devblogs.nvidia.com/nvidia-serves-deep-learning-inference/
/tmp/models/test_pb/
config.pbtxt
1/
predict_net.pb
init_net.pb
my for the example above config.pbtxt
name: "test_pb"
platform: "tensorflow_graphdef"
max_batch_size: 128
input [
{
name: "input"
data_type: TYPE_FP32
format: FORMAT_NHWC
dims: [ 224, 224, 3 ]
}
]
output [
{
name: "InceptionV3/Predictions/Softmax"
data_type: TYPE_FP32
dims: [ 1001 ]
}
]
instance_group [
{
kind: KIND_GPU,
count: 4
}
]
and
/tmp/models/test_netdef/
config.pbtxt
1/
predict_net.netdef
init_net.netdef
my for the example above config.pbtxt
name: "test_netdef"
platform: "tensorflow_graphdef"
max_batch_size: 128
input [
{
name: "input"
data_type: TYPE_FP32
format: FORMAT_NHWC
dims: [ 224, 224, 3 ]
}
]
output [
{
name: "InceptionV3/Predictions/Softmax"
data_type: TYPE_FP32
dims: [ 1001 ]
}
]
instance_group [
{
kind: KIND_GPU,
count: 4
}
]
but when I run container
nvidia-docker run --rm -p8000:8000 -p8001:8001 -v/tmp/models:/models nvcr.io/nvidia/tensorrtserver:18.09-py3 trtserver --model-store=/models
and send curl I get
ready_state: MODEL_UNAVAILABLE
What do I do incorrectly?
y.glushenkov@ml-test-env:/tmp/models_example$ nvidia-smi
Fri Jan 18 07:43:02 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.78 Driver Version: 410.78 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:00:05.0 Off | N/A |
| 35% 55C P2 76W / 250W | 7344MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 7293 C trtserver 841MiB |
| 0 18379 C /home/a.eryomin/anaconda3/bin/python 6493MiB |
+-----------------------------------------------------------------------------+