• Hardware Platform (Jetson / GPU)
GPU
• DeepStream Version
5.0 (image : nvcr.io/nvidia/deepstream:5.0-20.07-triton)
• JetPack Version (valid for Jetson only)
N/A
• TensorRT Version
7.0.0-1+cuda10.2
• NVIDIA GPU Driver Version (valid for GPU only)
455.32.00
• Issue Type( questions, new requirements, bugs)
I am trying to run a Pytorch model (built on torch v1.6 and torchvision v0.5) on Triton Inference Server but encountered the following issue :
E1112 14:33:37.104733 11675 model_repository_manager.cc:840] failed to load 'yolov5' version 1: Internal: load failed for libtorch model -> 'yolov5': version_ <= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED at ../caffe2/serialize/inline_container.cc:132, please report a bug to PyTorch. Attempted to read a PyTorch file with version 3, but the maximum supported version for reading is 2. Your PyTorch installation may be too old. (init at ../caffe2/serialize/inline_container.cc:132)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x6c (0x7fa1f293236c in /opt/tensorrtserver/lib/pytorch/libc10.so)
frame #1: caffe2::serialize::PyTorchStreamReader::init() + 0x27b4 (0x7fa24945de74 in /opt/tensorrtserver/lib/pytorch/libtorch_cpu.so)
frame #2: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::unique_ptr<caffe2::serialize::ReadAdapterInterface, std::default_delete<caffe2::serialize::ReadAdapterInterface> >) + 0x6d (0x7fa24945f7cd in /opt/tensorrtserver/lib/pytorch/libtorch_cpu.so)
frame #3: <unknown function> + 0x2bdc4ff (0x7fa24a41b4ff in /opt/tensorrtserver/lib/pytorch/libtorch_cpu.so)
frame #4: torch::jit::load(std::unique_ptr<caffe2::serialize::ReadAdapterInterface, std::default_delete<caffe2::serialize::ReadAdapterInterface> >, c10::optional<c10::Device>, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&) + 0x6d (0x7fa24a41ae5d in /opt/tensorrtserver/lib/pytorch/libtorch_cpu.so)
frame #5: torch::jit::load(std::istream&, c10::optional<c10::Device>, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&) + 0x79 (0x7fa24a41b2f9 in /opt/tensorrtserver/lib/pytorch/libtorch_cpu.so)
frame #6: <unknown function> + 0x2964d3 (0x7fa2851474d3 in /opt/tensorrtserver/lib/libtrtserver.so)
frame #7: <unknown function> + 0x297410 (0x7fa285148410 in /opt/tensorrtserver/lib/libtrtserver.so)
frame #8: <unknown function> + 0x290f55 (0x7fa285141f55 in /opt/tensorrtserver/lib/libtrtserver.so)
frame #9: <unknown function> + 0x114b7d (0x7fa284fc5b7d in /opt/tensorrtserver/lib/libtrtserver.so)
frame #10: <unknown function> + 0x116195 (0x7fa284fc7195 in /opt/tensorrtserver/lib/libtrtserver.so)
frame #11: <unknown function> + 0xbd66f (0x7fa2ec8c666f in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
frame #12: <unknown function> + 0x76db (0x7fa2ecb996db in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #13: clone + 0x3f (0x7fa2feab288f in /lib/x86_64-linux-gnu/libc.so.6)
- Triton Server doesn’t support PyTorch v1.6??
Based on my investigation, this appear because from PyTorch 1.6 onwards,torch.save
ortorch::jit::save
saves the weight in serialized zip file. In Triton server, it will try to load the PyTorch model usingtorch::jit::load
which only load models saved fromtorch::jit::save
. As a workaround to pass this stage, I need to use Pytorch v1.5 + torchvision v0.5.0 to save my model before Triton able to read the weights.
Having tried the above workaround, why we have to downgrade our model to using PyTorch v1.5 when the docker image supports PyTorch v1.6?
- Does it has something to do with TensorRT not supporting Pytorch v1.6?
- Is there any working sample config files for PyTorch running on Triton Inference Server from Nvidia? I would appreciate if Nvidia could provide us at least a sample from PyTorch to show that it actually works with Triton server. Please share it with us if you have it. Thanks!
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
Target model : Yolov5
Weights/Checkpoint : yolov5s.pt (Run the download_weights.sh
in here
Config files : config_yolo.zip (3.7 KB)
Credits to : YoloV5