I have downloaded Mistral-7B-Instruct-v0.2 model from huggingface and want to convert that model to Triton Inference Server supported framework and Run the model using Triton Inference server. I need support doc which provides steps to deploy and also infrastructure details.
sophwats
2
Hi @jutursundarkumar.reddy please see our docs here Deploying Hugging Face Transformer Models in Triton — NVIDIA Triton Inference Server.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Implementation of Triton Inference server on EC2 ubuntu instance | 2 | 683 | December 7, 2023 | |
Mistral AI Models | 1 | 335 | June 25, 2024 | |
New Tutorial : Deploying Models from TensorFlow Model Zoo Using NVIDIA DeepStream and NVIDIA Triton Inference Server | 2 | 472 | March 6, 2023 | |
Support for Triton Inference Server on Jetson NX | 2 | 885 | November 2, 2022 | |
MistralAI models, Mistral-7B, Mistral-7B-Instruct, Mixtral-8x7B, Mixtral-8x7B-Instruct | 0 | 224 | June 17, 2024 | |
Triton infererence server example 'simple_grpc_infer_client.py' | 11 | 5052 | March 23, 2022 | |
[SUPPORT] Workbench Example Project: Mistral Finetune | 5 | 651 | June 4, 2024 | |
Running a TensorRT model on Xavier | 4 | 593 | October 18, 2021 | |
Deploy LPRnet sequence Classifier on Triton Inference Server | 0 | 758 | July 21, 2021 | |
Documentation for multi-model serving with overcommit on Triton | 1 | 689 | April 24, 2023 |