How to deploy and run Mistral-7B-Instruct-v0.2 model using Triton Inference Server on AWS EC2 instance

jutursundarkumar.reddy · March 20, 2025, 3:24pm

I have downloaded Mistral-7B-Instruct-v0.2 model from huggingface and want to convert that model to Triton Inference Server supported framework and Run the model using Triton Inference server. I need support doc which provides steps to deploy and also infrastructure details.

sophwats · May 1, 2025, 1:16pm

Hi @jutursundarkumar.reddy please see our docs here Deploying Hugging Face Transformer Models in Triton — NVIDIA Triton Inference Server.

Topic		Replies	Views
Implementation of Triton Inference server on EC2 ubuntu instance TensorRT tensorrt , cuda , inference-server-triton , deepstream	2	683	December 7, 2023
Mistral AI Models TensorRT cudnn	1	335	June 25, 2024
New Tutorial : Deploying Models from TensorFlow Model Zoo Using NVIDIA DeepStream and NVIDIA Triton Inference Server DeepStream SDK	2	472	March 6, 2023
Support for Triton Inference Server on Jetson NX Jetson Xavier NX cuda , jetson-inference , inference-server-triton , jetson , deepstream61	2	885	November 2, 2022
MistralAI models, Mistral-7B, Mistral-7B-Instruct, Mixtral-8x7B, Mixtral-8x7B-Instruct Maxine	0	224	June 17, 2024
Triton infererence server example 'simple_grpc_infer_client.py' DeepStream SDK	11	5052	March 23, 2022
[SUPPORT] Workbench Example Project: Mistral Finetune NVIDIA AI Workbench workbench-example-project	5	651	June 4, 2024
Running a TensorRT model on Xavier Jetson AGX Xavier tensorrt	4	593	October 18, 2021
Deploy LPRnet sequence Classifier on Triton Inference Server Triton Inference Server - archived	0	758	July 21, 2021
Documentation for multi-model serving with overcommit on Triton TensorRT python , inference-server-triton	1	689	April 24, 2023

How to deploy and run Mistral-7B-Instruct-v0.2 model using Triton Inference Server on AWS EC2 instance

Related topics