Is it possible to deploy the Llama-70b model with TensorRT LLM on an L40S GPU?

I have been struggling to perform inference with large models like Llama-70b or Qwen-70b. What are the necessary requirements to perform inference with these two models using TensorRT LLM and Triton?

Changed the sub-category to TensorRT

Request you to refer the link below.