How can I use NIMs in a self-hosted environment to perform inference with the LLaMA2-70B model on an L40s GPU?

I am looking to perform inference using the LLaMA2-70B model on an L40s GPU using NVIDIA NGC Containers (NIMs) locally. Is this possible? How should I go about setting this up?