LLMs like Llama?

Can I run LLMs like Llama 13B in Int4 on the Orin? Seems like it should be possible, but some confirmation would be great!

Hi @dnhkng, I haven’t seen this attempted yet and don’t know if it would fit in memory or what the performance would be like - although it is certainly an interesting topic to explore. Perhaps others from the community can share their experience if they’ve tried it.

Ok, get me a discount code for the Orin, and I will try it out and post the results 👍

The 16Gb oring should be able to hold the large models easily in Int4 and offer some super interesting possibilities 🤔

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.