API Input length 1217 exceeds maximum allowed token size 512 but configured the API parameters to 4096

I’m building a Rag using Nvidia meta/llama-3.2-3b-instruct model while passsing the question with context retrived from the vector embedding im gettting this error "Input length 1217 exceeds maximum allowed token size 512’ but where as i have increased the parameter size in the Nvidia site and used a new APi key but still getting the same error. Is there a way to solve this or the API has a token imit of 512 token. Kindly let me know.