NIM_MAX_MODEL_LEN environment variable does nothing

Hey,

The documentation here:

States clearly that the NIM_MAX_MODEL_LEN environment variable can control the context window length.
However, setting it does nothing and the vLLM engine under the hood is launched with the default value of None.

In the following question the top answer by the NVIDIA team states that said parameter does not exist:

Please either fix the docs or introduce the variable.

Thanks for this feedback, Mark! I’ll pass it along!