NVIDIA NIM API invoked by Langchain returns statuscode 500

Hi! When I tried to invoke the NVIDIA NIM API (hosted by NVIDIA, not me) via Langchain (using the meta/llama-3.1-70b-instruct mode), and parsing the output as structured, I always get this error:

Traceback (most recent call last):
  File "/localhome/wtest/nv_wso copy.py", line 154, in <module>
    agent.invoke(
  File "/usr/local/lib/python3.12/site-packages/langgraph/pregel/__init__.py", line 1334, in invoke
    for chunk in self.stream(
  File "/usr/local/lib/python3.12/site-packages/langgraph/pregel/__init__.py", line 1020, in stream
    _panic_or_proceed(all_futures, loop.step)
  File "/usr/local/lib/python3.12/site-packages/langgraph/pregel/__init__.py", line 1450, in _panic_or_proceed
    raise exc
  File "/usr/local/lib/python3.12/site-packages/langgraph/pregel/executor.py", line 60, in done
    task.result()
  File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/langgraph/pregel/retry.py", line 26, in run_with_retry
    task.proc.invoke(task.input, task.config)
  File "/usr/local/lib/python3.12/site-packages/langchain_core/runnables/base.py", line 2876, in invoke
    input = context.run(step.invoke, input, config, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/langgraph/utils.py", line 102, in invoke
    ret = context.run(self.func, input, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/localhome/wtest/nv_wso copy.py", line 104, in respond
    response = structured_llm.invoke(
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/langchain_core/runnables/base.py", line 2876, in invoke
    input = context.run(step.invoke, input, config, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/langchain_core/runnables/base.py", line 5092, in invoke
    return self.bound.invoke(
           ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 277, in invoke
    self.generate_prompt(
  File "/usr/local/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 777, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 634, in generate
    raise e
  File "/usr/local/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 624, in generate
    self._generate_with_cache(
  File "/usr/local/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 846, in _generate_with_cache
    result = self._generate(
             ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/chat_models.py", line 289, in _generate
    response = self._client.get_req(payload=payload)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py", line 449, in get_req
    response, session = self._post(self.infer_url, payload)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py", line 346, in _post
    self._try_raise(response)
  File "/usr/local/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py", line 439, in _try_raise
    raise Exception(f"{header}\n{body}") from None
Exception: [500] Internal Server Error
'bool' object has no attribute 'get'
RequestID: 75efc63a-f9c1-4891-b83a-c8a76987c2c8

Does anyone know how to fix this error? By the way, I also noticed while using NIM on the NVIDIA’s web interface and via API that tool calls takes so much time (28 seconds), and for example, while using Ollama, time is normal?!

Here’s also screenshot of Langsmith’s output for it:

If anyone knows how to solve this problems, please help. Thanks in advance!

Can you provide a bit more context on how this is being called?