I run this NIM example from Jupyter notebook. It hangs more than 50% of runs. Usually, it runs ok for the first time after restarting the kernel in a Jupyter notebook but not always. When I reran this example, then it freezes for most cases.
from openai import OpenAI
client = OpenAI(
base_url = “https://integrate.api.nvidia.com/v1”,
api_key = “…”
)
completion = client.chat.completions.create(
model=“nvidia/llama-3.1-nemotron-70b-instruct”,
messages=[{“role”:“user”,“content”:“Write a limerick about the wonders of GPU computing.”}],
temperature=0.5,
top_p=1,
max_tokens=1024,
stream=False
)