Bug - (https://build.nvidia.com/moonshotai/kimi-k2.6) Kimi K2.6 Enters Infinite Repetition Loop (Spamming "!") when thinking

masoroonidev · May 1, 2026, 10:42pm

Application: Both NIM Models website and Hermes Agent
Model: Kimi K2.6 (via NVIDIA NIM)
Context Limit: 256k tokens
Notes: Hour-Of-Release on NIM API, expected bugs
Severity: High. I say this as it is entirely unusable when doing basic tasks.

When querying the Kimi K2.6 via standard API calls, the model’s reasoning trace occasionally breaks and enters an infinite repetition loop. Mid thought, it gets stuck on a single low-probability token (specifically “!”) and repeats it endlessly until the entire 256k context window is completely exhausted. I expected Kimi K2.6 to have bugs on release on NIM, and as I am writing this, Im in the exact hour of release on NIM.

Example:

me: how do i setup mem0 for use with hermes agent locally on my raspberry pi 5 8gb, do research please

Kimi’s Thinking:

… (lots of thinking, of course)

"pip3 install mem0ai chromadb --break-system-packages

Or use a venv (recommended)

!!! (~50,000 characters when unstopped, will not stop)

masoroonidev · May 1, 2026, 10:46pm

Follow Up, heres how it looks.

Bigfoot · May 1, 2026, 10:54pm

I also have this issue, it often happens

JohnDeere · May 1, 2026, 11:54pm

dpesnt happen to me

though in my case i barelt got any response at all with 502 and 504 but output is fine

you can try repetition penalty man

set it to 1.1 to 1.2

also tried on the web ui its fine

masoroonidev · May 1, 2026, 11:57pm

I’d recommend using a more complex problem or repeating followups

JohnDeere · May 2, 2026, 12:28am

i see the promblem now

using your exact prompt including case sensitivity Well can you tell me how openAI's chatgpt works do output infinite exclamation mark

but if you use Well, can you tell me how OpenAI's ChatGPT works? instead it works just fine bro

JohnDeere · May 2, 2026, 12:31am

same problem in API when you send exact same prompt

but i have confirmed adding "frequency_penalty”: 1.1 in the API completely fixed it man

before:

“reasoning”:" The!!!

after:

“reasoning”:" The user is asking for an explanation of how OpenAI’s ChatGPT works. This is a general knowledge question about AI/ML technology.

masoroonidev · May 2, 2026, 12:33am

Ah I see. Can this be added on the webUI? Might be a stupid question but I need to know.

Also, I’d like to note that the bug is very hit-or-miss. Sometimes it works and sometimes its ‘!!!’

I thought I fixed it as well when I was tinkering with temperature and some other settings, but it simply just delayed it a bit.

I’d like to have a sure-set solution that patches out the bug rather than delaying it and is something that can be implemented A.S.A.P. by an NVIDIA Dev.

JohnDeere · May 2, 2026, 12:50am

i tried it again and freq pen as low as 0.4 fixed it realiably

0.4: 4/5 coherent

0.3: 3/5

0.2: 2/5

its honestly not that bad as only that specific prompt with specific punctuation and case sensitivity that break it though even if you capitalize a single character to OpenAI’s it will output correctly

masoroonidev · May 2, 2026, 1:25am

Hmmm. I will have to try that.

masoroonidev · May 2, 2026, 3:33am

Update, I’m unable to replicate any success. Its still bugging out and spamming !!!, but now not only does it do it in its thinking, but also if it completes its thinking, the !!!'s are to be found in the main message.

Notes;

Wording is very odd, certain phrasing of the same points can cause it to succeed with or without failure

Adjusting Temperature has an effect, usually positive, however it feels as it is delaying the inevitable**.
Just as adjusting the temperature does, (but at a greater scale), adjusting** frequency_penaltydoes the same, this is thanks to the findings of JohnDeere

If you are able to fix the model, I’d recommend doing your own testing while keeping these observations in mind. Fixing this model A.S.A.P. would be VERY greatly appreciated as Kimi k2.6 is the best model on NVIDIA’s NIM API catalogue at this time.

tirprox · May 2, 2026, 7:06am

It has not been fixed at all

masoroonidev · May 2, 2026, 7:18am

Update, I have been using the model for a bit.

It’s not fixed.

Sorry for the confusion

masoroonidev · May 3, 2026, 12:11am

Confirmed to be still a prevalent issue as of this moment, happening ~1/3 times you ask it to think.

joelsomendiguera · May 3, 2026, 8:31am

still happening, and deepseek pro too has problems with thinking strange letters, so i dont think they will solve this, sad

kirstinchristmas336 · May 3, 2026, 1:50pm

Yeah, in their current state Kimi 2.6 and Deep Seek v4 pro API and Web UI are unusable from the platform.

turbosimon · May 3, 2026, 3:41pm

I’m currently still experiencing the same issue, is there any way to track the resolution status? Do you know any workaround? I’m trying K2-Thinking but I am having a lot of issues also with this model! Please any insight is very much appreciated

entropychannel · May 3, 2026, 6:24pm

I reported similar issues with DeepSeek V4 (Unexpected DeepSeek V4 Pro Behaviour - NVIDIA NIM / Models - NVIDIA Developer Forums) more than a week ago, but nothing seems to get resolved.

firejack200 · May 3, 2026, 8:08pm

it was working just fine a minute ago… and now it’s stopped again. OpenClaw idiots pulling 1000 Requests per second need to go use the Nemoclaw service and not the NIM Api.

JohnDeere · May 4, 2026, 11:47am

guys seems like its actually known issue for all kimi k26. providers

I stopped using nimfree because mine keep getting 502 or 504 and switched to ollama cloud but even there the moddle keep repeating SSSSSSSSS like TheSSSSSS or so its the same proble just different tokens

people on reddit said theirs keep doing the same too

seems like the onlty solution is for us the clinets to implement safeguard mechanism to catch those kidns of errors and therminate the connection wheb caught

Topic		Replies	Views
Kimi k2.6 spamming "the!" when calling the model Models	5	367	May 2, 2026
Kimi K2.6 Models	8	1851	May 2, 2026
Request for Kimi K2.6 API Rate Limit Increase (RPH) for Long-Context Text Processing Models nim	0	140	May 22, 2026
Kimi K2.6 Error (The!) when Thinking NVAPI	0	273	May 3, 2026
Bug Report: NVIDIA NIM Hosted Endpoint Reliability Issues - bugs requiring extensive client-side workarounds Models nim , deepseek	3	378	April 14, 2026
Help with NIM Models undergoing "stream of counciousness" during usage Models nim	2	88	May 6, 2026
Could you please update kimi k2 with the thinking model? Models	4	391	December 12, 2025
Kimi k2.6 500 unhashable type: 'dict' and bug with parallel_tool_calls Models tools , nim	0	228	May 11, 2026
Request for NVIDIA NIM API Rate Limit Increase – Model Evaluation & Personal Development Access/Accounts nim , deepseek , nemotron	1	367	April 29, 2026
Request for NIM API rate limit increase (40 → 200 RPM) — getting started with multi-model exploration Access/Accounts ai , nim , agentic-ai , deepseek	0	30	May 25, 2026

Bug - (https://build.nvidia.com/moonshotai/kimi-k2.6) Kimi K2.6 Enters Infinite Repetition Loop (Spamming "!") when thinking

Or use a venv (recommended)

Related topics