Hello,
I’ve installed the latest version of the AI-Workbench and Hybrid RAG following the tutorial.
I am using a local model (meta-llama/Meta-Llama-3-8B-Instruct) running on an RTX3090 and Windows 11.
When I submit a text it suddenly stops generating and I get the following error message:
return ctx.invoke(self.callback, **ctx.params)
File “/opt/conda/lib/python3.11/site-packages/click/core.py”, line 783, in invoke
return __callback(*args, **kwargs)
File “/home/workbench/.local/lib/python3.11/site-packages/typer/main.py”, line 699, in wrapper
return callback(**use_params)
File “/opt/conda/lib/python3.11/site-packages/text_generation_server/cli.py”, line 109, in serve
server.serve(
File “/opt/conda/lib/python3.11/site-packages/text_generation_server/server.py”, line 274, in serve
asyncio.run(
File “/opt/conda/lib/python3.11/asyncio/runners.py”, line 190, in run
return runner.run(main)
File “/opt/conda/lib/python3.11/asyncio/runners.py”, line 118, in run
return self._loop.run_until_complete(task)
File “/opt/conda/lib/python3.11/asyncio/base_events.py”, line 641, in run_until_complete
self.run_forever()
File “/opt/conda/lib/python3.11/asyncio/base_events.py”, line 608, in run_forever
self._run_once()
File “/opt/conda/lib/python3.11/asyncio/base_events.py”, line 1936, in _run_once
handle._run()
File “/opt/conda/lib/python3.11/asyncio/events.py”, line 84, in _run
self._context.run(self._callback, *self._args)
File “/opt/conda/lib/python3.11/site-packages/grpc_interceptor/server.py”, line 165, in invoke_intercept_method
return await self.intercept(
File “/opt/conda/lib/python3.11/site-packages/text_generation_server/interceptor.py”, line 21, in intercept
return await response
File “/opt/conda/lib/python3.11/site-packages/opentelemetry/instrumentation/grpc/_aio_server.py”, line 120, in _unary_interceptor
raise error
File “/opt/conda/lib/python3.11/site-packages/opentelemetry/instrumentation/grpc/_aio_server.py”, line 111, in _unary_interceptor
return await behavior(request_or_iterator, context)
File “/opt/conda/lib/python3.11/site-packages/text_generation_server/server.py”, line 184, in Decode
return generate_pb2.DecodeResponse(
ValueError: Value out of range: -410952983
2025-07-18T21:09:16.185815Z ERROR batch{batch_size=1}:decode:decode{size=1}:decode{size=1}: text_generation_router_v3::client: backends/v3/src/client/mod.rs:54: Server error: Value out of range: -410952983
2025-07-18T21:09:16.187287Z ERROR compat_generate{default_return_full_text=true compute_type=Extension(ComputeType(“1-nvidia-geforce-rtx-3090”))}:generate_stream{parameters=GenerateParameters { best_of: None, temperature: Some(0.7), repetition_penalty: Some(1.0), frequency_penalty: None, top_k: Some(10), top_p: Some(0.999), typical_p: Some(0.95), do_sample: false, max_new_tokens: Some(402), return_full_text: Some(false), stop: , truncate: None, watermark: false, details: true, decoder_input_details: false, seed: None, top_n_tokens: None, grammar: None, adapter_id: None }}:async_stream:generate_stream:schedule:infer:send_error: text_generation_router_v3::backend: backends/v3/src/backend.rs:488: Request failed during generation: Server error: Value out of range: -410952983
raise parse_error(resp.status_code, json_payload)
text_generation.errors.GenerationError: Request failed during generation: Server error: Value out of range: -410952983
Do you have any idea what is causing this error?
Please let me know if you need any additional information.
Many thanks and best regards
Max
Please tick the appropriate box to help us categorize your post
Bug or Error
Feature Request
Documentation Issue
Other