Internal Server error ,Try again

greg87 · March 22, 2024, 1:44pm

Help us respond more quickly by giving basic info about the issue. Fill in the appropriate details in the sections below. Make sure to upload screenshots and logs.

Which Workbench location had the issue?

(local)

What is the Operating System for your local Workbench?

(Windows )

What is the Workbench Desktop App version?

(0.25.30 or 0.28.29 —Not sure where do I check?)

Was the issue with the Desktop App or the CLI?

( Desktop App)

Summary of the Issue

*(Hello everyone,

Trying to run Mistral 7B Instruct.
My gpu’s are Nvidia 3060 12GB & a 3080TI 12GB. 96GB Ram. 5950X CPU
When I try Load the model it works fine. But when I launch the server I get the error above.

I tried like 20 times and it worked once. I kept restarting the environment and it eventually worked only once. But when I tried again im getting the same error. I managed to get it working a few times after trying to launch it multiple times. I think there is a bug. I launched the server it worked. I then stopped the server and launched it again and it never worked. I made no changes.

Any help please!!
)*

What are the error messages?

( internal Server error)

What are the steps to reproduce

(start server after I click load model)

Upload screenshots or logs

{“level”:“warn”,“container-registry”:“ghcr.io”,“time”:“2024-03-22T18:29:44+02:00”,“message”:“BaseEnvironmentLatestTag is unknown for images not in NGC registry”}
{“level”:“warn”,“topic”:“/home/workbench/nvidia-workbench/nvidia-workbench-example-hybrid-rag”,“channel”:“0xc0008221e0”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Failed to send to subscriber. Channel full”}
{“level”:“warn”,“topic”:“/home/workbench/nvidia-workbench/nvidia-workbench-example-hybrid-rag”,“channel”:“0xc001304600”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Failed to send to subscriber. Channel full”}
{“level”:“warn”,“topic”:“/home/workbench/nvidia-workbench/nvidia-workbench-example-hybrid-rag”,“channel”:“0xc0012baf00”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Failed to send to subscriber. Channel full”}
{“level”:“warn”,“topic”:“/home/workbench/nvidia-workbench/nvidia-workbench-example-hybrid-rag”,“channel”:“0xc000b64b40”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Failed to send to subscriber. Channel full”}
{“level”:“warn”,“topic”:“/home/workbench/nvidia-workbench/nvidia-workbench-example-hybrid-rag”,“channel”:“0xc000a929c0”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Failed to send to subscriber. Channel full”}
{“level”:“warn”,“topic”:“/home/workbench/nvidia-workbench/nvidia-workbench-example-hybrid-rag”,“channel”:“0xc001189320”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Failed to send to subscriber. Channel full”}
{“level”:“warn”,“topic”:“/home/workbench/nvidia-workbench/nvidia-workbench-example-hybrid-rag”,“channel”:“0xc000a938c0”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Failed to send to subscriber. Channel full”}
{“level”:“warn”,“topic”:“/home/workbench/nvidia-workbench/nvidia-workbench-example-hybrid-rag”,“channel”:“0xc0005a75c0”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Failed to send to subscriber. Channel full”}
{“level”:“warn”,“topic”:“/home/workbench/nvidia-workbench/nvidia-workbench-example-hybrid-rag”,“channel”:“0xc000a923c0”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Failed to send to subscriber. Channel full”}
{“level”:“info”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Processing git status output”}
{“level”:“info”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Processing git status output”}
{“level”:“info”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Processing git status output”}
{“level”:“info”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Processing git diff output”}
{“level”:“info”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Processing git diff output”}
{“level”:“info”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Processing git diff output”}
{“level”:“info”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Processing git diff output”}
{“level”:“info”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Processing git diff output”}
{“level”:“info”,“time”:“2024-03-22T18:29:51+02:00”,“message”:“Processing git diff output”}
{“level”:“warn”,“container-registry”:“ghcr.io”,“time”:“2024-03-22T18:29:52+02:00”,“message”:“BaseEnvironmentLatestTag is unknown for images not in NGC registry”}
{“level”:“info”,“time”:“2024/03/22 - 18:29:54”,“status”:200,“latency”:“10.786567563s”,“client-ip”:“127.0.0.1”,“method”:“POST”,“path”:“/v1/query”,“time”:“2024-03-22T18:29:54+02:00”,“message”:“GIN-Request”}
{“level”:“info”,“time”:“2024/03/22 - 18:33:38”,“status”:200,“latency”:“12.03µs”,“client-ip”:“127.0.0.1”,“method”:“OPTIONS”,“path”:“/v1/query”,“time”:“2024-03-22T18:33:38+02:00”,“message”:“GIN-Request”}
{“level”:“info”,“time”:“2024-03-22T18:33:38+02:00”,“message”:“Processing git status output”}
{“level”:“info”,“time”:“2024-03-22T18:33:38+02:00”,“message”:“Processing git diff output”}
{“level”:“info”,“time”:“2024-03-22T18:33:38+02:00”,“message”:“Processing git diff output”}
{“level”:“info”,“time”:“2024/03/22 - 18:33:38”,“status”:200,“latency”:“13.305451ms”,“client-ip”:“127.0.0.1”,“method”:“POST”,“path”:“/v1/query”,“time”:“2024-03-22T18:33:38+02:00”,“message”:“GIN-Request”}
{“level”:“info”,“time”:“2024-03-22T18:33:38+02:00”,“message”:“Processing git status output”}
{“level”:“info”,“time”:“2024-03-22T18:33:38+02:00”,“message”:“Processing git diff output”}
{“level”:“info”,“time”:“2024-03-22T18:33:38+02:00”,“message”:“Processing git diff output”}
{“level”:“info”,“time”:“2024/03/22 - 18:33:38”,“status”:200,“latency”:“48.637507ms”,“client-ip”:“127.0.0.1”,“method”:“POST”,“path”:“/v1/query”,“time”:“2024-03-22T18:33:38+02:00”,“message”:“GIN-Request”}

Please provide the following info (tick the boxes after creating this topic):

Submission Type
Bug or Error
Feature Request
Documentation Issue
Question
Other

Workbench Version
Desktop App v0.44.8
CLI v0.21.3
Other

Host Machine operating system and location
Local Windows 11
Local Windows 10
Local macOS
Local Ubuntu 22.04
Remote Ubuntu 22.04
Other

greg87 · March 27, 2024, 10:51pm

Bump anyone home??

edwli · April 4, 2024, 6:50pm

Apologies for the delay. Still getting caught up from the GTC rush.

Try these debugging steps to manually start up the local inference server:

Stop any running environments
Start the environment from AIWB. Open Chat and Juptyerlab apps.
In the Chat window, click “Local” inference mode on the right hand side. Wait for the RAG backend to set up properly and switch over to the Local inference settings.
Click the Download button for the Mistral model. Wait for the model to download.
Inside the Jupterlab window, open a terminal and run “bash /project/code/scripts/start-local.sh mistralai/Mistral-7B-Instruct-v0.1 bitsandbytes-nf4”

This should show some logs on what is going on.

twhitehouse · April 8, 2024, 1:41pm

WRT finding the version for Workbench, there are two ways.

In a terminal, run nvwb version and the output will give you the version for the CLI. It’s different but in step with the Desktop App version.
For the Desktop App, right click the grey Workbench icon in your system tray and select “About AI Workbench”. A window will open with the Desktop App version.

I’ve attached two screenshots for how to the the Desktop App on a Windows machine. Mac and Ubuntu 22.04 will be a similar set of steps.

greg87 · April 11, 2024, 4:31pm

Thanks.

My version is : Version 0.44.8

When I try run your command in JupterLab I get a syntax error :

Cell In[1], line 1
bash /project/code/scripts/start-local.sh mistralai/Mistral-7B-Instruct-v0.1 bitsandbytes-nf4
^
SyntaxError: invalid decimal literal

If I look at logs under “chat” , I get this :

2024-04-11T16:29:25.703495Z INFO text_generation_launcher: Args { model_id: “mistralai/Mistral-7B-Instruct-v0.1”, revision: None, validation_workers: 2, sharded: None, num_shard: None, quantize: Some(BitsandbytesNF4), speculate: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_top_n_tokens: 5, max_input_length: 4000, max_total_tokens: 5000, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, max_batch_size: None, enable_cuda_graphs: false, hostname: “project-hybrid-rag”, port: 9090, shard_uds_path: “/tmp/text-generation-server”, master_addr: “localhost”, master_port: 29500, huggingface_hub_cache: Some(“/data/”), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 0.85, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, cors_allow_origin: , watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, tokenizer_config_path: None, disable_grammar_support: false, env: false }
2024-04-11T16:29:25.703611Z INFO download: text_generation_launcher: Starting download process.
Error: http://localhost:9090/info returned HTTP code 000

edwli · April 11, 2024, 7:12pm

Got it, thanks. Ah, I meant running that command in a Jupyterlab terminal window (eg. new tab → terminal → run the command). If running as a cell in a jupyter notebook, typically you would need to add a bang (!) symbol in front of the command, or the notebook will interpret the command as python code.

Looks like you are getting a 000 HTTP code when starting the local inference server for Mistral and timing out. We are aware of the issue and pushing a fix soon to periodically poll for a 200 code before letting the user submit queries.

In the meantime, try this workaround. Restart the environment and open Jupyterlab. Under code/scripts/start-local.sh, locate the line “sleep 50 # Model warm-up”. You can try increasing that warmup period and seeing if it helps. Then open Chat and try again.

Topic		Replies	Views
[SUPPORT] Workbench Example Project: Hybrid RAG NVIDIA AI Workbench workbench-example-project	95	2032	May 18, 2025
Internal Server Error. Try Again NVIDIA AI Workbench	1	596	May 30, 2024
Unable to log in my local location in the workbench NVIDIA AI Workbench	15	225	September 10, 2024
How to fix 0 compatible profiles for L40S with mistral-7b-instruct-v03 NIM? Models gpu , nim , mistral-7b-instruct-v03	7	288	November 4, 2024
[SUPPORT] Workbench Example Project: Agentic RAG NVIDIA AI Workbench	39	491	May 14, 2025
Unable to create new project (Mac) NVIDIA AI Workbench nvidia-ai-workbench	5	62	September 27, 2024
MistralAI models, Mistral-7B, Mistral-7B-Instruct, Mixtral-8x7B, Mixtral-8x7B-Instruct Maxine	0	224	June 17, 2024
Create, Share, and Scale Enterprise AI Workflows with NVIDIA AI Workbench, Now in Beta Technical Blog	1	332	January 30, 2024
NGC RMIRs Error in downloading models Riva riva	17	1115	February 26, 2024
Mistral AI Models TensorRT cudnn	1	335	June 25, 2024