I am planning to run Gen AI based ChatBots application on HGX or SMCI 5U servers (e.g., AS -5126GS-TNRT) directly connected to block storage. The idea is: receive the incoming chat text in a application, which retrieves the data from RDBMS running on Block, passes whole bundle to LLMs running on server, gets the formatted reply back and relays it back to user. I need some expert opinion on this approach. Then for the question on small amount of shared drive, I plan to create a volume on block and expose it as NFS.
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Build and Deploy a Multi-Agent Chatbot on a Workstation | 1 | 145 | December 8, 2025 | |
| Choosing the Right Storage for Enterprise AI Workloads | 0 | 411 | July 21, 2022 | |
| DGX Spark: The Sovereign AI Stack — Dual-Model Architecture for Local Inference | 3 | 414 | December 15, 2025 | |
| New Video: What Runs ChatGPT? | 0 | 396 | June 12, 2023 | |
| Best Inference Framework & Open Models for Orchestrator-Workers Agentic Coding on GB10 + 5090 Hybrid? | 0 | 94 | January 5, 2026 | |
| Accelerating AI Storage by up to 48% with NVIDIA Spectrum-X Networking Platform and Partners | 1 | 57 | February 4, 2025 | |
| AI Chatbot - Docker workflow Guide issue Container nemollm-inference-microservice V100 32GB X8 | 1 | 136 | May 12, 2025 | |
| Agentic DevOps with DGX Spark ?! | 3 | 566 | September 2, 2025 | |
| Accelerating Inference on End-to-End Workflows with H2O.ai and NVIDIA | 2 | 515 | January 4, 2024 | |
| Updated blueprints for DGX Spark | 2 | 501 | October 15, 2025 |