Topics tagged llm

Topic	Replies	Views	Activity
Easy-vllm — Let a code agent build & serve any model on vLLM for your DGX Sparks DGX Spark / GB10 spark , llm , deepseek	1	133	July 11, 2026
Managing Multiple vLLM Models on DGX Spark with LiteLLM DGX Spark / GB10 Projects config , llm , developer-support , llama	9	334	July 11, 2026
GGUF Quantizer Independently Fingerprints the Universal Depth Layout CUDA-Q cuda , kernel , llm , kvm	0	19	July 10, 2026
vLLM FP8 models unusable on AGX Thor (SM 11.0): kernels compiled for sm100f only — This kernel only supports sm100f. → CUBLAS_STATUS_INTERNAL_ERROR Jetson Thor cublas , llm	2	31	July 8, 2026
Serving Qwen3.5-397B-A17B at 1M Tokens on 2× DGX Spark — MiniMax M3 Is Next DGX Spark / GB10 llm , llama , deepseek	5	1057	July 3, 2026
DeepSeek v4 Flash (IQ2XXS) on a single GB10! DGX Spark / GB10 Projects llm , llama , deepseek	13	4498	July 2, 2026
Compute bottleneck evaluating multiple trajectories through a 7B VLM (VLA Architecture Design) TensorRT tensorrt , cuda , nemo , llm	3	58	June 30, 2026
SCBKR: A local responsibility-chain workbench for LLMs with human-confirmed generation, storage, replay, and retrieval gates NIM on RTX AI PCs and Workstations ai , chinese , rtx , nim , llm	1	51	June 30, 2026
Three times ( VoiceClone \| VoiceDesign \| CustomVoice ) - Faster-Qwen3-TTS for NVIDIA DGX Spark (GB10) DGX Spark / GB10 Projects docker , spark , llm , speech , llama , dgx	55	2220	June 30, 2026
2 node spark vs 3 or 4 node spark DGX Spark / GB10 Projects spark , llm , agentic-ai , dgx	18	1272	June 29, 2026
Ollama installation script for Jetson Jetpack 7 Jetson Orin Nano llm	5	182	June 29, 2026
Hybrid Edge-Cloud modular VLA(Vision Language Action) Pipeline using NVIDIA NIM for Autonomous Driving Computer Vision & Image Processing opencv , computer-vision , nim , llm , llama , cosmos , dgx	0	46	June 28, 2026
Qwen3.5-122B-A10B on single Spark: up to 51 tok/s (v2.1 — patches + quick-start + benchmark) DGX Spark / GB10 cuda , performance , docker , performance-tuning , llm	434	23702	June 24, 2026
Running vllm as a service on Thor JP7.2 Jetson Thor llm	6	138	June 24, 2026
EXCEPTION_ACCESS_VIOLATION while testing the Unreal ACE LLM plugin General Topics & Other SDKs unreal-engine , llm	0	23	June 22, 2026
You Can Only Keep ONE: DGX Spark, RTX 5090 Workstation, or Mac Studio Ultra - Which One Survives on Your Desk in 2026? NVIDIA AI Workbench tensorrt , pytorch , generative_ai , llm , llama , agentic-ai	0	967	June 21, 2026
From Governance Runtime Assurance to Human-Directed Intelligence Confidential Computing python , ai , inference-server-triton , architecture-and-design , system-management-and-architecture , nim , llm , agentic-ai	7	58	June 19, 2026
Pushing GB10 to the Limit: Qwen3 235B MoE + Concurrent Best-of-4 + Persistent Agent Layer. Architecture check & Optimization tips? DGX Spark / GB10 tensorrt , nim , llm , mistral-large , nemotron , dgx-station	0	217	June 12, 2026
Llama.cpp MTP Lifted LLM Performance of Jetson Orin Jetson Orin NX kb , llm , llama	1	155	June 8, 2026
Ollama ERROR in docker Jetson AGX Orin llm	2	154	June 8, 2026
Jetson AGX Thor r38.4: BPMP thermal path stalls after large CUDA/vLLM unified-memory workload; nvfancontrol and thermal kworkers stuck in D state Jetson Thor cuda , llm	11	243	June 5, 2026
vLLM Qwen3.6-27B Extended Latency on Jetson Thor 128GB with Large Prompts Jetson Thor llm	7	348	June 4, 2026
Problem running Qwen Models via vllm on Jetson Orin Jetson AGX Orin llm	5	473	June 4, 2026
Nemotron-3-Nano on Jetson Thor vLLM : ITL degrades 4.7x with concurrency, MTP rejected Jetson Thor llm , nemotron	6	250	June 4, 2026
Nemotron-3-Super-120B-A12B-NVFP4 on single DGX Spark: 23.45 tok/s (spark-arena.com/ benhmarks) DGX Spark / GB10 cuda , benchmarks , spark , llm , nemotron , dgx , nemoclaw	6	1061	May 26, 2026
Deterministic gpt-oss-120b using vLLM on a DGX Spark DGX Spark / GB10 llm	8	717	May 26, 2026
Request for Brev.dev Credits for Human Rights & Cognitive Disability Support Project (Non-Profit Watchdog) NGC GPU Cloud cuda , gpu , api , generative_ai , llm , llama , agentic-ai	0	56	May 23, 2026
Request to enable Public API Endpoints for my personal organization Access/Accounts api , nim , llm , agentic-ai , nemotron	0	37	May 22, 2026
Nemotron Nano Omni on Thor? Jetson Thor llm , nemotron	11	303	May 21, 2026
Jetson Thor reboots under AI workloads and when connecting 2.5" HDD Jetson Thor power , problem , llm	14	195	May 21, 2026