Topics tagged llama

Topic	Replies	Views	Activity
NVFP4 quantization of a 100B-class Llama on 2× DGX Spark — lessons + open questions DGX Spark / GB10 llama	0	18	May 13, 2026
Fully custom CUDA-native Deepseek 4 Flash optimized for 1x Spark! antirez/ds4 DGX Spark / GB10 Projects gaming , llama , deepseek	20	893	May 13, 2026
Oops.. pressed the button for 2x GB10... no spousal approval, am I in trouble? DGX Spark / GB10 llama	11	302	May 13, 2026
PyTorch CUDACachingAllocator NVML assertion when sharing CUDA context with llama.cpp on Orin Nano 8 GB (JetPack 6.2.2) Jetson Orin Nano pytorch , jetson , generative_ai , llama	0	5	May 13, 2026
Introducing Tool Eval Bench CLI DGX Spark / GB10 Projects llama , agentic-ai	116	2798	May 13, 2026
Eugr joins NVIDIA Spark Team! DGX Spark / GB10 llama	102	2588	May 13, 2026
Request to enable Public API Endpoints for my personal organization NVAPI nim , llama , deepseek	0	4	May 13, 2026
Rate Limit 40 -> 150 Access/Accounts nim , llama , nemotron	0	22	May 13, 2026
NVFP4 on DGX Spark / GB10 is broken. I bought 9 of these for this feature. Requesting NVIDIA's official roadmap and response DGX Spark / GB10 jetson , llama , agentic-ai , nemotron , nemoclaw	43	3826	May 12, 2026
Trouble with Llama 70b 3.3 Instruct FP8 Model at 3 tokens per second DGX Spark / GB10 llama	15	522	May 12, 2026
My DGX Spark Setup (unsloth qwen36moe 2x, llama-cpp+mtp PR, ansible for easy mode) DGX Spark / GB10 ansible , system-setup , llama , dgx	1	199	May 12, 2026
Request for NVIDIA NIM API Rate Limit Increase (40 → 200 RPM) Access/Accounts nim , llama , deepseek	0	16	May 12, 2026
Request for NVIDIA NIM API Rate Limit Increase (40 → 200 RPM) – Hermes Agent Multi-Model Development Access/Accounts nim , llama	0	21	May 12, 2026
DGX Spark stability / out of RAM / overheating DGX Spark / GB10 llama	26	933	May 12, 2026
Spark-inference: Run 3 specialized models simultaneously on your DGX Spark — cybersecurity + coding + orchestration, 30-min setup DGX Spark / GB10 Projects jetson , llama , deepseek , nemotron	3	636	May 11, 2026
DeepSeek v4 Flash (IQ2XXS) on a single GB10! DGX Spark / GB10 Projects llm , llama , deepseek	2	1584	May 11, 2026
Manual Account Verification Request – Region: Uzbekistan (+998) Access/Accounts ai , api , nim , llama , nemotron	0	10	May 11, 2026
NVIDIA NIM API Rate Limit Increase Request (40 → 200 RPM) – Claude Code Multi-Agent Development Access/Accounts nim , llama , deepseek , nemotron	0	21	May 11, 2026
Qwen3.5 27B optimisation thread starting at 30+ t/s TP=1 DGX Spark / GB10 llama , agentic-ai	23	2212	May 11, 2026
nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-FP8 DGX Spark / GB10 jetson , llama , nemotron	2	372	May 9, 2026
Missing "Public API Endpoints" permission – 403 Forbidden on integrate.api.nvidia.com Access/Accounts nim , llama , nemotron	0	22	May 9, 2026
Request to increase the API Rate Limit (40 -> 200) Access/Accounts nim , llama , agentic-ai	0	21	May 9, 2026
DGX Spark Performance Degradation - GPU Power Draw Issue DGX Spark / GB10 power , performance , llama	50	2562	May 9, 2026
Request for NVIDIA NIM API Rate Limit Increase (40 → 200 RPM) – Agentic Coding Workflows Access/Accounts nim , llama , agentic-ai , deepseek , nemotron	0	38	May 9, 2026
Request for NVIDIA NIM API Rate Limit Increase (40 → 200 RPM) – OpenClaw Agent Development Access/Accounts api , jetson , nim , llama , agentic-ai , nemotron , openclaw	3	174	May 8, 2026
Request for Rate Limit Increase: 200 RPM for Multi-Agent Orchestration & Recursive Workflow (AIOC Project) Access/Accounts nim , llama , nemotron	0	17	May 8, 2026
Request for Rate Limit Increate \| Personal testing Access/Accounts nim , llama , nemotron	1	46	May 7, 2026
NVIDIA NIM API Rate Limit Increase (40 → 200 RPM) – Personal Development & Multi-Agent Testing Access/Accounts nim , llama , nemotron	0	55	May 7, 2026
Is Megatron training with Nemo/Megatron connector unsupported on GB10? DGX Spark / GB10 cuda , llama	1	66	May 7, 2026
46tok/s with RedHatAI/gemma-4-26B-A4B-it-NVFP4 DGX Spark / GB10 llama	18	1297	May 6, 2026