Tencent HPC-Ops interesting new inferencing library

arctic.gus · January 27, 2026, 12:32pm

Just saw this (GitHub - Tencent/hpc-ops: High Performance LLM Inference Operator Library), is in its infancy so lacking a lot of features / supported quants of the big boy alternatives, but reported performance gains look nice… hopefully those would also work on our sparks.

aniculescu · January 27, 2026, 9:53pm

Thanks for informing the community. Hopefully it can help some people with their projects

Topic		Replies	Views
Question on Inference Performance Results of Qwen3 235B A22B on 2× DGX Spark DGX Spark / GB10 cuda	5	306	December 19, 2025
DGX Spark performance DGX Spark / GB10	17	863	January 21, 2026
Can't inference GPT-OSS-120b-Eagle3 on GB10 DGX Spark / GB10 spark , llm	3	388	January 26, 2026
DGX Spark vs AMD Strix Halo DGX Spark / GB10 llama	2	2916	October 23, 2025
NVIDIA folks -- where is this promised nvfp4 speedup? DGX Spark / GB10	24	895	January 11, 2026
DGX Spark + Qwen3-Next-80B: Proven Performance, But Missing Clear Path to NIM, TensorRT-LLM & Web UIs DGX Spark / GB10 cuda , nim , llama	10	720	January 25, 2026
GDX Spark is extremely slow on a short LLM test DGX Spark / GB10	20	1917	January 25, 2026
Dgx spark benchmark performance DGX Spark / GB10	17	1430	January 4, 2026
New bleeding-edge vLLM Docker Image: avarok/vllm-nvfp4-gb10-sm120 DGX Spark / GB10 Projects	35	1196	December 31, 2025
DGX Spark Playbooks Update - Jan 2026 Announcements data-science , spark , jetson , generative_ai , nemotron	1	371	January 21, 2026