Accelerated Computing CUDA

CUDA Setup and Installation Installing and configuring your development environment for CUDA C, C++, Fortran, Python (pyCUDA), etc. CUDA Programming and Performance General discussion area for algorithms, optimizations, and approaches to GPU Computing with CUDA C, C++, Thrust, Fortran, Python (pyCUDA), etc. CUDA on Windows Subsystem for Linux General discussion on WSL 2 using CUDA and containers. CUDA NVCC Compiler Discussion forum for CUDA NVCC compiler.

Topic	Replies	Views	Activity
We require more registers CUDA Programming and Performance	4	26	July 28, 2026
How can I verify whether my Tesla V100 32GB PCIe has the correct VBIOS before attempting a firmware update? CUDA Setup and Installation cuda , driver	2	55	July 28, 2026
Is there a particular order to how warps are statically assigned to SM sub partitions? CUDA Programming and Performance cuda , performance , vulkan	2	41	July 28, 2026
Request for clarification: forward compatibility behavior on driver 550.54.14 seems inconsistent with the CUDA compatibility documentation CUDA Setup and Installation cuda , installation , software-and-drivers	4	102	July 23, 2026
RTX PRO 6000 Blackwell falls off the bus under load — Xid 79 → Xid 154, driver 580.159.03 CUDA Setup and Installation	1	104	July 23, 2026
Green-context SM provisioning vs the primary context CUDA Programming and Performance performance , blackwell	1	60	July 23, 2026
Weekend project: Highly efficient acos() implementations with reduced accuracy CUDA Programming and Performance	1	127	July 22, 2026
Weekend project: Exploring the feasibility of replacing MUFU.RCP and MUFU. RSQ CUDA Programming and Performance	5	328	July 22, 2026
Native Incremental and Temporal PageRank for Dynamic Graphs – Design Considerations CUDA Programming and Performance graph-analytics-cugraph	0	23	July 22, 2026
When API setHardwareCompatibilityLevel is implemented in CUDA? CUDA Setup and Installation	1	48	July 22, 2026
Build onnxruntime 1.19.2 fail due to API HardwareCompatibilityLevel CUDA Setup and Installation cuda	2	128	July 22, 2026
Using INT2 for Coarse Prediction Within a Stabilized FP4/FP8 Substrate (Native Hopper Implementation) CUDA Programming and Performance	1	57	July 21, 2026
cuMemSetAccess Failed to create GPU mapping CUDA on Windows Subsystem for Linux question , bug	0	37	July 20, 2026
Failing nccl mpi tests for vllm_25.10-py3.sif CUDA Setup and Installation	0	20	July 20, 2026
Tesla V100-PCIE-32GB detected as V100-SXM2-32GB on Windows (Code 10, driver won’t start) CUDA Setup and Installation cuda , driver , software-and-drivers	4	98	July 20, 2026
CUDA on GTX Titan Black CUDA Setup and Installation cuda	1	428	July 17, 2026
Questions for two undocumented VMM API behavior CUDA Programming and Performance question	2	72	July 17, 2026
When Activation Checkpointing Calls a Stateful Quantizer Twice CUDA Programming and Performance cuda , pytorch , ai-training	0	50	July 17, 2026
Does Blackwell support INT4 native? CUDA Programming and Performance	13	2002	July 16, 2026
How to Calculate the Bank Conflicts CUDA Programming and Performance cuda , nsight-compute	18	295	July 14, 2026
Weekend project: Very accurate double-precision sincos() implementation for a restricted domain CUDA Programming and Performance	1	151	July 13, 2026
Question about `thread_scope_system` release/acquire store/load when `cudaDevP2PAttrNativeAtomicSupported == 0` CUDA Programming and Performance cuda	0	43	July 13, 2026
Slow kernel if I don't specify bounds at compile time CUDA Programming and Performance cuda , hpc-compilers-nvfortran	1	47	July 12, 2026
Sub/virtual warps or serialize access? CUDA Programming and Performance	3	53	July 12, 2026
Code samples for GTC S81772: Don’t Leave Tensors on the Table: Programming and Optimizing Tensor Cores CUDA Programming and Performance gtc , nsight-compute	0	41	July 11, 2026
CUDA files won't be recompiled when an header file has changed if the CCCL library is used CUDA NVCC Compiler cuda	3	188	July 10, 2026
CUDA 13 Fraud AI Benchmark: CPU vs CUDA 12 vs CUDA 13 model-quality comparison CUDA Programming and Performance cuda , docker , pytorch , benchmarks	3	134	July 10, 2026
CUDART64_50_35.DLL in Wondershare like a malware CUDA Setup and Installation	0	28	July 8, 2026
Truncated loop-exit compare in PTX prevents ptxas from unrolling loops CUDA Programming and Performance	0	52	July 7, 2026
Ldmatrix access pattern to shared memory CUDA Programming and Performance	1	72	July 4, 2026