|
Is the CUDA tile kernel submitted to GPU still using the cuLaunchKernel?
|
|
1
|
15
|
December 9, 2025
|
|
FFMA with Uniform register
|
|
3
|
47
|
December 9, 2025
|
|
How many tensor cores to execute the wmma.mma.sync.aligned.{alayout}.{blayout}.m16n16k16 instruction?
|
|
4
|
49
|
December 9, 2025
|
|
Can't install CUDA and Nsight - Visual Studio or what? (Updated)
|
|
4
|
19
|
December 9, 2025
|
|
Is it possible having compressible memory & memory pools over the same array on device?
|
|
0
|
13
|
December 9, 2025
|
|
CUDA-Q kernel crashes on Tesla V100 (Driver 570.133 / CUDA 12.8) when running VQE
|
|
0
|
10
|
December 9, 2025
|
|
cudaMemcpyBatchAsync cannot aggregate D2D copy operations
|
|
13
|
88
|
December 9, 2025
|
|
Training YOLO in the background
|
|
1
|
38
|
December 8, 2025
|
|
Pixel Shader vs NPP - Which is faster for batch processing NV12 to RGB conversions and display directly to screen?
|
|
2
|
27
|
December 8, 2025
|
|
Deadlock when using cuStreamWaitValue32/cuStreamWriteValue32 for async cross-stream ordering
|
|
8
|
23
|
December 8, 2025
|
|
Questions about the Cutile C++
|
|
1
|
38
|
December 8, 2025
|
|
Implementing clang-tidy checks for CUDA C++ Guidelines for Safety Critical Programming
|
|
3
|
23
|
December 8, 2025
|
|
Enabling multi-GPU
|
|
0
|
15
|
December 8, 2025
|
|
Parallel Pool shuts down on generating testVectors
|
|
0
|
14
|
December 8, 2025
|
|
Question about CTA/warp lifecycle
|
|
4
|
35
|
December 8, 2025
|
|
Unable to Install CUDA-Enabled PyTorch for NVIDIA GB10 GPU (Only CPU Version Installed)
|
|
4
|
83
|
December 7, 2025
|
|
Help needed to execute tcgen05.mma_cta_group::2 instructions
|
|
0
|
19
|
December 7, 2025
|
|
CUDA-13.1 breaks BF16 operators for sm_86
|
|
3
|
38
|
December 6, 2025
|
|
Cuda runfile won't extract
|
|
3
|
96
|
December 5, 2025
|
|
Which offers lower latency for NV12 to RGB conversion, NPP or CV-CUDA?
|
|
1
|
21
|
December 5, 2025
|
|
TensorFlow + RTX 5090 + WSL: CUDA 12 Installed in WSL but Windows Driver Uses CUDA 13
|
|
0
|
20
|
December 5, 2025
|
|
Any advice for best pipeline of least latency to display NV12 textures to monitor (Windows 10)
|
|
1
|
18
|
December 5, 2025
|
|
the best way convert nv12 to RGBA
|
|
6
|
3481
|
December 5, 2025
|
|
When will CUDA toolkit be able to detect Visual Studio 2026 during installation? Soon?
|
|
2
|
713
|
December 4, 2025
|
|
Cuda core dump does not work properly when many device assert happens
|
|
2
|
132
|
December 4, 2025
|
|
Cooperative_groups::cluster_group _CG_HAS_CLUSTER_GROUP does not get #define'd
|
|
1
|
66
|
December 4, 2025
|
|
Cannot allocate any memory with cudaMallocHost or cudaMallocManaged
|
|
0
|
29
|
December 4, 2025
|
|
cuCtxSynchronize fails with latest nvidia windows drivers (581.80)
|
|
1
|
27
|
December 3, 2025
|
|
Unexpected Performance Behavior with CUDA Software Prefetcher, Warm-Up Kernel and GEMV
|
|
10
|
72
|
December 3, 2025
|
|
Asymmetric PCIe bandwidth in bidirectional transfers: H2D drops 56% while D2H maintains performance
|
|
1
|
39
|
December 2, 2025
|