How to report a bug
|
|
0
|
10220
|
November 28, 2018
|
Batched SDDMM is not supported
|
|
1
|
7
|
February 8, 2023
|
Is 2D-array column-major storage explicitly described in CUDA documentation?
|
|
4
|
30
|
February 8, 2023
|
How is elementwise operators fusion done by compiler?
|
|
0
|
10
|
February 8, 2023
|
My application peforms 40% slower on docker compared to WSL2, GPU is not achieving full clock speed?
|
|
4
|
20
|
February 8, 2023
|
Unable to perform matrix addition in PyCUDA - Jetson Xavier NX/CUDA 11.4.19
|
|
4
|
37
|
February 7, 2023
|
Correct usage of ldcg and stcg for inter-block communication
|
|
9
|
90
|
February 7, 2023
|
Are the intrinsics listed anywhere?
|
|
2
|
25
|
February 7, 2023
|
__half and standard operators + * / -
|
|
4
|
39
|
February 7, 2023
|
Does every thread block have its own 32 shared memory banks?
|
|
7
|
75
|
February 6, 2023
|
Using __restricct__ in CUDA is not giving any significant performance benifit
|
|
1
|
29
|
February 6, 2023
|
Parallel training with 4 cards 4090 cannot be performed on AMD 5975WX, stuck at the beginning
|
|
13
|
541
|
February 6, 2023
|
CUDA IPC vs NVSHMEM for shared memory between applications
|
|
4
|
169
|
February 6, 2023
|
Why Sleep blocking all cuda streams?
|
|
4
|
115
|
February 6, 2023
|
How to compile OpenCL code into binary for a GPU I do not physically have?
|
|
1
|
45
|
February 5, 2023
|
How to create CUDA library using Visual Studio
|
|
5
|
2972
|
February 5, 2023
|
What does Compute Unified Device Architecture mean?
|
|
1
|
52
|
February 4, 2023
|
How to launch CUDA Cooperative Groups Standard Deviation example kernel?
|
|
7
|
114
|
February 3, 2023
|
cudaStreamSynchronize is much slower than polling on a flag for kernel completion
|
|
7
|
188
|
February 2, 2023
|
Atomic add running faster than naive add
|
|
1
|
352
|
February 2, 2023
|
Does NVidia know about the 300% perf improvement cuDNN can provide?
|
|
5
|
109
|
February 2, 2023
|
Multi-GPU Training time is slower than single-GPU
|
|
0
|
54
|
February 2, 2023
|
How to get pointer to a kernel by its name?
|
|
2
|
89
|
February 2, 2023
|
Data transfer in/out CUDA API time unstable
|
|
3
|
67
|
February 2, 2023
|
About CPU thread increase when calling CUDA interface
|
|
1
|
80
|
February 2, 2023
|
unable to include cub
|
|
5
|
3833
|
February 2, 2023
|
How to create vector of objects in the device?
|
|
1
|
119
|
February 2, 2023
|
Cuda function increase new thread and kernel-mode process
|
|
0
|
72
|
February 1, 2023
|
Why does cudaEventSynchronize block other streams?
|
|
1
|
80
|
February 2, 2023
|
How to kill unknown process that eating up the GPU memory?
|
|
2
|
252
|
February 1, 2023
|