Latest GPU-Accelerated Libraries topics

Topic	Replies	Views	Activity
About the GPU-Accelerated Libraries category	0	5591	February 1, 2020
Wmma double buffering async + swizzle kernel	0	19	June 26, 2026
Potential int32 overflow in nvjpeg causing nvjpegEncodeImage failure for large images nvbugs , nvjpeg , dali	0	22	June 26, 2026
Segmentation fault on RTX 5090 with CUDA 13 during repeated PyTorch CUDA forward passes cuda , pytorch , neural-network-framework , gpu , deep-learning , rtx	14	288	June 25, 2026
cusolverDnXtrtri not 64 bit ready? cusolver	2	29	June 24, 2026
Virtual GPU Concept – NissanSkylanRSGXXX	0	18	June 23, 2026
cuDSS and the potential double-single precision arithmetic cusparse , gaming , cudss	2	36	June 23, 2026
cuFile only running compat mode for RTX 6000 pro & Samsung 9100 Pro SSD on ASUS TUF Gaming B850 gds , gaming	5	113	June 21, 2026
AMGX runtime error with preconditioning amgx	3	149	June 19, 2026
Sionna in the real hardware	0	19	June 19, 2026
Nvidia NNP Library: nppiYUV422ToRGB_8u_ColorTwist32f_C2C3R_Ctx() npp	0	22	June 15, 2026
How to efficiently solve transposed problem using cuDSS? cusparse , cudss	7	245	June 10, 2026
cuBLASLt sm_120 (Blackwell): TF32 split-K nvjet kernel raises "Warp Barrier Arrival Mismatch" — intermittent illegal access / GPU hang (RTX 5090) cublas , llama	0	53	June 10, 2026
Feature request: support many distributed small systems in cuDSS cusparse , cudss	1	28	June 9, 2026
Missing documentation for cusolverDnFunction_t cusolver	0	22	June 9, 2026
Need Lidar_AI_Solution libspconv.so for Jetson Orin JetPack 6 CUDA 12 sm_87 cudnn	0	29	June 9, 2026
OpenCL variable-amount rotate(x, y) extra mask instruction in PTX opencl	3	60	June 5, 2026
Application Works on Newer GPU but Fails on Older Hardware Generation gpu	0	23	June 5, 2026
Request for NVIDIA NIM API Rate Limit Increase (40 → 200 RPM) nim , agentic-ai	0	34	June 4, 2026
Cutensor errors on compute capability 6.1 card (Quadro P6000) cutensor	1	45	June 3, 2026
Possible typo in CUDA Programming Guide v13.3: cuda.threadIdx and cuda.gridDim descriptions appear to be swapped in the Python table documentation	2	44	June 3, 2026
LocateAnything with MagiAttention on a 5090?	0	41	June 1, 2026
cuTENSOR example code in Fortran cutensor	3	68	May 29, 2026
Is there any working example of cuTile with C++? cuda , cutile	1	107	May 28, 2026
I'm testing NCCL with a B300. Is this the right speed? hpc-benchmarks	0	75	May 28, 2026
Triton inference on multi GPU has slow inference with incorrect results cuda , inference-server-triton , linux-driver	2	75	May 26, 2026
Missing files from 2510 HSB IP? fpga , holoscan	0	37	May 15, 2026
cuBLAS severe underperformance on cublasSgemm for RTX 3060 Laptop GPU cublas , cutlass	1	59	May 14, 2026
NVIDIA Inception application stuck on "Pending Review" for 2+ months — any advice? inception	1	103	May 14, 2026
cudaMalloc'ed memory on GraceHopper GH200 is not accessible by CPU cuda	0	27	May 11, 2026

About the GPU-Accelerated Libraries category

0

5591

February 1, 2020

Wmma double buffering async + swizzle

kernel

0

19

June 26, 2026

Potential int32 overflow in nvjpeg causing nvjpegEncodeImage failure for large images

nvbugs , nvjpeg , dali

0

22

June 26, 2026

Segmentation fault on RTX 5090 with CUDA 13 during repeated PyTorch CUDA forward passes

cuda , pytorch , neural-network-framework , gpu , deep-learning , rtx

14

288

June 25, 2026

cusolverDnXtrtri not 64 bit ready?

cusolver

2

29

June 24, 2026

Virtual GPU Concept – NissanSkylanRSGXXX

0

18

June 23, 2026

cuDSS and the potential double-single precision arithmetic

cusparse , gaming , cudss

2

36

June 23, 2026

cuFile only running compat mode for RTX 6000 pro & Samsung 9100 Pro SSD on ASUS TUF Gaming B850

gds , gaming

5

113

June 21, 2026

AMGX runtime error with preconditioning

amgx

3

149

June 19, 2026

Sionna in the real hardware

0

19

June 19, 2026

Nvidia NNP Library: nppiYUV422ToRGB_8u_ColorTwist32f_C2C3R_Ctx()

npp

0

22

June 15, 2026

How to efficiently solve transposed problem using cuDSS?

cusparse , cudss

7

245

June 10, 2026

cuBLASLt sm_120 (Blackwell): TF32 split-K nvjet kernel raises "Warp Barrier Arrival Mismatch" — intermittent illegal access / GPU hang (RTX 5090)

cublas , llama

0

53

June 10, 2026

Feature request: support many distributed small systems in cuDSS

cusparse , cudss

1

28

June 9, 2026

Missing documentation for cusolverDnFunction_t

cusolver

0

22

June 9, 2026

Need Lidar_AI_Solution libspconv.so for Jetson Orin JetPack 6 CUDA 12 sm_87

cudnn

0

29

June 9, 2026

OpenCL variable-amount rotate(x, y) extra mask instruction in PTX

opencl

3

60

June 5, 2026

Application Works on Newer GPU but Fails on Older Hardware Generation

gpu

0

23

June 5, 2026

Request for NVIDIA NIM API Rate Limit Increase (40 → 200 RPM)

nim , agentic-ai

0

34

June 4, 2026

Cutensor errors on compute capability 6.1 card (Quadro P6000)

cutensor

1

45

June 3, 2026

Possible typo in CUDA Programming Guide v13.3: cuda.threadIdx and cuda.gridDim descriptions appear to be swapped in the Python table

documentation

2

44

June 3, 2026

LocateAnything with MagiAttention on a 5090?

0

41

June 1, 2026

cuTENSOR example code in Fortran

cutensor

3

68

May 29, 2026

Is there any working example of cuTile with C++?

cuda , cutile

1

107

May 28, 2026

I'm testing NCCL with a B300. Is this the right speed?

hpc-benchmarks

0

75

May 28, 2026

Triton inference on multi GPU has slow inference with incorrect results

cuda , inference-server-triton , linux-driver

2

75

May 26, 2026

Missing files from 2510 HSB IP?

fpga , holoscan

0

37

May 15, 2026

cuBLAS severe underperformance on cublasSgemm for RTX 3060 Laptop GPU

cublas , cutlass

1

59

May 14, 2026

NVIDIA Inception application stuck on "Pending Review" for 2+ months — any advice?

inception

1

103

May 14, 2026

cudaMalloc'ed memory on GraceHopper GH200 is not accessible by CPU

cuda

0

27

May 11, 2026

Accelerated Computing GPU-Accelerated Libraries