Tensor core differences between L40 and L40S? (and RTX 6000 Ada?)

dave64 · August 14, 2023, 4:01pm

Hello, I’m trying to optimize 1-bit tensor heavy code on a RTX 6000 Ada and now that the L40S was introduced, I’m wondering if deployment differences might crop up between these three GPUs.

For those of you that haven’t looked closely, the L40 and L40S data sheets strongly imply that the L40S tensor cores are twice as fast as the L40 except for INT4 (weird). And INT1 tensor performance is left unspecified for all three GPUs.

What’s going on? Or more directly, is the INT1 performance the same for L40/L40S/6000 Ada? Or is one significantly faster/slower than another?

Topic		Replies	Views
No speedup on L40s wrt RTX6000 Ada CUDA Programming and Performance	2	3681	April 1, 2024
GU H100/L40S Performance CUDA Programming and Performance	4	935	November 25, 2024
L40 vs. RTX 6000 Ada FP16/FP8 throughput? GPU - Hardware benchmarks	7	15946	April 4, 2023
Peak Performance INT1, INT4, INT8, INT16, INT32 for RTX3090 Tensorcore CUDA Developer Tools	0	1315	January 12, 2021
The L2 cache hit rate of A100(A800) is very low compared to RTX3090 CUDA Programming and Performance cuda	5	189	January 17, 2025
Looking for full specs on NVIDIA A5000 CUDA Programming and Performance	2	3307	June 16, 2022
Question about Tesla L4 performance vs RTX A4500 with lower memory bandwidth GPU Hardware	2	57	November 26, 2025
Does A30 GPU gives 3x RTX A4000 Performance GPU-Accelerated Libraries gpu	0	767	November 20, 2022
NVIDIA A5000 - How to get full specs and how to compare cards? Computer Vision & Image Processing gpu , benchmarks , jetson	3	1546	June 16, 2022
Ada RTX6000 RT cores OptiX	2	673	January 24, 2023

Tensor core differences between L40 and L40S? (and RTX 6000 Ada?)

Related topics