Hi, based on my initial tests the issue seems to be resolved in 25.11 and indeed the performance with difficult kernels appear to be at least as good as in 24.9. It’ll take a while to gather more experience with 25.11, but so far everything looks good! Thank you!
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Device code generated from -stdpar versus thrust | 12 | 2699 | June 13, 2022 | |
| [nvc++][C++17] Regression between SDK 23.5 and 23.9 | 3 | 246 | June 7, 2024 | |
| Nvc++ & external CUDA-thrust conflicts for -stdpar offload | 5 | 545 | December 12, 2022 | |
| LLVM Error when compiling C++ STD parallel execution policies to GPU | 9 | 707 | May 2, 2024 | |
| NVC++-F-0000-Internal compiler error. must have operand | 9 | 1019 | November 18, 2024 | |
| Cpp2 TERMINATED by signal 11 | 9 | 1734 | June 28, 2022 | |
| NVCC produces a significantly slower binary in CUDA 11 compared to CUDA 10.1 | 2 | 1571 | February 27, 2022 | |
| Nvc++: undefined __kmpc_for_static_init_16 and Unexpected branch type | 7 | 494 | April 2, 2024 | |
| NVC++ 23.1 F-0000-Internal compiler error with std::function ref to lambda | 1 | 652 | March 6, 2023 | |
| Nvc++ compilation fails with execution policy change | 3 | 440 | August 1, 2023 |