Finite Difference Methods in CUDA C++, Part 2

jwitsoe · November 12, 2013, 9:17am

Originally published at: https://developer.nvidia.com/blog/finite-difference-methods-cuda-c-part-2/

In the previous CUDA C++ post we dove in to 3D finite difference computations in CUDA C/C++, demonstrating how to implement the x derivative part of the computation. In this post, let’s continue by exploring how we can write efficient kernels for the y and z derivatives. As with the previous post, code for the examples in this post…

anon42508259 · June 27, 2016, 9:46pm

Hi, Dr. Mark Harris, I wanted to share my benchmarks with you and the rest of the NVIDIA CUDA dev community my benchmarks for 3-dim. finite difference derivatives. I had a few questions, which I wanted to throw out: beyond this 64^3 grid, how does this implementation and the concepts of pencils, extend to arbitrarily (large) sized grids? Naively, if you wanted a "big" grid to do 3-dim. finite difference derivatives on, e.g. 2560^3, does the "pencil" extend to size 2560 (entries)? Can we go even larger? More in general, are there any implementations out there for 3-dim. Navier-Stokes equation solvers using this finite difference with CUDA C/C++?

anon42508259 · June 28, 2016, 11:15pm

I asked this in part 1, but it may pertain here: Arbitrarily (large) sized grids - naively, I changed mx=my=mz for the
grid size (originally 64^3) to 92^3 (i.e. mx=my=mz=92) and anything
above 92, I obtain a Segmentation fault (core dumped). I was simply
curious what was happening; is it a limitation on the GPU hardware? If
so, which parameter? I'm on a NVIDIA GeForce GTX 980Ti.

anon53073872 · March 16, 2017, 7:01am

Quite useful could be real-measures approximation vial finite sum of improper polynomial. Note improper integral as sum of improper integrals - trivially to parallelize, and less computational costly than f.e. rectangles method. I strongly feel that simple cost function of RMSE and LUT of precomputated polynomial integrals should do the job on GTX 1060 and Core2Duo.

Topic		Replies	Views
Finite Difference Methods in CUDA Fortran, Part 2 Technical Blog	5	407	January 8, 2019
Finite Difference Methods in CUDA C/C++, Part 1 Technical Blog	13	879	May 11, 2017
Finite Difference Methods in CUDA Fortran, Part 1 Technical Blog	0	409	August 25, 2020
Your experience with finite differences CUDA Programming and Performance	9	2818	July 10, 2010
Please help with __shared__ memory different usage than in samples CUDA Programming and Performance	30	3597	January 10, 2010
performance unchanged by shared memory 1D heat equation attempt CUDA Programming and Performance	15	5020	July 12, 2009
Same Code (really, it is) - Much Different Results CUDA Programming and Performance	38	14225	September 30, 2010
Using shared memory CUDA Programming and Performance	3	205	August 9, 2024
Coalesced shared memory access? Read and write from which thread to which? CUDA Programming and Performance	23	2691	May 22, 2014
New edition of "CUDA Fortran for Scientists and Engineers" nvc, nvc++ and nvfortran	4	428	November 7, 2024

Finite Difference Methods in CUDA C++, Part 2

Related topics