Multi-sampling, deferred shading and OpenGL/Direct-X/CUDA interop

sam_hawker · November 30, 2016, 12:09pm

Firstly, I apologize that this is a largely graphics (rather than CUDA) related query. I’d like to use as much CUDA as possible but my feeling is that I need to use a graphics API to make the best use of the rasterization hardware. I haven’t completely ruled out the possibility of creating an entirely CUDA-based rasterizer or ray-tracer though.

I’m looking at creating a X-ray radiograph simulator along the lines of gvirtualxray ([url]http://gvirtualxray.sourceforge.net/gvirtualxray.php[/url]).

They use a triangle mesh representation for their object and rely on hardware rasterization. They disable depth testing and back face culling to ensure that all triangles are rendered. They use a custom pixel shader to calculate a signed distance from the X-ray source to the triangle (where the sign indicates whether the intersection represents entering or leaving the object). They use render target blending to calculate the sum of these signed distances. The sum then represents the path length through the object. The intensity of a pixel in the simulated radiograph (which represents the amount of X-ray transmission) is proportional to exp(-attenuation_coefficient * path_length).

The main modification I would like to make is to calculate these path lengths for multiple ray paths within each pixel and then to combine them in a second pass. This has to be done in a second pass is because I need to combine intensities rather than path lengths. Ideally I’d like to use CUDA for the second pass so I can make use of shared memory, thread synchronization, atomic operations, etc.

This is very straightforward if I limit myself to regular grid sampling but I’m interested in experimenting with other sampling patterns.

Would it be possible to perform the first pass using the “sample” interpolation modifier ([url]Microsoft Docs - Developer tools, technical documentation and coding examples) and a multi-sample render target? Can I still use render target blending on a multi-sample render target?

What about the second pass - can I read the multi-sample render target from a CUDA kernel?

How do I control the number of multi-samples and the pattern (e.g. regular grid, rotated grid, etc.)?

Apparently Maxwell has programmable sampling patterns ([url]http://www.geforce.com/whats-new/articles/multi-frame-sampled-anti-aliasing-delivers-better-performance-and-superior-image-quality[/url]) - where do I find documentation on how to program it?

Finally, I read that recent NVIDIA GPUs use a tile-based rendering approach. Would there be any way to leverage that to avoid the huge memory (and memory bandwidth) requirements of a high level of multi-sampling? Essentially I would like to somehow keep the results of the first pass (for the current tile) in shared memory or L1/L2 cache and then perform the second pass (for the current tile) and only actually output the results of the second pass to global memory. I could of course manually divide the image into tiles and render each one separately but I imagine this will result in a huge amount of redundant vertex shading unless I can somehow cull triangles that do not intersect the current tile.

Topic		Replies	Views
CUDA Rendering Porting GL/GLSL app to CUDA CUDA Programming and Performance	3	9815	May 9, 2009
CUDA Raytracer CUDA Programming and Performance	0	1307	December 28, 2009
Equivalent to geometry shader functionality ? CUDA Programming and Performance	4	2344	June 6, 2008
CUDA Radiosity and hardware rasterizer CUDA Programming and Performance	6	2379	January 14, 2015
Cuda good practices for image processing CUDA Programming and Performance	8	15520	February 12, 2009
CUDA Image Processing Demo & Soure code&Tutorials CUDA Programming and Performance	7	25047	April 2, 2007
[Help] Kernel Optimization Image subsampling CUDA Programming and Performance	2	4211	July 30, 2007
Is there any performance difference implementing a ray-tracer in cuda vs. rendering pipelines? CUDA Programming and Performance	7	2601	March 2, 2019
Application of CUDA looking for inspiration CUDA Programming and Performance	5	1983	May 9, 2011
Porting my renderer from C++ to CUDA - the speed gains and their cost. CUDA Programming and Performance	3	10981	February 6, 2011

Multi-sampling, deferred shading and OpenGL/Direct-X/CUDA interop

Related topics