Comparing transform feedback to CUDA Discussion on the performance differences

Hi all,
Recently, I am working on basic 2d grid based deformation simulations to compare the performance of transform feedback (TF) to CUDA.
I first started off with a naive CUDA implementation based on global memory and it was outperformed by tf. However, when i added in shared memory, CUDA out performed tf better for large grid size and tf still outperformed CUDA (almost 2 folds) for small grid sizes. I want to know if anyone else has experienced this or can point me to any other resource (papers) which might help me. Has anyone else used tf for this purpose? Any help would be appreciated.
Is it sane to use tf for this?
I am looking for people to discuss this.

I don’t understand exactly what you’re comparing here. Transform feedback is simply a way to storing vertices transformed by the graphics API to a buffer. I would expect that using CUDA to do the transformation would be about equal performance, since it’s using the same computational resources, or higher if using shared memory as you experienced.

Generally speaking there’s no advantage to using the graphics API for basic data-parallel stuff unless you’re using fixed-function hardware like the rasterizer that isn’t available from CUDA.

Thanks for your reply Simon.

Well there are two advantages with tf.

  1. The data remains on the GPU all the time and it is visualized straightaway.

  2. The implementation is much more cleaner as compared to doing the same thing in CUDA.

Apart from these, as you correctly pointed out, there isnt much performance gain to get from tf. I was looking out for if anyone else has done something related so that I might compare my findings.

Anyways, thanks for the comments.