Factorial for CUDA? existing implementation of factorial

frossi · February 18, 2008, 1:15am

I need to use a ‘factorial’ function in the calculation of zernike moments in an image processing application. Simple, right? Nope.

I’ve been searching through the docs and forum for references to ‘factorials’ in hopes that either CUDA supports it or someone has it implemented, but so far no luck. Am I missing something?

seibert · February 18, 2008, 2:22am

A float will run out of significant digits after 10!, and a 32-bit int will run out of digits after 12!, so you could probably do this pretty efficiently with a lookup table in constant or shared memory.

If you need larger factorials, you’ll start to lose precision, and want to rearrange the computation to avoid having to hold factorials that big.

Sarnath · February 18, 2008, 10:05am

That was a smart answer!!!

Also,

You could implement biggg numbers in the form of strings. And, define a set

of functions that implement arithmetics (including multiplication) on such

strings and go ahead with your factorial implementation.

(I guess it would be worthless to implement this on CUDA. But who knows,

it could be on the other side too. Juss a matter of trying…)
Get or manufacture a calculator (cellphone’s calculator???) that has an USB port

to talk to computers and communicate your math with it. Calcis can deal with

big numbers easily.

wumpus · February 18, 2008, 3:05pm

I think it would be best to use some kind of approximation, or otherwise a 1-D texture.

eelsen · February 19, 2008, 6:42pm

The gamma function is part of cuda’s math library with a maximum ulp error of 6. However, if you only need integer values, the best solution is a lookup.

frossi · February 21, 2008, 9:19pm

That’s a very good point about truncation on precision. A lookup is what I’ll likely look into in future. For the time being I’ve split the tasks such that zernike polynomials (requiring factorials) are precalculated by the CPU over an image segment. Then the results are stored in the GPU where this needs only to be done once. The more intensive and repetative calculation (zernike moments) are done seperately utilizing the stored zernike polynomials. (no factorials required).

Unfortunately I’m still faced with precision issues as you pointed out (beyond 12!). Thanks!

Topic		Replies	Views
How can I Convert my Factorial c program in Cuda , it working good in c,but I would like to make thi CUDA Programming and Performance	7	8748	November 15, 2019
massive factorial calculate of millions factorials CUDA Programming and Performance	5	9116	August 30, 2014
not sure where to ask this but CUDA can only handle 4 byte and 8 byte floating point? CUDA Programming and Performance	10	800	July 6, 2018
Integer factorization on the GPU Factoring large integers CUDA Programming and Performance	5	25687	March 8, 2008
quads in cuda? CUDA Programming and Performance	7	9906	August 3, 2011
floating point precision on CUDA CUDA Programming and Performance	11	14865	June 8, 2010
cuda float point precision CUDA Programming and Performance	3	7990	November 5, 2010
32 bit Float value question Zero insignificant bits after decimal pt CUDA Programming and Performance	5	2547	July 2, 2008
GPU sorting code, need help "shell sort" code, please help to modify it. CUDA Programming and Performance	2	6350	March 9, 2009
Best way to accelerate for loops in kernel? CUDA Programming and Performance cuda , kernel	5	512	December 13, 2023

Factorial for CUDA? existing implementation of factorial

Related topics