Decimal Floating-Point

Does anyone know of a CUDA library for decimal floating-point computations

I am not aware of any. What is your use case? What decimal floating-point library do you currently use in C/C++ code on the host? If it is an open source library, you may want to try porting it. Given that the CUDA tool chain supports most of C++11 at this point that should not be too hard.

There is an intel DFP library:

Its the Intel® Decimal Floating-Point Math Library, I dont think this can be ported… intel uses some specialize instructions.

I am interested to learn more about your use case. What advantages does decimal floating-point arithmetic provide over binary floating-point arithmetic in your specific context? What arithmetic operations do you need? Do you require adherence to the IEEE-754 (2008) standard, or would a custom storage format suffice?

Use case involves arithmetic aggregates over millions of currency values (sums of partially discounted prices for example). In that binary floating point cannot accurately represent some simple decimal values given any precision, rounding errors accumulate. We are indeed looking into libraries that implement decimal floating point arithmetic in adherence to IEEE-754-2008, where such computations can be computed exactly.

The Intel Decimal Floating Point Library:
Or this free implementation claiming dramatic speedup over Intel’s library:

eventually we would like to run this on the GPU

The code at the latter link seems to be by Mike Cowlishaw, who was the main driving force behind the addition of decimal floating-point arithmetic to the 2008 revision of the IEEE-754 standard. From cursory perusal of the website, it seems the C source code is made available under a GPL license. Have you tried compiling the code for the GPU?

I have never worked on accounting software, but am under the impression that it frequently uses integer arithmetic, for example by representing all quantities as multiples of 1/1000 of a cent; apparently this technique goes back all the way to how COBOL codes in the 1950s handled this. Other financial computations that can tolerate rounding errors, such as option pricing, appear to be commonly performed using double-precision arithmetic.

Thanks, we plan to look at the C source code. I’m also looking into:

A multiple-precision library seems like overkill to me. A signed 64-bit integer counting multiples of 1/1000 of a cent would allow representation of amounts up to 92 trillion dollars (where a trillion is 10**12). With the US national debt currently creeping up to 19 trillion dollars, that still leaves wiggle room going forward.

As you stated the Cowlishaw code is GPL, and is distributed by the Free Software Foundation as part of gcc (libdecnumber at Update: nvcc (cuda compiler) does compile it but I haven’t yet tried sending these objects to/from the gpu for computation. That’s next.

thanks for you support