Hi,

Is there any cuda library for doing arbitrary-precision arithmetic, like for example

multiplication of 2 million decimal-digit numbers?

Thanks much,

mani

Hi,

Is there any cuda library for doing arbitrary-precision arithmetic, like for example

multiplication of 2 million decimal-digit numbers?

Thanks much,

mani

You might want to take look at CAMPARY:

https://hal.archives-ouvertes.fr/hal-01312858

Mioara Joldes, Jean-Michel Muller, Valentina Popescu, Warwick Tucker: â€śCAMPARY: Cuda Multiple Precision Arithmetic Library and Applicationsâ€ť, 5th International Congress on Mathematical Software (ICMS), July 2016, Berlin, Germany

The best link to the software itself that I could find in a five-second search is http://homepages.laas.fr/mmjoldes/campary/, but by all means check the usual open software repositories as well.

An older project is CUMP:

https://github.com/skystar0227/CUMP

T. Nakayama and D. Takahashi: â€śImplementation of Multiple-Precision Floating-Point Arithmetic Library for GPU Computingâ€ť, Proc. 23rd IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS 2011), pp. 343â€“349 (2011).

Even older is gpuprec:

https://github.com/lumianph/gpuprec

Mian Lu, Bingsheng He, and Qiong Luo: â€śSupporting extended precision on graphics processorsâ€ť. DaMoN '10 Proceedings of the Sixth International Workshop on Data Management on New Hardware, June 2010, pp. 19-26.

I have not used any of the above.

Thanks a lot for your time.

Mani

directly from nVidia: https://github.com/NVlabs/xmp

2 million decimals is a lot. I am not sure if there is a practical general purpose library that can do it.

Interesting. I wasnâ€™t aware of an effort by NVlabs to produce such a library. Is there a published paper available somewhere? Or at least a GTC presentation slide deck?

VoilĂ

http://on-demand.gputechconf.com/gtc/2016/presentation/s6349-niall-emmart-xmp-library-internals.pdf

1 Like

this library is only good (optimized) up to the Maxwell microarchitecture

For best performance on Pascal, force it to multiply using XMAD, for Volta and Turing, force IMAD.

For peak performance on Pascal, stick with CUDA 8.0 - Volta and Turing can use the later CUDA releases.

Hi,

Do you know a place where I can find documentation about CAMPARY?

I was not able to find a guide or neat examples.

The problem that I want to solve requires arrays of numbers with precision higher than doubles (multi-precision). I need to continuously work with them in the CPU, transfer them to the GPU, transfer the results from the GPU to the CPU,â€¦ (always preserving the precision)

The operations that I use are just +, -, *, /, and they are implemented in CAMPARY, but I do not know how to implement the declarations of arrays in the CPU-GPU and transfer these arrays (preserving the precision) from the CPU (GPU) to the GPU (CPU). If you can provide a basic example it would be great!!

Thanks

I was under impression that CAMPARY is open-source software. If so, â€śUse the source, Luke!â€ť

Thanks for the suggestion.

The problem is that the source:

does not provide information, just the â€ś.hâ€ť files.

My needs could be fixed with a basic example (CPU-GPU transfer of multi-precision arrays preserving the precision), instead of diving in the codes.

Yes, â€śuse the sourceâ€ť means diving into the code. Not every open-source project comes with docs and/or neat examples. CAMPARY is a header-file library so the entire source is in those `.h`

files.

A quick look at the header files indicates that the multi-precision types are simply arrays of doubles (e.g. quadruple precision: four doubles), so multi-precision operands can be copied trivially between CPU and GPU.

CGBN seems to be a good solution: GitHub - NVlabs/CGBN: CGBN: CUDA Accelerated Multiple Precision Arithmetic (Big Num) using Cooperative Groups

However it has not been updated for 2 years and does not support Turing architecture and later. Do maintainers intend to add support for latest architectures?

I used the library for my project in a Ampere architecture and I didâ€™n have any problems. Just change the ` -arch`

flag in you `nvcc`

command. I used `-arch=sm_80 `

for Ampereâ€™s architecture. Try to update the `Makefile`

s with this and test in you device.

For example, in the `makefile`

for `samples/sample_01_add`

I used this:

```
nvcc $(INC) $(LIB) -I../../include -arch=sm_80 add.cu -o add -lgmp
```