In the course of implementing an algorithm, I find myself having to work with complex numbers whose real and imaginary parts are well in excess of DOUBLE_MAX. Unfortunately, I wasn’t satisfied with the performance of the algorithm on a CPU; after a couple of my friends with some programming experience looked at my code, they said I wasn’t doing anything obviously wrong. The problem is embarrassingly parallel, parallel in 2^(lattice width) to be exact.
I don’t need high precision. Four digits of accuracy is fine, but anything less could be an issue.
Notably, extended-precision arithmetic would make this a non-problem for me, since I am computing Z, and 2^16384 > Z > 2^1024
I have been hoping to use CUDA, but this has me stymied. The problem in particular is: I really have no idea what to expect performance-wise from one approximation technique versus another!
I considered storing the logarithm of the relevant numbers; this works well enough for real numbers, but for complex numbers there’s a singularity encountered in about 0.01% of cases where all ordinary approximations fail and more extensive calculation is required. I don’t know if branch prediction exists on CUDA at all, though, so whether it is possible to fix this situation I am not sure. If branch prediction works at all, my problems probably disappear.
In this case the CUDA implementation of log() and exp() can help if it is fast, though I would need to use them several times in the innermost loop of the program, which itself executes (number of order parameters) * (lattice size) * 2^(lattice width) times, which is at least ten million. The algorithm itself has already undergone extensive tuning.
I have also thought of building my own floats with an integer exponent and mantissa. I guess this can work, though it’d probably be rather slow!
More generally, though, I feel like someone has probably had this problem before. I am aware of the techniques used in e.g. GMP, but I suppose I’ve figured that can’t possibly be of much use to me; I just don’t need high precision. I am really hoping there is some commonly-accepted technique for dealing with combinatorially huge numbers.