Arithmetics differ greatly from emulation to real run


Problem already solved:

Turns out, my copying of the array IA was incorrectly done.

In emulation mode, the CPU must get “lucky”, for some reason, and find appropriate values everytime, making it appear like everything works well.

On the GPU however, the “lucky” data doesn’t exist, and my problem was revealed.

Original Post:

For the sake of brevity I have “inlined” the header file in the below code. Ofc it doesn’t “really” look like that.

The below code works perfectly well in emulation mode, and produces expected TP values of around -1000 to 1000.

However, when run in “real” mode, the values go haywire. They are NaNs, Infinities, and sometimes “normal” numbers, but in all ranges.

I’m currently trying to reduce the problem to a smaller size, but I get incorrect values in a lot of places, so I’m not quite sure what do think yet.

Hopefully there’s something about arithmetics in the GPU that I haven’t quite grasped and someone can point it out for me.

Otherwise I’ll be back when I’m able to figure something more out.

[codebox]#include {

const int INPUT_O = 5;

const int BPV = 50;

const int TC = 5;

enum Pos {

A = -1;

B = 0;

C = 1;



Kernel(float* IA)

float TP, EP;

int BN;

Pos CP;

TP = ((IA[BN + INPUT_O] - EP) * CP) * BPV - TC;[/codebox]

Wild guess: Is IA a pointer to host memory?

You got very close :P

Thanks :)