hardware random number generator please!

many algorithms used for a gpgpu use random numbers, and although one can impmlement a mersenne twister on a gpgpu, that uses up a lot of the processing power and memory, whereas if you put a hardware random number generator on the chip --like a “flat” roe that just that’s just 32 random bit generators in parralele) that would use up very little circuit area and you’d get REAL (or close to it) random numbers in about the same number of clock cylces it takes to do one basic operation. (then if someone wanted to increase the randomness they can just xor a series of them togerther.)

also, there are many alogrithms/methods (e.g. optimization heuristic and metaheuristic algorithms) that could benefit from a much faster random number generator, but are never used because it’s a lot faster to use an approximate algorithm that uses much fewer random numbers. but if you don’t have to use that approx. method, if you can generate random numbers much more cheaply (cheaper than pseudo-random generation), than you can use the better algorithms with better convergence times.

i keep coming back to this problem, where i think “if only i had a hardware random number generator on the gpu”. and i don’t imagine it’s technically that difficult to do (you’ve got a lot of available noise sources), and the circuit area cost would be minimal.

many algorithms used for a gpgpu use random numbers, and although one can impmlement a mersenne twister on a gpgpu, that uses up a lot of the processing power and memory, whereas if you put a hardware random number generator on the chip --like a “flat” roe that just that’s just 32 random bit generators in parralele) that would use up very little circuit area and you’d get REAL (or close to it) random numbers in about the same number of clock cylces it takes to do one basic operation. (then if someone wanted to increase the randomness they can just xor a series of them togerther.)

also, there are many alogrithms/methods (e.g. optimization heuristic and metaheuristic algorithms) that could benefit from a much faster random number generator, but are never used because it’s a lot faster to use an approximate algorithm that uses much fewer random numbers. but if you don’t have to use that approx. method, if you can generate random numbers much more cheaply (cheaper than pseudo-random generation), than you can use the better algorithms with better convergence times.

i keep coming back to this problem, where i think “if only i had a hardware random number generator on the gpu”. and i don’t imagine it’s technically that difficult to do (you’ve got a lot of available noise sources), and the circuit area cost would be minimal.

Here you go, a hardware random number generator

Here you go, a hardware random number generator

i was looking for something a little, err… smaller. i tried placing a few of those on my gpu and it didn’t seem to do anything. i considered placing them on the fan, but i’d have to turn my computer upside-down.

i was looking for something a little, err… smaller. i tried placing a few of those on my gpu and it didn’t seem to do anything. i considered placing them on the fan, but i’d have to turn my computer upside-down.

Did you look into the Library CURAND in the 3.2 toolkit.

Did you look into the Library CURAND in the 3.2 toolkit.

pseudo-random number generators are just not fast/efficient enough for what i’m thinking of. for optimal convergence times i need random number generation in the inner-most loop.

w/out hardware generation–as an opcode w/high throughput (say, comparable to an integer multiplication)—, my best option is to use a substantially different algorithm that will have a significantly worse convergence rate. i’m sure there are other problems like this, which is why i think the next generation of cards might benefit from a hardware random number generator. this is a suggestion to nvidia, not a tech question.

pseudo-random number generators are just not fast/efficient enough for what i’m thinking of. for optimal convergence times i need random number generation in the inner-most loop.

w/out hardware generation–as an opcode w/high throughput (say, comparable to an integer multiplication)—, my best option is to use a substantially different algorithm that will have a significantly worse convergence rate. i’m sure there are other problems like this, which is why i think the next generation of cards might benefit from a hardware random number generator. this is a suggestion to nvidia, not a tech question.

With Curand you can generate the random numbers in advance and then use them in your inner loop by simply reading them

With Curand you can generate the random numbers in advance and then use them in your inner loop by simply reading them

unless you’re recycling the random numbers, it still amounts to the same total compute time. and if you are recycling them, well you’re losing randomness. and anyways you could recycle them regardless of the method you used to generate them.

on cpu code that requires a lot of random numbers, i generally reuse my random numbers (using xors to shake things up a bit), but occasionally i have to refresh the set to avoid periodicity. turns out to like a 10x speed increase, but the generation is still the bottleneck of that “fast_rand” function., and when the “fast_rand” is in the innermost loop, that means the generation, even w/recycling, is still the bottleneck.

unless you’re recycling the random numbers, it still amounts to the same total compute time. and if you are recycling them, well you’re losing randomness. and anyways you could recycle them regardless of the method you used to generate them.

on cpu code that requires a lot of random numbers, i generally reuse my random numbers (using xors to shake things up a bit), but occasionally i have to refresh the set to avoid periodicity. turns out to like a 10x speed increase, but the generation is still the bottleneck of that “fast_rand” function., and when the “fast_rand” is in the innermost loop, that means the generation, even w/recycling, is still the bottleneck.

actually, come to think of it, if i can come up with a good chaotic system (chaos theory) i should be able to mix up a pregiven set of randoms pretty well. thus having a high recycling rate in low cpu time.

here’s some small little pseudo-code for chaotic mixing:

int fast_rand() {

	static int rand[64]; //this is actually pre-intialized w/random numbers.

	static int cur = 0;

	static int last = 0;

	static int count = 0;

	static int max = 64*64;

	int res;

		//add a new pseudorandom into the mix occasionally

	if( count == 0) {

		rand[cur] = generateNewRandom();

		count = max;

	}

	count--;

	int res = rand[cur]; //get our random number

	rand[cur] += last; //the "mixing" part 1

	cur = res & 0x003F; // = res % 64; //the "mixing" part 2

	last = res; //save for mixing

	last = (last >>= 1) + count; //lets move the bits around and add an offset to prevent convergence (i.e. to decrease periodicity)

	return res;

}

that should be relatively quick and chaotic.

probably good enough.

actually, come to think of it, if i can come up with a good chaotic system (chaos theory) i should be able to mix up a pregiven set of randoms pretty well. thus having a high recycling rate in low cpu time.

here’s some small little pseudo-code for chaotic mixing:

int fast_rand() {

	static int rand[64]; //this is actually pre-intialized w/random numbers.

	static int cur = 0;

	static int last = 0;

	static int count = 0;

	static int max = 64*64;

	int res;

		//add a new pseudorandom into the mix occasionally

	if( count == 0) {

		rand[cur] = generateNewRandom();

		count = max;

	}

	count--;

	int res = rand[cur]; //get our random number

	rand[cur] += last; //the "mixing" part 1

	cur = res & 0x003F; // = res % 64; //the "mixing" part 2

	last = res; //save for mixing

	last = (last >>= 1) + count; //lets move the bits around and add an offset to prevent convergence (i.e. to decrease periodicity)

	return res;

}

that should be relatively quick and chaotic.

probably good enough.