I have a problem with lists on the device and I don’t understand my fault.
I give you a simplified code where you can see my problem.
In my global function, I create a list local_tableau_mots, and I define an integer k which depends on the ThreadId.
If I put k=ThreadId%623 then I have a very long time( 1 or 2seconds).
If I put k=0, or any fixed value, no problem (0.10s):
In the function “device void next_normal_MT” , I have a read of an array and a write of the value I just read at the same index: I if delete the write, the things are ok…
So something is really out … :( :( :(
Nothing appears on the Emudebug mode.
Unsually those problems come when you try to make a bad acces in memory.
Is there a problem to define list of 624 unsigned int in the device? I thought the register was something like 8000 words, and 16kB of shared memory.
This is my example, Thanks in advance.
global_ void GaussKernelMT(int nb_trajectoires, float nb_jour,int index_premier_k,float *d_tableau_sommes_partielles,int nb_strike,float *pd_strike,float forward, float vol, float volvol,unsigned int *d_tableau_mots,int * d_tableau_k)
{
float taux;
unsigned int local_tableau_mots[624];
int jj;
int k;
const int THREAD_N = blockDim.x * gridDim.x
const int tid = blockDim.x * blockIdx.x + threadIdx.x;
for(jj=0;jj<624;jj++)//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
{
local_tableau_mots[jj]=777;
}
k=tid%622;
k=0;///////////////////////////1.7sec if this ligne is in comment, but 0.11sec if this ligne is inputed.//////////////////
for(int traj = tid; traj < nb_trajectoires; traj += THREAD_N)
{
taux=taux_apres_trajectoire_gaussMT(nb_jour,forward,vol, volvol,local_tableau_mots,&k);// taux will not depend on k,see code
remplit_tableau_somme_partielles(THREAD_N,tid, forward, taux, nb_strike,pd_strike,d_tableau_sommes_partielles);// does not depend on k.
}
}
device float taux_apres_trajectoire_gaussMT(float nb_jour,float forward,float vol,float volvol,unsigned int local_tableau_n_mots[624],int *pk)
{
float taux;
float vol_actuelle;
float valeur_alea1;
float valeur_alea2;
taux=forward;
vol_actuelle=vol;
for (int j=0;j<nb_jour;j++)
{
//valeur_alea1 and valeur_alea2 is always at 0.5, taux will have the same value for every k.
next_normal_M(local_tableau_n_mots,pk,&valeur_alea1, &valeur_alea2);
//valeur_alea1 and valeur_alea2 is always 0.5f, see code below
taux+=vol_actuelle*valeur_alea1;
}
return taux;
}
device void next_normal_MT( unsigned int ptableau_n_mots[mt_nn],int *pk,float * palea1,float * palea2)
{
int k;
unsigned int prochain_mot;
k=(*pk);
prochain_mot=ptableau_n_mots[k];
ptableau_n_mots[k]=prochain_mot;// if I delete this line the problem disappears!!!
(*palea1)=0.5f;
(*palea2)=0.5f;
}