tests on r/w device and host-device cudaMemcpy memory bandwidth issues

Hi,
1, my kernel reads from one device array d_Src and writes them into d_Dst. When the array elements are of int4, it’s blazing fast; when I changed int4 into Pos, where Pos is my own-defined struct:
struct Pos
{
int x, y, z,w;
}
it’s 4~5 times slower. why’s that?
btw, the int4 bandwidth is about 70G/s in a rough approximate. What’s the peak bandwidth of G80, please?

2, I use cudaMemcpy() to test device-memory bdwd, roughly, copy-in is 0.6G/s, copy-out is 0.8G/s. What’s the peak bandwidth of copy-in/out, please?
thanks!

see section 6.1.2.1 of the programming guide.

Peter

Thank you very much.

1, if I write

struct align(16) {int x, y, z, w};

then where do I write the struct name “pos”?

“struct Pos align(16) {int x, y, z, w};” or

“struct align(16) Pos {int x, y, z, w};” or

or other orders all get error.

2, bandwidth of r/w global memory, and that of host-device cudaMemcpy, are still unmentioned in 6.1.2.1?

Thanks!

  1. struct align Name {};

  2. section 5.1

It doesn’t hurt if you read the entire manual btw.

Peter

Thanks!

  1. Anybody has passed below code in .cu? thanks!

struct align Pos

{

int x;

int y;

int z;

int w;

};

  1. Still don’t know the bandwidths, in forms of Gbytes/s.

If you change the align to align(16) I think your Pos struct will be identical to the int4 struct defined in vector_types.h

Mark

Thanks! I’m using sdk0.8,

this passed compiling:

struct Pos

{

int4 x;

};

this also passed compiling:

struct /align(16)/ Pos

{

int x;

int y;

int z;

int w;

};

but this failed compiling with error outputs:

struct align(16) Pos

{

int x;

int y;

int z;

int w;

};

error:

      expected an identifier

struct align(16) Pos

             ^

error:

      expected a ";"

struct align(16) Pos

                ^

OK, maybe I wasn’t verbose enough. The following compiles and works nicely for me:

struct __align__(16) Pos

{

  int x;

  int y;

  int z;

  int w;

};

As I said, section 5.1. I cite

Peter

Thank you very much! I was using the old Jan15 progGuide and gave you troubles :) My account has no power to install Feb6 currently. Please forgive me for following progGuide Jan15’s align sample code align(16) , which might be a typo.

Thanks again:)