I have two functions which does exactly the same job - rasterize a triangle. They differ only in the way arguments are passed. Here are the definitions:

```
__global__ void rasterizePixel(
byte *colorBuffer, float *depthBuffer, int width,
Vertex v0,
Vertex v1,
Vertex v2,
const CTexture* texture,
float one_over_h0,
float one_over_h1,
float one_over_h2,
int minX,
int maxX,
int minY,
int maxY,
float one_over_v0ToLine12,
float one_over_v1ToLine20,
float one_over_v2ToLine01,
plane alphaPlane,
plane betaPlane,
plane gammaPlane,
float one_over_alpha_c,
float one_over_beta_c,
float one_over_gamma_c,
float alpha_ffx,
float beta_ffx,
float gamma_ffx,
float alpha_ffy,
float beta_ffy,
float gamma_ffy)
```

And the second:

```
__global__ void rasterizePixel(
byte *colorBuffer, float *depthBuffer, int width,
TriangleToRasterize t)
```

TriangleToRasterize looks like this:

```
struct TriangleToRasterize
{
Vertex v0, v1, v2;
const CTexture* texture;
float one_over_h0;
float one_over_h1;
float one_over_h2;
int minX;
int maxX;
int minY;
int maxY;
float one_over_v0ToLine12;
float one_over_v1ToLine20;
float one_over_v2ToLine01;
plane alphaPlane;
plane betaPlane;
plane gammaPlane;
float one_over_alpha_c;
float one_over_beta_c;
float one_over_gamma_c;
float alpha_ffx;
float beta_ffx;
float gamma_ffx;
float alpha_ffy;
float beta_ffy;
float gamma_ffy;
};
```

To my surprise, when I call the first function I get around 8ms of total frame time, and when I call the second I get 40ms. Any idea of why passing structures could be slower?