old data passed to function

have a CUDA function that is invoked multiple times, each time varying by the contents of an array.
the array specifies row, start column, and end column for a region-of-interest computation.

another array returns the result for each scan line.
added more dimensions to this array in order to return a copy of the input data.

first two calls returns correct input parameters as passed out of the output array. But the third time the input/output is the same as the second time.

verified that the source data changes all three times.
input and output arrays are allocated and disposed on each call.

How can the array uploaded to the GPU not change from one call to the next?

Here is the calling function:

private bool CalculateOnCUDAThread(ILineCollection tempLines, ref Int64 total)
int h_lines = new int[tempLines.LineCount * 3];
for (int i = 0, j = 0; i < tempLines.LineCount; i++)
h_lines[j++] = (int)tempLines.Lines[i].Start.Y;
h_lines[j++] = (int)tempLines.Lines[i].Start.X;
h_lines[j++] = (int)tempLines.Lines[i].End.X;
CudaDeviceVariable d_lines = h_lines;

int dimResults = 4;
CudaDeviceVariable<Int64> d_results = new CudaDeviceVariable<Int64>(tempLines.FrameHeight * dimResults);

int threadsPerBlock = Math.Min(256, tempLines.LineCount);
cudaApertureTotal.BlockDimensions = threadsPerBlock;
cudaApertureTotal.GridDimensions = (tempLines.LineCount + threadsPerBlock - 1) / threadsPerBlock;

cudaApertureTotal.Run(tempLines.LineCount, d_lines.DevicePointer, 
	tempLines.FrameWidth, CUDAResults.d_data.DevicePointer,
	d_results.DevicePointer, dimResults);

Int64[] h_results = d_results;

//  release resources

total = 0;
for (int y = 0; y < tempLines.FrameHeight; y++)
	total += h_results[y * dimResults + 0];

return true;


I guess this is c#/managedCuda ?

found it. was creating return array using CUDA which does not initialize to 0.

changed to creating C# array, assign to output array, call function, copying back to C# array

now works as expected.

thanks for listening!