Hi every body,
i am a new member, i have a question about reading data from global memory to shared
memory using float2 or float4.
how does it work?
look at this code:-
global void GPU_Func(CvPoint mask_point, float *masks, CvPoint center, int
winSize, int imgW, int imgH, int patch_width)
{
center.x = (int)threadIdx.x;
center.y = (int)blockIdx.x + (int)blockIdx.y * 2;
int th = (int)threadIdx.x;
if(center.x >= imgW) return;
if(center.y >= imgH) return;
extern shared float data;
// leftmost position in the masks
float *s_mask = data;
s_mask += th * 5;
float3 *mask_start_ptr3;
float2 *mask_start_ptr2;
float3 *float3_read;
float2 *float2_read;
// read mask point
float3_read = (float3 *)s_mask;
mask_start_ptr3 = (float3 *)(masks + (center.y * winSize + mask_point.y + 1) * imgW
- patch_width + center.x * patch_width + mask_point.x - 1);
float3_read[0] = mask_start_ptr3[0];
s_mask += 3;
float2_read = (float2 *)s_mask;
mask_start_ptr2 = (float2 *)(masks + (center.y * winSize + mask_point.y) * imgW *
patch_width + center.x * patch_width + mask_point.x);
float2_read[0] = mask_start_ptr2[0];
s_mask -= 3;
}
in this code i am trying to read 5 float numbers from an array of floats that have
the size imgW * imgH * patch_width * winSize.
first i want to read 3 floats from a specified location, then i want to read 2 other
floats.
the first 3 floats have been read correctly, but the other 2 floats are not.
can any one help please?