Maybe a silly question but…
Do kernel parameters always reside in shared memory?
Say for a kernel like this:
[codebox]global void CudaCSGSphereKernel
(const float3 *p_CudaPointArray,
float *p_CudaResultArray,
const int p_Size,
const int p_NumberOfNodes,
const int p_Label,
const float p_Radius,
const float3 p_Center)
{
...
}[/codebox]
… the .ptx looks like this…
The [codebox].reg .u16 %rh<4>;
.reg .u32 %r<16>;
.reg .f32 %f<17>;
.reg .pred %p<3>;
.param .u32 __cudaparm__Z19CudaCSGSphereKernelPK6float3PfiiifS__p_CudaPo
intArray;
.param .u32 __cudaparm__Z19CudaCSGSphereKernelPK6float3PfiiifS__p_CudaRe
sultArray;
.param .s32 __cudaparm__Z19CudaCSGSphereKernelPK6float3PfiiifS__p_Size;
.param .s32 __cudaparm__Z19CudaCSGSphereKernelPK6float3PfiiifS__p_Number
OfNodes;
.param .s32 __cudaparm__Z19CudaCSGSphereKernelPK6float3PfiiifS__p_Label;
.param .f32 __cudaparm__Z19CudaCSGSphereKernelPK6float3PfiiifS__p_Radius
;
.param .align 4 .b8 __cudaparm__Z19CudaCSGSphereKernelPK6float3PfiiifS__p_Center
[12];
.loc 14 116 0[/codebox]
I would think these parameters are copied to shared memory and fast-fetched for every individual thread (since they are constant). But is this true?
Kind regards,
Daniel Dekkers