Are there any built in intrinsics for dot & cross products ?
I don’t see any definitions of those in headers
Or the common way to do this is:
inline device float dot(float3 a, float3 b)
return a.x * b.x + a.y * b.y + a.z * b.z;
and let’s the compilator vectorize the code ?
I’v use a lot of dot & cross products (on float3 and float4 datatypes)
in my kernel, and i’m curious if there is any ‘optimized intrinsic’ for this.