inline assembly

hi,

CUDA 1.1 (i don’t know whether 1.0 could do that) supports inlining assembler instructions e.g.

asm("ret;");

unfortunately referencing high-level language variables does not work:

 float t = 1;

  float s=2;

  float result;

  asm("add.f32 result, t, s;");

the code above leads to PTXAS errors, whereas the following lines are translated without problems.

 float t = 1;

  float s=2;

  float result;

  asm(".reg .f32 t, s, result; add.f32 result, t, s;");

so currently one can either:

  • stick to HLL

  • code everything in one asm-string

  • or declare variables twice (resulting in really nice code)

is that inline-assembler “feature” likely to be improved in near future? (i mean TRUE inline assembly without the above mentioned drawbacks)

has anyone found an other, hopefully better way to inline assembly?

greetings,

tomschi