error: calling a __host__ function("printf") from a __global__ function is not allowed

I know this topic has come up before but since I still fell for it I thought it worth
re-posting the solution: add -arch to the compile command line.
Eg nvcc -arch sm_35 -o prog

The logic of this is only devices 2.0 or later support printf directly but nvcc defaults
not to the compute level of the device attached to your computer but to 1.0. Ie the
default compute level does not support printf. This is what the error message has spent
the last two days trying to tell me.

The simplePrintf example in the cuda sdk samples has a work round for GPUs eralier than 2.0


    Dr. W. B. Langdon,
    Department of Computer Science,
    University College London
    Gower Street, London WC1E 6BT, UK

