The driver keeps cards in a low power state by under clocking the core and memory when the card is idle. When you establish a CUDA context, the card goes back up to full clocks, and the temperature will rise. If you have a Fermi card ( and I am guessing you do), then what you are seeing is pretty normal.
Linux driver does not speed up the fan all the way for some reason, even under full load. I have to enable coolbits through X11 config and set the fan manually to 85% or 90%. (Is there a way to do that through command line or API?) It gets somewhat noisy, but keeps the temperature under 60 C at full load (GTX 560 with stock cooler and more or less stock clocks.)
On Fermi I haven’t been able to do that. On the GT200, nvclock used to work. Our old cluster job script for CUDA work used to ramp up the fan to full using nvclock before starting the work, and then ramp it down again to idle afterwards, but I haven’t been able to do the same thing with Fermi cards, unfortunately.