accelarate my program

Dear all;
how can I accelerate my cuda program, it takes toooo long time to give the results?
i had disabled the TDR .

Since you do not ask for a specific problem, I give you a general answer: read for example [url]Best Practices Guide :: CUDA Toolkit Documentation

those are also good links: