First, sorry for writing. I’m brazilian and I do not write very well.
I upgrade my gpu (GT 740). Previously, on GT 430 the kernel of my project run much faster. Both gpus have DDR3. I use VS 2013. Additionally, GT 430 has 96 cuda cores and 1 GB of RAM and GT 740 has 384 cuda cores and 1 GB of RAM.
On GT 430, I compile the project with compute_20,sm_20 and configure the threads and blocks size to 192xNxM.
On GT 740, I compile the project with compute_30,sm_30 and configure the threads and blocks size to 384xTxS.
In both case, I clean the project. My CPU is a I5-2320 with 4 GB of RAM.