I am currently calculating 10 million particle in single Gpu. Can i increase the simulation to 20 million using multiGPu? Could you please suggest me reference and ways to do so?

N. Wilt’s “CUDA Handbook” contains a detailed description of a multi-GPU implementation of the N-body problem in Chapter 9.
I think you could download the source code at