Is possible synchronous handle a lot of pics?


If i have a lots of pictures( 800 pics/sec), and pictures size 1280x40.

And i have 480 mutil-cores one card.

In the algorithm, I just only use 1280 threads(3 blocks). Is possible synchronous handle a lot of pictures?

What are you doing on these 1280x40 pictures (sure it’s 1280x40 ?!?) ?

You have one CUDA card with 480 cuda-core, probably GeForce GTX 480 or GTX 570, so a Fermi card.

You need at least 2880 threads to occupy the compute unit of the card, and you may consider 11520 threads (this is 9 pictures in parallel, given you use 1280 threads by picture).

If you don’t need real-time, I would consider

    Sending 18 pictures (9x2) to the card first

    Processing the first 9 pictures on the card then send back results

    process the next one, while the CPU asynchronously send the next batch of 9 pictures


More pictures in the buffer might be necessary to limit transfert time per picture.

Picture size is 1280(width)X 40 (height),the height depends on the object height.
I must find laser position in y axis each picture like 3d laser scan. It’s real time.
The card is GeForce GTX 570 as you think.So is possible to use max threads(>1280)?
Or if not, is possible that threads calcuate in the loop, and I transfer the image data??
or I must wait the threads finished the job, then I transfer data??