Any support for twelve core AMD processor?

Dr_Skids · April 13, 2010, 3:16pm

Hello PGF users,
I have access to a system with AMD 12-core processors (2 per node).
The compiler is PGI Fortran 10.3.0 (others available). While it is compiling I notice that it is using “-tp istanbul-64” even though I did not ask for it.

I realise the sysadmin on the site provide a wrapper for users that should set the target processor. However, I thought it would be set to something like “magny-cours”.

Is the istanbul target still valid (the 12-cores are two hex-core dies anyway)?
Is there a “magny-cours” or will it be mc12 ? If so will there truly be any difference in performance? What can the compiler do different to the istanbul optimisations?

cheers,

Dr. Skids

MatColgrove · April 13, 2010, 3:38pm

Hi Dr.Skids,

A Magny-Cours uses two Istabul hex-cores so from the compiler perspective is the same architecture. The optimizations would be the same for both.

When benchmarking a Magny-Cours system, the main issue I found is memory binding. The two Istabuls each have their own memory node which is different from most systems. So the first time I ran parallel code my time was about half of the expected time because I was binding thinking that it had one node per socket. Use “numactl --hardware” to see the core to memory node mapping.

Mat

Dr_Skids · April 23, 2010, 9:49am

Hi PG support,
just a further question:

as these hex-cores are Istanbul should I use -mp=numa or nonuma?

What is the difference between these options?

(I believe it was mainly to do with libnuma availability in the OS)

So perhaps just using “-mp” with no flag is best?

Dr. Skids

MatColgrove · April 23, 2010, 8:01pm

Hi Dr.Skids,

The default for “-mp” is to use NUMA (-mp=numa) and I would recommend using it on Mangy-Cours. NUMA will attempt to place each thread’s data on the memory node associated with the core it’s running on. Without NUMA, the memory may be allocated on any of the memory nodes.

Hope this helps,
Mat