I have an application that I’m running on an 8 core (4 dual core cpus) NUMA system running Linux Fedora Core 4. I would like for the Portland compiler to try to spread the application (which is normally 1 process) over more processors to speed it up. I compiled the application with Mconcur=numa (it’s dynamically linking and ran it, but I can see that it’s only using 1 CPU. All other CPUs show 0% usage. Also the runtime appears to be slightly longer than when I don’t use Mconcur.

Do you know of anything that I could be missing?


Did you set the NCPUS or the OMP_NUM_THREADS environment variables? If so, check that you both compiled and linked with the -Mconcur option.

Hope that helps,