[mpich-discuss] some trouble with mpich + openmp thread affinity

Burlen Loring bloring at lbl.gov
Tue Jan 5 13:10:49 CST 2021


Hi All,

I'm trying to accomplish binding each of a number of openmp threads to a 
unique physical core. my system has 2 hyper-threads per core, I want to 
bind only 1 openmp thread per physical core and skip hyper-thread "slots".

My system has 10 cores (20 if you count hyper-threads). I want to run 2 
mpi ranks each with 5 openmp threads, and I want the threads on rank 0 
to bind to core: 0,1,2,3,4  and the threads on rank 1 to bind to core: 
5,6,7,8,9.

I've tried combinations of -bind-to and -map-by but I'm experiencing 
that the openmp threads are always doubled up , the hyperthread slots 
are being used.

For example (a.out here prints the thread affinity):

$ export OMP_PLACES=threads OMP_PROC_BIND=true OMP_DISPLAY_ENV=TRUE 
OMP_NUM_THREADS=5
$ mpiexec -np 2 -bind-to core:5 -map-by hwthread:10 ./a.out | sort -k 4

OPENMP DISPLAY ENVIRONMENT BEGIN
   _OPENMP = '201511'
   OMP_DYNAMIC = 'FALSE'
   OMP_NESTED = 'FALSE'
   OMP_NUM_THREADS = '5'
   OMP_SCHEDULE = 'DYNAMIC'
   OMP_PROC_BIND = 'TRUE'
   OMP_PLACES = '{0},{10},{1},{11},{2},{12},{3},{13},{4},{14}'
   OMP_STACKSIZE = '0'
   OMP_WAIT_POLICY = 'PASSIVE'
   OMP_THREAD_LIMIT = '4294967295'
   OMP_MAX_ACTIVE_LEVELS = '2147483647'
   OMP_CANCELLATION = 'FALSE'
   OMP_DEFAULT_DEVICE = '0'
   OMP_MAX_TASK_PRIORITY = '0'
   OMP_DISPLAY_AFFINITY = 'FALSE'
   OMP_AFFINITY_FORMAT = 'level %L thread %i affinity %A'
OPENMP DISPLAY ENVIRONMENT END

OPENMP DISPLAY ENVIRONMENT BEGIN
   _OPENMP = '201511'
   OMP_DYNAMIC = 'FALSE'
   OMP_NESTED = 'FALSE'
   OMP_NUM_THREADS = '5'
   OMP_SCHEDULE = 'DYNAMIC'
   OMP_PROC_BIND = 'TRUE'
   OMP_PLACES = '{5},{15},{6},{16},{7},{17},{8},{18},{9},{19}'
   OMP_STACKSIZE = '0'
   OMP_WAIT_POLICY = 'PASSIVE'
   OMP_THREAD_LIMIT = '4294967295'
   OMP_MAX_ACTIVE_LEVELS = '2147483647'
   OMP_CANCELLATION = 'FALSE'
   OMP_DEFAULT_DEVICE = '0'
   OMP_MAX_TASK_PRIORITY = '0'
   OMP_DISPLAY_AFFINITY = 'FALSE'
   OMP_AFFINITY_FORMAT = 'level %L thread %i affinity %A'
OPENMP DISPLAY ENVIRONMENT END
Hello from rank 0, thread 0, on smic.dhcp. (core affinity = 0)
Hello from rank 0, thread 1, on smic.dhcp. (core affinity = 10)
Hello from rank 0, thread 2, on smic.dhcp. (core affinity = 1)
Hello from rank 0, thread 3, on smic.dhcp. (core affinity = 11)
Hello from rank 0, thread 4, on smic.dhcp. (core affinity = 2)
Hello from rank 1, thread 0, on smic.dhcp. (core affinity = 5)
Hello from rank 1, thread 1, on smic.dhcp. (core affinity = 15)
Hello from rank 1, thread 2, on smic.dhcp. (core affinity = 6)
Hello from rank 1, thread 3, on smic.dhcp. (core affinity = 16)
Hello from rank 1, thread 4, on smic.dhcp. (core affinity = 7)

*How do I get mpich to skip the hyper-threads (i.e. cores with number 
greater than 9)?*

Here's an example that did work:

    export OMP_PROC_BIND=true OMP_DISPLAY_ENV=TRUE OMP_NUM_THREADS=5
    export OMP_PLACES="{0},{1},{2},{3},{4},{5},{6},{7},{8},{9}"

    mpiexec -np 2 -bind-to core:5 -map-by core:5 ./a.out

    libgomp: Number of places reduced from 10 to 5 because some places
    didn't contain any usable logical CPUs

    OPENMP DISPLAY ENVIRONMENT BEGIN
       _OPENMP = '201511'
       OMP_DYNAMIC = 'FALSE'
       OMP_NESTED = 'FALSE'
       OMP_NUM_THREADS = '5'
       OMP_SCHEDULE = 'DYNAMIC'
       OMP_PROC_BIND = 'TRUE'
       OMP_PLACES = '{0},{1},{2},{3},{4}
    libgomp: '
       OMP_STACKSIZE = '0'
       OMP_WAIT_POLICY = 'PASSIVE'
       OMP_THREAD_LIMIT = '4294967295'
       OMP_MAX_ACTIVE_LEVELS = '2147483647'
       OMP_CANCELLATION = 'FALSE'
       OMP_DEFAULT_DEVICE = '0'
       OMP_MAX_TASK_PRIORITY = '0'
       OMP_DISPLAY_AFFINITY = 'FALSE'
       OMP_AFFINITY_FORMAT = 'level %L thread %i affinity %A'
    OPENMP DISPLAY ENVIRONMENT END
    Number of places reduced from 10 to 5 because some places didn't
    contain any usable logical CPUs

    OPENMP DISPLAY ENVIRONMENT BEGIN
       _OPENMP = '201511'
       OMP_DYNAMIC = 'FALSE'
       OMP_NESTED = 'FALSE'
       OMP_NUM_THREADS = '5'
       OMP_SCHEDULE = 'DYNAMIC'
       OMP_PROC_BIND = 'TRUE'
       OMP_PLACES = '{5},{6},{7},{8},{9}'
       OMP_STACKSIZE = '0'
       OMP_WAIT_POLICY = 'PASSIVE'
       OMP_THREAD_LIMIT = '4294967295'
       OMP_MAX_ACTIVE_LEVELS = '2147483647'
       OMP_CANCELLATION = 'FALSE'
       OMP_DEFAULT_DEVICE = '0'
       OMP_MAX_TASK_PRIORITY = '0'
       OMP_DISPLAY_AFFINITY = 'FALSE'
       OMP_AFFINITY_FORMAT = 'level %L thread %i affinity %A'
    OPENMP DISPLAY ENVIRONMENT END
    Hello from rank 0, thread 1, on smic.dhcp. (core affinity = 1)
    Hello from rank 0, thread 2, on smic.dhcp. (core affinity = 2)
    Hello from rank 1, thread 0, on smic.dhcp. (core affinity = 5)
    Hello from rank 0, thread 3, on smic.dhcp. (core affinity = 3)
    Hello from rank 0, thread 0, on smic.dhcp. (core affinity = 0)
    Hello from rank 1, thread 4, on smic.dhcp. (core affinity = 9)
    Hello from rank 0, thread 4, on smic.dhcp. (core affinity = 4)
    Hello from rank 1, thread 3, on smic.dhcp. (core affinity = 8)
    Hello from rank 1, thread 1, on smic.dhcp. (core affinity = 6)
    Hello from rank 1, thread 2, on smic.dhcp. (core affinity = 7)

That's exactly what I want!

*Could I accomplish this without setting OMP_PLACES list explicitly? And 
in the process hopefully avoid the libgomp warning. Is there some 
combination of -bind-to, -map-to that would achieve this result?*


Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20210105/711dfc87/attachment.html>


More information about the discuss mailing list