[mpich-discuss] [mvapich-discuss] Non-MPI_THREAD-SINGLE mode with enabled MV2 affinity?

Fri Nov 15 12:42:04 CST 2013

Thanks for the suggestion.  This wiki is actually maintained by the
MPICH group at ANL.  I'm including their discuss list in this reply.

I believe that the biggest improvement to the wiki would be to expand
on the syntax accepted by user defined binding.

On Fri, Nov 15, 2013 at 1:01 PM, Thiago Quirino - NOAA Federal
<thiago.quirino at noaa.gov> wrote:
> Thank you so much, Jonathan.
>
> Is it possible to add this information to the Wiki? Though I've searched the
> net for hours, unfortunately I couldn't find any reference to this. The
> closest I found regarding disabling internal MVAPICH2 and mpiexec.hydra
> affinities in favor of external MPI process pinning using taskset or numactl
> was this:
>
> https://scicomp.jlab.org/docs/node/66
>
> Thank you so much again, Jonathan.
> Thiago.
>
>
>
>
> On Fri, Nov 15, 2013 at 10:31 AM, Jonathan Perkins
> <perkinjo at cse.ohio-state.edu> wrote:
>>
>> Hello Thiago.  I've confirmed that the following syntax will give you
>> what you're asking for.  You'll basically need to specify user
>> binding, replace your commas with pluses, and replace your colons with
>> commas.
>>
>> [perkinjo at sandy1 install]$ ./bin/mpiexec -n 4 -bind-to
>> user:0+1+2+3,4+5+6+7,8+9+10+11,12+13+14+15 -env MV2_ENABLE_AFFINITY=0
>> ./cpumask
>> 1111000000000000
>> 0000111100000000
>> 0000000011110000
>> 0000000000001111
>>
>> On Wed, Nov 13, 2013 at 7:02 PM, Thiago Quirino - NOAA Federal
>> <thiago.quirino at noaa.gov> wrote:
>> > Hi, Jonathan.
>> >
>> > Using mpiexec.hydra's binding capability, is it possible to assign a CPU
>> > range for each MPI task in a node? Suppose I want to spawn 4 MPI tasks
>> > per
>> > node, where each node has 2 sockets with 8 cores each (total 16 cores).
>> > I
>> > want task 1 to run on CPU range 0-3, task 2 on CPU range 4-7, task 3 on
>> > CPU
>> > range 8-11, and task 4 on CPU range 12-15. I used to accomplish this
>> > using
>> > the MV2_CPU_BINDING variables as follows:
>> >
>> > export MV2_CPU_MAPPING=0,1,2,3:4,5,6,7:8,9,10,11:12,13,14,15
>> >
>> > Can I accomplish the same binding configuration with mpiexec.hydra's
>> > binding
>> > capability? I only see socket binding options in the Wiki.
>> >
>> > Thanks again, Jonathan.
>> > Thiago.
>> >
>> >
>> >
>> > On Tue, Nov 12, 2013 at 1:56 PM, Jonathan Perkins
>> > <jonathan.lamar.perkins at gmail.com> wrote:
>> >>
>> >> Hello Thiago.  Perhaps you can try an alternative to
>> >> MV2_ENABLE_AFFINITY.  If you use the hydra process manager
>> >> (mpiexec.hydra), you can disable the library affinity and use the
>> >> launcher affinity instead.  In this case the other threading levels
>> >> will be available to you.
>> >>
>> >> Please see
>> >>
>> >> https://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager#Process-core_Binding
>> >> for more information on how to use this hydra feature.  Also please do
>> >> not forget to set MV2_ENABLE_AFFINITY to 0.
>> >>
>> >> Please let us know if this helps.
>> >>
>> >> On Fri, Nov 8, 2013 at 6:41 PM, Thiago Quirino - NOAA Federal
>> >> <thiago.quirino at noaa.gov> wrote:
>> >> > Hi, folks. Quick question about MVAPICH2 and affinity support.
>> >> >
>> >> > Is it possible to invoke MPI_Init_thread with any mode other than
>> >> > "MPI_THREAD_SINGLE" and still use "MV2_ENABLE_AFFINITY=1"? In my
>> >> > hybrid
>> >> > application I mix MPI with raw Pthreads (not OpenMP). I start 4 MPI
>> >> > tasks in
>> >> > each 16 cores node, where each node has 2 sockets with 8 Sandybridge
>> >> > cores
>> >> > each. Each of the 4 MPI tasks then spawns 4 pthreads for a total of
>> >> > 16
>> >> > pthreads/node, or 1 pthread/core. Within each MPI task, the MPI calls
>> >> > are
>> >> > serialized among the 4 pthreads, so I can use any MPI_THREAD_* mode,
>> >> > but
>> >> > I
>> >> > don't know which mode will work best. I want to assign each of the 4
>> >> > MPI
>> >> > tasks in a node a set of 4 cores using MV2_CPU_MAPPING (e.g. export
>> >> > MV2_CPU_MAPPING=0,1,2,3:4,5,6,7:8,9,10,11:12,13,14,15) so that the 4
>> >> > pthreads spawned by each MPI task can migrate to any processor within
>> >> > its
>> >> > exclusive CPU set of size 4.
>> >> >
>> >> > Is that possible with modes other than MPI_THREAD_SINGLE? If not, do
>> >> > you
>> >> > foresee any issues with using MPI_THREAD_SINGLE while serializing the
>> >> > MPI
>> >> > calls among the 4 pthreads of each MPI task? That is, is there any
>> >> > advantage
>> >> > to using MPI_THREAD_FUNELLED or MPI_THREAD_SERIALIZED versus
>> >> > MPI_THREAD_SINGLE for serialized calls among pthreads?
>> >> >
>> >> > Thank you so much, folks. Any help is much appreciated.
>> >> >
>> >> > Best,
>> >> > Thiago.
>> >> >
>> >> >
>> >> > ---------------------------------------------------
>> >> > Thiago Quirino, Ph.D.
>> >> > NOAA Hurricane Research Division
>> >> > 4350 Rickenbacker Cswy.
>> >> > Miami, FL 33139
>> >> > P: 305-361-4503
>> >> > E: Thiago.Quirino at noaa.gov
>> >> >
>> >> > _______________________________________________
>> >> > mvapich-discuss mailing list
>> >> > mvapich-discuss at cse.ohio-state.edu
>> >> > http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Jonathan Perkins
>> >
>> >
>> >
>> > _______________________________________________
>> > mvapich-discuss mailing list
>> > mvapich-discuss at cse.ohio-state.edu
>> > http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>> >
>>
>>
>>
>> --
>> Jonathan Perkins
>> http://www.cse.ohio-state.edu/~perkinjo
>>
>

-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo