[mpich-discuss] mpiexec.hydra binding for multiple compute nodes
Kenneth Raffenetti
raffenet at mcs.anl.gov
Wed May 13 08:29:29 CDT 2015
Ah, you are right. They will be striped odd/even across the nodes. Not
sequentially. I don't see a way to serve that binding with the current
options. It's something we can look at adding in a future release. I'll
open up a ticket and add you to the CC list, if you are interested.
Ken
On 05/12/2015 02:16 PM, Justin Chang wrote:
> So just to be clear, if I have the following script:
>
> #SBATCH -N 2
> #SBATCH -n 32
>
> mpiexec.hydra -bind-to user:0,1,2,3,4,5,6,7,10,11,12,13,14,15,16,17 -n
> 32 ./my_program <args>
>
> ranks 0-15 will be binded as such on the first node and ranks 16-31 will
> be the same for the second node? Or will all the even ranks be on one
> node and the odd on the other?
>
> Thanks,
>
> On Mon, May 11, 2015 at 9:27 PM, Kenneth Raffenetti
> <raffenet at mcs.anl.gov <mailto:raffenet at mcs.anl.gov>> wrote:
>
> Ah, I see now the problem. I misread the first email. Your original
> line should work fine! The user bindings are listings of hw
> elements, not processes, so your binding will be applied identically
> on each node.
>
> Ken
>
>
> On 05/11/2015 05:03 PM, Justin Chang wrote:
>
> Ken,
>
> "-bind-to core" gives me the following topology:
>
> process 0 binding: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 1 binding: 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 2 binding: 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 3 binding: 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 4 binding: 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 5 binding: 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 6 binding: 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 7 binding: 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
> process 8 binding: 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
> process 9 binding: 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
> process 10 binding: 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
> process 11 binding: 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
> process 12 binding: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
> process 13 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
> process 14 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
> process 15 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
>
> but I want this:
>
> process 0 binding: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 1 binding: 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 2 binding: 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 3 binding: 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 4 binding: 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 5 binding: 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 6 binding: 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
> process 7 binding: 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
> process 8 binding: 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
> process 9 binding: 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
> process 10 binding: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
> process 11 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
> process 12 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
> process 13 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
> process 14 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
> process 15 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
>
> The latter gives me better performance for my application, and I am
> guessing it's because I have evenly distribute the processes
> among the
> two sockets (sequentially). Which is why I resorted to what I had
> originally with the custom binding.
>
> Thanks,
>
> On Mon, May 11, 2015 at 4:53 PM, Kenneth Raffenetti
> <raffenet at mcs.anl.gov <mailto:raffenet at mcs.anl.gov>
> <mailto:raffenet at mcs.anl.gov <mailto:raffenet at mcs.anl.gov>>> wrote:
>
> Justin,
>
> Try using the "-bind-to core" option instead. It should do
> exactly
> what you are wanting. See this page with examples for more
> details
> https://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager#Process-core_Binding
>
> Ken
>
>
> On 05/11/2015 04:48 PM, Justin Chang wrote:
>
> Hello everyone,
>
> I am working with an HPC machine that has this
> configuring for a
> single
> compute node:
>
> Machine (64GB total)
> NUMANode L#0 (P#0 32GB)
> Socket L#0 + L3 L#0 (25MB)
> L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0
> (32KB) + Core
> L#0 + PU
> L#0 (P#0)
> L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1
> (32KB) + Core
> L#1 + PU
> L#1 (P#1)
> L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2
> (32KB) + Core
> L#2 + PU
> L#2 (P#2)
> L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3
> (32KB) + Core
> L#3 + PU
> L#3 (P#3)
> L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4
> (32KB) + Core
> L#4 + PU
> L#4 (P#4)
> L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5
> (32KB) + Core
> L#5 + PU
> L#5 (P#5)
> L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6
> (32KB) + Core
> L#6 + PU
> L#6 (P#6)
> L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7
> (32KB) + Core
> L#7 + PU
> L#7 (P#7)
> L2 L#8 (256KB) + L1d L#8 (32KB) + L1i L#8
> (32KB) + Core
> L#8 + PU
> L#8 (P#8)
> L2 L#9 (256KB) + L1d L#9 (32KB) + L1i L#9
> (32KB) + Core
> L#9 + PU
> L#9 (P#9)
> HostBridge L#0
> PCIBridge
> PCI 1000:0087
> Block L#0 "sda"
> PCIBridge
> PCI 15b3:1003
> Net L#1 "eth0"
> Net L#2 "ib0"
> OpenFabrics L#3 "mlx4_0"
> PCIBridge
> PCI 8086:1521
> Net L#4 "eth1"
> PCI 8086:1521
> Net L#5 "eth2"
> PCIBridge
> PCI 102b:0533
> PCI 8086:1d02
> NUMANode L#1 (P#1 32GB) + Socket L#1 + L3 L#1 (25MB)
> L2 L#10 (256KB) + L1d L#10 (32KB) + L1i L#10
> (32KB) + Core
> L#10 +
> PU L#10 (P#10)
> L2 L#11 (256KB) + L1d L#11 (32KB) + L1i L#11
> (32KB) + Core
> L#11 +
> PU L#11 (P#11)
> L2 L#12 (256KB) + L1d L#12 (32KB) + L1i L#12
> (32KB) + Core
> L#12 +
> PU L#12 (P#12)
> L2 L#13 (256KB) + L1d L#13 (32KB) + L1i L#13
> (32KB) + Core
> L#13 +
> PU L#13 (P#13)
> L2 L#14 (256KB) + L1d L#14 (32KB) + L1i L#14
> (32KB) + Core
> L#14 +
> PU L#14 (P#14)
> L2 L#15 (256KB) + L1d L#15 (32KB) + L1i L#15
> (32KB) + Core
> L#15 +
> PU L#15 (P#15)
> L2 L#16 (256KB) + L1d L#16 (32KB) + L1i L#16
> (32KB) + Core
> L#16 +
> PU L#16 (P#16)
> L2 L#17 (256KB) + L1d L#17 (32KB) + L1i L#17
> (32KB) + Core
> L#17 +
> PU L#17 (P#17)
> L2 L#18 (256KB) + L1d L#18 (32KB) + L1i L#18
> (32KB) + Core
> L#18 +
> PU L#18 (P#18)
> L2 L#19 (256KB) + L1d L#19 (32KB) + L1i L#19
> (32KB) + Core
> L#19 +
> PU L#19 (P#19)
>
> If I ran my program with 16 processes, I would have the
> follow
> batch script:
>
> #!/bin/bash
> #SBATCH -N 1
> #SBATCH -n 20
> #SBATCH -t 0-09:00
> #SBATCH -o output.txt
>
> mpiexec.hydra -bind-to
> user:0,1,2,3,4,5,6,7,10,11,12,13,14,15,16,17 -n
> 16 ./my_program <args>
>
> This would give me decent speedup. However, what if I
> want to use 32
> processes? Since each node only has 20 cores I would need
> #SBATCH -N 2
> and #SBATCH -n 40. However, I want ranks 0-15 and 16-31
> to have
> the same
> mapping as above but on different compute nodes, so how
> would I
> do this?
> Or would the above line work so long as I have a
> multiple of 16
> processes?
>
> Thanks,
>
> --
> Justin Chang
> PhD Candidate, Civil Engineering - Computational Sciences
> University of Houston, Department of Civil and
> Environmental
> Engineering
> Houston, TX 77004
> (512) 963-3262 <tel:%28512%29%20963-3262> <tel:%28512%29%20963-3262>
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> <mailto:discuss at mpich.org> <mailto:discuss at mpich.org
> <mailto:discuss at mpich.org>>
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> <mailto:discuss at mpich.org> <mailto:discuss at mpich.org
> <mailto:discuss at mpich.org>>
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
>
> --
> Justin Chang
> PhD Candidate, Civil Engineering - Computational Sciences
> University of Houston, Department of Civil and Environmental
> Engineering
> Houston, TX 77004
> (512) 963-3262 <tel:%28512%29%20963-3262>
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
> _______________________________________________
> discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
>
> --
> Justin Chang
> PhD Candidate, Civil Engineering - Computational Sciences
> University of Houston, Department of Civil and Environmental Engineering
> Houston, TX 77004
> (512) 963-3262
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list