<meta http-equiv="Content-Type" content="text/html; charset=utf-8"><div dir="ltr">Oh okay. I see that one of the binding options within hydra is to bind to the motherboard. Would that allow one to assign ranks to certain nodes? (Assuming that my compute node only has one motherboard). If I had #SBATCH -N 2 and #SBATCH -n 32 y guess is to have something like '-bind-to board' sequentially bind 0-15 to node 0 and 16-31 to node 1, or would this option still result in striped even/odd assignments? <br><div><br><div>But yes otherwise I would very much like to be included in that CC if a ticket is to be made.</div><div><br></div><div>Thanks<br><br>On Wednesday, May 13, 2015, Kenneth Raffenetti <<a href="mailto:raffenet@mcs.anl.gov" target="_blank">raffenet@mcs.anl.gov</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Ah, you are right. They will be striped odd/even across the nodes. Not sequentially. I don't see a way to serve that binding with the current options. It's something we can look at adding in a future release. I'll open up a ticket and add you to the CC list, if you are interested.<br>
<br>
Ken<br>
<br>
On 05/12/2015 02:16 PM, Justin Chang wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
So just to be clear, if I have the following script:<br>
<br>
#SBATCH -N 2<br>
#SBATCH -n 32<br>
<br>
mpiexec.hydra -bind-to user:0,1,2,3,4,5,6,7,10,11,12,13,14,15,16,17 -n<br>
32 ./my_program <args><br>
<br>
ranks 0-15 will be binded as such on the first node and ranks 16-31 will<br>
be the same for the second node? Or will all the even ranks be on one<br>
node and the odd on the other?<br>
<br>
Thanks,<br>
<br>
On Mon, May 11, 2015 at 9:27 PM, Kenneth Raffenetti<br>
<<a>raffenet@mcs.anl.gov</a> <mailto:<a>raffenet@mcs.anl.gov</a>>> wrote:<br>
<br>
Ah, I see now the problem. I misread the first email. Your original<br>
line should work fine! The user bindings are listings of hw<br>
elements, not processes, so your binding will be applied identically<br>
on each node.<br>
<br>
Ken<br>
<br>
<br>
On 05/11/2015 05:03 PM, Justin Chang wrote:<br>
<br>
Ken,<br>
<br>
"-bind-to core" gives me the following topology:<br>
<br>
process 0 binding: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 1 binding: 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 2 binding: 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 3 binding: 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 4 binding: 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 5 binding: 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 6 binding: 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 7 binding: 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 8 binding: 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0<br>
process 9 binding: 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0<br>
process 10 binding: 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0<br>
process 11 binding: 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0<br>
process 12 binding: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0<br>
process 13 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0<br>
process 14 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0<br>
process 15 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0<br>
<br>
but I want this:<br>
<br>
process 0 binding: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 1 binding: 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 2 binding: 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 3 binding: 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 4 binding: 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 5 binding: 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 6 binding: 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 7 binding: 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0<br>
process 8 binding: 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0<br>
process 9 binding: 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0<br>
process 10 binding: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0<br>
process 11 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0<br>
process 12 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0<br>
process 13 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0<br>
process 14 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0<br>
process 15 binding: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0<br>
<br>
The latter gives me better performance for my application, and I am<br>
guessing it's because I have evenly distribute the processes<br>
among the<br>
two sockets (sequentially). Which is why I resorted to what I had<br>
originally with the custom binding.<br>
<br>
Thanks,<br>
<br>
On Mon, May 11, 2015 at 4:53 PM, Kenneth Raffenetti<br>
<<a>raffenet@mcs.anl.gov</a> <mailto:<a>raffenet@mcs.anl.gov</a>><br>
<mailto:<a>raffenet@mcs.anl.gov</a> <mailto:<a>raffenet@mcs.anl.gov</a>>>> wrote:<br>
<br>
Justin,<br>
<br>
Try using the "-bind-to core" option instead. It should do<br>
exactly<br>
what you are wanting. See this page with examples for more<br>
details<br>
<a href="https://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager#Process-core_Binding" target="_blank">https://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager#Process-core_Binding</a><br>
<br>
Ken<br>
<br>
<br>
On 05/11/2015 04:48 PM, Justin Chang wrote:<br>
<br>
Hello everyone,<br>
<br>
I am working with an HPC machine that has this<br>
configuring for a<br>
single<br>
compute node:<br>
<br>
Machine (64GB total)<br>
NUMANode L#0 (P#0 32GB)<br>
Socket L#0 + L3 L#0 (25MB)<br>
L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0<br>
(32KB) + Core<br>
L#0 + PU<br>
L#0 (P#0)<br>
L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1<br>
(32KB) + Core<br>
L#1 + PU<br>
L#1 (P#1)<br>
L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2<br>
(32KB) + Core<br>
L#2 + PU<br>
L#2 (P#2)<br>
L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3<br>
(32KB) + Core<br>
L#3 + PU<br>
L#3 (P#3)<br>
L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4<br>
(32KB) + Core<br>
L#4 + PU<br>
L#4 (P#4)<br>
L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5<br>
(32KB) + Core<br>
L#5 + PU<br>
L#5 (P#5)<br>
L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6<br>
(32KB) + Core<br>
L#6 + PU<br>
L#6 (P#6)<br>
L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7<br>
(32KB) + Core<br>
L#7 + PU<br>
L#7 (P#7)<br>
L2 L#8 (256KB) + L1d L#8 (32KB) + L1i L#8<br>
(32KB) + Core<br>
L#8 + PU<br>
L#8 (P#8)<br>
L2 L#9 (256KB) + L1d L#9 (32KB) + L1i L#9<br>
(32KB) + Core<br>
L#9 + PU<br>
L#9 (P#9)<br>
HostBridge L#0<br>
PCIBridge<br>
PCI 1000:0087<br>
Block L#0 "sda"<br>
PCIBridge<br>
PCI 15b3:1003<br>
Net L#1 "eth0"<br>
Net L#2 "ib0"<br>
OpenFabrics L#3 "mlx4_0"<br>
PCIBridge<br>
PCI 8086:1521<br>
Net L#4 "eth1"<br>
PCI 8086:1521<br>
Net L#5 "eth2"<br>
PCIBridge<br>
PCI 102b:0533<br>
PCI 8086:1d02<br>
NUMANode L#1 (P#1 32GB) + Socket L#1 + L3 L#1 (25MB)<br>
L2 L#10 (256KB) + L1d L#10 (32KB) + L1i L#10<br>
(32KB) + Core<br>
L#10 +<br>
PU L#10 (P#10)<br>
L2 L#11 (256KB) + L1d L#11 (32KB) + L1i L#11<br>
(32KB) + Core<br>
L#11 +<br>
PU L#11 (P#11)<br>
L2 L#12 (256KB) + L1d L#12 (32KB) + L1i L#12<br>
(32KB) + Core<br>
L#12 +<br>
PU L#12 (P#12)<br>
L2 L#13 (256KB) + L1d L#13 (32KB) + L1i L#13<br>
(32KB) + Core<br>
L#13 +<br>
PU L#13 (P#13)<br>
L2 L#14 (256KB) + L1d L#14 (32KB) + L1i L#14<br>
(32KB) + Core<br>
L#14 +<br>
PU L#14 (P#14)<br>
L2 L#15 (256KB) + L1d L#15 (32KB) + L1i L#15<br>
(32KB) + Core<br>
L#15 +<br>
PU L#15 (P#15)<br>
L2 L#16 (256KB) + L1d L#16 (32KB) + L1i L#16<br>
(32KB) + Core<br>
L#16 +<br>
PU L#16 (P#16)<br>
L2 L#17 (256KB) + L1d L#17 (32KB) + L1i L#17<br>
(32KB) + Core<br>
L#17 +<br>
PU L#17 (P#17)<br>
L2 L#18 (256KB) + L1d L#18 (32KB) + L1i L#18<br>
(32KB) + Core<br>
L#18 +<br>
PU L#18 (P#18)<br>
L2 L#19 (256KB) + L1d L#19 (32KB) + L1i L#19<br>
(32KB) + Core<br>
L#19 +<br>
PU L#19 (P#19)<br>
<br>
If I ran my program with 16 processes, I would have the<br>
follow<br>
batch script:<br>
<br>
#!/bin/bash<br>
#SBATCH -N 1<br>
#SBATCH -n 20<br>
#SBATCH -t 0-09:00<br>
#SBATCH -o output.txt<br>
<br>
mpiexec.hydra -bind-to<br>
user:0,1,2,3,4,5,6,7,10,11,12,13,14,15,16,17 -n<br>
16 ./my_program <args><br>
<br>
This would give me decent speedup. However, what if I<br>
want to use 32<br>
processes? Since each node only has 20 cores I would need<br>
#SBATCH -N 2<br>
and #SBATCH -n 40. However, I want ranks 0-15 and 16-31<br>
to have<br>
the same<br>
mapping as above but on different compute nodes, so how<br>
would I<br>
do this?<br>
Or would the above line work so long as I have a<br>
multiple of 16<br>
processes?<br>
<br>
Thanks,<br>
<br>
--<br>
Justin Chang<br>
PhD Candidate, Civil Engineering - Computational Sciences<br>
University of Houston, Department of Civil and<br>
Environmental<br>
Engineering<br>
Houston, TX 77004<br>
<a href="tel:%28512%29%20963-3262" value="+15129633262" target="_blank">(512) 963-3262</a> <tel:%28512%29%20963-3262> <tel:%28512%29%20963-3262><br>
<br>
<br>
_______________________________________________<br>
discuss mailing list <a>discuss@mpich.org</a><br>
<mailto:<a>discuss@mpich.org</a>> <mailto:<a>discuss@mpich.org</a><br>
<mailto:<a>discuss@mpich.org</a>>><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
<br>
_______________________________________________<br>
discuss mailing list <a>discuss@mpich.org</a><br>
<mailto:<a>discuss@mpich.org</a>> <mailto:<a>discuss@mpich.org</a><br>
<mailto:<a>discuss@mpich.org</a>>><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
<br>
<br>
<br>
<br>
--<br>
Justin Chang<br>
PhD Candidate, Civil Engineering - Computational Sciences<br>
University of Houston, Department of Civil and Environmental<br>
Engineering<br>
Houston, TX 77004<br>
<a href="tel:%28512%29%20963-3262" value="+15129633262" target="_blank">(512) 963-3262</a> <tel:%28512%29%20963-3262><br>
<br>
<br>
_______________________________________________<br>
discuss mailing list <a>discuss@mpich.org</a> <mailto:<a>discuss@mpich.org</a>><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
<br>
_______________________________________________<br>
discuss mailing list <a>discuss@mpich.org</a> <mailto:<a>discuss@mpich.org</a>><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
<br>
<br>
<br>
<br>
--<br>
Justin Chang<br>
PhD Candidate, Civil Engineering - Computational Sciences<br>
University of Houston, Department of Civil and Environmental Engineering<br>
Houston, TX 77004<br>
<a href="tel:%28512%29%20963-3262" value="+15129633262" target="_blank">(512) 963-3262</a><br>
<br>
<br>
_______________________________________________<br>
discuss mailing list <a>discuss@mpich.org</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
<br>
</blockquote>
_______________________________________________<br>
discuss mailing list <a>discuss@mpich.org</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
</blockquote></div></div>
</div>