No subject


Tue Jun 18 13:52:11 CDT 2019


as less latency than 4KB.

I was looking for explanation of this behavior  but did not get any.


  1.  MPIR_CVAR_CH3_EAGER_MAX_MSG_SIZE is set to 128KB. So none of the abov=
e message size is using Rendezvous protocol. Is there any partition inside =
eager protocol (e.g. 0 - 512 bytes, 1KB - 8KB, 16KB - 64KB)? If yes then wh=
at are the boundaries for them? Can I log them with debug-event-logging?


Setup I am using:

- two nodes has intel core i7, one with 16gb memory another one 8gb

- mpich 3.2.1, configured and build to use nemesis tcp

- 1gb Ethernet connection

- NFS is using for sharing

- osu_latency : uses MPI_Send and MPI_Recv

- MPIR_CVAR_CH3_EAGER_MAX_MSG_SIZE=3D 131072 (128KB)


Can anyone help me on that? Thanks in advance.




Best Regards,

Abu Naser

_______________________________________________
discuss mailing list     discuss at mpich.org<mailto:discuss at mpich.org>
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss




--
Jeff Hammond
jeff.science at gmail.com<mailto:jeff.science at gmail.com>
http://jeffhammond.github.io/



_______________________________________________
discuss mailing list     discuss at mpich.org<mailto:discuss at mpich.org>
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss





_______________________________________________
discuss mailing list     discuss at mpich.org<mailto:discuss at mpich.org>
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss



--_000_BLUPR0501MB2003414CB97CA97A0242D0BC97430BLUPR0501MB2003_
Content-Type: text/html; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

<html><head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3DWindows-1=
252">
<style type=3D"text/css" style=3D"display:none;"><!-- P {margin-top:0;margi=
n-bottom:0;} --></style>
</head>
<body dir=3D"ltr">
<div id=3D"divtagdefaultwrapper" style=3D"font-size:12pt;color:#000000;font=
-family:Calibri,Helvetica,sans-serif;" dir=3D"ltr">
<div id=3D"divtagdefaultwrapper" style=3D"" dir=3D"ltr">
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt; margin-top: 0px; margin-bottom: 0px;">
Hello Min,</p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt; margin-top: 0px; margin-bottom: 0px;">
<br>
</p>
<p style=3D"margin-top: 0px; margin-bottom: 0px;"></p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt;">
After compiling my mpich-3.2.1 with sock, while I was trying to run  a=
ny program including osu benchmark or examples/cpi  in two m=
achines, I have received following error -</p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt;">
<br>
</p>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">Process 3 of 4 is on dhcp16194</span></=
i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">Process 1 of 4 is on dhcp16194</span></=
i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">Process 0 of 4 is on dhcp16198</span></=
i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">Process 2 of 4 is on dhcp16198</span></=
i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">Fatal error in PMPI_Bcast: Unknown erro=
r class, error stack:</span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">PMPI_Bcast(1600).......................=
.....: MPI_Bcast(buf=3D0x7ffc1808542c, count=3D1, MPI_INT, root=3D0, MPI_CO=
MM_WORLD) failed</span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIR_Bcast_impl(1452)..................=
.....: </span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIR_Bcast(1476).......................=
.....: </span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIR_Bcast_intra(1249).................=
.....: </span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIR_SMP_Bcast(1081)...................=
.....: </span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIR_Bcast_binomial(285)...............=
.....: </span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIC_Send(303).........................=
.....: </span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIC_Wait(226).........................=
.....: </span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIDI_CH3i_Progress_wait(242)..........=
.....: an error occurred while handling an event returned by MPIDU_Sock_Wai=
t()</span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIDI_CH3I_Progress_handle_sock_event(6=
98)..: </span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIDI_CH3_Sockconn_handle_connect_event=
(597): [ch3:sock] failed to connnect to remote process</span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIDU_Socki_handle_connect(808)........=
.....: connection failure (set=3D0,sock=3D1,errno=3D111:Connection refused)=
</span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIR_SMP_Bcast(1088)...................=
.....: </span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIR_Bcast_binomial(310)...............=
.....: Failure during collective</span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">Fatal error in PMPI_Bcast: Other MPI er=
ror, error stack:</span></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">PMPI_Bcast(1600)........: MPI_Bcast(buf=
=3D0x7ffd9eeebdac, count=3D1, MPI_INT, root=3D0, MPI_COMM_WORLD) failed</sp=
an></i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIR_Bcast_impl(1452)...: </span><=
/i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIR_Bcast(1476)........: </span><=
/i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIR_Bcast_intra(1249)..: </span><=
/i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIR_SMP_Bcast(1088)....: </span><=
/i></div>
<div style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-se=
rif, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", =
NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", Emo=
jiSymbols; font-size: 16px;">
<i><span style=3D"font-size: 10pt;">MPIR_Bcast_binomial(310): Failure durin=
g collective</span></i></div>
<br style=3D"font-family: Calibri, Helvetica, sans-serif, EmojiFont, "=
Apple Color Emoji", "Segoe UI Emoji", NotoColorEmoji, "=
Segoe UI Symbol", "Android Emoji", EmojiSymbols; font-size: =
16px;">
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt;">
</p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 16px;">
<span style=3D"font-size: 12pt;">I checked the mpich FAQ and also mpic=
h discussion list. Based on that I have checked </span>followings<span=
 style=3D"font-size: 12pt;"> </span><span style=3D"font-size: 12pt;">a=
nd found  they are fine in my machines -</span><br>
</p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt;">
<span style=3D"font-size: 12pt;">- firewall is disabled in both machine</sp=
an></p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt;">
<span style=3D"font-size: 12pt;">- I can do </span>password less<span =
style=3D"font-size: 12pt;"> ssh in both machine</span></p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt;">
<span style=3D"font-size: 12pt;">- /etc/hosts in both machine configured wi=
th ip address and name properly</span></p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt;">
<span style=3D"font-size: 12pt;">- I have updated the library path and used=
 absolute path for mpiexec</span></p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt;">
<span style=3D"font-size: 12pt;">- Most importantly when I configured and b=
uild mpich with tcp, it works fine.</span></p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt;">
<span style=3D"font-size: 12pt;"><br>
</span></p>
<p style=3D""><span style=3D"font-size: 12pt;"> I think I am </sp=
an><span style=3D"font-size: 12pt;">missing something but could not figure =
out yet. Any help would be
</span>appreciated<span style=3D"font-size: 12pt;">.</span></p>
<p style=3D""><span style=3D"font-size: 12pt;"><br>
</span></p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt;">
<span style=3D"font-size: 12pt;">Thank you.</span></p>
<br>
<p></p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt; margin-top: 0px; margin-bottom: 0px;">
<br>
</p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt; margin-top: 0px; margin-bottom: 0px;">
<br>
</p>
<p style=3D"color: rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-seri=
f, EmojiFont, "Apple Color Emoji", "Segoe UI Emoji", No=
toColorEmoji, "Segoe UI Symbol", "Android Emoji", Emoji=
Symbols; font-size: 12pt; margin-top: 0px; margin-bottom: 0px;">
<br>
</p>
<div id=3D"Signature" style=3D"color: rgb(0, 0, 0); font-family: Calibri, H=
elvetica, sans-serif, EmojiFont, "Apple Color Emoji", "Segoe=
 UI Emoji", NotoColorEmoji, "Segoe UI Symbol", "Android=
 Emoji", EmojiSymbols; font-size: 12pt;">
<div id=3D"divtagdefaultwrapper" dir=3D"ltr" style=3D"font-size:12pt; color=
:rgb(0,0,0); font-family:Calibri,Helvetica,sans-serif,"EmojiFont"=
,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,&q=
uot;Segoe UI Symbol","Android Emoji",EmojiSymbols">
<p><br>
</p>
<p align=3D"left"><span style=3D"font-size:10pt; font-family:Calibri,Helvet=
ica,sans-serif">Best Regards,</span></p>
<span style=3D"font-family:Calibri,Helvetica,sans-serif; font-size:10pt"></=
span>
<div align=3D"left"><span style=3D"font-size:11pt; font-family:Calibri,Helv=
etica,sans-serif"></span></div>
<span style=3D"font-family:Calibri,Helvetica,sans-serif; font-size:10pt"></=
span>
<p align=3D"left"><span style=3D"font-size:10pt; font-family:Calibri,Helvet=
ica,sans-serif">Abu Naser</span><br>
</p>
</div>
</div>
</div>
<hr style=3D"display:inline-block;width:98%" tabindex=3D"-1">
<div id=3D"divRplyFwdMsg" dir=3D"ltr"><font face=3D"Calibri, sans-serif" st=
yle=3D"font-size:11pt" color=3D"#000000"><b>From:</b> Min Si <msi at anl.go=
v><br>
<b>Sent:</b> Tuesday, June 26, 2018 12:54:29 PM<br>
<b>To:</b> discuss at mpich.org<br>
<b>Subject:</b> Re: [mpich-discuss] osu_latency test: why 8KB takes less ti=
me than 4KB and 2KB takes less time than 1KB?</font>
<div> </div>
</div>
<meta content=3D"text/html; charset=3DWindows-1252">
<div style=3D"background-color:#FFFFFF">Hi Abu,<br>
<br>
I think the results are stable enough. Perhaps you could also try the follo=
wing tests, and see if similar trend exists:<br>
- MPICH/socket (set `--with-device=3Dch3:sock` at configure)<br>
- A socket-based pingpong test without MPI. <br>
<br>
At this point, I could not think of any MPI-specific design for 2k/8k messa=
ges. My guess is that it is related to your network connection.
<br>
<br>
Min<br>
<br>
<div class=3D"x_moz-cite-prefix">On 2018/06/24 11:09, Abu Naser wrote:<br>
</div>
<blockquote type=3D"cite">
<meta content=3D"text/html;=0A=
        charset=3DWindows-1252">
<div id=3D"x_divtagdefaultwrapper" dir=3D"ltr">
<div id=3D"x_divtagdefaultwrapper" dir=3D"ltr">
<p>Hello Min and Jeff,</p>
<p><br>
</p>
<p>Here is my experiment results. Default number of iterations in osu_=
latency for 0B =96 8KB is 10,000. With that setting I had run the osu_laten=
cy 100 times and found standard deviation 33 for 8KB message size.</p>
<p><br>
</p>
<p>So later I have set the iteration to 50,000 and 100,000 for 1KB =96 16KB=
 message size. Then run osu_latency for 100 times for each setting and take=
 the average and standard deviation.</p>
<p><br>
</p>
<table width=3D"665">
<colgroup><col width=3D"99"><col width=3D"112"><col width=3D"118"><col widt=
h=3D"154"><col width=3D"140"></colgroup>
<tbody>
<tr>
<td width=3D"99">
<p><b>Msg Size in Bytes</b></p>
</td>
<td width=3D"112">
<p><b>Avg time in us (50K iterations)</b></p>
</td>
<td width=3D"118">
<p><b>Avg time in us (100k iterations)</b></p>
</td>
<td width=3D"154">
<p><b>Standard deviation (50K iterations)</b></p>
</td>
<td width=3D"140">
<p><b>Standard deviation (100K iterations)</b></p>
</td>
</tr>
<tr>
<td width=3D"99">
<p>1k</p>
</td>
<td width=3D"112">
<p>85.10</p>
</td>
<td width=3D"118">
<p>84.9</p>
</td>
<td width=3D"154">
<p>0.55</p>
</td>
<td width=3D"140">
<p>0.45</p>
</td>
</tr>
<tr>
<td width=3D"99">
<p>2k</p>
</td>
<td width=3D"112">
<p>75.79</p>
</td>
<td width=3D"118">
<p>74.63</p>
</td>
<td width=3D"154">
<p>5.09</p>
</td>
<td width=3D"140">
<p>4.44</p>
</td>
</tr>
<tr>
<td width=3D"99">
<p>4k</p>
</td>
<td width=3D"112">
<p>273.80</p>
</td>
<td width=3D"118">
<p>274.71</p>
</td>
<td width=3D"154">
<p>4.18</p>
</td>
<td width=3D"140">
<p>2.45</p>
</td>
</tr>
<tr>
<td width=3D"99">
<p>8k</p>
</td>
<td width=3D"112">
<p>258.56</p>
</td>
<td width=3D"118">
<p>249.83</p>
</td>
<td width=3D"154">
<p>21.14</p>
</td>
<td width=3D"140">
<p>28</p>
</td>
</tr>
<tr>
<td height=3D"24" width=3D"99">
<p>16k</p>
</td>
<td width=3D"112">
<p>281.31</p>
</td>
<td width=3D"118">
<p>281.02</p>
</td>
<td width=3D"154">
<p>3.22</p>
</td>
<td width=3D"140">
<p>4.10</p>
</td>
</tr>
</tbody>
</table>
<p><br>
</p>
<p><br>
</p>
<p>The standard deviation of 8K message is so high and that implies it actu=
ally not producing any consistent latency time. Looks like that's the =
reason for 8K is taking less time than 4K.</p>
<p><br>
</p>
<p>Meanwhile, 2K has standard deviation less than 5 but 1K message latency =
timing are more densely populated than 2K. So probably this is the explanat=
ion for 2K message less latency time.</p>
<p><br>
</p>
<p>Thank you for your suggestions.</p>
<br>
<p><br>
</p>
<div id=3D"x_Signature">
<div id=3D"x_divtagdefaultwrapper" dir=3D"ltr">
<p><br>
</p>
<p><span>Best Regards,</span></p>
<span></span>
<div><span></span></div>
<span></span>
<p><span>Abu Naser</span><br>
</p>
</div>
</div>
</div>
<hr tabindex=3D"-1">
<div id=3D"x_divRplyFwdMsg" dir=3D"ltr"><b>From:</b> Abu Naser<br>
<b>Sent:</b> Wednesday, June 20, 2018 1:48:53 PM<br>
<b>To:</b> <a class=3D"x_moz-txt-link-abbreviated OWAAutoLink" href=3D"mail=
to:discuss at mpich.org" id=3D"LPlnk729146" previewremoved=3D"true">
discuss at mpich.org</a><br>
<b>Subject:</b> Re: [mpich-discuss] osu_latency test: why 8KB takes less ti=
me than 4KB and 2KB takes less time than 1KB?
<div> </div>
</div>
<meta content=3D"text/html; charset=3Diso-8859-1">
<div dir=3D"ltr">
<div id=3D"x_x_divtagdefaultwrapper" dir=3D"ltr">
<div id=3D"x_x_divtagdefaultwrapper" dir=3D"ltr">
<p>Hello Min,</p>
<p><br>
</p>
<p>Thanks for the clarification.  I will do the experiment.<br>
</p>
<p><br>
</p>
<div id=3D"x_x_Signature">
<div id=3D"x_x_divtagdefaultwrapper" dir=3D"ltr">
<p>Thanks.</p>
<p><span>Best Regards,</span></p>
<span></span>
<div><span></span></div>
<span></span>
<p><span>Abu Naser</span><br>
</p>
</div>
</div>
</div>
<hr tabindex=3D"-1">
<div id=3D"x_x_divRplyFwdMsg" dir=3D"ltr"><b>From:</b> Min Si <a class=3D"x=
_moz-txt-link-rfc2396E OWAAutoLink" href=3D"mailto:msi at anl.gov" id=3D"LPlnk=
558260" previewremoved=3D"true">
<msi at anl.gov></a><br>
<b>Sent:</b> Wednesday, June 20, 2018 1:39:30 PM<br>
<b>To:</b> <a class=3D"x_moz-txt-link-abbreviated OWAAutoLink" href=3D"mail=
to:discuss at mpich.org" id=3D"LPlnk472728" previewremoved=3D"true">
discuss at mpich.org</a><br>
<b>Subject:</b> Re: [mpich-discuss] osu_latency test: why 8KB takes less ti=
me than 4KB and 2KB takes less time than 1KB?
<div> </div>
</div>
<meta content=3D"text/html; charset=3DWindows-1252">
<div>Hi Abu,<br>
<br>
I think Jeff means that you should run your experiment with more iterations=
 in order to get a stable results.<br>
- Increase the iteration of for loop in each execution (I think osu benchma=
rk allows you to set it)<br>
- Run the experiments 10 or 100 times, and take the average and standard de=
viation.<br>
<br>
If you see a very small standard deviation (e.g., <=3D5%), then the tren=
d is stable and you might not see such gaps.<br>
<br>
Best regards,<br>
Min<br>
<div class=3D"x_x_x_moz-cite-prefix">On 2018/06/20 12:14, Abu Naser wrote:<=
br>
</div>
<blockquote type=3D"cite">
<div id=3D"x_x_x_divtagdefaultwrapper" dir=3D"ltr">
<p>Hello Jeff,</p>
<p><br>
</p>
<p>Yes, I am using a switch and other machines are also connected with=
 that switch.
<br>
</p>
<p>If I remove other machines and just use my two node with the switch, the=
n will it improve the performance by 200 ~ 400 iterations?</p>
<p>Meanwhile I will give a try with a single dedicated cable. <span></span>=
<br>
</p>
<p><br>
</p>
<p>Thank you.<br>
</p>
<div id=3D"x_x_x_Signature">
<div id=3D"x_x_x_divtagdefaultwrapper" dir=3D"ltr">
<p><br>
</p>
<p><span>Best Regards,</span></p>
<span></span>
<div><span></span></div>
<span></span>
<p><span>Abu Naser</span><br>
</p>
</div>
</div>
</div>
<hr tabindex=3D"-1">
<div id=3D"x_x_x_divRplyFwdMsg" dir=3D"ltr"><b>From:</b> Jeff Hammond <a cl=
ass=3D"x_x_x_moz-txt-link-rfc2396E x_x_OWAAutoLink" href=3D"mailto:jeff.sci=
ence at gmail.com" id=3D"LPlnk983157" previewremoved=3D"true">
<jeff.science at gmail.com></a><br>
<b>Sent:</b> Wednesday, June 20, 2018 12:52:06 PM<br>
<b>To:</b> MPICH<br>
<b>Subject:</b> Re: [mpich-discuss] osu_latency test: why 8KB takes less ti=
me than 4KB and 2KB takes less time than 1KB?
<div> </div>
</div>
<meta content=3D"text/html; charset=3Dutf-8">
<div>
<div dir=3D"ltr">Is the ethernet connection a single dedicated cable betwee=
n the two machines or are you running through a switch that handles other t=
raffic?
<div><br>
</div>
<div>My best guess is that this is noise and that you may be able to avoid =
it by running a very long time, e.g. 10000 iterations.</div>
<div><br>
</div>
<div>Jeff</div>
</div>
<div class=3D"x_x_x_x_gmail_extra"><br>
<div class=3D"x_x_x_x_gmail_quote">On Wed, Jun 20, 2018 at 6:53 AM, Abu Nas=
er <span dir=3D"ltr">
<<a href=3D"mailto:an16e at my.fsu.edu" target=3D"_blank" id=3D"LPlnk305789=
" class=3D"x_x_OWAAutoLink" previewremoved=3D"true">an16e at my.fsu.edu</a>&gt=
;</span> wrote:<br>
<blockquote class=3D"x_x_x_x_gmail_quote">
<div dir=3D"ltr">
<div id=3D"x_x_x_x_m_6077755676379859201divtagdefaultwrapper" dir=3D"ltr">
<p><br>
</p>
<p>Good day to all,</p>
<p><br>
</p>
<p>I had run point to point osu_latency test in two nodes for 200 times.&nb=
sp; Followings are the average time in microsecond for various size of the =
messages -</p>
<div>1KB    84.8514 us<br>
<span>2KB    73.52535</span> us<br>
4KB    272.55275 us<br>
<span>8KB    234.86385</span> us<br>
16KB    288.88 us<br>
32KB    523.3725 us<br>
64KB    910.4025 us</div>
<p><br>
</p>
<p>From the above looks like, 2KB message has less latency than 1 KB and 8K=
B has less latency than 4KB.
<br>
</p>
<p>I was looking for explanation of this behavior  but did not get any=
.</p>
<p><br>
</p>
<ol>
<li><span>MPIR_CVAR_CH3_EAGER_MAX_MSG_<wbr>SIZE</span><span> is set to 128K=
B. So none of the above message size is using Rendezvous protocol. Is there=
 any partition inside eager protocol (e.g. 0 - 512 bytes, 1KB - 8KB, 16KB -=
 64KB)? If yes then what are the
 boundaries for them? Can I log them with debug-event-logging? </span><br>
</li></ol>
<p><br>
</p>
<p>Setup I am using:</p>
<p>- two nodes has intel core i7, one with 16gb memory another one 8gb</p>
<p>- mpich 3.2.1, configured and build to use nemesis tcp</p>
<p>- 1gb Ethernet connection</p>
<p>- NFS is using for sharing<br>
</p>
<p>- osu_latency : uses MPI_Send and MPI_Recv</p>
<p>- <span>MPIR_CVAR_CH3_EAGER_MAX_MSG_<wbr>SIZE</span>=3D <span>131072</sp=
an> (128KB)<br>
</p>
<p><br>
</p>
<p>Can anyone help me on that? Thanks in advance.<br>
</p>
<p><br>
</p>
<p><br>
</p>
<div id=3D"x_x_x_x_m_6077755676379859201Signature">
<div id=3D"x_x_x_x_m_6077755676379859201divtagdefaultwrapper" dir=3D"ltr">
<p><br>
</p>
<p><span>Best Regards,</span></p>
<span></span>
<div><span></span></div>
<span></span>
<p><span>Abu Naser</span><br>
</p>
</div>
</div>
</div>
</div>
<br>
______________________________<wbr>_________________<br>
discuss mailing list     <a href=3D"mailto:discuss at mpich.org=
" id=3D"LPlnk816471" class=3D"x_x_OWAAutoLink" previewremoved=3D"true">disc=
uss at mpich.org</a><br>
To manage subscription options or unsubscribe:<br>
<a href=3D"https://lists.mpich.org/mailman/listinfo/discuss" rel=3D"norefer=
rer" target=3D"_blank" id=3D"LPlnk624595" class=3D"x_x_OWAAutoLink" preview=
removed=3D"true">https://lists.mpich.org/<wbr>mailman/listinfo/discuss</a><=
br>
<br>
</blockquote>
</div>
<br>
<br>
<div><br>
</div>
-- <br>
<div class=3D"x_x_x_x_gmail_signature">Jeff Hammond<br>
<a href=3D"mailto:jeff.science at gmail.com" target=3D"_blank" id=3D"LPlnk3149=
93" class=3D"x_x_OWAAutoLink" previewremoved=3D"true">jeff.science at gmail.co=
m</a><br>
<a href=3D"http://jeffhammond.github.io/" target=3D"_blank" id=3D"LPlnk8614=
34" class=3D"x_x_OWAAutoLink" previewremoved=3D"true">http://jeffhammond.gi=
thub.io/</a></div>
</div>
</div>
<br>
<fieldset class=3D"x_x_x_mimeAttachmentHeader"></fieldset> <br>
<pre>_______________________________________________=0A=
discuss mailing list     <a class=3D"x_x_x_moz-txt-link-abbreviated x_x_OWA=
AutoLink" href=3D"mailto:discuss at mpich.org" id=3D"LPlnk657371" previewremov=
ed=3D"true">discuss at mpich.org</a>=0A=
To manage subscription options or unsubscribe:=0A=
<a class=3D"x_x_x_moz-txt-link-freetext x_x_OWAAutoLink" href=3D"https://li=
sts.mpich.org/mailman/listinfo/discuss" id=3D"LPlnk669988" previewremoved=
=3D"true">https://lists.mpich.org/mailman/listinfo/discuss</a>=0A=
</pre>
</blockquote>
<br>
</div>
</div>
</div>
</div>
<br>
<fieldset class=3D"x_mimeAttachmentHeader"></fieldset> <br>
<pre>_______________________________________________=0A=
discuss mailing list     <a class=3D"x_moz-txt-link-abbreviated OWAAutoLink=
" href=3D"mailto:discuss at mpich.org" id=3D"LPlnk832953" previewremoved=3D"tr=
ue">discuss at mpich.org</a>=0A=
To manage subscription options or unsubscribe:=0A=
<a class=3D"x_moz-txt-link-freetext OWAAutoLink" href=3D"https://lists.mpic=
h.org/mailman/listinfo/discuss" id=3D"LPlnk481779" previewremoved=3D"true">=
https://lists.mpich.org/mailman/listinfo/discuss</a>=0A=
</pre>
</blockquote>
<br>
</div>
</div>
</body>
</html>

--_000_BLUPR0501MB2003414CB97CA97A0242D0BC97430BLUPR0501MB2003_--

--===============7322407779089830927==
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss

--===============7322407779089830927==--


More information about the discuss mailing list