[mpich-discuss] osu_latency test: why 8KB takes less time than 4KB and 2KB takes less time than 1KB?

Min Si msi at anl.gov
Tue Jun 26 11:54:29 CDT 2018


Hi Abu,

I think the results are stable enough. Perhaps you could also try the 
following tests, and see if similar trend exists:
- MPICH/socket (set `--with-device=ch3:sock` at configure)
- A socket-based pingpong test without MPI.

At this point, I could not think of any MPI-specific design for 2k/8k 
messages. My guess is that it is related to your network connection.

Min

On 2018/06/24 11:09, Abu Naser wrote:
>
> Hello Min and Jeff,
>
>
> Here is my experiment results. Default number of iterations in 
> osu_latency for 0B – 8KB is 10,000. With that setting I had run the 
> osu_latency 100 times and found standard deviation 33 for 8KB message 
> size.
>
>
> So later I have set the iteration to 50,000 and 100,000 for 1KB – 16KB 
> message size. Then run osu_latency for 100 times for each setting and 
> take the average and standard deviation.
>
>
> *Msg Size in Bytes*
>
> 	
>
> *Avg time in us (50K iterations)*
>
> 	
>
> *Avg time in us (100k iterations)*
>
> 	
>
> *Standard deviation (50K iterations)*
>
> 	
>
> *Standard deviation (100K iterations)*
>
> 1k
>
> 	
>
> 85.10
>
> 	
>
> 84.9
>
> 	
>
> 0.55
>
> 	
>
> 0.45
>
> 2k
>
> 	
>
> 75.79
>
> 	
>
> 74.63
>
> 	
>
> 5.09
>
> 	
>
> 4.44
>
> 4k
>
> 	
>
> 273.80
>
> 	
>
> 274.71
>
> 	
>
> 4.18
>
> 	
>
> 2.45
>
> 8k
>
> 	
>
> 258.56
>
> 	
>
> 249.83
>
> 	
>
> 21.14
>
> 	
>
> 28
>
> 16k
>
> 	
>
> 281.31
>
> 	
>
> 281.02
>
> 	
>
> 3.22
>
> 	
>
> 4.10
>
>
>
> The standard deviation of 8K message is so high and that implies it 
> actually not producing any consistent latency time. Looks like that's 
> the reason for 8K is taking less time than 4K.
>
>
> Meanwhile, 2K has standard deviation less than 5 but 1K message 
> latency timing are more densely populated than 2K. So probably this is 
> the explanation for 2K message less latency time.
>
>
> Thank you for your suggestions.
>
>
>
>
> Best Regards,
>
> Abu Naser
>
> ------------------------------------------------------------------------
> *From:* Abu Naser
> *Sent:* Wednesday, June 20, 2018 1:48:53 PM
> *To:* discuss at mpich.org
> *Subject:* Re: [mpich-discuss] osu_latency test: why 8KB takes less 
> time than 4KB and 2KB takes less time than 1KB?
>
> Hello Min,
>
>
> Thanks for the clarification.  I will do the experiment.
>
>
> Thanks.
>
> Best Regards,
>
> Abu Naser
>
> ------------------------------------------------------------------------
> *From:* Min Si <msi at anl.gov>
> *Sent:* Wednesday, June 20, 2018 1:39:30 PM
> *To:* discuss at mpich.org
> *Subject:* Re: [mpich-discuss] osu_latency test: why 8KB takes less 
> time than 4KB and 2KB takes less time than 1KB?
> Hi Abu,
>
> I think Jeff means that you should run your experiment with more 
> iterations in order to get a stable results.
> - Increase the iteration of for loop in each execution (I think osu 
> benchmark allows you to set it)
> - Run the experiments 10 or 100 times, and take the average and 
> standard deviation.
>
> If you see a very small standard deviation (e.g., <=5%), then the 
> trend is stable and you might not see such gaps.
>
> Best regards,
> Min
> On 2018/06/20 12:14, Abu Naser wrote:
>>
>> Hello Jeff,
>>
>>
>> Yes, I am using a switch and other machines are also connected with 
>> that switch.
>>
>> If I remove other machines and just use my two node with the switch, 
>> then will it improve the performance by 200 ~ 400 iterations?
>>
>> Meanwhile I will give a try with a single dedicated cable.
>>
>>
>> Thank you.
>>
>>
>> Best Regards,
>>
>> Abu Naser
>>
>> ------------------------------------------------------------------------
>> *From:* Jeff Hammond <jeff.science at gmail.com> 
>> <mailto:jeff.science at gmail.com>
>> *Sent:* Wednesday, June 20, 2018 12:52:06 PM
>> *To:* MPICH
>> *Subject:* Re: [mpich-discuss] osu_latency test: why 8KB takes less 
>> time than 4KB and 2KB takes less time than 1KB?
>> Is the ethernet connection a single dedicated cable between the two 
>> machines or are you running through a switch that handles other traffic?
>>
>> My best guess is that this is noise and that you may be able to avoid 
>> it by running a very long time, e.g. 10000 iterations.
>>
>> Jeff
>>
>> On Wed, Jun 20, 2018 at 6:53 AM, Abu Naser <an16e at my.fsu.edu 
>> <mailto:an16e at my.fsu.edu>> wrote:
>>
>>
>>     Good day to all,
>>
>>
>>     I had run point to point osu_latency test in two nodes for 200
>>     times.  Followings are the average time in microsecond for
>>     various size of the messages -
>>
>>     1KB    84.8514 us
>>     2KB    73.52535 us
>>     4KB    272.55275 us
>>     8KB    234.86385 us
>>     16KB    288.88 us
>>     32KB    523.3725 us
>>     64KB    910.4025 us
>>
>>
>>     From the above looks like, 2KB message has less latency than 1 KB
>>     and 8KB has less latency than 4KB.
>>
>>     I was looking for explanation of this behavior  but did not get any.
>>
>>
>>      1. MPIR_CVAR_CH3_EAGER_MAX_MSG_SIZEis set to 128KB. So none of
>>         the above message size is using Rendezvous protocol. Is there
>>         any partition inside eager protocol (e.g. 0 - 512 bytes, 1KB
>>         - 8KB, 16KB - 64KB)? If yes then what are the boundaries for
>>         them? Can I log them with debug-event-logging?
>>
>>
>>     Setup I am using:
>>
>>     - two nodes has intel core i7, one with 16gb memory another one 8gb
>>
>>     - mpich 3.2.1, configured and build to use nemesis tcp
>>
>>     - 1gb Ethernet connection
>>
>>     - NFS is using for sharing
>>
>>     - osu_latency : uses MPI_Send and MPI_Recv
>>
>>     - MPIR_CVAR_CH3_EAGER_MAX_MSG_SIZE= 131072 (128KB)
>>
>>
>>     Can anyone help me on that? Thanks in advance.
>>
>>
>>
>>
>>     Best Regards,
>>
>>     Abu Naser
>>
>>
>>     _______________________________________________
>>     discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
>>     To manage subscription options or unsubscribe:
>>     https://lists.mpich.org/mailman/listinfo/discuss
>>     <https://lists.mpich.org/mailman/listinfo/discuss>
>>
>>
>>
>>
>> -- 
>> Jeff Hammond
>> jeff.science at gmail.com <mailto:jeff.science at gmail.com>
>> http://jeffhammond.github.io/
>>
>>
>> _______________________________________________
>> discuss mailing listdiscuss at mpich.org <mailto:discuss at mpich.org>
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20180626/29b03c9b/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list