[mpich-discuss] osu_latency test: why 8KB takes less time than 4KB and 2KB takes less time than 1KB?
Min Si
msi at anl.gov
Tue Jun 26 11:54:29 CDT 2018
Hi Abu,
I think the results are stable enough. Perhaps you could also try the
following tests, and see if similar trend exists:
- MPICH/socket (set `--with-device=ch3:sock` at configure)
- A socket-based pingpong test without MPI.
At this point, I could not think of any MPI-specific design for 2k/8k
messages. My guess is that it is related to your network connection.
Min
On 2018/06/24 11:09, Abu Naser wrote:
>
> Hello Min and Jeff,
>
>
> Here is my experiment results. Default number of iterations in
> osu_latency for 0B – 8KB is 10,000. With that setting I had run the
> osu_latency 100 times and found standard deviation 33 for 8KB message
> size.
>
>
> So later I have set the iteration to 50,000 and 100,000 for 1KB – 16KB
> message size. Then run osu_latency for 100 times for each setting and
> take the average and standard deviation.
>
>
> *Msg Size in Bytes*
>
>
>
> *Avg time in us (50K iterations)*
>
>
>
> *Avg time in us (100k iterations)*
>
>
>
> *Standard deviation (50K iterations)*
>
>
>
> *Standard deviation (100K iterations)*
>
> 1k
>
>
>
> 85.10
>
>
>
> 84.9
>
>
>
> 0.55
>
>
>
> 0.45
>
> 2k
>
>
>
> 75.79
>
>
>
> 74.63
>
>
>
> 5.09
>
>
>
> 4.44
>
> 4k
>
>
>
> 273.80
>
>
>
> 274.71
>
>
>
> 4.18
>
>
>
> 2.45
>
> 8k
>
>
>
> 258.56
>
>
>
> 249.83
>
>
>
> 21.14
>
>
>
> 28
>
> 16k
>
>
>
> 281.31
>
>
>
> 281.02
>
>
>
> 3.22
>
>
>
> 4.10
>
>
>
> The standard deviation of 8K message is so high and that implies it
> actually not producing any consistent latency time. Looks like that's
> the reason for 8K is taking less time than 4K.
>
>
> Meanwhile, 2K has standard deviation less than 5 but 1K message
> latency timing are more densely populated than 2K. So probably this is
> the explanation for 2K message less latency time.
>
>
> Thank you for your suggestions.
>
>
>
>
> Best Regards,
>
> Abu Naser
>
> ------------------------------------------------------------------------
> *From:* Abu Naser
> *Sent:* Wednesday, June 20, 2018 1:48:53 PM
> *To:* discuss at mpich.org
> *Subject:* Re: [mpich-discuss] osu_latency test: why 8KB takes less
> time than 4KB and 2KB takes less time than 1KB?
>
> Hello Min,
>
>
> Thanks for the clarification. I will do the experiment.
>
>
> Thanks.
>
> Best Regards,
>
> Abu Naser
>
> ------------------------------------------------------------------------
> *From:* Min Si <msi at anl.gov>
> *Sent:* Wednesday, June 20, 2018 1:39:30 PM
> *To:* discuss at mpich.org
> *Subject:* Re: [mpich-discuss] osu_latency test: why 8KB takes less
> time than 4KB and 2KB takes less time than 1KB?
> Hi Abu,
>
> I think Jeff means that you should run your experiment with more
> iterations in order to get a stable results.
> - Increase the iteration of for loop in each execution (I think osu
> benchmark allows you to set it)
> - Run the experiments 10 or 100 times, and take the average and
> standard deviation.
>
> If you see a very small standard deviation (e.g., <=5%), then the
> trend is stable and you might not see such gaps.
>
> Best regards,
> Min
> On 2018/06/20 12:14, Abu Naser wrote:
>>
>> Hello Jeff,
>>
>>
>> Yes, I am using a switch and other machines are also connected with
>> that switch.
>>
>> If I remove other machines and just use my two node with the switch,
>> then will it improve the performance by 200 ~ 400 iterations?
>>
>> Meanwhile I will give a try with a single dedicated cable.
>>
>>
>> Thank you.
>>
>>
>> Best Regards,
>>
>> Abu Naser
>>
>> ------------------------------------------------------------------------
>> *From:* Jeff Hammond <jeff.science at gmail.com>
>> <mailto:jeff.science at gmail.com>
>> *Sent:* Wednesday, June 20, 2018 12:52:06 PM
>> *To:* MPICH
>> *Subject:* Re: [mpich-discuss] osu_latency test: why 8KB takes less
>> time than 4KB and 2KB takes less time than 1KB?
>> Is the ethernet connection a single dedicated cable between the two
>> machines or are you running through a switch that handles other traffic?
>>
>> My best guess is that this is noise and that you may be able to avoid
>> it by running a very long time, e.g. 10000 iterations.
>>
>> Jeff
>>
>> On Wed, Jun 20, 2018 at 6:53 AM, Abu Naser <an16e at my.fsu.edu
>> <mailto:an16e at my.fsu.edu>> wrote:
>>
>>
>> Good day to all,
>>
>>
>> I had run point to point osu_latency test in two nodes for 200
>> times. Followings are the average time in microsecond for
>> various size of the messages -
>>
>> 1KB 84.8514 us
>> 2KB 73.52535 us
>> 4KB 272.55275 us
>> 8KB 234.86385 us
>> 16KB 288.88 us
>> 32KB 523.3725 us
>> 64KB 910.4025 us
>>
>>
>> From the above looks like, 2KB message has less latency than 1 KB
>> and 8KB has less latency than 4KB.
>>
>> I was looking for explanation of this behavior but did not get any.
>>
>>
>> 1. MPIR_CVAR_CH3_EAGER_MAX_MSG_SIZEis set to 128KB. So none of
>> the above message size is using Rendezvous protocol. Is there
>> any partition inside eager protocol (e.g. 0 - 512 bytes, 1KB
>> - 8KB, 16KB - 64KB)? If yes then what are the boundaries for
>> them? Can I log them with debug-event-logging?
>>
>>
>> Setup I am using:
>>
>> - two nodes has intel core i7, one with 16gb memory another one 8gb
>>
>> - mpich 3.2.1, configured and build to use nemesis tcp
>>
>> - 1gb Ethernet connection
>>
>> - NFS is using for sharing
>>
>> - osu_latency : uses MPI_Send and MPI_Recv
>>
>> - MPIR_CVAR_CH3_EAGER_MAX_MSG_SIZE= 131072 (128KB)
>>
>>
>> Can anyone help me on that? Thanks in advance.
>>
>>
>>
>>
>> Best Regards,
>>
>> Abu Naser
>>
>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>> <https://lists.mpich.org/mailman/listinfo/discuss>
>>
>>
>>
>>
>> --
>> Jeff Hammond
>> jeff.science at gmail.com <mailto:jeff.science at gmail.com>
>> http://jeffhammond.github.io/
>>
>>
>> _______________________________________________
>> discuss mailing listdiscuss at mpich.org <mailto:discuss at mpich.org>
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20180626/29b03c9b/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list