[mpich-discuss] MPI memory allocation.
Anatoly G
anatolyrishon at gmail.com
Thu Dec 5 02:58:18 CST 2013
Hello.
I"m using MPICH2 1.5.
My system contains master and 16 slaves.
System uses number of communicators.
The single communicator used for below scenario:
Each slave sends non-stop 2Kbyte data buffer using MPI_Isend and waits
using MPI_Wait.
Master starts with MPI_Irecv to each slave
Then in endless loop:
MPI_Waitany and MPI_Irecv on rank returned by MPI_Waitany.
Another communicator used for broadcast communication (commands between
master + slaves),
but it's not used in parallel with previous communicator,
only before or after data transfer.
The system executed on two computers linked by 1Gbit/s Ethernet.
Master executed on first computer, all slaves on other one.
Network traffic is ~800Mbit/s.
After 1-2 minutes of execution, master process starts to increase it's
memory allocation and network traffic becomes low.
This memory allocation & network traffic slow down continues until fail of
MPI,
without core file save.
My program doesn't allocate memory. Can you please explain this behaviour.
How can I cause MPI to stop sending slaves if Master can't serve such
traffic, instead of memory allocation and fail?
Thank you,
Anatoly.
P.S.
On my stand alone test, I simulate similar behaviour, but with single
thread on each process (master & hosts).
When I start stand alone test, master stops slaves until it completes
accumulated data processing and MPI doesn't increase memory allocation.
When Master is free slaves continue to send data.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20131205/c155e690/attachment.html>
More information about the discuss
mailing list