[mpich-discuss] MPI memory allocation.
Anatoly G
anatolyrishon at gmail.com
Sat Dec 7 03:55:21 CST 2013
Ok. I"ll try both Issend, and next step to upgrade MPICH to 3.0.4.
I thought before that MPICH & MPICH2 are two different branches, when
MPICH2 partially supports Fault tolerance, but MPICH not. Now I understand,
that I was wrong and MPICH2 is just main version of MPICH.
Thank you very much,
Anatoly.
On Thu, Dec 5, 2013 at 11:01 PM, Rajeev Thakur <thakur at mcs.anl.gov> wrote:
> The master is receiving too many incoming messages than it can match
> quickly enough with Irecvs. Try using MPI_Issend instead of MPI_Isend.
>
> Rajeev
>
> On Dec 5, 2013, at 2:58 AM, Anatoly G <anatolyrishon at gmail.com> wrote:
>
> > Hello.
> > I"m using MPICH2 1.5.
> > My system contains master and 16 slaves.
> > System uses number of communicators.
> > The single communicator used for below scenario:
> > Each slave sends non-stop 2Kbyte data buffer using MPI_Isend and waits
> using MPI_Wait.
> > Master starts with MPI_Irecv to each slave
> > Then in endless loop:
> > MPI_Waitany and MPI_Irecv on rank returned by MPI_Waitany.
> >
> > Another communicator used for broadcast communication (commands between
> master + slaves),
> > but it's not used in parallel with previous communicator,
> > only before or after data transfer.
> >
> > The system executed on two computers linked by 1Gbit/s Ethernet.
> > Master executed on first computer, all slaves on other one.
> > Network traffic is ~800Mbit/s.
> >
> > After 1-2 minutes of execution, master process starts to increase it's
> memory allocation and network traffic becomes low.
> > This memory allocation & network traffic slow down continues until fail
> of MPI,
> > without core file save.
> > My program doesn't allocate memory. Can you please explain this
> behaviour.
> > How can I cause MPI to stop sending slaves if Master can't serve such
> traffic, instead of memory allocation and fail?
> >
> >
> > Thank you,
> > Anatoly.
> >
> > P.S.
> > On my stand alone test, I simulate similar behaviour, but with single
> thread on each process (master & hosts).
> > When I start stand alone test, master stops slaves until it completes
> accumulated data processing and MPI doesn't increase memory allocation.
> > When Master is free slaves continue to send data.
> > _______________________________________________
> > discuss mailing list discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20131207/95385b84/attachment.html>
More information about the discuss
mailing list