[mpich-discuss] discuss Digest, Vol 16, Issue 7

Soheil Hooshdaran shooshdaran577 at gmail.com
Thu Feb 20 11:15:18 CST 2014


It causes no problem at all, since the categories/buckets are large enough.
Only the malloc()  function causes the problem. I read that it is not
'interrupt safe'. What does that mean and what should be done against it?


On Wed, Feb 19, 2014 at 8:08 AM, <discuss-request at mpich.org> wrote:

> Send discuss mailing list submissions to
>         discuss at mpich.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.mpich.org/mailman/listinfo/discuss
> or, via email, send a message with subject or body 'help' to
>         discuss-request at mpich.org
>
> You can reach the person managing the list at
>         discuss-owner at mpich.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of discuss digest..."
>
>
> Today's Topics:
>
>    1. Re:  dynamic 2D array creation error (Kenneth Raffenetti)
>    2. Re:  urgent-malloc problem (Gilles Gouaillardet)
>    3. Re:  Communication Error when installing MPICH on multi
>       HOSTS. (Balaji, Pavan)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 18 Feb 2014 07:48:39 -0600
> From: Kenneth Raffenetti <raffenet at mcs.anl.gov>
> To: <discuss at mpich.org>
> Subject: Re: [mpich-discuss] dynamic 2D array creation error
> Message-ID: <530364B7.5040908 at mcs.anl.gov>
> Content-Type: text/plain; charset="UTF-8"; format=flowed
>
> This list is for asking questions about the usage of mpich. General
> programming questions like the below are better suited for a forum like
> Stackoverflow.
>
> On 02/18/2014 12:44 AM, Soheil Hooshdaran wrote:
> > Hello. What's wrong with this code snippet?
> >
> > |1
> > 2
> > 3
> > 4
> > 5
> > 6
> > |
> >
> >
> >
> > |        int     **lBucket;
> >
> >         lBucket =new  int*[iWorldSize];//iWorldSize is the nummber of
> processors
> >
> >       for(i=0;i<iWorldSize;++i)
> >           lBucket[i] =new  int[m];|
> >
> >
> >
> > Thanks in advance
> >
> >
> >
> >
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
> >
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 19 Feb 2014 12:07:43 +0900
> From: Gilles Gouaillardet <gilles.gouaillardet at iferc.org>
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] urgent-malloc problem
> Message-ID: <53041FFF.4040101 at iferc.org>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hello,
>
> how do you run your program ?
> e.g.
> ./a.out <m> <base> <r>
> which values for m, base and r ?
>
> there is no check on boundaries, and depending on the input parameters,
>
> cat[count][ cursor[count]++ ]
>
> indexes can be out of bound
>
> imho, this is not mpich related at all
>
> Best regards,
>
> Gilles
>
>
> On 2014/02/13 0:22, Soheil Hooshdaran wrote:
> > Hello.
> > I have a memory allocation problem (using malloc). I can't figure out its
> > cause. Could you help me please?
> >
>
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 19 Feb 2014 04:38:38 +0000
> From: "Balaji, Pavan" <balaji at anl.gov>
> To: "discuss at mpich.org" <discuss at mpich.org>
> Subject: Re: [mpich-discuss] Communication Error when installing MPICH
>         on multi HOSTS.
> Message-ID: <CF29912C.5996E%balaji at anl.gov>
> Content-Type: text/plain; charset="utf-8"
>
>
> It?s hard to tell, but this does indicate some problem with your
> communication setup.  Did you verify your /etc/hosts like described on the
> FAQ page?
>
>   ? Pavan
>
> From: ???? <wu_0317 at qq.com<mailto:wu_0317 at qq.com>>
> Reply-To: "discuss at mpich.org<mailto:discuss at mpich.org>" <discuss at mpich.org
> <mailto:discuss at mpich.org>>
> Date: Tuesday, February 18, 2014 at 5:21 AM
> To: discuss <discuss at mpich.org<mailto:discuss at mpich.org>>
> Subject: [mpich-discuss] Communication Error when installing MPICH on
> multi HOSTS.
>
> HI.
>
> My environment:
> Two Vmware VMs with ubuntu-server12.04 OS, called mpimaster,mpislaver1
> they both linked to a virtual network 10.0.0.1;
> they can ssh to each other without password;
> I have disabled the fire walls with "sudo ufw disable"
> I  install  mpich3.0.4 on a NFS servered by mpimaster.
>
> I installed mpich3.0.4 follow the "readme.txt", it has Communication
> problem when progresses from different host comunicate with each other.
> [cid:E1179051 at 3853243A.43420353.png]
>
> From picture above we can see it's ok to run "cpi" on both hosts
> separately.
>
> If you can't see the picture,plz see the shell's below.
>
> ailab at mpimaster:~/Downloads/mpich-3.0.4$ mpiexec -n 4 ./examples/cpi
> Process 0 of 4 is on mpimaster
> Process 1 of 4 is on mpimaster
> Process 2 of 4 is on mpimaster
> Process 3 of 4 is on mpimaster
> pi is approximately 3.1415926544231239, Error is 0.0000000008333307
> wall clock time = 0.028108
> ailab at mpimaster:~/Downloads/mpich-3.0.4$ mpiexec -hosts mpimaster -n 4
> ./examples/cpi
> Process 2 of 4 is on mpimaster
> Process 0 of 4 is on mpimaster
> Process 1 of 4 is on mpimaster
> Process 3 of 4 is on mpimaster
> pi is approximately 3.1415926544231239, Error is 0.0000000008333307
> wall clock time = 0.027234
> ailab at mpimaster:~/Downloads/mpich-3.0.4$ mpiexec -hosts mpislaver1 -n 4
> ./examples/cpi
> Process 0 of 4 is on mpislaver1
> pi is approximately 3.1415926544231239, Error is 0.0000000008333307
> wall clock time = 0.000093
> Process 1 of 4 is on mpislaver1
> Process 2 of 4 is on mpislaver1
> Process 3 of 4 is on mpislaver1
> ailab at mpimaster:~/Downloads/mpich-3.0.4$ mpiexec -hosts
> mpimaster,mpislaver1 -n 4 ./examples/cpi
> Process 0 of 4 is on mpimaster
> Process 2 of 4 is on mpimaster
> Fatal error in PMPI_Reduce: A process has failed, error stack:
> PMPI_Reduce(1217)...............: MPI_Reduce(sbuf=0x7fff73a51ce8,
> rbuf=0x7fff73a51cf0, count=1, MPI_DOUBLE, MPI_SUM, root=0, MPI_COMM_WORLD)
> failed
> MPIR_Reduce_impl(1029)..........:
> MPIR_Reduce_intra(779)..........:
> MPIR_Reduce_impl(1029)..........:
> MPIR_Reduce_intra(835)..........:
> MPIR_Reduce_binomial(144).......:
> MPIDI_CH3U_Recvq_FDU_or_AEP(667): Communication error with rank 1
> MPIR_Reduce_intra(799)..........:
> MPIR_Reduce_impl(1029)..........:
> MPIR_Reduce_intra(835)..........:
> MPIR_Reduce_binomial(206).......: Failure during collective
>
>
> ================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   EXIT CODE: 1
> =   CLEANING UP REMAINING PROCESSES
> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>
> ================================================================================
> [proxy:0:1 at mpislaver1] HYD_pmcd_pmip_control_cmd_cb
> (./pm/pmiserv/pmip_cb.c:886)
> [proxy:0:1 at mpislaver1] HYDT_dmxu_poll_wait_for_event
> (./tools/demux/demux_poll.c
> [proxy:0:1 at mpislaver1] main (./pm/pmiserv/pmip.c:206): demux engine error
> waitin
> [mpiexec at mpimaster] HYDT_bscu_wait_for_completion
> (./tools/bootstrap/utils/bscu_
> [mpiexec at mpimaster] HYDT_bsci_wait_for_completion
> (./tools/bootstrap/src/bsci_wa
> [mpiexec at mpimaster] HYD_pmci_wait_for_completion
> (./pm/pmiserv/pmiserv_pmci.c:21
> [mpiexec at mpimaster] main (./ui/mpich/mpiexec.c:331): process manager
> error waiti
> ailab at mpimaster:~/Downloads/mpich-3.0.4$
>
> plz help,THX!
>
>
> ------------------
> Jie-Jun Wu
> Department of Computer Science,
> Sun Yat-sen University,
> Guangzhou,
> P.R. China
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.mpich.org/pipermail/discuss/attachments/20140219/d9519600/attachment.html
> >
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: E1179051 at 3853243A.43420353.png
> Type: application/octet-stream
> Size: 68330 bytes
> Desc: E1179051 at 3853243A.43420353.png
> URL: <
> http://lists.mpich.org/pipermail/discuss/attachments/20140219/d9519600/attachment.obj
> >
>
> ------------------------------
>
> _______________________________________________
> discuss mailing list
> discuss at mpich.org
> https://lists.mpich.org/mailman/listinfo/discuss
>
> End of discuss Digest, Vol 16, Issue 7
> **************************************
>



-- 
قال المصطفی علیه السلام: ‌«إرحموا ثلاثا عزیزاً، قوم الذل  و غنی قوم افتقر و
عالماً یلعب به الجُهّال»
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140220/52384d35/attachment.html>


More information about the discuss mailing list