[mpich-discuss] mpich on Mac os x

Wesley Bland wbland at mcs.anl.gov
Mon Jul 8 10:53:48 CDT 2013


On Jul 8, 2013, at 10:39 AM, Reem Alraddadi <raba500 at york.ac.uk> wrote:

> Hi Pavan,
> 
> Could you explain more what do you mean by that? I am sorry I am still beginner in MPICH.

This isn't specific to MPICH. This would be the case with any MPI implementation. What he means is that you need to make sure that when you are done using a communicator, you call MPI_Comm_free(communicator) so MPICH can free the internal resources used by the communicator. If you never free your communicators, then you will eventually run out. There is a limited number of communicators available internally.

> 
> Hi Wesley,
> apologize about my first confused reply I have forgotten to change the subject. However, as I mentioned I wrote the following line:
> 
> mpirun --np 4 --env MPIR_PARAM_CTXID_EAGER_SIZE 1 ./flash4
> 
> I am wondering if I made any mistake here as the problem is still the same 

Yes, that's the correct way to use it, so that means your problem is probably what we described above.

> 
> Thanks, 
> Reem
> 
> 
> 
> Message: 1
> Date: Mon, 08 Jul 2013 09:17:00 -0500
> From: Pavan Balaji <balaji at mcs.anl.gov>
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] mpich on Mac os x
> Message-ID: <51DAC9DC.5000502 at mcs.anl.gov>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> 
> Also make sure you are freeing unused communicators.
> 
>   -- Pavan
> 
> On 07/08/2013 08:14 AM, Wesley Bland wrote:
> > It seems that you're creating more communicators than MPICH can handle. It's possible that you might be able to get around this by setting the environment variable MPIR_PARAM_CTXID_EAGER_SIZE to something smaller than its default (which is 2). That frees up a few more communicators, but there is a pathological case where even with fewer communicators than the max, MPICH won't be able to agree on a new communicator id when needed. Try changing that environment variable and see if that fixes things.
> >
> > Wesley
> >
> > On Jul 8, 2013, at 5:33 AM, Reem Alraddadi <raba500 at york.ac.uk> wrote:
> >
> >> Hi all,
> >> I am using mpich-3.0.4 on Mac os x version 10.7.5 to run FLASH code. It works fine in the beginning of the run and then I got the following error:
> >>
> >> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> >> MPI_Comm_create(600).................: MPI_Comm_create(comm=0x84000002, group=0xc8001349, new_comm=0x7fff606a8614) failed
> >> MPI_Comm_create(577).................:
> >> MPIR_Comm_create_intra(241)..........:
> >> MPIR_Get_contextid(799)..............:
> >> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID because of fragmentation (169/2048 free on this process; ignore_id=0)
> >> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> >> MPI_Comm_create(600).................: MPI_Comm_create(comm=0x84000002, group=0xc80012b6, new_comm=0x7fff670cc614) failed
> >> MPI_Comm_create(577).................:
> >> MPIR_Comm_create_intra(241)..........:
> >> MPIR_Get_contextid(799)..............:
> >> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID because of fragmentation (316/2048 free on this process; ignore_id=0)
> >> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> >> MPI_Comm_create(600).................: MPI_Comm_create(comm=0x84000004, group=0xc800000e, new_comm=0x7fff629d5614) failed
> >> MPI_Comm_create(577).................:
> >> MPIR_Comm_create_intra(241)..........:
> >> MPIR_Get_contextid(799)..............:
> >> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID because of fragmentation (2020/2048 free on this process; ignore_id=0)
> >> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> >> MPI_Comm_create(600).................: MPI_Comm_create(comm=0x84000002, group=0xc8000020, new_comm=0x7fff639ae614) failed
> >> MPI_Comm_create(577).................:
> >> MPIR_Comm_create_intra(241)..........:
> >> MPIR_Get_contextid(799)..............:
> >> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID because of fragmentation (2002/2048 free on this process; ignore_id=0
> >>
> >> Is there a way to fix that ?
> >>
> >> Thanks,
> >> Reem
> >> _______________________________________________
> >> discuss mailing list     discuss at mpich.org
> >> To manage subscription options or unsubscribe:
> >> https://lists.mpich.org/mailman/listinfo/discuss
> >
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
> >
> 
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20130708/5b36571f/attachment.html>


More information about the discuss mailing list