[mpich-discuss] mpich on Mac os x

Jeff Hammond jeff.science at gmail.com
Mon Jul 8 10:25:01 CDT 2013


Is this the FLASH astrophysics code developed at UC?  That code is run
with MPICH all the time without issues so one has to assume that your
program usage is at least partly causing the problem.

Jeff

On Mon, Jul 8, 2013 at 9:17 AM, Pavan Balaji <balaji at mcs.anl.gov> wrote:
>
> Also make sure you are freeing unused communicators.
>
>  -- Pavan
>
> On 07/08/2013 08:14 AM, Wesley Bland wrote:
>>
>> It seems that you're creating more communicators than MPICH can handle.
>> It's possible that you might be able to get around this by setting the
>> environment variable MPIR_PARAM_CTXID_EAGER_SIZE to something smaller than
>> its default (which is 2). That frees up a few more communicators, but there
>> is a pathological case where even with fewer communicators than the max,
>> MPICH won't be able to agree on a new communicator id when needed. Try
>> changing that environment variable and see if that fixes things.
>>
>> Wesley
>>
>> On Jul 8, 2013, at 5:33 AM, Reem Alraddadi <raba500 at york.ac.uk> wrote:
>>
>>> Hi all,
>>> I am using mpich-3.0.4 on Mac os x version 10.7.5 to run FLASH code. It
>>> works fine in the beginning of the run and then I got the following error:
>>>
>>> Fatal error in MPI_Comm_create: Other MPI error, error stack:
>>> MPI_Comm_create(600).................: MPI_Comm_create(comm=0x84000002,
>>> group=0xc8001349, new_comm=0x7fff606a8614) failed
>>> MPI_Comm_create(577).................:
>>> MPIR_Comm_create_intra(241)..........:
>>> MPIR_Get_contextid(799)..............:
>>> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
>>> because of fragmentation (169/2048 free on this process; ignore_id=0)
>>> Fatal error in MPI_Comm_create: Other MPI error, error stack:
>>> MPI_Comm_create(600).................: MPI_Comm_create(comm=0x84000002,
>>> group=0xc80012b6, new_comm=0x7fff670cc614) failed
>>> MPI_Comm_create(577).................:
>>> MPIR_Comm_create_intra(241)..........:
>>> MPIR_Get_contextid(799)..............:
>>> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
>>> because of fragmentation (316/2048 free on this process; ignore_id=0)
>>> Fatal error in MPI_Comm_create: Other MPI error, error stack:
>>> MPI_Comm_create(600).................: MPI_Comm_create(comm=0x84000004,
>>> group=0xc800000e, new_comm=0x7fff629d5614) failed
>>> MPI_Comm_create(577).................:
>>> MPIR_Comm_create_intra(241)..........:
>>> MPIR_Get_contextid(799)..............:
>>> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
>>> because of fragmentation (2020/2048 free on this process; ignore_id=0)
>>> Fatal error in MPI_Comm_create: Other MPI error, error stack:
>>> MPI_Comm_create(600).................: MPI_Comm_create(comm=0x84000002,
>>> group=0xc8000020, new_comm=0x7fff639ae614) failed
>>> MPI_Comm_create(577).................:
>>> MPIR_Comm_create_intra(241)..........:
>>> MPIR_Get_contextid(799)..............:
>>> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
>>> because of fragmentation (2002/2048 free on this process; ignore_id=0
>>>
>>> Is there a way to fix that ?
>>>
>>> Thanks,
>>> Reem
>>> _______________________________________________
>>> discuss mailing list     discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss



-- 
Jeff Hammond
jeff.science at gmail.com



More information about the discuss mailing list