[mpich-discuss] mpich on Mac os x

Reem Alraddadi raba500 at york.ac.uk
Mon Jul 8 11:32:25 CDT 2013


Hi Wesley,

How could I do that? as I read in the internet. I have to set:
   include "mpif.h"
   Call MPI_COMM_FREE

I found where I can set that in FLASH code ( I have attached the file) So I
set:
   integer, parameter :: FLASH_COMM = MPI_COMM_FREE
but I got an error when I compile the code:

    Included at Driver_abortFlash.F90:34:

   integer, parameter :: FLASH_COMM = MPI_COMM_FREE
                                   1
Error: Symbol 'flash_comm' at (1) already has basic type of INTEGER
make: *** [Driver_abortFlash.o] Error 1

the code used   MPI_COMM_WORLD.
So is there a way to fix that?

Thanks,
Reem



Date: Mon, 8 Jul 2013 10:25:38 -0500
> From: Wesley Bland <wbland at mcs.anl.gov>
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] discuss Digest, Vol 9, Issue 13
> Message-ID: <9A8EEBB2-72BB-429B-9368-150983B1C283 at mcs.anl.gov>
> Content-Type: text/plain; charset="iso-8859-1"
>
> That's the correct way to set that environment variable, but I would have
> to agree with Pavan here. If you're running into the cap on context id's
> (especially with so few processes) it would seem that you're creating a lot
> of communicators that you probably don't need. Are you sure that you're
> freeing them correctly after use?
>
> Wesley
>
> On Jul 8, 2013, at 10:05 AM, Reem Alraddadi <raba500 at york.ac.uk> wrote:
>
> > Hi Wesley.
> > I wrote the following:
> > mpirun --np 4 --env MPIR_PARAM_CTXID_EAGER_SIZE 1 ./flash4
> > but the error still the same. Did I do it in the wrong way??
> >
> > Thanks,
> > Reem
> >
> > Message: 5
> > Date: Mon, 8 Jul 2013 08:14:48 -0500
> > From: Wesley Bland <wbland at mcs.anl.gov>
> > To: discuss at mpich.org
> > Subject: Re: [mpich-discuss] mpich on Mac os x
> > Message-ID: <8DC984B2-4E4B-4BFE-806E-203463A7A4E4 at mcs.anl.gov>
> > Content-Type: text/plain; charset=iso-8859-1
> >
> > It seems that you're creating more communicators than MPICH can handle.
> It's possible that you might be able to get around this by setting the
> environment variable MPIR_PARAM_CTXID_EAGER_SIZE to something smaller than
> its default (which is 2). That frees up a few more communicators, but there
> is a pathological case where even with fewer communicators than the max,
> MPICH won't be able to agree on a new communicator id when needed. Try
> changing that environment variable and see if that fixes things.
> >
> > Wesley
> >
> > On Jul 8, 2013, at 5:33 AM, Reem Alraddadi <raba500 at york.ac.uk> wrote:
> >
> > > Hi all,
> > > I am using mpich-3.0.4 on Mac os x version 10.7.5 to run FLASH code.
> It works fine in the beginning of the run and then I got the following
> error:
> > >
> > > Fatal error in MPI_Comm_create: Other MPI error, error stack:
> > > MPI_Comm_create(600).................:
> MPI_Comm_create(comm=0x84000002, group=0xc8001349, new_comm=0x7fff606a8614)
> failed
> > > MPI_Comm_create(577).................:
> > > MPIR_Comm_create_intra(241)..........:
> > > MPIR_Get_contextid(799)..............:
> > > MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> because of fragmentation (169/2048 free on this process; ignore_id=0)
> > > Fatal error in MPI_Comm_create: Other MPI error, error stack:
> > > MPI_Comm_create(600).................:
> MPI_Comm_create(comm=0x84000002, group=0xc80012b6, new_comm=0x7fff670cc614)
> failed
> > > MPI_Comm_create(577).................:
> > > MPIR_Comm_create_intra(241)..........:
> > > MPIR_Get_contextid(799)..............:
> > > MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> because of fragmentation (316/2048 free on this process; ignore_id=0)
> > > Fatal error in MPI_Comm_create: Other MPI error, error stack:
> > > MPI_Comm_create(600).................:
> MPI_Comm_create(comm=0x84000004, group=0xc800000e, new_comm=0x7fff629d5614)
> failed
> > > MPI_Comm_create(577).................:
> > > MPIR_Comm_create_intra(241)..........:
> > > MPIR_Get_contextid(799)..............:
> > > MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> because of fragmentation (2020/2048 free on this process; ignore_id=0)
> > > Fatal error in MPI_Comm_create: Other MPI error, error stack:
> > > MPI_Comm_create(600).................:
> MPI_Comm_create(comm=0x84000002, group=0xc8000020, new_comm=0x7fff639ae614)
> failed
> > > MPI_Comm_create(577).................:
> > > MPIR_Comm_create_intra(241)..........:
> > > MPIR_Get_contextid(799)..............:
> > > MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> because of fragmentation (2002/2048 free on this process; ignore_id=0
> > >
> > > Is there a way to fix that ?
> > >
> > > Thanks,
> > > Reem
>


> On Mon, Jul 8, 2013 at 9:17 AM, Pavan Balaji <balaji at mcs.anl.gov> wrote:
> >
> > Also make sure you are freeing unused communicators.
> >
> >  -- Pavan
> >
> > On 07/08/2013 08:14 AM, Wesley Bland wrote:
> >>
> >> It seems that you're creating more communicators than MPICH can handle.
> >> It's possible that you might be able to get around this by setting the
> >> environment variable MPIR_PARAM_CTXID_EAGER_SIZE to something smaller
> than
> >> its default (which is 2). That frees up a few more communicators, but
> there
> >> is a pathological case where even with fewer communicators than the max,
> >> MPICH won't be able to agree on a new communicator id when needed. Try
> >> changing that environment variable and see if that fixes things.
> >>
> >> Wesley
> >>
> >> On Jul 8, 2013, at 5:33 AM, Reem Alraddadi <raba500 at york.ac.uk> wrote:
> >>
> >>> Hi all,
> >>> I am using mpich-3.0.4 on Mac os x version 10.7.5 to run FLASH code. It
> >>> works fine in the beginning of the run and then I got the following
> error:
> >>>
> >>> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> >>> MPI_Comm_create(600).................: MPI_Comm_create(comm=0x84000002,
> >>> group=0xc8001349, new_comm=0x7fff606a8614) failed
> >>> MPI_Comm_create(577).................:
> >>> MPIR_Comm_create_intra(241)..........:
> >>> MPIR_Get_contextid(799)..............:
> >>> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> >>> because of fragmentation (169/2048 free on this process; ignore_id=0)
> >>> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> >>> MPI_Comm_create(600).................: MPI_Comm_create(comm=0x84000002,
> >>> group=0xc80012b6, new_comm=0x7fff670cc614) failed
> >>> MPI_Comm_create(577).................:
> >>> MPIR_Comm_create_intra(241)..........:
> >>> MPIR_Get_contextid(799)..............:
> >>> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> >>> because of fragmentation (316/2048 free on this process; ignore_id=0)
> >>> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> >>> MPI_Comm_create(600).................: MPI_Comm_create(comm=0x84000004,
> >>> group=0xc800000e, new_comm=0x7fff629d5614) failed
> >>> MPI_Comm_create(577).................:
> >>> MPIR_Comm_create_intra(241)..........:
> >>> MPIR_Get_contextid(799)..............:
> >>> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> >>> because of fragmentation (2020/2048 free on this process; ignore_id=0)
> >>> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> >>> MPI_Comm_create(600).................: MPI_Comm_create(comm=0x84000002,
> >>> group=0xc8000020, new_comm=0x7fff639ae614) failed
> >>> MPI_Comm_create(577).................:
> >>> MPIR_Comm_create_intra(241)..........:
> >>> MPIR_Get_contextid(799)..............:
> >>> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> >>> because of fragmentation (2002/2048 free on this process; ignore_id=0
> >>>
> >>> Is there a way to fix that ?
> >>>
> >>> Thanks,
> >>> Reem
> >>> _______________________________________________
> >>> discuss mailing list     discuss at mpich.org
> >>> To manage subscription options or unsubscribe:
> >>> https://lists.mpich.org/mailman/listinfo/discuss
> >>
> >>
> >> _______________________________________________
> >> discuss mailing list     discuss at mpich.org
> >> To manage subscription options or unsubscribe:
> >> https://lists.mpich.org/mailman/listinfo/discuss
> >>
> >
> > --
> > Pavan Balaji
> > http://www.mcs.anl.gov/~balaji
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> --
> Jeff Hammond
> jeff.science at gmail.com
>
>
> ------------------------------
>
> Message: 3
> Date: Mon, 8 Jul 2013 16:39:03 +0100
> From: Reem Alraddadi <raba500 at york.ac.uk>
> To: discuss at mpich.org
> Subject: [mpich-discuss] mpich on Mac os x
> Message-ID:
>         <
> CANKsntEJR0b_TA-tbdBO3fOV2O8PkwTG82a718qZPd3FnEBT3w at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi Pavan,
>
> Could you explain more what do you mean by that? I am sorry I am still
> beginner in MPICH.
>
> Hi Wesley,
> apologize about my first confused reply I have forgotten to change the
> subject. However, as I mentioned I wrote the following line:
>
> mpirun --np 4 --env MPIR_PARAM_CTXID_EAGER_SIZE 1 ./flash4
>
> I am wondering if I made any mistake here as the problem is still the same
>
> Thanks,
> Reem
>
>
>
> Message: 1
> > Date: Mon, 08 Jul 2013 09:17:00 -0500
> > From: Pavan Balaji <balaji at mcs.anl.gov>
> > To: discuss at mpich.org
> > Subject: Re: [mpich-discuss] mpich on Mac os x
> > Message-ID: <51DAC9DC.5000502 at mcs.anl.gov>
> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >
> >
> > Also make sure you are freeing unused communicators.
> >
> >   -- Pavan
> >
> > On 07/08/2013 08:14 AM, Wesley Bland wrote:
> > > It seems that you're creating more communicators than MPICH can handle.
> > It's possible that you might be able to get around this by setting the
> > environment variable MPIR_PARAM_CTXID_EAGER_SIZE to something smaller
> than
> > its default (which is 2). That frees up a few more communicators, but
> there
> > is a pathological case where even with fewer communicators than the max,
> > MPICH won't be able to agree on a new communicator id when needed. Try
> > changing that environment variable and see if that fixes things.
> > >
> > > Wesley
> > >
> > > On Jul 8, 2013, at 5:33 AM, Reem Alraddadi <raba500 at york.ac.uk> wrote:
> > >
> > >> Hi all,
> > >> I am using mpich-3.0.4 on Mac os x version 10.7.5 to run FLASH code.
> It
> > works fine in the beginning of the run and then I got the following
> error:
> > >>
> > >> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> > >> MPI_Comm_create(600).................:
> MPI_Comm_create(comm=0x84000002,
> > group=0xc8001349, new_comm=0x7fff606a8614) failed
> > >> MPI_Comm_create(577).................:
> > >> MPIR_Comm_create_intra(241)..........:
> > >> MPIR_Get_contextid(799)..............:
> > >> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> > because of fragmentation (169/2048 free on this process; ignore_id=0)
> > >> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> > >> MPI_Comm_create(600).................:
> MPI_Comm_create(comm=0x84000002,
> > group=0xc80012b6, new_comm=0x7fff670cc614) failed
> > >> MPI_Comm_create(577).................:
> > >> MPIR_Comm_create_intra(241)..........:
> > >> MPIR_Get_contextid(799)..............:
> > >> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> > because of fragmentation (316/2048 free on this process; ignore_id=0)
> > >> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> > >> MPI_Comm_create(600).................:
> MPI_Comm_create(comm=0x84000004,
> > group=0xc800000e, new_comm=0x7fff629d5614) failed
> > >> MPI_Comm_create(577).................:
> > >> MPIR_Comm_create_intra(241)..........:
> > >> MPIR_Get_contextid(799)..............:
> > >> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> > because of fragmentation (2020/2048 free on this process; ignore_id=0)
> > >> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> > >> MPI_Comm_create(600).................:
> MPI_Comm_create(comm=0x84000002,
> > group=0xc8000020, new_comm=0x7fff639ae614) failed
> > >> MPI_Comm_create(577).................:
> > >> MPIR_Comm_create_intra(241)..........:
> > >> MPIR_Get_contextid(799)..............:
> > >> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> > because of fragmentation (2002/2048 free on this process; ignore_id=0
> > >>
> > >> Is there a way to fix that ?
> > >>
> > >> Thanks,
> > >> Reem
> > >> _______________________________________________
> > >> discuss mailing list     discuss at mpich.org
> > >> To manage subscription options or unsubscribe:
> > >> https://lists.mpich.org/mailman/listinfo/discuss
> > >
> > > _______________________________________________
> > > discuss mailing list     discuss at mpich.org
> > > To manage subscription options or unsubscribe:
> > > https://lists.mpich.org/mailman/listinfo/discuss
> > >
> >
> > --
> > Pavan Balaji
> > http://www.mcs.anl.gov/~balaji
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.mpich.org/pipermail/discuss/attachments/20130708/e3a9be0d/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 4
> Date: Mon, 8 Jul 2013 10:53:48 -0500
> From: Wesley Bland <wbland at mcs.anl.gov>
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] mpich on Mac os x
> Message-ID: <CEEC8B51-FFDF-40FE-BE56-E25F9C05BB27 at mcs.anl.gov>
> Content-Type: text/plain; charset="iso-8859-1"
>
> On Jul 8, 2013, at 10:39 AM, Reem Alraddadi <raba500 at york.ac.uk> wrote:
>
> > Hi Pavan,
> >
> > Could you explain more what do you mean by that? I am sorry I am still
> beginner in MPICH.
>
> This isn't specific to MPICH. This would be the case with any MPI
> implementation. What he means is that you need to make sure that when you
> are done using a communicator, you call MPI_Comm_free(communicator) so
> MPICH can free the internal resources used by the communicator. If you
> never free your communicators, then you will eventually run out. There is a
> limited number of communicators available internally.
>
> >
> > Hi Wesley,
> > apologize about my first confused reply I have forgotten to change the
> subject. However, as I mentioned I wrote the following line:
> >
> > mpirun --np 4 --env MPIR_PARAM_CTXID_EAGER_SIZE 1 ./flash4
> >
> > I am wondering if I made any mistake here as the problem is still the
> same
>
> Yes, that's the correct way to use it, so that means your problem is
> probably what we described above.
>
> >
> > Thanks,
> > Reem
> >
> >
> >
> > Message: 1
> > Date: Mon, 08 Jul 2013 09:17:00 -0500
> > From: Pavan Balaji <balaji at mcs.anl.gov>
> > To: discuss at mpich.org
> > Subject: Re: [mpich-discuss] mpich on Mac os x
> > Message-ID: <51DAC9DC.5000502 at mcs.anl.gov>
> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >
> >
> > Also make sure you are freeing unused communicators.
> >
> >   -- Pavan
> >
> > On 07/08/2013 08:14 AM, Wesley Bland wrote:
> > > It seems that you're creating more communicators than MPICH can
> handle. It's possible that you might be able to get around this by setting
> the environment variable MPIR_PARAM_CTXID_EAGER_SIZE to something smaller
> than its default (which is 2). That frees up a few more communicators, but
> there is a pathological case where even with fewer communicators than the
> max, MPICH won't be able to agree on a new communicator id when needed. Try
> changing that environment variable and see if that fixes things.
> > >
> > > Wesley
> > >
> > > On Jul 8, 2013, at 5:33 AM, Reem Alraddadi <raba500 at york.ac.uk> wrote:
> > >
> > >> Hi all,
> > >> I am using mpich-3.0.4 on Mac os x version 10.7.5 to run FLASH code.
> It works fine in the beginning of the run and then I got the following
> error:
> > >>
> > >> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> > >> MPI_Comm_create(600).................:
> MPI_Comm_create(comm=0x84000002, group=0xc8001349, new_comm=0x7fff606a8614)
> failed
> > >> MPI_Comm_create(577).................:
> > >> MPIR_Comm_create_intra(241)..........:
> > >> MPIR_Get_contextid(799)..............:
> > >> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> because of fragmentation (169/2048 free on this process; ignore_id=0)
> > >> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> > >> MPI_Comm_create(600).................:
> MPI_Comm_create(comm=0x84000002, group=0xc80012b6, new_comm=0x7fff670cc614)
> failed
> > >> MPI_Comm_create(577).................:
> > >> MPIR_Comm_create_intra(241)..........:
> > >> MPIR_Get_contextid(799)..............:
> > >> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> because of fragmentation (316/2048 free on this process; ignore_id=0)
> > >> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> > >> MPI_Comm_create(600).................:
> MPI_Comm_create(comm=0x84000004, group=0xc800000e, new_comm=0x7fff629d5614)
> failed
> > >> MPI_Comm_create(577).................:
> > >> MPIR_Comm_create_intra(241)..........:
> > >> MPIR_Get_contextid(799)..............:
> > >> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> because of fragmentation (2020/2048 free on this process; ignore_id=0)
> > >> Fatal error in MPI_Comm_create: Other MPI error, error stack:
> > >> MPI_Comm_create(600).................:
> MPI_Comm_create(comm=0x84000002, group=0xc8000020, new_comm=0x7fff639ae614)
> failed
> > >> MPI_Comm_create(577).................:
> > >> MPIR_Comm_create_intra(241)..........:
> > >> MPIR_Get_contextid(799)..............:
> > >> MPIR_Get_contextid_sparse_group(1146):  Cannot allocate context ID
> because of fragmentation (2002/2048 free on this process; ignore_id=0
> > >>
> > >> Is there a way to fix that ?
> > >>
> > >> Thanks,
> > >> Reem
> > >> _______________________________________________
> > >> discuss mailing list     discuss at mpich.org
> > >> To manage subscription options or unsubscribe:
> > >> https://lists.mpich.org/mailman/listinfo/discuss
> > >
> > > _______________________________________________
> > > discuss mailing list     discuss at mpich.org
> > > To manage subscription options or unsubscribe:
> > > https://lists.mpich.org/mailman/listinfo/discuss
> > >
> >
> > --
> > Pavan Balaji
> > http://www.mcs.anl.gov/~balaji
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.mpich.org/pipermail/discuss/attachments/20130708/5b36571f/attachment.html
> >
>
> ------------------------------
>
> _______________________________________________
> discuss mailing list
> discuss at mpich.org
> https://lists.mpich.org/mailman/listinfo/discuss
>
> End of discuss Digest, Vol 9, Issue 15
> **************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20130708/d8cab3ae/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Flash_mpi.h
Type: text/x-chdr
Size: 389 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20130708/d8cab3ae/attachment.bin>


More information about the discuss mailing list