[mpich-discuss] discuss Digest, Vol 15, Issue 18

bhushan sable golusable at gmail.com
Fri Jan 17 06:05:12 CST 2014


plz........anybody know how to implement pub-sub systems using MPI in ubuntu?

On 1/16/14, discuss-request at mpich.org <discuss-request at mpich.org> wrote:
> Send discuss mailing list submissions to
> 	discuss at mpich.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://lists.mpich.org/mailman/listinfo/discuss
> or, via email, send a message with subject or body 'help' to
> 	discuss-request at mpich.org
>
> You can reach the person managing the list at
> 	discuss-owner at mpich.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of discuss digest..."
>
>
> Today's Topics:
>
>    1. Re:  Poor performance of Waitany / Waitsome (Jeff Hammond)
>    2. Re:  Poor performance of Waitany / Waitsome (John Grime)
>    3.  hello everyone....can anyone tell me how to do performance
>       analysis by using mpi and cuda. (bhimashankar dhappadhule)
>    4. Re:  hello everyone....can anyone tell me how to do
>       performance analysis by using mpi and cuda. (Mahesh Doijade)
>    5. Re:  Failed to allocate memory for an unexpected	message
>       (Luiz Carlos da Costa Junior)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 15 Jan 2014 15:26:44 -0600
> From: Jeff Hammond <jeff.science at gmail.com>
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] Poor performance of Waitany / Waitsome
> Message-ID:
> 	<CAGKz=uJ6-9dDXtforrm3CMemo4JkzbGxE+SeY3fEYuoFaEWhnA at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> Given that you're using shared memory on a bloated OS (anything
> driving a GUI Window Manager), software overhead is going to be
> significant.  You can only do so much about this.  You might want to
> compile MPICH yourself using all the optimization flags.
>
> For example, I decided that "--enable-static
> --enable-fast=O3,nochkmsg,notiming,ndebug,nompit
> --disable-weak-symbols --enable-threads=single" were configure options
> that someone in search of speed might use.  I have not done any
> systematic testing yet so some MPICH developer might tell me I'm a
> clueless buffoon for bothering to (de)activate some of these options.
>
> If you were to assume that I was going to rerun your test with
> different builds of MPICH on my Mac laptop as soon as I get some
> coffee, you would be correct.  Hence, apathy on your part has no
> impact on the experiments regarding MPICH build variants and speed :-)
>
> Jeff
>
> On Wed, Jan 15, 2014 at 3:10 PM, John Grime <jgrime at uchicago.edu> wrote:
>> Cheers for the help, Jeff!
>>
>> I just tried to mimic Waitall() using a variety of the ?MPI_Test??
>> routines
>> (code attached), and the results are not what I would expect:
>>
>> Although Waitsome() seems to give consistently the worst performance (
>> Waitall < Waitany < Waitsome ) , Testsome() *appears* to always be faster
>> than Testany(), and for larger numbers of requests the performance order
>> seems to actually reverse.
>>
>>
>> Now, I may have done something spectacularly dumb here (it would be the
>> 5th
>> such example from today alone), but on the assumption I have not: is this
>> result expected given the underlying implementation?
>>
>> J.
>>
>>
>> ./time_routines.sh 4 50
>>
>> nprocs = 4, ntokens = 16, ncycles = 50
>> Method          : Time         Relative
>>     MPI_Waitall : 1.526000e-03    1.000x
>>     MPI_Waitany : 1.435000e-03    0.940x
>>    MPI_Waitsome : 3.381000e-03    2.216x
>>     MPI_Testall : 3.101000e-03    2.032x
>>     MPI_Testany : 8.080000e-03    5.295x
>>    MPI_Testsome : 3.037000e-03    1.990x
>>    PMPI_Waitall : 1.603000e-03    1.050x
>>    PMPI_Waitany : 1.404000e-03    0.920x
>>   PMPI_Waitsome : 4.666000e-03    3.058x
>>
>>
>>
>> nprocs = 4, ntokens = 64, ncycles = 50
>> Method          : Time         Relative
>>     MPI_Waitall : 3.173000e-03    1.000x
>>     MPI_Waitany : 5.362000e-03    1.690x
>>    MPI_Waitsome : 1.809100e-02    5.702x
>>     MPI_Testall : 1.364200e-02    4.299x
>>     MPI_Testany : 2.309300e-02    7.278x
>>    MPI_Testsome : 1.469800e-02    4.632x
>>    PMPI_Waitall : 2.063000e-03    0.650x
>>    PMPI_Waitany : 9.420000e-03    2.969x
>>   PMPI_Waitsome : 1.890300e-02    5.957x
>>
>>
>>
>> nprocs = 4, ntokens = 128, ncycles = 50
>> Method          : Time         Relative
>>     MPI_Waitall : 4.730000e-03    1.000x
>>     MPI_Waitany : 2.691000e-02    5.689x
>>    MPI_Waitsome : 4.519000e-02    9.554x
>>     MPI_Testall : 4.696900e-02    9.930x
>>     MPI_Testany : 7.285200e-02   15.402x
>>    MPI_Testsome : 3.773400e-02    7.978x
>>    PMPI_Waitall : 5.158000e-03    1.090x
>>    PMPI_Waitany : 2.223200e-02    4.700x
>>   PMPI_Waitsome : 4.205000e-02    8.890x
>>
>>
>>
>> nprocs = 4, ntokens = 512, ncycles = 50
>> Method          : Time         Relative
>>     MPI_Waitall : 1.365900e-02    1.000x
>>     MPI_Waitany : 3.261610e-01   23.879x
>>    MPI_Waitsome : 3.944020e-01   28.875x
>>     MPI_Testall : 5.408010e-01   39.593x
>>     MPI_Testany : 4.865990e-01   35.625x
>>    MPI_Testsome : 3.067470e-01   22.458x
>>    PMPI_Waitall : 1.976100e-02    1.447x
>>    PMPI_Waitany : 3.011500e-01   22.048x
>>   PMPI_Waitsome : 3.791930e-01   27.761x
>>
>>
>>
>> nprocs = 4, ntokens = 1024, ncycles = 50
>> Method          : Time         Relative
>>     MPI_Waitall : 4.087800e-02    1.000x
>>     MPI_Waitany : 1.245209e+00   30.462x
>>    MPI_Waitsome : 1.704020e+00   41.686x
>>     MPI_Testall : 1.940940e+00   47.481x
>>     MPI_Testany : 1.618215e+00   39.586x
>>    MPI_Testsome : 1.133568e+00   27.731x
>>    PMPI_Waitall : 3.970200e-02    0.971x
>>    PMPI_Waitany : 1.344188e+00   32.883x
>>   PMPI_Waitsome : 1.685816e+00   41.240x
>>
>>
>> nprocs = 4, ntokens = 2048, ncycles = 50
>> Method          : Time         Relative
>>     MPI_Waitall : 1.173840e-01    1.000x
>>     MPI_Waitany : 4.600552e+00   39.192x
>>    MPI_Waitsome : 6.840568e+00   58.275x
>>     MPI_Testall : 6.762144e+00   57.607x
>>     MPI_Testany : 5.170525e+00   44.048x
>>    MPI_Testsome : 4.260335e+00   36.294x
>>    PMPI_Waitall : 1.291590e-01    1.100x
>>    PMPI_Waitany : 5.161881e+00   43.974x
>>   PMPI_Waitsome : 7.388439e+00   62.942x
>>
>>
>>
>> On Jan 15, 2014, at 2:53 PM, Jeff Hammond <jeff.science at gmail.com> wrote:
>>
>>> On Wed, Jan 15, 2014 at 2:23 PM, John Grime <jgrime at uchicago.edu> wrote:
>>>> Hi Jeff,
>>>>
>>>>> If Waitall wasn't faster than Waitsome or Waitany, then it wouldn't
>>>>> exist since obviously one can implement the former in terms of the
>>>>> latter
>>>>
>>>>
>>>> I see no reason it wouldn?t exist in such a case, given that it?s an
>>>> elegant/convenient way to wait for all requests to complete vs. Waitsome
>>>> /
>>>> Waitany. It makes sense to me that it would be in the API in any case,
>>>> much
>>>> as I appreciate the value of the RISC-y approach you imply.
>>>>
>>>>> it shouldn't be surprising that they aren't as efficient.
>>>>
>>>> I would?t expect them to have identical performance - but nor would I
>>>> have expected a performance difference of ~50x for the same number of
>>>> outstanding requests, even given that a naive loop over the request
>>>> array
>>>> will be O(N). That loop should be pretty cheap after all, even given
>>>> that
>>>> you can?t use cache well due to the potential for background state
>>>> changes
>>>> in the request object data or whatever (I?m not sure how it?s actually
>>>> implemented, which is why I?m asking about this issue on the mailing
>>>> list).
>>>>
>>>>> The appropriate question to ask is whether Waitany is implemented
>>>>> optimally or not.
>>>>
>>>>
>>>> Well, yes. I kinda hoped that question was heavily implied by my
>>>> original
>>>> email!
>>>>
>>>>
>>>>> If you find that emulating Waitany
>>>>> using Testall following by a loop, then that's useful information.
>>>>
>>>> I accidentally the whole thing, Jeff! ;)
>>>>
>>>> But that?s a good idea, thanks - I?ll give it a try and report back!
>>>
>>> Testall is the wrong semantic here.  I thought it would test them all
>>> individually but it doesn't.  I implemented it anyways and it is the
>>> worst of all.  I attached your test with my modifications.  Because I
>>> am an evil bastard, I made a ton of whitespace changes in addition to
>>> the nontrivial ones.
>>>
>>> Jeff
>>>
>>> --
>>> Jeff Hammond
>>> jeff.science at gmail.com
>>> <nb_ring.c>_______________________________________________
>>
>>> discuss mailing list     discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> --
> Jeff Hammond
> jeff.science at gmail.com
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 15 Jan 2014 21:39:33 +0000
> From: John Grime <jgrime at uchicago.edu>
> To: "discuss at mpich.org" <discuss at mpich.org>
> Subject: Re: [mpich-discuss] Poor performance of Waitany / Waitsome
> Message-ID: <F6F38644-1B91-41AF-A59E-49078DA2C16D at uchicago.edu>
> Content-Type: text/plain; charset="Windows-1252"
>
> Hi Jeff,
>
>> Given that you're using shared memory on a bloated OS (anything
>> driving a GUI Window Manager), software overhead is going to be
>> significant.
>
>
> Very true - I would not expect these times to be indicative of what MPICH
> can actually achieve, but nonetheless the general trends seem to be
> reproducible.
>
> I?m hoping if I can get a good handle on what?s happening, then I can write
> better MPI code in the general case. The major head scratcher for me is how
> the MPI_TestX routines seem to be slower than their conjugate MPI_WaitX in
> most situations, when I would imagine they?d be doing something fairly
> similar behind the scenes.
>
> The wildcard appears to be MPI_Testsome().
>
> I must be doing something dumb here, so I?ll also caffein-ate and consider!
>
> J.
>
>
> On Jan 15, 2014, at 3:26 PM, Jeff Hammond <jeff.science at gmail.com> wrote:
>
>> Given that you're using shared memory on a bloated OS (anything
>> driving a GUI Window Manager), software overhead is going to be
>> significant.  You can only do so much about this.  You might want to
>> compile MPICH yourself using all the optimization flags.
>>
>> For example, I decided that "--enable-static
>> --enable-fast=O3,nochkmsg,notiming,ndebug,nompit
>> --disable-weak-symbols --enable-threads=single" were configure options
>> that someone in search of speed might use.  I have not done any
>> systematic testing yet so some MPICH developer might tell me I'm a
>> clueless buffoon for bothering to (de)activate some of these options.
>>
>> If you were to assume that I was going to rerun your test with
>> different builds of MPICH on my Mac laptop as soon as I get some
>> coffee, you would be correct.  Hence, apathy on your part has no
>> impact on the experiments regarding MPICH build variants and speed :-)
>>
>> Jeff
>>
>> On Wed, Jan 15, 2014 at 3:10 PM, John Grime <jgrime at uchicago.edu> wrote:
>>> Cheers for the help, Jeff!
>>>
>>> I just tried to mimic Waitall() using a variety of the ?MPI_Test??
>>> routines
>>> (code attached), and the results are not what I would expect:
>>>
>>> Although Waitsome() seems to give consistently the worst performance (
>>> Waitall < Waitany < Waitsome ) , Testsome() *appears* to always be
>>> faster
>>> than Testany(), and for larger numbers of requests the performance order
>>> seems to actually reverse.
>>>
>>>
>>> Now, I may have done something spectacularly dumb here (it would be the
>>> 5th
>>> such example from today alone), but on the assumption I have not: is
>>> this
>>> result expected given the underlying implementation?
>>>
>>> J.
>>>
>>>
>>> ./time_routines.sh 4 50
>>>
>>> nprocs = 4, ntokens = 16, ncycles = 50
>>> Method          : Time         Relative
>>>    MPI_Waitall : 1.526000e-03    1.000x
>>>    MPI_Waitany : 1.435000e-03    0.940x
>>>   MPI_Waitsome : 3.381000e-03    2.216x
>>>    MPI_Testall : 3.101000e-03    2.032x
>>>    MPI_Testany : 8.080000e-03    5.295x
>>>   MPI_Testsome : 3.037000e-03    1.990x
>>>   PMPI_Waitall : 1.603000e-03    1.050x
>>>   PMPI_Waitany : 1.404000e-03    0.920x
>>>  PMPI_Waitsome : 4.666000e-03    3.058x
>>>
>>>
>>>
>>> nprocs = 4, ntokens = 64, ncycles = 50
>>> Method          : Time         Relative
>>>    MPI_Waitall : 3.173000e-03    1.000x
>>>    MPI_Waitany : 5.362000e-03    1.690x
>>>   MPI_Waitsome : 1.809100e-02    5.702x
>>>    MPI_Testall : 1.364200e-02    4.299x
>>>    MPI_Testany : 2.309300e-02    7.278x
>>>   MPI_Testsome : 1.469800e-02    4.632x
>>>   PMPI_Waitall : 2.063000e-03    0.650x
>>>   PMPI_Waitany : 9.420000e-03    2.969x
>>>  PMPI_Waitsome : 1.890300e-02    5.957x
>>>
>>>
>>>
>>> nprocs = 4, ntokens = 128, ncycles = 50
>>> Method          : Time         Relative
>>>    MPI_Waitall : 4.730000e-03    1.000x
>>>    MPI_Waitany : 2.691000e-02    5.689x
>>>   MPI_Waitsome : 4.519000e-02    9.554x
>>>    MPI_Testall : 4.696900e-02    9.930x
>>>    MPI_Testany : 7.285200e-02   15.402x
>>>   MPI_Testsome : 3.773400e-02    7.978x
>>>   PMPI_Waitall : 5.158000e-03    1.090x
>>>   PMPI_Waitany : 2.223200e-02    4.700x
>>>  PMPI_Waitsome : 4.205000e-02    8.890x
>>>
>>>
>>>
>>> nprocs = 4, ntokens = 512, ncycles = 50
>>> Method          : Time         Relative
>>>    MPI_Waitall : 1.365900e-02    1.000x
>>>    MPI_Waitany : 3.261610e-01   23.879x
>>>   MPI_Waitsome : 3.944020e-01   28.875x
>>>    MPI_Testall : 5.408010e-01   39.593x
>>>    MPI_Testany : 4.865990e-01   35.625x
>>>   MPI_Testsome : 3.067470e-01   22.458x
>>>   PMPI_Waitall : 1.976100e-02    1.447x
>>>   PMPI_Waitany : 3.011500e-01   22.048x
>>>  PMPI_Waitsome : 3.791930e-01   27.761x
>>>
>>>
>>>
>>> nprocs = 4, ntokens = 1024, ncycles = 50
>>> Method          : Time         Relative
>>>    MPI_Waitall : 4.087800e-02    1.000x
>>>    MPI_Waitany : 1.245209e+00   30.462x
>>>   MPI_Waitsome : 1.704020e+00   41.686x
>>>    MPI_Testall : 1.940940e+00   47.481x
>>>    MPI_Testany : 1.618215e+00   39.586x
>>>   MPI_Testsome : 1.133568e+00   27.731x
>>>   PMPI_Waitall : 3.970200e-02    0.971x
>>>   PMPI_Waitany : 1.344188e+00   32.883x
>>>  PMPI_Waitsome : 1.685816e+00   41.240x
>>>
>>>
>>> nprocs = 4, ntokens = 2048, ncycles = 50
>>> Method          : Time         Relative
>>>    MPI_Waitall : 1.173840e-01    1.000x
>>>    MPI_Waitany : 4.600552e+00   39.192x
>>>   MPI_Waitsome : 6.840568e+00   58.275x
>>>    MPI_Testall : 6.762144e+00   57.607x
>>>    MPI_Testany : 5.170525e+00   44.048x
>>>   MPI_Testsome : 4.260335e+00   36.294x
>>>   PMPI_Waitall : 1.291590e-01    1.100x
>>>   PMPI_Waitany : 5.161881e+00   43.974x
>>>  PMPI_Waitsome : 7.388439e+00   62.942x
>>>
>>>
>>>
>>> On Jan 15, 2014, at 2:53 PM, Jeff Hammond <jeff.science at gmail.com>
>>> wrote:
>>>
>>>> On Wed, Jan 15, 2014 at 2:23 PM, John Grime <jgrime at uchicago.edu>
>>>> wrote:
>>>>> Hi Jeff,
>>>>>
>>>>>> If Waitall wasn't faster than Waitsome or Waitany, then it wouldn't
>>>>>> exist since obviously one can implement the former in terms of the
>>>>>> latter
>>>>>
>>>>>
>>>>> I see no reason it wouldn?t exist in such a case, given that it?s an
>>>>> elegant/convenient way to wait for all requests to complete vs.
>>>>> Waitsome /
>>>>> Waitany. It makes sense to me that it would be in the API in any case,
>>>>> much
>>>>> as I appreciate the value of the RISC-y approach you imply.
>>>>>
>>>>>> it shouldn't be surprising that they aren't as efficient.
>>>>>
>>>>> I would?t expect them to have identical performance - but nor would I
>>>>> have expected a performance difference of ~50x for the same number of
>>>>> outstanding requests, even given that a naive loop over the request
>>>>> array
>>>>> will be O(N). That loop should be pretty cheap after all, even given
>>>>> that
>>>>> you can?t use cache well due to the potential for background state
>>>>> changes
>>>>> in the request object data or whatever (I?m not sure how it?s actually
>>>>> implemented, which is why I?m asking about this issue on the mailing
>>>>> list).
>>>>>
>>>>>> The appropriate question to ask is whether Waitany is implemented
>>>>>> optimally or not.
>>>>>
>>>>>
>>>>> Well, yes. I kinda hoped that question was heavily implied by my
>>>>> original
>>>>> email!
>>>>>
>>>>>
>>>>>> If you find that emulating Waitany
>>>>>> using Testall following by a loop, then that's useful information.
>>>>>
>>>>> I accidentally the whole thing, Jeff! ;)
>>>>>
>>>>> But that?s a good idea, thanks - I?ll give it a try and report back!
>>>>
>>>> Testall is the wrong semantic here.  I thought it would test them all
>>>> individually but it doesn't.  I implemented it anyways and it is the
>>>> worst of all.  I attached your test with my modifications.  Because I
>>>> am an evil bastard, I made a ton of whitespace changes in addition to
>>>> the nontrivial ones.
>>>>
>>>> Jeff
>>>>
>>>> --
>>>> Jeff Hammond
>>>> jeff.science at gmail.com
>>>> <nb_ring.c>_______________________________________________
>>>
>>>> discuss mailing list     discuss at mpich.org
>>>> To manage subscription options or unsubscribe:
>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>>
>>> _______________________________________________
>>> discuss mailing list     discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>>
>> --
>> Jeff Hammond
>> jeff.science at gmail.com
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> ------------------------------
>
> Message: 3
> Date: Thu, 16 Jan 2014 11:25:32 +0530
> From: bhimashankar dhappadhule <dhappabn at gmail.com>
> To: discuss <discuss at mpich.org>
> Subject: [mpich-discuss] hello everyone....can anyone tell me how to
> 	do performance analysis by using mpi and cuda.
> Message-ID:
> 	<CAL1fN=78MwEiKutf2UyZc5WkCDGvb0yWEq3a0D8aFxUQKhsSDw at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> <http://lists.mpich.org/pipermail/discuss/attachments/20140116/79d271d8/attachment-0001.html>
>
> ------------------------------
>
> Message: 4
> Date: Thu, 16 Jan 2014 13:29:35 +0530
> From: Mahesh Doijade <maheshdoijade at gmail.com>
> To: dhappabn at gmail.com
> Cc: discuss at mpich.org
> Subject: Re: [mpich-discuss] hello everyone....can anyone tell me how
> 	to do performance analysis by using mpi and cuda.
> Message-ID:
> 	<CAFHBtLcM9f_CfAOzRfwkuXUHUksxegK1PWnE2yFzAZ9tsZRt-Q at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hello,
>          For MPI specific profiling you can make use of JumpShot which
> comes with MPICH2, and for MPI + CUDA, I find TAU(Tuning and Analysis
> Utilities) http://www.nic.uoregon.edu/tau-wiki/Guide:TAUGPU, quite
> effective open source solution.
>
>
> On Thu, Jan 16, 2014 at 11:25 AM, bhimashankar dhappadhule <
> dhappabn at gmail.com> wrote:
>
>>
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
>
>
> --
> Regards,
> -- Mahesh Doijade
> http://www.techdarting.com/
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> <http://lists.mpich.org/pipermail/discuss/attachments/20140116/d080fb84/attachment-0001.html>
>
> ------------------------------
>
> Message: 5
> Date: Thu, 16 Jan 2014 16:07:51 -0200
> From: Luiz Carlos da Costa Junior <lcjunior at ufrj.br>
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] Failed to allocate memory for an
> 	unexpected	message
> Message-ID:
> 	<CAOv4ofSTPYXwjh5bN3rX93Lip0OXaNHSrxWKWAfFko0fg8TZ0w at mail.gmail.com>
> Content-Type: text/plain; charset="windows-1252"
>
> Hi Pavan and Antonio,
>
> I implemented the scheme you suggested and it was much easier than I
> thought. Very nice, thanks for your help.
>
> However, I noticed that the execution times were much higher than the cases
> in which the failure didn't occur.
> Is there any reason, apart from some implementation mistake, to explain
> this behavior?
>
> I don't know if it would help, but I am sending below part of the receiver
> process Fortran code.
>
> Thanks in advance.
>
> Best regards,
> Luiz
>
> c-----------------------------------------------------------------------
>       subroutine my_receiver
> c     ------------------------------------------------------------------
>       (...)
>
> c     Local
> c     -----
>       integer*4 m_stat(MPI_STATUS_SIZE)
>
>       integer*4 m_request(zrecv)        ! request identifier
> for asynchronous receives
>       character card(zrecv)*(zbuf)      ! buffer for receiving messages
>
> c     Pre-post RECVs
> c     --------------
>       do irecv = 1, zrecv
>         call MPI_IRECV(card(irecv), zbuf, MPI_CHARACTER,
>      .                 MPI_ANY_SOURCE, MPI_ANY_TAG, M_COMM_SDDP,
>      .                 m_request(irecv), m_ierr )
>       end do !irecv
>
>       do while( keep_receiving )
>
> c       Wait for any of the pre-posted requests to arrive
> c       -------------------------------------------------
>         call MPI_WAITANY(zrecv, m_request, irecv, m_stat, m_ierr)
>
> c       Process message: disk IO
> c       ---------------
>         <DO SOMETHING>
>         if( SOMETHING_ELSE ) then
>           keep_receiving = .false.
>         end if
>
> c       Re-post RECV
> c       ------------
>         call MPI_IRECV(card(irecv), zbuf, MPI_CHARACTER,
>      .                 MPI_ANY_SOURCE, MPI_ANY_TAG, M_COMM_SDDP,
>      .                 m_request(irecv), m_ierr)
>
>       end do
>
> c     Cancel unused RECVs
> c     -------------------
>       do irecv = 1, zrecv
>         call MPI_CANCEL( m_request(irecv), m_ierr )
>       end do !irecv
>
>       (...)
>
>       return
>       end
>
>
>
>
> On 1 November 2013 22:14, Luiz Carlos da Costa Junior
> <lcjunior at ufrj.br>wrote:
>
>> Thanks
>>
>>
>> On 1 November 2013 22:00, Pavan Balaji <balaji at mcs.anl.gov> wrote:
>>
>>>
>>> On Nov 1, 2013, at 4:30 PM, Luiz Carlos da Costa Junior
>>> <lcjunior at ufrj.br>
>>> wrote:
>>> > I understand that I will have to have N buffers, one for each posted
>>> MPI_Irecv. I will also have to TEST (using MPI_PROBE or MPI_WAITANY)
>>> until
>>> a message comes. The result of this test will identify which one of the
>>> posted MPI_Irecv has actually received the message and then process the
>>> right buffer. Is this correct?
>>>
>>> Correct.
>>>
>>> > Should I have to change anything at the sender's processes?
>>>
>>> Likely not.  But you need to think through your algorithm to confirm
>>> that.
>>>
>>> > At the end, my receiver process receives a message identifying that it
>>> should exit this routine. What should I do with the already posted
>>> MPI_Irecv's? Can I cancel them?
>>>
>>> Yes, you can with MPI_CANCEL.
>>>
>>>   ?- Pavan
>>>
>>> --
>>> Pavan Balaji
>>> http://www.mcs.anl.gov/~balaji
>>> _______________________________________________
>>> discuss mailing list     discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>
>>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> <http://lists.mpich.org/pipermail/discuss/attachments/20140116/7ff2858c/attachment.html>
>
> ------------------------------
>
> _______________________________________________
> discuss mailing list
> discuss at mpich.org
> https://lists.mpich.org/mailman/listinfo/discuss
>
> End of discuss Digest, Vol 15, Issue 18
> ***************************************
>



More information about the discuss mailing list