[mpich-discuss] Persistent Communication using MPI_SEND_INIT, MPI_RECV_INIT etc.
Timothy Stitt
Timothy.Stitt.9 at nd.edu
Tue Mar 26 13:46:13 CDT 2013
Pavan...thanks for the comments.
Our finite element simulation decomposes the problem domain into a 2D grid of sub-domains with a 1-1 mapping between compute cores and sub-domains. In each sub-domain we apply a ghost zone between direct neighbors and communicate the relevant data between neighbors during each timestep.
In general we use MVAPICH2 on the various systems we have access to. When using a Cray we use their tuned MPI library which I believe is derived from MPICH2. If we primarily are MPICH-based (and with a basic idea of our communication pattern as given above) would you recommend looking into some other MPI(-3) functionality to achieve better scalability and performance?
Cheers,
Tim.
On Mar 26, 2013, at 12:59 PM, Pavan Balaji <balaji at mcs.anl.gov> wrote:
> Tim,
>
> It's hard for us to decide which MPI functionality is better for you
> without knowing your algorithmic model. For some algorithms persistent
> send/recv is great. For some, RMA is great. Of course, each one has
> its shortcomings. Please don't jump to one or the other based on a
> short email discussion :-).
>
> Also note that some MPI implementations do optimize persistent
> communication, particularly for memory registration. So even if you
> don't see a benefit on some platforms, it doesn't mean that other MPI
> implementations cannot take advantage of it.
>
> -- Pavan
>
> On 03/26/2013 11:36 AM US Central Time, Timothy Stitt wrote:
>> Thanks for the quick reply Jeff. That information is valuable. I'll
>> follow up on your pointers.
>>
>> Much appreciated,
>>
>> Tim.
>>
>> On Mar 26, 2013, at 12:32 PM, Jeff Hammond <jhammond at alcf.anl.gov
>> <mailto:jhammond at alcf.anl.gov>> wrote:
>>
>>> You might want to look at neighborhood collectives, which are
>>> discussed in Chapter 7 of MPI-3. This is a new feature so it may not
>>> be implemented in all MPI implementations, but MPICH supports it. I
>>> guess MVAPICH will support it soon enough if not already.
>>>
>>> When persistent MPI send/recv is discussed at the MPI Forum, it is
>>> often described as an inadequate solution because it does not specify
>>> a full channel and thus some important optimizations, e.g. for RDMA,
>>> may not be feasible.
>>>
>>> If you can use MPI-3 RMA, that is probably going to be a good idea,
>>> although high-quality support for RMA varies. MPICH-derived
>>> implementations usually do a good job though.
>>>
>>> Best,
>>>
>>> Jeff
>>>
>>> On Tue, Mar 26, 2013 at 11:25 AM, Timothy Stitt
>>> <Timothy.Stitt.9 at nd.edu <mailto:Timothy.Stitt.9 at nd.edu>> wrote:
>>>> Hi all,
>>>>
>>>> I've been asking this question around various MPI boards to try and
>>>> get a consensus before I decide to rewrite some MPI code. I am
>>>> grateful for any advice that you can give.
>>>>
>>>> I've inherited a MPI code that was written ~8-10 years ago and it
>>>> predominately uses MPI persistent communication routines for data
>>>> transfers e.g. MPI_SEND_INIT, MPI_RECV_INIT, MPI_START etc. (which I
>>>> am not familiar with and don't normally hear much discussion about).
>>>> I was just wondering if using persistent communication calls is still
>>>> regarded as the most efficient/scalable way to perform communication
>>>> when the communication pattern is known and fixed amongst
>>>> neighborhood processes? We regularly run the code across an IB
>>>> network so would there be a benefit to rewrite the code using another
>>>> approach (e.g. MPI one-sided communication) or should I leave it as
>>>> it is? The code currently scales up to 10K cores and I want to push
>>>> it even further and thus was wondering if there is any benefit in
>>>> tinkering with this persistent MPI communication approach?
>>>>
>>>> Thanks in advance for any advice.
>>>>
>>>> Tim.
>>>>
>>>> _______________________________________________
>>>> discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
>>>> To manage subscription options or unsubscribe:
>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>>
>>>
>>> --
>>> Jeff Hammond
>>> Argonne Leadership Computing Facility
>>> University of Chicago Computation Institute
>>> jhammond at alcf.anl.gov <mailto:jhammond at alcf.anl.gov> / (630) 252-5381
>>> http://www.linkedin.com/in/jeffhammond
>>> https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
>>> _______________________________________________
>>> discuss mailing list discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
More information about the discuss
mailing list