[mpich-devel] ROMIO collective i/o memory use

Mon May 6 16:55:15 CDT 2013

There are at least a few cases where BGQ collective performance is
_better_ with the MPICH built-in algorithms than the PAMI_Collective
implementation, which suggests to me that the assumption that the BG
MPIDI collectives are always faster no longer holds.  This is partly
because PAMI collectives are not necessarily written for BGQ but could
also be because BGQ is less sensitive to certain types of contention
than previous iterations were.

I'm not that "post all isend/irecv pairs and waitall" is the right way
to do this is reducing memory is the goal.  There are a variety of
alternatives but I'd have to run experiments and compare to alltoallv
to know if anything of them are better.

Jeff

On Mon, May 6, 2013 at 4:44 PM, Rob Latham <robl at mcs.anl.gov> wrote:
> On Mon, May 06, 2013 at 04:35:11PM -0500, Jeff Hammond wrote:
>> Do alltoallv actually run faster than send-recv for the MPIO use case?
>>  For >1MB messages, is alltoallv noticeably faster than a well-written
>> send-recv implantation?
>
> We only have data from Blue Gene /L: Hao Yu's paper only studied
> scalability up to 1k mpi processes.  Alltoallv got the implementation
> an extra 100 MiB /sec or so.
>
> That paper cites almassi's ICS 2005 paper "MPI collective communication on
> BlueGene/L systems", but surely there has been more recent work in
> this area?
>
> What is a "well written send-recv implementation"?  One that posts a
> bunch of nonblocking sends/receives and lets the MPICH progress engine
> figure it out?    One that tries to schedule the send/recieves in some
> way?
>
> ==rob
>
> --
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA

-- 
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
ALCF docs: http://www.alcf.anl.gov/user-guides