[mpich-devel] Fwd: MPI_Comm_Split/Dup scalability on BGQ and K supercomputers

Thu May 29 15:41:49 CDT 2014

On Thu, May 29, 2014 at 2:41 PM, Rob Latham <robl at mcs.anl.gov> wrote:
>
>
> On 05/29/2014 02:21 PM, Kenneth Raffenetti wrote:
>>
>> I do not [know where the source code to IBM's MPI
>> implementation on Blue Gene is]. RobL should know. I imagine you have to
>> add it via softenv?
>
>
> Don't be shy, man! these messages should all be on the mpich-devel list,
> which I'm looping back in now.
>
> - you can add the mpich-ibm repository and then clone it with git.  A bit of
> a mess, since V1R1M1 and V1R2M1 have a different directory structure.
>
> - you can browse the source on-line  Mira is now (finally!) running V1R2M1,
> so follow the BGQ/IBM_V1R2M1/efix15 tag:
>
> http://git.mpich.org/mpich-ibm.git/tree/81ec186864f8cfe69ed9d99dfbd2d5ffd423e6cd:/mpich2

if you want the canonical MPI BGQ source that IBM packages in the
driver, see https://wiki.alcf.anl.gov/parts/index.php/Blue_Gene/Q#Source_Code

jeff


>> On 05/29/2014 02:15 PM, Junchao Zhang wrote:
>>>
>>> Ken,
>>>     Do you know where is IBM MPI on mira?
>>>
>>> --Junchao Zhang
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: *Thakur, Rajeev* <thakur at mcs.anl.gov <mailto:thakur at mcs.anl.gov>>
>>> Date: Thu, May 29, 2014 at 2:11 PM
>>> Subject: Re: [mpich-devel] MPI_Comm_Split/Dup scalability on BGQ and K
>>> supercomputers
>>> To: "Zhang, Junchao" <jczhang at mcs.anl.gov <mailto:jczhang at mcs.anl.gov>>
>>> Cc: "Balaji, Pavan" <balaji at anl.gov <mailto:balaji at anl.gov>>
>>>
>>>
>>> Can you look at the source code of IBM MPI on Mira to see what they are
>>> doing. I don't know if the source code is in our repo or somewhere else,
>>> but Rob Latham should know.
>>>
>>> Also, there is a bug in Comm_split in the latest version of MPI
>>> installed on Mira. There was a lot of discussion on the early-users
>>> mailing list. Rob Latham knows about it, and he can tell you what the
>>> bug is about.
>>>
>>> Rajeev
>>>
>>>
>>> On May 29, 2014, at 1:49 PM, Junchao Zhang <jczhang at mcs.anl.gov
>>> <mailto:jczhang at mcs.anl.gov>>
>>>   wrote:
>>>
>>>  > I did more experiments on Mira. But I feel I am lost :)   In short, I
>>> think the original MPICH comm_split implementation is good if not the
>>> best. However, for unknown reasons, the MPI library provided by Mira
>>> is bad.
>>>  > I wrote three micro-benchmarks, 1) qsort.c, which is a sequential C
>>> doe, to test the libc qsort performance when sorting a presorted array.
>>> 2) split.c, which measures MPI_Comm_split(MPI_COMM_WORLD, 0, myrank,
>>> ...). Note that all processes have the same color and use their old
>>> rank. The maximal time on processes is reported. 3) dup.c, which simply
>>> does MPI_Comm_dup(MPI_COMM_WORLD, &newcomm).
>>>  >
>>>  > I compiled the code using either mpixlc, which is provided by
>>> +mpiwrapper-gcc on Mira, or mpicc, which is built from mpich/master by
>>> myself, or mpicc-mergesort, which is also built by myself with a
>>> mergesort comm_split.c
>>>  >
>>>  > QSORT : compiled with mpixlc, run with 1 core on 1 node (also run
>>> with 16 cores simultaneously on 1 node, which gives even better
>>> results). The results indicate the libc-qsort used by mpixlc is good.
>>>  > # integers  time (s)
>>>  > 64000       0.110789
>>>  > 125000     0.200747
>>>  > 512000     0.445075
>>>  >
>>>  > COMM_SPLIT : compiled with mpixlc.(Note the results match with Sam's)
>>>  > # ranks     time (s)
>>>  >  64000       2.297776
>>>  >  125000     8.157885
>>>  >  512000     237.079862
>>>  >
>>>  > COMM_DUP : compiled with mpixlc (Note the results match with Sam's)
>>>  >  64000       2.057966
>>>  >  125000     7.808858
>>>  >  512000     235.948144
>>>  >
>>>  > COMM_SPLIT : compiled with mpicc, run two times. The variation is big
>>> at 512,000.
>>>  >  64000       0.549024  0.545003
>>>  >  125000     0.959198  0.965059
>>>  >  512000     6.625385  3.716696
>>>  >
>>>  > COMM_DUP : compiled with mpicc
>>>  >  64000       0.028518
>>>  >  125000     0.056386
>>>  >  512000     0.202677
>>>  >
>>>  > COMM_SPLIT : compiled with mpicc-mergesort
>>>  >  64000       0.362572
>>>  >  125000     0.712167
>>>  >  512000     2.929134
>>>  >
>>>  > COMM_DUP : compiled with mpicc-mergesort
>>>  >  64000       0.028835
>>>  >  125000     0.055987
>>>  >  512000     0.202660
>>>  >
>>>  > (Another difference should be mentioned is that mpixlc version is
>>> "MPICH2 version 1.5".  I looked at the log history of comm_split.c.
>>> After MPICH2 1.5, there is no big changes)
>>>  > We can see Comm_dup performance is independent of Comm_split and
>>> mpicc-mergesort is slightly better than mpicc.  It may be still worth
>>> pushing mergesort to mpich.
>>>  >
>>>  > I have no idea why mpixlc is bad.  Should I reported it to ALCF?
>>>  >
>>>  > --Junchao Zhang
>>>  >
>>>  >
>>>  > On Wed, May 28, 2014 at 7:34 AM, Junchao Zhang <jczhang at mcs.anl.gov
>>> <mailto:jczhang at mcs.anl.gov>> wrote:
>>>  > Fixed them.  Thanks.
>>>  >
>>>  > --Junchao Zhang
>>>  >
>>>  >
>>>  > On Tue, May 27, 2014 at 11:02 PM, Thakur, Rajeev <thakur at mcs.anl.gov
>>> <mailto:thakur at mcs.anl.gov>> wrote:
>>>  > Junchao,
>>>  >               Your code looks fine, although I didn't check the merge
>>> sort in detail for potential bugs. You may want add a comment explaining
>>> that the 2x in the memory allocation for keytable is because the second
>>> half of it will be used as scratch space for the merge sort. Otherwise,
>>> someone looking at the code 3 years from now will miss it :-). Also, a
>>> typo in one comment: sortting -> sorting.
>>>  >
>>>  > Rajeev
>>>  >
>>>  >
>>>  > On May 27, 2014, at 3:42 PM, Junchao Zhang <jczhang at mcs.anl.gov
>>> <mailto:jczhang at mcs.anl.gov>> wrote:
>>>  >
>>>  > > Here is the code.
>>>  > > We do not just sort keys. We also want to keep track of old ranks
>>> for these keys. So we actually sort an array of {key, rank}.
>>>  > >
>>>  > > --Junchao Zhang
>>>  > >
>>>  > >
>>>  > > On Tue, May 27, 2014 at 12:58 PM, Junchao Zhang
>>> <jczhang at mcs.anl.gov <mailto:jczhang at mcs.anl.gov>> wrote:
>>>  > > Downloaded mpich2-1.3.1 and checked that .  It uses bubble sort,
>>> which is a stable sort algorithm and no extra buffer is needed.
>>>  > > merge sort is stable, but needs O(N) extra memory. I can allocate
>>> the array and the scratch together in one malloc instead of two mallocs
>>> separately. Let me do that.
>>>  > >
>>>  > > --Junchao Zhang
>>>  > >
>>>  > >
>>>  > > On Tue, May 27, 2014 at 12:46 PM, Thakur, Rajeev
>>> <thakur at mcs.anl.gov <mailto:thakur at mcs.anl.gov>> wrote:
>>>  > > The version I suggested (1.4) was not old enough. It has the new
>>> code with quick sort, not the old code with bubble sort. Instead, see
>>> 1.3.1 (http://www.mpich.org/static/downloads/1.3.1/). I confirmed that
>>> it has the old bubble sort. I believe it used O(p) less memory as well
>>> but I didn't recheck just now.
>>>  > >
>>>  > > Rajeev
>>>  > >
>>>  > >
>>>  > >
>>>  > > On May 27, 2014, at 12:38 PM, Junchao Zhang <jczhang at mcs.anl.gov
>>> <mailto:jczhang at mcs.anl.gov>>
>>>  > >  wrote:
>>>  > >
>>>  > > > I read the old code. It is not efficient. The code gathers {key,
>>> color (in effect it is later used to store old ranks) } and puts them in
>>> an array. Actually, that is enough info for a stable sort (i.e., using
>>> old ranks for ties).  But the old code did not realize that and created
>>> a new type, splittype { key, color, orig_idx}, It assigns orig_idx the
>>> index of the element in array and uses that for ties.  This is the O(p)
>>> memory you mentioned. We can say it adds 50% memory use.
>>>  > > > For merge sort, the buffer has to be standalone (i.e., not mixed
>>> with old data).  So I deleted the orig_idx field from splittype, and
>>> allocated a buffer as big as the original array, in effect adding 100%
>>> memory use.
>>>  > > > I think we can have a fast qsort with 0% extra memory for our
>>> case. I will do that experiment later.
>>>  > > >
>>>  > > > --Junchao Zhang
>>>  > > >
>>>  > > >
>>>  > > > On Tue, May 27, 2014 at 11:17 AM, Thakur, Rajeev
>>> <thakur at mcs.anl.gov <mailto:thakur at mcs.anl.gov>> wrote:
>>>  > > > Junchao,
>>>  > > >               As far as I remember, when we switched from bubble
>>> sort to quicksort, we had to allocate additional O(nprocs) memory for
>>> the trick to make the qsort stable. You can check the old code in a
>>> release in 2011 (say 1.4 from
>>> http://www.mpich.org/static/downloads/1.4/). It would be good if we can
>>> avoid that additional memory allocation in the new sort. And if in the
>>> merge sort, you have added one more allocate, that should definitely be
>>> avoided.
>>>  > > >
>>>  > > > Rajeev
>>>  > > >
>>>  > > >
>>>  > > > On May 27, 2014, at 9:40 AM, Junchao Zhang <jczhang at mcs.anl.gov
>>> <mailto:jczhang at mcs.anl.gov>> wrote:
>>>  > > >
>>>  > > > > Rejeev,
>>>  > > > >    The code is at mpich-review/commsplit.  I pushed it there
>>> several days ago and MPICH test results look good.  Could you review it?
>>>  > > > >    Yes, I will look why the improvement of Comm_dup is better.
>>>  > > > >    Another thing I want to investigate is: In Comm_split, we
>>> sort {key, old rank}, i.e., an integer pair. So basically, we have
>>> mechanisms to avoid the unstable problem of qsort. We can copy an
>>> optimized qsort instead of using the one in libc, and test its
>>> performance.  The benefit of qsort is we can avoid the extra buffer
>>> needed in mergesort. Using a customized qsort, we can also avoid
>>> function pointer dereferencing.  In libc qsort, one provides a generic
>>> comp function pointer. But our case is simple. We just want to compare
>>> integers. We can avoid the function call cost for each comparison.
>>>  > > > >
>>>  > > > >
>>>  > > > >
>>>  > > > > --Junchao Zhang
>>>  > > > >
>>>  > > > >
>>>  > > > > On Tue, May 27, 2014 at 9:24 AM, Thakur, Rajeev
>>> <thakur at mcs.anl.gov <mailto:thakur at mcs.anl.gov>> wrote:
>>>  > > > > Thanks. Have you run your new version through the MPICH test
>>> suite to make sure there are no bugs in the new sort.
>>>  > > > >
>>>  > > > > Can you look at Comm_dup to see why it improved so much.
>>>  > > > >
>>>  > > > > BG/Q has a relatively slower processor, so the sorting time is
>>> noticeable, unlike on the NERSC system.
>>>  > > > >
>>>  > > > > I think we should include it in 3.1.1
>>>  > > > >
>>>  > > > > Rajeev
>>>  > > > >
>>>  > > > > On May 27, 2014, at 9:16 AM, Junchao Zhang <jczhang at mcs.anl.gov
>>> <mailto:jczhang at mcs.anl.gov>>
>>>  > > > >  wrote:
>>>  > > > >
>>>  > > > > > I tested Sam's HPGMG on Mira. I tried mpiwrapper-gcc and
>>> mpiwrapper-xl provided by the softenv of Mira, and also tried a
>>> self-built mpich with my own comm_split + mergesort. Here are test
>>> results.
>>>  > > > > >
>>>  > > > > > mpiwrapper-gcc and mpiwrapper-xl have similar results like
>>> the following (all time is in seconds)
>>>  > > > > > # of ranks     comm_dup     comm_split    MGBuild
>>>  > > > > > 64000           2.059304        2.291601       28.085873
>>>  > > > > > 125000         7.807612        8.158366       88.644299
>>>  > > > > > 512000      235.923766       (no results due to  job timed
>>> out)
>>>  > > > > >
>>>  > > > > > mpich-mergesort
>>>  > > > > > # of ranks     comm_dup     comm_split    MGBuild
>>>  > > > > > 64000           0.046616        0.716383       6.999050
>>>  > > > > > 125000         0.065818        1.293382       10.696351
>>>  > > > > > 512000         0.229605        4.998839        52.467708
>>>  > > > > >
>>>  > > > > > I don't know why the MPICH libraries on Mira are bad, since
>>> optimizations in qsort to avoid the worst case are not new. The libc
>>> library on Mira should have an optimized qsort.
>>>  > > > > > Another thing I have not looked into is why comm_dup is
>>> affected by comm_split.  It seems comm_dup calls comm_split internally.
>>>  > > > > >
>>>  > > > > > In short, the current mergesort solution largely fixes the
>>> scaling problem.  We may further investigate the issue and come up with
>>> better solutions. But if you think this is a key fix, probably we can
>>> push it to 3.1.1.
>>>  > > > > >
>>>  > > > > >
>>>  > > > > > --Junchao Zhang
>>>  > > > > >
>>>  > > > > >
>>>  > > > > > On Sun, May 25, 2014 at 12:25 PM, Junchao Zhang
>>> <jczhang at mcs.anl.gov <mailto:jczhang at mcs.anl.gov>> wrote:
>>>  > > > > > Yes, I am going to test that.
>>>  > > > > >
>>>  > > > > > --Junchao Zhang
>>>  > > > > >
>>>  > > > > >
>>>  > > > > > On Sun, May 25, 2014 at 11:11 AM, Thakur, Rajeev
>>> <thakur at mcs.anl.gov <mailto:thakur at mcs.anl.gov>> wrote:
>>>  > > > > > Can you also see if the xlc compiler makes any difference?
>>>  > > > > >
>>>  > > > > > Rajeev
>>>  > > > > >
>>>  > > > > > On May 25, 2014, at 8:14 AM, Junchao Zhang
>>> <jczhang at mcs.anl.gov <mailto:jczhang at mcs.anl.gov>>
>>>  > > > > >  wrote:
>>>
>>>  > > > > >
>>>  > > > > > > I tested hpgmg on Mira and reproduced the scaling problem.
>>> I used mpiwrapper-gcc.
>>>  > > > > > > With 64000 MPI tasks, the output is:
>>>  > > > > > > Duplicating MPI_COMM_WORLD...done (2.059304 seconds)
>>>  > > > > > > Building MPI subcommunicators..., level 1...done (2.291601
>>> seconds)
>>>  > > > > > > Total time in MGBuild     28.085873 seconds
>>>  > > > > > > With 125000 MPI Tasks, the output is:
>>>  > > > > > > Duplicating MPI_COMM_WORLD...done (7.807612 seconds)
>>>  > > > > > > Building MPI subcommunicators...,  level 1...done (8.158366
>>> seconds)
>>>  > > > > > > Total time in MGBuild     88.644299 seconds
>>>  > > > > > >
>>>  > > > > > > Let me replace qsort with mergesort in Comm_split to see
>>> what will happen.
>>>  > > > > > >
>>>  > > > > > > --Junchao Zhang
>>>  > > > > > >
>>>  > > > > > >
>>>  > > > > > > On Sat, May 24, 2014 at 9:07 AM, Sam Williams
>>> <swwilliams at lbl.gov <mailto:swwilliams at lbl.gov>> wrote:
>>>  > > > > > > I saw the problem on Mira and K but not Edison.  I don't
>>> know if that is due to scale or implementation.
>>>  > > > > > >
>>>  > > > > > > On Mira, I was running jobs with a 10min wallclock limit.
>>>   I scaled to 46656 processes with 64 threads per process (c1,
>>> OMP_NUM_THREADS=64) and all jobs completed successfully.  However, I was
>>> only looking at MGSolve times and not MGBuild times.  I then decided to
>>> explore 8 threads per process (c8, OMP_NUM_THREADS=8) and started at the
>>> high concurrency.  The jobs timed out while still in MGBuild after
>>> 10mins with 373248 processes as well as with a 20min timeout.  At that
>>> point I added the USE_SUBCOMM option to enable/disable the use of
>>> comm_split.  I haven't tried scaling with the sub communicator on Mira
>>> since then.
>>>  > > > > > >
>>>  > > > > > >
>>>  > > > > > > On May 24, 2014, at 6:39 AM, Junchao Zhang
>>> <jczhang at mcs.anl.gov <mailto:jczhang at mcs.anl.gov>> wrote:
>>>  > > > > > >
>>>  > > > > > > > Hi, Sam,
>>>  > > > > > > >   Could you give me the exact number of MPI ranks for
>>> your results on Mira?
>>>  > > > > > > >   I run hpgmg on Edison with export OMP_NUM_THREADS=1,
>>> aprun -n 64000 -ss -cc numa_node ./hpgmg-fv 6 1. The total time in
>>> MGBuild is about 0.005 seconds. I was wondering how many cores I need to
>>> apply to reproduce the problem.
>>>  > > > > > > >   Thanks.
>>>  > > > > > > >
>>>  > > > > > > >
>>>  > > > > > > > --Junchao Zhang
>>>  > > > > > > >
>>>  > > > > > > >
>>>  > > > > > > > On Sat, May 17, 2014 at 9:06 AM, Sam Williams
>>> <swwilliams at lbl.gov <mailto:swwilliams at lbl.gov>> wrote:
>>>  > > > > > > > I've been conducting scaling experiments on the Mira
>>> (Blue Gene/Q) and K (Sparc) supercomputers.  I've noticed that the time
>>> required for MPI_Comm_split and MPI_Comm_dup can grow quickly with scale
>>> (~P^2).  As such, its performance eventually becomes a bottleneck.  That
>>> is, although the benefit of using a subcommunicator is huge (multigrid
>>> solves are weak-scalable), the penalty of creating one (multigrid build
>>> time) is also huge.
>>>  > > > > > > >
>>>  > > > > > > > For example, when scaling from 1 to 46K nodes (= cubes of
>>> integers) on Mira, the time (in seconds) required to build a MG solver
>>> (including a subcommunicator) scales as
>>>  > > > > > > > 222335.output:   Total time in MGBuild      0.056704
>>>  > > > > > > > 222336.output:   Total time in MGBuild      0.060834
>>>  > > > > > > > 222348.output:   Total time in MGBuild      0.064782
>>>  > > > > > > > 222349.output:   Total time in MGBuild      0.090229
>>>  > > > > > > > 222350.output:   Total time in MGBuild      0.075280
>>>  > > > > > > > 222351.output:   Total time in MGBuild      0.091852
>>>  > > > > > > > 222352.output:   Total time in MGBuild      0.137299
>>>  > > > > > > > 222411.output:   Total time in MGBuild      0.301552
>>>  > > > > > > > 222413.output:   Total time in MGBuild      0.606444
>>>  > > > > > > > 222415.output:   Total time in MGBuild      0.745272
>>>  > > > > > > > 222417.output:   Total time in MGBuild      0.779757
>>>  > > > > > > > 222418.output:   Total time in MGBuild      4.671838
>>>  > > > > > > > 222419.output:   Total time in MGBuild     15.123162
>>>  > > > > > > > 222420.output:   Total time in MGBuild     33.875626
>>>  > > > > > > > 222421.output:   Total time in MGBuild     49.494547
>>>  > > > > > > > 222422.output:   Total time in MGBuild    151.329026
>>>  > > > > > > >
>>>  > > > > > > > If I disable the call to MPI_Comm_Split, my time scales as
>>>  > > > > > > > 224982.output:   Total time in MGBuild      0.050143
>>>  > > > > > > > 224983.output:   Total time in MGBuild      0.052607
>>>  > > > > > > > 224988.output:   Total time in MGBuild      0.050697
>>>  > > > > > > > 224989.output:   Total time in MGBuild      0.078343
>>>  > > > > > > > 224990.output:   Total time in MGBuild      0.054634
>>>  > > > > > > > 224991.output:   Total time in MGBuild      0.052158
>>>  > > > > > > > 224992.output:   Total time in MGBuild      0.060286
>>>  > > > > > > > 225008.output:   Total time in MGBuild      0.062925
>>>  > > > > > > > 225009.output:   Total time in MGBuild      0.097357
>>>  > > > > > > > 225010.output:   Total time in MGBuild      0.061807
>>>  > > > > > > > 225011.output:   Total time in MGBuild      0.076617
>>>  > > > > > > > 225012.output:   Total time in MGBuild      0.099683
>>>  > > > > > > > 225013.output:   Total time in MGBuild      0.125580
>>>  > > > > > > > 225014.output:   Total time in MGBuild      0.190711
>>>  > > > > > > > 225016.output:   Total time in MGBuild      0.218329
>>>  > > > > > > > 225017.output:   Total time in MGBuild      0.282081
>>>  > > > > > > >
>>>  > > > > > > > Although I didn't directly measure it, this suggests the
>>> time for MPI_Comm_Split is growing roughly quadratically with process
>>> concurrency.
>>>  > > > > > > >
>>>  > > > > > > >
>>>  > > > > > > >
>>>  > > > > > > >
>>>  > > > > > > > I see the same effect on the K machine (8...64K nodes)
>>> where the code uses comm_split/dup in conjunction:
>>>  > > > > > > > run00008_7_1.sh.o2412931:   Total time in MGBuild
>>>   0.026458 seconds
>>>  > > > > > > > run00064_7_1.sh.o2415876:   Total time in MGBuild
>>>   0.039121 seconds
>>>  > > > > > > > run00512_7_1.sh.o2415877:   Total time in MGBuild
>>>   0.086800 seconds
>>>  > > > > > > > run01000_7_1.sh.o2414496:   Total time in MGBuild
>>>   0.129764 seconds
>>>  > > > > > > > run01728_7_1.sh.o2415878:   Total time in MGBuild
>>>   0.224576 seconds
>>>  > > > > > > > run04096_7_1.sh.o2415880:   Total time in MGBuild
>>>   0.738979 seconds
>>>  > > > > > > > run08000_7_1.sh.o2414504:   Total time in MGBuild
>>>   2.123800 seconds
>>>  > > > > > > > run13824_7_1.sh.o2415881:   Total time in MGBuild
>>>   6.276573 seconds
>>>  > > > > > > > run21952_7_1.sh.o2415882:   Total time in MGBuild
>>> 13.634200 seconds
>>>  > > > > > > > run32768_7_1.sh.o2415884:   Total time in MGBuild
>>> 36.508670 seconds
>>>  > > > > > > > run46656_7_1.sh.o2415874:   Total time in MGBuild
>>> 58.668228 seconds
>>>  > > > > > > > run64000_7_1.sh.o2415875:   Total time in MGBuild
>>>   117.322217 seconds
>>>  > > > > > > >
>>>  > > > > > > >
>>>  > > > > > > > A glance at the implementation on Mira (I don't know if
>>> the implementation on K is stock) suggests it should be using qsort to
>>> sort based on keys.  Unfortunately, qsort is not performance robust like
>>> heap/merge sort.  If one were to be productive and call comm_split
>>> like...
>>>  > > > > > > > MPI_Comm_split(...,mycolor,myrank,...)
>>>  > > > > > > > then one runs the risk that the keys are presorted.  This
>>> hits the worst case computational complexity for qsort... O(P^2).
>>>   Demanding programmers avoid sending sorted keys seems unreasonable.
>>>  > > > > > > >
>>>  > > > > > > >
>>>  > > > > > > > I should note, I see a similar lack of scaling with
>>> MPI_Comm_dup on the K machine.  Unfortunately, my BGQ data used an
>>> earlier version of the code that did not use comm_dup.  As such, I can’t
>>> definitively say that it is a problem on that machine as well.
>>>  > > > > > > >
>>>  > > > > > > > Thus, I'm asking for scalable implementations of
>>> comm_split/dup using merge/heap sort whose worst case complexity is
>>> still PlogP to be prioritized in the next update.
>>>  > > > > > > >
>>>  > > > > > > >
>>>  > > > > > > > thanks
>>>  > > > > > > > _______________________________________________
>>>  > > > > > > > To manage subscription options or unsubscribe:
>>>  > > > > > > > https://lists.mpich.org/mailman/listinfo/devel
>>>  > > > > > > >
>>>  > > > > > > > _______________________________________________
>>>  > > > > > > > To manage subscription options or unsubscribe:
>>>  > > > > > > > https://lists.mpich.org/mailman/listinfo/devel
>>>  > > > > > >
>>>  > > > > > > _______________________________________________
>>>  > > > > > > To manage subscription options or unsubscribe:
>>>  > > > > > > https://lists.mpich.org/mailman/listinfo/devel
>>>  > > > > > >
>>>  > > > > > > _______________________________________________
>>>  > > > > > > To manage subscription options or unsubscribe:
>>>  > > > > > > https://lists.mpich.org/mailman/listinfo/devel
>>>  > > > > >
>>>  > > > > >
>>>  > > > > >
>>>  > > > >
>>>  > > > >
>>>  > > >
>>>  > > >
>>>  > >
>>>  > >
>>>  > >
>>>  > > <comm_split.c>
>>>  >
>>>  >
>>>  >
>>>
>>>
>
> --
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
>
> _______________________________________________
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/devel


-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/