[mpich-discuss] Bug fix for dims_create

Ian Hutchinson hutch at psfc.mit.edu
Sat Dec 8 15:08:26 CST 2012


Thanks Jeff

 	Ian Hutchinson
 	http://www.psfc.mit.edu/people/hutch/


-

> Date: Sat, 8 Dec 2012 14:46:03 -0600
> From: Jeff Hammond <jhammond at alcf.anl.gov>
> Reply-To: discuss at mpich.org
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] Bug fix for dims_create
> 
> Hi Ian,
>
> I have created https://trac.mpich.org/projects/mpich/ticket/1765 on
> your behalf so that this issue can be tracked by the developers.
>
> Best,
>
> Jeff
>
> On Sat, Dec 8, 2012 at 2:07 PM, Ian Hutchinson <hutch at psfc.mit.edu> wrote:
>>
>> The file src/mpi/topo/dims_create.c contains the code that determines the
>> result of
>>
>> MPI_DIMS_CREATE
>>
>> It contains a bug which causes it to produce improper distributions of the
>> processes among dimensions that do not satisfy the objective of being "as
>> close to each other as possible". For example, if called in 3-dimensions,
>> with 16 nodes, the topology returned is 4, 4, 1. It ought to be 4, 2, 2.
>>
>> This bug is caused by some longstanding cobbled-together code that is called
>> when all the factors of the nnodes are 2 (which is not an unusual case).
>>
>> I attach (and include below) a patch to correct this bug. It would be great
>> if it could find its way into the distribution.
>>
>> Thanks
>>         Ian Hutchinson
>>         http://www.psfc.mit.edu/people/hutch/
>>
>> =========================================================================
>>
>> --- dims_create.c.dist  2012-12-08 13:46:46.000000000 -0500
>> +++ dims_create.c       2012-12-08 13:48:02.000000000 -0500
>> @@ -317,28 +317,22 @@
>>             int cnt    = factors[0].cnt; /* Numver of factors left */
>>             int cnteach = ( cnt + dims_needed - 1 ) / dims_needed;
>>             int factor_each;
>> - -         factor_each = factor;
>> -           for (i=1; i<cnteach; i++) factor_each *= factor;
>>
>> -           for (i=0; i<ndims; i++) {
>> -               if (dims[i] == 0) {
>> -                   if (cnt > cnteach) {
>> -                       dims[i] = factor_each;
>> -                       cnt -= cnteach;
>> -                   }
>> -                   else if (cnt > 0) {
>> -                       factor_each = factor;
>> -                       for (j=1; j<cnt; j++) -
>> factor_each *= factor;
>> -                       dims[i] = factor_each;
>> -                       cnt = 0;
>> -                   }
>> -                   else {
>> -                       dims[i] = 1;
>> -                   }
>> +           for (i=0;i<ndims;i++){ +            if(dims[i]==0)dims[i]=-1;
>> +           }
>> +           i=0;
>> +           while(cnt > 0){
>> +               if(dims[i] < 0){
>> +                   dims[i]=dims[i]*factor;
>> +                   cnt--;
>>                 }
>> +               if(++i >= ndims)i=0;
>> +           }
>> +           for (i=0;i<ndims;i++){
>> +               if(dims[i] < 0)dims[i]=-dims[i];
>>             }
>> +
>>         }
>>         else {
>>             /* Here is the general case.  */
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
>



More information about the discuss mailing list