[mpich-discuss] Bug fix for dims_create
Ian Hutchinson
hutch at psfc.mit.edu
Sat Dec 8 15:08:26 CST 2012
Thanks Jeff
Ian Hutchinson
http://www.psfc.mit.edu/people/hutch/
-
> Date: Sat, 8 Dec 2012 14:46:03 -0600
> From: Jeff Hammond <jhammond at alcf.anl.gov>
> Reply-To: discuss at mpich.org
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] Bug fix for dims_create
>
> Hi Ian,
>
> I have created https://trac.mpich.org/projects/mpich/ticket/1765 on
> your behalf so that this issue can be tracked by the developers.
>
> Best,
>
> Jeff
>
> On Sat, Dec 8, 2012 at 2:07 PM, Ian Hutchinson <hutch at psfc.mit.edu> wrote:
>>
>> The file src/mpi/topo/dims_create.c contains the code that determines the
>> result of
>>
>> MPI_DIMS_CREATE
>>
>> It contains a bug which causes it to produce improper distributions of the
>> processes among dimensions that do not satisfy the objective of being "as
>> close to each other as possible". For example, if called in 3-dimensions,
>> with 16 nodes, the topology returned is 4, 4, 1. It ought to be 4, 2, 2.
>>
>> This bug is caused by some longstanding cobbled-together code that is called
>> when all the factors of the nnodes are 2 (which is not an unusual case).
>>
>> I attach (and include below) a patch to correct this bug. It would be great
>> if it could find its way into the distribution.
>>
>> Thanks
>> Ian Hutchinson
>> http://www.psfc.mit.edu/people/hutch/
>>
>> =========================================================================
>>
>> --- dims_create.c.dist 2012-12-08 13:46:46.000000000 -0500
>> +++ dims_create.c 2012-12-08 13:48:02.000000000 -0500
>> @@ -317,28 +317,22 @@
>> int cnt = factors[0].cnt; /* Numver of factors left */
>> int cnteach = ( cnt + dims_needed - 1 ) / dims_needed;
>> int factor_each;
>> - - factor_each = factor;
>> - for (i=1; i<cnteach; i++) factor_each *= factor;
>>
>> - for (i=0; i<ndims; i++) {
>> - if (dims[i] == 0) {
>> - if (cnt > cnteach) {
>> - dims[i] = factor_each;
>> - cnt -= cnteach;
>> - }
>> - else if (cnt > 0) {
>> - factor_each = factor;
>> - for (j=1; j<cnt; j++) -
>> factor_each *= factor;
>> - dims[i] = factor_each;
>> - cnt = 0;
>> - }
>> - else {
>> - dims[i] = 1;
>> - }
>> + for (i=0;i<ndims;i++){ + if(dims[i]==0)dims[i]=-1;
>> + }
>> + i=0;
>> + while(cnt > 0){
>> + if(dims[i] < 0){
>> + dims[i]=dims[i]*factor;
>> + cnt--;
>> }
>> + if(++i >= ndims)i=0;
>> + }
>> + for (i=0;i<ndims;i++){
>> + if(dims[i] < 0)dims[i]=-dims[i];
>> }
>> +
>> }
>> else {
>> /* Here is the general case. */
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
>
More information about the discuss
mailing list