[mpich-discuss] Supporting > 64K ranks in CH4/UCX netmod

Zhou, Hui zhouh at anl.gov
Wed Apr 21 21:32:27 CDT 2021


Hi Min,

Following up, we have a pull request, https://github.com/pmodels/mpich/pull/5201, should add the configure option --with-ch4-ucx-rankbits=<N> . You can configure the rankbits for ch4:ucx to between 16 and 32. We are expected to merge this PR soon.

--
Hui Zhou


From: Zhou, Hui <zhouh at anl.gov>
Date: Thursday, April 8, 2021 at 10:18 AM
To: discuss at mpich.org <discuss at mpich.org>
Cc: M Xie <xmxmxie at gmail.com>
Subject: Re: [mpich-discuss] Supporting > 64K ranks in CH4/UCX netmod
Hi Min,

I think we can do something about it. We’ll follow-up when we have updates.

--
Hui Zhou


From: M Xie via discuss <discuss at mpich.org>
Date: Thursday, April 8, 2021 at 12:30 AM
To: discuss at mpich.org <discuss at mpich.org>
Cc: M Xie <xmxmxie at gmail.com>
Subject: [mpich-discuss] Supporting > 64K ranks in CH4/UCX netmod
Hi,

I am using MPICH on CH4/UCX netmod, the version is mpich-3.4.1.

I noticed that there is a configure parameter "--with-ch4-rank-bits" which can set the value of CH4_RANK_BITS, but seems CH4_RANK_BITS is not used in the code.

And I also find in the netmod/ucx/ucx_impl.h, _UCX_init_tag()/_UCX_recv_tag() use only 16 bits to set MPI rank in the ucp_tag, but this cannot differentiate correct ucp_tag when MPI ranks exceed 64K.

In Open MPI, 20 bits is used in pml/ucx module to set rank in ucp_tag, 20 bits for context, 24 bits for MPI tag, thus the maximum ranks in Open MPI can be 1M.

Is there any plan to support > 64K ranks in MPICH/CH4/UCX?

Thanks.

Min
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20210422/101bc9f5/attachment-0001.html>


More information about the discuss mailing list