[mpich-discuss] Supporting > 64K ranks in CH4/UCX netmod

Zhou, Hui zhouh at anl.gov
Thu Apr 8 10:18:35 CDT 2021


Hi Min,

I think we can do something about it. We’ll follow-up when we have updates.

--
Hui Zhou


From: M Xie via discuss <discuss at mpich.org>
Date: Thursday, April 8, 2021 at 12:30 AM
To: discuss at mpich.org <discuss at mpich.org>
Cc: M Xie <xmxmxie at gmail.com>
Subject: [mpich-discuss] Supporting > 64K ranks in CH4/UCX netmod
Hi,

I am using MPICH on CH4/UCX netmod, the version is mpich-3.4.1.

I noticed that there is a configure parameter "--with-ch4-rank-bits" which can set the value of CH4_RANK_BITS, but seems CH4_RANK_BITS is not used in the code.

And I also find in the netmod/ucx/ucx_impl.h, _UCX_init_tag()/_UCX_recv_tag() use only 16 bits to set MPI rank in the ucp_tag, but this cannot differentiate correct ucp_tag when MPI ranks exceed 64K.

In Open MPI, 20 bits is used in pml/ucx module to set rank in ucp_tag, 20 bits for context, 24 bits for MPI tag, thus the maximum ranks in Open MPI can be 1M.

Is there any plan to support > 64K ranks in MPICH/CH4/UCX?

Thanks.

Min
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20210408/6b62c8d8/attachment.html>


More information about the discuss mailing list