[mpich-discuss] Supporting > 64K ranks in CH4/UCX netmod
zhouh at anl.gov
Thu Apr 8 10:18:35 CDT 2021
I think we can do something about it. We’ll follow-up when we have updates.
From: M Xie via discuss <discuss at mpich.org>
Date: Thursday, April 8, 2021 at 12:30 AM
To: discuss at mpich.org <discuss at mpich.org>
Cc: M Xie <xmxmxie at gmail.com>
Subject: [mpich-discuss] Supporting > 64K ranks in CH4/UCX netmod
I am using MPICH on CH4/UCX netmod, the version is mpich-3.4.1.
I noticed that there is a configure parameter "--with-ch4-rank-bits" which can set the value of CH4_RANK_BITS, but seems CH4_RANK_BITS is not used in the code.
And I also find in the netmod/ucx/ucx_impl.h, _UCX_init_tag()/_UCX_recv_tag() use only 16 bits to set MPI rank in the ucp_tag, but this cannot differentiate correct ucp_tag when MPI ranks exceed 64K.
In Open MPI, 20 bits is used in pml/ucx module to set rank in ucp_tag, 20 bits for context, 24 bits for MPI tag, thus the maximum ranks in Open MPI can be 1M.
Is there any plan to support > 64K ranks in MPICH/CH4/UCX?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the discuss