[mpich-discuss] UCX run-time warnings
Raffenetti, Kenneth J.
raffenet at mcs.anl.gov
Wed Nov 27 09:29:36 CST 2019
Could you try an development build (mpich at develop) and see if the
warnings persist? 3.3.2 only contains critical bug fixes. Our master git
branch has diverged quite a bit, especially the ch4 device.
Ken
On 11/26/19 1:04 PM, Sajid Ali via discuss wrote:
> Hi MPICH-developers,
>
> I’ve built MPICH-3.3.2 from source on my univ. cluster to ensure that
> jobs can be launched with slurm’s srun (via —mpi=pmi2, slurm’s native
> pmi version). I built mpich from source using spack using ch4:ucx.
>
> I see the following errors in my outfile (from osu-micro-benchmarks runs):
>
> |.... slurmstepd: error: mpi/pmi2: no value for key in req ....
> [1574612953.443924] [qnode5038:30333:0] mpool.c:38 UCX WARN object
> 0x1b88d00 was not returned to mpool ucp_rkeys ... ` |
>
> as shown in a typical outfile here.
>
> I don’t see such errors with using openmpi-4.0.2 which was also built
> with UCX as the primary transport layer protocol where I compiled a
> newer version of UCX from source.
>
> I’m attaching the build log from mpich install for reference.
>
> |Windows Terminal[sas4990 at quser10 ~]$ spack find -ldv mpich%gcc at 8.3.0
> ==> 1 installed package -- linux-rhel7-ivybridge / gcc at 8.3.0
> ---------------------------- btlyk64 mpich at 3.3.2 device=ch4 +hydra
> netmod=ucx +pci pmi=pmi2 +romio+slurm~verbs+wrapperrpath eb74crw
> libpciaccess at 0.13.5 t6h6tgg libxml2 at 2.9.9~python 46i2h6v libiconv at 1.16
> dnbfmo2 xz at 5.2.4 xaq5v23 zlib at 1.2.11+optimize+pic+shared 77kx7yy
> slurm at 19-05-3-2~gtk~hdf5~hwloc~mariadb~pmix+readline [sas4990 at quser10 ~]$ |
>
> Thanks in advance for your help!
>
> --
> Sajid Ali | PhD Candidate
> Applied Physics
> Northwestern University
> s-sajid-ali.github.io <http://s-sajid-ali.github.io>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
More information about the discuss
mailing list