From alexanda at txcorp.com Fri Sep 4 10:47:40 2020 From: alexanda at txcorp.com (David Alexander) Date: Fri, 4 Sep 2020 09:47:40 -0600 Subject: [mpich-discuss] Help resolving hwloc_pci_compare_busids: Assertion `0' failed error Message-ID: When I execute a distributed copy of MPICH 3.3.2 with a program built against that same copy of MPICH I see the following error on the target machine: $ mpiexec -np 2 myprogram myprogram: /builds/mpich-3.3.2/src/hwloc/hwloc/pci-common.c:259: hwloc_pci_compare_busids: Assertion `0' failed. myprogram: /builds/mpich-3.3.2/src/hwloc/hwloc/pci-common.c:259: hwloc_pci_compare_busids: Assertion `0' failed. The target machine where mpiexec is running on is an SGI with SLES12 and the machine that I built on is Centos7. The library dependency of mpiexec.hydra and myprogram are below: $ lddtree /installation/bin/mpiexec.hydra mpiexec.hydra => /installation/bin/mpiexec.hydra (interpreter => /lib64/ld-linux-x86-64.so.2) libm.so.6 => /lib64/libm.so.6 ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 libudev.so.1 => /lib64/libudev.so.1 librt.so.1 => /lib64/librt.so.1 libcap.so.2 => /lib64/libcap.so.2 libattr.so.1 => /lib64/libattr.so.1 libdw.so.1 => /lib64/libdw.so.1 libelf.so.1 => /lib64/libelf.so.1 libz.so.1 => /lib64/libz.so.1 liblzma.so.5 => /lib64/liblzma.so.5 libbz2.so.1 => /lib64/libbz2.so.1 libdl.so.2 => /lib64/libdl.so.2 libgcc_s.so.1 => /lib64/libgcc_s.so.1 libpciaccess.so.0 => /lib64/libpciaccess.so.0 libxml2.so.2 => /lib64/libxml2.so.2 libpthread.so.0 => /lib64/libpthread.so.0 libc.so.6 => /lib64/libc.so.6 $ lddtree /installation/bin/myprogram myprogram => /installation/bin/myprogram (interpreter => /lib64/ld-linux-x86-64.so.2) libHYPRE.so => /installation/bin/../../lib/libHYPRE.so libz.so.1 => /installation/bin/../../lib/libz.so.1 libgfortran.so.5 => /installation/bin/../../lib/libgfortran.so.5 libquadmath.so.0 => /installation/bin/../../lib/libquadmath.so.0 libpthread.so.0 => /lib64/libpthread.so.0 libutil.so.1 => /lib64/libutil.so.1 libdl.so.2 => /lib64/libdl.so.2 librt.so.1 => /lib64/librt.so.1 libcurand.so.10 => /installation/bin/../../lib/libcurand.so.10 libmpicxx.so.12 => /installation/bin/../../lib/libmpicxx.so.12 libudev.so.1 => /lib64/libudev.so.1 libcap.so.2 => /lib64/libcap.so.2 libattr.so.1 => /lib64/libattr.so.1 libdw.so.1 => /lib64/libdw.so.1 libelf.so.1 => /lib64/libelf.so.1 liblzma.so.5 => /lib64/liblzma.so.5 libbz2.so.1 => /lib64/libbz2.so.1 libpciaccess.so.0 => /lib64/libpciaccess.so.0 libxml2.so.2 => /lib64/libxml2.so.2 libmpi.so.12 => /installation/bin/../../lib/libmpi.so.12 libpython2.7.so.1.0 => /installation/bin/../../lib/libpython2.7.so.1.0 libstdc++.so.6 => /installation/bin/../../lib/libstdc++.so.6 libm.so.6 => /lib64/libm.so.6 libgomp.so.1 => /installation/bin/../../lib/libgomp.so.1 libgcc_s.so.1 => /installation/bin/../../lib/libgcc_s.so.1 libc.so.6 => /lib64/libc.so.6 ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 Thanks! From Brice.Goglin at inria.fr Fri Sep 4 11:11:05 2020 From: Brice.Goglin at inria.fr (Brice Goglin) Date: Fri, 4 Sep 2020 18:11:05 +0200 Subject: [mpich-discuss] Help resolving hwloc_pci_compare_busids: Assertion `0' failed error In-Reply-To: References: Message-ID: Hello I think we've seen this issue once in the past, but I couldn't find it in the archives yet. Anyway, it's not MPICH's fault. Can you open an issue on github/open-mpi/hwloc, with the output of lspci -vt ? If you can build hwloc, run lstopo to check whether the issue occurs, and get a gdb backtrace, it'd be great. I am going to add some debug printf before this assert to avoid having to run gdb next time. In the meantime, try setting HWLOC_COMPONENTS=-pci,-linuxio in the environment so that hwloc's PCI backends are disabled. Hopefully, MPICH won't fail because of this. Brice Le 04/09/2020 ? 17:47, David Alexander via discuss a ?crit?: > When I execute a distributed copy of MPICH 3.3.2 with a program > built against that same copy of MPICH I see the following error on > the target machine: > > $ mpiexec -np 2 myprogram > > myprogram: /builds/mpich-3.3.2/src/hwloc/hwloc/pci-common.c:259: hwloc_pci_compare_busids: Assertion `0' failed. > myprogram: /builds/mpich-3.3.2/src/hwloc/hwloc/pci-common.c:259: hwloc_pci_compare_busids: Assertion `0' failed. > > The target machine where mpiexec is running on is an SGI with SLES12 and the > machine that I built on is Centos7. > > The library dependency of mpiexec.hydra and myprogram are below: > > $ lddtree /installation/bin/mpiexec.hydra > mpiexec.hydra => /installation/bin/mpiexec.hydra (interpreter => /lib64/ld-linux-x86-64.so.2) > libm.so.6 => /lib64/libm.so.6 > ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 > libudev.so.1 => /lib64/libudev.so.1 > librt.so.1 => /lib64/librt.so.1 > libcap.so.2 => /lib64/libcap.so.2 > libattr.so.1 => /lib64/libattr.so.1 > libdw.so.1 => /lib64/libdw.so.1 > libelf.so.1 => /lib64/libelf.so.1 > libz.so.1 => /lib64/libz.so.1 > liblzma.so.5 => /lib64/liblzma.so.5 > libbz2.so.1 => /lib64/libbz2.so.1 > libdl.so.2 => /lib64/libdl.so.2 > libgcc_s.so.1 => /lib64/libgcc_s.so.1 > libpciaccess.so.0 => /lib64/libpciaccess.so.0 > libxml2.so.2 => /lib64/libxml2.so.2 > libpthread.so.0 => /lib64/libpthread.so.0 > libc.so.6 => /lib64/libc.so.6 > > $ lddtree /installation/bin/myprogram > myprogram => /installation/bin/myprogram (interpreter => /lib64/ld-linux-x86-64.so.2) > libHYPRE.so => /installation/bin/../../lib/libHYPRE.so > libz.so.1 => /installation/bin/../../lib/libz.so.1 > libgfortran.so.5 => /installation/bin/../../lib/libgfortran.so.5 > libquadmath.so.0 => /installation/bin/../../lib/libquadmath.so.0 > libpthread.so.0 => /lib64/libpthread.so.0 > libutil.so.1 => /lib64/libutil.so.1 > libdl.so.2 => /lib64/libdl.so.2 > librt.so.1 => /lib64/librt.so.1 > libcurand.so.10 => /installation/bin/../../lib/libcurand.so.10 > libmpicxx.so.12 => /installation/bin/../../lib/libmpicxx.so.12 > libudev.so.1 => /lib64/libudev.so.1 > libcap.so.2 => /lib64/libcap.so.2 > libattr.so.1 => /lib64/libattr.so.1 > libdw.so.1 => /lib64/libdw.so.1 > libelf.so.1 => /lib64/libelf.so.1 > liblzma.so.5 => /lib64/liblzma.so.5 > libbz2.so.1 => /lib64/libbz2.so.1 > libpciaccess.so.0 => /lib64/libpciaccess.so.0 > libxml2.so.2 => /lib64/libxml2.so.2 > libmpi.so.12 => /installation/bin/../../lib/libmpi.so.12 > libpython2.7.so.1.0 => /installation/bin/../../lib/libpython2.7.so.1.0 > libstdc++.so.6 => /installation/bin/../../lib/libstdc++.so.6 > libm.so.6 => /lib64/libm.so.6 > libgomp.so.1 => /installation/bin/../../lib/libgomp.so.1 > libgcc_s.so.1 => /installation/bin/../../lib/libgcc_s.so.1 > libc.so.6 => /lib64/libc.so.6 > ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 > > Thanks! > _______________________________________________ > discuss mailing list discuss at mpich.org > To manage subscription options or unsubscribe: > https://lists.mpich.org/mailman/listinfo/discuss From raffenet at mcs.anl.gov Fri Sep 4 15:41:07 2020 From: raffenet at mcs.anl.gov (Raffenetti, Kenneth J.) Date: Fri, 4 Sep 2020 20:41:07 +0000 Subject: [mpich-discuss] Unable to use --enable-fast=O3,ndebug In-Reply-To: References: Message-ID: I updated the "Report a bug" link to go to GitHub. I also created https://github.com/pmodels/mpich/issues/4773 to capture this specific issue. Ken ?On 8/26/20, 12:54 AM, "Jeff Hammond via discuss" wrote: A GitHub issue seems appropriate here (https://github.com/pmodels/mpich/issues)... Jeff On Tue, Aug 25, 2020 at 2:09 PM William Gropp via discuss wrote: (If there is a better way to submit these, let me know. The ?bug report? link doesn?t work from mpich.org ) ?enable-fast in the top level configure accepts multiple, comma-separated items. The configure in modules/yaksa/configure.ac expects (without checking) an O level only. This causes the configure step to fail with an unrelated error (yaksa thinks that it can?t fine pthreads). Bill William Gropp Director and Chief Scientist, NCSA Thomas M. Siebel Chair in Computer Science University of Illinois Urbana-Champaign _______________________________________________ discuss mailing list discuss at mpich.org To manage subscription options or unsubscribe: https://lists.mpich.org/mailman/listinfo/discuss -- Jeff Hammond jeff.science at gmail.com http://jeffhammond.github.io/ From alexanda at txcorp.com Fri Sep 4 18:08:02 2020 From: alexanda at txcorp.com (David Alexander) Date: Fri, 4 Sep 2020 17:08:02 -0600 Subject: [mpich-discuss] Help resolving hwloc_pci_compare_busids: Assertion `0' failed error In-Reply-To: References: Message-ID: Thanks! Setting HWLOC_COMPONENTS=-pci,-linuxio worked! BTW, I am not absolutely sure it was in the MPICH code. I do know that $ mpiexec -np 2 hostname worked and did not give the error, but of course that doesn?t do an of the MPI operations in the target executable ?hostname". Thanks again! dave > On Sep 4, 2020, at 10:11 AM, Brice Goglin via discuss wrote: > > Hello > > I think we've seen this issue once in the past, but I couldn't find it > in the archives yet. > > Anyway, it's not MPICH's fault. Can you open an issue on > github/open-mpi/hwloc, with the output of lspci -vt ? If you can build > hwloc, run lstopo to check whether the issue occurs, and get a gdb > backtrace, it'd be great. I am going to add some debug printf before > this assert to avoid having to run gdb next time. > > In the meantime, try setting HWLOC_COMPONENTS=-pci,-linuxio in the > environment so that hwloc's PCI backends are disabled. Hopefully, MPICH > won't fail because of this. > > Brice > > > > Le 04/09/2020 ? 17:47, David Alexander via discuss a ?crit : >> When I execute a distributed copy of MPICH 3.3.2 with a program >> built against that same copy of MPICH I see the following error on >> the target machine: >> >> $ mpiexec -np 2 myprogram >> >> myprogram: /builds/mpich-3.3.2/src/hwloc/hwloc/pci-common.c:259: hwloc_pci_compare_busids: Assertion `0' failed. >> myprogram: /builds/mpich-3.3.2/src/hwloc/hwloc/pci-common.c:259: hwloc_pci_compare_busids: Assertion `0' failed. >> >> The target machine where mpiexec is running on is an SGI with SLES12 and the >> machine that I built on is Centos7. >> >> The library dependency of mpiexec.hydra and myprogram are below: >> >> $ lddtree /installation/bin/mpiexec.hydra >> mpiexec.hydra => /installation/bin/mpiexec.hydra (interpreter => /lib64/ld-linux-x86-64.so.2) >> libm.so.6 => /lib64/libm.so.6 >> ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 >> libudev.so.1 => /lib64/libudev.so.1 >> librt.so.1 => /lib64/librt.so.1 >> libcap.so.2 => /lib64/libcap.so.2 >> libattr.so.1 => /lib64/libattr.so.1 >> libdw.so.1 => /lib64/libdw.so.1 >> libelf.so.1 => /lib64/libelf.so.1 >> libz.so.1 => /lib64/libz.so.1 >> liblzma.so.5 => /lib64/liblzma.so.5 >> libbz2.so.1 => /lib64/libbz2.so.1 >> libdl.so.2 => /lib64/libdl.so.2 >> libgcc_s.so.1 => /lib64/libgcc_s.so.1 >> libpciaccess.so.0 => /lib64/libpciaccess.so.0 >> libxml2.so.2 => /lib64/libxml2.so.2 >> libpthread.so.0 => /lib64/libpthread.so.0 >> libc.so.6 => /lib64/libc.so.6 >> >> $ lddtree /installation/bin/myprogram >> myprogram => /installation/bin/myprogram (interpreter => /lib64/ld-linux-x86-64.so.2) >> libHYPRE.so => /installation/bin/../../lib/libHYPRE.so >> libz.so.1 => /installation/bin/../../lib/libz.so.1 >> libgfortran.so.5 => /installation/bin/../../lib/libgfortran.so.5 >> libquadmath.so.0 => /installation/bin/../../lib/libquadmath.so.0 >> libpthread.so.0 => /lib64/libpthread.so.0 >> libutil.so.1 => /lib64/libutil.so.1 >> libdl.so.2 => /lib64/libdl.so.2 >> librt.so.1 => /lib64/librt.so.1 >> libcurand.so.10 => /installation/bin/../../lib/libcurand.so.10 >> libmpicxx.so.12 => /installation/bin/../../lib/libmpicxx.so.12 >> libudev.so.1 => /lib64/libudev.so.1 >> libcap.so.2 => /lib64/libcap.so.2 >> libattr.so.1 => /lib64/libattr.so.1 >> libdw.so.1 => /lib64/libdw.so.1 >> libelf.so.1 => /lib64/libelf.so.1 >> liblzma.so.5 => /lib64/liblzma.so.5 >> libbz2.so.1 => /lib64/libbz2.so.1 >> libpciaccess.so.0 => /lib64/libpciaccess.so.0 >> libxml2.so.2 => /lib64/libxml2.so.2 >> libmpi.so.12 => /installation/bin/../../lib/libmpi.so.12 >> libpython2.7.so.1.0 => /installation/bin/../../lib/libpython2.7.so.1.0 >> libstdc++.so.6 => /installation/bin/../../lib/libstdc++.so.6 >> libm.so.6 => /lib64/libm.so.6 >> libgomp.so.1 => /installation/bin/../../lib/libgomp.so.1 >> libgcc_s.so.1 => /installation/bin/../../lib/libgcc_s.so.1 >> libc.so.6 => /lib64/libc.so.6 >> ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 >> >> Thanks! >> _______________________________________________ >> discuss mailing list discuss at mpich.org >> To manage subscription options or unsubscribe: >> https://lists.mpich.org/mailman/listinfo/discuss > _______________________________________________ > discuss mailing list discuss at mpich.org > To manage subscription options or unsubscribe: > https://lists.mpich.org/mailman/listinfo/discuss From Kent.Cheung at arm.com Mon Sep 21 04:51:15 2020 From: Kent.Cheung at arm.com (Kent Cheung) Date: Mon, 21 Sep 2020 09:51:15 +0000 Subject: [mpich-discuss] Intermittent hang in MPI_Finalize with PGI 20.1 In-Reply-To: <35D7644C-DA30-43BD-8198-93FC2CD76D9E@mcs.anl.gov> References: <35D7644C-DA30-43BD-8198-93FC2CD76D9E@mcs.anl.gov> Message-ID: Are there any updates on this issue? Thanks. Kent ________________________________ From: Raffenetti, Kenneth J. Sent: 24 June 2020 17:04 To: discuss at mpich.org Cc: Kent Cheung Subject: Re: [mpich-discuss] Intermittent hang in MPI_Finalize with PGI 20.1 Hi Kent, Thanks for your report. We have not seen this issue with any compiler/OS combination in our nightly tests. We are using PGI 19.4 at this time. I will request 20.1 be installed so we can investigate further. Ken ?On 6/23/20, 8:26 AM, "Kent Cheung via discuss" wrote: I'm running into an issue where processes sometimes hang when calling MPI_Finalize. This happens with both versions 3.3.2 and 3.4a2 on a single node RedHat 7.5 x86-64 machine, when MPICH is compiled with PGI 20.1 with these configuration flags --enable-debug --enable-shared --enable-debuginfo --enable-sharedlib=gcc If I change the default optimization level (-O2) by configuring with --enable-fast=O1 as well, the hang doesn't occur. Another data point is that the hang does not occur with PGI 19.5 with either optimization levels. I have been testing with the cpi.c code in the examples folder built with just mpicc cpi.c mpiexec -n 3 ./a.out Here is a the backtrace from one of the processes that is hanging (gdb) bt #0 MPID_nem_mpich_blocking_recv () at /tmp/mpich-3.3.2/build/../src/mpid/ch3/channels/nemesis/include/mpid_nem_inline.h:1038 #1 MPIDI_CH3I_Progress () at ../src/mpid/ch3/channels/nemesis/src/ch3_progress.c:506 #2 0x00000000004fc88d in MPIDI_CH3U_VC_WaitForClose () at ../src/mpid/ch3/src/ch3u_handle_connection.c:383 #3 0x0000000000442364 in MPID_Finalize () at ../src/mpid/ch3/src/mpid_finalize.c:110 #4 0x0000000000408621 in PMPI_Finalize () at ../src/mpi/init/finalize.c:260 #5 0x00000000004023e5 in main () at cpi.c:59 Is there a potential fix to be made to MPICH to prevent processes hanging when MPICH is compiled with the default optimization level? Thanks, Kent IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhouh at anl.gov Mon Sep 21 08:59:54 2020 From: zhouh at anl.gov (Zhou, Hui) Date: Mon, 21 Sep 2020 13:59:54 +0000 Subject: [mpich-discuss] Intermittent hang in MPI_Finalize with PGI 20.1 In-Reply-To: References: <35D7644C-DA30-43BD-8198-93FC2CD76D9E@mcs.anl.gov> Message-ID: <3FAF7BD3-634B-4769-AE18-F663F00B3995@anl.gov> Hi Kent, I just tried with PGI 20.1 on mpich v3.3.2. I think I hit a hang once when I was checking it manually, but then I couldn?t reproduce it even after 1000 times repeat. Anyway, we have made some changes to the ch3 header structures that potentially makes the code more standard compliant. Could you try with latest development on github and see if the issue still occur on your end? -- Hui Zhou From: Kent Cheung via discuss Reply-To: "discuss at mpich.org" Date: Monday, September 21, 2020 at 4:52 AM To: "discuss at mpich.org" Cc: Kent Cheung Subject: Re: [mpich-discuss] Intermittent hang in MPI_Finalize with PGI 20.1 Are there any updates on this issue? Thanks. Kent ________________________________ From: Raffenetti, Kenneth J. Sent: 24 June 2020 17:04 To: discuss at mpich.org Cc: Kent Cheung Subject: Re: [mpich-discuss] Intermittent hang in MPI_Finalize with PGI 20.1 Hi Kent, Thanks for your report. We have not seen this issue with any compiler/OS combination in our nightly tests. We are using PGI 19.4 at this time. I will request 20.1 be installed so we can investigate further. Ken On 6/23/20, 8:26 AM, "Kent Cheung via discuss" wrote: I'm running into an issue where processes sometimes hang when calling MPI_Finalize. This happens with both versions 3.3.2 and 3.4a2 on a single node RedHat 7.5 x86-64 machine, when MPICH is compiled with PGI 20.1 with these configuration flags --enable-debug --enable-shared --enable-debuginfo --enable-sharedlib=gcc If I change the default optimization level (-O2) by configuring with --enable-fast=O1 as well, the hang doesn't occur. Another data point is that the hang does not occur with PGI 19.5 with either optimization levels. I have been testing with the cpi.c code in the examples folder built with just mpicc cpi.c mpiexec -n 3 ./a.out Here is a the backtrace from one of the processes that is hanging (gdb) bt #0 MPID_nem_mpich_blocking_recv () at /tmp/mpich-3.3.2/build/../src/mpid/ch3/channels/nemesis/include/mpid_nem_inline.h:1038 #1 MPIDI_CH3I_Progress () at ../src/mpid/ch3/channels/nemesis/src/ch3_progress.c:506 #2 0x00000000004fc88d in MPIDI_CH3U_VC_WaitForClose () at ../src/mpid/ch3/src/ch3u_handle_connection.c:383 #3 0x0000000000442364 in MPID_Finalize () at ../src/mpid/ch3/src/mpid_finalize.c:110 #4 0x0000000000408621 in PMPI_Finalize () at ../src/mpi/init/finalize.c:260 #5 0x00000000004023e5 in main () at cpi.c:59 Is there a potential fix to be made to MPICH to prevent processes hanging when MPICH is compiled with the default optimization level? Thanks, Kent IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Martin.Audet at cnrc-nrc.gc.ca Tue Sep 22 21:40:56 2020 From: Martin.Audet at cnrc-nrc.gc.ca (Audet, Martin) Date: Wed, 23 Sep 2020 02:40:56 +0000 Subject: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx Message-ID: Hello MPICH_Developers, I am having trouble running the configure script on two latest mpich release (3.4a2 and 3.4a3) when I use the flag --with-device=ch4:ucx (no such problems with ch3:sock or ch3:nemesis). It seems that the generated Makefile often contains two strange compilation and link flags: -Iyes/include and -Lyes/lib. To be able to compile the library I had to use the following command to remove those strange flags: find . -name Makefile -exec sed -i -e 's/-Iyes\/include\>//g' -e 's/-Lyes\/lib\>//g' {} \; After this step I am able to compile and the resulting library seems ok and appears to performs well (not extensive testing). For your information I call the configure script like this: ./configure --with-device=ch4:ucx --with-hcoll=/opt/mellanox/hcoll --with-pmix --prefix=/home/publique/depot/mpi/mpich-ch4_ucx-3.4a3 --enable-fast=all --enable-romio --with-file-system=ufs+nfs --enable-shared --enable-sharedlibs=gcc I had experienced this problem on the following three configurations: mpich-3.4a2 CentOS 7.6 MOFED 4.7 mpich-3.4a3 CentOS 7.6 MOFED 4.7 mpich-3.4a3 CentOS 7.8 MOFED 4.9 The architecture is x86_64. Could someone take a look at this problem ? Thanks, Martin Audet -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhouh at anl.gov Wed Sep 23 08:46:37 2020 From: zhouh at anl.gov (Zhou, Hui) Date: Wed, 23 Sep 2020 13:46:37 +0000 Subject: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx In-Reply-To: References: Message-ID: > ./configure --with-device=ch4:ucx --with-hcoll=/opt/mellanox/hcoll --with-pmix --prefix=/home/publique/depot/mpi/mpich-ch4_ucx-3.4a3 --enable-fast=all --enable-romio --with-file-system=ufs+nfs --enable-shared --enable-sharedlibs=gcc I believe the offending option is `--with-pmix`, since that defaults to `--with-pmix=yes`. You are supposed to pass in the path to pmix installation, e.g. `--with-pmix=/usr/local`. I admit this is a bit not obvious. -- Hui Zhou From: "Audet, Martin via discuss" Reply-To: "discuss at mpich.org" Date: Tuesday, September 22, 2020 at 9:41 PM To: "discuss at mpich.org" Cc: "Audet, Martin" Subject: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx Hello MPICH_Developers, I am having trouble running the configure script on two latest mpich release (3.4a2 and 3.4a3) when I use the flag --with-device=ch4:ucx (no such problems with ch3:sock or ch3:nemesis). It seems that the generated Makefile often contains two strange compilation and link flags: -Iyes/include and ?Lyes/lib. To be able to compile the library I had to use the following command to remove those strange flags: find . -name Makefile -exec sed -i -e 's/-Iyes\/include\>//g' -e 's/-Lyes\/lib\>//g' {} \; After this step I am able to compile and the resulting library seems ok and appears to performs well (not extensive testing). For your information I call the configure script like this: ./configure --with-device=ch4:ucx --with-hcoll=/opt/mellanox/hcoll --with-pmix --prefix=/home/publique/depot/mpi/mpich-ch4_ucx-3.4a3 --enable-fast=all --enable-romio --with-file-system=ufs+nfs --enable-shared --enable-sharedlibs=gcc I had experienced this problem on the following three configurations: mpich-3.4a2 CentOS 7.6 MOFED 4.7 mpich-3.4a3 CentOS 7.6 MOFED 4.7 mpich-3.4a3 CentOS 7.8 MOFED 4.9 The architecture is x86_64. Could someone take a look at this problem ? Thanks, Martin Audet -------------- next part -------------- An HTML attachment was scrubbed... URL: From martineduardomorales at gmail.com Wed Sep 23 15:49:21 2020 From: martineduardomorales at gmail.com (=?UTF-8?Q?Mart=C3=ADn_Morales?=) Date: Wed, 23 Sep 2020 17:49:21 -0300 Subject: [mpich-discuss] Spawns without mpirun Message-ID: Hi all! I asked here some time ago about dynamically spawn processes from a singleton but the answer was that unfortunately, there was a problem with that and just It was not possible. I wonder now if this functionality is available. Best regards Mart?n Virus-free. www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> -------------- next part -------------- An HTML attachment was scrubbed... URL: From Martin.Audet at cnrc-nrc.gc.ca Wed Sep 23 16:38:14 2020 From: Martin.Audet at cnrc-nrc.gc.ca (Audet, Martin) Date: Wed, 23 Sep 2020 21:38:14 +0000 Subject: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx In-Reply-To: References: Message-ID: <22483ce7d34a4197aec806f422b3cd0e@DC01ZWH0012.Corp.nrc.gc.ca> Thanks Zhou for your answer. Now when I add the following options: --with-pmix-include=/usr/include --with-pmix-lib=/usr/lib64 to my configuration options (just after --with-pmix) it works as expected (i.e. I am able to use the pmix mechanism in Slurm). However if I don?t provide --with-pmix, even if --with-pmix-include=/usr/include --with-pmix-lib=/usr/lib64 are provided, pmix is not used (and I have to use the pmi2 mechanism in Slurm). Regards, Martin Audet From: Zhou, Hui [mailto:zhouh at anl.gov] Sent: Wednesday, September 23, 2020 9:47 To: discuss at mpich.org Cc: Audet, Martin Subject: Re: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx > ./configure --with-device=ch4:ucx --with-hcoll=/opt/mellanox/hcoll --with-pmix --prefix=/home/publique/depot/mpi/mpich-ch4_ucx-3.4a3 --enable-fast=all --enable-romio --with-file-system=ufs+nfs --enable-shared --enable-sharedlibs=gcc I believe the offending option is `--with-pmix`, since that defaults to `--with-pmix=yes`. You are supposed to pass in the path to pmix installation, e.g. `--with-pmix=/usr/local`. I admit this is a bit not obvious. -- Hui Zhou From: "Audet, Martin via discuss" > Reply-To: "discuss at mpich.org" > Date: Tuesday, September 22, 2020 at 9:41 PM To: "discuss at mpich.org" > Cc: "Audet, Martin" > Subject: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx Hello MPICH_Developers, I am having trouble running the configure script on two latest mpich release (3.4a2 and 3.4a3) when I use the flag --with-device=ch4:ucx (no such problems with ch3:sock or ch3:nemesis). It seems that the generated Makefile often contains two strange compilation and link flags: -Iyes/include and ?Lyes/lib. To be able to compile the library I had to use the following command to remove those strange flags: find . -name Makefile -exec sed -i -e 's/-Iyes\/include\>//g' -e 's/-Lyes\/lib\>//g' {} \; After this step I am able to compile and the resulting library seems ok and appears to performs well (not extensive testing). For your information I call the configure script like this: ./configure --with-device=ch4:ucx --with-hcoll=/opt/mellanox/hcoll --with-pmix --prefix=/home/publique/depot/mpi/mpich-ch4_ucx-3.4a3 --enable-fast=all --enable-romio --with-file-system=ufs+nfs --enable-shared --enable-sharedlibs=gcc I had experienced this problem on the following three configurations: mpich-3.4a2 CentOS 7.6 MOFED 4.7 mpich-3.4a3 CentOS 7.6 MOFED 4.7 mpich-3.4a3 CentOS 7.8 MOFED 4.9 The architecture is x86_64. Could someone take a look at this problem ? Thanks, Martin Audet -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhouh at anl.gov Wed Sep 23 18:24:34 2020 From: zhouh at anl.gov (Zhou, Hui) Date: Wed, 23 Sep 2020 23:24:34 +0000 Subject: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx In-Reply-To: <22483ce7d34a4197aec806f422b3cd0e@DC01ZWH0012.Corp.nrc.gc.ca> References: <22483ce7d34a4197aec806f422b3cd0e@DC01ZWH0012.Corp.nrc.gc.ca> Message-ID: Hi Martin, Try `--with-pmix=/usr` -- Hui Zhou From: "Audet, Martin" Date: Wednesday, September 23, 2020 at 4:38 PM To: "discuss at mpich.org" Cc: "Zhou, Hui" Subject: RE: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx Thanks Zhou for your answer. Now when I add the following options: --with-pmix-include=/usr/include --with-pmix-lib=/usr/lib64 to my configuration options (just after --with-pmix) it works as expected (i.e. I am able to use the pmix mechanism in Slurm). However if I don?t provide --with-pmix, even if --with-pmix-include=/usr/include --with-pmix-lib=/usr/lib64 are provided, pmix is not used (and I have to use the pmi2 mechanism in Slurm). Regards, Martin Audet From: Zhou, Hui [mailto:zhouh at anl.gov] Sent: Wednesday, September 23, 2020 9:47 To: discuss at mpich.org Cc: Audet, Martin Subject: Re: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx > ./configure --with-device=ch4:ucx --with-hcoll=/opt/mellanox/hcoll --with-pmix --prefix=/home/publique/depot/mpi/mpich-ch4_ucx-3.4a3 --enable-fast=all --enable-romio --with-file-system=ufs+nfs --enable-shared --enable-sharedlibs=gcc I believe the offending option is `--with-pmix`, since that defaults to `--with-pmix=yes`. You are supposed to pass in the path to pmix installation, e.g. `--with-pmix=/usr/local`. I admit this is a bit not obvious. -- Hui Zhou From: "Audet, Martin via discuss" > Reply-To: "discuss at mpich.org" > Date: Tuesday, September 22, 2020 at 9:41 PM To: "discuss at mpich.org" > Cc: "Audet, Martin" > Subject: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx Hello MPICH_Developers, I am having trouble running the configure script on two latest mpich release (3.4a2 and 3.4a3) when I use the flag --with-device=ch4:ucx (no such problems with ch3:sock or ch3:nemesis). It seems that the generated Makefile often contains two strange compilation and link flags: -Iyes/include and ?Lyes/lib. To be able to compile the library I had to use the following command to remove those strange flags: find . -name Makefile -exec sed -i -e 's/-Iyes\/include\>//g' -e 's/-Lyes\/lib\>//g' {} \; After this step I am able to compile and the resulting library seems ok and appears to performs well (not extensive testing). For your information I call the configure script like this: ./configure --with-device=ch4:ucx --with-hcoll=/opt/mellanox/hcoll --with-pmix --prefix=/home/publique/depot/mpi/mpich-ch4_ucx-3.4a3 --enable-fast=all --enable-romio --with-file-system=ufs+nfs --enable-shared --enable-sharedlibs=gcc I had experienced this problem on the following three configurations: mpich-3.4a2 CentOS 7.6 MOFED 4.7 mpich-3.4a3 CentOS 7.6 MOFED 4.7 mpich-3.4a3 CentOS 7.8 MOFED 4.9 The architecture is x86_64. Could someone take a look at this problem ? Thanks, Martin Audet -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhouh at anl.gov Wed Sep 23 18:31:55 2020 From: zhouh at anl.gov (Zhou, Hui) Date: Wed, 23 Sep 2020 23:31:55 +0000 Subject: [mpich-discuss] Spawns without mpirun In-Reply-To: References: Message-ID: It?s possible but not prioritized. Could you open an issue on github? When we have enough users requesting the feature (or when we have persistent user requesting it), the priority may be escalated. ? Meanwhile, could you describe the scenario that you have to use singleton init (vs. `mpirun -n 1 prog ?`)? -- Hui Zhou From: Mart?n Morales via discuss Reply-To: "discuss at mpich.org" Date: Wednesday, September 23, 2020 at 3:49 PM To: "discuss at mpich.org" Cc: Mart?n Morales Subject: [mpich-discuss] Spawns without mpirun Hi all! I asked here some time ago about dynamically spawn processes from a singleton but the answer was that unfortunately, there was a problem with that and just It was not possible. I wonder now if this functionality is available. Best regards Mart?n [https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif] Virus-free. www.avast.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From martineduardomorales at gmail.com Wed Sep 23 19:28:12 2020 From: martineduardomorales at gmail.com (=?UTF-8?Q?Mart=C3=ADn_Morales?=) Date: Wed, 23 Sep 2020 21:28:12 -0300 Subject: [mpich-discuss] Spawns without mpirun In-Reply-To: References: Message-ID: Hi Hui, thank you for your reply. Ok I'll create that issue then. We've a quite large PVM application. It has processes handling in a singleton fashion that it allows powerful functionality. We needed to port the PVM code to MPI in this exact way to preserve that functionality. We've done this already but with the Open MPI implementation. However we've found some inconsistencies in it and that's why I'm querying about this feature in MPICH. Best regards Mart?n On Wed, Sep 23, 2020 at 8:31 PM Zhou, Hui wrote: > It?s possible but not prioritized. Could you open an issue on github? When > we have enough users requesting the feature (or when we have persistent > user requesting it), the priority may be escalated. ? Meanwhile, could > you describe the scenario that you have to use singleton init (vs. `mpirun > -n 1 prog ?`)? > > > > -- > Hui Zhou > > > > > > *From: *Mart?n Morales via discuss > *Reply-To: *"discuss at mpich.org" > *Date: *Wednesday, September 23, 2020 at 3:49 PM > *To: *"discuss at mpich.org" > *Cc: *Mart?n Morales > *Subject: *[mpich-discuss] Spawns without mpirun > > > > Hi all! > > I asked here some time ago about dynamically spawn processes from a > singleton but the answer was that unfortunately, there was a problem with > that and just It was not possible. I wonder now if this functionality is > available. > > Best regards > > > > Mart?n > > > > > > > > > > > > > Virus-free. www.avast.com > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Martin.Audet at cnrc-nrc.gc.ca Wed Sep 23 20:33:28 2020 From: Martin.Audet at cnrc-nrc.gc.ca (Audet, Martin) Date: Thu, 24 Sep 2020 01:33:28 +0000 Subject: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx In-Reply-To: References: <22483ce7d34a4197aec806f422b3cd0e@DC01ZWH0012.Corp.nrc.gc.ca> Message-ID: Hello Zhou, Using simply ?with-pmix=/usr as you suggested seems to work the same but is shorter and more elegant. I will use this formulation. Thanks for the suggestion, Martin Audet From: Zhou, Hui [mailto:zhouh at anl.gov] Sent: Wednesday, September 23, 2020 19:25 To: Audet, Martin ; discuss at mpich.org Subject: Re: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx Hi Martin, Try `--with-pmix=/usr` -- Hui Zhou From: "Audet, Martin" > Date: Wednesday, September 23, 2020 at 4:38 PM To: "discuss at mpich.org" > Cc: "Zhou, Hui" > Subject: RE: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx Thanks Zhou for your answer. Now when I add the following options: --with-pmix-include=/usr/include --with-pmix-lib=/usr/lib64 to my configuration options (just after --with-pmix) it works as expected (i.e. I am able to use the pmix mechanism in Slurm). However if I don?t provide --with-pmix, even if --with-pmix-include=/usr/include --with-pmix-lib=/usr/lib64 are provided, pmix is not used (and I have to use the pmi2 mechanism in Slurm). Regards, Martin Audet From: Zhou, Hui [mailto:zhouh at anl.gov] Sent: Wednesday, September 23, 2020 9:47 To: discuss at mpich.org Cc: Audet, Martin > Subject: Re: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx > ./configure --with-device=ch4:ucx --with-hcoll=/opt/mellanox/hcoll --with-pmix --prefix=/home/publique/depot/mpi/mpich-ch4_ucx-3.4a3 --enable-fast=all --enable-romio --with-file-system=ufs+nfs --enable-shared --enable-sharedlibs=gcc I believe the offending option is `--with-pmix`, since that defaults to `--with-pmix=yes`. You are supposed to pass in the path to pmix installation, e.g. `--with-pmix=/usr/local`. I admit this is a bit not obvious. -- Hui Zhou From: "Audet, Martin via discuss" > Reply-To: "discuss at mpich.org" > Date: Tuesday, September 22, 2020 at 9:41 PM To: "discuss at mpich.org" > Cc: "Audet, Martin" > Subject: [mpich-discuss] Problem configuring mpich-3.4a3 and 3.4a2 --with-device=ch4:ucx Hello MPICH_Developers, I am having trouble running the configure script on two latest mpich release (3.4a2 and 3.4a3) when I use the flag --with-device=ch4:ucx (no such problems with ch3:sock or ch3:nemesis). It seems that the generated Makefile often contains two strange compilation and link flags: -Iyes/include and ?Lyes/lib. To be able to compile the library I had to use the following command to remove those strange flags: find . -name Makefile -exec sed -i -e 's/-Iyes\/include\>//g' -e 's/-Lyes\/lib\>//g' {} \; After this step I am able to compile and the resulting library seems ok and appears to performs well (not extensive testing). For your information I call the configure script like this: ./configure --with-device=ch4:ucx --with-hcoll=/opt/mellanox/hcoll --with-pmix --prefix=/home/publique/depot/mpi/mpich-ch4_ucx-3.4a3 --enable-fast=all --enable-romio --with-file-system=ufs+nfs --enable-shared --enable-sharedlibs=gcc I had experienced this problem on the following three configurations: mpich-3.4a2 CentOS 7.6 MOFED 4.7 mpich-3.4a3 CentOS 7.6 MOFED 4.7 mpich-3.4a3 CentOS 7.8 MOFED 4.9 The architecture is x86_64. Could someone take a look at this problem ? Thanks, Martin Audet -------------- next part -------------- An HTML attachment was scrubbed... URL: From martineduardomorales at gmail.com Fri Sep 25 09:09:13 2020 From: martineduardomorales at gmail.com (=?UTF-8?Q?Mart=C3=ADn_Morales?=) Date: Fri, 25 Sep 2020 11:09:13 -0300 Subject: [mpich-discuss] Spawns without mpirun In-Reply-To: References: Message-ID: Hi again Hui. My apologies, I overlooked your singleton analogie with mpirun (*mpirun -n 1 ./prog...* ). Yes, in fact, it was one our first tries. These program we use is interactive and requires the *ncurses *library for that. I read some time ago a known problem with Open MPI implementation and *ncurses*. Unfortunately with MPICH we've experienced the same issue. Best regards, Mart?n On Wed, Sep 23, 2020 at 9:28 PM Mart?n Morales < martineduardomorales at gmail.com> wrote: > Hi Hui, thank you for your reply. Ok I'll create that issue then. We've a > quite large PVM application. It has processes handling in a singleton > fashion that it allows powerful functionality. We needed to port the PVM > code to MPI in this exact way to preserve that functionality. We've done > this already but with the Open MPI implementation. However we've found some > inconsistencies in it and that's why I'm querying about this feature in > MPICH. > Best regards > > Mart?n > > > On Wed, Sep 23, 2020 at 8:31 PM Zhou, Hui wrote: > >> It?s possible but not prioritized. Could you open an issue on github? >> When we have enough users requesting the feature (or when we have >> persistent user requesting it), the priority may be escalated. ? >> Meanwhile, could you describe the scenario that you have to use singleton >> init (vs. `mpirun -n 1 prog ?`)? >> >> >> >> -- >> Hui Zhou >> >> >> >> >> >> *From: *Mart?n Morales via discuss >> *Reply-To: *"discuss at mpich.org" >> *Date: *Wednesday, September 23, 2020 at 3:49 PM >> *To: *"discuss at mpich.org" >> *Cc: *Mart?n Morales >> *Subject: *[mpich-discuss] Spawns without mpirun >> >> >> >> Hi all! >> >> I asked here some time ago about dynamically spawn processes from a >> singleton but the answer was that unfortunately, there was a problem with >> that and just It was not possible. I wonder now if this functionality is >> available. >> >> Best regards >> >> >> >> Mart?n >> >> >> >> >> >> >> >> >> >> >> >> >> Virus-free. www.avast.com >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhouh at anl.gov Fri Sep 25 10:58:58 2020 From: zhouh at anl.gov (Zhou, Hui) Date: Fri, 25 Sep 2020 15:58:58 +0000 Subject: [mpich-discuss] Spawns without mpirun In-Reply-To: References: Message-ID: <126E5F0F-E3C8-496E-B523-09B4657D8CFB@anl.gov> Hi Martin, I see. Yeah, MPICH (or the project manager hydra) will not work with interactive programs nicely. It might be a potential interest to add a special mode that project manager do not try to take over the input/output. For now, is it possible for your ncurse application to spawn a separate process (via `mpirun -n 1 ?`) that can be used to spawn additional processes? I am thinking that your interactive part can be programmed as purely a frontend to your actual MPI processes. -- Hui Zhou From: Mart?n Morales Date: Friday, September 25, 2020 at 9:09 AM To: "discuss at mpich.org" , "Zhou, Hui" Subject: Re: [mpich-discuss] Spawns without mpirun Hi again Hui. My apologies, I overlooked your singleton analogie with mpirun (mpirun -n 1 ./prog... ). Yes, in fact, it was one our first tries. These program we use is interactive and requires the ncurses library for that. I read some time ago a known problem with Open MPI implementation and ncurses. Unfortunately with MPICH we've experienced the same issue. Best regards, Mart?n On Wed, Sep 23, 2020 at 9:28 PM Mart?n Morales > wrote: Hi Hui, thank you for your reply. Ok I'll create that issue then. We've a quite large PVM application. It has processes handling in a singleton fashion that it allows powerful functionality. We needed to port the PVM code to MPI in this exact way to preserve that functionality. We've done this already but with the Open MPI implementation. However we've found some inconsistencies in it and that's why I'm querying about this feature in MPICH. Best regards Mart?n On Wed, Sep 23, 2020 at 8:31 PM Zhou, Hui > wrote: It?s possible but not prioritized. Could you open an issue on github? When we have enough users requesting the feature (or when we have persistent user requesting it), the priority may be escalated. ? Meanwhile, could you describe the scenario that you have to use singleton init (vs. `mpirun -n 1 prog ?`)? -- Hui Zhou From: Mart?n Morales via discuss > Reply-To: "discuss at mpich.org" > Date: Wednesday, September 23, 2020 at 3:49 PM To: "discuss at mpich.org" > Cc: Mart?n Morales > Subject: [mpich-discuss] Spawns without mpirun Hi all! I asked here some time ago about dynamically spawn processes from a singleton but the answer was that unfortunately, there was a problem with that and just It was not possible. I wonder now if this functionality is available. Best regards Mart?n [https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif] Virus-free. www.avast.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From martineduardomorales at gmail.com Fri Sep 25 15:10:45 2020 From: martineduardomorales at gmail.com (=?UTF-8?Q?Mart=C3=ADn_Morales?=) Date: Fri, 25 Sep 2020 17:10:45 -0300 Subject: [mpich-discuss] Spawns without mpirun In-Reply-To: <126E5F0F-E3C8-496E-B523-09B4657D8CFB@anl.gov> References: <126E5F0F-E3C8-496E-B523-09B4657D8CFB@anl.gov> Message-ID: Hi Hui. About the "special (or interactive) mode" would be great! In relation to your frontend-backend suggestion that you propose, having a "proxy" process in that way might be an important code restructuring which we just can't do at the moment. Thank you very much anyway. Best regards, Mart?n Virus-free. www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> On Fri, Sep 25, 2020 at 12:58 PM Zhou, Hui wrote: > Hi Martin, > > > > I see. Yeah, MPICH (or the project manager hydra) will not work with > interactive programs nicely. It might be a potential interest to add a > special mode that project manager do not try to take over the input/output. > For now, is it possible for your ncurse application to spawn a separate > process (via `mpirun -n 1 ?`) that can be used to spawn additional > processes? I am thinking that your interactive part can be programmed as > purely a frontend to your actual MPI processes. > > > > -- > Hui Zhou > > > > > > *From: *Mart?n Morales > *Date: *Friday, September 25, 2020 at 9:09 AM > *To: *"discuss at mpich.org" , "Zhou, Hui" > *Subject: *Re: [mpich-discuss] Spawns without mpirun > > > > Hi again Hui. My apologies, I overlooked your singleton analogie with > mpirun (*mpirun -n 1 ./prog...* ). Yes, in fact, it was one our first > tries. These program we use is interactive and requires the *ncurses *library > for that. I read some time ago a known problem with Open MPI implementation > and *ncurses*. Unfortunately with MPICH we've experienced the same issue. > > Best regards, > > > > Mart?n > > > > On Wed, Sep 23, 2020 at 9:28 PM Mart?n Morales < > martineduardomorales at gmail.com> wrote: > > Hi Hui, thank you for your reply. Ok I'll create that issue then. We've a > quite large PVM application. It has processes handling in a singleton > fashion that it allows powerful functionality. We needed to port the PVM > code to MPI in this exact way to preserve that functionality. We've done > this already but with the Open MPI implementation. However we've found some > inconsistencies in it and that's why I'm querying about this feature in > MPICH. > > Best regards > > > > Mart?n > > > > > > On Wed, Sep 23, 2020 at 8:31 PM Zhou, Hui wrote: > > It?s possible but not prioritized. Could you open an issue on github? When > we have enough users requesting the feature (or when we have persistent > user requesting it), the priority may be escalated. ? Meanwhile, could > you describe the scenario that you have to use singleton init (vs. `mpirun > -n 1 prog ?`)? > > > > -- > Hui Zhou > > > > > > *From: *Mart?n Morales via discuss > *Reply-To: *"discuss at mpich.org" > *Date: *Wednesday, September 23, 2020 at 3:49 PM > *To: *"discuss at mpich.org" > *Cc: *Mart?n Morales > *Subject: *[mpich-discuss] Spawns without mpirun > > > > Hi all! > > I asked here some time ago about dynamically spawn processes from a > singleton but the answer was that unfortunately, there was a problem with > that and just It was not possible. I wonder now if this functionality is > available. > > Best regards > > > > Mart?n > > > > > > > > > > > > > Virus-free. www.avast.com > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: