[mpich-discuss] error spawning processes in mpich-3.2rc1

Siegmar Gross Siegmar.Gross at informatik.hs-fulda.de
Thu Oct 8 00:02:24 CDT 2015


Hi Min,

thank you very much for your answer.

> We cannot reproduce this error on our test machines (Solaris i386,
> Ubuntu x86_64) by using your programs. And unfortunately we do not have
> Solaris Sparc machine thus could not verify it.

The programs work fine on my Solaris x86_64 and Linux machines
as well. I only have a problem on Solaris Sparc.


> Sometime, it can happen that you need to add "./" in front of the
> program path, could you try it ?
>
> For example, in spawn_master.c MPI: A Message-Passing Interface Standard
>> #define SLAVE_PROG      "./spawn_slave"

No, it wil not work, because the programs are stored in a
different directory ($HOME/{SunOS, Linux}/{sparc, x86_64}/bin)
which is part of PATH (as well as ".").

Can I do anything to track the source of the error?


Kind regards

Siegmar

>
> Min
>
> On 10/7/15 5:03 AM, Siegmar Gross wrote:
>> Hi,
>>
>> today I've built mpich-3.2rc1 on my machines (Solaris 10 Sparc,
>> Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with gcc-5.1.0
>> and Sun C 5.13. I still get the following errors on my Sparc machine
>> which I'd already reported September 8th. "mpiexec" is aliased to
>> 'mpiexec -genvnone'. It still doesn't matter if I use my cc- or
>> gcc-version of MPICH.
>>
>>
>> tyr spawn 119 mpichversion
>> MPICH Version:          3.2rc1
>> MPICH Release date:     Wed Oct  7 00:00:33 CDT 2015
>> MPICH Device:           ch3:nemesis
>> MPICH configure:        --prefix=/usr/local/mpich-3.2_64_cc
>> --libdir=/usr/local/mpich-3.2_64_cc/lib64
>> --includedir=/usr/local/mpich-3.2_64_cc/include64 CC=cc CXX=CC F77=f77
>> FC=f95 CFLAGS=-m64 CXXFLAGS=-m64 FFLAGS=-m64 FCFLAGS=-m64 LDFLAGS=-m64
>> -L/usr/lib/sparcv9 -R/usr/lib/sparcv9 --enable-fortran=yes
>> --enable-cxx --enable-romio --enable-debuginfo --enable-smpcoll
>> --enable-threads=multiple --with-thread-package=posix --enable-shared
>> MPICH CC:       cc -m64   -O2
>> MPICH CXX:      CC -m64  -O2
>> MPICH F77:      f77 -m64
>> MPICH FC:       f95 -m64  -O2
>> tyr spawn 120
>>
>>
>>
>> tyr spawn 111 mpiexec -np 1 spawn_master
>>
>> Parent process 0 running on tyr.informatik.hs-fulda.de
>>   I create 4 slave processes
>>
>> Fatal error in MPI_Comm_spawn: Unknown error class, error stack:
>> MPI_Comm_spawn(144)...........: MPI_Comm_spawn(cmd="spawn_slave",
>> argv=0, maxprocs=4, MPI_INFO_NULL, root=0, MPI_COMM_WORLD,
>> intercomm=ffffffff7fffde50, errors=0) failed
>> MPIDI_Comm_spawn_multiple(274):
>> MPID_Comm_accept(153).........:
>> MPIDI_Comm_accept(1057).......:
>> MPIR_Bcast_intra(1287)........:
>> MPIR_Bcast_binomial(310)......: Failure during collective
>>
>>
>>
>>
>> tyr spawn 112 mpiexec -np 1 spawn_multiple_master
>>
>> Parent process 0 running on tyr.informatik.hs-fulda.de
>>   I create 3 slave processes.
>>
>> Fatal error in MPI_Comm_spawn_multiple: Unknown error class, error stack:
>> MPI_Comm_spawn_multiple(162)..: MPI_Comm_spawn_multiple(count=2,
>> cmds=ffffffff7fffde08, argvs=ffffffff7fffddf8,
>> maxprocs=ffffffff7fffddf0, infos=ffffffff7fffdde8, root=0,
>> MPI_COMM_WORLD, intercomm=ffffffff7fffdde4, errors=0) failed
>> MPIDI_Comm_spawn_multiple(274):
>> MPID_Comm_accept(153).........:
>> MPIDI_Comm_accept(1057).......:
>> MPIR_Bcast_intra(1287)........:
>> MPIR_Bcast_binomial(310)......: Failure during collective
>>
>>
>>
>>
>> tyr spawn 113 mpiexec -np 1 spawn_intra_comm
>> Parent process 0: I create 2 slave processes
>> Fatal error in MPI_Comm_spawn: Unknown error class, error stack:
>> MPI_Comm_spawn(144)...........: MPI_Comm_spawn(cmd="spawn_intra_comm",
>> argv=0, maxprocs=2, MPI_INFO_NULL, root=0, MPI_COMM_WORLD,
>> intercomm=ffffffff7fffded4, errors=0) failed
>> MPIDI_Comm_spawn_multiple(274):
>> MPID_Comm_accept(153).........:
>> MPIDI_Comm_accept(1057).......:
>> MPIR_Bcast_intra(1287)........:
>> MPIR_Bcast_binomial(310)......: Failure during collective
>> tyr spawn 114
>>
>>
>> I would be grateful if somebody can fix the problem. Thank you very
>> much for any help in advance. I've attached my programs. Please let
>> me know if you need anything else.
>>
>>
>> Kind regards
>>
>> Siegmar
>>
>>
>> _______________________________________________
>> discuss mailing listdiscuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5164 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20151008/cd5f78a9/attachment.p7s>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list