[mpich-discuss] error spawning processes in mpich-3.2rc1

Siegmar Gross Siegmar.Gross at informatik.hs-fulda.de
Mon Oct 12 09:24:48 CDT 2015


Hi Min,

> It seems you already enabled the most detailed error outputs. We could
> not think out any clue for now. If you can give us access to your
> machine, we are glad to help you debug on it.

Can you send me your email address because I don't want to send
login data to this list.


Kind regards

Siegmar


>
> Min
>
> On 10/8/15 12:02 AM, Siegmar Gross wrote:
>> Hi Min,
>>
>> thank you very much for your answer.
>>
>>> We cannot reproduce this error on our test machines (Solaris i386,
>>> Ubuntu x86_64) by using your programs. And unfortunately we do not have
>>> Solaris Sparc machine thus could not verify it.
>>
>> The programs work fine on my Solaris x86_64 and Linux machines
>> as well. I only have a problem on Solaris Sparc.
>>
>>
>>> Sometime, it can happen that you need to add "./" in front of the
>>> program path, could you try it ?
>>>
>>> For example, in spawn_master.c MPI: A Message-Passing Interface Standard
>>>> #define SLAVE_PROG      "./spawn_slave"
>>
>> No, it wil not work, because the programs are stored in a
>> different directory ($HOME/{SunOS, Linux}/{sparc, x86_64}/bin)
>> which is part of PATH (as well as ".").
>>
>> Can I do anything to track the source of the error?
>>
>>
>> Kind regards
>>
>> Siegmar
>>
>>>
>>> Min
>>>
>>> On 10/7/15 5:03 AM, Siegmar Gross wrote:
>>>> Hi,
>>>>
>>>> today I've built mpich-3.2rc1 on my machines (Solaris 10 Sparc,
>>>> Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with gcc-5.1.0
>>>> and Sun C 5.13. I still get the following errors on my Sparc machine
>>>> which I'd already reported September 8th. "mpiexec" is aliased to
>>>> 'mpiexec -genvnone'. It still doesn't matter if I use my cc- or
>>>> gcc-version of MPICH.
>>>>
>>>>
>>>> tyr spawn 119 mpichversion
>>>> MPICH Version:          3.2rc1
>>>> MPICH Release date:     Wed Oct  7 00:00:33 CDT 2015
>>>> MPICH Device:           ch3:nemesis
>>>> MPICH configure:        --prefix=/usr/local/mpich-3.2_64_cc
>>>> --libdir=/usr/local/mpich-3.2_64_cc/lib64
>>>> --includedir=/usr/local/mpich-3.2_64_cc/include64 CC=cc CXX=CC F77=f77
>>>> FC=f95 CFLAGS=-m64 CXXFLAGS=-m64 FFLAGS=-m64 FCFLAGS=-m64 LDFLAGS=-m64
>>>> -L/usr/lib/sparcv9 -R/usr/lib/sparcv9 --enable-fortran=yes
>>>> --enable-cxx --enable-romio --enable-debuginfo --enable-smpcoll
>>>> --enable-threads=multiple --with-thread-package=posix --enable-shared
>>>> MPICH CC:       cc -m64   -O2
>>>> MPICH CXX:      CC -m64  -O2
>>>> MPICH F77:      f77 -m64
>>>> MPICH FC:       f95 -m64  -O2
>>>> tyr spawn 120
>>>>
>>>>
>>>>
>>>> tyr spawn 111 mpiexec -np 1 spawn_master
>>>>
>>>> Parent process 0 running on tyr.informatik.hs-fulda.de
>>>>   I create 4 slave processes
>>>>
>>>> Fatal error in MPI_Comm_spawn: Unknown error class, error stack:
>>>> MPI_Comm_spawn(144)...........: MPI_Comm_spawn(cmd="spawn_slave",
>>>> argv=0, maxprocs=4, MPI_INFO_NULL, root=0, MPI_COMM_WORLD,
>>>> intercomm=ffffffff7fffde50, errors=0) failed
>>>> MPIDI_Comm_spawn_multiple(274):
>>>> MPID_Comm_accept(153).........:
>>>> MPIDI_Comm_accept(1057).......:
>>>> MPIR_Bcast_intra(1287)........:
>>>> MPIR_Bcast_binomial(310)......: Failure during collective
>>>>
>>>>
>>>>
>>>>
>>>> tyr spawn 112 mpiexec -np 1 spawn_multiple_master
>>>>
>>>> Parent process 0 running on tyr.informatik.hs-fulda.de
>>>>   I create 3 slave processes.
>>>>
>>>> Fatal error in MPI_Comm_spawn_multiple: Unknown error class, error
>>>> stack:
>>>> MPI_Comm_spawn_multiple(162)..: MPI_Comm_spawn_multiple(count=2,
>>>> cmds=ffffffff7fffde08, argvs=ffffffff7fffddf8,
>>>> maxprocs=ffffffff7fffddf0, infos=ffffffff7fffdde8, root=0,
>>>> MPI_COMM_WORLD, intercomm=ffffffff7fffdde4, errors=0) failed
>>>> MPIDI_Comm_spawn_multiple(274):
>>>> MPID_Comm_accept(153).........:
>>>> MPIDI_Comm_accept(1057).......:
>>>> MPIR_Bcast_intra(1287)........:
>>>> MPIR_Bcast_binomial(310)......: Failure during collective
>>>>
>>>>
>>>>
>>>>
>>>> tyr spawn 113 mpiexec -np 1 spawn_intra_comm
>>>> Parent process 0: I create 2 slave processes
>>>> Fatal error in MPI_Comm_spawn: Unknown error class, error stack:
>>>> MPI_Comm_spawn(144)...........: MPI_Comm_spawn(cmd="spawn_intra_comm",
>>>> argv=0, maxprocs=2, MPI_INFO_NULL, root=0, MPI_COMM_WORLD,
>>>> intercomm=ffffffff7fffded4, errors=0) failed
>>>> MPIDI_Comm_spawn_multiple(274):
>>>> MPID_Comm_accept(153).........:
>>>> MPIDI_Comm_accept(1057).......:
>>>> MPIR_Bcast_intra(1287)........:
>>>> MPIR_Bcast_binomial(310)......: Failure during collective
>>>> tyr spawn 114
>>>>
>>>>
>>>> I would be grateful if somebody can fix the problem. Thank you very
>>>> much for any help in advance. I've attached my programs. Please let
>>>> me know if you need anything else.
>>>>
>>>>
>>>> Kind regards
>>>>
>>>> Siegmar
>>>>
>>>>
>>>> _______________________________________________
>>>> discuss mailing listdiscuss at mpich.org
>>>> To manage subscription options or unsubscribe:
>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>>
>>>
>>> _______________________________________________
>>> discuss mailing list discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>
>>
>>
>> _______________________________________________
>> discuss mailing listdiscuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5164 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20151012/44428e12/attachment.p7s>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list