[mpich-discuss] Fatal error in PMPI_Barrier: A process has failed, error stack:

Tony Ladd tladd at che.ufl.edu
Wed Mar 26 20:28:29 CDT 2014


No - you get the same error - it looks as if process 1 (on the remote 
node) is not starting

svr:tladd(netbench)> mpirun -n 2 -f hosts 
/global/usr/src/mpich-3.0.4/examples/cpi
Process 0 of 2 is on svr.che.ufl.edu
Fatal error in PMPI_Reduce: A process has failed, error stack:
PMPI_Reduce(1217)...............: MPI_Reduce(sbuf=0x7fff30ecced8, 
rbuf=0x7fff30ecced0, count=1, MPI_DOUBLE,

But if I reverse the order in the host file (pc5 first and then svr) 
apparently both processes start

svr:tladd(netbench)> mpirun -n 2 -f hosts 
/global/usr/src/mpich-3.0.4/examples/cpi
Process 1 of 2 is on svr.che.ufl.edu
Process 0 of 2 is on pc5
Fatal error in PMPI_Reduce: A process has failed, error stack:
PMPI_Reduce(1217)...............: MPI_Reduce(sbuf=0x7fff4d776348, 
rbuf=0x7fff4d776340, count=1, MPI_DOUBLE,

But with the same result in the end.

Tony



On 03/26/2014 08:18 PM, Rajeev Thakur wrote:
> Does the cpi example run across two machines?
>
> Rajeev
>
> On Mar 26, 2014, at 7:13 PM, Tony Ladd <tladd at che.ufl.edu>
>   wrote:
>
>> Rajeev
>>
>> Sorry about that. I was switching back and forth from openmpi to mpich. But it does not make a difference. Here is a clean log from a fresh terminal - no mention of openmpi
>>
>> Tony
>>
>> PS - its a CentOS 6.5install - should have mentioned it before.
>>
>> -- 
>> Tony Ladd
>>
>> Chemical Engineering Department
>> University of Florida
>> Gainesville, Florida 32611-6005
>> USA
>>
>> Email: tladd-"(AT)"-che.ufl.edu
>> Web    http://ladd.che.ufl.edu
>>
>> Tel:   (352)-392-6509
>> FAX:   (352)-392-9514
>>
>> <mpich.log>_______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

-- 
Tony Ladd

Chemical Engineering Department
University of Florida
Gainesville, Florida 32611-6005
USA

Email: tladd-"(AT)"-che.ufl.edu
Web    http://ladd.che.ufl.edu

Tel:   (352)-392-6509
FAX:   (352)-392-9514




More information about the discuss mailing list