[mpich-discuss] Fatal error in PMPI_Barrier: A process has failed, error stack:

Rajeev Thakur thakur at mcs.anl.gov
Wed Mar 26 16:51:05 CDT 2014


There are numerous instances of "openmpi" in directory paths in the log file. Is the code entirely compiled and linked with MPICH?

Rajeev

On Mar 26, 2014, at 3:53 PM, Tony Ladd <tladd at che.ufl.edu> wrote:

> I get this error when I try to use mpich across two different nodes. My program works on a single node. I realize this is a common error but I have checked all of the issues mentioned in the FAQ and I did not find any further discussion in the archives.
> 
> I followed the standard installation to /global/usr/bin and /global/usr/lib. The file system /global is automounted on the clients which have passwordless ssh configured.
> 
> Running either my network test code or Intel's IMB code produced a similar error (see attached). Both codes run on a single node under mpich and run across the two nodes using openmpi. Also netpipe (using tcp sockets works fine).
> 
> I tried including the LD_LIBRARY_PAT via the -genv option but that did not help. I must have some configuration issue but I do not see what. I have had previous versions of mpich/mvapich working without any trouble but I am stuck here. I would be grateful for any hints.
> 
> Tony
> 
> -- 
> Tony Ladd
> 
> Chemical Engineering Department
> University of Florida
> Gainesville, Florida 32611-6005
> USA
> 
> Email: tladd-"(AT)"-che.ufl.edu
> Web    http://ladd.che.ufl.edu
> 
> Tel:   (352)-392-6509
> FAX:   (352)-392-9514
> 
> <mpich.log>_______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss




More information about the discuss mailing list