[mpich-discuss] Fatal error in PMPI_Barrier: A process has failed, error stack:
Tony Ladd
tladd at che.ufl.edu
Wed Mar 26 15:53:13 CDT 2014
I get this error when I try to use mpich across two different nodes. My
program works on a single node. I realize this is a common error but I
have checked all of the issues mentioned in the FAQ and I did not find
any further discussion in the archives.
I followed the standard installation to /global/usr/bin and
/global/usr/lib. The file system /global is automounted on the clients
which have passwordless ssh configured.
Running either my network test code or Intel's IMB code produced a
similar error (see attached). Both codes run on a single node under
mpich and run across the two nodes using openmpi. Also netpipe (using
tcp sockets works fine).
I tried including the LD_LIBRARY_PAT via the -genv option but that did
not help. I must have some configuration issue but I do not see what. I
have had previous versions of mpich/mvapich working without any trouble
but I am stuck here. I would be grateful for any hints.
Tony
--
Tony Ladd
Chemical Engineering Department
University of Florida
Gainesville, Florida 32611-6005
USA
Email: tladd-"(AT)"-che.ufl.edu
Web http://ladd.che.ufl.edu
Tel: (352)-392-6509
FAX: (352)-392-9514
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mpich.log
Type: text/x-log
Size: 21367 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140326/a17563fc/attachment.bin>
More information about the discuss
mailing list