[mpich-discuss] Fatal error in PMPI_Barrier: A process has failed, error stack:
Rajeev Thakur
thakur at mcs.anl.gov
Wed Mar 26 16:51:05 CDT 2014
There are numerous instances of "openmpi" in directory paths in the log file. Is the code entirely compiled and linked with MPICH?
Rajeev
On Mar 26, 2014, at 3:53 PM, Tony Ladd <tladd at che.ufl.edu> wrote:
> I get this error when I try to use mpich across two different nodes. My program works on a single node. I realize this is a common error but I have checked all of the issues mentioned in the FAQ and I did not find any further discussion in the archives.
>
> I followed the standard installation to /global/usr/bin and /global/usr/lib. The file system /global is automounted on the clients which have passwordless ssh configured.
>
> Running either my network test code or Intel's IMB code produced a similar error (see attached). Both codes run on a single node under mpich and run across the two nodes using openmpi. Also netpipe (using tcp sockets works fine).
>
> I tried including the LD_LIBRARY_PAT via the -genv option but that did not help. I must have some configuration issue but I do not see what. I have had previous versions of mpich/mvapich working without any trouble but I am stuck here. I would be grateful for any hints.
>
> Tony
>
> --
> Tony Ladd
>
> Chemical Engineering Department
> University of Florida
> Gainesville, Florida 32611-6005
> USA
>
> Email: tladd-"(AT)"-che.ufl.edu
> Web http://ladd.che.ufl.edu
>
> Tel: (352)-392-6509
> FAX: (352)-392-9514
>
> <mpich.log>_______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list