[mpich-discuss] _get_addr error while running application using MPICH

Zhou, Hui zhouh at anl.gov
Mon Nov 26 10:12:37 CST 2018


Hi Pavan,

Is there difference between mpirun and mpiexec. I understand the official standards only specifies mpiexec, and mpirun often (is it always?) is a link to mpiexec. Is there `unofficial` convention for mpirun, such as always present, always a link,  no difference to mpiexec at all? It appears to me that mpirun is more wide spread than mpiexec (I always have been using mpirun).

—
Hui Zhou


On Nov 21, 2018, at 5:34 PM, Balaji, Pavan via discuss <discuss at mpich.org<mailto:discuss at mpich.org>> wrote:


I should clarify one piece of how mpiexec works:

% ./real.exe

and

% mpirun -np 1 ./real.exe

are equivalent.

% mpirun ./real.exe

uses some number of processes depending on the environment.  For unmanaged clusters, that's typically 1.  For clusters that have some job management system (such as slurm or pbs), mpiexec will figure out how many nodes you allocated and use all of the cores allocated to that job.

My guess is that real.exe has some dependencies that are met on the local machine, but not on other machines.  So when mpiexec tries to launch real.exe on other nodes, it's throwing an error.  This is not an mpiexec problem, but you might want to use the -prepend-pattern option in mpiexec to figure out where the error is coming from.  Something like this:

mpiexec -prepend-pattern %h ./real.exe

 -- Pavan

On Nov 4, 2018, at 10:27 AM, Zhifeng Yang via discuss <discuss at mpich.org<mailto:discuss at mpich.org>> wrote:

Hi

After I installed mpich and used it in a FORTRAN code. There is an error while running this FORTRAN code called real.exe by using the following command

$./real.exe
or
$mpirun ./real.exe
real.exe: error: _get_addr: No such file or directory
real.exe: error: _get_addr: No such file or directory
real.exe: error: _get_addr: No such file or directory
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(784).....:
MPID_Init(1323)...........: channel initialization failed
MPIDI_CH3_Init(120).......:
MPID_nem_init_ckpt(852)...:
MPIDI_CH3I_Seg_commit(364): PMI_Barrier returned -1

But while using mpirun with specifying number of processors. as follows
$mpirun -np 1 ./real.exe
There is no error any more. I am not sure why? do you have any explanation? Thank you very much

Best regards
Zhifeng

_______________________________________________
discuss mailing list     discuss at mpich.org<mailto:discuss at mpich.org>
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list     discuss at mpich.org<mailto:discuss at mpich.org>
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20181126/d725ebfe/attachment.html>


More information about the discuss mailing list