[mpich-discuss] [EXTERNAL] Re: mpiexec fails to launch any processes

Mccall, Kurt E. (MSFC-EV41) kurt.e.mccall at nasa.gov
Mon Jun 13 16:15:58 CDT 2022


Hui,

That worked too.   I guess I’ll have to find a way to pass a “verbose” argument to sbatch and see why Slurm is killing my application.

Thanks,
Kurt

From: Zhou, Hui <zhouh at anl.gov>
Sent: Monday, June 13, 2022 4:11 PM
To: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov>; discuss at mpich.org
Subject: Re: [EXTERNAL] Re: mpiexec fails to launch any processes

Kurt,

Could you try launch hostname​ with the same command?

    mpiexec -launcher ssh -verbose -print-all-exitcodes -wdir  <directory> -np 20 -ppn 1 hostname

If that went okay, it then seems to point to your application. Something in your code made Slurm kill the job.

--
Hui
________________________________
From: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov<mailto:kurt.e.mccall at nasa.gov>>
Sent: Monday, June 13, 2022 4:02 PM
To: Zhou, Hui <zhouh at anl.gov<mailto:zhouh at anl.gov>>; discuss at mpich.org<mailto:discuss at mpich.org> <discuss at mpich.org<mailto:discuss at mpich.org>>
Subject: RE: [EXTERNAL] Re: mpiexec fails to launch any processes


Hui,



$ mpiexec -N 10 -hostfile MySlurmNodeFile2 hostname



works properly, reporting from each of 10 nodes.



Kurt



From: Zhou, Hui <zhouh at anl.gov<mailto:zhouh at anl.gov>>
Sent: Monday, June 13, 2022 2:44 PM
To: discuss at mpich.org<mailto:discuss at mpich.org>
Cc: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov<mailto:kurt.e.mccall at nasa.gov>>
Subject: [EXTERNAL] Re: mpiexec fails to launch any processes



Hi Kurt,



I don't have much clue. Are you able to launch some trivial applications, for example, "hostname​"?



--

Hui

________________________________

From: Mccall, Kurt E. (MSFC-EV41) via discuss <discuss at mpich.org<mailto:discuss at mpich.org>>
Sent: Monday, June 13, 2022 12:29 PM
To: discuss at mpich.org<mailto:discuss at mpich.org> <discuss at mpich.org<mailto:discuss at mpich.org>>
Cc: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov<mailto:kurt.e.mccall at nasa.gov>>
Subject: Re: [mpich-discuss] mpiexec fails to launch any processes



Outlook blocked the output file slurm.out that I had attached.   Trying to send it again as slurm.txt.



Kurt





Hi,



My mpiexec command fails to launch any processes.   I ran it with the -verbose option but didn’t see any obvious errors in the output (attached).



The command is:



mpiexec -launcher ssh -verbose -print-all-exitcodes -wdir  <directory> -np 20 -ppn 1  <more args…>



I am running MPICH 4.0.1 under Slurm 20.11.8.  Thanks for any help.



Kurt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20220613/afd21c0f/attachment-0001.html>


More information about the discuss mailing list