[mpich-discuss] MPICH2-1.4.1 on Windows Server, Issues Running on More than One Node

Joshua Moore jdmoore at ncsu.edu
Mon Apr 25 20:51:42 CDT 2016


Hello,

This is a follow-up to my message below.  I am running MPICH2-1.4.1 on
Windows Server 2102r2 using the job scheduler with command line through
PowerShell.

I found this page which explains my issue

https://technet.microsoft.com/pt-pt/library/cc720141(v=ws.10).aspx

"An exception is the mpiexec option *-n*. Using this option will cause all
of the processes to be run on the same processor, regardless of how many
processors have been specified using */numprocessors:*."

It's an older version, but I think is the cause of my problem.

Can I use MPICH2 instead of Windows MPI with the Windows Job Scheduler?  If
I don't specify any "-n" parameter, I only get mpiexec to execute a single
task, despite the number of processors I am trying to request

I am trying to use

$j=New-HpcJob
$j | Add-HpcTask -numcores $NUMCORES -command "$MPIEXEC -n $NUMCORES
lmp_mpi.exe -in in.lammps"
$j | Submit-HpcJob

where $MPIEXEC is the full path to mpiexec.exe.  I am using MPICH2-1.4.1
because it works with LAMMPS and I am told Windows MPI does not because of
Visual C++ compatibility with LAMMPS.

I can successfully run the cpi.exe example outside the job scheduler on all
cores and nodes.

I can use a machinefile as well but when I use it, the cores and nodes that
mpiexec executes don't follow the allocated nodes from the job scheduler.

Is there anyway around using MPICH2 with the -n option in Windows Job
Scheduler getting jobs to run on more than one node?

Thanks again for any response I get.  I've spent days trying to figure this
out so would be grateful for any advice.

Josh



On Sun, Apr 24, 2016 at 11:49 PM, Joshua Moore <jdmoore at ncsu.edu> wrote:

> Hello,
>
> I am having issues running mpiexec when I run on more than one node.
>
> I am using v 1.4.1 because it is compatible with LAMMPS software I am
> using.
>
> I am running on Windows Server 2012r2 and trying to get MPICH2 to play
> nice with Window's batch server.
>
> I've installed mpich2 on each of the nodes and the head node.
>
> 1) *msiexec /i mpich2-1.2.1-win-ia32.msi*
> *2) mpiexec -register (to register username and password)*
> *3) smpd -install  (to start spmd server on **each of the nodes and the
> head node)*
> *4) On the head node, I've used smpd -sethosts hostname1 hostname2 ... to
> add each of the hosts.  I haven't done this on the compute nodes.*
>
> When I execute a job through Windows batch system through PowerShell, my
> job executes but puts all of the executions on the same node.  So if for
> example my nodes are 16 cores, and I ask for 32 cores, 32 separate
> processes are run on the first node and none on the second, even though
> Window's scheduler is allocated 2 nodes.
>
> It's like mpiexec is ignoring the node list that the smpd hosts is setting.
>
> I can use a machinefile with mpiexec and this will allow me to execute on
> multiple nodes but it doesn't seem to follow the nodes that Windows
> allocates in their batch server.
>
> I should also add that when I try to request more than one node through
> Window's batch server with -requestednodes "host1 host2" with new-hpcjob,
> Windows tells me that I can't do this because I have zero cores available.
> I can ask for only 1 node and up to the maximum number of cores and it is
> ok with this.
>
> Any suggestions?
>
> Thank you.
>
> Josh
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20160425/16fe6523/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list