[mpich-discuss] Amazon ec2 Windows machine

Nicholas Sgro nsgro060 at gmail.com
Fri Feb 8 16:02:35 CST 2013


Hi,
This is the command I'm using:

mpiexec.exe -machinefile machines.txt -env MPICH2_CHANNEL sock -n 2 cpi.exe

I have tried using both machine file and hosts in the command line, but I
get the same results. The program runs on a single instance with any number
of processors. I tried running mpiexec on one instance and using the other
as a single host and that also works.

-Nicholas

On Fri, Feb 8, 2013 at 12:04 PM, Jayesh Krishna <jayesh at mcs.anl.gov> wrote:

> Hi,
>  How are you running your job (mpiexec command)? Did you try using a
> machine file to specify the hostnames when running the job?
>  Does the program (cpi) execute correctly on a single ec2 instance?
>
> Regards,
> Jayesh
>
> ----- Original Message -----
> From: "Nicholas Sgro" <nsgro060 at gmail.com>
> To: "Jayesh Krishna" <jayesh at mcs.anl.gov>
> Sent: Thursday, February 7, 2013 9:57:55 PM
> Subject: Re: [mpich-discuss] Amazon ec2 Windows machine
>
> I'm using version 1.4.1p1. I tried the sock channel. It doesn't seem to
> work either. With sock, I get to the point where I enter the number of
> intervals, but then it does nothing.
>
> Do you know any reason it wouldn't work with ec2 instances?
>
>
>
> On Thu, Feb 7, 2013 at 4:29 PM, Jayesh Krishna < jayesh at mcs.anl.gov >
> wrote:
>
>
> Hi,
> Which version of MPICH2 are you using? Did you try the "sock" channel (See
> if it works)?
>
> (PS: We haven't tested MPICH2 on Windows with ec2 instances.)
> Regards,
> Jayesh
>
>
> ----- Original Message -----
> From: "Nicholas Sgro" < nsgro060 at gmail.com >
> To: discuss at mpich.org
> Sent: Thursday, February 7, 2013 11:29:57 AM
> Subject: [mpich-discuss] Amazon ec2 Windows machine
>
>
> Hi all,
>
> I am trying to run the example cpi.exe across 2 amazon ec2 instances
> running windows. I have different problems depending on the channel I
> choose. If I try nemesis, I get the following error:
>
> Fatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(392).................:
> MPID_Init(139)........................: channel initialization failed
> MPIDI_CH3_Init(38)....................:
> MPID_nem_init(196)....................:
> MPIDI_CH3I_Seg_commit(366)............:
> MPIU_SHMW_Hnd_deserialize(324)........:
> MPIU_SHMW_Seg_open(863)...............:
> MPIU_SHMW_Seg_create_attach_templ(763): unable to allocate shared memory -
> OpenFileMapping The system cannot find the file specified.
>
> If I try to use shm, cpi.exe uses 100% of the processors on both machines,
> but makes no progress and I have to cancel the job.
>
> I am attaching logs from smpd from both machines from the runs with
> nemesis and shm.
>
> I don't have any experience with mpich, so I have no idea what the problem
> is. Any guidance would be appreciated.
>
> Thanks
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20130208/8ab20ecb/attachment.html>


More information about the discuss mailing list