Hi,<br>This is the command I'm using: <br><br>mpiexec.exe -machinefile machines.txt -env MPICH2_CHANNEL sock -n 2 cpi.exe<br><br>I
have tried using both machine file and hosts in the command line, but I
get the same results. The program runs on a single instance with any number of processors. I tried
running mpiexec on one instance and using the other as a single host and
that also works. <br><br>-Nicholas<br><br><div class="gmail_quote">On Fri, Feb 8, 2013 at 12:04 PM, Jayesh Krishna <span dir="ltr"><<a href="mailto:jayesh@mcs.anl.gov" target="_blank">jayesh@mcs.anl.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
How are you running your job (mpiexec command)? Did you try using a machine file to specify the hostnames when running the job?<br>
Does the program (cpi) execute correctly on a single ec2 instance?<br>
<div class="im HOEnZb"><br>
Regards,<br>
Jayesh<br>
<br>
----- Original Message -----<br>
From: "Nicholas Sgro" <<a href="mailto:nsgro060@gmail.com">nsgro060@gmail.com</a>><br>
</div><div class="HOEnZb"><div class="h5">To: "Jayesh Krishna" <<a href="mailto:jayesh@mcs.anl.gov">jayesh@mcs.anl.gov</a>><br>
Sent: Thursday, February 7, 2013 9:57:55 PM<br>
Subject: Re: [mpich-discuss] Amazon ec2 Windows machine<br>
<br>
I'm using version 1.4.1p1. I tried the sock channel. It doesn't seem to work either. With sock, I get to the point where I enter the number of intervals, but then it does nothing.<br>
<br>
Do you know any reason it wouldn't work with ec2 instances?<br>
<br>
<br>
<br>
On Thu, Feb 7, 2013 at 4:29 PM, Jayesh Krishna < <a href="mailto:jayesh@mcs.anl.gov">jayesh@mcs.anl.gov</a> > wrote:<br>
<br>
<br>
Hi,<br>
Which version of MPICH2 are you using? Did you try the "sock" channel (See if it works)?<br>
<br>
(PS: We haven't tested MPICH2 on Windows with ec2 instances.)<br>
Regards,<br>
Jayesh<br>
<br>
<br>
----- Original Message -----<br>
From: "Nicholas Sgro" < <a href="mailto:nsgro060@gmail.com">nsgro060@gmail.com</a> ><br>
To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
Sent: Thursday, February 7, 2013 11:29:57 AM<br>
Subject: [mpich-discuss] Amazon ec2 Windows machine<br>
<br>
<br>
Hi all,<br>
<br>
I am trying to run the example cpi.exe across 2 amazon ec2 instances running windows. I have different problems depending on the channel I choose. If I try nemesis, I get the following error:<br>
<br>
Fatal error in MPI_Init: Other MPI error, error stack:<br>
MPIR_Init_thread(392).................:<br>
MPID_Init(139)........................: channel initialization failed<br>
MPIDI_CH3_Init(38)....................:<br>
MPID_nem_init(196)....................:<br>
MPIDI_CH3I_Seg_commit(366)............:<br>
MPIU_SHMW_Hnd_deserialize(324)........:<br>
MPIU_SHMW_Seg_open(863)...............:<br>
MPIU_SHMW_Seg_create_attach_templ(763): unable to allocate shared memory - OpenFileMapping The system cannot find the file specified.<br>
<br>
If I try to use shm, cpi.exe uses 100% of the processors on both machines, but makes no progress and I have to cancel the job.<br>
<br>
I am attaching logs from smpd from both machines from the runs with nemesis and shm.<br>
<br>
I don't have any experience with mpich, so I have no idea what the problem is. Any guidance would be appreciated.<br>
<br>
Thanks<br>
<br>
<br>
_______________________________________________<br>
discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
<br>
</div></div></blockquote></div><br>