[mpich-discuss] Amazon ec2 Windows machine

Rayson Ho raysonlogin at gmail.com
Sun Feb 10 12:57:11 CST 2013


How did you configure the EC2 security groups? By default, EC2
instances have their inbound traffic blocked, and you will need to
configure security group rules to enable inbound traffic.

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html

Also, any reason you are manually creating EC2 HPC clusters instead of
using a toolkit?? We are a fan of MIT's StarCluster -- with it we can
start up and shut down clusters very quickly (usually a few minutes).
It is Linux based, with MPICH (and/or Open MPI), Open Grid Scheduler /
Grid Engine, and many tools needed for doing HPC in EC2:

http://star.mit.edu/cluster/

And we built a 10,000-node cluster in EC2 based on StarCluster late
last year, during SC12:

http://blogs.scalablelogic.com/2012/11/running-10000-node-grid-engine-cluster.html

Rayson

==================================================
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/


On Fri, Feb 8, 2013 at 5:02 PM, Nicholas Sgro <nsgro060 at gmail.com> wrote:
> Hi,
> This is the command I'm using:
>
> mpiexec.exe -machinefile machines.txt -env MPICH2_CHANNEL sock -n 2 cpi.exe
>
> I have tried using both machine file and hosts in the command line, but I
> get the same results. The program runs on a single instance with any number
> of processors. I tried running mpiexec on one instance and using the other
> as a single host and that also works.
>
> -Nicholas
>
>
> On Fri, Feb 8, 2013 at 12:04 PM, Jayesh Krishna <jayesh at mcs.anl.gov> wrote:
>>
>> Hi,
>>  How are you running your job (mpiexec command)? Did you try using a
>> machine file to specify the hostnames when running the job?
>>  Does the program (cpi) execute correctly on a single ec2 instance?
>>
>> Regards,
>> Jayesh
>>
>> ----- Original Message -----
>> From: "Nicholas Sgro" <nsgro060 at gmail.com>
>> To: "Jayesh Krishna" <jayesh at mcs.anl.gov>
>> Sent: Thursday, February 7, 2013 9:57:55 PM
>> Subject: Re: [mpich-discuss] Amazon ec2 Windows machine
>>
>> I'm using version 1.4.1p1. I tried the sock channel. It doesn't seem to
>> work either. With sock, I get to the point where I enter the number of
>> intervals, but then it does nothing.
>>
>> Do you know any reason it wouldn't work with ec2 instances?
>>
>>
>>
>> On Thu, Feb 7, 2013 at 4:29 PM, Jayesh Krishna < jayesh at mcs.anl.gov >
>> wrote:
>>
>>
>> Hi,
>> Which version of MPICH2 are you using? Did you try the "sock" channel (See
>> if it works)?
>>
>> (PS: We haven't tested MPICH2 on Windows with ec2 instances.)
>> Regards,
>> Jayesh
>>
>>
>> ----- Original Message -----
>> From: "Nicholas Sgro" < nsgro060 at gmail.com >
>> To: discuss at mpich.org
>> Sent: Thursday, February 7, 2013 11:29:57 AM
>> Subject: [mpich-discuss] Amazon ec2 Windows machine
>>
>>
>> Hi all,
>>
>> I am trying to run the example cpi.exe across 2 amazon ec2 instances
>> running windows. I have different problems depending on the channel I
>> choose. If I try nemesis, I get the following error:
>>
>> Fatal error in MPI_Init: Other MPI error, error stack:
>> MPIR_Init_thread(392).................:
>> MPID_Init(139)........................: channel initialization failed
>> MPIDI_CH3_Init(38)....................:
>> MPID_nem_init(196)....................:
>> MPIDI_CH3I_Seg_commit(366)............:
>> MPIU_SHMW_Hnd_deserialize(324)........:
>> MPIU_SHMW_Seg_open(863)...............:
>> MPIU_SHMW_Seg_create_attach_templ(763): unable to allocate shared memory -
>> OpenFileMapping The system cannot find the file specified.
>>
>> If I try to use shm, cpi.exe uses 100% of the processors on both machines,
>> but makes no progress and I have to cancel the job.
>>
>> I am attaching logs from smpd from both machines from the runs with
>> nemesis and shm.
>>
>> I don't have any experience with mpich, so I have no idea what the problem
>> is. Any guidance would be appreciated.
>>
>> Thanks
>>
>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss



More information about the discuss mailing list