[mpich-discuss] Problems with running make

Gus Correa gus at ldeo.columbia.edu
Mon Mar 3 18:47:45 CST 2014


On 03/03/2014 06:57 PM, Ron Palmer wrote:
> Thanks Gus, I will read up on those queuing systems.
>
> mathlab, not really, I do not know how to use it and I am the only user
> (and I could always re-enable the windows manager later if required). I
> do generate gnuplots in my perl scripts but email them to me as the
> inversion progresses. All results get downloaded via ftp to a
> workstation where are graphics (visualisation, interpretation ad
> fly-throughs) are done. The cluster is really only horsepower, and all
> processes and OS could quite comfortably run in RAM only, with a token
> harddrive for intermediate files. If they crash, I just start them up
> again, only the last few hours to a few days lost, nothing longer term.
> So many things, so little time...
>
> Yes, not mpi related, so I appropriately apologise to all and sundry!
>
> :-)

Hi Ron

When it comes to the details of how to setup a queuing system,
of course you will get better advice in their respective mailing lists.

However, troubleshooting network setup,
ssh authentication, integration of MPI with queuing system, etc,
these are all related to MPI, and it is hard to trace a dividing line,
as you may have noticed.
IMHO this list is not strict in this regard.
I for sure think that your postings were focused on how to get
MPICH up and running, well taken, and very appropriate.

Gus Correa



>
> On 4/03/2014 09:49, Gus Correa wrote:
>> Hi Ron
>>
>> On 03/03/2014 06:19 PM, Ron Palmer wrote:
>>> Reuti, good comment about primary interface for mpi and secondary for
>>> the rest of the world.
>>>
>>> Yes, setting it up from scratch, and I would love a queuing system as
>>> well, I just had no idea such existed (learning by the minute!). These
>>> computers are dedicated to this parallelisation jobs I am running and
>>> will not run anything else. Would you be able to point my nose in a
>>> direction of a simple queuing system?
>>>
>>
>> In the open source / free software world:
>>
>> I use Torque+Maui:
>>
>> http://www.adaptivecomputing.com/products/open-source/torque/
>> http://www.adaptivecomputing.com/products/open-source/maui/
>>
>> Another very popular queuing system is the Sun Grid Engine (maybe this
>> is what Reuti uses?), which I think became the Open Grid Scheduler.
>> Better ask somebody else (Reuti maybe) about this one.
>>
>> http://gridscheduler.sourceforge.net/
>>
>> Yet a third one is LLNL's Slurm:
>>
>> https://computing.llnl.gov/linux/slurm/
>>
>>
>>> All my parallel processing are controlled by my own perl scripts and
>>> commandline commands (provided binaries), nothing GUI at all (which is
>>> kinda nice).
>>>
>>> Do you think shutting down the xserver would make any difference to the
>>> performance of these computers? I am pretty sure that they al have a
>>> windows manager running by default... I only got them going last week so
>>> I have not (yet) gone in to this level of detail.
>>>
>>
>> If you are sure nobody will ever want to run Matlab,
>> or to show a graph with gnuplot, or perhaps more likely, use a
>> graphical debugger such as DDD (or the equivalent from Intel or PGI),
>> you can live without X-windows.
>>
>> I don't see it as a big burden, though, unless there is somebody doing
>> heavy GUI stuff while other computation is going on concurrently.
>>
>> You could simply start the computers at runlevel 3.
>>
>> **
>>
>> OK, this is now more about cluster then about MPICH specifically. :)
>>
>> I hope this helps,
>> Gus Correa
>>
>>> Thanks,
>>> Ron
>>>
>>> On 4/03/2014 09:06, Reuti wrote:
>>>> Am 03.03.2014 um 23:54 schrieb Gus Correa:
>>>>
>>>>> On 03/03/2014 04:36 PM, Ron Palmer wrote:
>>>>>> Thanks Reuti for your comments. I will peruse that FAQ detail.
>>>>>>
>>>>>> I just thought of the fact that each of these rack computers have 4
>>>>>> ethernet sockets, eth0 - eth3... I could connect the cluster on a
>>>>>> separate ethernet sockets via an extra switch not connected to the
>>>>>> internet or any other computers, and accept all communication among
>>>>>> them, and keep iptables up on the ethx connected to the outside
>>>>>> world. I
>>>>>> guess I would have to set up routing tables or something. Ah, more
>>>>>> reading :-)
>>>>>>
>>>>>> Thanks for your help.
>>>>>> Ron
>>>>>>
>>>>> Hi Ron
>>>>>
>>>>> If those extra interfaces are not in use,
>>>>> and if you have a spare switch,
>>>>> you can setup a separate private subnet exclusively for MPI.
>>>>> You need to configure the interfaces consistently (IP, subnet mask,
>>>>> perhaps a gateway). Configuring them statically is easy:
>>>>>
>>>>> https://access.redhat.com/site/documentation//en-US/Red_Hat_Enterprise_Linux/6/html-single/Deployment_Guide/index.html#s2-networkscripts-interfaces-eth0
>>>>>
>>>>>
>>>>>
>>>>> Use a subnet that doesn't intersect the existent/original IP range.
>>>>>
>>>>> http://en.wikipedia.org/wiki/Private_network
>>>>>
>>>>> You could also create host names associated to those IPs (say
>>>>> node01, node02, node02), resolve them via /etc/hosts on each computer,
>>>>> set passwordless ssh across these newly named "hosts".
>>>>> This may simpler/safer than messing with the iptables.
>>>>>
>>>>> [Actually, the IP addresses you showed 192.168.X.Y, sound as a private
>>>>> subnet already, not Internet, but that may be the subnet for your
>>>>> organization/school/department already. So, you may set up a different
>>>>> one on these three computers for MPI and very-local access.]
>>>> Yep, maybe in the 10.0.0.0/8 range.
>>>>
>>>> BTW: are you setting up a cluster from scratch - will you also add any
>>>> queuing system later on?
>>>>
>>>>
>>>>> OpenMPI allows you to choose the interface that it will use,
>>>>> so you can direct it to your very-local subnet:
>>>>>
>>>>> http://www.open-mpi.org/faq/?category=tcp#tcp-selection
>>>> MPICH:
>>>>
>>>> http://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager#Hydra_with_Non-Ethernet_Networks
>>>>
>>>>
>>>>
>>>> It should work for other Ethernet interfaces too.
>>>>
>>>> Nevertheless: in any case it might be easier for the applications to
>>>> use the primary interface solely for MPI, and any other one for the
>>>> external access (this is usually my setup, as I don't have to provide
>>>> the assignment of other interfaces this way).
>>>>
>>>> -- Reuti
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> discuss mailing list discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>>
>>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss




More information about the discuss mailing list