[mpich-discuss] Problems with running make

Ron Palmer ron.palmer at pgcgroup.com.au
Mon Mar 3 17:57:20 CST 2014


Thanks Gus, I will read up on those queuing systems.

mathlab, not really, I do not know how to use it and I am the only user 
(and I could always re-enable the windows manager later if required). I 
do generate gnuplots in my perl scripts but email them to me as the 
inversion progresses. All results get downloaded via ftp to a 
workstation where are graphics (visualisation, interpretation ad 
fly-throughs) are done. The cluster is really only horsepower, and all 
processes and OS could quite comfortably run in RAM only, with a token 
harddrive for intermediate files. If they crash, I just start them up 
again, only the last few hours to a few days lost, nothing longer term. 
So many things, so little time...

Yes, not mpi related, so I appropriately apologise to all and sundry!

:-)

On 4/03/2014 09:49, Gus Correa wrote:
> Hi Ron
>
> On 03/03/2014 06:19 PM, Ron Palmer wrote:
>> Reuti, good comment about primary interface for mpi and secondary for
>> the rest of the world.
>>
>> Yes, setting it up from scratch, and I would love a queuing system as
>> well, I just had no idea such existed (learning by the minute!). These
>> computers are dedicated to this parallelisation jobs I am running and
>> will not run anything else. Would you be able to point my nose in a
>> direction of a simple queuing system?
>>
>
> In the open source / free software world:
>
> I use Torque+Maui:
>
> http://www.adaptivecomputing.com/products/open-source/torque/
> http://www.adaptivecomputing.com/products/open-source/maui/
>
> Another very popular queuing system is the Sun Grid Engine (maybe this
> is what Reuti uses?), which I think became the Open Grid Scheduler.
> Better ask somebody else (Reuti maybe) about this one.
>
> http://gridscheduler.sourceforge.net/
>
> Yet a third one is LLNL's Slurm:
>
> https://computing.llnl.gov/linux/slurm/
>
>
>> All my parallel processing are controlled by my own perl scripts and
>> commandline commands (provided binaries), nothing GUI at all (which is
>> kinda nice).
>>
>> Do you think shutting down the xserver would make any difference to the
>> performance of these computers? I am pretty sure that they al have a
>> windows manager running by default... I only got them going last week so
>> I have not (yet) gone in to this level of detail.
>>
>
> If you are sure nobody will ever want to run Matlab,
> or to show a graph with gnuplot, or perhaps more likely, use a 
> graphical debugger such as DDD (or the equivalent from Intel or PGI), 
> you can live without X-windows.
>
> I don't see it as a big burden, though, unless there is somebody doing
> heavy GUI stuff while other computation is going on concurrently.
>
> You could simply start the computers at runlevel 3.
>
> **
>
> OK, this is now more about cluster then about MPICH specifically. :)
>
> I hope this helps,
> Gus Correa
>
>> Thanks,
>> Ron
>>
>> On 4/03/2014 09:06, Reuti wrote:
>>> Am 03.03.2014 um 23:54 schrieb Gus Correa:
>>>
>>>> On 03/03/2014 04:36 PM, Ron Palmer wrote:
>>>>> Thanks Reuti for your comments. I will peruse that FAQ detail.
>>>>>
>>>>> I just thought of the fact that each of these rack computers have 4
>>>>> ethernet sockets, eth0 - eth3... I could connect the cluster on a
>>>>> separate ethernet sockets via an extra switch not connected to the
>>>>> internet or any other computers, and accept all communication among
>>>>> them, and keep iptables up on the ethx connected to the outside
>>>>> world. I
>>>>> guess I would have to set up routing tables or something. Ah, more
>>>>> reading :-)
>>>>>
>>>>> Thanks for your help.
>>>>> Ron
>>>>>
>>>> Hi Ron
>>>>
>>>> If those extra interfaces are not in use,
>>>> and if you have a spare switch,
>>>> you can setup a separate private subnet exclusively for MPI.
>>>> You need to configure the interfaces consistently (IP, subnet mask,
>>>> perhaps a gateway). Configuring them statically is easy:
>>>>
>>>> https://access.redhat.com/site/documentation//en-US/Red_Hat_Enterprise_Linux/6/html-single/Deployment_Guide/index.html#s2-networkscripts-interfaces-eth0 
>>>>
>>>>
>>>>
>>>> Use a subnet that doesn't intersect the existent/original IP range.
>>>>
>>>> http://en.wikipedia.org/wiki/Private_network
>>>>
>>>> You could also create host names associated to those IPs (say
>>>> node01, node02, node02), resolve them via /etc/hosts on each computer,
>>>> set passwordless ssh across these newly named "hosts".
>>>> This may simpler/safer than messing with the iptables.
>>>>
>>>> [Actually, the IP addresses you showed 192.168.X.Y, sound as a private
>>>> subnet already, not Internet, but that may be the subnet for your
>>>> organization/school/department already. So, you may set up a different
>>>> one on these three computers for MPI and very-local access.]
>>> Yep, maybe in the 10.0.0.0/8 range.
>>>
>>> BTW: are you setting up a cluster from scratch - will you also add any
>>> queuing system later on?
>>>
>>>
>>>> OpenMPI allows you to choose the interface that it will use,
>>>> so you can direct it to your very-local subnet:
>>>>
>>>> http://www.open-mpi.org/faq/?category=tcp#tcp-selection
>>> MPICH:
>>>
>>> http://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager#Hydra_with_Non-Ethernet_Networks 
>>>
>>>
>>>
>>> It should work for other Ethernet interfaces too.
>>>
>>> Nevertheless: in any case it might be easier for the applications to
>>> use the primary interface solely for MPI, and any other one for the
>>> external access (this is usually my setup, as I don't have to provide
>>> the assignment of other interfaces this way).
>>>
>>> -- Reuti
>>>
>>>
>>>
>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
>
>




More information about the discuss mailing list