[mpich-discuss] Hydra torque support issues

Brock Palen brockp at umich.edu
Wed Apr 17 14:10:50 CDT 2013


Bharath,

(sorry on road real quick)

I do this with hydra+mvapich with Matlab and it does work.

In my case I didn't find how to tell hydra when building mpich/mvapich to tell it where libtorque.so lives. 

To resolve this I download hydra alone and build it giving explicit options to it to find the torque library.  Then you get the expected behavior.  I don't have the directions for how I did this handy.  

Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
brockp at umich.edu
(734)936-1985



On Apr 17, 2013, at 1:04 PM, Bharath Ramesh <bramesh at vt.edu> wrote:

> On Thu, Mar 07, 2013 at 04:18:58PM -0600, Pavan Balaji wrote:
>> 
>> This was already present in 1.5, though not enabled by default.
>> However, there were some bug fixes in 3.0, so using the latest version
>> is the best bet.
>> 
>> -- Pavan
>> 
>> On 03/07/2013 03:57 PM US Central Time, Dave Goodell wrote:
>>> On Mar 7, 2013, at 3:51 PM CST, Bharath Ramesh <bramesh at vt.edu> wrote:
>>> 
>>>> I am using mvapich2-1.9a2 which is based of mpich2-1.5. We have
>>>> enabled Torque integration with hydra. We are noticing an issue
>>>> where in Torque is not tracking the resource used by the MPI
>>>> application when they are built with mvapich2. Further
>>>> investigation revealed that mvapich2 hydra process launcher was
>>>> not setting the correct session id to what torque used for
>>>> forking the shell.  I am wondering if this is a known issue or
>>>> something that has already been fixed. If it has been fixed, what
>>>> would be the best way to upgrade just hydra without affecting the
>>>> rest of MPI stack.
>>> 
>>> I can't speak to whether the issue is known or fixed (Pavan will know).  But you can install a different version of Hydra from one of the release tarballs or nightly tarballs:
> 
> Sorry about a delayed response. I installed mvapich2-1.9b based
> on mpich-3.0.2 and I can say that the issue still exists. To
> better understand I am attaching the output of ps axf with
> relevant portions to differentiate between the behavior of hydra
> launcher when compared to OpenMPI which does the correct thing.
> This allows torque to track the resources used.
> 
> -- 
> Bharath
> <mpi_ps_axf.out>_______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss




More information about the discuss mailing list