[mpich-discuss] Better SGE integration of MPICH3 (spawning several queues)

Reuti reuti at staff.uni-marburg.de
Tue Sep 17 11:58:38 CDT 2013


the $PE_HOSTFILE is parsed in src/pm/hydra/tools/bootstrap/external/sge_query_node_list.c after MPICH detects that it's running inside an SGE job. Although it's not used often, it's allowed to create a PE (parallel environment) in SGE which spawns several queues. As a result, you get more than one entry in the $PE_HOSTFILE for a host and each one will be added to the list separately. This results in the effect, that remote hosts might get two or more `qrsh -inherit ...`, or even the local machine (where the master task of an MPI application is running) faces several bunches of `forks`, each bunch gets the number of forks MPICH detected in the $PE_HOSTFILE in a single line.

The correct handling would be, to sum up all the slots which are granted for a particular machine, independent from the queue it is targeted for.

It turned out (AFAICS), that in src/pm/hydra/utils/others/others.c the procedure "HYDU_add_to_node_list" is responsible for it. A double host entry is already covered, but only if it's the last element of the already read lines, i.e. when the list of nodes is sorted. Please find attached other.c (from 3.1b1), which should cover duplicate entries at any position. It's tested with SGE only though.

-- Reuti

PS: I don't know Torque in all details, but somehow I remember that it depends on its setup whether you get machines (as node numbers) in 1 1 1 2 2 3 or 1 2 3 1 2 1 order. In the latter case, it might have failed also there in the past for Torque (or "did it in a suboptimal process startup").

-------------- next part --------------
A non-text attachment was scrubbed...
Name: others.c
Type: application/octet-stream
Size: 2013 bytes
Desc: not available
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20130917/d376286b/attachment.obj>

More information about the discuss mailing list