[mpich-discuss] mpich3 3.0.4 and binding on four socket nodes
Kenneth Raffenetti
raffenet at mcs.anl.gov
Wed May 7 14:31:41 CDT 2014
There was significant improvement to the process binding features in the
MPICH 3.1 release. Can you build that version and see if it fixes the issue?
Ken
On 05/07/2014 02:25 PM, Susan A. Schwarz wrote:
> I am using mpich3 3.0.4 built with Intel 13.0 compilers on Centos 6.5. I
> am trying to use the 'bind-to core' and 'map-by core:2' options to
> mpiexec. I am able to run this successfully on nodes in my cluster where
> there are 2 sockets. However, when I try to run it on a 4 socket node
> with 12 cores per socket, I get the following errors:
>
> [proxy:0:0 at f02] get_nbobjs_by_type
> (./tools/topo/hwloc/topo_hwloc.c:189): assert (nb % x == 0) failed
> [proxy:0:0 at f02] handle_bitmap_binding
> (./tools/topo/hwloc/topo_hwloc.c:450): unable to get number of objects
> [proxy:0:0 at f02] HYDT_topo_hwloc_init
> (./tools/topo/hwloc/topo_hwloc.c:527): error binding with bind "core"
> and map "core:2"
> [proxy:0:0 at f02] HYDT_topo_init (./tools/topo/topo.c:60): unable to
> initialize hwloc
> [proxy:0:0 at f02] launch_procs (./pm/pmiserv/pmip_cb.c:520): unable to
> initialize process topology
> [proxy:0:0 at f02] HYD_pmcd_pmip_control_cmd_cb
> (./pm/pmiserv/pmip_cb.c:893): launch_procs returned error
> [proxy:0:0 at f02] HYDT_dmxu_poll_wait_for_event
> (./tools/demux/demux_poll.c:77): callback returned error status
> [proxy:0:0 at f02] main (./pm/pmiserv/pmip.c:206): demux engine error
> waiting for event
> [mpiexec at f02] control_cb (./pm/pmiserv/pmiserv_cb.c:202): assert
> (!closed) failed
> [mpiexec at f02] HYDT_dmxu_poll_wait_for_event
> (./tools/demux/demux_poll.c:77): callback returned error status
> [mpiexec at f02] HYD_pmci_wait_for_completion
> (./pm/pmiserv/pmiserv_pmci.c:197): error waiting for event
> [mpiexec at f02] main (./ui/mpich/mpiexec.c:331): process manager error
> waiting for completion
>
> Any idea what could be causing this problem?
>
> Susan Schwarz
> Research Computing
> Dartmouth
>
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list