[mpich-devel] segfault calling neighbor collectives in communicator with no topology
Dave Goodell
goodell at mcs.anl.gov
Thu May 2 10:58:42 CDT 2013
Thanks for letting us know. I've created a ticket to track this and commented on your suggestions there:
https://trac.mpich.org/projects/mpich/ticket/1833#comment:1
-Dave
On Apr 30, 2013, at 2:53 AM CDT, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> I'm adding support for the MPI-3 neighborhood collectives to mpi4py.
> By mistake, I called a neighbor collective on COMM_SELF, and got a
> segfault. After running under valgrind, I get the trace below.
>
> It seems that MPICH (running 3.0.4) is not checking the communicators
> for a topology being attached. This should be fixed in
> MPIR_Topo_canon_nhb_count() at src/mpi/topo/topoutil.c, adding a check
> after the following line:
>
> topo_ptr = MPIR_Topology_get(comm_ptr);
>
> BTW, the same kind of check should also be added to MPIR_Topo_canon_nhb().
>
>
> ==14696== Invalid read of size 4
> ==14696== at 0xDE0ED39: MPIR_Topo_canon_nhb_count (topoutil.c:283)
> ==14696== by 0xDFA870A: MPIR_Ineighbor_allgather_default
> (inhb_allgather.c:50)
> ==14696== by 0xDFA8B8B: MPIR_Ineighbor_allgather_impl (inhb_allgather.c:98)
> ==14696== by 0xDFAE27C: MPIR_Neighbor_allgather_default (nhb_allgather.c:37)
> ==14696== by 0xDFAE350: MPIR_Neighbor_allgather_impl (nhb_allgather.c:58)
> ==14696== by 0xDFAE918: PMPI_Neighbor_allgather (nhb_allgather.c:155)
> ==14696== by 0xDAD7B77:
> __pyx_pw_6mpi4py_3MPI_9Intracomm_25Neighbor_allgather
> (mpi4py.MPI.c:87767)
> ==14696== by 0x31784DD280: PyEval_EvalFrameEx (in
> /usr/lib64/libpython2.7.so.1.0)
> ==14696== by 0x31784DCEF0: PyEval_EvalFrameEx (in
> /usr/lib64/libpython2.7.so.1.0)
> ==14696== by 0x31784DDCBE: PyEval_EvalCodeEx (in
> /usr/lib64/libpython2.7.so.1.0)
> ==14696== by 0x317846DA36: ??? (in /usr/lib64/libpython2.7.so.1.0)
> ==14696== by 0x3178449C0D: PyObject_Call (in /usr/lib64/libpython2.7.so.1.0)
> ==14696== Address 0x0 is not stack'd, malloc'd or (recently) free'd
>
>
> --
> Lisandro Dalcin
> ---------------
> CIMEC (INTEC/CONICET-UNL)
> Predio CONICET-Santa Fe
> Colectora RN 168 Km 472, Paraje El Pozo
> 3000 Santa Fe, Argentina
> Tel: +54-342-4511594 (ext 1011)
> Tel/Fax: +54-342-4511169
More information about the devel
mailing list