[mpich-discuss] MPICH-3.2: SIGSEGV in MPID_Request_create () at src/mpid/ch3/src/ch3u_request.c:101
Halim Amer
aamer at anl.gov
Wed Sep 7 17:25:46 CDT 2016
Mark,
Thanks for the toy program. I could reproduce the problem locally. We
also experienced a similar bug with a PathScale compiler. We are still
working on a fix, though. You can follow the progress and potentially
provide more input in the ticket I created to track this issue
(https://trac.mpich.org/projects/mpich/ticket/2350).
Regrading missing symbols from dynamic libraries when using gdb on Mac
OS, I personally never managed to make it work, not just with MPICH, but
also with other libraries. I consistently do static builds to avoid this
problem. If you are able to see symbols from other libraries, we might
want to look into this as well.
--Halim
www.mcs.anl.gov/~aamer
On 8/17/16 3:29 PM, Mark Davis wrote:
> Hello,
>
> Sure. Please see the attached which is a simple single-threaded test
> that simply loops through many broadcasts in a row. Note that when I
> run this with NPROCS=2 it runs fine, but when I turn up the NPROCS to
> 6, about 50% of the time it seems to hang for a couple seconds, and
> then SEGV (backtrace below). Again, this is a transient issue and
> seems to happen on every MPI program I try, assuming the NPROCS is
> large enough; sometimes it runs fine with no "hanging". Note that I'm
> running on an 8-core macbook pro. (When I run the same application on
> my Linux cluster, I never have this problem.)
>
> Secondly, note that I used the recommendation of compiling with
> --enable-g=most,mem and that did allow me to compile master HEAD.
> However, debug symbols aren't being either generated or at least not
> loaded by gdb. I did notice that no libpmpi.12.dylib.dSYM directories
> (with DWARF debug format info) were created when I built master from
> source, despite my --enable-g=most,mem flag. I'm pretty sure when I
> built MPICH 3.2 release it did create these. Is there something else I
> can do during the build to force debug symbols?
>
> Thank you,
> Mark
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x1313 of process 62381]
> MPID_Request_init (req=0x105be2098) at src/mpid/ch3/src/ch3u_request.c:56
> 56 req->dev.ext_hdr_ptr = NULL;
> (gdb) bt full
> #0 MPID_Request_init (req=0x105be2098) at src/mpid/ch3/src/ch3u_request.c:56
> No locals.
> #1 0x000000010036aeea in ?? () from /Users/m/local/lib/libpmpi.12.dylib
> No symbol table info available.
> #2 0x0000000100445550 in ?? () from /Users/m/local/lib/libpmpi.12.dylib
> No symbol table info available.
> #3 0x0000000000000000 in ?? ()
> No symbol table info available.
>
> On Tue, Aug 16, 2016 at 7:32 PM, Halim Amer <aamer at anl.gov> wrote:
>> Can you send us a toy program that reproduces this problem?
>>
>> --Halim
>> www.mcs.anl.gov/~aamer
>>
>>
>> On 8/16/16 6:04 PM, Mark Davis wrote:
>>>
>>> I pulled from master (at d8bb1df from yesterday) again and then
>>> recompiled with the --enable-g=most,mem flag instead of just
>>> --enable-g=most.
>>>
>>> The good news is that the --enable-g=most,mem flag compiled successfully.
>>>
>>> The bad news is two-fold:
>>>
>>> 1. I believe I'm still getting the same SEGV as I was getting before
>>> related to req->dev.ext_hdr_ptr = NULL; (Although it's now
>>> pointing to a different line in src/mpid/ch3/src/ch3u_request.c (line
>>> 56 instead of line 101 as before). I'm not sure if the line is
>>> relevant; some other things may have moved around in that file since
>>> then the 3.2 release version.
>>>
>>> 2. I no longer have debugging symbols in my library, so my backtraces
>>> are not helpful. It's possible these two issues are related?
>>>
>>> I did double check that I rebuilt my application from scratch so it
>>> linked in the new library and that the library was indeed rebuilt (by
>>> looking at file creation timestamps).
>>>
>>> Any ideas about these two issues? Thank you.
>>>
>>> Program received signal SIGSEGV, Segmentation fault.
>>> [Switching to Thread 0x1313 of process 62381]
>>> MPID_Request_init (req=0x105be2098) at src/mpid/ch3/src/ch3u_request.c:56
>>> 56 req->dev.ext_hdr_ptr = NULL;
>>> (gdb) bt full
>>> #0 MPID_Request_init (req=0x105be2098) at
>>> src/mpid/ch3/src/ch3u_request.c:56
>>> No locals.
>>> #1 0x000000010036aeea in ?? () from /Users/m/local/lib/libpmpi.12.dylib
>>> No symbol table info available.
>>> #2 0x0000000100445550 in ?? () from /Users/m/local/lib/libpmpi.12.dylib
>>> No symbol table info available.
>>> #3 0x0000000000000000 in ?? ()
>>> No symbol table info available.
>>>
>>> On Mon, Aug 15, 2016 at 5:32 PM, Halim Amer <aamer at anl.gov> wrote:
>>>>
>>>> Good catch! the `most` option implies `mem`, but the root configure
>>>> failed
>>>> to forward the `mem` option to the MPL software layer. We will push a
>>>> fix,
>>>> but meanwhile you can specify `--enable-g=most,mem` to get the desired
>>>> behavior.
>>>>
>>>> --Halim
>>>> www.mcs.anl.gov/~aamer
>>>>
>>>>
>>>> On 8/12/16 9:37 PM, Mark Davis wrote:
>>>>>
>>>>>
>>>>> I've tried both git HEAD (3ea7589) as well as the August 1 master
>>>>> snapshot and am having trouble building it; I'm getting the same error
>>>>> in both cases. I've configured with --enable-g=most but otherwise it's
>>>>> all default. I'm running on OSX (Darwin - 15.6.0) and clang 3.8.1.
>>>>>
>>>>> It's erroring on compiling src/mpi/attr/lib_libpmpi_la-attr_delete.lo
>>>>> due to an issue with the macro MPL_free
>>>>>
>>>>> Has anyone seen this before? I'm including the full error trace below:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Making all in .
>>>>> CC src/mpi/attr/lib_libpmpi_la-attr_delete.lo
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:217:
>>>>> ./src/include/mpir_request.h:281:13: warning: multi-character
>>>>> character constant [-Wmultichar]
>>>>> MPL_free(req->u.ureq.greq_fns);
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:217:
>>>>> ./src/include/mpir_request.h:281:13: warning: character constant too
>>>>> long for its type
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:217:
>>>>> ./src/include/mpir_request.h:281:13: error: expected ';' after
>>>>> expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:50: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:217:
>>>>> ./src/include/mpir_request.h:281:13: error: expected expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:51: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:225:
>>>>> In file included from ./src/include/mpir_cvars.h:17:
>>>>> In file included from ./src/include/mpitimpl.h:18:
>>>>> ./src/include/mpir_utarray.h:238:43: warning: multi-character
>>>>> character constant [-Wmultichar]
>>>>> *_dst = (*_src == NULL) ? NULL : (char*)utarray_strdup_(*_src);
>>>>> ^
>>>>> ./src/include/mpir_utarray.h:56:33: note: expanded from macro
>>>>> 'utarray_strdup_'
>>>>> #define utarray_strdup_(x_) MPL_strdup(x_)
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:17:20:
>>>>> note: expanded from macro 'MPL_strdup'
>>>>> #define MPL_strdup strdup
>>>>> ^
>>>>> ./src/include/mpir_mem.h:100:27: note: expanded from macro 'strdup'
>>>>> #define strdup(a) 'Error use MPL_strdup' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:225:
>>>>> In file included from ./src/include/mpir_cvars.h:17:
>>>>> In file included from ./src/include/mpitimpl.h:18:
>>>>> ./src/include/mpir_utarray.h:238:43: warning: character constant too
>>>>> long for its type
>>>>> ./src/include/mpir_utarray.h:56:33: note: expanded from macro
>>>>> 'utarray_strdup_'
>>>>> #define utarray_strdup_(x_) MPL_strdup(x_)
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:17:20:
>>>>> note: expanded from macro 'MPL_strdup'
>>>>> #define MPL_strdup strdup
>>>>> ^
>>>>> ./src/include/mpir_mem.h:100:27: note: expanded from macro 'strdup'
>>>>> #define strdup(a) 'Error use MPL_strdup' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:225:
>>>>> In file included from ./src/include/mpir_cvars.h:17:
>>>>> In file included from ./src/include/mpitimpl.h:18:
>>>>> ./src/include/mpir_utarray.h:238:43: error: expected ';' after
>>>>> expression
>>>>> ./src/include/mpir_utarray.h:56:33: note: expanded from macro
>>>>> 'utarray_strdup_'
>>>>> #define utarray_strdup_(x_) MPL_strdup(x_)
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:17:20:
>>>>> note: expanded from macro 'MPL_strdup'
>>>>> #define MPL_strdup strdup
>>>>> ^
>>>>> ./src/include/mpir_mem.h:100:50: note: expanded from macro 'strdup'
>>>>> #define strdup(a) 'Error use MPL_strdup' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:225:
>>>>> In file included from ./src/include/mpir_cvars.h:17:
>>>>> In file included from ./src/include/mpitimpl.h:18:
>>>>> ./src/include/mpir_utarray.h:238:43: error: expected expression
>>>>> ./src/include/mpir_utarray.h:56:33: note: expanded from macro
>>>>> 'utarray_strdup_'
>>>>> #define utarray_strdup_(x_) MPL_strdup(x_)
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:17:20:
>>>>> note: expanded from macro 'MPL_strdup'
>>>>> #define MPL_strdup strdup
>>>>> ^
>>>>> ./src/include/mpir_mem.h:100:51: note: expanded from macro 'strdup'
>>>>> #define strdup(a) 'Error use MPL_strdup' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:225:
>>>>> In file included from ./src/include/mpir_cvars.h:17:
>>>>> In file included from ./src/include/mpitimpl.h:18:
>>>>> ./src/include/mpir_utarray.h:242:14: warning: multi-character
>>>>> character constant [-Wmultichar]
>>>>> if (*eltc) utarray_free_(*eltc);
>>>>> ^
>>>>> ./src/include/mpir_utarray.h:54:33: note: expanded from macro
>>>>> 'utarray_free_'
>>>>> #define utarray_free_(x_) MPL_free(x_)
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:225:
>>>>> In file included from ./src/include/mpir_cvars.h:17:
>>>>> In file included from ./src/include/mpitimpl.h:18:
>>>>> ./src/include/mpir_utarray.h:242:14: warning: character constant too
>>>>> long for its type
>>>>> ./src/include/mpir_utarray.h:54:33: note: expanded from macro
>>>>> 'utarray_free_'
>>>>> #define utarray_free_(x_) MPL_free(x_)
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:225:
>>>>> In file included from ./src/include/mpir_cvars.h:17:
>>>>> In file included from ./src/include/mpitimpl.h:18:
>>>>> ./src/include/mpir_utarray.h:242:14: error: expected ';' after
>>>>> expression
>>>>> ./src/include/mpir_utarray.h:54:33: note: expanded from macro
>>>>> 'utarray_free_'
>>>>> #define utarray_free_(x_) MPL_free(x_)
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:50: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:225:
>>>>> In file included from ./src/include/mpir_cvars.h:17:
>>>>> In file included from ./src/include/mpitimpl.h:18:
>>>>> ./src/include/mpir_utarray.h:242:14: error: expected expression
>>>>> ./src/include/mpir_utarray.h:54:33: note: expanded from macro
>>>>> 'utarray_free_'
>>>>> #define utarray_free_(x_) MPL_free(x_)
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:51: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:61:29: warning: multi-character
>>>>> character constant [-Wmultichar]
>>>>> nIndirect = (int *) MPL_calloc(objmem->indirect_size,
>>>>> sizeof(int));
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>>>>> note: expanded from macro 'MPL_calloc'
>>>>> #define MPL_calloc(a,b) calloc((size_t)(a),(size_t)(b))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:91:27: note: expanded from macro 'calloc'
>>>>> #define calloc(a,b) 'Error use MPL_calloc' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:61:29: warning: character constant too
>>>>> long for its type
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>>>>> note: expanded from macro 'MPL_calloc'
>>>>> #define MPL_calloc(a,b) calloc((size_t)(a),(size_t)(b))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:91:27: note: expanded from macro 'calloc'
>>>>> #define calloc(a,b) 'Error use MPL_calloc' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:61:29: error: expected ';' after
>>>>> expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>>>>> note: expanded from macro 'MPL_calloc'
>>>>> #define MPL_calloc(a,b) calloc((size_t)(a),(size_t)(b))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:91:50: note: expanded from macro 'calloc'
>>>>> #define calloc(a,b) 'Error use MPL_calloc' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:61:29: error: expected expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>>>>> note: expanded from macro 'MPL_calloc'
>>>>> #define MPL_calloc(a,b) calloc((size_t)(a),(size_t)(b))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:91:51: note: expanded from macro 'calloc'
>>>>> #define calloc(a,b) 'Error use MPL_calloc' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:117:9: warning: multi-character
>>>>> character constant [-Wmultichar]
>>>>> MPL_free(nIndirect);
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:117:9: warning: character constant too
>>>>> long for its type
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:117:9: error: expected ';' after
>>>>> expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:50: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:117:9: error: expected expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:51: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:179:9: warning: multi-character
>>>>> character constant [-Wmultichar]
>>>>> MPL_free((*indirect)[i]);
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:179:9: warning: character constant too
>>>>> long for its type
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:179:9: error: expected ';' after
>>>>> expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:50: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:179:9: error: expected expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:51: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:182:9: warning: multi-character
>>>>> character constant [-Wmultichar]
>>>>> MPL_free(indirect);
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:182:9: warning: character constant too
>>>>> long for its type
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:182:9: error: expected ';' after
>>>>> expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:50: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:182:9: error: expected expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>>>>> note: expanded from macro 'MPL_free'
>>>>> #define MPL_free(a) free((void *)(a))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:92:51: note: expanded from macro 'free'
>>>>> #define free(a) 'Error use MPL_free' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:249:30: warning: multi-character
>>>>> character constant [-Wmultichar]
>>>>> *indirect = (void *) MPL_calloc(indirect_num_blocks, sizeof(void
>>>>> *));
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>>>>> note: expanded from macro 'MPL_calloc'
>>>>> #define MPL_calloc(a,b) calloc((size_t)(a),(size_t)(b))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:91:27: note: expanded from macro 'calloc'
>>>>> #define calloc(a,b) 'Error use MPL_calloc' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:249:30: warning: character constant too
>>>>> long for its type
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>>>>> note: expanded from macro 'MPL_calloc'
>>>>> #define MPL_calloc(a,b) calloc((size_t)(a),(size_t)(b))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:91:27: note: expanded from macro 'calloc'
>>>>> #define calloc(a,b) 'Error use MPL_calloc' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:249:30: error: expected ';' after
>>>>> expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>>>>> note: expanded from macro 'MPL_calloc'
>>>>> #define MPL_calloc(a,b) calloc((size_t)(a),(size_t)(b))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:91:50: note: expanded from macro 'calloc'
>>>>> #define calloc(a,b) 'Error use MPL_calloc' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:249:30: error: expected expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>>>>> note: expanded from macro 'MPL_calloc'
>>>>> #define MPL_calloc(a,b) calloc((size_t)(a),(size_t)(b))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:91:51: note: expanded from macro 'calloc'
>>>>> #define calloc(a,b) 'Error use MPL_calloc' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:264:26: warning: multi-character
>>>>> character constant [-Wmultichar]
>>>>> block_ptr = (void *) MPL_calloc(indirect_num_indices, obj_size);
>>>>> ^
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>>>>> note: expanded from macro 'MPL_calloc'
>>>>> #define MPL_calloc(a,b) calloc((size_t)(a),(size_t)(b))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:91:27: note: expanded from macro 'calloc'
>>>>> #define calloc(a,b) 'Error use MPL_calloc' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:264:26: warning: character constant too
>>>>> long for its type
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>>>>> note: expanded from macro 'MPL_calloc'
>>>>> #define MPL_calloc(a,b) calloc((size_t)(a),(size_t)(b))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:91:27: note: expanded from macro 'calloc'
>>>>> #define calloc(a,b) 'Error use MPL_calloc' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:264:26: error: expected ';' after
>>>>> expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>>>>> note: expanded from macro 'MPL_calloc'
>>>>> #define MPL_calloc(a,b) calloc((size_t)(a),(size_t)(b))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:91:50: note: expanded from macro 'calloc'
>>>>> #define calloc(a,b) 'Error use MPL_calloc' :::
>>>>> ^
>>>>> In file included from src/mpi/attr/attr_delete.c:8:
>>>>> In file included from ./src/include/mpiimpl.h:228:
>>>>> ./src/include/mpir_handlemem.h:264:26: error: expected expression
>>>>>
>>>>>
>>>>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>>>>> note: expanded from macro 'MPL_calloc'
>>>>> #define MPL_calloc(a,b) calloc((size_t)(a),(size_t)(b))
>>>>> ^
>>>>> ./src/include/mpir_mem.h:91:51: note: expanded from macro 'calloc'
>>>>> #define calloc(a,b) 'Error use MPL_calloc' :::
>>>>> ^
>>>>> 18 warnings and 18 errors generated.
>>>>> make[2]: *** [src/mpi/attr/lib_libpmpi_la-attr_delete.lo] Error 1
>>>>> make[1]: *** [all-recursive] Error 1
>>>>> Makefile:10270: recipe for target 'all' failed
>>>>> gmake: *** [all] Error 2
>>>>>
>>>>> On Thu, Aug 11, 2016 at 5:21 PM, Halim Amer <aamer at anl.gov> wrote:
>>>>>>
>>>>>>
>>>>>> This should be related to the alignment problem reported before
>>>>>> (http://lists.mpich.org/pipermail/discuss/2016-May/004764.html).
>>>>>>
>>>>>> We plan to include a fix in the 3.2.x bug fix release series.
>>>>>> Meanwhile,
>>>>>> please try the repo version (git.mpich.org/mpich.git), which should not
>>>>>> suffer from this problem.
>>>>>>
>>>>>> --Halim
>>>>>> www.mcs.anl.gov/~aamer
>>>>>>
>>>>>>
>>>>>> On 8/11/16 8:48 AM, Mark Davis wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hello, I'm running into a segfault when I run some relatively simple
>>>>>>> MPI programs. In this particular case, I'm running a small program in
>>>>>>> a loop that does MPI_Bcast, once per loop, within MPI_COMM_WORLD. The
>>>>>>> buffer consists of just 7 doubles. I'm running with 6 procs on a
>>>>>>> machine with 8 cores on OSX (Darwin - 15.6.0 Darwin Kernel Version
>>>>>>> 15.6.0: Thu Jun 23 18:25:34 PDT 2016;
>>>>>>> root:xnu-3248.60.10~1/RELEASE_X86_64 x86_64). When I run the same
>>>>>>> program with a smaller number of procs, the error usually doesn't show
>>>>>>> up. My compiler (both for compiling the MPICH source as well as my
>>>>>>> application) is clang 3.8.1.
>>>>>>>
>>>>>>> When I run the same program on linux, also with MPICH-3.2 (I believe
>>>>>>> the same exact source), compiled with gcc 5.3, I do not get this
>>>>>>> error. This seems to be something I get only with
>>>>>>>
>>>>>>> gdb shows the following stack trace. I have a feeling that this has
>>>>>>> something to do with my toolchain and/or libraries on my system given
>>>>>>> that I never get this error on my other system (linux). However, it's
>>>>>>> possible that there's an application bug as well.
>>>>>>>
>>>>>>> I'm running the MPICH-3.2 stable release; I haven't tried anything
>>>>>>> from the repository yet.
>>>>>>>
>>>>>>> Does anyone have any ideas about what's going on here? I'm happy to
>>>>>>> provide more details.
>>>>>>>
>>>>>>> Thank you,
>>>>>>> Mark
>>>>>>>
>>>>>>>
>>>>>>> Program received signal SIGSEGV, Segmentation fault.
>>>>>>> MPID_Request_create () at src/mpid/ch3/src/ch3u_request.c:101
>>>>>>> 101 req->dev.ext_hdr_ptr = NULL;
>>>>>>> (gdb) bt full
>>>>>>> #0 MPID_Request_create () at src/mpid/ch3/src/ch3u_request.c:101
>>>>>>> No locals.
>>>>>>> #1 0x00000001003ac4c9 in MPIDI_CH3U_Recvq_FDP_or_AEU
>>>>>>> (match=<optimized out>, foundp=0x7fff5fbfe2bc) at
>>>>>>> src/mpid/ch3/src/ch3u_recvq.c:830
>>>>>>> proc_failure_bit_masked = <error reading variable
>>>>>>> proc_failure_bit_masked (Cannot access memory at address 0x1)>
>>>>>>> error_bit_masked = <error reading variable error_bit_masked
>>>>>>> (Cannot access memory at address 0x1)>
>>>>>>> prev_rreq = <optimized out>
>>>>>>> channel_matched = <optimized out>
>>>>>>> rreq = <optimized out>
>>>>>>> #2 0x00000001003d1ffe in MPIDI_CH3_PktHandler_EagerSend
>>>>>>> (vc=<optimized out>, pkt=0x1004b3fd8 <MPIU_DBG_MaxLevel>,
>>>>>>> buflen=0x7fff5fbfe440, rreqp=0x7fff5fbfe438) at
>>>>>>> src/mpid/ch3/src/ch3u_eager.c:629
>>>>>>> mpi_errno = <error reading variable mpi_errno (Cannot access
>>>>>>> memory at address 0x0)>
>>>>>>> found = <error reading variable found (Cannot access memory at
>>>>>>> address 0xefefefefefefefef)>
>>>>>>> rreq = <optimized out>
>>>>>>> data_len = <optimized out>
>>>>>>> complete = <optimized out>
>>>>>>> #3 0x00000001003f6045 in MPID_nem_handle_pkt (vc=<optimized out>,
>>>>>>> buf=0x102ad07e0 "", buflen=<optimized out>) at
>>>>>>> src/mpid/ch3/channels/nemesis/src/ch3_progress.c:760
>>>>>>> len = 140734799800192
>>>>>>> mpi_errno = <optimized out>
>>>>>>> complete = <error reading variable complete (Cannot access
>>>>>>> memory at address 0x1)>
>>>>>>> rreq = <optimized out>
>>>>>>> #4 0x00000001003f4e41 in MPIDI_CH3I_Progress
>>>>>>> (progress_state=0x7fff5fbfe750, is_blocking=1) at
>>>>>>> src/mpid/ch3/channels/nemesis/src/ch3_progress.c:570
>>>>>>> payload_len = 4299898840
>>>>>>> cell_buf = <optimized out>
>>>>>>> rreq = <optimized out>
>>>>>>> vc = 0x102ad07e8
>>>>>>> made_progress = <error reading variable made_progress (Cannot
>>>>>>> access memory at address 0x0)>
>>>>>>> mpi_errno = <optimized out>
>>>>>>> #5 0x000000010035386d in MPIC_Wait (request_ptr=<optimized out>,
>>>>>>> errflag=<optimized out>) at src/mpi/coll/helper_fns.c:225
>>>>>>> progress_state = {ch = {completion_count = -1409286143}}
>>>>>>> mpi_errno = <error reading variable mpi_errno (Cannot access
>>>>>>> memory at address 0x0)>
>>>>>>> #6 0x0000000100353b10 in MPIC_Send (buf=0x100917c30,
>>>>>>> count=4299945096, datatype=-1581855963, dest=<optimized out>,
>>>>>>> tag=4975608, comm_ptr=0x1004b3fd8 <MPIU_DBG_MaxLevel>,
>>>>>>> errflag=<optimized out>) at src/mpi/coll/helper_fns.c:302
>>>>>>> mpi_errno = <optimized out>
>>>>>>> request_ptr = 0x1004bf7e0 <MPID_Request_direct+1760>
>>>>>>> #7 0x0000000100246031 in MPIR_Bcast_binomial (buffer=<optimized out>,
>>>>>>> count=<optimized out>, datatype=<optimized out>, root=<optimized out>,
>>>>>>> comm_ptr=<optimized out>, errflag=<optimized out>) at
>>>>>>> src/mpi/coll/bcast.c:280
>>>>>>> nbytes = <optimized out>
>>>>>>> mpi_errno_ret = <optimized out>
>>>>>>> mpi_errno = 0
>>>>>>> comm_size = <optimized out>
>>>>>>> rank = 2
>>>>>>> type_size = <optimized out>
>>>>>>> tmp_buf = 0x0
>>>>>>> position = <optimized out>
>>>>>>> relative_rank = <optimized out>
>>>>>>> mask = <optimized out>
>>>>>>> src = <optimized out>
>>>>>>> status = <optimized out>
>>>>>>> recvd_size = <optimized out>
>>>>>>> dst = <optimized out>
>>>>>>> #8 0x00000001002455a3 in MPIR_SMP_Bcast (buffer=<optimized out>,
>>>>>>> count=<optimized out>, datatype=<optimized out>, root=<optimized out>,
>>>>>>> comm_ptr=<optimized out>, errflag=<optimized out>) at
>>>>>>> src/mpi/coll/bcast.c:1087
>>>>>>> mpi_errno_ = <error reading variable mpi_errno_ (Cannot access
>>>>>>> memory at address 0x0)>
>>>>>>> mpi_errno = <optimized out>
>>>>>>> mpi_errno_ret = <optimized out>
>>>>>>> nbytes = <optimized out>
>>>>>>> type_size = <optimized out>
>>>>>>> status = <optimized out>
>>>>>>> recvd_size = <optimized out>
>>>>>>> #9 MPIR_Bcast_intra (buffer=0x100917c30, count=<optimized out>,
>>>>>>> datatype=<optimized out>, root=1, comm_ptr=<optimized out>,
>>>>>>> errflag=<optimized out>) at src/mpi/coll/bcast.c:1245
>>>>>>> nbytes = <optimized out>
>>>>>>> mpi_errno_ret = <error reading variable mpi_errno_ret (Cannot
>>>>>>> access memory at address 0x0)>
>>>>>>> mpi_errno = <optimized out>
>>>>>>> type_size = <optimized out>
>>>>>>> comm_size = <optimized out>
>>>>>>> #10 0x000000010024751e in MPIR_Bcast (buffer=<optimized out>,
>>>>>>> count=<optimized out>, datatype=<optimized out>, root=<optimized out>,
>>>>>>> comm_ptr=0x0, errflag=<optimized out>) at src/mpi/coll/bcast.c:1475
>>>>>>> mpi_errno = <optimized out>
>>>>>>> #11 MPIR_Bcast_impl (buffer=0x1004bf7e0 <MPID_Request_direct+1760>,
>>>>>>> count=-269488145, datatype=-16, root=0, comm_ptr=0x0,
>>>>>>> errflag=0x1004bf100 <MPID_Request_direct>) at
>>>>>>> src/mpi/coll/bcast.c:1451
>>>>>>> mpi_errno = <optimized out>
>>>>>>> #12 0x00000001000f3c24 in MPI_Bcast (buffer=<optimized out>, count=7,
>>>>>>> datatype=1275069445, root=1, comm=<optimized out>) at
>>>>>>> src/mpi/coll/bcast.c:1585
>>>>>>> errflag = 2885681152
>>>>>>> mpi_errno = <optimized out>
>>>>>>> comm_ptr = <optimized out>
>>>>>>> #13 0x0000000100001df7 in run_test<int> (my_rank=2,
>>>>>>> num_ranks=<optimized out>, count=<optimized out>, root_rank=1,
>>>>>>> datatype=@0x7fff5fbfeaec: 1275069445, iterations=<optimized out>) at
>>>>>>> bcast_test.cpp:83
>>>>>>> No locals.
>>>>>>> #14 0x00000001000019cd in main (argc=<optimized out>, argv=<optimized
>>>>>>> out>) at bcast_test.cpp:137
>>>>>>> root_rank = <optimized out>
>>>>>>> count = <optimized out>
>>>>>>> iterations = <optimized out>
>>>>>>> my_rank = 4978656
>>>>>>> num_errors = <optimized out>
>>>>>>> runtime_ns = <optimized out>
>>>>>>> stats = {<std::__1::__basic_string_common<true>> = {<No data
>>>>>>> fields>}, __r_ =
>>>>>>> {<std::__1::__libcpp_compressed_pair_imp<std::__1::basic_string<char,
>>>>>>> std::__1::char_traits<char>, std::__1::allocator<char> >::__rep,
>>>>>>> std::__1::allocator<char>, 2>> = {<std::__1::allocator<char>> = {<No
>>>>>>> data fields>}, __first_ = {{__l = {__cap_ = 17289301308300324847,
>>>>>>> __size_ = 17289301308300324847, __data_ = 0xefefefefefefefef <error:
>>>>>>> Cannot access memory at address 0xefefefefefefefef>}
>>>>>>> _______________________________________________
>>>>>>> discuss mailing list discuss at mpich.org
>>>>>>> To manage subscription options or unsubscribe:
>>>>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>>>>
>>>>>> _______________________________________________
>>>>>> discuss mailing list discuss at mpich.org
>>>>>> To manage subscription options or unsubscribe:
>>>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> discuss mailing list discuss at mpich.org
>>>>> To manage subscription options or unsubscribe:
>>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>>
>>>> _______________________________________________
>>>> discuss mailing list discuss at mpich.org
>>>> To manage subscription options or unsubscribe:
>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>> _______________________________________________
>>> discuss mailing list discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list