[mpich-discuss] MPICH-3.2: SIGSEGV in MPID_Request_create () at src/mpid/ch3/src/ch3u_request.c:101

Mark Davis markdavisinboston at gmail.com
Tue Aug 16 18:04:01 CDT 2016


I pulled from master (at d8bb1df from yesterday) again and then
recompiled with the --enable-g=most,mem flag instead of just
--enable-g=most.

The good news is that the --enable-g=most,mem flag compiled successfully.

The bad news is two-fold:

1. I believe I'm still getting the same SEGV as I was getting before
related to req->dev.ext_hdr_ptr       = NULL; (Although it's now
pointing to a different line in src/mpid/ch3/src/ch3u_request.c (line
56 instead of line 101 as before). I'm not sure if the line is
relevant; some other things may have moved around in that file since
then the 3.2 release version.

2. I no longer have debugging symbols in my library, so my backtraces
are not helpful. It's possible these two issues are related?

I did double check that I rebuilt my application from scratch so it
linked in the new library and that the library was indeed rebuilt (by
looking at file creation timestamps).

Any ideas about these two issues? Thank you.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x1313 of process 62381]
MPID_Request_init (req=0x105be2098) at src/mpid/ch3/src/ch3u_request.c:56
56          req->dev.ext_hdr_ptr       = NULL;
(gdb) bt full
#0  MPID_Request_init (req=0x105be2098) at src/mpid/ch3/src/ch3u_request.c:56
No locals.
#1  0x000000010036aeea in ?? () from /Users/m/local/lib/libpmpi.12.dylib
No symbol table info available.
#2  0x0000000100445550 in ?? () from /Users/m/local/lib/libpmpi.12.dylib
No symbol table info available.
#3  0x0000000000000000 in ?? ()
No symbol table info available.

On Mon, Aug 15, 2016 at 5:32 PM, Halim Amer <aamer at anl.gov> wrote:
> Good catch! the `most` option implies `mem`, but the root configure failed
> to forward the `mem` option to the MPL software layer. We will push a fix,
> but meanwhile you can specify `--enable-g=most,mem` to get the desired
> behavior.
>
> --Halim
> www.mcs.anl.gov/~aamer
>
>
> On 8/12/16 9:37 PM, Mark Davis wrote:
>>
>> I've tried both git HEAD (3ea7589) as well as the August 1 master
>> snapshot and am having trouble building it; I'm getting the same error
>> in both cases. I've configured with --enable-g=most but otherwise it's
>> all default. I'm running on OSX (Darwin - 15.6.0) and clang 3.8.1.
>>
>> It's erroring on compiling src/mpi/attr/lib_libpmpi_la-attr_delete.lo
>> due to an issue with the macro MPL_free
>>
>> Has anyone seen this before? I'm including the full error trace below:
>>
>>
>>
>>
>>
>> Making all in .
>>   CC       src/mpi/attr/lib_libpmpi_la-attr_delete.lo
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:217:
>> ./src/include/mpir_request.h:281:13: warning: multi-character
>> character constant [-Wmultichar]
>>             MPL_free(req->u.ureq.greq_fns);
>>             ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:217:
>> ./src/include/mpir_request.h:281:13: warning: character constant too
>> long for its type
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:217:
>> ./src/include/mpir_request.h:281:13: error: expected ';' after expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:50: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                                                  ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:217:
>> ./src/include/mpir_request.h:281:13: error: expected expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:51: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                                                   ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:225:
>> In file included from ./src/include/mpir_cvars.h:17:
>> In file included from ./src/include/mpitimpl.h:18:
>> ./src/include/mpir_utarray.h:238:43: warning: multi-character
>> character constant [-Wmultichar]
>>   *_dst = (*_src == NULL) ? NULL : (char*)utarray_strdup_(*_src);
>>                                           ^
>> ./src/include/mpir_utarray.h:56:33: note: expanded from macro
>> 'utarray_strdup_'
>> #define utarray_strdup_(x_)     MPL_strdup(x_)
>>                                 ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:17:20:
>> note: expanded from macro 'MPL_strdup'
>> #define MPL_strdup strdup
>>                    ^
>> ./src/include/mpir_mem.h:100:27: note: expanded from macro 'strdup'
>> #define strdup(a)         'Error use MPL_strdup' :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:225:
>> In file included from ./src/include/mpir_cvars.h:17:
>> In file included from ./src/include/mpitimpl.h:18:
>> ./src/include/mpir_utarray.h:238:43: warning: character constant too
>> long for its type
>> ./src/include/mpir_utarray.h:56:33: note: expanded from macro
>> 'utarray_strdup_'
>> #define utarray_strdup_(x_)     MPL_strdup(x_)
>>                                 ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:17:20:
>> note: expanded from macro 'MPL_strdup'
>> #define MPL_strdup strdup
>>                    ^
>> ./src/include/mpir_mem.h:100:27: note: expanded from macro 'strdup'
>> #define strdup(a)         'Error use MPL_strdup' :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:225:
>> In file included from ./src/include/mpir_cvars.h:17:
>> In file included from ./src/include/mpitimpl.h:18:
>> ./src/include/mpir_utarray.h:238:43: error: expected ';' after expression
>> ./src/include/mpir_utarray.h:56:33: note: expanded from macro
>> 'utarray_strdup_'
>> #define utarray_strdup_(x_)     MPL_strdup(x_)
>>                                 ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:17:20:
>> note: expanded from macro 'MPL_strdup'
>> #define MPL_strdup strdup
>>                    ^
>> ./src/include/mpir_mem.h:100:50: note: expanded from macro 'strdup'
>> #define strdup(a)         'Error use MPL_strdup' :::
>>                                                  ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:225:
>> In file included from ./src/include/mpir_cvars.h:17:
>> In file included from ./src/include/mpitimpl.h:18:
>> ./src/include/mpir_utarray.h:238:43: error: expected expression
>> ./src/include/mpir_utarray.h:56:33: note: expanded from macro
>> 'utarray_strdup_'
>> #define utarray_strdup_(x_)     MPL_strdup(x_)
>>                                 ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:17:20:
>> note: expanded from macro 'MPL_strdup'
>> #define MPL_strdup strdup
>>                    ^
>> ./src/include/mpir_mem.h:100:51: note: expanded from macro 'strdup'
>> #define strdup(a)         'Error use MPL_strdup' :::
>>                                                   ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:225:
>> In file included from ./src/include/mpir_cvars.h:17:
>> In file included from ./src/include/mpitimpl.h:18:
>> ./src/include/mpir_utarray.h:242:14: warning: multi-character
>> character constant [-Wmultichar]
>>   if (*eltc) utarray_free_(*eltc);
>>              ^
>> ./src/include/mpir_utarray.h:54:33: note: expanded from macro
>> 'utarray_free_'
>> #define utarray_free_(x_)       MPL_free(x_)
>>                                 ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:225:
>> In file included from ./src/include/mpir_cvars.h:17:
>> In file included from ./src/include/mpitimpl.h:18:
>> ./src/include/mpir_utarray.h:242:14: warning: character constant too
>> long for its type
>> ./src/include/mpir_utarray.h:54:33: note: expanded from macro
>> 'utarray_free_'
>> #define utarray_free_(x_)       MPL_free(x_)
>>                                 ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:225:
>> In file included from ./src/include/mpir_cvars.h:17:
>> In file included from ./src/include/mpitimpl.h:18:
>> ./src/include/mpir_utarray.h:242:14: error: expected ';' after expression
>> ./src/include/mpir_utarray.h:54:33: note: expanded from macro
>> 'utarray_free_'
>> #define utarray_free_(x_)       MPL_free(x_)
>>                                 ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:50: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                                                  ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:225:
>> In file included from ./src/include/mpir_cvars.h:17:
>> In file included from ./src/include/mpitimpl.h:18:
>> ./src/include/mpir_utarray.h:242:14: error: expected expression
>> ./src/include/mpir_utarray.h:54:33: note: expanded from macro
>> 'utarray_free_'
>> #define utarray_free_(x_)       MPL_free(x_)
>>                                 ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:51: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                                                   ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:61:29: warning: multi-character
>> character constant [-Wmultichar]
>>         nIndirect = (int *) MPL_calloc(objmem->indirect_size,
>> sizeof(int));
>>                             ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>> note: expanded from macro 'MPL_calloc'
>> #define MPL_calloc(a,b)  calloc((size_t)(a),(size_t)(b))
>>                          ^
>> ./src/include/mpir_mem.h:91:27: note: expanded from macro 'calloc'
>> #define calloc(a,b)       'Error use MPL_calloc' :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:61:29: warning: character constant too
>> long for its type
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>> note: expanded from macro 'MPL_calloc'
>> #define MPL_calloc(a,b)  calloc((size_t)(a),(size_t)(b))
>>                          ^
>> ./src/include/mpir_mem.h:91:27: note: expanded from macro 'calloc'
>> #define calloc(a,b)       'Error use MPL_calloc' :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:61:29: error: expected ';' after expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>> note: expanded from macro 'MPL_calloc'
>> #define MPL_calloc(a,b)  calloc((size_t)(a),(size_t)(b))
>>                          ^
>> ./src/include/mpir_mem.h:91:50: note: expanded from macro 'calloc'
>> #define calloc(a,b)       'Error use MPL_calloc' :::
>>                                                  ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:61:29: error: expected expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>> note: expanded from macro 'MPL_calloc'
>> #define MPL_calloc(a,b)  calloc((size_t)(a),(size_t)(b))
>>                          ^
>> ./src/include/mpir_mem.h:91:51: note: expanded from macro 'calloc'
>> #define calloc(a,b)       'Error use MPL_calloc' :::
>>                                                   ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:117:9: warning: multi-character
>> character constant [-Wmultichar]
>>         MPL_free(nIndirect);
>>         ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:117:9: warning: character constant too
>> long for its type
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:117:9: error: expected ';' after expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:50: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                                                  ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:117:9: error: expected expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:51: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                                                   ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:179:9: warning: multi-character
>> character constant [-Wmultichar]
>>         MPL_free((*indirect)[i]);
>>         ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:179:9: warning: character constant too
>> long for its type
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:179:9: error: expected ';' after expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:50: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                                                  ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:179:9: error: expected expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:51: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                                                   ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:182:9: warning: multi-character
>> character constant [-Wmultichar]
>>         MPL_free(indirect);
>>         ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:182:9: warning: character constant too
>> long for its type
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:27: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:182:9: error: expected ';' after expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:50: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                                                  ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:182:9: error: expected expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:110:26:
>> note: expanded from macro 'MPL_free'
>> #define MPL_free(a)      free((void *)(a))
>>                          ^
>> ./src/include/mpir_mem.h:92:51: note: expanded from macro 'free'
>> #define free(a)           'Error use MPL_free'   :::
>>                                                   ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:249:30: warning: multi-character
>> character constant [-Wmultichar]
>>         *indirect = (void *) MPL_calloc(indirect_num_blocks, sizeof(void
>> *));
>>                              ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>> note: expanded from macro 'MPL_calloc'
>> #define MPL_calloc(a,b)  calloc((size_t)(a),(size_t)(b))
>>                          ^
>> ./src/include/mpir_mem.h:91:27: note: expanded from macro 'calloc'
>> #define calloc(a,b)       'Error use MPL_calloc' :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:249:30: warning: character constant too
>> long for its type
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>> note: expanded from macro 'MPL_calloc'
>> #define MPL_calloc(a,b)  calloc((size_t)(a),(size_t)(b))
>>                          ^
>> ./src/include/mpir_mem.h:91:27: note: expanded from macro 'calloc'
>> #define calloc(a,b)       'Error use MPL_calloc' :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:249:30: error: expected ';' after
>> expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>> note: expanded from macro 'MPL_calloc'
>> #define MPL_calloc(a,b)  calloc((size_t)(a),(size_t)(b))
>>                          ^
>> ./src/include/mpir_mem.h:91:50: note: expanded from macro 'calloc'
>> #define calloc(a,b)       'Error use MPL_calloc' :::
>>                                                  ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:249:30: error: expected expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>> note: expanded from macro 'MPL_calloc'
>> #define MPL_calloc(a,b)  calloc((size_t)(a),(size_t)(b))
>>                          ^
>> ./src/include/mpir_mem.h:91:51: note: expanded from macro 'calloc'
>> #define calloc(a,b)       'Error use MPL_calloc' :::
>>                                                   ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:264:26: warning: multi-character
>> character constant [-Wmultichar]
>>     block_ptr = (void *) MPL_calloc(indirect_num_indices, obj_size);
>>                          ^
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>> note: expanded from macro 'MPL_calloc'
>> #define MPL_calloc(a,b)  calloc((size_t)(a),(size_t)(b))
>>                          ^
>> ./src/include/mpir_mem.h:91:27: note: expanded from macro 'calloc'
>> #define calloc(a,b)       'Error use MPL_calloc' :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:264:26: warning: character constant too
>> long for its type
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>> note: expanded from macro 'MPL_calloc'
>> #define MPL_calloc(a,b)  calloc((size_t)(a),(size_t)(b))
>>                          ^
>> ./src/include/mpir_mem.h:91:27: note: expanded from macro 'calloc'
>> #define calloc(a,b)       'Error use MPL_calloc' :::
>>                           ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:264:26: error: expected ';' after
>> expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>> note: expanded from macro 'MPL_calloc'
>> #define MPL_calloc(a,b)  calloc((size_t)(a),(size_t)(b))
>>                          ^
>> ./src/include/mpir_mem.h:91:50: note: expanded from macro 'calloc'
>> #define calloc(a,b)       'Error use MPL_calloc' :::
>>                                                  ^
>> In file included from src/mpi/attr/attr_delete.c:8:
>> In file included from ./src/include/mpiimpl.h:228:
>> ./src/include/mpir_handlemem.h:264:26: error: expected expression
>>
>> /Users/m/local/src/mpich-master-v3.2-370-g0d6412303488/src/mpl/include/mpl_trmem.h:109:26:
>> note: expanded from macro 'MPL_calloc'
>> #define MPL_calloc(a,b)  calloc((size_t)(a),(size_t)(b))
>>                          ^
>> ./src/include/mpir_mem.h:91:51: note: expanded from macro 'calloc'
>> #define calloc(a,b)       'Error use MPL_calloc' :::
>>                                                   ^
>> 18 warnings and 18 errors generated.
>> make[2]: *** [src/mpi/attr/lib_libpmpi_la-attr_delete.lo] Error 1
>> make[1]: *** [all-recursive] Error 1
>> Makefile:10270: recipe for target 'all' failed
>> gmake: *** [all] Error 2
>>
>> On Thu, Aug 11, 2016 at 5:21 PM, Halim Amer <aamer at anl.gov> wrote:
>>>
>>> This should be related to the alignment problem reported before
>>> (http://lists.mpich.org/pipermail/discuss/2016-May/004764.html).
>>>
>>> We plan to include a fix in the 3.2.x bug fix release series. Meanwhile,
>>> please try the repo version (git.mpich.org/mpich.git), which should not
>>> suffer from this problem.
>>>
>>> --Halim
>>> www.mcs.anl.gov/~aamer
>>>
>>>
>>> On 8/11/16 8:48 AM, Mark Davis wrote:
>>>>
>>>>
>>>> Hello, I'm running into a segfault when I run some relatively simple
>>>> MPI programs. In this particular case, I'm running a small program in
>>>> a loop that does MPI_Bcast, once per loop, within MPI_COMM_WORLD. The
>>>> buffer consists of just 7 doubles. I'm running with 6 procs on a
>>>> machine with 8 cores on OSX (Darwin - 15.6.0 Darwin Kernel Version
>>>> 15.6.0: Thu Jun 23 18:25:34 PDT 2016;
>>>> root:xnu-3248.60.10~1/RELEASE_X86_64 x86_64). When I run the same
>>>> program with a smaller number of procs, the error usually doesn't show
>>>> up. My compiler (both for compiling the MPICH source as well as my
>>>> application) is clang 3.8.1.
>>>>
>>>> When I run the same program on linux, also with MPICH-3.2 (I believe
>>>> the same exact source), compiled with gcc 5.3, I do not get this
>>>> error. This seems to be something I get only with
>>>>
>>>> gdb shows the following stack trace. I have a feeling that this has
>>>> something to do with my toolchain and/or libraries on my system given
>>>> that I never get this error on my other system (linux). However, it's
>>>> possible that there's an application bug as well.
>>>>
>>>> I'm running the MPICH-3.2 stable release; I haven't tried anything
>>>> from the repository yet.
>>>>
>>>> Does anyone have any ideas about what's going on here? I'm happy to
>>>> provide more details.
>>>>
>>>> Thank you,
>>>> Mark
>>>>
>>>>
>>>> Program received signal SIGSEGV, Segmentation fault.
>>>> MPID_Request_create () at src/mpid/ch3/src/ch3u_request.c:101
>>>> 101             req->dev.ext_hdr_ptr       = NULL;
>>>> (gdb) bt full
>>>> #0  MPID_Request_create () at src/mpid/ch3/src/ch3u_request.c:101
>>>> No locals.
>>>> #1  0x00000001003ac4c9 in MPIDI_CH3U_Recvq_FDP_or_AEU
>>>> (match=<optimized out>, foundp=0x7fff5fbfe2bc) at
>>>> src/mpid/ch3/src/ch3u_recvq.c:830
>>>>         proc_failure_bit_masked = <error reading variable
>>>> proc_failure_bit_masked (Cannot access memory at address 0x1)>
>>>>         error_bit_masked = <error reading variable error_bit_masked
>>>> (Cannot access memory at address 0x1)>
>>>>         prev_rreq = <optimized out>
>>>>         channel_matched = <optimized out>
>>>>         rreq = <optimized out>
>>>> #2  0x00000001003d1ffe in MPIDI_CH3_PktHandler_EagerSend
>>>> (vc=<optimized out>, pkt=0x1004b3fd8 <MPIU_DBG_MaxLevel>,
>>>> buflen=0x7fff5fbfe440, rreqp=0x7fff5fbfe438) at
>>>> src/mpid/ch3/src/ch3u_eager.c:629
>>>>         mpi_errno = <error reading variable mpi_errno (Cannot access
>>>> memory at address 0x0)>
>>>>         found = <error reading variable found (Cannot access memory at
>>>> address 0xefefefefefefefef)>
>>>>         rreq = <optimized out>
>>>>         data_len = <optimized out>
>>>>         complete = <optimized out>
>>>> #3  0x00000001003f6045 in MPID_nem_handle_pkt (vc=<optimized out>,
>>>> buf=0x102ad07e0 "", buflen=<optimized out>) at
>>>> src/mpid/ch3/channels/nemesis/src/ch3_progress.c:760
>>>>         len = 140734799800192
>>>>         mpi_errno = <optimized out>
>>>>         complete = <error reading variable complete (Cannot access
>>>> memory at address 0x1)>
>>>>         rreq = <optimized out>
>>>> #4  0x00000001003f4e41 in MPIDI_CH3I_Progress
>>>> (progress_state=0x7fff5fbfe750, is_blocking=1) at
>>>> src/mpid/ch3/channels/nemesis/src/ch3_progress.c:570
>>>>         payload_len = 4299898840
>>>>         cell_buf = <optimized out>
>>>>         rreq = <optimized out>
>>>>         vc = 0x102ad07e8
>>>>         made_progress = <error reading variable made_progress (Cannot
>>>> access memory at address 0x0)>
>>>>         mpi_errno = <optimized out>
>>>> #5  0x000000010035386d in MPIC_Wait (request_ptr=<optimized out>,
>>>> errflag=<optimized out>) at src/mpi/coll/helper_fns.c:225
>>>>         progress_state = {ch = {completion_count = -1409286143}}
>>>>         mpi_errno = <error reading variable mpi_errno (Cannot access
>>>> memory at address 0x0)>
>>>> #6  0x0000000100353b10 in MPIC_Send (buf=0x100917c30,
>>>> count=4299945096, datatype=-1581855963, dest=<optimized out>,
>>>> tag=4975608, comm_ptr=0x1004b3fd8 <MPIU_DBG_MaxLevel>,
>>>> errflag=<optimized out>) at src/mpi/coll/helper_fns.c:302
>>>>         mpi_errno = <optimized out>
>>>>         request_ptr = 0x1004bf7e0 <MPID_Request_direct+1760>
>>>> #7  0x0000000100246031 in MPIR_Bcast_binomial (buffer=<optimized out>,
>>>> count=<optimized out>, datatype=<optimized out>, root=<optimized out>,
>>>> comm_ptr=<optimized out>, errflag=<optimized out>) at
>>>> src/mpi/coll/bcast.c:280
>>>>         nbytes = <optimized out>
>>>>         mpi_errno_ret = <optimized out>
>>>>         mpi_errno = 0
>>>>         comm_size = <optimized out>
>>>>         rank = 2
>>>>         type_size = <optimized out>
>>>>         tmp_buf = 0x0
>>>>         position = <optimized out>
>>>>         relative_rank = <optimized out>
>>>>         mask = <optimized out>
>>>>         src = <optimized out>
>>>>         status = <optimized out>
>>>>         recvd_size = <optimized out>
>>>>         dst = <optimized out>
>>>> #8  0x00000001002455a3 in MPIR_SMP_Bcast (buffer=<optimized out>,
>>>> count=<optimized out>, datatype=<optimized out>, root=<optimized out>,
>>>> comm_ptr=<optimized out>, errflag=<optimized out>) at
>>>> src/mpi/coll/bcast.c:1087
>>>>         mpi_errno_ = <error reading variable mpi_errno_ (Cannot access
>>>> memory at address 0x0)>
>>>>         mpi_errno = <optimized out>
>>>>         mpi_errno_ret = <optimized out>
>>>>         nbytes = <optimized out>
>>>>         type_size = <optimized out>
>>>>         status = <optimized out>
>>>>         recvd_size = <optimized out>
>>>> #9  MPIR_Bcast_intra (buffer=0x100917c30, count=<optimized out>,
>>>> datatype=<optimized out>, root=1, comm_ptr=<optimized out>,
>>>> errflag=<optimized out>) at src/mpi/coll/bcast.c:1245
>>>>         nbytes = <optimized out>
>>>>         mpi_errno_ret = <error reading variable mpi_errno_ret (Cannot
>>>> access memory at address 0x0)>
>>>>         mpi_errno = <optimized out>
>>>>         type_size = <optimized out>
>>>>         comm_size = <optimized out>
>>>> #10 0x000000010024751e in MPIR_Bcast (buffer=<optimized out>,
>>>> count=<optimized out>, datatype=<optimized out>, root=<optimized out>,
>>>> comm_ptr=0x0, errflag=<optimized out>) at src/mpi/coll/bcast.c:1475
>>>>         mpi_errno = <optimized out>
>>>> #11 MPIR_Bcast_impl (buffer=0x1004bf7e0 <MPID_Request_direct+1760>,
>>>> count=-269488145, datatype=-16, root=0, comm_ptr=0x0,
>>>> errflag=0x1004bf100 <MPID_Request_direct>) at
>>>> src/mpi/coll/bcast.c:1451
>>>>         mpi_errno = <optimized out>
>>>> #12 0x00000001000f3c24 in MPI_Bcast (buffer=<optimized out>, count=7,
>>>> datatype=1275069445, root=1, comm=<optimized out>) at
>>>> src/mpi/coll/bcast.c:1585
>>>>         errflag = 2885681152
>>>>         mpi_errno = <optimized out>
>>>>         comm_ptr = <optimized out>
>>>> #13 0x0000000100001df7 in run_test<int> (my_rank=2,
>>>> num_ranks=<optimized out>, count=<optimized out>, root_rank=1,
>>>> datatype=@0x7fff5fbfeaec: 1275069445, iterations=<optimized out>) at
>>>> bcast_test.cpp:83
>>>> No locals.
>>>> #14 0x00000001000019cd in main (argc=<optimized out>, argv=<optimized
>>>> out>) at bcast_test.cpp:137
>>>>         root_rank = <optimized out>
>>>>         count = <optimized out>
>>>>         iterations = <optimized out>
>>>>         my_rank = 4978656
>>>>         num_errors = <optimized out>
>>>>         runtime_ns = <optimized out>
>>>>         stats = {<std::__1::__basic_string_common<true>> = {<No data
>>>> fields>}, __r_ =
>>>> {<std::__1::__libcpp_compressed_pair_imp<std::__1::basic_string<char,
>>>> std::__1::char_traits<char>, std::__1::allocator<char> >::__rep,
>>>> std::__1::allocator<char>, 2>> = {<std::__1::allocator<char>> = {<No
>>>> data fields>}, __first_ = {{__l = {__cap_ = 17289301308300324847,
>>>> __size_ = 17289301308300324847, __data_ = 0xefefefefefefefef <error:
>>>> Cannot access memory at address 0xefefefefefefefef>}
>>>> _______________________________________________
>>>> discuss mailing list     discuss at mpich.org
>>>> To manage subscription options or unsubscribe:
>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>
>>> _______________________________________________
>>> discuss mailing list     discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list