[mpich-devel] lots of windows -> lots of comms -> fail

Jeff Hammond jeff.science at gmail.com
Thu Jun 4 14:50:58 CDT 2015


Why does MPICH need to dupe the comm for every window?  I guess
performing a collective in MPI_WIN_FENCE is one reason.  Does
MPI_WIN_FREE need an internal comm?  Are there others?

If I were to provide an info saying that I only use passive target, do
you still need to dupe the comm?  If there is any way to avoid one
comm per window, I would like to do that.

And yes, I am indeed asserting that 2048 windows is not enough.

Thanks,

Jeff


jrhammon-mac01:ticket459 jrhammon$ mpiexec -n 2 ./test_win_sync.x 10000

Fatal error in MPI_Win_allocate_shared: Other MPI error, error stack:

MPI_Win_allocate_shared(162).........: MPI_Win_allocate_shared(size=4,
MPI_INFO_NULL, comm=0x84000004, baseptr=0x7fff53c163c0,
win=0x7fff53c0a780) failed

MPID_Win_allocate_shared(248)........:

win_init(294)........................:

MPIR_Comm_dup_impl(57)...............:

MPIR_Comm_copy(1773).................:

MPIR_Get_contextid(829)..............:

MPIR_Get_contextid_sparse_group(1203): Too many communicators (0/2048
free on this process; ignore_id=0)

Fatal error in MPI_Win_allocate_shared: Other MPI error, error stack:

MPI_Win_allocate_shared(162).........: MPI_Win_allocate_shared(size=4,
MPI_INFO_NULL, comm=0x84000002, baseptr=0x7fff572853c0,
win=0x7fff57279780) failed

MPID_Win_allocate_shared(248)........:

win_init(294)........................:

MPIR_Comm_dup_impl(57)...............:

MPIR_Comm_copy(1773).................:

MPIR_Get_contextid(829)..............:

MPIR_Get_contextid_sparse_group(1203): Too many communicators (0/2048
free on this process; ignore_id=0)


-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/


More information about the devel mailing list