[mpich-devel] Win questions
Bob Cernohous
bobc at us.ibm.com
Mon Jun 10 09:23:28 CDT 2013
> Active target separates exposure and access epochs. It is erroneous
> for a process to engage in more than one exposure epoch or more than
> one access epoch per window. So it's ok for many processes to
> expose (post/wait) to me concurrently, but for a given window I can
> access only one (start/complete) at a time.
> Your example code looks ok to me. Are you getting an error message?
Yeah, it's definitely a PAMID problem. As I said, Rank 0 confuses
Rank 4 when it's working on LR group 5/7. I just wanted to verify the
example was ok and that I wasn't trying to fix something too odd.
> ~Jim.
> On Jun 7, 2013 9:11 AM, "Bob Cernohous" <bobc at us.ibm.com> wrote:
> I haven't worked on rma before but was working on a problem and ran
> into this **comment in MPI_Win_post:
>
> "Starts an RMA exposure epoch for the local window associated with
> win. **Only the processes belonging to group should access the
> window with RMA calls on win during this epoch. Each process in
> group must issue a matching call to MPI_Win_start. MPI_Win_post does
> not block."
>
> Would overlapping epochs be violating the ** line? I decided I
> probably need to support this but I wondered if it's bending or
> breaking the 'rules'?
>
> The problem (code at the bottom of this email) is using a cartesian
> communicator and alternating "left/right' accumulates with 'up/down'
> accumulates on a single win. So:
>
> - Ranks 0,1,2,3 are doing a left/right accumulate.
> - Ranks 4,5,6,7 are doing a left/right accumulate.
> - ...
>
> and then sometimes...
>
> - Ranks 0,1,2,3 complete and enter the 'up/down' accumulate epoch
> -- Rank 0 does MPI_Win_post to ranks 4,12
> -- Rank 1 doesn MPI_Win_post to ranks 5,13
> ...
>
> So is Rank 0 posting to Rank 4 while 4 is still in the epoch with 5/
> 6/7 a violation of "Only the processes belonging to group should
> access the window with RMA calls on win during this epoch"? From
> Rank 4's point of view, rank 0 isn't in the group for the current
win/epoch.
>
> Putting a barrier (or something) in between or using two different
> win's fixes it. I like using two win's since it separates the
> epochs and clearly doesn't use the wrong group/rank on the win.
>
> /* RMA transfers in left-right direction */
> MPI_Win_post(grp_lr, 0, win);
> MPI_Win_start(grp_lr, 0, win);
> MPI_Accumulate(&i, 1, MPI_INT, ranks_lr[LEFT] , 0, 1, MPI_INT,
> MPI_SUM, win);
> MPI_Accumulate(&i, 1, MPI_INT, ranks_lr[RIGHT], 0, 1, MPI_INT,
> MPI_SUM, win);
> MPI_Win_complete(win);
> MPI_Win_wait(win);
>
> /* RMA transfers in up-down direction */
> MPI_Win_post(grp_ud, 0, win);
> MPI_Win_start(grp_ud, 0, win);
> MPI_Accumulate(&i, 1, MPI_INT, ranks_ud[UP] , 0, 1, MPI_INT,
> MPI_SUM, win);
> MPI_Accumulate(&i, 1, MPI_INT, ranks_ud[DOWN], 0, 1, MPI_INT,
> MPI_SUM, win);
> MPI_Win_complete(win);
> MPI_Win_wait(win);
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/devel/attachments/20130610/4b43666a/attachment.html>
More information about the devel
mailing list