[mpich-devel] progress in flushall

Jeff Hammond jeff.science at gmail.com
Tue Apr 29 22:20:19 CDT 2014


Yeah, I know that's what MPICH gives me.  My point is that is
undesirable.  All sync operations should make progress, no matter
what.

Best,

Jeff

On Tue, Apr 29, 2014 at 10:17 PM, Balaji, Pavan <balaji at anl.gov> wrote:
> Jeff,
>
> If there are no local RMA operations, Win_flush will essentially be a no-op, so no it’ll not make any progress.
>
>   — Pavan
>
> On Apr 29, 2014, at 12:18 PM, Jeff Hammond <jeff.science at gmail.com> wrote:
>
>> win_flush_all mentions poking the progress engine for async progress
>> but it won't actually do this if a given process has not initiated any
>> RMA operations.  This defeats the purpose of using win_flush_all to
>> drive progress locally.
>>
>> Maybe I am misreading the code, but I'd sleep better at night if I was
>> absolutely certain that win_flush_all hit the progress at least once.
>>
>> Thanks,
>>
>> Jeff
>>
>> (the following excerpts are easily found in src/mpid/ch3/src/ch3u_rma_sync.c)
>>
>> int MPIDI_Win_flush_all(MPID_Win *win_ptr)
>> {
>>    int mpi_errno = MPI_SUCCESS;
>>    int i;
>>    MPIDI_STATE_DECL(MPIDI_STATE_MPIDI_WIN_FLUSH_ALL);
>>
>>    MPIDI_RMA_FUNC_ENTER(MPIDI_STATE_MPIDI_WIN_FLUSH_ALL);
>>
>>    MPIU_ERR_CHKANDJUMP(win_ptr->epoch_state != MPIDI_EPOCH_LOCK &&
>>                        win_ptr->epoch_state != MPIDI_EPOCH_LOCK_ALL,
>>                        mpi_errno, MPI_ERR_RMA_SYNC, "**rmasync");
>>
>>    /* FIXME: Performance -- we should not process the ops separately.
>>     * Ideally, we should be able to use the same infrastructure that's used by
>>     * active target to complete all operations. */
>>
>>    /* Note: Local RMA calls don't poke the progress engine.  This routine
>>     * should poke the progress engine when the local target is flushed to help
>>     * make asynchronous progress.  Currently this is handled by Win_flush().
>>     */
>>     for (i = 0; i < MPIR_Comm_size(win_ptr->comm_ptr); i++) {
>>         if (MPIDI_CH3I_RMA_Ops_head(&win_ptr->targets[i].rma_ops_list) == NULL)
>>             continue;
>>        if (win_ptr->targets[i].remote_lock_state != MPIDI_CH3_WIN_LOCK_NONE) {
>>            mpi_errno = win_ptr->RMAFns.Win_flush(i, win_ptr);
>>            if (mpi_errno != MPI_SUCCESS) { MPIU_ERR_POP(mpi_errno); }
>>        }
>>    }
>>
>>
>> int MPIDI_Win_flush(int rank, MPID_Win *win_ptr)
>> {
>>    int mpi_errno = MPI_SUCCESS;
>>    int wait_for_rma_done_pkt = 0;
>>    MPIDI_RMA_Op_t *rma_op;
>>    MPIDI_STATE_DECL(MPID_STATE_MPIDI_WIN_FLUSH);
>>
>>    MPIDI_RMA_FUNC_ENTER(MPID_STATE_MPIDI_WIN_FLUSH);
>>
>>    MPIU_ERR_CHKANDJUMP(win_ptr->epoch_state != MPIDI_EPOCH_LOCK &&
>>                        win_ptr->epoch_state != MPIDI_EPOCH_LOCK_ALL,
>>                        mpi_errno, MPI_ERR_RMA_SYNC, "**rmasync");
>>
>>    /* Check if win_lock was called */
>>    MPIU_ERR_CHKANDJUMP(win_ptr->targets[rank].remote_lock_state ==
>> MPIDI_CH3_WIN_LOCK_NONE,
>>                        mpi_errno, MPI_ERR_RMA_SYNC, "**rmasync");
>>
>>    /* Local flush: ops are performed immediately on the local process */
>>    if (rank == win_ptr->comm_ptr->rank) {
>>        MPIU_Assert(win_ptr->targets[rank].remote_lock_state ==
>> MPIDI_CH3_WIN_LOCK_GRANTED);
>>        MPIU_Assert(MPIDI_CH3I_RMA_Ops_isempty(&win_ptr->targets[rank].rma_ops_list));
>>
>>        /* If flush is used as a part of polling for incoming data, we can
>>         * deadlock, since local RMA calls never poke the progress engine.  So,
>>         * make extra progress here to avoid this problem. */
>>        mpi_errno = MPIDI_CH3_Progress_poke();
>>        if (mpi_errno) MPIU_ERR_POP(mpi_errno);
>>        goto fn_exit;
>>    }
>>
>>    /* NOTE: All flush and req-based operations are currently implemented in
>>       terms of MPIDI_Win_flush.  When this changes, those operations will also
>>       need to insert this read/write memory fence for shared memory windows. */
>>
>>    if (win_ptr->shm_allocated == TRUE) {
>>        OPA_read_write_barrier();
>>    }
>>
>>    rma_op = MPIDI_CH3I_RMA_Ops_head(&win_ptr->targets[rank].rma_ops_list);
>>
>>    /* If there is no activity at this target (e.g. lock-all was called, but we
>>     * haven't communicated with this target), don't do anything. */
>>    if (win_ptr->targets[rank].remote_lock_state == MPIDI_CH3_WIN_LOCK_CALLED
>>        && rma_op == NULL)
>>    {
>>        goto fn_exit;
>>    }
>>
>> --
>> Jeff Hammond
>> jeff.science at gmail.com
>> _______________________________________________
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/devel
>
> _______________________________________________
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/devel



-- 
Jeff Hammond
jeff.science at gmail.com


More information about the devel mailing list