[mpich-commits] [mpich] MPICH primary repository branch, master, updated. v3.2b4-97-g6f020c7

Service Account noreply at mpich.org
Mon Aug 10 20:25:26 CDT 2015


This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "MPICH primary repository".

The branch, master has been updated
       via  6f020c71b5f6349b3b616280ba9c24ed7988de40 (commit)
       via  7da26cd82a0826c4581b4e357a66eb1cbee6c8bc (commit)
       via  e204ed26d4c07e5b6f6b14f0f32b1a47c6cc72e9 (commit)
       via  9c17714a7fe3b4233cc50962e71c6ca7527d521b (commit)
      from  40b1483fb31d9207644783a2574b08149dfa594b (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
http://git.mpich.org/mpich.git/commitdiff/6f020c71b5f6349b3b616280ba9c24ed7988de40

commit 6f020c71b5f6349b3b616280ba9c24ed7988de40
Author: Xin Zhao <xinzhao3 at illinois.edu>
Date:   Mon Aug 10 16:46:19 2015 -0500

    Increase time limit of some GACC-based RMA tests to 4 min.
    
    In 40b1483fb31d9207644783a2574b08149dfa594b, we reduced the value
    of MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD to 64K by measuring with
    an one-to-all benchmark involving PUT operations on MXM. This effects
    the performance of some GACC-based tests. Here we increase the time
    limit of those tests to 4 min.
    
    The default value of MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD is just
    a reference value according the one-to-all PUT tests. In a real
    application, user should tune the value depending on different
    operations and communication patterns.
    
    Signed-off-by: Pavan Balaji <balaji at anl.gov>

diff --git a/test/mpi/rma/testlist.in b/test/mpi/rma/testlist.in
index c824afd..de7b025 100644
--- a/test/mpi/rma/testlist.in
+++ b/test/mpi/rma/testlist.in
@@ -32,10 +32,10 @@ lock_dt 2
 lock_dt_flush 2
 lock_dt_flushlocal 2
 lockall_dt 4
-lockall_dt_flush 4
-lockall_dt_flushall 4
-lockall_dt_flushlocal 4
-lockall_dt_flushlocalall 4
+lockall_dt_flush 4 timeLimit=240
+lockall_dt_flushall 4 timeLimit=240
+lockall_dt_flushlocal 4 timeLimit=240
+lockall_dt_flushlocalall 4 timeLimit=240
 lock_contention_dt 4
 transpose4 2
 fetchandadd 7

http://git.mpich.org/mpich.git/commitdiff/7da26cd82a0826c4581b4e357a66eb1cbee6c8bc

commit 7da26cd82a0826c4581b4e357a66eb1cbee6c8bc
Author: Xin Zhao <xinzhao3 at illinois.edu>
Date:   Mon Aug 10 14:47:53 2015 -0500

    Bug-fix on RMA working with asynchronous thread.
    
    In RMA operation routines, when the total number of active internal
    requests hits the upper bound, we let the process to poke the progress
    engine in a while loop until the number of active requests is reduced.
    However, when ASYNC_PROGRESS is on, this can cause the deadlock situation
    in the main thread. Because poking progress engine is a nonblocking call,
    it will not yield CPU to async thread. If the async thread already starts
    receiving the last packet before main thread starts poking the progress
    engine, then main thread will not receive any more packets, which makes
    the while loop endlessly running. Here we fix this issue by replacing
    nonblocking progress engine call with blocking progress engine call.
    
    Signed-off-by: Pavan Balaji <balaji at anl.gov>

diff --git a/src/mpid/ch3/src/ch3u_rma_ops.c b/src/mpid/ch3/src/ch3u_rma_ops.c
index 91df74b..95f153f 100644
--- a/src/mpid/ch3/src/ch3u_rma_ops.c
+++ b/src/mpid/ch3/src/ch3u_rma_ops.c
@@ -194,7 +194,7 @@ int MPIDI_CH3I_Put(const void *origin_addr, int origin_count, MPI_Datatype
         if (MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD >= 0 &&
             MPIDI_CH3I_RMA_Active_req_cnt >= MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD) {
             while (MPIDI_CH3I_RMA_Active_req_cnt >= MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD) {
-                mpi_errno = poke_progress_engine();
+                mpi_errno = wait_progress_engine();
                 if (mpi_errno != MPI_SUCCESS)
                     MPIU_ERR_POP(mpi_errno);
             }
@@ -363,7 +363,7 @@ int MPIDI_CH3I_Get(void *origin_addr, int origin_count, MPI_Datatype
         if (MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD >= 0 &&
             MPIDI_CH3I_RMA_Active_req_cnt >= MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD) {
             while (MPIDI_CH3I_RMA_Active_req_cnt >= MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD) {
-                mpi_errno = poke_progress_engine();
+                mpi_errno = wait_progress_engine();
                 if (mpi_errno != MPI_SUCCESS)
                     MPIU_ERR_POP(mpi_errno);
             }
@@ -574,7 +574,7 @@ int MPIDI_CH3I_Accumulate(const void *origin_addr, int origin_count, MPI_Datatyp
         if (MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD >= 0 &&
             MPIDI_CH3I_RMA_Active_req_cnt >= MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD) {
             while (MPIDI_CH3I_RMA_Active_req_cnt >= MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD) {
-                mpi_errno = poke_progress_engine();
+                mpi_errno = wait_progress_engine();
                 if (mpi_errno != MPI_SUCCESS)
                     MPIU_ERR_POP(mpi_errno);
             }
@@ -826,7 +826,7 @@ int MPIDI_CH3I_Get_accumulate(const void *origin_addr, int origin_count,
         if (MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD >= 0 &&
             MPIDI_CH3I_RMA_Active_req_cnt >= MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD) {
             while (MPIDI_CH3I_RMA_Active_req_cnt >= MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD) {
-                mpi_errno = poke_progress_engine();
+                mpi_errno = wait_progress_engine();
                 if (mpi_errno != MPI_SUCCESS)
                     MPIU_ERR_POP(mpi_errno);
             }
@@ -1073,7 +1073,7 @@ int MPID_Compare_and_swap(const void *origin_addr, const void *compare_addr,
         if (MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD >= 0 &&
             MPIDI_CH3I_RMA_Active_req_cnt >= MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD) {
             while (MPIDI_CH3I_RMA_Active_req_cnt >= MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD) {
-                mpi_errno = poke_progress_engine();
+                mpi_errno = wait_progress_engine();
                 if (mpi_errno != MPI_SUCCESS)
                     MPIU_ERR_POP(mpi_errno);
             }
@@ -1215,7 +1215,7 @@ int MPID_Fetch_and_op(const void *origin_addr, void *result_addr,
         if (MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD >= 0 &&
             MPIDI_CH3I_RMA_Active_req_cnt >= MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD) {
             while (MPIDI_CH3I_RMA_Active_req_cnt >= MPIR_CVAR_CH3_RMA_ACTIVE_REQ_THRESHOLD) {
-                mpi_errno = poke_progress_engine();
+                mpi_errno = wait_progress_engine();
                 if (mpi_errno != MPI_SUCCESS)
                     MPIU_ERR_POP(mpi_errno);
             }

http://git.mpich.org/mpich.git/commitdiff/e204ed26d4c07e5b6f6b14f0f32b1a47c6cc72e9

commit e204ed26d4c07e5b6f6b14f0f32b1a47c6cc72e9
Author: Xin Zhao <xinzhao3 at illinois.edu>
Date:   Mon Aug 10 14:40:59 2015 -0500

    Add an assert to check MPIDI_CH3I_RMA_Active_req_cnt.
    
    Signed-off-by: Pavan Balaji <balaji at anl.gov>

diff --git a/src/mpid/ch3/src/ch3u_handle_op_req.c b/src/mpid/ch3/src/ch3u_handle_op_req.c
index 5b8af99..6e6283f 100644
--- a/src/mpid/ch3/src/ch3u_handle_op_req.c
+++ b/src/mpid/ch3/src/ch3u_handle_op_req.c
@@ -29,6 +29,7 @@ int MPIDI_CH3_Req_handler_rma_op_complete(MPID_Request * sreq)
     MPID_Win_get_ptr(sreq->dev.source_win_handle, win_ptr);
     MPIU_Assert(win_ptr != NULL);
     MPIDI_CH3I_RMA_Active_req_cnt--;
+    MPIU_Assert(MPIDI_CH3I_RMA_Active_req_cnt >= 0);
 
     if (sreq->dev.request_handle != MPI_REQUEST_NULL) {
         /* get user request */

http://git.mpich.org/mpich.git/commitdiff/9c17714a7fe3b4233cc50962e71c6ca7527d521b

commit 9c17714a7fe3b4233cc50962e71c6ca7527d521b
Author: Xin Zhao <xinzhao3 at illinois.edu>
Date:   Mon Aug 10 14:06:17 2015 -0500

    Delete redundant function call.
    
    Signed-off-by: Pavan Balaji <balaji at anl.gov>

diff --git a/src/mpid/ch3/include/mpid_rma_oplist.h b/src/mpid/ch3/include/mpid_rma_oplist.h
index 4355d9d..bc598ea 100644
--- a/src/mpid/ch3/include/mpid_rma_oplist.h
+++ b/src/mpid/ch3/include/mpid_rma_oplist.h
@@ -517,12 +517,6 @@ static inline int MPIDI_CH3I_Win_get_op(MPID_Win * win_ptr, MPIDI_RMA_Op_t ** e)
         if (new_ptr != NULL)
             break;
 
-        MPIR_T_PVAR_TIMER_START(RMA, rma_rmaqueue_alloc);
-        new_ptr = MPIDI_CH3I_Win_op_alloc(win_ptr);
-        MPIR_T_PVAR_TIMER_END(RMA, rma_rmaqueue_alloc);
-        if (new_ptr != NULL)
-            break;
-
         mpi_errno = MPIDI_CH3I_RMA_Cleanup_ops_aggressive(win_ptr);
         if (mpi_errno != MPI_SUCCESS)
             MPIU_ERR_POP(mpi_errno);

-----------------------------------------------------------------------

Summary of changes:
 src/mpid/ch3/include/mpid_rma_oplist.h |    6 ------
 src/mpid/ch3/src/ch3u_handle_op_req.c  |    1 +
 src/mpid/ch3/src/ch3u_rma_ops.c        |   12 ++++++------
 test/mpi/rma/testlist.in               |    8 ++++----
 4 files changed, 11 insertions(+), 16 deletions(-)


hooks/post-receive
-- 
MPICH primary repository


More information about the commits mailing list