[mpich-discuss] How to use non-blocking send/receive without calling MPI_Wait

Lei Shi lshi at ku.edu
Tue Apr 7 02:39:14 CDT 2015


On Tue, Apr 7, 2015 at 2:37 AM, Lei Shi <leishi at ku.edu> wrote:

> Hi Huiwei and Jeff,
>
> I use hybrid OpenMP/MPI to do overlap communication. So I put all
> communication in one dedicated OpenMP thread and computation in the other
> thread. For this case, I'm using intel MPI library. Probably I did some
> mistakes
>
> One version of my code using one dedicated thread to do messaging is like
> this
>
> /* hybrid mpi/openmp overlap **/template<typename T>void CPR_NS_3D_Solver<T>::UpdateRes(T**q, T**res){
>   int thread_id,n_thread;
>   int sol_rev_flag=0,grad_rev_flag=0;
>
>   // Explicitly disable dynamic teams
>   omp_set_dynamic(0);
>   // Use 2 threads for all consecutive parallel regions
>   omp_set_num_threads(2);
>     #pragma omp parallel default(shared) private(thread_id)
>   {
>     thread_id=omp_get_thread_num();
>     n_thread=omp_get_num_threads();
>
>     /** communication thread   **/
>     if(thread_id==1){
>       SendInterfaceSol();
>       RevInterfaceSol();#pragma omp flush
>       sol_rev_flag=1;#pragma omp flush(sol_rev_flag)
>     }
>
>     /** computation thread **/
>     if(thread_id==0){
>       ResFromDivInvisFlux(q,res); //local computation
>         #pragma omp flush(sol_rev_flag)
>         while(sol_rev_flag!=1){             #pragma omp flush(sol_rev_flag)
>         }#pragma omp flush
>         ResFromFluxCorrection(q,res); //depends on interface sol
>     }
>   }//end of omp
>     }
>
> template<typename T>
>
>   void CPR_NS_3D_Solver<T>::SendInterfaceSol(){
>     uint *n_if_to_proc=this->grid_->num_iface_proc;
>     uint **if_to_proc=this->grid_->snd_iface_proc;
>     uint **rev_if_to_f=this->grid_->rev_iface_proc;
>
>     int tag=52;
>     for(int p2=0;p2<_n_proc;++p2){
>       if(p2!=_proc_id){
>         int nif=n_if_to_proc[p2];
>         //pack data to send ....
>
>       }
>     }
>
>     /**      * Exchange interface sol     **/
>     int n_proc_exchange=0;
>     for(int z=0;z<_n_proc;++z){
>       int nif=n_if_to_proc[z];
>
>       //send data
>       if(nif>0){        MPI_Isend(&snd_buf_[z][0],n_buf_[z],MPI_DOUBLE,z,tag, MPI_COMM_WORLD, &s_sol_req_[n_proc_exchange]);
>         MPI_Irecv(&rev_buf_[z][0],n_buf_[z],MPI_DOUBLE,z,tag, MPI_COMM_WORLD, &r_sol_req_[n_proc_exchange]);
>         n_proc_exchange++;
>       }
>     }
>
>   }
>
>   template<typename T>
>   void CPR_NS_3D_Solver<T>::RevInterfaceSol(){    uint *n_if_to_proc=this->grid_->num_iface_proc;
>     uint **if_to_proc=this->grid_->snd_iface_proc;
>     uint **rev_if_to_f=this->grid_->rev_iface_proc;
>
>     //wait
>     if(n_proc_exchange_>0){      MPI_Waitall(n_proc_exchange_,s_sol_req_,MPI_STATUS_IGNORE);      MPI_Waitall(n_proc_exchange_,r_sol_req_,MPI_STATUS_IGNORE);    }
>
>     /** store to local data structure **/
>     for(int z=0;z<_n_proc;++z){
>       int nif=n_if_to_proc[z];
>
>       if(nif>0){
>
>        //unpacking ....      }
>     }
>
>   }
>
>
>
>
>
>
> Sincerely Yours,
>
> Lei Shi
> ---------
>
> On Fri, Apr 3, 2015 at 4:37 PM, Jeff Hammond <jeff.science at gmail.com>
> wrote:
>
>> As far as I know, Ethernet is not good at making asynchronous progress in
>> hardware the way e.g. InfiniBand is.  I would have thought that a dedicated
>> progress thread would help, but it seems you tried that.  Did you use your
>> own progress thread or MPICH_ASYNC_PROGRESS=1?
>>
>> Jeff
>>
>> On Fri, Apr 3, 2015 at 10:10 AM, Lei Shi <lshi at ku.edu> wrote:
>>
>>> Huiwei,
>>>
>>> Thanks for your email. Your answer leads to my another question about
>>> asynchronous MPI communication.
>>>
>>> I'm trying to do an overlapped communication/computing to speedup my MPI
>>> code. I read some papers comparing some different approaches to do the
>>> overlapped communication. The "naive" overlapped communication
>>> implementation, which only use non-blocking mpi Isend/Irecv and the hybrid
>>> approach using OpenMP and MPI together. In the hybrid approach, a separated
>>> thread is use to do all non-blocking communications. Just exactly as you
>>> said, the results indicate that current MPI implementations do not
>>> support true asynchronous communication.
>>>
>>> If I use the naive approach, my code with non-blocking or blocking
>>> send/recv gives me almost the same performance in term of Wtime. All
>>> communications are postponed to MPI_Wait.
>>>
>>> I have tried calling mpi_test to push library to do communication during
>>> iterations. And try to use a dedicated thread to do communication and the
>>> other thread to do computing only. However, the performance gains are very
>>> small or no gain at all. I'm wondering it is due to the hardware. The
>>> cluster I tested uses 10G Ethernet card.
>>>
>>>
>>> Best,
>>>
>>> Lei Shi
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Apr 3, 2015 at 8:49 AM, Huiwei Lu <huiweilu at mcs.anl.gov> wrote:
>>>
>>>> Hi Lei,
>>>>
>>>> As far as I know, all current MPI implementations do not support true
>>>> asynchronous communication for now. i.e., If there is no MPI calls in your
>>>> iterations, MPICH will not be able to make progress on communication.
>>>>
>>>> One solution is to poll the MPI runtime regularly to make progress by
>>>> inserting MPI_Test to your iteration (even though you do not want to check
>>>> the data).
>>>>
>>>> Another solution is to enable MPI's asynchronous progress thread to
>>>> make progress for you.
>>>>
>>>> --
>>>> Huiwei
>>>>
>>>> On Thu, Apr 2, 2015 at 11:44 PM, Lei Shi <lshi at ku.edu> wrote:
>>>>
>>>>> Hi Junchao,
>>>>>
>>>>> Thanks for your reply. For my case, I don't want to check the data
>>>>> have been received or not. So I don't want to call MPI_Test or any function
>>>>> to verify it. But my problem is like if I ignore calling the MPI_Wait, just
>>>>> call Isend/Irev, my program freezes for several sec and then continues to
>>>>> run. My guess is probably I messed up the MPI library internal buffer by
>>>>> doing this.
>>>>>
>>>>> On Thu, Apr 2, 2015 at 7:25 PM, Junchao Zhang <jczhang at mcs.anl.gov>
>>>>> wrote:
>>>>>
>>>>>> Does MPI_Test fit your needs?
>>>>>>
>>>>>> --Junchao Zhang
>>>>>>
>>>>>> On Thu, Apr 2, 2015 at 7:16 PM, Lei Shi <lshi at ku.edu> wrote:
>>>>>>
>>>>>>> I want to use non-blocking send/rev MPI_Isend/MPI_Irev to do
>>>>>>> communication. But in my case, I don't really care what kind of data I get
>>>>>>> or it is ready to use or not. So I don't want to waste my time to do any
>>>>>>> synchronization  by calling MPI_Wait or etc API.
>>>>>>>
>>>>>>> But when I avoid calling MPI_Wait, my program is freezed several
>>>>>>> secs after running some iterations (after multiple MPI_Isend/Irev
>>>>>>> callings), then continues. It takes even more time than the case with
>>>>>>> MPI_Wait.  So my question is how to do a "true" non-blocking communication
>>>>>>> without waiting for the data ready or not. Thanks.
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> discuss mailing list     discuss at mpich.org
>>>>>>> To manage subscription options or unsubscribe:
>>>>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> discuss mailing list     discuss at mpich.org
>>>>>> To manage subscription options or unsubscribe:
>>>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> discuss mailing list     discuss at mpich.org
>>>>> To manage subscription options or unsubscribe:
>>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> discuss mailing list     discuss at mpich.org
>>>> To manage subscription options or unsubscribe:
>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>
>>>
>>>
>>> _______________________________________________
>>> discuss mailing list     discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>
>>
>>
>> --
>> Jeff Hammond
>> jeff.science at gmail.com
>> http://jeffhammond.github.io/
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20150407/cfd4f4fa/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list