[mpich-discuss] Understanding recursive doubling impl. in Mpich

Thakur, Rajeev thakur at anl.gov
Tue Nov 29 13:37:29 CST 2016


I don’t remember for sure, but probably because that code is meant to work even for non power of two numbers of processes (although the if statement precludes that right now) , and calculating the recvcount accurately in that case is more work.

Rajeev

> On Nov 29, 2016, at 1:14 PM, Dorier, Matthieu <mdorier at anl.gov> wrote:
> 
> Hi,
> 
> I'm looking at Mpich's code for MPI_Allgather (src/mpi/coll/allgather.c), lines 183 to 222, where the recursive doubling algorithm is used. I''m under the impression that the MPI_Sendrecv operation used line 206 is not issued with the same recvcount value than the sendcount value of the sending process.
> 
> Example: if I compute manually for an allgather across 4 processes: in the first loop iteration, process 0 and 1 will exchange data, and process 2 and 3 will do the same. Focusing on 0 and 1, process 0 sends 1 item and expects 3, process 1 sends 1 item and expects 4.
> 
> I know that posting a receive with a size larger than what is sent is correct in MPI, and that would explain why line 220 we have MPIR_Get_count_impl(&status, recvtype, &last_recv_cnt);
> I just want to make sure that I'm not missing something, and if not, I'd like to know why processes overestimate the received size instead of doubling the curr_cnt variable at every step? Is it to allow the next part of the code (which is between /* --BEGIN EXPERIMENTAL-- */ comments) to work?
> 
> Thanks
> 
> Matthieu
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list