[mpich-discuss] MPI_Send and MPI_Bcast

Min Si msi at anl.gov
Wed Sep 7 10:40:30 CDT 2016

Hi Alaa,

Some information is still missing.
- Number of processes per node
- The word document shows you are comparing three algorithms, but you 
only mentioned two of them. What is the third one ?
- What is the size of small/medium/large arrays in your experiments ?
- I see Bcast speedup is better than Send in Figure (c), but the numbers 
in the table do not show that. What is the number of processes 
corresponding to each result in the table ?

Since we do not support MPICH on Windows anymore, perhaps you would like 
to use MS-MPI for better performance. If you still observe the same 
issue, you can discuss with the Microsoft MPI team.

On 9/7/16 8:59 AM, alaa nashar wrote:
> Hi Min
> Thanks a lot for your fast response.
> The file attached contains the required information. Regards
> Alaa
> On Wednesday, September 7, 2016 4:21 PM, msi <msi at anl.gov> wrote:
> Hi Alaa,
> Generally, for small number of processes, the performance of MPI_Bcast 
> is usually equal to MPI_Isend/MPI_Recv, for large number of processes, 
> broadcast should be better.
> Could you please try the lateat version of MPICH on you system ? 
> MPICH2 is very old.
> Please also give us following information.
> Number of processes
> Number of processes per node
> The execution time of each algorithm
> Min
> Sent via my cell phone.
> -------- Original message --------
> From: alaa nashar <nashar_al at yahoo.com>
> Date: 9/7/2016 7:04 AM (GMT-06:00)
> To: discuss at mpich.org
> Subject: [mpich-discuss] MPI_Send and MPI_Bcast
> Dear all
> I have implemented the following two algorithms on my home Ethernet 
> LAN containing 2/ and three heterogeneous devices using MPICH2.
> Algorithm1:
> 1- Root process reads the contents of an input array.
> 2- It then sends the array data to all other processes.
> 3- All processes including the root process perform specific
> computations on their array copies.
> 4- Once the computations within each process are finished,
> the generated arrays are directly written by the process that
> performed the computations to separate files.
> Algorithm2:
> Same as Algorithm 1 except step 2 is replace by:
> 2- Root process broadcast the array data to all other processes.
> The implementation of both algorithms  work fine and give the expected 
> results but Algorithm2 that uses MIPI_Bcast is slower than Algorithm1 
> that uses MPI_Send
> For my knowledge, MIPI_Bcast is faster than MPI_Send.
> Please, would you guide me to know if there is any conflict of 
> misunderstand.
> Thanks a lot
> Alaa

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20160907/a1832458/attachment.html>
-------------- next part --------------
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:

More information about the discuss mailing list