[mpich-discuss] CPU usage versus Nodes, Threads
Lu, Huiwei
huiweilu at mcs.anl.gov
Wed Oct 22 22:57:19 CDT 2014
> On Oct 22, 2014, at 10:14 PM, Qiguo Jing <qjing at trinityconsultants.com> wrote:
>
> Hey Huiwei,
>
> Thanks for your quick response.
>
> I think I am using blocking MPI calls. MPI_Send() and MPI_Recv() are blocking calls, right? For each time, the master thread has to collect data from all slaves, and do some calculation/output (only master thread does these extra calculation/output), so I guess non-blocking calls may be not suitable? Maybe I will try non-blocking calls see if the results are different from blocking calls.
Though the master thread is waiting the data from slaves, the communication happens between processes. I don’t see any reason that you can’t replace blocking calls with non-blocking ones.
However, you need to carefully design non-blocking communication to get benefits. Simply replace blocking with non-blocking will not get much benefit as you expected. The goal of non-blocking calls is to overlap communication and computation. You need to find the overlap while keep the correct data dependency. In your case, is it possible that the communication of slaves threads can be overlapped with the communication of master thread?
>
> All threads will call MPI, so I think it is MPI_THREAD_MULTIPLE?
Yes, it is MPI_THREAD_MULTIPLE.
—
Huiwei
>
> Jing
>
> -----Original Message-----
> From: Lu, Huiwei [mailto:huiweilu at mcs.anl.gov]
> Sent: Wednesday, October 22, 2014 6:40 PM
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] CPU usage versus Nodes, Threads
>
> The cause of low utilization of CPU use can be blocking communications, load imbalance or thread contentions in your application.
>
> What MPI calls are you using? Are they blocking or nonblocking? Are the load balanced on each node?
>
> What thread level are your using? MPI_THREAD_MULTIPLE or MPI_THREAD_FUNNELED?
>
> -
> Huiwei
>
>> On Oct 22, 2014, at 5:02 PM, Qiguo Jing <qjing at trinityconsultants.com> wrote:
>>
>> Hi All,
>>
>> We have a parallel program running on a cluster. We recently found a case, which decreases the CPU usage and increase the run-time when increases Nodes. Below is the results table.
>>
>> The particular run requires a lot of data communication between nodes.
>>
>> Any thoughts about this phenomena? Or is there any way we can improve the CPU usage when using higher number of nodes?
>>
>> Average CPU Usage (%)
>> Number of Nodes
>> Number of Threads/Node
>> 100
>> 1
>> 8
>> 92
>> 2
>> 8
>> 50
>> 3
>> 8
>> 40
>> 4
>> 8
>> 35
>> 5
>> 8
>> 30
>> 6
>> 8
>> 25
>> 7
>> 8
>> 20
>> 8
>> 8
>> 20
>> 8
>> 4
>>
>>
>> Thanks!
>>
>> ______________________________________________________________________
>> ___
>>
>> The information transmitted is intended only for the person or entity
>> to which it is addressed and may contain confidential and/or
>> privileged material. Any review, retransmission, dissemination or
>> other use of, or taking of any action in reliance upon, this
>> information by persons or entities other than the intended recipient
>> is prohibited. If you received this in error, please contact the
>> sender and delete the material from any computer.
>> ______________________________________________________________________
>> ___ _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
> --
> _________________________________________________________________________
>
> The information transmitted is intended only for the person or entity to
> which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipient is prohibited. If you received
> this in error, please contact the sender and delete the material from any
> computer.
> _________________________________________________________________________
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list