[mpich-discuss] HPC cluster network utilisation monitoring tool ?
Raffenetti, Ken
raffenet at mcs.anl.gov
Tue Oct 26 13:08:02 CDT 2021
Apologies, I meant to link to the Nagios Core product, which is free, as an entry point to Nagios monitoring. https://www.nagios.com/products/nagios-core/
Ken
On 10/26/21, 10:01 AM, "Raffenetti, Ken via discuss" <discuss at mpich.org> wrote:
In a previous life, I used NAGIOS (https://www.nagios.com/solutions/network-monitoring/) to monitor all kinds of things on servers. It should be capable of telling you port-level network utilization. As for your switch, you might need to find something specific to the hardware.
On the MPI side, profilers like mpiP (https://software.llnl.gov/mpiP/) can capture usage statistics and produce a report which you can look to for insights on how MPI is performing for your applications.
Ken
On 10/19/21, 12:52 PM, "Nicholas Yue via discuss" <discuss at mpich.org> wrote:
Hi,
I am mainly using MPICH via the mpiexec that ships with Paraview
I have a small test cluster and a 1Gbit switch.
What is the recommended way to determine and record the network utilization for a given MPI run?
I was hoping to gather such information over time and be in a proactive position to plan for network equipment update should I find that I start running into a situation where my network is becoming the performance bottleneck.
Cheers
--
Nicholas Yue
https://www.linkedin.com/in/nicholasyue/
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list