[mpich-discuss] Excessive disk activity in MPI processes

Mccall, Kurt E. (MSFC-EV41) kurt.e.mccall at nasa.gov
Thu Jun 11 10:49:28 CDT 2020


Hui,

No, I was asking if MPI itself did any optional disk I/O that could be turned off.   Ken Raffenetti said that there wasn't any.   Your idea of launching with a wrapper might help, though, since my application does some writing to stdout that I cannot silence.

Thanks,
Kurt

-----Original Message-----
From: Zhou, Hui via discuss <discuss at mpich.org> 
Sent: Wednesday, June 10, 2020 5:16 PM
To: discuss at mpich.org
Cc: Zhou, Hui <zhouh at anl.gov>
Subject: [EXTERNAL] Re: [mpich-discuss] Excessive disk activity in MPI processes

Correct me if I was off, but I believe in the OP's case, it is the application itself heavily outputing to stdout and stderr -- from his anecdote of telling qsub not to save stdout and stderr. Hydra collects stdout and stderr by default, so I think that is causing the I/O simply due to the amount of output. I think what OP was asking is whether there is an option to simply drop stdout/stderr.

Kurt, not that I know of. If the application doesn't have an option to silence itself, could you try launch with a script wrapper, for example:
``` app.sh
[your app] 2>&1 > /dev/null # or redirect to a log file ```

Then launch with `mpirun app.sh`?

--
Hui Zhou

 

On 6/10/20, 4:57 PM, "Raffenetti, Kenneth J. via discuss" <discuss at mpich.org> wrote:

    MPICH queries a bunch of information from the /proc filesystem using hwloc, but that should all be during MPI_Init. Nevertheless, you can disable it in 3.3.2 using --with-hwloc-prefix=no.

    I'm not able to think of other optional I/O.

    Ken

    On 6/2/20, 2:39 PM, "Mccall, Kurt E. (MSFC-EV41) via discuss" <discuss at mpich.org> wrote:

        Sorry, I should have mentioned that our file system is NFS hosted on the cluster head node, with infiniband from the compute nodes to the head.

        ******

        I am attempting to run an MPI Monte Carlo: multiple instances of a simulation “Mav” under the control of MPI.    Mav does a huge amount of output to disk, but the MPI processes that oversee the Mav instances are lightweight and only wake up every 5 seconds.

        The MPI Monte Carlo is much slower than running each Mav as an independent PBS Torque job, and according to “iotop”, the MPI-supervised Mav instances regularly become blocked waiting for I/O.   iotop often shows the Mavs’s spending 99% of their time waiting for I/O.   The independent Torque jobs don’t do that.

        I got some significant improvement in the MPI performance by telling qsub not to save the stdout and stderr output (qusb –k n) to a file.  My question is: are there any other optional types of disk activity performed by MPI that I can disable?  What else could be happening here?

        I’m using MPICH 3.3.2 compiled with pgc++,   PBS Torque 5.1.1,   CentOS 3.10.0.

        Here are my MPICH configure options:

        '--disable-option-checking' '--prefix=/opt/mpich_pgc' '--with-pbs=/opt/torque' 'CC=pgcc' 'CXX=pgc++' '-enable-debuginfo' '--cache-file=/dev/null' '--srcdir=.' 'CFLAGS= -O2' 'LDFLAGS=' 'LIBS=' 'CPPFLAGS= -I/home/kmccall/mpich-3.3.2/src/mpl/include -I/home/kmccall/mpich-3.3.2/src/mpl/include -I/home/kmccall/mpich-3.3.2/src/openpa/src -I/home/kmccall/mpich-3.3.2/src/openpa/src -D_REENTRANT -I/home/kmccall/mpich-3.3.2/src/mpi/romio/include' 'MPLLIBNAME=mpl'

        Thanks,
        Kurt



    _______________________________________________
    discuss mailing list     discuss at mpich.org
    To manage subscription options or unsubscribe:
    https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.mpich.org_mailman_listinfo_discuss&d=DwIGaQ&c=ApwzowJNAKKw3xye91w7BE1XMRKi2LN9kiMk5Csz9Zk&r=6cP1IfXu3IZOHSDh_vBqciYiIh4uuVgs1MSi5K7l5fQ&m=9gt2e1lYukctyFcE_rUQtct41635OnrNX8kbovA2QzQ&s=c-e5ZcaDsXbfeqwQ5olYOPtcPlw2WMJBTQqkk5ks4Pk&e= 

_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.mpich.org_mailman_listinfo_discuss&d=DwIGaQ&c=ApwzowJNAKKw3xye91w7BE1XMRKi2LN9kiMk5Csz9Zk&r=6cP1IfXu3IZOHSDh_vBqciYiIh4uuVgs1MSi5K7l5fQ&m=9gt2e1lYukctyFcE_rUQtct41635OnrNX8kbovA2QzQ&s=c-e5ZcaDsXbfeqwQ5olYOPtcPlw2WMJBTQqkk5ks4Pk&e= 


More information about the discuss mailing list