[mpich-discuss] MPI_File_read_shared on NFS seems to be slow on multi-node

Latham, Robert J. robl at mcs.anl.gov
Thu Jun 14 11:09:15 CDT 2018


On Wed, 2017-12-20 at 17:33 +0530, Aboorva Devarajan wrote:
> MPI_File_read_shared / MPI_File_write_shared on NFS seems to be very
> slow when running on multi-node environment compared to a single node
> run. originally detected with the mpich test case (writeshf90)
> 
> Here is a minimal reproducer with MPI_File_read_shared,
> https://gist.github.com/AboorvaDevarajan/db4797e02ac71f60384a8254ae28
> 62c8

We don't have a very good opimization for the shared file
operations.  On a single node, the fcntl() locks are easier to
manage.  When multiple nodes are involved, the fcntl() locks will be
more expensive.

Years ago we wrote about an RMA-based data structure (and then MPI-3
made that data structure overkill).  Are shared files important to your
workload?

==rob

> [aboorvad at c712f6n06 io]$ cat hostfile_mpich
> c712f6n06:2
> c712f6n07:2
> [aboorvad at c712f6n06 io]$ time mpirun -np 4 --hostfile hostfile_mpich
> ./read_shared
> real    0m14.709s
> user    0m27.703s
> sys    0m1.438s
> 
> [aboorvad at c712f6n06 io]$ time mpirun -np 4 ./read_shared
> 
> real    0m0.247s
> user    0m0.153s
> sys    0m0.616s
> 
> Is it an expected behaviour? If yes, why? 
> 
> 
> Thanks.
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list