[mpich-discuss] Using MPICH in Python breaks Fortran MPI_IN_PLACE

Patrick McNally rpmcnally at gmail.com
Wed Jun 10 13:08:15 CDT 2020


I hate to bug, but this is a pretty serious issue.  I suspect it is why we
get segfaults trying to use similar variables like MPI_STATUSES_IGNORE.
Any insight would be appreciated.

Thanks,
Patrick

On Wed, May 27, 2020 at 10:25 AM Patrick McNally <rpmcnally at gmail.com>
wrote:

> Our application consists primarily of a Python head calling into Fortran
> routines to do the heavy lifting.  We have never been able to successfully
> use MPI_IN_PLACE in Fortran but weren't sure why.  Recently, we discovered
> that it works fine in standalone Fortran code and is only broken when the
> Fortran code is run through our Python modules.
>
> The issue appears to be related to having code that only links to the C
> libmpi library loaded first and with RTLD_LOCAL, as happens when we load
> mpi4py.  It works if you load something linked to libmpifort first or load
> everything with RTLD_GLOBAL.  I'm assuming this has something to do with
> how MPICH tests the address of MPIR_F08_MPI_IN_PLACE but I don't understand
> SO loading well enough to fully grasp the issue.  Below is some standalone
> code to show the issue.  I'd appreciate any insight you can provide into
> why this is happening.
>
> Relevant system details:
> RHEL 7.8
> Python 2.7
> GCC 7.3.0
> MPICH 3.3.2 (and 3.2)
>
> The below files are also available towards the end of the bug report at
> the following link:
>
> https://bitbucket.org/mpi4py/mpi4py/issues/162/mpi4py-initialization-breaks-fortran
>
> Thanks,
> Patrick
>
> makefile
> -----------
> libs = testc.so testf.so
> all: $(libs)
>
> testc.so: testc.c
>         mpicc   -shared -fPIC $< -o $@
>
> testf.so: testf.f90
>         mpifort -shared -fPIC $< -o $@
>
> clean:
>         $(RM) $(libs)
>
> testc.c
> ---------
> #include <stddef.h>
> #include <stdio.h>
> #include <mpi.h>
>
> extern void initc(void);
> extern void testc(void);
>
> void initc(void)
> {
>   MPI_Init(NULL,NULL);
> }
>
> void testc(void)
> {
>   int val = 1;
>   MPI_Allreduce(MPI_IN_PLACE, &val, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);
>   printf("C val: %2d\n",val);
> }
>
> testf.f90
> -----------
> subroutine initf() bind(C)
>   use mpi
>   integer ierr
>   call MPI_Init(ierr)
> end subroutine initf
>
> subroutine testf() bind(C)
>   use mpi
>   integer ierr
>   integer val
>   val = 1
>   call MPI_Allreduce(MPI_IN_PLACE, val, 1, MPI_INTEGER, MPI_SUM,
> MPI_COMM_WORLD, ierr)
>   print '(A,I2)', 'F val: ', val
> end subroutine testf
>
> test.py
> ---------
> from ctypes import CDLL, RTLD_LOCAL, RTLD_GLOBAL
>
> mode = RTLD_LOCAL
> cfirst = True
>
> if cfirst: # it does not work!
>     libc = CDLL("./testc.so", mode)
>     libf = CDLL("./testf.so", mode)
> else: # it works!
>     libf = CDLL("./testf.so", mode)
>     libc = CDLL("./testc.so", mode)
>
> libc.initc.restype  = None
> libc.testc.argtypes = []
> libf.initf.restype  = None
> libf.testf.argtypes = []
>
> libc.initc()
> libc.testc()
> libf.testf()
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20200610/fdcea4ef/attachment.html>


More information about the discuss mailing list