[mpich-discuss] MPI_Comm_spawn zombies have risen from the dead.

Thomas Pak thomas.pak at maths.ox.ac.uk
Tue Dec 11 09:10:45 CST 2018


Hi Hui,

Sorry the late response.
No problem! The application that I am developing is a tool that automates the parallelisation of scientific simulations as part of an advanced parameter sweeping algorithm. One of the key goals is to do this in a modular manner. I have decided to let the simulation be performed by an executable provided by the user, a so-called "simulator". This provides the most flexibility as the user can then write their simulator in any programming or scripting language. Moreover, this also implies that the user can make use of any software package or library.
For regular non-MPI simulators, my application invokes the simulator with the system calls fork() and exec(). These types of simulator work perfectly fine with my application as it currently stands.
However, when the simulator itself uses MPI, the fork-exec fails because an MPI application forking another MPI application is undefined and causes all kinds of troubles. Thus, I have resorted to using the native MPI_Comm_spawn function for dynamic process management for those cases.
A possible counterpoint to all of this is that there is no use in running parallel instances of an MPI simulator because the simulator itself parallelised. However, this is not always the case because some simulation software packages are written using external libraries that use MPI internally, such as PETSc, despite the application itself not being parallelised. Even though this case might be somewhat niche, it is exactly the situation that I am in.
In summary, my MPI application calls external executables provided by the user and runs many parallel instances of them. The external executable itself may be an MPI application, in which case MPI_Comm_spawn is necessary.
In a way, the reason I need MPI_Comm_spawn to work is because MPI has become so ubiquitous in scientific computing that even non-parallelised simulators might be using MPI under the hood.
I hope this helps.
Best wishes,
Thomas Pak

On Dec 6 2018, at 4:00 pm, Zhou, Hui <zhouh at anl.gov> wrote:
> Hi Thomas,
>
> Sure. While I am finding time looking into this, could you talk a bit more about your application? The dynamic MPI processes is still a new territory despite its presence for many years. One of the reasons is due to lack of exposure for such applications.
>
>> Hui Zhou
>
>
>
> > On Dec 6, 2018, at 6:11 AM, Thomas Pak <thomas.pak at maths.ox.ac.uk (mailto:thomas.pak at maths.ox.ac.uk)> wrote:
> > Hi Hui,
> > That is exactly right, thanks so much for looking into it! Please let me know if you make any progress on this issue, as the application I am developing critically depends on MPI_Comm_spawn.
> > Best wishes,
> > Thomas Pak
> >
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20181211/b6c81d26/attachment.html>


More information about the discuss mailing list