[mpich-discuss] mpi_comm_spawn - process not destroyed
Min Si
msi at anl.gov
Tue Sep 5 21:27:29 CDT 2017
Hi Stanislav,
I apologize for the late update. After a while I finally find chance to
try this test. I could not reproduce this problem on my side.
Here is my environment:
- MPICH version: version 3.3a2 (using the same configure options as
shown in your config.log)
- mpi4py: version 2.0.0
- platform: a Fedora25 VM and a Centos7 VM
If you are still facing this problem, please try the following steps to
narrow down:
1. Try remove the computation in child.py
2. Try MPICH test suite under <your MPICH build directory>/test/mpi/spawn/
make
make testing V=1
Regards,
Min
On 8/13/17 11:07 AM, Stanislav Simko wrote:
> Hi Min,
> attached are configure and make logs for the build that I did for
> myself on our local cluster with Intel compilers. I.e., build is
> definitely not guaranteed to be perfect/optimal. But please keep in
> mind that I get the same behaviour with standard MPICH package in
> Fedora distribution - I think that configuration options should be
> available online, maybe prepared by someone from MPICH?
>
> Also, I tested with simple c++ hello world like program and it's the same.
>
> Thank you.
> best,
> stanislav.
>
>
> On Sun, 2017-08-13 at 19:58 +0100, Min Si wrote:
>> Hi Stanislav,
>>
>> This seems interesting. Could you please also attach the MPICH
>> config.log ? You can find under the directory where you build MPICH.
>> I will look into this problem then and keep you updated.
>>
>> Min
>>
>> On 8/11/17 1:31 PM, Stanislav Simko wrote:
>>> Dear all,
>>>
>>> I'm just trying some very basic stuff with MPI_COMM_SPAWN in python
>>> (i.e. I use mpi4py package), but I see behaviour that I do not
>>> understand - the child process gets spawned, does its stuff and then
>>> "should" finish. I see though, that the process created for the
>>> child stays alive. I see this only with the MPICH, OPENMPI does what
>>> I would (naively) expect. In this way I can end up with N "ghost"
>>> process, after calling SPAWN N-times. My minimal working example is
>>> following:
>>>
>>>
>>> ______________________________________________
>>> parent.py
>>>
>>> from __future__ import print_function
>>> from mpi4py import MPI
>>> comm = MPI.COMM_WORLD
>>> spawned =
>>> MPI.COMM_SELF.Spawn(sys.executable,args=['child.py'],maxprocs=1)
>>> print("parent process is waiting for child")
>>> spawned.Barrier()
>>>
>>>
>>> ______________________________________________
>>> child.py
>>>
>>> from __future__ import print_function
>>> from mpi4py import MPI
>>> parent = MPI.Comm.Get_parent()
>>> # just do some stupid stuff that takes a bit of time
>>> for i in range(5000000):
>>> a = i*i+1-(i*10) + math.sin(math.pow(i,i%8))
>>> parent.Barrier()
>>>
>>> ______________________________________________
>>>
>>> I run with e.g.:
>>> mpirun -n 1 python mpi.py
>>>
>>> Do I miss something with SPAWN method?
>>> (I tested on two independent systems, our local cluster with mpich
>>> v3.0.4, and my laptop - fedora 26, mpich v3.2.8 from repositories)
>>>
>>> thank you very much for suggestions.
>>>
>>> Regards,
>>> stanislav.
>>>
>>>
>>> _______________________________________________
>>> discuss mailing listdiscuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20170905/9500d285/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list