[mpich-discuss] Error in dynamically spawned processes

Antonio J. Peña apenya at mcs.anl.gov
Wed May 15 09:02:44 CDT 2013


Hi,

We would need more information to determine if this error is caused by an 
MPICH bug. Could you please post the code which is failing along with some 
sample input for both failing and non-failing cases?

Thanks,
  Antonio


On Wednesday, May 15, 2013 07:19:59 PM Mahesh Doijade wrote:

Hi,
    I am dynamic creating process using MPI_Comm_spawn_multiple(), the code 
runs for several inputs without any error but for some larger input size it is 
throwing this error as given below.  


Fatal error in PMPI_Barrier: A process has failed, error stack:
PMPI_Barrier(426).........: MPI_Barrier(comm=0x84000002) failed
MPIR_Barrier_impl(333)....: Failure during collective
MPIR_Barrier_impl(315)....: 
MPIR_Barrier_intra(83)....: 
dequeue_and_set_error(823): Communication error with rank 0

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 1
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
[proxy:1:0 at clusterslave3] HYD_pmcd_pmip_control_cmd_cb 
(./pm/pmiserv/pmip_cb.c:883): assert (!closed) failed
[proxy:1:0 at clusterslave3] HYDT_dmxu_poll_wait_for_event 
(./tools/demux/demux_poll.c:77): callback returned error status
[proxy:1:0 at clusterslave3] main (./pm/pmiserv/pmip.c:210): demux engine error 
waiting for event
[mpiexec at clusterslave3] HYDT_bscu_wait_for_completion 
(./tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated 
badly; aborting
[mpiexec at clusterslave3] HYDT_bsci_wait_for_completion 
(./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for 
completion
[mpiexec at clusterslave3] HYD_pmci_wait_for_completion 
(./pm/pmiserv/pmiserv_pmci.c:216): launcher returned error waiting for 
completion
[mpiexec at clusterslave3] main (./ui/mpich/mpiexec.c:325): process manager error 
waiting for completion


-- 

Regards,
-- Mahesh Doijade
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20130515/0c837ca0/attachment.html>


More information about the discuss mailing list