[mpich-discuss] mpi assertion error

Jeff Hammond jeff.science at gmail.com
Fri Jun 28 10:14:24 CDT 2013


Sorry, I didn't realize that you attached the code already.  I braved
the unknown and opened it to find only benign text files :-)

Jeff

On Fri, Jun 28, 2013 at 10:11 AM, Jeff Hammond <jeff.science at gmail.com> wrote:
> Null buffer assertions are suggestive of incorrect programs.  Can you
> share the source of this program?
>
> As for the inline vs attached files debate, I think that pastebin is a
> superior option for large output since it is plain-text readable from
> any internet-enabled device and doesn't lead to huge messages on the
> list.  But for short messages, inlining is definitely good for email
> reading on phones.
>
> Jeff
>
> On Fri, Jun 28, 2013 at 9:46 AM, Danilo <apeironoriepa at aol.com> wrote:
>> In the last topic I read it was asked more that once to zip the files and I
>> did it.. By the way, this is the first error:
>> Assertion failed in file helper_fns.c at line 361: ((((char *) sendbuf +
>> sendtype_true_lb))) != NULL
>> internal ABORT - process 2
>>
>> Starting from the second execution I get:
>>
>> =====================================================================================
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   EXIT CODE: 139
>> =   CLEANING UP REMAINING PROCESSES
>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>> =====================================================================================
>>
>> HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:928): assert (!closed)
>> failed
>> HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback
>> returned error status
>> main (./pm/pmiserv/pmip.c:226): demux engine error waiting for event
>> HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:928): assert (!closed)
>> failed
>> HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback
>> returned error status
>> main (./pm/pmiserv/pmip.c:226): demux engine error waiting for event
>> HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:928): assert (!closed)
>> failed
>> HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback
>> returned error status
>> main (./pm/pmiserv/pmip.c:226): demux engine error waiting for event
>> HYDT_bscu_wait_for_completion (./tools/bootstrap/utils/bscu_wait.c:70): one
>> of the processes terminated badly; aborting
>> HYDT_bsci_wait_for_completion (./tools/bootstrap/src/bsci_wait.c:23):
>> launcher returned error waiting for completion
>> HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:191): launcher
>> returned error waiting for completion
>> main (./ui/mpich/mpiexec.c:405): process manager error waiting for
>> completion
>>
>>
>> Regards,
>> Danilo
>>
>>
>> -----Original Message-----
>> From: Wesley Bland <wbland at mcs.anl.gov>
>> To: discuss <discuss at mpich.org>
>> Sent: Fri, Jun 28, 2013 3:22 pm
>> Subject: Re: [mpich-discuss] mpi assertion error
>>
>> Can you just copy paste your error into the email? Most of us will probably
>> not be all that excited about opening up strange tarballs attached to an
>> email. Also, we get these emails on our phones and tablets where unzipping
>> source code isn't as much of an option.
>>
>> Wesley
>>
>> On Jun 28, 2013, at 8:17 AM, Danilo <apeironoriepa at aol.com> wrote:
>>
>> Good afternoon,
>>
>> I wrote a little application in C to compute 2D fft. This app was firstly
>> executed on a cluster on which it was installed 2007 mpi version (don't
>> remember the package name) and then adapted for a different cluster with mpi
>> 1.4.1 (had to change the scatter/gather because in the previous version I
>> could use the same buffer for both sendbuff and recvbuff). By the way, when
>> executing with 2 processes it works fine. When trying with 4/8/16/32 and so
>> on it gives firstly an assertion error as shown in the file attached, and
>> starting from the second time you try to run it on more than 2 procs it
>> gives error code 139. The error I'm talking about appears just when you run
>> it with "realDim=16384" (it means that you have 16384 rows and 16384x2
>> columns since it is designed for real/imaginary numbers). I know the code is
>> working since it was all ok on the previous cluster (even with 4-8-16-32
>> procs) and I can't find out which is the problem now.. Can you help?
>>
>> As said attached you can find my application as well as the errors appearing
>> and mpi info..
>>
>> Regards,
>> Danilo
>> <error+app.tar.gz>_______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> --
> Jeff Hammond
> jeff.science at gmail.com



-- 
Jeff Hammond
jeff.science at gmail.com



More information about the discuss mailing list