[mpich-discuss] mpi assertion error

Danilo apeironoriepa at aol.com
Fri Jun 28 10:47:53 CDT 2013


Probably you should try to change the value realDim=16384 to something more at your machine level, such as 1024... So that you won't experiment a hung :)
My purpose is no to use an already done fft algorithm, but just to try how it is to write one starting from a serial version.. I know that exists 10000kk times better algorithms..

 

 

-----Original Message-----
From: Jeff Hammond <jeff.science at gmail.com>
To: discuss <discuss at mpich.org>
Sent: Fri, Jun 28, 2013 5:41 pm
Subject: Re: [mpich-discuss] mpi assertion error


Yes, I can tell your new to programming.  Stop reading K&R C as a
style guide :-)

Your program hung my virtual machine so I cannot help you.  I
recommend you use FFTW instead of rolling your own FFTs.

Best,

Jeff

On Fri, Jun 28, 2013 at 10:28 AM, Danilo <apeironoriepa at aol.com> wrote:
> Hi Jeff,
> the program was tested intensively on the previous cluster. The changes made
> are in scatter/gather (due to sendbuf and recvbuff that has to be differente
> in this version it seems..). The other main change is due to hydra, because
> on the previous cluster there wasn't such a process management system. But
> I'm quite new to programming, so I don't know...
>
>
> Thanks for your help.
>
> Regards
>
> -----Original Message-----
> From: Jeff Hammond <jeff.science at gmail.com>
> To: discuss <discuss at mpich.org>
> Sent: Fri, Jun 28, 2013 5:12 pm
> Subject: Re: [mpich-discuss] mpi assertion error
>
> Null buffer assertions are suggestive of incorrect programs.  Can you
> share the source of this program?
>
> As for the inline vs attached files debate, I think that pastebin is a
> superior option for large output since it is plain-text readable from
> any internet-enabled device and doesn't lead to huge messages on the
> list.  But for short messages, inlining is definitely good for email
> reading on phones.
>
> Jeff
>
> On Fri, Jun 28, 2013 at 9:46 AM, Danilo <apeironoriepa at aol.com> wrote:
>> In the last topic I read it was asked more that once to zip the files and
>> I
>> did it.. By the way, this is the first error:
>> Assertion failed in file helper_fns.c at line 361: ((((char *) sendbuf +
>> sendtype_true_lb))) != NULL
>> internal ABORT - process 2
>>
>> Starting from the second execution I get:
>>
>>
>> =====================================================================================
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   EXIT CODE: 139
>> =   CLEANING UP REMAINING PROCESSES
>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>>
>> =====================================================================================
>>
>> HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:928): assert
>> (!closed)
>> failed
>> HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback
>> returned error status
>> main (./pm/pmiserv/pmip.c:226): demux engine error waiting for event
>> HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:928): assert
>> (!closed)
>> failed
>> HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback
>> returned error status
>> main (./pm/pmiserv/pmip.c:226): demux engine error waiting for event
>> HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:928): assert
>> (!closed)
>> failed
>> HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback
>> returned error status
>> main (./pm/pmiserv/pmip.c:226): demux engine error waiting for event
>> HYDT_bscu_wait_for_completion (./tools/bootstrap/utils/bscu_wait.c:70):
>> one
>> of the processes terminated badly; aborting
>> HYDT_bsci_wait_for_completion (./tools/bootstrap/src/bsci_wait.c:23):
>> launcher returned error waiting for completion
>> HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:191): launcher
>> returned error waiting for completion
>> main (./ui/mpich/mpiexec.c:405): process manager error waiting for
>> completion
>>
>>
>> Regards,
>> Danilo
>>
>>
>> -----Original Message-----
>> From: Wesley Bland <wbland at mcs.anl.gov>
>> To: discuss <discuss at mpich.org>
>> Sent: Fri, Jun 28, 2013 3:22 pm
>> Subject: Re: [mpich-discuss] mpi assertion error
>>
>> Can you just copy paste your error into the email? Most of us will
>> probably
>> not be all that excited about opening up strange tarballs attached to an
>> email. Also, we get these emails on our phones and tablets where unzipping
>> source code isn't as much of an option.
>>
>> Wesley
>>
>> On Jun 28, 2013, at 8:17 AM, Danilo <apeironoriepa at aol.com> wrote:
>>
>> Good afternoon,
>>
>> I wrote a little application in C to compute 2D fft. This app was firstly
>> executed on a cluster on which it was installed 2007 mpi version (don't
>> remember the package name) and then adapted for a different cluster with
>> mpi
>> 1.4.1 (had to change the scatter/gather because in the previous version I
>> could use the same buffer for both sendbuff and recvbuff). By the way,
>> when
>> executing with 2 processes it works fine. When trying with 4/8/16/32 and
>> so
>> on it gives firstly an assertion error as shown in the file attached, and
>> starting from the second time you try to run it on more than 2 procs it
>> gives error code 139. The error I'm talking about appears just when you
>> run
>> it with "realDim=16384" (it means that you have 16384 rows and 16384x2
>> columns since it is designed for real/imaginary numbers). I know the code
>> is
>> working since it was all ok on the previous cluster (even with 4-8-16-32
>> procs) and I can't find out which is the problem now.. Can you help?
>>
>> As said attached you can find my application as well as the errors
>> appearing
>> and mpi info..
>>
>> Regards,
>> Danilo
>> <error+app.tar.gz>_______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> --
> Jeff Hammond
> jeff.science at gmail.com
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss



-- 
Jeff Hammond
jeff.science at gmail.com
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20130628/7bffb34c/attachment.html>


More information about the discuss mailing list