[mpich-discuss] MPICH 3.1 fails doesn't create core file.

Lu, Huiwei huiweilu at mcs.anl.gov
Sun Oct 12 22:49:45 CDT 2014


You may add printf to your application to report the application progress. So when it exits (or disappears), it will print the last message that can help you locate the error.
—
Huiwei

On Oct 12, 2014, at 10:42 PM, Anatoly G <anatolyrishon at gmail.com> wrote:

> I'm not sure about my current failure.
> I execute my application (MPI processes). Each process executes endless loop. After a couple of hours Master process fails (disappears - I can't see it using 'ps' command). There is no core file. I'm not sure about the reason of failure.
> It looks strange that single process (15 processes executed) fails w/o core file dropping.
> I suspect that I'm using MPICH in wrong way, which causes failure.
> But like I said before I'm not sure about failure.
> 
> Regards,
> Anatoly.
> 
> On Mon, Oct 13, 2014 at 6:21 AM, Wesley Bland <wbland at anl.gov> wrote:
> Calling MPI_Abort is not an error that would cause the OS to dump a core. For that, you'll need to do something within your application that would normally cause it to dump a core. MPICH itself won't do it for you. You can add a divide by zero error to your own code and you'll get a core file of your app. What specifically within MPICH are you trying to debug?
> 
> Thanks,
> Wesley
> 
> 
> 
> On Oct 12, 2014, at 10:09 PM, Anatoly G <anatolyrishon at gmail.com> wrote:
> 
>> Thank you.
>> My OS (Kubuntu 14.04).
>> ulimit is already set to unlimited.
>> When I generate exception (abort() function or divide by zero) I always get a core file,
>> but when MPICH fails (for example with MPI_Abort) no core file created.
>> May be I need to change any other settings?
>> 
>> 
>> Regards,
>> Anatoly.
>> 
>> On Sun, Oct 12, 2014 at 4:47 PM, Lu, Huiwei <huiweilu at mcs.anl.gov> wrote:
>> The core file is created by the OS, not MPICH. What’s the output of ‘ulimit -c’ on your machine? Could you set ‘ulimit -c unlimited’ and try again?
>> 
>>>> Huiwei
>> 
>> On Oct 12, 2014, at 7:35 AM, Anatoly G <anatolyrishon at gmail.com> wrote:
>> 
>> > Dear MPICH.
>> > I'm using MPICH3.1
>> > After application run of 4-5 hours Master process fails, but it doesn't creates a core file.
>> > When I used MPI_Abort (in the past) it not creates core file too.
>> > Can I make MPICH create core in any fail (of course if it's MPI fail).
>> >
>> > Regards,
>> > Anatoly.
>> >
>> > _______________________________________________
>> > discuss mailing list     discuss at mpich.org
>> > To manage subscription options or unsubscribe:
>> > https://lists.mpich.org/mailman/listinfo/discuss
>> 
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>> 
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list