[mpich-discuss] Core dump in parse_args

kumar.tarun at siemens.com kumar.tarun at siemens.com
Fri Sep 22 16:23:23 CDT 2023


I tried -configfile option and it further crashed with a memory failure.

The configfile itself has 1199 words/23777 characters. Overall it has 4 lines for 4 partitions.


*** Error in `vmpiexec': double free or corruption (out): 0x00000000007001e0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81329)[0x7ffff7206329]
vmpiexec[0x4262a9]
vmpiexec[0x4220ed]
vmpiexec[0x409d23]
vmpiexec[0x4035da]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7ffff71a7555]
vmpiexec[0x404c05]
======= Memory map: ========
00400000-0047c000 r-xp 00000000 00:77 2085572302                         ./mpich2/linux_x86_64/bin/mpiexec.hydra
0067b000-0067e000 rw-p 0007b000 00:77 2085572302                        ./mpich2/linux_x86_64/bin/mpiexec.hydra
0067e000-00bd9000 rw-p 00000000 00:00 0                                  [heap]
7ffff0000000-7ffff0021000 rw-p 00000000 00:00 0
7ffff0021000-7ffff4000000 ---p 00000000 00:00 0
7ffff6b45000-7ffff6b6a000 r-xp 00000000 08:02 45522                      /usr/lib64/liblzma.so.5.2.2
7ffff6b6a000-7ffff6d69000 ---p 00025000 08:02 45522                      /usr/lib64/liblzma.so.5.2.2
7ffff6d69000-7ffff6d6a000 r--p 00024000 08:02 45522                      /usr/lib64/liblzma.so.5.2.2
7ffff6d6a000-7ffff6d6b000 rw-p 00025000 08:02 45522                      /usr/lib64/liblzma.so.5.2.2
7ffff6d6b000-7ffff6d80000 r-xp 00000000 08:02 447697                     /usr/lib64/libz.so.1.2.7
7ffff6d80000-7ffff6f7f000 ---p 00015000 08:02 447697                     /usr/lib64/libz.so.1.2.7
7ffff6f7f000-7ffff6f80000 r--p 00014000 08:02 447697                     /usr/lib64/libz.so.1.2.7
7ffff6f80000-7ffff6f81000 rw-p 00015000 08:02 447697                     /usr/lib64/libz.so.1.2.7
7ffff6f81000-7ffff6f83000 r-xp 00000000 08:02 45003                      /usr/lib64/libdl-2.17.so
7ffff6f83000-7ffff7183000 ---p 00002000 08:02 45003                      /usr/lib64/libdl-2.17.so
7ffff7183000-7ffff7184000 r--p 00002000 08:02 45003                      /usr/lib64/libdl-2.17.so

Regards
Tarun

From: Kumar, Tarun (DI SW ICS DVT RD QSCE)
Sent: Monday, September 18, 2023 3:16 PM
To: Zhou, Hui <zhouh at anl.gov>; discuss at mpich.org; Thakur, Rajeev <thakur at anl.gov>
Cc: Raffenetti, Ken <raffenet at anl.gov>
Subject: RE: [mpich-discuss] Core dump in parse_args

Thanks. Good to know. I will explore further.

Regards
Tarun

From: Zhou, Hui <zhouh at anl.gov<mailto:zhouh at anl.gov>>
Sent: Monday, September 18, 2023 2:34 PM
To: discuss at mpich.org<mailto:discuss at mpich.org>; Thakur, Rajeev <thakur at anl.gov<mailto:thakur at anl.gov>>
Cc: Raffenetti, Ken <raffenet at anl.gov<mailto:raffenet at anl.gov>>; Kumar, Tarun (DI SW ICS DVT RD QSCE) <kumar.tarun at siemens.com<mailto:kumar.tarun at siemens.com>>
Subject: Re: [mpich-discuss] Core dump in parse_args

Hydra supports -configfile {name}​ option, so one could try putting all his commandline arguments in a file and pass that to hydra.

Hui

________________________________
From: Raffenetti, Ken via discuss <discuss at mpich.org<mailto:discuss at mpich.org>>
Sent: Monday, September 18, 2023 4:22 PM
To: discuss at mpich.org<mailto:discuss at mpich.org> <discuss at mpich.org<mailto:discuss at mpich.org>>; Thakur, Rajeev <thakur at anl.gov<mailto:thakur at anl.gov>>
Cc: Raffenetti, Ken <raffenet at anl.gov<mailto:raffenet at anl.gov>>; kumar.tarun at siemens.com<mailto:kumar.tarun at siemens.com> <kumar.tarun at siemens.com<mailto:kumar.tarun at siemens.com>>
Subject: Re: [mpich-discuss] Core dump in parse_args


I’m not aware of a hard-coded limit, but I am able to reproduce a segfault reliably by passing 1000 arguments a toy executable. Will update with more info when we have it.



Ken



From: "kumar.tarun--- via discuss" <discuss at mpich.org<mailto:discuss at mpich.org>>
Reply-To: "discuss at mpich.org<mailto:discuss at mpich.org>" <discuss at mpich.org<mailto:discuss at mpich.org>>
Date: Friday, September 15, 2023 at 4:57 PM
To: "Thakur, Rajeev" <thakur at anl.gov<mailto:thakur at anl.gov>>, "discuss at mpich.org<mailto:discuss at mpich.org>" <discuss at mpich.org<mailto:discuss at mpich.org>>
Cc: "kumar.tarun at siemens.com<mailto:kumar.tarun at siemens.com>" <kumar.tarun at siemens.com<mailto:kumar.tarun at siemens.com>>
Subject: Re: [mpich-discuss] Core dump in parse_args



Thanks Rajeev for your reply.

I don’t have the exact number as of now. It’s definitely more than 1000. Is there a hard coded limit?



Regards
Tarun



From: Thakur, Rajeev <thakur at anl.gov<mailto:thakur at anl.gov>>
Sent: Friday, September 15, 2023 2:40 PM
To: discuss at mpich.org<mailto:discuss at mpich.org>
Cc: Kumar, Tarun (DI SW ICS DVT RD QSCE) <kumar.tarun at siemens.com<mailto:kumar.tarun at siemens.com>>
Subject: Re: [mpich-discuss] Core dump in parse_args



How many arguments?



Rajeev



From: "kumar.tarun--- via discuss" <discuss at mpich.org<mailto:discuss at mpich.org>>
Reply-To: "discuss at mpich.org<mailto:discuss at mpich.org>" <discuss at mpich.org<mailto:discuss at mpich.org>>
Date: Friday, September 15, 2023 at 4:22 PM
To: "discuss at mpich.org<mailto:discuss at mpich.org>" <discuss at mpich.org<mailto:discuss at mpich.org>>
Cc: "kumar.tarun at siemens.com<mailto:kumar.tarun at siemens.com>" <kumar.tarun at siemens.com<mailto:kumar.tarun at siemens.com>>
Subject: [mpich-discuss] Core dump in parse_args



Hi,

     I recently encountered this crash where mpiexec when executed from a bash script and with a large number of arguments crashes. The core dump is as follows:



#0  0x0000000000408ec3 in parse_args ()

#1  0x0000000000409f26 in HYD_uii_mpx_get_parameters ()

#2  0x000000000040397a in main ()



If I reduce the number of arguments then I don’t see the crash. Is it a known issue?



Regards

Tarun


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20230922/807ffe40/attachment-0001.html>


More information about the discuss mailing list