[mpich-discuss] Problems running MPICH jobs under SLURM

Biddiscombe, John A. biddisco at cscs.ch
Sun Jun 9 01:03:42 CDT 2013


For reasons unclear to me, With this tarball, I get no mpiexec compiled.

I'm doing a diff between the latest tarballs and trying to find the problem ...

JB

ll install/bin/
total 68
lrwxrwxrwx 1 biddisco biddisco     6 Jun  9 07:55 mpic++ -> mpicxx
-rwxr-xr-x 1 biddisco biddisco  9905 Jun  9 07:55 mpicc
-rwxr-xr-x 1 biddisco biddisco  9300 Jun  9 07:55 mpichversion
-rwxr-xr-x 1 biddisco biddisco  9458 Jun  9 07:55 mpicxx
-rwxr-xr-x 1 biddisco biddisco 11551 Jun  9 07:55 mpif77
-rwxr-xr-x 1 biddisco biddisco 13375 Jun  9 07:55 mpif90
-rwxr-xr-x 1 biddisco biddisco  3430 Jun  9 07:55 parkill



-----Original Message-----
From: Pavan Balaji [mailto:balaji at mcs.anl.gov] 
Sent: 09 June 2013 07:13
To: discuss at mpich.org
Cc: Biddiscombe, John A.
Subject: Re: [mpich-discuss] Problems running MPICH jobs under SLURM

John,

Can you try the latest nightly snapshot?

http://www.mpich.org/static/tarballs/nightly/master/mpich/

  -- Pavan

On 06/08/2013 03:23 PM, Pavan Balaji wrote:
>
> Thanks, John.  I'll look into it and get back to you if I need any 
> more information.
>
> Btw, you should not need sudo at all.  You might have some previously 
> left over files with root permissions that might have caused the issue.
>   If you delete the entire directory and start from scratch, this 
> issue should not be there.
>
>   -- Pavan
>
> On 06/08/2013 03:02 PM, Biddiscombe, John A. wrote:
>> Following your instructions (I only have 1 node, so changed N2 n4 to 
>> N
>> 1 n2),  same error, listed below...
>>
>> [NB . Only one interesting thing is that I cannot do a make as user 
>> biddisco and have to sudo make as it gives me some permission error 
>> otherwise mkdir -p 
>> '/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/etc'
>>   /usr/bin/install -c -m 644 src/env/mpicc.conf src/env/mpif77.conf 
>> src/env/mpif90.conf src/env/mpicxx.conf 
>> '/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/etc'
>> /usr/bin/install: cannot remove
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/etc/mpicc.conf':
>> Permission denied
>> /usr/bin/install: cannot remove
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/etc/mpif77.conf':
>> Permission denied
>> /usr/bin/install: cannot remove
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/etc/mpif90.conf':
>> Permission denied
>> /usr/bin/install: cannot remove
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/etc/mpicxx.conf':
>> Permission denied
>> make[3]: *** [install-sysconfDATA] Error 1
>> make[3]: Leaving directory
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79'
>> make[2]: *** [install-am] Error 2
>> make[2]: Leaving directory
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79'
>> make[1]: *** [install-recursive] Error 1
>> make[1]: Leaving directory
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79'
>> make: *** [install] Error 2
>> ]
>>
>> JB
>>
>> biddisco at breno2 ~/build/mpich-master-v3.0.4-259-gf322ce79 $ 
>> ./install/bin/mpiexec -n 2 ./hello
>> *** glibc detected *** ./hello: double free or corruption (fasttop):
>> 0x00000000017b2340 ***
>> ======= Backtrace: =========
>> /lib/x86_64-linux-gnu/libc.so.6(+0x7eb96)[0x7fcaeb4a2b96]
>> *** glibc detected *** ./hello: double free or corruption (fasttop):
>> 0x00000000011e7340 ***
>> ======= Backtrace: =========
>> /lib/x86_64-linux-gnu/libc.so.6(+0x7eb96)[0x7fd1a77ddb96]
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/li
>> bmpich.so.10(MPIDI_Populate_vc_node_ids+0x3f9)[0x7fd1a7bc75a9]
>>
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/li
>> bmpich.so.10(MPID_Init+0x136)[0x7fd1a7bc1da6]
>>
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/li
>> bmpich.so.10(MPIR_Init_thread+0x22f)[0x7fd1a7c78f1f]
>>
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/li
>> bmpich.so.10(MPI_Init+0xae)[0x7fd1a7c788be]
>>
>> ./hello[0x40081e]
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/li
>> bmpich.so.10(MPIDI_Populate_vc_node_ids+0x3f9)[0x7fcaeb88c5a9]
>>
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/li
>> bmpich.so.10(MPID_Init+0x136)[0x7fcaeb886da6]
>>
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/li
>> bmpich.so.10(MPIR_Init_thread+0x22f)[0x7fcaeb93df1f]
>>
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/li
>> bmpich.so.10(MPI_Init+0xae)[0x7fcaeb93d8be]
>>
>> ./hello[0x40081e]
>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7fcaeb44576
>> d]
>> ./hello[0x400719]
>> ======= Memory map: ========
>> 00400000-00401000 r-xp 00000000 08:01
>> 10625669
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/hello
>> 00600000-00601000 r--p 00000000 08:01
>> 10625669
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/hello
>> 00601000-00602000 rw-p 00001000 08:01
>> 10625669
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/hello
>> 017b1000-017d2000 rw-p 00000000 00:00
>> 0                                  [heap]
>> 7fcaeabe1000-7fcaeabe4000 rw-p 00000000 00:00 0
>> 7fcaeabe4000-7fcaeabf9000 r-xp 00000000 08:01
>> 9047556                    /lib/x86_64-linux-gnu/libgcc_s.so.1
>> 7fcaeabf9000-7fcaeadf8000 ---p 00015000 08:01
>> 9047556                    /lib/x86_64-linux-gnu/libgcc_s.so.1
>> 7fcaeadf8000-7fcaeadf9000 r--p 00014000 08:01
>> 9047556                    /lib/x86_64-linux-gnu/libgcc_s.so.1
>> 7fcaeadf9000-7fcaeadfa000 rw-p 00015000 08:01
>> 9047556                    /lib/x86_64-linux-gnu/libgcc_s.so.1
>> 7fcaeadfa000-7fcaeae12000 r-xp 00000000 08:01
>> 9050338                    /lib/x86_64-linux-gnu/libpthread-2.15.so
>> 7fcaeae12000-7fcaeb011000 ---p 00018000 08:01
>> 9050338                    /lib/x86_64-linux-gnu/libpthread-2.15.so
>> 7fcaeb011000-7fcaeb012000 r--p 00017000 08:01
>> 9050338                    /lib/x86_64-linux-gnu/libpthread-2.15.so
>> 7fcaeb012000-7fcaeb013000 rw-p 00018000 08:01
>> 9050338                    /lib/x86_64-linux-gnu/libpthread-2.15.so
>> 7fcaeb013000-7fcaeb017000 rw-p 00000000 00:00 0
>> 7fcaeb017000-7fcaeb01e000 r-xp 00000000 08:01
>> 9050343                    /lib/x86_64-linux-gnu/librt-2.15.so
>> 7fcaeb01e000-7fcaeb21d000 ---p 00007000 08:01
>> 9050343                    /lib/x86_64-linux-gnu/librt-2.15.so
>> 7fcaeb21d000-7fcaeb21e000 r--p 00006000 08:01
>> 9050343                    /lib/x86_64-linux-gnu/librt-2.15.so
>> 7fcaeb21e000-7fcaeb21f000 rw-p 00007000 08:01
>> 9050343                    /lib/x86_64-linux-gnu/librt-2.15.so
>> 7fcaeb21f000-7fcaeb223000 r-xp 00000000 08:01
>> 8661134
>> /home/biddisco/apps/mpich-3.0.4/lib/libmpl.so.1.0.0
>> 7fcaeb223000-7fcaeb422000 ---p 00004000 08:01
>> 8661134
>> /home/biddisco/apps/mpich-3.0.4/lib/libmpl.so.1.0.0
>> 7fcaeb422000-7fcaeb423000 r--p 00003000 08:01
>> 8661134
>> /home/biddisco/apps/mpich-3.0.4/lib/libmpl.so.1.0.0
>> 7fcaeb423000-7fcaeb424000 rw-p 00004000 08:01
>> 8661134
>> /home/biddisco/apps/mpich-3.0.4/lib/libmpl.so.1.0.0
>> 7fcaeb424000-7fcaeb5d9000 r-xp 00000000 08:01
>> 9050358                    /lib/x86_64-linux-gnu/libc-2.15.so
>> 7fcaeb5d9000-7fcaeb7d8000 ---p 001b5000 08:01
>> 9050358                    /lib/x86_64-linux-gnu/libc-2.15.so
>> 7fcaeb7d8000-7fcaeb7dc000 r--p 001b4000 08:01
>> 9050358                    /lib/x86_64-linux-gnu/libc-2.15.so
>> 7fcaeb7dc000-7fcaeb7de000 rw-p 001b8000 08:01
>> 9050358                    /lib/x86_64-linux-gnu/libc-2.15.so
>> 7fcaeb7de000-7fcaeb7e3000 rw-p 00000000 00:00 0
>> 7fcaeb7e3000-7fcaeba03000 r-xp 00000000 08:01
>> 11675463
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/li
>> bmpich.so.10.0.4
>>
>> 7fcaeba03000-7fcaebc03000 ---p 00220000 08:01
>> 11675463
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/li
>> bmpich.so.10.0.4
>>
>> 7fcaebc03000-7fcaebc10000 r--p 00220000 08:01
>> 11675463
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/li
>> bmpich.so.10.0.4
>>
>> 7fcaebc10000-7fcaebc16000 rw-p 0022d000 08:01
>> 11675463
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/li
>> bmpich.so.10.0.4
>>
>> 7fcaebc16000-7fcaebc4e000 rw-p 00000000 00:00 0
>> 7fcaebc4e000-7fcaebc70000 r-xp 00000000 08:01
>> 9050344                    /lib/x86_64-linux-gnu/ld-2.15.so
>> 7fcaebe54000-7fcaebe56000 rw-p 00000000 00:00 0
>> 7fcaebe6d000-7fcaebe70000 rw-p 00000000 00:00 0
>> 7fcaebe70000-7fcaebe71000 r--p 00022000 08:01
>> 9050344                    /lib/x86_64-linux-gnu/ld-2.15.so
>> 7fcaebe71000-7fcaebe73000 rw-p 00023000 08:01
>> 9050344                    /lib/x86_64-linux-gnu/ld-2.15.so
>> 7fff671e5000-7fff67206000 rw-p 00000000 00:00
>> 0                          [stack]
>> 7fff673b4000-7fff673b5000 r-xp 00000000 00:00
>> 0                          [vdso]
>> ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00
>> 0                  [vsyscall]
>>
>> =====================================================================
>> ==============
>>
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   EXIT CODE: 6
>> =   CLEANING UP REMAINING PROCESSES
>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>> =====================================================================
>> ==============
>>
>> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6) 
>> This typically refers to a problem with your application.
>> Please see the FAQ page for debugging suggestions
>>
>

--
Pavan Balaji
http://www.mcs.anl.gov/~balaji



More information about the discuss mailing list