[mpich-discuss] Problems running MPICH jobs under SLURM

Pavan Balaji balaji at mcs.anl.gov
Sun Jun 9 00:13:09 CDT 2013


John,

Can you try the latest nightly snapshot?

http://www.mpich.org/static/tarballs/nightly/master/mpich/

  -- Pavan

On 06/08/2013 03:23 PM, Pavan Balaji wrote:
>
> Thanks, John.  I'll look into it and get back to you if I need any more
> information.
>
> Btw, you should not need sudo at all.  You might have some previously
> left over files with root permissions that might have caused the issue.
>   If you delete the entire directory and start from scratch, this issue
> should not be there.
>
>   -- Pavan
>
> On 06/08/2013 03:02 PM, Biddiscombe, John A. wrote:
>> Following your instructions (I only have 1 node, so changed N2 n4 to N
>> 1 n2),  same error, listed below...
>>
>> [NB . Only one interesting thing is that I cannot do a make as user
>> biddisco and have to sudo make as it gives me some permission error
>> otherwise
>> mkdir -p
>> '/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/etc'
>>   /usr/bin/install -c -m 644 src/env/mpicc.conf src/env/mpif77.conf
>> src/env/mpif90.conf src/env/mpicxx.conf
>> '/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/etc'
>> /usr/bin/install: cannot remove
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/etc/mpicc.conf':
>> Permission denied
>> /usr/bin/install: cannot remove
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/etc/mpif77.conf':
>> Permission denied
>> /usr/bin/install: cannot remove
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/etc/mpif90.conf':
>> Permission denied
>> /usr/bin/install: cannot remove
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/etc/mpicxx.conf':
>> Permission denied
>> make[3]: *** [install-sysconfDATA] Error 1
>> make[3]: Leaving directory
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79'
>> make[2]: *** [install-am] Error 2
>> make[2]: Leaving directory
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79'
>> make[1]: *** [install-recursive] Error 1
>> make[1]: Leaving directory
>> `/home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79'
>> make: *** [install] Error 2
>> ]
>>
>> JB
>>
>> biddisco at breno2 ~/build/mpich-master-v3.0.4-259-gf322ce79 $
>> ./install/bin/mpiexec -n 2 ./hello
>> *** glibc detected *** ./hello: double free or corruption (fasttop):
>> 0x00000000017b2340 ***
>> ======= Backtrace: =========
>> /lib/x86_64-linux-gnu/libc.so.6(+0x7eb96)[0x7fcaeb4a2b96]
>> *** glibc detected *** ./hello: double free or corruption (fasttop):
>> 0x00000000011e7340 ***
>> ======= Backtrace: =========
>> /lib/x86_64-linux-gnu/libc.so.6(+0x7eb96)[0x7fd1a77ddb96]
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/libmpich.so.10(MPIDI_Populate_vc_node_ids+0x3f9)[0x7fd1a7bc75a9]
>>
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/libmpich.so.10(MPID_Init+0x136)[0x7fd1a7bc1da6]
>>
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/libmpich.so.10(MPIR_Init_thread+0x22f)[0x7fd1a7c78f1f]
>>
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/libmpich.so.10(MPI_Init+0xae)[0x7fd1a7c788be]
>>
>> ./hello[0x40081e]
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/libmpich.so.10(MPIDI_Populate_vc_node_ids+0x3f9)[0x7fcaeb88c5a9]
>>
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/libmpich.so.10(MPID_Init+0x136)[0x7fcaeb886da6]
>>
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/libmpich.so.10(MPIR_Init_thread+0x22f)[0x7fcaeb93df1f]
>>
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/libmpich.so.10(MPI_Init+0xae)[0x7fcaeb93d8be]
>>
>> ./hello[0x40081e]
>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7fcaeb44576d]
>> ./hello[0x400719]
>> ======= Memory map: ========
>> 00400000-00401000 r-xp 00000000 08:01
>> 10625669
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/hello
>> 00600000-00601000 r--p 00000000 08:01
>> 10625669
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/hello
>> 00601000-00602000 rw-p 00001000 08:01
>> 10625669
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/hello
>> 017b1000-017d2000 rw-p 00000000 00:00
>> 0                                  [heap]
>> 7fcaeabe1000-7fcaeabe4000 rw-p 00000000 00:00 0
>> 7fcaeabe4000-7fcaeabf9000 r-xp 00000000 08:01
>> 9047556                    /lib/x86_64-linux-gnu/libgcc_s.so.1
>> 7fcaeabf9000-7fcaeadf8000 ---p 00015000 08:01
>> 9047556                    /lib/x86_64-linux-gnu/libgcc_s.so.1
>> 7fcaeadf8000-7fcaeadf9000 r--p 00014000 08:01
>> 9047556                    /lib/x86_64-linux-gnu/libgcc_s.so.1
>> 7fcaeadf9000-7fcaeadfa000 rw-p 00015000 08:01
>> 9047556                    /lib/x86_64-linux-gnu/libgcc_s.so.1
>> 7fcaeadfa000-7fcaeae12000 r-xp 00000000 08:01
>> 9050338                    /lib/x86_64-linux-gnu/libpthread-2.15.so
>> 7fcaeae12000-7fcaeb011000 ---p 00018000 08:01
>> 9050338                    /lib/x86_64-linux-gnu/libpthread-2.15.so
>> 7fcaeb011000-7fcaeb012000 r--p 00017000 08:01
>> 9050338                    /lib/x86_64-linux-gnu/libpthread-2.15.so
>> 7fcaeb012000-7fcaeb013000 rw-p 00018000 08:01
>> 9050338                    /lib/x86_64-linux-gnu/libpthread-2.15.so
>> 7fcaeb013000-7fcaeb017000 rw-p 00000000 00:00 0
>> 7fcaeb017000-7fcaeb01e000 r-xp 00000000 08:01
>> 9050343                    /lib/x86_64-linux-gnu/librt-2.15.so
>> 7fcaeb01e000-7fcaeb21d000 ---p 00007000 08:01
>> 9050343                    /lib/x86_64-linux-gnu/librt-2.15.so
>> 7fcaeb21d000-7fcaeb21e000 r--p 00006000 08:01
>> 9050343                    /lib/x86_64-linux-gnu/librt-2.15.so
>> 7fcaeb21e000-7fcaeb21f000 rw-p 00007000 08:01
>> 9050343                    /lib/x86_64-linux-gnu/librt-2.15.so
>> 7fcaeb21f000-7fcaeb223000 r-xp 00000000 08:01
>> 8661134
>> /home/biddisco/apps/mpich-3.0.4/lib/libmpl.so.1.0.0
>> 7fcaeb223000-7fcaeb422000 ---p 00004000 08:01
>> 8661134
>> /home/biddisco/apps/mpich-3.0.4/lib/libmpl.so.1.0.0
>> 7fcaeb422000-7fcaeb423000 r--p 00003000 08:01
>> 8661134
>> /home/biddisco/apps/mpich-3.0.4/lib/libmpl.so.1.0.0
>> 7fcaeb423000-7fcaeb424000 rw-p 00004000 08:01
>> 8661134
>> /home/biddisco/apps/mpich-3.0.4/lib/libmpl.so.1.0.0
>> 7fcaeb424000-7fcaeb5d9000 r-xp 00000000 08:01
>> 9050358                    /lib/x86_64-linux-gnu/libc-2.15.so
>> 7fcaeb5d9000-7fcaeb7d8000 ---p 001b5000 08:01
>> 9050358                    /lib/x86_64-linux-gnu/libc-2.15.so
>> 7fcaeb7d8000-7fcaeb7dc000 r--p 001b4000 08:01
>> 9050358                    /lib/x86_64-linux-gnu/libc-2.15.so
>> 7fcaeb7dc000-7fcaeb7de000 rw-p 001b8000 08:01
>> 9050358                    /lib/x86_64-linux-gnu/libc-2.15.so
>> 7fcaeb7de000-7fcaeb7e3000 rw-p 00000000 00:00 0
>> 7fcaeb7e3000-7fcaeba03000 r-xp 00000000 08:01
>> 11675463
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/libmpich.so.10.0.4
>>
>> 7fcaeba03000-7fcaebc03000 ---p 00220000 08:01
>> 11675463
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/libmpich.so.10.0.4
>>
>> 7fcaebc03000-7fcaebc10000 r--p 00220000 08:01
>> 11675463
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/libmpich.so.10.0.4
>>
>> 7fcaebc10000-7fcaebc16000 rw-p 0022d000 08:01
>> 11675463
>> /home/biddisco/build/mpich-master-v3.0.4-259-gf322ce79/install/lib/libmpich.so.10.0.4
>>
>> 7fcaebc16000-7fcaebc4e000 rw-p 00000000 00:00 0
>> 7fcaebc4e000-7fcaebc70000 r-xp 00000000 08:01
>> 9050344                    /lib/x86_64-linux-gnu/ld-2.15.so
>> 7fcaebe54000-7fcaebe56000 rw-p 00000000 00:00 0
>> 7fcaebe6d000-7fcaebe70000 rw-p 00000000 00:00 0
>> 7fcaebe70000-7fcaebe71000 r--p 00022000 08:01
>> 9050344                    /lib/x86_64-linux-gnu/ld-2.15.so
>> 7fcaebe71000-7fcaebe73000 rw-p 00023000 08:01
>> 9050344                    /lib/x86_64-linux-gnu/ld-2.15.so
>> 7fff671e5000-7fff67206000 rw-p 00000000 00:00
>> 0                          [stack]
>> 7fff673b4000-7fff673b5000 r-xp 00000000 00:00
>> 0                          [vdso]
>> ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00
>> 0                  [vsyscall]
>>
>> ===================================================================================
>>
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   EXIT CODE: 6
>> =   CLEANING UP REMAINING PROCESSES
>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>> ===================================================================================
>>
>> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6)
>> This typically refers to a problem with your application.
>> Please see the FAQ page for debugging suggestions
>>
>

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji



More information about the discuss mailing list