[mpich-discuss] 3.2 build help

Galloway, Michael D. gallowaymd at ornl.gov
Mon Mar 28 11:33:06 CDT 2016


Good Day All,

We’re trying to get a build of 3.2 for our centos7 hpc environment using IB. we don’t have mxm installed so I’m trying this:

./configure --prefix=/software/tools/apps/mpich/gnu/3.2 --with-pm=hydra -with-device=ch3:nemesis --with-ibverbs=/usr  --with-pbs=/opt/torque

But we end up with backtraces of the form:

 mpi_script_launcher.run:17440 terminated with signal 7 at PC=7f8caca067f4 SP=7fff7c4ffb20. Backtrace:

mpi_script_launcher.run:17441 terminated with signal 7 at PC=7fece77297f4 SP=7ffd3b5ff100.  Backtrace:

/software/tools/apps/mpich/gnu/3.2/lib/libmpi.so.12(MPID_nem_init+0x964)[0x7f8caca067f4]

/software/tools/apps/mpich/gnu/3.2/lib/libmpi.so.12(MPID_nem_init+0x964)[0x7fece77297f4]

/software/tools/apps/mpich/gnu/3.2/lib/libmpi.so.12(MPIDI_CH3_Init+0x29)[0x7f8cac9f7609]

/software/tools/apps/mpich/gnu/3.2/lib/libmpi.so.12(MPIDI_CH3_Init+0x29)[0x7fece771a609]

/software/tools/apps/mpich/gnu/3.2/lib/libmpi.so.12(MPID_Init+0x18b)[0x7f8cac9ececb]

/software/tools/apps/mpich/gnu/3.2/lib/libmpi.so.12(MPID_Init+0x18b)[0x7fece770fecb]

/software/tools/apps/mpich/gnu/3.2/lib/libmpi.so.12(MPIR_Init_thread+0x34c)[0x7f8cac95345c]

/software/tools/apps/mpich/gnu/3.2/lib/libmpi.so.12(MPIR_Init_thread+0x34c)[0x7fece767645c]

/software/tools/apps/mpich/gnu/3.2/lib/libmpi.so.12(MPI_Init+0x7e)[0x7f8cac952ede]

/home/m8a/mpi_script_launcher/mpi_script_launcher.run[0x4008d0]

/software/tools/apps/mpich/gnu/3.2/lib/libmpi.so.12(MPI_Init+0x7e)[0x7fece7675ede]

/home/m8a/mpi_script_launcher/mpi_script_launcher.run[0x4008d0]

/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f8cac4fcb15]

/home/m8a/mpi_script_launcher/mpi_script_launcher.run[0x4007c9]

/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fece721fb15]

/home/m8a/mpi_script_launcher/mpi_script_launcher.run[0x4007c9]



Similarly with mvapich we get failures of this form:


mpirun -n 4 /home/m8a/mpi_script_launcher/MVAPICH/mpi_script_launcher.run  /home/m8a/mpi_script_launcher/mpi_bash_script_example.sh

[cli_0]: aborting job:

Fatal error in MPI_Init:

Other MPI error, error stack:

MPIR_Init_thread(514)..........:

MPID_Init(365).................: channel initialization failed

MPIDI_CH3_Init(495)............:

MPIDI_CH3I_SHMEM_Helper_fn(921): write: Success

===================================================================================

=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES

=   PID 17901 RUNNING AT mod-condo-login02.ornl.gov

=   EXIT CODE: 1

=   CLEANING UP REMAINING PROCESSES

=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

===================================================================================



I suspect we are doing something silly here but I’m not sure what, and  openmpi code on the same cluster runs fine.

Is there a current recommendation for IB/pbs/torque build flags?

— michael

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20160328/535247dc/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list