<meta http-equiv="Content-Type" content="text/html; charset=utf-8"><div dir="ltr">Hello,<div><br></div><div>I am experiencing issues on the SDSC Comet system when using the intel compilers with mvapich2. The scheduling system on Comet is slurm. It seems like the code is seg-faulting inside of MPI_Comm_dup, but prior to that, it seems like it is rejecting a connection request to "self" (i.e., same IP to same IP). </div><div><br></div><div>The modules loaded are:</div><div><br></div>```<br>$ module list<br><br><div>Currently Loaded Modulefiles:<br> 1) intel/2013_sp1.2.144 2) mvapich2_ib/2.1 3) gnutools/2.69<br>```</div><div><br></div><div>I am attempting to use the `ib0` interface. In my job script, I am launching 3 different applications. I am **not** using slurm `--multi-prog`. I am instead using 3 different `srun` commands. My job has to be launched this way.</div><div><br></div><div>Using OpenMPI, I can set the MCA parameters to allow connections from `self` at the byte-transfer layer, i.e., `OMPI_MCA_btl="self,openib"` and specify to slurm that I would like to use `--mpi=pmi2`. </div><div><br></div><div>I think the mvapich errors that I am experiencing stem from the fact that the "self" connection is rejected (i.e., node to itself). Is there a way to tell MVAPICH to allow the self connection? I think I want the `<font color="#000000"><span style="white-space:pre-wrap">--with-device=ch3:nemesis:ib` command in some capacity, but I'm not sure if that would be enough to allow the connection from the node to itself. Is the self connection inherently a TCP connection? Or do I still need `--mpi=pmi2` for srun? Can I use srun or do I need to use `mpiexec` explicitly?</span></font></div><div><span style="color:rgb(0,0,0);white-space:pre-wrap"><br></span></div><div><span style="color:rgb(0,0,0);white-space:pre-wrap">Could this also be the cause of the error described by this FAQ? </span><font color="#000000"><span style="white-space:pre-wrap"><a href="https://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_All_my_processes_get_rank_0">https://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_All_my_processes_get_rank_0</a>.</span></font></div><div><font color="#000000"><span style="white-space:pre-wrap"><br></span></font></div><div><font color="#000000"><span style="white-space:pre-wrap">Any help you can provide is greatly appreciated.</span></font></div><div><br></div><div>-Melissa</div>
</div>