[mpich-discuss] MPICH Connection to Self Rejected

Kenneth Raffenetti raffenet at mcs.anl.gov
Mon May 1 14:28:36 CDT 2017


Hi Melissa,

Probably best to post this question to 
mvapich-discuss at cse.ohio-state.edu and go from there.

Thanks,
Ken

On 05/01/2017 01:15 PM, Melissa Romanus wrote:
> Hello,
>
> I am experiencing issues on the SDSC Comet system when using the intel
> compilers with mvapich2. The scheduling system on Comet is slurm. It
> seems like the code is seg-faulting inside of MPI_Comm_dup, but prior to
> that, it seems like it is rejecting a connection request to "self"
> (i.e., same IP to same IP).
>
> The modules loaded are:
>
> ```
> $ module list
>
> Currently Loaded Modulefiles:
>   1) intel/2013_sp1.2.144   2) mvapich2_ib/2.1        3) gnutools/2.69
> ```
>
> I am attempting to use the `ib0` interface. In my job script, I am
> launching 3 different applications. I am **not** using slurm
> `--multi-prog`. I am instead using 3 different `srun` commands. My job
> has to be launched this way.
>
> Using OpenMPI, I can set the MCA parameters to allow connections from
> `self` at the byte-transfer layer, i.e., `OMPI_MCA_btl="self,openib"`
> and specify to slurm that I would like to use `--mpi=pmi2`.
>
> I think the mvapich errors that I am experiencing stem from the fact
> that the "self" connection is rejected (i.e., node to itself). Is there
> a way to tell MVAPICH to allow the self connection? I think I want the
> `--with-device=ch3:nemesis:ib` command in some capacity, but I'm not
> sure if that would be enough to allow the connection from the node to
> itself. Is the self connection inherently a TCP connection? Or do I
> still need `--mpi=pmi2` for srun? Can I use srun or do I need to use
> `mpiexec` explicitly?
>
> Could this also be the cause of the error described by this FAQ?
> https://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Q:_All_my_processes_get_rank_0.
>
> Any help you can provide is greatly appreciated.
>
> -Melissa
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list