[mpich-discuss] Runing mpich over InfiniBand Open Fabrics

"Antonio J. Peña" apenya at mcs.anl.gov
Thu Jan 15 10:21:10 CST 2015


Hi Ramiro,

Our folks that contributed that netmod are looking into this issue. In 
the meantime, I'd suggest trying the MXM netmod. MXM is a netmod for IB 
networks using the MXM API instead of Verbs to interact with the HCA.

Best,
   Antonio


On 01/15/2015 05:52 AM, Ramiro Alba wrote:
> Hi all,
>
> I've compiled mpich-3.1.3 on centos 6.5 with the following options:
>
>         --enable-fortran=yes \
>         --with-device=ch3:nemesis:ib \
>         --with-pm=hydra:gforker \
>
> and the package 'libibverbs-devel' installed.
>
> When I try to run a test hello program using two IB DDR  nodes, using the
> command:
>
> mpiexec.hydra -np 16 -bind-to core -launcher rsh -iface ib0 -hosts 
> jff201,jff202 mpi_hello
>
> I've got the errors bellow, even running using with root user.
>
> If I compile with:
>
> --with-device=ch3:nemesis
>
> it works with no errors.
>
> I am also using both openmpi and mvapich2 on Infiniband and they work 
> fine
>
> Am I doing something wrong when compiling and/or running?
> Any sugestion is welcomed?
>
> Thanks in advance
> Regards
>
> ########################################################################## 
>
> MPICH OVER IB: RUN ERRORS
> ########################################################################## 
>
> Fatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(498):
> MPID_Init(177).......: channel initialization failed
> MPIDI_CH3_Init(89)...:
> MPID_nem_init(320)...:
> MPID_nem_ib_init(264): MPID_nem_ib_com_open failed
> Fatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(498):
> MPID_Init(177).......: channel initialization failed
> MPIDI_CH3_Init(89)...:
> MPID_nem_init(320)...:
> MPID_nem_ib_init(264): MPID_nem_ib_com_open failed
> [root at jff201 mpich]# mpirun -np 2 -iface eth0 mpi_hello-mpich
> IB device not foundFatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(498):
> MPID_Init(177).......: channel initialization failed
> MPIDI_CH3_Init(89)...:
> MPID_nem_init(320)...:
> MPID_nem_ib_init(264): MPID_nem_ib_com_open failed
> IB device not foundFatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(498):
> MPID_Init(177).......: channel initialization failed
> MPIDI_CH3_Init(89)...:
> MPID_nem_init(320)...:
> MPID_nem_ib_init(264): MPID_nem_ib_com_open failed
> ########################################################################## 
>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss


-- 
Antonio J. Peña
Postdoctoral Appointee
Mathematics and Computer Science Division
Argonne National Laboratory
9700 South Cass Avenue, Bldg. 240, Of. 3148
Argonne, IL 60439-4847
apenya at mcs.anl.gov
www.mcs.anl.gov/~apenya

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20150115/5fb8d9e3/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list