[mpich-discuss] Issue with OrangeFS 2.9.7 direct interface and MPICH 3.3.1 using CH4 device

Latham, Robert J. robl at mcs.anl.gov
Tue Oct 8 10:10:30 CDT 2019


On Sun, 2019-10-06 at 12:20 -0500, Kun Feng via discuss wrote:
> Hi Min,
> 
> If that is the case, please ignore this email. Nothing is wrong
> without OrangeFS direct interface. I will try "ch4:ucx". Thank you
> for the info.

Does the 'pvfs2' driver still work?  the liborangefsposix library might
be intercepting system calls MPICH expects to use natively.

The liborangefsposix library is intended more for non-mpi applications
-- Hadoop workflows, for example.  MPICH's pvfs2 driver (the old name
for OrangeFS) speaks directly to the orangefs servers.  It also uses a
few optimizations not available if MPICH treats the OrangeFS like a
traditional UNIX-like file system.

==rob

> 
> On Sun, Oct 6, 2019 at 10:25 AM Si, Min via discuss <
> discuss at mpich.org> wrote:
> > Hi Kun,
> > 
> > Can you please try to reproduce the issue in a simple MPI program
> > which does not use OrangeFS ? It is hard for the MPICH community to
> > help when mixing MPI and OrangeFS together, because we are not
> > OrangeFS experts.
> > 
> > Besides, for InfiniBand networks, you might want to use `ch4:ucx`
> > instead of  `ch4:ofi`. But I do not think it causes the failure in
> > your use case.
> > 
> > Best regards,
> > Min
> > 
> > On 2019/10/04 12:21, Kun Feng via discuss wrote:
> > > To Whom It May Concern,
> > > 
> > > Recently, I switched to CH4 device in MPICH 3.3.1 for better
> > > network performance over the RoCE network we are using.
> > > I realized that my code fails to run when I use direct interface
> > > of OrangeFS 2.9.7. It exits without any error. But even simple
> > > helloworld cannot print anything. It happens only when I enable
> > > direct interface of OrangeFS by linking -lorangefsposix.
> > > Could you please help me on this issue?
> > > Here are some information that might be useful:
> > > Output of ibv_devinfo of 40Gbps Mellanox ConnectX-4 Lx adapter:
> > > hca_id: mlx5_0
> > >         transport:                      InfiniBand (0)
> > >         fw_ver:                         14.20.1030
> > >         node_guid:                      248a:0703:0015:a800
> > >         sys_image_guid:                 248a:0703:0015:a800
> > >         vendor_id:                      0x02c9
> > >         vendor_part_id:                 4117
> > >         hw_ver:                         0x0
> > >         board_id:                       LNV2430110027
> > >         phys_port_cnt:                  1
> > >                 port:   1
> > >                         state:                  PORT_ACTIVE (4)
> > >                         max_mtu:                4096 (5)
> > >                         active_mtu:             1024 (3)
> > >                         sm_lid:                 0
> > >                         port_lid:               0
> > >                         port_lmc:               0x00
> > >                         link_layer:             Ethernet
> > > 
> > > hca_id: i40iw0
> > >         transport:                      iWARP (1)
> > >         fw_ver:                         0.2
> > >         node_guid:                      7cd3:0aef:3da0:0000
> > >         sys_image_guid:                 7cd3:0aef:3da0:0000
> > >         vendor_id:                      0x8086
> > >         vendor_part_id:                 14289
> > >         hw_ver:                         0x0
> > >         board_id:                       I40IW Board ID
> > >         phys_port_cnt:                  1
> > >                 port:   1
> > >                         state:                  PORT_ACTIVE (4)
> > >                         max_mtu:                4096 (5)
> > >                         active_mtu:             1024 (3)
> > >                         sm_lid:                 0
> > >                         port_lid:               1
> > >                         port_lmc:               0x00
> > >                         link_layer:             Ethernet
> > > MPICH 3.3.1 configuration command: ./configure --with-
> > > device=ch4:ofi --with-pvfs2=/home/kfeng/install --enable-shared
> > > --enable-romio --with-file-system=ufs+pvfs2+zoidfs --enable-
> > > fortran=no --with-libfabric=/home/kfeng/install
> > > OrangeFS 2.9.7 configuration command: ./configure --
> > > prefix=/home/kfeng/install --enable-shared --enable-jni --with-
> > > jdk=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-0.el7_5.x86_64
> > > --with-kernel=/usr/src/kernels/3.10.0-862.el7.x86_64
> > > Make command: mpicc -o ~/hello ~/hello.c
> > > -L/home/kfeng/install/lib -lorangefsposix
> > > The verbose outputs of mpiexec are attached.
> > > 
> > > Thanks
> > > Kun
> > > 
> > > 
> > > _______________________________________________
> > > discuss mailing list     discuss at mpich.org
> > > To manage subscription options or unsubscribe:
> > > https://lists.mpich.org/mailman/listinfo/discuss
> >  
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss



More information about the discuss mailing list