[mpich-discuss] Mpich over RDMA sample

Niyaz Murshed Niyaz.Murshed at arm.com
Wed Jun 12 15:59:26 CDT 2024


I think image is not gone : https://urldefense.us/v3/__https://ibb.co/RDGnJfy__;!!G_uCfscf7eWS!ZufMq3Gfh2mhDMaE0Mnc5iPAtr9Q-CZrc_SW5jN5zAhPp8BAR4q0XAKDb2RWKz4MyQf_AvzIMgMa2AYITC0$ 

From: Niyaz Murshed <Niyaz.Murshed at arm.com>
Date: Wednesday, June 12, 2024 at 3:56 PM
To: Zhou, Hui <zhouh at anl.gov>, discuss at mpich.org <discuss at mpich.org>
Cc: nd <nd at arm.com>
Subject: Re: Mpich over RDMA sample
Thank you Hui for the reply.

I was expecting to see RoCE messages in the packet capture.

When I use verbs on libfabric sample tests, I see the RoCE messages as below :
[cid:image001.png at 01DABCE1.0D6BE810]



From: Zhou, Hui <zhouh at anl.gov>
Date: Wednesday, June 12, 2024 at 3:41 PM
To: Niyaz Murshed <Niyaz.Murshed at arm.com>, discuss at mpich.org <discuss at mpich.org>
Cc: nd <nd at arm.com>
Subject: Re: Mpich over RDMA sample
Niyaz,

ofi_rxm:verbs is the correct provider. The ofi_rxm provides the verbs provider additional messaging semantics that are needed to be used in MPI.
--
Hui Zhou


From: Niyaz Murshed <Niyaz.Murshed at arm.com>
Date: Wednesday, June 12, 2024 at 2:54 PM
To: Zhou, Hui <zhouh at anl.gov>, discuss at mpich.org <discuss at mpich.org>
Cc: nd <nd at arm.com>
Subject: Re: Mpich over RDMA sample
Thank you for the reply. I did that; however, it selects ofi_rxm: verbs libfabric: 412098: 1718221892: : ofi_rxm: av: ofi_av_insert_addr(): 313<info> fi_addr: 1 libfabric: 412099: 1718221892: : ofi_rxm: av: ofi_av_insert_addr(): 313<info> fi_addr: 
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd
Thank you for the reply.

I did that; however, it selects ofi_rxm:verbs


libfabric:412098:1718221892::ofi_rxm:av:ofi_av_insert_addr():313<info> fi_addr: 1

libfabric:412099:1718221892::ofi_rxm:av:ofi_av_insert_addr():313<info> fi_addr: 1



options:

  backend:        cpu

  iters:          16

  warmup_iters:   16

  cache:          1

  min_elem_count: 1

  max_elem_count: 1

  elem_counts:    [1]

  validate:       last

  window_size:    64

#------------------------------------------------------------

# Benchmarking: Bandwidth

# #processes: 2

#------------------------------------------------------------



        #bytes      #repetitions        Mbytes/sec

             4                16              0.02



# All done



libfabric:412098:1718221892::ofi_rxm:ep_ctrl:rxm_stop_listen():864<info> stopping CM thread

libfabric:412099:1718221892::ofi_rxm:ep_ctrl:rxm_stop_listen():864<info> stopping CM thread



As per https://urldefense.us/v3/__https://www.intel.com/content/www/us/en/docs/mpi-library/developer-guide-linux/2021-6/ofi-providers-support.html__;!!G_uCfscf7eWS!ZufMq3Gfh2mhDMaE0Mnc5iPAtr9Q-CZrc_SW5jN5zAhPp8BAR4q0XAKDb2RWKz4MyQf_AvzIMgMaziARNwo$ <https://urldefense.us/v3/__https:/www.intel.com/content/www/us/en/docs/mpi-library/developer-guide-linux/2021-6/ofi-providers-support.html__;!!G_uCfscf7eWS!YajPk9G-sPCEDt44nrZQrYG8r7V_s953AsKjQ4w_vW5OzcjJXfdGHOINW-PEYIw-IcISZSCw1xIsIoJ-dbE$> , we need to add FI_PROVIDER=^ofi_rxm , but if do that,  it moves back to sockets provider.
Is here a way to combine ^ofi_rxm and verbs

From: Zhou, Hui <zhouh at anl.gov>
Date: Wednesday, June 12, 2024 at 2:41 PM
To: Niyaz Murshed <Niyaz.Murshed at arm.com>, discuss at mpich.org <discuss at mpich.org>
Cc: nd <nd at arm.com>
Subject: Re: Mpich over RDMA sample
Niyaz,

All you need to do is to set an environment variable `FI_PROVIDER=verbs`.

--
Hui Zhou


From: Niyaz Murshed <Niyaz.Murshed at arm.com>
Date: Wednesday, June 12, 2024 at 1:23 PM
To: Zhou, Hui <zhouh at anl.gov>, discuss at mpich.org <discuss at mpich.org>
Cc: nd <nd at arm.com>
Subject: Re: Mpich over RDMA sample
When testing with Libfabric, verbs provider is selected. I did have to use “-e msg -d mlx5_1” so that it selects verbs. I was checking if there is anything like that for mpich sample tests. Else might need to do some hack in the code to force
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd
When testing with Libfabric, verbs provider is selected. I did have to use “-e msg -d mlx5_1”  so that it selects verbs.
I was checking if there is anything like that for mpich sample tests. Else might need to do some hack in the code to force the selection of verbs.

From: Zhou, Hui <zhouh at anl.gov>
Date: Wednesday, June 12, 2024 at 1:15 PM
To: discuss at mpich.org <discuss at mpich.org>
Cc: Niyaz Murshed <Niyaz.Murshed at arm.com>, nd <nd at arm.com>
Subject: Re: Mpich over RDMA sample
Libfabric support multiple providers. Sounds like it was selecting the sockets or tcp provider rather than a provider that support RoCE. I am not exactly sure whether the verbs provider will do that. If you can confirm the provider using libfabric tests, then you can try forcing MPICH to use that provider by setting the FI_PROVIDER environment variable.

--
Hui Zhou


From: Niyaz Murshed via discuss <discuss at mpich.org>
Date: Wednesday, June 12, 2024 at 9:03 AM
To: discuss at mpich.org <discuss at mpich.org>
Cc: Niyaz Murshed <Niyaz.Murshed at arm.com>, nd <nd at arm.com>
Subject: [mpich-discuss] Mpich over RDMA sample
Hello, I am trying to learn about MPICH and its performance over RDMA. I am using libfabric and installed mpich using the below configure. ./configure --prefix=/opt/mpich/ --with-ofi=/opt/libfabric/ When I run any applications between 2 directly
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd

Hello,

I am trying to learn about MPICH and its performance over RDMA.
I am using libfabric and installed mpich using the below configure.

./configure --prefix=/opt/mpich/  --with-ofi=/opt/libfabric/

When I run any applications between 2 directly connected servers having Mellanox NICs, I see that communication is happening over tcp and not over RoCE.
Is there any way to test commination over RoCE ?

For eg. I was able to test it for libfabric using the below sample that comes along with libfabric to test RMA.
Is there something similar for MPICH ? or use the current sample to use RoCE by some parameter?

Server :
fi_rma_bw -s   192.168.1.100  -e msg   -d mlx5_1 -S 1024 -I 1
Client :
fi_rma_bw -s   192.168.1.200  -e msg   -d mlx5_3  192.168.1.100  -S 1024 -I 1


Regards,
Niyaz

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20240612/98b31b60/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 88812 bytes
Desc: image001.png
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20240612/98b31b60/attachment-0001.png>


More information about the discuss mailing list