<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hi Edric,</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
I am not sure which part is hanging, but you don't need to enable <code>ofi:shm</code> (libfabric shm provider). The ch4 device comes with its own shared memory functionality.</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
-- <br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hui<br>
</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Edric Ellis via discuss <discuss@mpich.org><br>
<b>Sent:</b> Wednesday, December 13, 2023 7:05 AM<br>
<b>To:</b> discuss@mpich.org <discuss@mpich.org><br>
<b>Cc:</b> Edric Ellis <eellis@mathworks.com><br>
<b>Subject:</b> [mpich-discuss] Hang during MPI_Finalize using ch4:ofi:shm in mpich-4.1.2</font>
<div> </div>
</div>
<style type="text/css" style="display:none">
<!--
p
{margin-top:0;
margin-bottom:0}
-->
</style>
<div dir="ltr">
<div class="x_elementToProof" style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
I'm working on getting a build of mpich-4.1.2 ready to replace our old build of mpich-3.3.2. With older MPICH releases, we used the "nemesis" channel via ch3 to provide support for shared-memory configurations as well as TCP/IP. In ch4, I thought the nearest
equivalent would be:</div>
<div class="x_elementToProof" style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">--with-device=ch4:ofi:tcp,shm</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">The "tcp" portion of this seems to work just fine, but "shm" hangs during (I think) MPI_Finalize, requiring a CTRL-C to kill it. For example,
in the build area,</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">$ ./src/pm/hydra/mpiexec.hydra -n 2 ./examples/cpi</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Process 0 of 2 is on uk-eellis-l</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Process 1 of 2 is on uk-eellis-l</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">pi is approximately 3.1415926544231318, Error is 0.0000000008333387</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">wall clock time = 0.000019</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">^C[mpiexec@uk-eellis-l] Sending Ctrl-C to processes as requested</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">[mpiexec@uk-eellis-l] Press Ctrl-C again to force abort</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">===================================================================================</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">= PID 829015 RUNNING AT uk-eellis-l</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">= EXIT CODE: 2</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">= CLEANING UP REMAINING PROCESSES</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">===================================================================================</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Interrupt (signal 2)</span></div>
<div><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">This typically refers to a problem with your application.</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Please see the FAQ page for debugging suggestions</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Things work fine if I force FI_PROVIDER=tcp. Am I missing something? </span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Here's the configure line I'm using:</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">$ ./configure --prefix <prefix> --with-device=ch4:ofi:tcp,shm --enable-shared --with-libfabric=embedded --enable-fortran --enable-efa=no</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">This is running on a Debian 11 system, gcc 10.3.0. </span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Cheers,</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">Edric.</span></div>
<div class="x_elementToProof"><span style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"><br>
</span></div>
</div>
</body>
</html>