<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Menlo;
panose-1:2 11 6 9 3 8 4 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style>
</head>
<body lang="EN-US" link="#0563C1" vlink="#954F72" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">I believe this is the same issue as <a href="https://github.com/pmodels/mpich/issues/5309">
https://github.com/pmodels/mpich/issues/5309</a>. While we are resolving it, you could try the patch mentioned in the issue, or configure with ch4 using the latest release.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<div>
<p class="MsoNormal">-- <br>
Hui Zhou<o:p></o:p></p>
</div>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="mso-margin-top-alt:0in;margin-right:0in;margin-bottom:12.0pt;margin-left:.5in">
<b><span style="font-size:12.0pt;color:black">From: </span></b><span style="font-size:12.0pt;color:black">Fabrice Ducos via discuss <discuss@mpich.org><br>
<b>Date: </b>Wednesday, June 16, 2021 at 3:54 AM<br>
<b>To: </b>discuss@mpich.org <discuss@mpich.org><br>
<b>Cc: </b>Fabrice Ducos <fabrice.ducos@univ-lille.fr><br>
<b>Subject: </b>[mpich-discuss] the MPI daemon triggers an assertion on an ARM-based Linux system<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Greetings,<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">when running an application (atmospheric science) with MPICH on an AWS ARM (not x86) instance (with Linux Ubuntu Server 20.04),<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">our process crashes at the end of the processing.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">MPICH was installed precompiled with the Ubuntu apt provisioning tool:<o:p></o:p></p>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">$ sudo apt install -y mpich<o:p></o:p></span></p>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">$ apt list<o:p></o:p></span></p>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">[only relevant lines displayed for brevity]<o:p></o:p></span></p>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">lib<b><span style="color:#B42419">mpich</span></b>-dev/focal,now 3.3.2-2build1 arm64 [residual-config]<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">lib<b><span style="color:#B42419">mpich</span></b>12/focal,now 3.3.2-2build1 arm64 [installed,auto-removable]<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><b><span style="font-size:8.5pt;font-family:Menlo;color:#B42419">mpich</span></b><span style="font-size:8.5pt;font-family:Menlo">-doc/focal 3.3.2-2build1 all<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><b><span style="font-size:8.5pt;font-family:Menlo;color:#B42419">mpich</span></b><span style="font-size:8.5pt;font-family:Menlo">/focal 3.3.2-2build1 arm64<o:p></o:p></span></p>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Luckily, we got some debug information in
<span style="font-size:8.5pt;font-family:Menlo">mpid</span> that can be valuable:<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">Assertion failed in file src/mpid/ch3/channels/nemesis/src/ch3_progress.c at line 530: payload_len >= sizeof (MPIDI_CH3_Pkt_t)<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">0xffff86833f5f ???<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">???:0<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">0xffff86881eef ???<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">???:0<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">0xffff8683793f ???<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">???:0<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">0xffff8686c543 ???<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">???:0<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">0xffff8676d4b3 ???<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">???:0<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">0xffff8637e6eb ???<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">???:0<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">0xffff8637e85b ???<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">???:0<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">0xffff86369093 ???<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">???:0<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">0xaaaac5491e6f ???<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">???:0<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:8.5pt;font-family:Menlo">internal ABORT - process 22<o:p></o:p></span></p>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">The same application has been used for years with several MPI implementations (MPICH, OpenMPI, Intel MPI) on x86 systems without problem.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">It was successfully tested with MPICH on Linux Ubuntu Server 20.04 x86 shortly alongside the ARM test.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">We also tested the application with another MPI implementation (namely, OpenMPI) on the same ARM instance and it did work.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">We are perfectly fine using another MPI implementation in this specific case, but we thought that this issue would be of some interest to the MPICH maintenance team.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Best regards<o:p></o:p></p>
</div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Fabrice Ducos<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Ingénieur d’études CNRS<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Laboratoire d’Optique Atmosphérique - UMR CNRS 8518<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Faculté des Sciences et Technologies<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Bâtiment P5 - Bureau 325<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Université de Lille - Cité Scientifique<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">59655 Villeneuve d’Ascq<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
</body>
</html>