No subject


Tue Jun 18 13:52:11 CDT 2019


$ ./mpi-run.sh
Running cpi on machines.u2.mpi
Process 0 of 1 is on UNIT2
pi is approximately 3.1415926544231341, Error is 0.0000000008333410
wall clock time =3D 0.001327
Done!

If I edit the script to change the mpiexec line like this:

sudo -E LD_LIBRARY_PATH=3D${ELIBS} mpiexec --allow-run-as-root -machinefile=
 /home/linaro/.machines.u2.mpi -n 1 /home/linaro/myMPI/cpi

Now I get (edited for brevity):

$ ./mpi-run.sh
Running cpi on machines.u2.mpi
linaro at UNIT2's password:
PATH=3D/usr/local/bin[...]: Command not found.
export: Command not found.
LD_LIBRARY_PATH=3D/usr/local/lib[...]: Command not found.
export: Command not found.
DYLD_LIBRARY_PATH: Undefined variable.

And it just stops there.  Note that the LD_LIBRARY_PATH being reported is *=
not* the one passed in by the script.  I don't think it's managing to reach=
 the mpi execution stage itself.

If the machinefile lists more than one host, the password prompts appear tw=
o at a time and interfere with each other such that no login succeeds (alth=
ough all machines have the same password).

Googling around, I've seen this series of error outputs in a wide variety o=
f other contexts, including Open MPI but also some completely unrelated app=
lication suites and SDKs.

My problem is that the mpi binaries I need to run on the hosts absolutely r=
equire sudo elevation.  Is sudo mpiexec the way to go?  What is going on in=
 my example case?

Daniel U. Thibault
RDDC - Centre de recherches de Valcartier | DRDC - Valcartier Research Cent=
re
NAC : 918V QSDJ<http://www.travelgis.com/map.asp?addr=3D918V%20QSDJ> <http:=
//www.travelgis.com/map.asp?addr=3D918V%20QSDJ>
Gouvernement du Canada | Government of Canada
<http://www.valcartier.drdc-rddc.gc.ca/>


--_000_48CF5AC71E61DB46B70D0F388054EFFD86F70ED5VALE02valcartie_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html><head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"=
>
<meta name=3D"Generator" content=3D"Microsoft Exchange Server">
<!-- converted from rtf -->
<style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left:=
 #800000 2px solid; } --></style>
</head>
<body>
<font face=3D"Calibri" size=3D"2"><span style=3D"font-size:11pt;">
<div>   I have a small network of machines all running the same O=
S (Linaro Ubuntu Linux); they were all cloned from the same disk image and =
differ only in their machine names (UNIT1 through UNIT4).</div>
<div> </div>
<div>   I can ssh between them at will, trusty has been establish=
ed and I no longer get asked for a password upon connecting.  MPICH is=
 installed from the Ubuntu repository (not quite the latest version: mpiexe=
c reports version OpenRTE 1.8.1, the mpich package
is 3.0.4-6ubuntu1), and I can run a demo like cpi with no issues, using a l=
ittle mpi-run.sh bash script (the default shell is tcsh, however) :</div>
<div> </div>
<div>(begin script)</div>
<div>#!/bin/bash</div>
<div> </div>
<div>set -e</div>
<div> </div>
<div>ESDK=3D${EPIPHANY_HOME}</div>
<div>ELIBS=3D${ESDK}/tools/host/lib:${LD_LIBRARY_PATH}</div>
<div>EHDF=3D${EPIPHANY_HDF}</div>
<div> </div>
<div>echo "Running cpi on machines.u2.mpi"</div>
<div>LD_LIBRARY_PATH=3D${ELIBS} mpiexec --allow-run-as-root -machinefile /h=
ome/linaro/.machines.u2.mpi -n 1 /home/linaro/myMPI/cpi</div>
<div>echo "Done!"</div>
<div>(end script)</div>
<div> </div>
<div>.machines.u2.mpi consists of the one line:</div>
<div> </div>
<div>linaro at UNIT2</div>
<div> </div>
<div>From UNIT1, if I do:</div>
<div> </div>
<div>$ ./mpi-run.sh</div>
<div>Running cpi on machines.u2.mpi</div>
<div>Process 0 of 1 is on UNIT2</div>
<div>pi is approximately 3.1415926544231341, Error is 0.0000000008333410</d=
iv>
<div>wall clock time =3D 0.001327</div>
<div>Done!</div>
<div> </div>
<div>If I edit the script to change the mpiexec line like this:</div>
<div> </div>
<div>sudo -E LD_LIBRARY_PATH=3D${ELIBS} mpiexec --allow-run-as-root -machin=
efile /home/linaro/.machines.u2.mpi -n 1 /home/linaro/myMPI/cpi</div>
<div> </div>
<div>Now I get (edited for brevity):</div>
<div> </div>
<div>$ ./mpi-run.sh</div>
<div>Running cpi on machines.u2.mpi</div>
<div>linaro at UNIT2’s password:</div>
<div>PATH=3D/usr/local/bin[…]: Command not found.</div>
<div>export: Command not found.</div>
<div>LD_LIBRARY_PATH=3D/usr/local/lib[…]: Command not found.</div>
<div>export: Command not found.</div>
<div>DYLD_LIBRARY_PATH: Undefined variable.</div>
<div> </div>
<div>And it just stops there.  Note that the LD_LIBRARY_PATH being rep=
orted is *<b>not</b>* the one passed in by the script.  I don’t =
think it’s managing to reach the mpi execution stage itself.</div>
<div> </div>
<div>If the machinefile lists more than one host, the password prompts appe=
ar two at a time and interfere with each other such that no login succeeds =
(although all machines have the same password).</div>
<div> </div>
<div>Googling around, I’ve seen this series of error outputs in a wid=
e variety of other contexts, including Open MPI but also some completely un=
related application suites and SDKs.</div>
<div> </div>
<div>My problem is that the mpi binaries I need to run on the hosts absolut=
ely require sudo elevation.  Is sudo mpiexec the way to go?  What=
 is going on in my example case?</div>
<div> </div>
<div><font face=3D"Consolas">Daniel U. Thibault<br>

RDDC - Centre de recherches de Valcartier | DRDC - Valcartier Research Cent=
re<br>

NAC : <a href=3D"http://www.travelgis.com/map.asp?addr=3D918V%20QSDJ">=
<font face=3D"Courier New" size=3D"2" color=3D"blue"><span style=3D"font-si=
ze:10pt;"><u>918V QSDJ</u></span></font></a><font face=3D"Calibri" color=3D=
"#1F497D"> </font><font face=3D"Calibri"><</font><font face=3D"Courier N=
ew" size=3D"2" color=3D"blue"><span style=3D"font-size:10pt;"><u><a href=3D=
"http://www.travelgis.com/map.asp?addr=3D918V%20QSDJ">http://www.travelgis.=
com/map.asp?addr=3D918V%20QSDJ</a></u></span></font><font face=3D"Calibri">=
><br>

</font>Gouvernement du Canada | Government of Canada<br>

<font face=3D"Calibri"><</font><a href=3D"http://www.valcartier.drdc-rdd=
c.gc.ca/"><font face=3D"Courier New" size=3D"2" color=3D"blue"><span style=
=3D"font-size:10pt;"><u>http://www.valcartier.drdc-rddc.gc.ca/</u></span></=
font></a><font face=3D"Calibri">></font></font></div>
<div> </div>
</span></font>
</body>
</html>

--_000_48CF5AC71E61DB46B70D0F388054EFFD86F70ED5VALE02valcartie_--

--===============9162204499992570155==
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
--===============9162204499992570155==--


More information about the discuss mailing list