<style> #mailBodyContentDiv { font-family : Dotum, Verdana, Arial, Helvetica ; background-color: #ffffff;font-size:10pt;} #mailBodyContentDiv BODY { background-color: #ffffff;} #mailBodyContentDiv BODY, TD, TH { color: black; font-family: Dotum, Verdana, Arial, Helvetica; font-size: 10pt; } #mailBodyContentDiv P { margin: 0px; padding:2px;} </style> <div id="mailBodyContentDiv" style="width:100%"> <span style="font-family: Dotum; font-size: small;">I set up 2 node cluster and installed mpich by using intel compiler. OS is CentOS 6.4 </span><div><span style="font-family: Dotum; font-size: small;"><br></span></div><div><span style="font-family: Dotum; font-size: small;">Each node is connected by ethernet. They shared directories where the compiler and mpich are installed, by using nfs.</span></div><div><span style="font-family: Dotum; font-size: small;"><br></span></div><div><span style="font-family: Dotum; font-size: small;">SSH co
nnection can be done without password.</span></div><div><span style="font-family: Dotum; font-size: small;"><br></span></div><div><span style="font-family: Dotum; font-size: small;">By way of experiment, I compiled simple example of MPI_File_open with mpicc and executed by mpiexec on the nfs directory.</span></div><div><span style="font-family: Dotum; font-size: small;"><br></span></div><div><span style="font-family: Dotum; font-size: small;">The code is </span></div><div><p style="margin-top: 0.5em; margin-bottom: 0.9em; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 11.818181991577148px; line-height: 17.017045974731445px;">#include "mpi.h"<br>#include</p><p style="margin-top: 0.5em; margin-bottom: 0.9em; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 11.818181991577148px; line-height: 17.017045974731445px;">int main( int argc, char *argv[] )<br>{<br>MPI_Fint handleA, handleB;<br>int rc;<br>int errs = 0;<br>int rank;<br>MPI_File cFile;</
p><p style="margin-top: 0.5em; margin-bottom: 0.9em; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 11.818181991577148px; line-height: 17.017045974731445px;">MPI_Init( &argc, &argv );<br>MPI_Comm_rank( MPI_COMM_WORLD, &rank );<br>rc = MPI_File_open( MPI_COMM_WORLD, "temp", MPI_MODE_RDWR | MPI_MODE_DELETE_ON_CLOSE | MPI_MODE_CREATE, MPI_INFO_NULL, &cFile );<br>if (rc) {<br>printf( "Unable to open file \"temp\"\n" );fflush(stdout);<br>}<br>else {<br>MPI_File_close( &cFile );<br>}<br>MPI_Finalize();<br>return 0;<br>}</p><p style="margin-top: 0.5em; margin-bottom: 0.9em; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 11.818181991577148px; line-height: 17.017045974731445px;"><br></p></div><div><span style="font-family: Dotum; font-size: small;">This example just opens the file and closes it in parallel.</span></div><div><span style="font-family: Dotum; font-size: small;"><br></span></div><div><span style="font-family: Dotum; fon
t-size: small;">On the single machine, it worked well.</span></div><div><span style="font-family: Dotum; font-size: small;"><br></span></div><div><span style="font-family: Dotum; font-size: small;">However, it didn't work when 2 machines' processor executed the code in parallel.</span></div><div><span style="font-family: Dotum; font-size: small;"><br></span></div><div><span style="font-family: Dotum; font-size: small;">The error message:</span></div><div><span style="font-family: Dotum; font-size: small;"><br></span></div><div><span style="font-family: Gulim; font-size: medium;">Internal Error: invalid error code 209e0e (Ring ids do not match) in MPIR_Bcast_intra:1119</span><br style="font-family: Gulim; font-size: medium;"><span style="font-family: Gulim; font-size: medium;">Fatal error in PMPI_Bcast: Other MPI error, error stack:</span><br style="font-family: Gulim; font-size: medium;"><span style="font-family: Gulim; font-size: medium;">PMPI_Bcast(1478)......: MPI_Bcast(b
uf=0x1deb980, count=1, MPI_CHAR, root=0, comm=0x84000004) failed</span><br style="font-family: Gulim; font-size: medium;"><span style="font-family: Gulim; font-size: medium;">MPIR_Bcast_impl(1321).: </span><br style="font-family: Gulim; font-size: medium;"><span style="font-family: Gulim; font-size: medium;">MPIR_Bcast_intra(1119): </span><br style="font-family: Gulim; font-size: medium;"></div><div><span style="font-family: Dotum; font-size: small;"><br></span></div><div><div><span style="font-family: Dotum; font-size: small;">When I added the prefix nfs: to the filename, as "nfs:temp", it worked well. </span></div><div><span style="font-family: Dotum; font-size: small;"><br></span></div><div><span style="font-family: Dotum; font-size: small;">However, I don't want to add that because I should execute very big code on the cluster. </span></div><div><span style="font-family: Dotum; font-size: small;"><br></span></div><div><span style="font-family: Dotum;
font-size: small;">It is very hard to modify the code. </span><span style="font-family: Dotum; font-size: small;">Would you let me know what the problem is and how the problem is solved?</span></div></div><div><br></div><div><span style="font-family: Dotum; font-size: small;">The nfs options are</span></div><div><span style="font-family: Dotum; font-size: small;">(client)</span></div><div><span style="font-family: Dotum; font-size: small;">mount -t nfs -o noac,nfsvers=3 "server_directory" "mount_directory"</span></div><div><span style="font-family: Dotum; font-size: small;">(server, /etc/exports)</span></div><div><span style="font-family: Dotum; font-size: small;">sync,rw,no_root_squash</span></div><div><span style="font-family: Dotum; font-size: small;"><br></span></div><div><span style="font-family: Dotum; font-size: small;">I shared /opt and /home/username </span></div><div><br></div><div>mpich configuration is</div><div><span style="font-family: Gulim; font-siz
e: medium;">MPICH2 Version: 1.4.1p1</span><br style="font-family: Gulim; font-size: medium;"><span style="font-family: Gulim; font-size: medium;">MPICH2 Release date: Thu Sep 1 13:53:02 CDT 2011</span><br style="font-family: Gulim; font-size: medium;"><span style="font-family: Gulim; font-size: medium;">MPICH2 Device: ch3:nemesis</span><br style="font-family: Gulim; font-size: medium;"><span style="font-family: Gulim; font-size: medium;">MPICH2 configure: --prefix=/home/master/lib/mpich/mpich2-1.4.1p1 CC=icc F77=ifort CXX=icpc FC=ifort --enable-romio --with-file-system=nfs+ufs --with-pm=hydra</span><br style="font-family: Gulim; font-size: medium;"><span style="font-family: Gulim; font-size: medium;">MPICH2 CC: icc -O2</span><br style="font-family: Gulim; font-size: medium;"><span style="font-family: Gulim; font-size: mediu
m;">MPICH2 CXX: icpc -O2</span><br style="font-family: Gulim; font-size: medium;"><span style="font-family: Gulim; font-size: medium;">MPICH2 F77: ifort -O2</span><br style="font-family: Gulim; font-size: medium;"><span style="font-family: Gulim; font-size: medium;">MPICH2 FC: ifort -O2</span><br style="font-family: Gulim; font-size: medium;"></div><div><br></div> </div><table border='0' cellpadding='0' cellspacing='0' style='width:100%; padding:30 10 10 10'><tr><td style='font-family:Gulim; font-size:12px;'> <style> #signDiv {background-color: #ffffff;} #signDiv BODY{ background-color: #ffffff;} BODY, TD, TH { color: black; font-family: 굴림; font-size: 12px; } TD { border: 0px} #signDiv P { margin: 0px; padding:0px} </style> <div id="signDiv" ><P>------------------------------------------------------<BR></P>
<P>Jaeyong Jeong</P>
<P>Department of Mechanical Engineering</P>
<P>Pohang University of Science and Technology</P>
<P> </P>
<P> </P> </div>
</td><td width='15'></td></tr></table>
<img src="http://webmail.postech.ac.kr/mail/PutAck.jsp?ack_args=c2VudF9maWxlPWh1dGluYXgyQHBvc3RlY2guYWMua3IvLlNlbnQvMTM3NjQwNDk4NDY0My43MTg5My5wb3N0ZWNoJnNlbmRfZGF0ZT0yMDEzMDgxMzIzNDMwNCZzdWJqZWN0PVttcGljaC1kaXNjdXNzXU1QSV9GaWxlX29wZW4gZmFpbHMgb24gdGhlIGNsdXN0ZXIgKDIgbm9kZSk=&to_email=discuss@mpich.org_____________________________________________________________________________________________________________________________________" width="1" border="0" height="1">