[mpich-discuss] Problem with Mpich3.0.4 build for WRF run across multiple nodes in a cluster.

Rob Latham robl at mcs.anl.gov
Tue Jun 21 10:20:47 CDT 2016



On 06/21/2016 05:58 AM, Teck-Bin Arthur Lim wrote:
> Hi ,
>
> I met with a basic problem  trying to get old mpich(3.0.4) version to do
> parallel run across different machines in a mini-cluster.   The error
>   messages
>
> while invoking the mpiruns  are :
>
> *****************error messages********************
>
> [limtba at fdns00_ws testenv-testlib]$ ./mpi-script
>
> [limtba at fdns00_ws testenv-testlib]$ more log.aout
>
> [mpiexec at fdns00_ws] HYDU_process_mfile_token (./utils/args/args.c:299):
> token cpu not supported at this time
>
> [mpiexec at fdns00_ws] HYDU_parse_hostfile (./utils/args/args.c:347):
> unable to process token

what does your machine file look like?  it sounds like you've got 
something in there that Hydra does not expect.

==rob
>
> [mpiexec at fdns00_ws] mfile_fn (./ui/mpich/utils.c:341): error parsing
> hostfile
>
> [mpiexec at fdns00_ws] match_arg (./utils/args/args.c:153): match handler
> returned error
>
> [mpiexec at fdns00_ws] HYDU_parse_array (./utils/args/args.c:175): argument
> matching returned error
>
> [mpiexec at fdns00_ws] parse_args (./ui/mpich/utils.c:1609): error parsing
> input array
>
> [mpiexec at fdns00_ws] HYD_uii_mpx_get_parameters
> (./ui/mpich/utils.c:1660): unable to parse user arguments
>
> [mpiexec at fdns00_ws] main (./ui/mpich/mpiexec.c:153): error parsing
> parameters
>
> Command exited with non-zero status 255
>
> *****************error messages********************
>
> This old version was downloaded from WRF site
> (http://www2.mmm.ucar.edu/wrf/OnLineTutorial/compilation_tutorial.php#STEP2)
> , and was built
>
> with essentially, all the default configuration settings without any
> options arguments given in the configure/make/make-install process, as :
>
> Ø./configure –prefix=$DIR/mpich
>
> Ømake
>
> Ømake install
>
> There are  no error messages during the built process, and the mpirun
> works fine for parallel runs, using multiple processors, on a single
> NODE only
>
> but met with the above error messages when attempting parallel run
> across multiple machines.
>
> I need some advice as to how get this old mpich 3.0.4 working across
> machines.   The OS for these machines are Centos5.5, with gcc4.1.2 and
>
> gcc4.4.7 installations.  As WRF needs gcc4.4 and higher version, I have
> built the mpich3.04 using gcc4.4.7.
>
> Would appreciate any help and advices…
>
> Many Thanks.
>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list