[mpich-discuss] coupling
fereshteh komijani
fereshtehkomijani at gmail.com
Tue Oct 29 02:53:01 CDT 2013
Dear Gus and Dear Huiwei
I would like to offer my heartfelt appreciation for your kind
consideration.
After Huiwei 's last post I am examining his suggestions. I have installed
current version of mpich2 but it could not solve problem. utilizing same
of various values for Ntilei and Ntilej can not solve problem too. Then I
check swan model again. Now I am sure it is the reason of errors. I am
trying to sort out what cause those errors.
I find this model mailing list.
Thanks again
Cheers
fereshte
On Mon, Oct 28, 2013 at 8:03 PM, Gus Correa <gus at ldeo.columbia.edu> wrote:
> Hi Fereshteh
>
> 1) I would look for log or error messages in the model
> output *before* the final one ("MPI_Abort ...").
>
> Although it aborted with 2 processors,
> that was a graceful termination by the program.
> So what is before the last error message may shed some light
> on why the model aborted,
> and what it requires to run correctly.
>
> **
>
> 2) MCT is also used for coupling climate models that we use
> here. Some of these models it run in SPMD mode (i.e. a single
> executable is launched by mpiexec).
> Others run in MPMD mode (i.e. several executables
> are launched by mpiexec, which requires a mpiexec command
> line with more parameters).
>
> Your mpiexec suggests that the model relies on a
> single executable (oceanG).
> However, this may/may not be the case, as you mentioned
> three separate items also: ROMS, SWAN, and the coupler.
>
> Is it a single executable or more than one that your model
> uses?
> [If you describe how you compile the model it may help.]
>
> **
>
> 3) Are you trying to run it in hybrid mode, i.e.
> using both MPI and OpenMP?
>
> I am not familiar to this model,
> so I am just guessing.
>
> The conventional wisdom is to use parameters names like
> "Nthreads" for OpenMP threads, although parameter names are elusive.
> OpenMP would add another twist to your processor configuration,
> as you would need to provision additional processors
> for OpenMP, besides those for MPI.
>
> **
>
> 4) Does the domain decomposition (Ntilei and Ntilej) have to
> be the same for both models?
>
>
> Have you tried something like:
> Ntilei=2
> Ntilej=2
> i.e. 2*2=4 processors for the ocean
> plus one processor for SWANN, a total of 5 processors,
> then:
>
> mpiexec -np 5 ...
>
> In some models we run here (e.g. MITgcm) you need to match precisely
> the number of processors to the tiles used to decompose your domain
> (and possibly also the OpenMP threads).
>
> When you tried mpiexec -np 2, did you choose Ntilei=Ntilej=Nthreads=1 for
> both models?
>
> Do you have perhaps to add one processor (or more)
> specifically for the coupler?
>
> **
>
> 5) Do your input files (describing the ocean initial state,
> and perhaps the wave state), have to be organized based on the
> "tile" configuration, or can they span the whole domain?
> This could be another source of error.
>
> 6) Does ROMS and SWANN have mailing lists that can perhaps help
> you more than the generic MPICH list?
>
>
> I hope this helps,
> Gus Correa
>
>
>
> On 10/27/2013 06:33 AM, fereshteh komijani wrote:
>
>> Sure.
>>
>> ROMS and SWAN models are models for ocean and wave models,
>> respectively. Coupling of them require 3 input files: coupling_test.in
>> <http://coupling_test.in>, swan_test.in <http://swan_test.in> and
>> roms_test.in <http://roms_test.in> which first one is coupled input file
>>
>> and also one build.bash file which in it user called requested
>> libraries, cpp optins, header file, compilers (for me gcc and gfortran).
>> For example in build.bash file I set
>>
>> USE_MPI=on = on
>> USE_MPIF90 =on
>>
>> WHICH_MPI=mpich2
>>
>> FORT = gfortran
>>
>> also with regard that for coupling model, MCT (model coupling toolkit)
>> is necessary I have installed it and set its include and lib
>> directories in build.bash file.
>>
>> In coupling_test.in <http://coupling_test.in> user call swan_test.in
>> <http://swan_test.in>and roms_test.in <http://roms_test.in> as wave and
>> ocean model's inputs.
>>
>> In roms_test.in <http://roms_test.in> some coefficients, solving
>>
>> technique and some input forcing file determined.
>>
>> for choosing number of nodes for each model there are NtileI and Ntilej
>> in roms_test.in <http://roms_test.in> and Nthreads (ocean) and Nthreads
>> (wave) in coupling_test.in <http://coupling_test.in> file ( Nthreads
>>
>> (ocean)=NtileI *Ntilej ) and total nodes are equal to Nthreads
>> (ocean)+Nthreads (wave).
>>
>> whenever i set one node for wave model (SWAN) (Nthreads (wave)=1) after
>> running by
>>
>> mpirun -np 2 ./oceanG coupling_inlet-test.in
>> <http://coupling_inlet-test.**in/ <http://coupling_inlet-test.in/>
>> >>mpi.log
>>
>>
>> it replies:
>>
>> application called MPI_Abort(comm=0x84000002, 4) - process 0
>>
>> But when setNthreads (wave)>1 (and everything for Nthreads (ocean)) for
>>
>> example
>>
>> mpirun -np 8 ./oceanG coupling_inlet-test.in
>> <http://coupling_inlet-test.**in/ <http://coupling_inlet-test.in/>
>> >>mpi.log
>>
>>
>> mpi.log file (attaché file) shows that roms model does not have problems
>> and its nodes are active but nothing happen for swan model after 2 weeks
>> than running.
>>
>> I hope that this information be sufficient.
>>
>> All the best
>>
>> fereshte
>>
>>
>>
>> ______________________________**_________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/**mailman/listinfo/discuss<https://lists.mpich.org/mailman/listinfo/discuss>
>>
>
> ______________________________**_________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/**mailman/listinfo/discuss<https://lists.mpich.org/mailman/listinfo/discuss>
>
--
***Angel***
**
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20131029/462c7b5f/attachment.html>
More information about the discuss
mailing list