[mpich-discuss] mpirun not running in batch mode.
Gus Correa
gus at ldeo.columbia.edu
Wed Jun 11 14:28:19 CDT 2014
On 06/11/2014 05:24 AM, Reuti wrote:
> Hi,
>
> Am 11.06.2014 um 10:55 schrieb Francisco Pastor:
>
>> Hi mpich users
>>
>> I know I am using an old version of mpich but this
one has been working for a long time so when building
a new cluster we decided to use the same one.
The point is that we have installed RAMS meteorological
model with mpich2-1.0.5 and it runs fine when running
from console but not with cron or at.
>
> Is it running at root then or by a normal user?
The path may be different when starting with these methodes too.
>
> As it's a new cluster: did you look into the option to install a queuing system?
>
> -- Reuti
>
+1
Using cron or at to launch batch jobs is a shoot on one's own foot.
A job queueing system is a clean solution, and not very hard to setup.
SGE, Torque and slurm are open source/free job queuing systems
that have good mailing list / community support.
Gus Correa
>
>> If the script that runs the model is scheduled with cron
>> it just starts processes on the main node but not in the rest of the cluster nodes.
>>
>> While waiting for your help I will try to upgrade to a newer mpich version.
>>
>> I can't find the reason why this is happening, probably I made some mistake. Can you help me? Any idea will be welcome and tried.
>>
>> Following I attach some information that could be useful to understand my problem.
>>
>> meteo at ventus:~$ mpich2version
>> Version: 1.0.5
>> Device: ch3:sock
>> Configure Options:
>> CC: gcc
>> CXX: c++
>> F77: f95
>> F90: f95
>>
>>
>> Processes in the main node when run with `at`
>>
>> meteo at ventus:~/RAMS/RUN$ ps -ef | grep rams
>> meteo 20324 18460 0 09:32 ? 00:00:00 python2 /usr/local/mpich2-1.0.5p4/bin/mpirun -envall -n 40 ./rams60 -f RAMSIN.operatiu
>> meteo 20331 20326 0 09:32 ? 00:00:00 ./rams60 -f RAMSIN.operatiu
>> meteo 20332 20327 0 09:32 ? 00:00:00 ./rams60 -f RAMSIN.operatiu
>> meteo 20333 20328 0 09:32 ? 00:00:00 ./rams60 -f RAMSIN.operatiu
>> meteo 20334 20329 0 09:32 ? 00:00:00 ./rams60 -f RAMSIN.operatiu
>> meteo 20335 20330 0 09:32 ? 00:00:00 ./rams60 -f RAMSIN.operatiu
>> meteo 20336 20325 0 09:32 ? 00:00:00 ./rams60 -f RAMSIN.operatiu
>> meteo 20343 18113 0 09:33 pts/4 00:00:00 grep --color=auto rams
>>
>> log of mpirun
>>
>> meteo at ventus:~/RAMS/RUN$ cat log_rams_operatiu-INITIAL-20140611-0000
>> in main 1024
>> numarg: 2
>> in main 1024
>> numarg: 2
>> in main 1024
>> numarg: 2
>> in main 1024
>> numarg: 2
>> in main 1024
>> numarg: 2
>> in main 1024
>> numarg: 2
>> in main 1024
>> numarg: 2
>> in main 1024
>> numarg: 2
>> par init numargs: 3 ./rams60 1024 0
>> par init args: 0 argvp[i] ./rams60
>> par init args: 1 argvp[i] -f
>> par init args: 2 argvp[i] RAMSIN.operatiu
>> par init RAMS_MPI defined
>> par init numargs: 3 ./rams60 1024 0
>> par init args: 0 argvp[i] ./rams60
>> par init args: 1 argvp[i] -f
>> par init args: 2 argvp[i] RAMSIN.operatiu
>> par init RAMS_MPI defined
>> par init numargs: 3 ./rams60 1024 0
>> par init args: 0 argvp[i] ./rams60
>> par init args: 1 argvp[i] -f
>> par init args: 2 argvp[i] RAMSIN.operatiu
>> par init RAMS_MPI defined
>> par init numargs: 3 ./rams60 1024 0
>> par init args: 0 argvp[i] ./rams60
>> par init args: 1 argvp[i] -f
>> par init args: 2 argvp[i] RAMSIN.operatiu
>> par init RAMS_MPI defined
>> par init numargs: 3 ./rams60 1024 0
>> par init args: 0 argvp[i] ./rams60
>> par init args: 1 argvp[i] -f
>> par init args: 2 argvp[i] RAMSIN.operatiu
>> par init RAMS_MPI defined
>> par init numargs: 3 ./rams60 1024 0
>> par init args: 0 argvp[i] ./rams60
>> par init args: 1 argvp[i] -f
>> par init args: 2 argvp[i] RAMSIN.operatiu
>> par init RAMS_MPI defined
>> par init numargs: 3 ./rams60 1024 0
>> par init args: 0 argvp[i] ./rams60
>> par init args: 1 argvp[i] -f
>> par init args: 2 argvp[i] RAMSIN.operatiu
>> par init RAMS_MPI defined
>> par init numargs: 3 ./rams60 1024 0
>> par init args: 0 argvp[i] ./rams60
>> par init args: 1 argvp[i] -f
>> par init args: 2 argvp[i] RAMSIN.operatiu
>> par init RAMS_MPI defined
>>
>> Mpich options to compile RAMS model
>>
>> MPI_PATH=/usr/local/mpich2-1.0.5p4
>> PAR_INCS=-I$(MPI_PATH)/src/include
>> PAR_LIBS=-L$(MPI_PATH)/lib/ -lmpich
>> PAR_DEFS=-DRAMS_MPI
>>
>> Thank you very much for your help.
>>
>> --
>> -----------
>> Dr. Francisco Pastor
>> Meteorology department, Instituto Universitario CEAM-UMH
>> http://www.ceam.es
>> -----------
>> Mendeley profile: http://www.mendeley.com/profiles/francisco-pastor1/
>> Google Scholar: http://scholar.google.com/citations?user=V3mmCdkAAAAJ&hl=es
>> Researcher ID: http://www.researcherid.com/rid/B-8331-2008
>> Cosis profile: http://www.cosis.net/profile/francisco.pastor
>> -----------
>> mail: paco at ceam.es
>> skype: paco.pastor.guzman
>> -----------
>> Parque Tecnologico, C/ Charles R. Darwin, 14
>> 46980 PATERNA (Valencia), Spain
>> Tlf. 96 131 82 27 - Fax. 96 131 81 90
>>
>>
>> ---------------------------------------------------------------------
>> Este mensaje y los ficheros anexos son confidenciales. Los mismos contienen información reservada de la empresa que no puede ser difundida. Si usted ha recibido este correo por error, tenga la amabilidad de eliminarlo de su sistema y avisar al remitente mediante reenvío a su dirección electrónica; no deberá copiar el mensaje ni divulgar su contenido a ninguna persona.
>>
>> Su dirección de correo electrónico junto a sus datos personales forman parte de un fichero titularidad de la Fundación de la Comunidad Valenciana Centro de Estudios Ambientales del Mediterráneo - CEAM, con CIF: G-46957213, cuya finalidad es la de mantener el contacto con Ud. De acuerdo con la Ley Orgánica 15/1999, usted puede ejercitar sus derechos de acceso, rectificación, cancelación y, en su caso, oposición enviando una solicitud por escrito, acompañada de una fotocopia de su DNI dirigida a: Fundación de la Comunidad Valenciana Centro de Estudios Ambientales del Mediterráneo - CEAM. C/ Charles R. Darwin, 14. Parque Tecnológico.46980 PATERNA (Valencia).
>>
>> This message and the attached files are confidential. They contain reserved information belonging to our centre and are not to be broadcast. If you have received this email by mistake, please delete it from your system and alert the sender by returning it to his/her email address. You must not copy or divulge the contents of the message to anyone.
>>
>> Your email address and personal data are included in a file belonging to the Fundación de la Comunidad Valenciana Centro de Estudios Ambientales del Mediterráneo - CEAM, con CIF: G-46957213. The purpose of this file is to allow us to keep in contact with you. In accordance with Organic Law 15/1999, you are permitted to access, rectify, cancel or oppose the contents of this file by submitting a written request, accompanied by a photocopy of your DNI, to: Fundación de la Comunidad Valenciana Centro de Estudios Ambientales del Mediterráneo - CEAM. C/ Charles R. Darwin, 14. Parque Tecnológico.46980 PATERNA (Valencia).
>>
>>
>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
More information about the discuss
mailing list