[mpich-discuss] MPICH v3.2 and SLRUM
Guo, Yanfei
yguo at anl.gov
Tue Dec 1 12:39:00 CST 2015
Hi Andreas,
I am guessing that you are using "--cpu-freq" option of srun. One way to go is manually setting the SLRUM_CPU_FREQ_REQ environment variable. Srun suppose to pick that up.
Yanfei Guo
Postdoctoral Appointee
MCS Division, ANL
On 12/1/15, 11:22 AM, "Andreas Gocht" <andreas.gocht at tu-dresden.de> wrote:
>Hey
>
>yeah mpiexec is working quite well. I'd just liked to use slurm, as our
>implantation allows to set the cpu frequency on a node. Is there a way
>to pass flags to srun using mpiexec?
>
>Thanks four your help.
>
>Kind Regards
>
>Andreas
>
>Am 01.12.2015 um 17:41 schrieb Balaji, Pavan:
>> Looks like SLURM is telling MPICH that two processes are on the same node, even though they are on different nodes. It looks like a bug in the SLURM PMI implementation. Did you try simply using mpiexec instead? You'll need to remove the --with-pmi, --with-pm, and LDFLAGS/LIBS options and rebuild mpich for that. Note that mpiexec will internally use srun on slurm environments.
>>
>> -- Pavan
>>
>>> On Dec 1, 2015, at 5:56 AM, Andreas Gocht <andreas.gocht at tu-dresden.de> wrote:
>>>
>>> Hey
>>>
>>> I tried to build an use mpich with slurm, sbatch and srun. Unfortunate it looks like MPI_Init doesn't work with srun.
>>>
>>> I got the following error:
>>>
>>> Fatal error in MPI_Init: Other MPI error, error stack:
>>> MPIR_Init_thread(474).................:
>>> MPID_Init(190)........................: channel initialization failed
>>> MPIDI_CH3_Init(89)....................:
>>> MPID_nem_init(272)....................:
>>> MPIDI_CH3I_Seg_commit(366)............:
>>> MPIU_SHMW_Hnd_deserialize(324)........:
>>> MPIU_SHMW_Seg_open(865)...............:
>>> MPIU_SHMW_Seg_create_attach_templ(637): open failed - No such file or directory
>>> In: PMI_Abort(4239887, Fatal error in MPI_Init: Other MPI error, error stack:
>>> MPIR_Init_thread(474).................:
>>> MPID_Init(190)........................: channel initialization failed
>>> MPIDI_CH3_Init(89)....................:
>>> MPID_nem_init(272)....................:
>>> MPIDI_CH3I_Seg_commit(366)............:
>>> MPIU_SHMW_Hnd_deserialize(324)........:
>>> MPIU_SHMW_Seg_open(865)...............:
>>> MPIU_SHMW_Seg_create_attach_templ(637): open failed - No such file or directory)
>>>
>>> I configured MPICH with "./configure --prefix=<some/prefix> --with-pmi=slurm --with-pm=none --with-slurm=<path/to/slurm>" and compiled my application with the "-L<path_to_slurm_lib> -lpmi" command.
>>>
>>> (as described in
>>>
>>> https://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Note_that_the_default_build_of_MPICH_will_work_fine_in_SLURM_environments._No_extra_steps_are_needed.
>>>
>>> and
>>>
>>> https://computing.llnl.gov/linux/slurm/mpi_guide.html#mpich2
>>>
>>> )
>>>
>>> I am running with 10 nodes and one task per node. Is there something I am missing during the configuration of MPICH?
>>>
>>> Best,
>>>
>>> Andreas
>>>
>>> --
>>> M.Sc. Andreas Gocht
>>>
>>> Technische Universität Dresden
>>> Center for Information Services and
>>> High Performance Computing (ZIH)
>>> D-01062 Dresden
>>> Germany
>>>
>>> Contact:
>>> Willersbau, Room A 104
>>> Phone: (+49) 351 463-36415
>>> Fax: (+49) 351 463-3773
>>> e-mail: andreas.gocht at tu-dresden.de
>>>
>>>
>>> _______________________________________________
>>> discuss mailing list discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>--
>M.Sc. Andreas Gocht
>
>Technische Universität Dresden
>Center for Information Services and
>High Performance Computing (ZIH)
>D-01062 Dresden
>Germany
>
>Contact:
>Willersbau, Room A 104
>Phone: (+49) 351 463-36415
>Fax: (+49) 351 463-3773
>e-mail: andreas.gocht at tu-dresden.de
>
>
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list