[mpich-discuss] Threaded Listener

Eduardo erocha.ssa at gmail.com
Sat Jan 17 14:23:29 CST 2015


The system is a micro-OS running in an embedded processor. There will be
multiple of these processors, each one with a single mpi rank. I'm starting
to think that I will have to do what Hydra does myself. Is there any
documentation about what a mpi job expects from hydra?

Eduardo

On Sat, Jan 17, 2015 at 5:58 PM, Jeff Hammond <jeff.science at gmail.com>
wrote:

> How do you create processes on this system?  If you are using an MPI
> implementation that has a 1:1 mapping between MPI processes and OS
> processes, you've got to be able to create multiple processes somehow.
>
> Do you have any platform details you can share?
>
> Jeff
>
> On Sat, Jan 17, 2015 at 11:46 AM, Eduardo <erocha.ssa at gmail.com> wrote:
> > I tried to use a newer MPICH, but, as you said, Hydra forks as part of
> > mpiexec. Is it possible to launch a job without mpiexec/mpirun ? That
> should
> > probably work in my environment (assuming that there is no listener as a
> > heavy process as it is the case in the default mpich-1.2.7p1).
> >
> > The problem is not that the fork symbol is missing. The problem is that
> it
> > is been called. My environment actually has a dummy fork, but that only
> > causes the program to abort if it is called.
> >
> > So in summary, the problem is that I cannot use mpiexec to launch a
> program,
> > because it calls fork. In mpich-1.2.7p1 , I can launch myself the
> program as
> > if I was launching a debug (as in section 3.5.6 of the CH_P4 manual). In
> > addition, the mpich cannot issue a fork for a listener like task, only
> > threads.
> >
> > Eduardo
> >
> > On Sat, Jan 17, 2015 at 4:23 PM, Jeff Hammond <jeff.science at gmail.com>
> > wrote:
> >>
> >> MPICH shouldn't fork. Hydra probably uses fork to launch processes as
> part
> >> of mpiexec.
> >>
> >> Blue Gene doesn't support fork either and MPICH runs there, but the
> >> process launcher is not Hydra. Same for Cray last time I checked. So I'm
> >> sure that fork isn't required to use MPICH, at least if dynamic
> processes
> >> are not used.
> >>
> >> Is the issue that you cannot link an MPI program because the fork symbol
> >> is missing or that you cannot launch jobs with a Hydra?
> >>
> >> All mainstream MPI implementations use OS processes to implement MPI
> >> processes, but the standard doesn't require this. FG-MPI uses threads to
> >> implement MPI processes, but as it still uses Hydra (AFAIK), it may or
> may
> >> not work for you.
> >>
> >> Can you describe in detail what happens when you try to build and run
> >> MPICH-latest on your system?
> >>
> >> Jeff
> >>
> >> Sent from my iPhone
> >>
> >> On Jan 17, 2015, at 9:41 AM, Eduardo <erocha.ssa at gmail.com> wrote:
> >>
> >> I would use a newer version if I could. However, I cannot issue a fork
> in
> >> my embedded environment. I can create threads though.
> >>
> >> So, is there any newer versions of mpich that does not create heavy
> >> processes? I can live without MPI_Spawn and the like.
> >>
> >> Regards,
> >>
> >> Eduardo
> >>
> >> On Jan 16, 2015 5:42 PM, "Wesley Bland" <wbland at anl.gov> wrote:
> >>>
> >>> Can you try using a more recent version of MPICH. The version you are
> >>> using is years old and we don't support it anymore. Our latest version
> is
> >>> 3.1.3. You might see if the issue is still present there.
> >>>
> >>> Thanks,
> >>> Wesley
> >>>
> >>> On Fri, Jan 16, 2015 at 10:54 AM, Eduardo <erocha.ssa at gmail.com>
> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> I am trying to compile and use mpich-1.2.7p1 with threaded listener
> >>>> (i.e. configured with --enable-threaded-listener). However, I cannot
> even
> >>>> run a simple mpi example with the resulting mpich.
> >>>>
> >>>> I need to use threaded listener because the environment I am compiling
> >>>> for (kind of embedded environment)  does not have fork (no heavy
> processes).
> >>>>
> >>>> The error I get with the mpich with threaded listener is:
> >>>>
> >>>> rm_2889: 1103279872:  p4_error: listener select: -1
> >>>>     p4_error: latest msg from perror: Bad file descriptor
> >>>> p0_2710: (2.097656) net_recv failed for fd = 5
> >>>> p0_2710: 3778266880:  p4_error: net_recv read, errno = : 104
> >>>>
> >>>> Has anyone experienced a similar problem?
> >>>>
> >>>> Thanks in advance,
> >>>> Eduardo
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> discuss mailing list     discuss at mpich.org
> >>> To manage subscription options or unsubscribe:
> >>> https://lists.mpich.org/mailman/listinfo/discuss
> >>
> >> _______________________________________________
> >> discuss mailing list     discuss at mpich.org
> >> To manage subscription options or unsubscribe:
> >> https://lists.mpich.org/mailman/listinfo/discuss
> >>
> >>
> >> _______________________________________________
> >> discuss mailing list     discuss at mpich.org
> >> To manage subscription options or unsubscribe:
> >> https://lists.mpich.org/mailman/listinfo/discuss
> >
> >
> >
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> --
> Jeff Hammond
> jeff.science at gmail.com
> http://jeffhammond.github.io/
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20150117/54867f14/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list