[mpich-devel] Sessions Considering Changing MPI_COMM_WORLD

Lisandro Dalcin dalcinl at gmail.com
Tue Oct 31 03:26:49 CDT 2017


I concur with Jed Brown. However, I would stress that "link time
constants" is handled differently on POSIX vs Windows dynamic loaders,
in the second you cannot "static MPI_Comm world = MPI_COMM_WORLD;")

Now I'll go off topic with some minor criticism to the proposal.

@JeffSquyres I prefer to complain with patches rather than words, but
in this case there is no code to patch :-)

I really hope that at some point this MPI Sessions proposal is watered down.

There are quite a bit of things in the proposed API that looks
superfluous and maybe they are not really needed.



* Look these ones:
MPI_FLAG_THREAD_NONCONCURRENT_SINGLE
MPI_FLAG_THREAD_NONCONCURRENT_FUNNELED
MPI_FLAG_THREAD_NONCONCURRENT_SERIALIZED
MPI_FLAG_THREAD_CONCURRENT

Do we really really need these new constants?
We already have 4 constants related to threads and we should need to
keep them for compatibility with MPI_Init_thread().



* Look this one:
MPI_Session_get_names(
IN MPI_Session session,
OUT char **set_names)

Why not returning the number of entries in the output array as well?
What about Fortran wrappers?
"The caller is responsible for freeing the returned array of strings."
How? C stdlib free()? MPI_Free_mem()? Just the array, or also the
string entries within the array? Won't someone think of the Fortrans!



* Look this one:
MPI_Create_comm_from_group(
IN MPI_Group group,
IN const char *tag, // for matching (see next slide)
IN MPI_Info info,
IN MPI_Errhandler errhander,
OUT MPI_Comm *comm)

Maybe It should be called MPI_Intracomm_create_from_group(), also
MPI_Intercomm_create_from_group().
Not a strong complaint but just a question/observation: Dow we really
need tag to be "const char*" and not just "int"? We already have the
precedent of MPI_Intercomm_create() using integer tags to
disambiguate. What about Fortran wrappers?
Do we really need to pass MPI_Errhandler? Why not use the one passed
to the Session create call? Of course, this would complicate the
implementation, groups created from sessions should keep track of the
session errhandler.
Or maybe we should also pass the MPI_Session handle to the
{intra|inter}comm create calls, i.e. something like
MPI_{Inter|Intra}comm_create_from_session(IN MPI_Session session, ...)
?

* Look at all these ones:
MPI_Create_cart_comm_from_group()
MPI_Create_graph_comm_from_group(…)
MPI_Create_dist_graph_comm_from_group(…)
MPI_Create_dist_graph_adjacent_comm_from_group(…)
MPI_Create_win_from_group()
MPI_Create_file_from_group()

Do we really need all this API bomb exploding on our face?
It seems to me that all of them can be implemented by first creating
creating an intracomm out of the session, then create the other with
the calls we already have in MPI.
Of course performance matters, but I would argue that the potential
performance issues have to be demonstrated first, then outweighed with
the extra API complexity.



* Finally, in the "Additional notes" section at the end, we have
+objects derived from Session A cannot be used to communicate with
objects derived from Session B
+MPI cannot communicate between Sessions (Send, receive, put, get,
intercommunicator creation, connect/accept, etc.)
+Cannot have requests from different Sessions in a single call to the
array TEST/WAIT functions

So it seems that Sessions are rather limited in comparison to
MPI_Init()/MPI_Finalize(). I'm wondering what's the point of slide 12
here: https://blogs.cisco.com/performance/mpi-sessions-a-proposal-for-the-mpi-forum
Given the proposed limitations, then it would be better to keep the
API as small as possible, and extend it in a future revision of the
standard if deemed crucial for performance.



On 31 October 2017 at 07:00, Jed Brown <jed at jedbrown.org> wrote:
> Note that MPI_COMM_WORLD isn't required to be a compile-time constant in
> the current standard and Open MPI doesn't treat it as one.  (It is a
> link-time constant.)  So regardless of potential performance impact,
> standard-compliant user code cannot treat it as a compile-time constant
> (e.g., by using it in a case statement).
>
> I'm also skeptical about ability to measure an impact on user code and
> thought your SC paper was about compile-time specialization in the MPI
> implementation.
>
>
> As one data point, PETSc does not reference MPI_COMM_WORLD except for
> some optional diagnostics and it would be pretty easy to remove.
>
> Kenneth Raffenetti <raffenet at mcs.anl.gov> writes:
>
>> Strictly speaking, compile-time objects like MPI_COMM_WORLD impose less
>> overhead than dynamically created ones. See our SC paper for more
>> details :).
>>
>> Ken
>>
>> On 10/30/2017 05:51 PM, Jeff Hammond wrote:
>>> This is going to cause horrible problems. I hate to say it, but the time
>>> for MPI to have avoided hidden state was in the 1990s.
>>>
>>> Lots of MPI libraries create their own communicator from the
>>> compile-time constant rather than a communicator argument. Breaking even
>>> a few of these will ruin MPI’s exhalted reputation in the HPC world.
>>>
>>> The better option is to provide new headers and libraries for sessions
>>> that break backwards compatibility. Or just have users opt-in to
>>> sessions by not using MPI_COMM_WORLD at all. That way you’re only
>>> breaking code that wants this feature rather than many that don’t.
>>>
>>> What’s your Fortran story? If you don’t have a trivial solution for
>>> users of mpif.h, you’ve lost approximately half if MPI’s users.
>>>
>>> Jeff
>>>
>>> On Mon, Oct 30, 2017 at 11:33 AM Bland, Wesley <wesley.bland at intel.com
>>> <mailto:wesley.bland at intel.com>> wrote:
>>>
>>>     The sessions working group is working on a proposal that, among
>>>     other things, would change the way MPI_COMM_WORLD works. As much as
>>>     I don't want to take things out of context for those who aren't
>>>     familiar with the proposal, I'm not going to summarize the whole
>>>     thing here. If you need that, look here:
>>>     https://github.com/mpiwg-sessions/sessions-issues/wiki/sessions_cheat_sheet
>>>
>>>
>>>     On today's call, we were trying to decide the result of a backward
>>>     compatibility break for the MPI_COMM_WORLD symbol. We want to keep
>>>     the MPI_COMM_WORLD symbol to allow legacy codes to work, but we
>>>     don't want to make it a compiler symbol anymore. It would only exist
>>>     if you call MPI_INIT. To that end, MPI_INIT would look something
>>>     like this:
>>>
>>>     MPI_Init(...) {
>>>     MPI_Comm *comm;
>>>     MPI_Session *session;
>>>
>>>     MPI_Session_init(..., session);
>>>     MPI_Group_create_from_session_name(&session, "mpi://WOLRD", comm);
>>>     MPI_COMM_WORLD = *comm;
>>>     }
>>>
>>>     The problem here is that MPI_COMM_WORLD is no longer the
>>>     compile-time constant that it was before. For example, if you're
>>>     using MPI_COMM_WORLD in a library, this would cause problems.
>>>
>>>     The working group is trying to figure out the results of this to
>>>     decide whether this would cause horrible problems. It seems that as
>>>     long as applications are reasonably well behaved and check if the
>>>     library is initialized first, they should work correctly.
>>>
>>>     What is this community's opinion of this?
>>>
>>>     Thanks,
>>>     Wesley
>>>     _______________________________________________
>>>     To manage subscription options or unsubscribe:
>>>     https://lists.mpich.org/mailman/listinfo/devel
>>>
>>> --
>>> Jeff Hammond
>>> jeff.science at gmail.com <mailto:jeff.science at gmail.com>
>>> http://jeffhammond.github.io/
>>>
>>>
>>> _______________________________________________
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/devel
>>>
>> _______________________________________________
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/devel
> _______________________________________________
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/devel



-- 
Lisandro Dalcin
============
Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa

Office Phone: +966 12 808-0459


More information about the devel mailing list