From zhouh at anl.gov Mon Jan 6 16:45:17 2020 From: zhouh at anl.gov (Zhou, Hui) Date: Mon, 6 Jan 2020 22:45:17 +0000 Subject: [mpich-devel] Proposed CH4 device API -- MPID_Pre_init Message-ID: Hi MPICH developers, There is a pending PR on GitHub, https://github.com/pmodels/mpich/pull/4214, that is proposing adding a new device layer API for ch4 ? MPID_Pre_init. The current ADI a single MPID_Init(int *argc, int **argv, int required_thread_level, int *provided_thread_level). It parses command line argument and decides the thread_level among other device layer initializations. There are MPIR-layer initializations both before MPID_Init that need know the thread-level. Even inside MPID_Init, different components may need know thread_level but it is not certain at which point the thread-level is determined. Currently, in MPIR_Init_thread, we initializes with a thread-level based on user argument, which potentially will get changed by MPID_Init. That creates a potential inconsistency and as well as complications. The PR addresses it by splitting the argument parsing part of init into MPID_Pre_Init, that is called the very first thing during the init process, guaranteeing the thread-level that each components sees is the reliable final thread-level. While addressing the thread-level support is the motivation, once the MPID_Pre_init API is there, it will allow more flexible device initializations as well. The PR has already went through internal reviews and is ready to be merged. We welcome any comments. ? Hui Zhou -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhouh at anl.gov Mon Jan 6 22:25:57 2020 From: zhouh at anl.gov (Zhou, Hui) Date: Tue, 7 Jan 2020 04:25:57 +0000 Subject: [mpich-devel] Correction [Re: Proposed CH4 device API -- MPID_Pre_init] In-Reply-To: References: Message-ID: <827BB8FE-68EE-471F-9040-0BB0E11D78E0@anl.gov> The last email about ADI change was accidentally attributed to CH4. Correction: it is an MPICH ADI addition that applies to all devices, both CH3 and CH4. While the reasons and justifications given in the last email still applies, please note this is a change that will affect all devices. Hopefully, we?ll all see this impact as trivial. Again, any comments are welcome. ? Hui Zhou On Jan 6, 2020, at 4:45 PM, Hui Zhou > wrote: Hi MPICH developers, There is a pending PR on GitHub, https://github.com/pmodels/mpich/pull/4214, that is proposing adding a new device layer API for ch4 ? MPID_Pre_init. The current ADI a single MPID_Init(int *argc, int **argv, int required_thread_level, int *provided_thread_level). It parses command line argument and decides the thread_level among other device layer initializations. There are MPIR-layer initializations both before MPID_Init that need know the thread-level. Even inside MPID_Init, different components may need know thread_level but it is not certain at which point the thread-level is determined. Currently, in MPIR_Init_thread, we initializes with a thread-level based on user argument, which potentially will get changed by MPID_Init. That creates a potential inconsistency and as well as complications. The PR addresses it by splitting the argument parsing part of init into MPID_Pre_Init, that is called the very first thing during the init process, guaranteeing the thread-level that each components sees is the reliable final thread-level. While addressing the thread-level support is the motivation, once the MPID_Pre_init API is there, it will allow more flexible device initializations as well. The PR has already went through internal reviews and is ready to be merged. We welcome any comments. ? Hui Zhou -------------- next part -------------- An HTML attachment was scrubbed... URL: From balaji at anl.gov Thu Jan 16 09:16:50 2020 From: balaji at anl.gov (Balaji, Pavan) Date: Thu, 16 Jan 2020 15:16:50 +0000 Subject: [mpich-devel] Request for comments: MPICH device-specific timers Message-ID: <90C4BF6E-07A3-459B-BEA2-1C4FB7F51042@anl.gov> Folks, I'm considering removing the device timers from MPICH and fully relying on MPL timers. This means, we'll no longer have MPID_Wtime and friends, but simply use MPL_Wtime (which internally would use the OS provided timers). Why did we have device-specific timers in the first place? The intent of the device-specific timers was to allow for platforms that provide node-synchronized timers to provide their own timers for MPI_Wtime and MPI_Wtick. The last set of platforms that I know of that gave such synchronized timers were the Blue Gene machines. What has changed now? AFAICT from reading online, Blue Gene/Q eventually integrated these timers to update the TSC register, so the OS-provided timers such as clock_gettime (and hence MPL) would give the same time. Plus, going forward, it seems more likely that vendors would integrate such synchronized timers into the OS timers anyway. Thus the value of the device-provided timers doesn't seem to exist any longer. Why can't we leave the current code in MPICH as-is? The problem with allowing for device timers is that they are initialized with the device (e.g., in MPID_Init or in MPID_Wtime_init). More importantly, this initialization of timers is now collective over all processes for platforms that have device-specific timers. This creates a problem in the order of initialization of the various components because the timers need to be initialized early before things like logging can be initialized. By moving to MPL timers, such initialization can be done completely at the MPI layer and completely locally. A local-only initialization would be a significant improvement in the maintainability of the code. Also, are we are looking to modify the initialization to integrate additional functionality (such as threading improvements), the initialization is becoming a big spaghetti mess, which this would help with. AFAIK, none of the MPICH derivatives rely on this feature, but I'd love to hear thoughts from other developers if my understanding is incorrect. If they are OK with this change, I'd also appreciate a note saying so. Regards, -- Pavan From raffenet at mcs.anl.gov Thu Jan 16 10:15:24 2020 From: raffenet at mcs.anl.gov (Raffenetti, Kenneth J.) Date: Thu, 16 Jan 2020 16:15:24 +0000 Subject: [mpich-devel] proposed changes to function enter/exit logging In-Reply-To: <52449d9b-ee08-2bca-847b-fb6ae58579cd@mcs.anl.gov> References: <936e3c1f-2e2e-1c4b-4a2f-7f9fb126e203@mcs.anl.gov> <5e401725-8f4f-7a61-813d-a847ce006580@anl.gov> <82f6ed27-0a23-c614-38e2-08a2da8a18e9@mcs.anl.gov> <52449d9b-ee08-2bca-847b-fb6ae58579cd@mcs.anl.gov> Message-ID: <7da63558-f94a-e2f6-3400-590c1291c6d7@mcs.anl.gov> [UPDATE] An alternative approach to the function enter/exit logging changes has been posted here: https://github.com/pmodels/mpich/pull/4276 The previous -finstrument-functions PR only supported Linux for address->func_name conversion in log files, which was problematic for some. The new approach uses scripts to add/remove the enter/exit macros on-demand. The benefit is that it will work on all systems, and be less error-prone than manually adding the macros by hand. Please comment on the PR or this thread if you have thoughts. Thanks, Ken On 11/7/19 9:35 AM, Raffenetti, Kenneth J. via devel wrote: > I should also note that currently the address->function translation > scripts will only work on Linux. > > Ken > > On 11/7/19 8:30 AM, Ken Raffenetti wrote: >> Here's what I've found so far: >> >> XL compilers support -qfunctrace with slightly different user-supplied >> tracing function signatures. Should be trivial to add support to MPICH. >> >> https://www.ibm.com/support/knowledgecenter/SSGH2K_12.1.0/com.ibm.xlc121.aix.doc/compiler_ref/opt_functrace.html >> >> >> PGI supports -Minstrument with the same tracing function signatures as >> GCC and Clang. >> >> Intel supports -finstrument-functions. >> >> I do not see any support in Oracle suncc. >> >> Ken >> >> On 11/7/19 4:08 AM, Finkel, Hal J. via devel wrote: >>> Clang does support -finstrument-functions (and also provides >>> -finstrument-functions-after-inlining and XRay-based options: >>> https://llvm.org/docs/XRay.html). >>> >>> ??-Hal >>> >>> On 11/6/19 6:22 PM, Jeff Hammond via devel wrote: >>>> Do you have a list of what compilers support this?? Is this just GCC, >>>> just GCC and Clang, or a bigger list?? If Clang supports it, do you >>>> know if all of the Clang derivatives (e.g. IBM XLC) support it? >>>> >>>> Thanks, >>>> >>>> Jeff >>>> >>>> On Wed, Nov 6, 2019 at 2:04 PM Raffenetti, Kenneth J. via devel >>>> > wrote: >>>> >>>> ??? Folks, >>>> >>>> ??? I want to draw attention to a pull request posted to the MPICH >>>> github. >>>> >>>> ??? https://github.com/pmodels/mpich/pull/4139 >>>> >>>> ??? This proposes to remove all MPIR_FUNC_VERBOSE_ENTER and >>>> ??? MPIR_FUNC_VERBOSE_EXIT macros from MPICH and instead use compiler >>>> ??? function instrumentation when configured for debug logging: >>>> ??? https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html. >>>> >>>> ??? There are 2 main benefits: >>>> >>>> ??? 1. De-clutter the code. PR stats show a net deletion of >10,000 >>>> lines! >>>> ??? 2. Less error-prone. The compiler will reliably instrument *all* >>>> ??? functions, and will not suffer from copy/paste errors or typos. >>>> >>>> ??? Drawbacks: >>>> >>>> ??? 1. Requires compiler support for instrumentation. May affect users >>>> ??? debugging certain compiler configurations. >>>> ??? 2. Compiler function instrumentation logs function addresses, not >>>> ??? symbols. An additional step is needed to convert addresses to >>>> ??? function >>>> ??? names via script (included in PR). >>>> >>>> ??? What do people think? Will these changes affect anyone using debug >>>> ??? logging in any regular capacity? We're looking for feedback good >>>> ??? or bad. >>>> ??? Let us know. >>>> >>>> ??? Thanks, >>>> ??? Ken >>>> ??? _______________________________________________ >>>> ??? To manage subscription options or unsubscribe: >>>> ??? https://lists.mpich.org/mailman/listinfo/devel >>>> >>>> >>>> >>>> -- >>>> Jeff Hammond >>>> jeff.science at gmail.com >>>> http://jeffhammond.github.io/ >>>> >>>> _______________________________________________ >>>> To manage subscription options or unsubscribe: >>>> https://lists.mpich.org/mailman/listinfo/devel >>> -- >>> Hal Finkel >>> Lead, Compiler Technology and Programming Languages >>> Leadership Computing Facility >>> Argonne National Laboratory >>> >>> >>> _______________________________________________ >>> To manage subscription options or unsubscribe: >>> https://lists.mpich.org/mailman/listinfo/devel >>> > _______________________________________________ > To manage subscription options or unsubscribe: > https://lists.mpich.org/mailman/listinfo/devel > From jeff.science at gmail.com Thu Jan 16 13:07:19 2020 From: jeff.science at gmail.com (Jeff Hammond) Date: Thu, 16 Jan 2020 11:07:19 -0800 Subject: [mpich-devel] Request for comments: MPICH device-specific timers In-Reply-To: <90C4BF6E-07A3-459B-BEA2-1C4FB7F51042@anl.gov> References: <90C4BF6E-07A3-459B-BEA2-1C4FB7F51042@anl.gov> Message-ID: I have consulted with the relevant Intel experts and precise time measurement on Intel products will support clock_gettime or similar, including remote sources of time synchronized via 1588/PTP. Moving timers to MPL is fine from this perspective. This is not a response on behalf of Intel MPI team. Jeff On Thu, Jan 16, 2020 at 7:17 AM Balaji, Pavan via devel wrote: > Folks, > > I'm considering removing the device timers from MPICH and fully relying on > MPL timers. This means, we'll no longer have MPID_Wtime and friends, but > simply use MPL_Wtime (which internally would use the OS provided timers). > > Why did we have device-specific timers in the first place? > > The intent of the device-specific timers was to allow for platforms that > provide node-synchronized timers to provide their own timers for MPI_Wtime > and MPI_Wtick. The last set of platforms that I know of that gave such > synchronized timers were the Blue Gene machines. > > What has changed now? > > AFAICT from reading online, Blue Gene/Q eventually integrated these timers > to update the TSC register, so the OS-provided timers such as clock_gettime > (and hence MPL) would give the same time. Plus, going forward, it seems > more likely that vendors would integrate such synchronized timers into the > OS timers anyway. Thus the value of the device-provided timers doesn't > seem to exist any longer. > > Why can't we leave the current code in MPICH as-is? > > The problem with allowing for device timers is that they are initialized > with the device (e.g., in MPID_Init or in MPID_Wtime_init). More > importantly, this initialization of timers is now collective over all > processes for platforms that have device-specific timers. This creates a > problem in the order of initialization of the various components because > the timers need to be initialized early before things like logging can be > initialized. By moving to MPL timers, such initialization can be done > completely at the MPI layer and completely locally. A local-only > initialization would be a significant improvement in the maintainability of > the code. Also, are we are looking to modify the initialization to > integrate additional functionality (such as threading improvements), the > initialization is becoming a big spaghetti mess, which this would help with. > > AFAIK, none of the MPICH derivatives rely on this feature, but I'd love to > hear thoughts from other developers if my understanding is incorrect. If > they are OK with this change, I'd also appreciate a note saying so. > > Regards, > > -- Pavan > > _______________________________________________ > To manage subscription options or unsubscribe: > https://lists.mpich.org/mailman/listinfo/devel > -- Jeff Hammond jeff.science at gmail.com http://jeffhammond.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeff.science at gmail.com Thu Jan 16 13:07:19 2020 From: jeff.science at gmail.com (Jeff Hammond) Date: Thu, 16 Jan 2020 11:07:19 -0800 Subject: [mpich-devel] Request for comments: MPICH device-specific timers In-Reply-To: <90C4BF6E-07A3-459B-BEA2-1C4FB7F51042@anl.gov> References: <90C4BF6E-07A3-459B-BEA2-1C4FB7F51042@anl.gov> Message-ID: I have consulted with the relevant Intel experts and precise time measurement on Intel products will support clock_gettime or similar, including remote sources of time synchronized via 1588/PTP. Moving timers to MPL is fine from this perspective. This is not a response on behalf of Intel MPI team. Jeff On Thu, Jan 16, 2020 at 7:17 AM Balaji, Pavan via devel wrote: > Folks, > > I'm considering removing the device timers from MPICH and fully relying on > MPL timers. This means, we'll no longer have MPID_Wtime and friends, but > simply use MPL_Wtime (which internally would use the OS provided timers). > > Why did we have device-specific timers in the first place? > > The intent of the device-specific timers was to allow for platforms that > provide node-synchronized timers to provide their own timers for MPI_Wtime > and MPI_Wtick. The last set of platforms that I know of that gave such > synchronized timers were the Blue Gene machines. > > What has changed now? > > AFAICT from reading online, Blue Gene/Q eventually integrated these timers > to update the TSC register, so the OS-provided timers such as clock_gettime > (and hence MPL) would give the same time. Plus, going forward, it seems > more likely that vendors would integrate such synchronized timers into the > OS timers anyway. Thus the value of the device-provided timers doesn't > seem to exist any longer. > > Why can't we leave the current code in MPICH as-is? > > The problem with allowing for device timers is that they are initialized > with the device (e.g., in MPID_Init or in MPID_Wtime_init). More > importantly, this initialization of timers is now collective over all > processes for platforms that have device-specific timers. This creates a > problem in the order of initialization of the various components because > the timers need to be initialized early before things like logging can be > initialized. By moving to MPL timers, such initialization can be done > completely at the MPI layer and completely locally. A local-only > initialization would be a significant improvement in the maintainability of > the code. Also, are we are looking to modify the initialization to > integrate additional functionality (such as threading improvements), the > initialization is becoming a big spaghetti mess, which this would help with. > > AFAIK, none of the MPICH derivatives rely on this feature, but I'd love to > hear thoughts from other developers if my understanding is incorrect. If > they are OK with this change, I'd also appreciate a note saying so. > > Regards, > > -- Pavan > > _______________________________________________ > To manage subscription options or unsubscribe: > https://lists.mpich.org/mailman/listinfo/devel > -- Jeff Hammond jeff.science at gmail.com http://jeffhammond.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: