[mpich-devel] Request for comments: MPICH device-specific timers

Jeff Hammond jeff.science at gmail.com
Thu Jan 16 13:07:19 CST 2020


I have consulted with the relevant Intel experts and precise time
measurement on Intel products will support clock_gettime or similar,
including remote sources of time synchronized via 1588/PTP.  Moving timers
to MPL is fine from this perspective.

This is not a response on behalf of Intel MPI team.

Jeff

On Thu, Jan 16, 2020 at 7:17 AM Balaji, Pavan via devel <devel at mpich.org>
wrote:

> Folks,
>
> I'm considering removing the device timers from MPICH and fully relying on
> MPL timers.  This means, we'll no longer have MPID_Wtime and friends, but
> simply use MPL_Wtime (which internally would use the OS provided timers).
>
> Why did we have device-specific timers in the first place?
>
> The intent of the device-specific timers was to allow for platforms that
> provide node-synchronized timers to provide their own timers for MPI_Wtime
> and MPI_Wtick.  The last set of platforms that I know of that gave such
> synchronized timers were the Blue Gene machines.
>
> What has changed now?
>
> AFAICT from reading online, Blue Gene/Q eventually integrated these timers
> to update the TSC register, so the OS-provided timers such as clock_gettime
> (and hence MPL) would give the same time.  Plus, going forward, it seems
> more likely that vendors would integrate such synchronized timers into the
> OS timers anyway.  Thus the value of the device-provided timers doesn't
> seem to exist any longer.
>
> Why can't we leave the current code in MPICH as-is?
>
> The problem with allowing for device timers is that they are initialized
> with the device (e.g., in MPID_Init or in MPID_Wtime_init).  More
> importantly, this initialization of timers is now collective over all
> processes for platforms that have device-specific timers.  This creates a
> problem in the order of initialization of the various components because
> the timers need to be initialized early before things like logging can be
> initialized.  By moving to MPL timers, such initialization can be done
> completely at the MPI layer and completely locally.  A local-only
> initialization would be a significant improvement in the maintainability of
> the code.  Also, are we are looking to modify the initialization to
> integrate additional functionality (such as threading improvements), the
> initialization is becoming a big spaghetti mess, which this would help with.
>
> AFAIK, none of the MPICH derivatives rely on this feature, but I'd love to
> hear thoughts from other developers if my understanding is incorrect.  If
> they are OK with this change, I'd also appreciate a note saying so.
>
> Regards,
>
>   -- Pavan
>
> _______________________________________________
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/devel
>


-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/devel/attachments/20200116/61ff7af2/attachment-0001.html>


More information about the devel mailing list