[mpich-devel] MPIR_ERRTEST_ARGNULL and count=0 (or type_size=0)

William Gropp wgropp at illinois.edu
Mon Jun 3 08:25:48 CDT 2013


The reason that you are not seeing the MPIR_ERRTEST_USERBUFFER is that the original intent of the design is that easy error tests are carried out in the error test block and anything else must be handled elsewhere.  For example, some of the group operations have consistency checks in the body of the code, rather than in the error test block.  Because MPI_BOTTOM with an MPI Datatype containing non-zero offsets is always valid, the simple checks for NULL were originally not performed in any of the routines.  The ERRTEST_USERBUFFER check will catch many (though not all) null buffers at relatively low cost, though since other parts of the code must also check for null, it should not be considered necessary or sufficient.

Its probably worth adding a comment that it is incorrect to use ERRTEST_ARGNULL for communication buffers in every routine that has one, since this issue comes up every few years.

Also, the ERRTEST_USERBUFFER should include the parameter name and add an error message that includes the parameter name, since many routines have multiple communication buffers to which this can be applied.

Bill

William Gropp
Director, Parallel Computing Institute
Deputy Director for Research
Institute for Advanced Computing Applications and Technologies
Thomas M. Siebel Chair in Computer Science
University of Illinois Urbana-Champaign




On May 30, 2013, at 6:40 AM, Jeff Hammond wrote:

> The check for NO_OP in *get_accumulate was most certainly present.
> That was the first thing I checked when
> MPI_Get_accumulate(MPI_REPLACE) failed to behave like MPI_Get w.r.t.
> ARMCI-MPI test codes.
> 
> I'm going to supplement the RMA test suite so that this issue and ones
> like it would have been caught by MPICH rather than waiting until the
> very last ARMCI-MPI test to be revealed.
> 
> Jeff
> 
> On Thu, May 30, 2013 at 4:53 AM, Jim Dinan <james.dinan at gmail.com> wrote:
>> Ugh, deep shame upon me and whoever reviewed my patch!  Lisandro is
>> correct.  :)
>> 
>> For accumulate ops, the op can also cause a buffer to be ignored (e.g.
>> GACC/FOP with NO_OP).  I think there is code to handle this, but keep an
>> eye out for it.
>> 
>> ~Jim.
>> 
>> On 05/29/2013 03:08 PM, Lisandro Dalcin wrote:
>>> On 29 May 2013 22:49, Jeff Hammond <jhammond at alcf.anl.gov> wrote:
>>>> It seems that I didn't understand MPI_BOTTOM completely, so the error
>>>> checks needed to be different than described below, but the high-level
>>>> issues are the same.
>>>> 
>>>> Interested parties may consult
>>>> http://trac.mpich.org/projects/mpich/ticket/1863 for details and
>>>> further discussion.
>>>> 
>>> 
>>> Please take a look at the macro MPIR_ERRTEST_USERBUFFER .
>>> 
>>> 
>>> 
>>> --
>>> Lisandro Dalcin
>>> ---------------
>>> CIMEC (INTEC/CONICET-UNL)
>>> Predio CONICET-Santa Fe
>>> Colectora RN 168 Km 472, Paraje El Pozo
>>> 3000 Santa Fe, Argentina
>>> Tel: +54-342-4511594 (ext 1011)
>>> Tel/Fax: +54-342-4511169
>>> 
>> 
> 
> 
> 
> -- 
> Jeff Hammond
> Argonne Leadership Computing Facility
> University of Chicago Computation Institute
> jhammond at alcf.anl.gov / (630) 252-5381
> http://www.linkedin.com/in/jeffhammond
> https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
> ALCF docs: http://www.alcf.anl.gov/user-guides

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/devel/attachments/20130603/b24da0eb/attachment.html>


More information about the devel mailing list