[mpich-discuss] Possible integer overflows at scale in gather.c

Thu Oct 1 16:13:54 CDT 2015

On 10/01/2015 01:09 PM, Ignacio Laguna wrote:
> I would be happy to continue the discussion too. It's just that since
> I'm new here I don't want to violate the rules (if there are). I know
> that non mpich topics should be discussed in stack-overflow and places
> like that. If there are more questions, just let me know.

I think the question we all have is "when can we add this to our git 
commit hooks?" -- debugging these problems without the aid of tooling is 
basically impossible.

==rob

>
> Ignacio
>
>
> On 10/1/15 10:50 AM, Jeff Hammond wrote:
>> I would be happy to see this discussion continue on the MPICH devel list
>> even if it gets into LLVM weeds.  Tools for ensuring count-safety are
>> vital to MPICH and related efforts in the MPI community, and I think
>> many of the people who are subscribed to devel at mpich.org
>> <mailto:devel at mpich.org> would be interested in this topic.  At the very
>> least, they should tolerate it, since everyone on that list knows how to
>> use email filters ;-)
>>
>> Best,
>>
>> Jeff
>>
>> On Thu, Oct 1, 2015 at 10:35 AM, Ignacio Laguna <lagunaperalt1 at llnl.gov
>> <mailto:lagunaperalt1 at llnl.gov>> wrote:
>>
>>     Rob,
>>
>>     Thanks for replying! I'm answering your questions inline below.
>>
>>     On 10/1/15 9:15 AM, Rob Latham wrote:
>>
>>
>>
>>         On 09/30/2015 04:08 PM, Ignacio Laguna wrote:
>>
>>             Hi,
>>
>>             I found some 'potential' integer overflows when running
>>             mpich at large
>>             scale and/or with large inputs in gather.c. I believe that
>>             they are
>>             related to ticket #1767
>>             (http://trac.mpich.org/projects/mpich/ticket/1767), but I
>>             didn't see any
>>             other bug reports about them so I thought I should confirm
>>             with mpich
>>             developers.
>>
>>             In addition to this case in line 171:
>>
>>                   98        int tmp_buf_size, missing;
>>             ...
>>                  169        if (nbytes <
>> MPIR_CVAR_GATHER_VSMALL_MSG_SIZE)
>>             tmp_buf_size++;
>>                  170
>>                  171        tmp_buf_size *= nbytes;
>>
>>             which I believe it's fixed in mpich-3.2b4 (where
>> tmp_buf_size is
>>             declared as a 64-bit MPI_Aint),
>>
>>
>>         I fixed this (I sure hope!) over the summer in commit
>>         68f8c7aa798f7009
>>
>>
>>     Yes, this is fixed (according to my analysis), so don't worry, I
>>     just used it as a reference.
>>
>>
>>             I found the following cases in the same
>>             file src/mpi/coll/gather.c (in both mpich-3.1.4 and
>>             mpich-3.2b4):
>>
>>             Case 1:
>>
>>                  222    mpi_errno = MPIC_Recv(((char *)recvbuf +
>>                  223                (((rank + mask) %
>>             comm_size)*recvcount*extent)),
>>                  224                recvblks * recvcount, recvtype, src,
>>                  225                MPIR_GATHER_TAG, comm,
>>                  226                &status, errflag);
>>
>>             In line 223 I believe we get an integer overflow as follows.
>>             Suppose I
>>             run 2^20 = 1,048,576 ranks and do a gather with 4,096
>>             elements. In this
>>             case (if I understand the algorithm well), ((rank + mask) %
>>             comm_size)
>>             would be 2^20 / 2 = 524,288, and recvcount = 4,096. Then the
>>             ((rank +
>>             mask) % comm_size)*recvcount expression would overflow:
>>             524,288 * 4,096
>>             = 2,147,483,648, and become negative.
>>
>>
>>         We're doing address math here, so I hope the compiler is not
>>         trying to
>>         store intermediate results in an integer.  However, if LLVM is
>>         saying
>>         otherwise, then sure, let's promote those types.
>>
>>            ((rank + mask) % comm_size)*recvcount is of type int*int,
>>         which is
>>         then multiplied by an MPI_Aint... yeah, looks suspicious.
>>
>>
>>     Even when they are intermediate operations, the compiler could place
>>     the result of ((rank + mask) % comm_size)*recvcount in an 32-bit
>>     integer register. In fact, in llvm with -O0 optimization level
>>     that's what it does. This is the portion of the code in llvm IR:
>>
>>        %mul220 = mul nsw i32 %rem219.pre, %recvcount
>>        %conv221 = sext i32 %mul220 to i64
>>        %mul222 = mul nsw i64 %conv221, %extent.1851
>>        %add.ptr223 = getelementptr inbounds i8* %recvbuf, i64 %mul222
>>
>>     In the first line it multiplies the result of ((rank + mask) %
>>     comm_size) with recvcount. Note that this is a 32-bit multiplication
>>     (it says i32). Then in the 2nd line it extends the result to a
>>     64-bit register (unfortunately at this point %mul220 is already
>>     wrapped around) and stores it in %conv221. Then it multiplies it by
>>     %extent, and then gets an element of the array (with getelementptr).
>>
>>     I'm not sure if this would be a problem with higher optimization
>>     levels, though, but we should not rely on that.
>>
>>     I also verified these things are propagated to actual machine
>>     registers by looking at the assembly code. Here's an example:
>>
>>     $ cat test.c
>>     #include <stdio.h>
>>     int main()
>>     {
>>        int val1, val2;
>>        long long tmp, final;
>>        val1 = (1 << 20) / 2;
>>        val2 = 4096;
>>        tmp = 4;
>>        final = val1 * val2 * tmp;
>>        printf("val1: %d, val2: %d, final: %lld\n", val1, val2, final);
>>        return 0;
>>     }
>>
>>     I compiled it with clang and -O0 and this is what I get when I run
>> it:
>>
>>     $ ./test
>>     val1: 524288, val2: 4096, final: -8589934592
>>
>>     This is the assembly code in my Mac:
>>
>>     $ otool -vVt test
>>     test:
>>     (__TEXT,__text) section
>>     _main:
>>     0000000100000f00        pushq   %rbp
>>     0000000100000f01        movq    %rsp, %rbp
>>     0000000100000f04        subq    $0x30, %rsp
>>     0000000100000f08        leaq    0x6f(%rip), %rdi
>>     0000000100000f0f        movl    $0x0, -0x4(%rbp)
>>     0000000100000f16        movl    $0x80000, -0x8(%rbp)
>>     0000000100000f1d        movl    $0x1000, -0xc(%rbp)
>>     0000000100000f24        movq    $0x4, -0x18(%rbp)
>>     0000000100000f2c        movl    -0x8(%rbp), %eax
>>     0000000100000f2f        imull   -0xc(%rbp), %eax
>>     0000000100000f33        movslq  %eax, %rcx
>>     0000000100000f36        imulq   -0x18(%rbp), %rcx
>>
>>     The first multiplication with imull is stored in %eax, and then
>>     second one in %rcx, but at this point %eax already has an overflow.
>>
>>
>>         There are quite a few places where we do math on recvcount and
>>         recvblocks -- does your static analysis show problems with all
>>         of those
>>         spots?
>>
>>
>>     I checked all MPI calls in mpich-3.2b4 and the analysis doesn't
>>     report anything else. Now, this doesn't mean there are no more
>>     bugs--just that it could not find anything else :-)
>>
>>             When multiplied with 'extent', which is size_t or MPI_Aint,
>>             it will
>>             become negative I believe or a huge positive which in any
>>             case will
>>             point to the wrong location in the recvbuf buffer, unless of
>>             course this
>>             wraparound behavior is intended.
>>
>>
>>         Nope, wraparound sure is not intended!  We would have no hope of
>>         cleaning up all these locations if it were not for clang's
>>         -Wshorten64-to-32 flag.
>>
>>
>>     Thanks! I was just double checking. In some programs wraparounds are
>>     actually intended.
>>
>>             Case 2:
>>
>>             There might be a similar problem in line 224 in the above
>>             code. With
>>             2^20 ranks, recvblks becomes 524,288 (again if I understand
>>             well the
>>             algorithm), so the recvblks * recvcount operation will also
>>             overflow.
>>
>>             I might be wrong on this -- I'm catching these issues with
>>             LLVM symbolic
>>             analysis -- so they can be totally false positives,
>>
>>
>>         Tell me more about LLVM symbolic analysis!  We have started using
>>         coverity for static analysis but yes, there are a lot of messages
>>         obscuring the genuine bugs.
>>
>>
>>     I can tell you more in a separate email (I know this mailing list is
>>     only for mpich discussions).
>>
>>     Basically we are looking for scale-dependent bugs, i.e., bugs that
>>     will only manifest at large scale and/or with large inputs. It does
>>     data-flow analysis to understand what code is dependent on things
>>     like the size of the problem (in mpich and/or MPI apps this is
>>     basically code that depends on the size of the communicators, for
>>     example). It then performs tests to see if things could go wrong at
>>     large scale, e.g., by replacing ranks = 1 million values and so on.
>>     And then it reports what line of the code and file the problem is.
>>
>>     Having said that, coverity should find much more bugs. This is very
>>     specialized for scale-dependent bugs.
>>
>>
>>             but I just wanted to
>>             check with the mpich developers if they are valid issues or
>>             not. If they
>>             are, I believe fixes can be easy to implement (just make all
>>             these
>>             computations size_t).
>>
>>
>>         I promoted the prototype for MPIC_Recv to this:
>>
>>         int MPIC_Recv(void *buf, MPI_Aint count,
>>         MPI_Datatype datatype, int source, int tag,
>>         MPID_Comm *comm_ptr, MPI_Status *status,
>>         MPIR_Errflag_t *errflag)
>>
>>         Top-level MPI routines are stuck using an 'int' count, but
>>         internally we
>>         can (and now do) use a larger MPI_Aint type.
>>
>>         True, int*int can overflow, but since that parameter to
>>         MPIC_Recv is of
>>         the larger MPI_Aint type, we're ok here.   LLVM disagrees?
>>
>>
>>     Once you have a version I can download with the patches, I can run
>>     the analysis again and see if I can catch more things. Just let me
>>     know and I would be happy to test it.
>>
>>     In general, overflows can also occur in intermediate computations,
>>     depending on how code is optimized, I believe.
>>
>>     Thanks!
>>
>>     Ignacio
>>
>>         ==rob
>>
>>     _______________________________________________
>>     discuss mailing list discuss at mpich.org <mailto:discuss at mpich.org>
>>     To manage subscription options or unsubscribe:
>>     https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>>
>>
>> --
>> Jeff Hammond
>> jeff.science at gmail.com <mailto:jeff.science at gmail.com>
>> http://jeffhammond.github.io/
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss