[mpich-discuss] Possible integer overflows at scale in gather.c
Jeff Hammond
jeff.science at gmail.com
Thu Oct 1 12:50:14 CDT 2015
I would be happy to see this discussion continue on the MPICH devel list
even if it gets into LLVM weeds. Tools for ensuring count-safety are vital
to MPICH and related efforts in the MPI community, and I think many of the
people who are subscribed to devel at mpich.org would be interested in this
topic. At the very least, they should tolerate it, since everyone on that
list knows how to use email filters ;-)
Best,
Jeff
On Thu, Oct 1, 2015 at 10:35 AM, Ignacio Laguna <lagunaperalt1 at llnl.gov>
wrote:
> Rob,
>
> Thanks for replying! I'm answering your questions inline below.
>
> On 10/1/15 9:15 AM, Rob Latham wrote:
>
>>
>>
>> On 09/30/2015 04:08 PM, Ignacio Laguna wrote:
>>
>>> Hi,
>>>
>>> I found some 'potential' integer overflows when running mpich at large
>>> scale and/or with large inputs in gather.c. I believe that they are
>>> related to ticket #1767
>>> (http://trac.mpich.org/projects/mpich/ticket/1767), but I didn't see any
>>> other bug reports about them so I thought I should confirm with mpich
>>> developers.
>>>
>>> In addition to this case in line 171:
>>>
>>> 98 int tmp_buf_size, missing;
>>> ...
>>> 169 if (nbytes < MPIR_CVAR_GATHER_VSMALL_MSG_SIZE)
>>> tmp_buf_size++;
>>> 170
>>> 171 tmp_buf_size *= nbytes;
>>>
>>> which I believe it's fixed in mpich-3.2b4 (where tmp_buf_size is
>>> declared as a 64-bit MPI_Aint),
>>>
>>
>> I fixed this (I sure hope!) over the summer in commit 68f8c7aa798f7009
>>
>>
> Yes, this is fixed (according to my analysis), so don't worry, I just used
> it as a reference.
>
>
>> I found the following cases in the same
>>> file src/mpi/coll/gather.c (in both mpich-3.1.4 and mpich-3.2b4):
>>>
>>> Case 1:
>>>
>>> 222 mpi_errno = MPIC_Recv(((char *)recvbuf +
>>> 223 (((rank + mask) % comm_size)*recvcount*extent)),
>>> 224 recvblks * recvcount, recvtype, src,
>>> 225 MPIR_GATHER_TAG, comm,
>>> 226 &status, errflag);
>>>
>>> In line 223 I believe we get an integer overflow as follows. Suppose I
>>> run 2^20 = 1,048,576 ranks and do a gather with 4,096 elements. In this
>>> case (if I understand the algorithm well), ((rank + mask) % comm_size)
>>> would be 2^20 / 2 = 524,288, and recvcount = 4,096. Then the ((rank +
>>> mask) % comm_size)*recvcount expression would overflow: 524,288 * 4,096
>>> = 2,147,483,648, and become negative.
>>>
>>
>> We're doing address math here, so I hope the compiler is not trying to
>> store intermediate results in an integer. However, if LLVM is saying
>> otherwise, then sure, let's promote those types.
>>
>> ((rank + mask) % comm_size)*recvcount is of type int*int, which is
>> then multiplied by an MPI_Aint... yeah, looks suspicious.
>>
>
> Even when they are intermediate operations, the compiler could place the
> result of ((rank + mask) % comm_size)*recvcount in an 32-bit integer
> register. In fact, in llvm with -O0 optimization level that's what it does.
> This is the portion of the code in llvm IR:
>
> %mul220 = mul nsw i32 %rem219.pre, %recvcount
> %conv221 = sext i32 %mul220 to i64
> %mul222 = mul nsw i64 %conv221, %extent.1851
> %add.ptr223 = getelementptr inbounds i8* %recvbuf, i64 %mul222
>
> In the first line it multiplies the result of ((rank + mask) % comm_size)
> with recvcount. Note that this is a 32-bit multiplication (it says i32).
> Then in the 2nd line it extends the result to a 64-bit register
> (unfortunately at this point %mul220 is already wrapped around) and stores
> it in %conv221. Then it multiplies it by %extent, and then gets an element
> of the array (with getelementptr).
>
> I'm not sure if this would be a problem with higher optimization levels,
> though, but we should not rely on that.
>
> I also verified these things are propagated to actual machine registers by
> looking at the assembly code. Here's an example:
>
> $ cat test.c
> #include <stdio.h>
> int main()
> {
> int val1, val2;
> long long tmp, final;
> val1 = (1 << 20) / 2;
> val2 = 4096;
> tmp = 4;
> final = val1 * val2 * tmp;
> printf("val1: %d, val2: %d, final: %lld\n", val1, val2, final);
> return 0;
> }
>
> I compiled it with clang and -O0 and this is what I get when I run it:
>
> $ ./test
> val1: 524288, val2: 4096, final: -8589934592
>
> This is the assembly code in my Mac:
>
> $ otool -vVt test
> test:
> (__TEXT,__text) section
> _main:
> 0000000100000f00 pushq %rbp
> 0000000100000f01 movq %rsp, %rbp
> 0000000100000f04 subq $0x30, %rsp
> 0000000100000f08 leaq 0x6f(%rip), %rdi
> 0000000100000f0f movl $0x0, -0x4(%rbp)
> 0000000100000f16 movl $0x80000, -0x8(%rbp)
> 0000000100000f1d movl $0x1000, -0xc(%rbp)
> 0000000100000f24 movq $0x4, -0x18(%rbp)
> 0000000100000f2c movl -0x8(%rbp), %eax
> 0000000100000f2f imull -0xc(%rbp), %eax
> 0000000100000f33 movslq %eax, %rcx
> 0000000100000f36 imulq -0x18(%rbp), %rcx
>
> The first multiplication with imull is stored in %eax, and then second one
> in %rcx, but at this point %eax already has an overflow.
>
>
>> There are quite a few places where we do math on recvcount and
>> recvblocks -- does your static analysis show problems with all of those
>> spots?
>>
>>
> I checked all MPI calls in mpich-3.2b4 and the analysis doesn't report
> anything else. Now, this doesn't mean there are no more bugs--just that it
> could not find anything else :-)
>
> When multiplied with 'extent', which is size_t or MPI_Aint, it will
>>> become negative I believe or a huge positive which in any case will
>>> point to the wrong location in the recvbuf buffer, unless of course this
>>> wraparound behavior is intended.
>>>
>>
>> Nope, wraparound sure is not intended! We would have no hope of
>> cleaning up all these locations if it were not for clang's
>> -Wshorten64-to-32 flag.
>>
>>
> Thanks! I was just double checking. In some programs wraparounds are
> actually intended.
>
> Case 2:
>>>
>>> There might be a similar problem in line 224 in the above code. With
>>> 2^20 ranks, recvblks becomes 524,288 (again if I understand well the
>>> algorithm), so the recvblks * recvcount operation will also overflow.
>>>
>>> I might be wrong on this -- I'm catching these issues with LLVM symbolic
>>> analysis -- so they can be totally false positives,
>>>
>>
>> Tell me more about LLVM symbolic analysis! We have started using
>> coverity for static analysis but yes, there are a lot of messages
>> obscuring the genuine bugs.
>>
>
> I can tell you more in a separate email (I know this mailing list is only
> for mpich discussions).
>
> Basically we are looking for scale-dependent bugs, i.e., bugs that will
> only manifest at large scale and/or with large inputs. It does data-flow
> analysis to understand what code is dependent on things like the size of
> the problem (in mpich and/or MPI apps this is basically code that depends
> on the size of the communicators, for example). It then performs tests to
> see if things could go wrong at large scale, e.g., by replacing ranks = 1
> million values and so on. And then it reports what line of the code and
> file the problem is.
>
> Having said that, coverity should find much more bugs. This is very
> specialized for scale-dependent bugs.
>
>
>> but I just wanted to
>>> check with the mpich developers if they are valid issues or not. If they
>>> are, I believe fixes can be easy to implement (just make all these
>>> computations size_t).
>>>
>>
>> I promoted the prototype for MPIC_Recv to this:
>>
>> int MPIC_Recv(void *buf, MPI_Aint count,
>> MPI_Datatype datatype, int source, int tag,
>> MPID_Comm *comm_ptr, MPI_Status *status,
>> MPIR_Errflag_t *errflag)
>>
>> Top-level MPI routines are stuck using an 'int' count, but internally we
>> can (and now do) use a larger MPI_Aint type.
>>
>> True, int*int can overflow, but since that parameter to MPIC_Recv is of
>> the larger MPI_Aint type, we're ok here. LLVM disagrees?
>>
>>
> Once you have a version I can download with the patches, I can run the
> analysis again and see if I can catch more things. Just let me know and I
> would be happy to test it.
>
> In general, overflows can also occur in intermediate computations,
> depending on how code is optimized, I believe.
>
> Thanks!
>
> Ignacio
>
> ==rob
>>
>> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>
--
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20151001/b63c7e76/attachment.html>
-------------- next part --------------
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list