[mpich-discuss] Possible integer overflows at scale in gather.c
Ignacio Laguna
lagunaperalt1 at llnl.gov
Thu Oct 1 12:35:32 CDT 2015
Rob,
Thanks for replying! I'm answering your questions inline below.
On 10/1/15 9:15 AM, Rob Latham wrote:
>
>
> On 09/30/2015 04:08 PM, Ignacio Laguna wrote:
>> Hi,
>>
>> I found some 'potential' integer overflows when running mpich at large
>> scale and/or with large inputs in gather.c. I believe that they are
>> related to ticket #1767
>> (http://trac.mpich.org/projects/mpich/ticket/1767), but I didn't see any
>> other bug reports about them so I thought I should confirm with mpich
>> developers.
>>
>> In addition to this case in line 171:
>>
>> 98 int tmp_buf_size, missing;
>> ...
>> 169 if (nbytes < MPIR_CVAR_GATHER_VSMALL_MSG_SIZE)
>> tmp_buf_size++;
>> 170
>> 171 tmp_buf_size *= nbytes;
>>
>> which I believe it's fixed in mpich-3.2b4 (where tmp_buf_size is
>> declared as a 64-bit MPI_Aint),
>
> I fixed this (I sure hope!) over the summer in commit 68f8c7aa798f7009
>
Yes, this is fixed (according to my analysis), so don't worry, I just
used it as a reference.
>
>> I found the following cases in the same
>> file src/mpi/coll/gather.c (in both mpich-3.1.4 and mpich-3.2b4):
>>
>> Case 1:
>>
>> 222 mpi_errno = MPIC_Recv(((char *)recvbuf +
>> 223 (((rank + mask) % comm_size)*recvcount*extent)),
>> 224 recvblks * recvcount, recvtype, src,
>> 225 MPIR_GATHER_TAG, comm,
>> 226 &status, errflag);
>>
>> In line 223 I believe we get an integer overflow as follows. Suppose I
>> run 2^20 = 1,048,576 ranks and do a gather with 4,096 elements. In this
>> case (if I understand the algorithm well), ((rank + mask) % comm_size)
>> would be 2^20 / 2 = 524,288, and recvcount = 4,096. Then the ((rank +
>> mask) % comm_size)*recvcount expression would overflow: 524,288 * 4,096
>> = 2,147,483,648, and become negative.
>
> We're doing address math here, so I hope the compiler is not trying to
> store intermediate results in an integer. However, if LLVM is saying
> otherwise, then sure, let's promote those types.
>
> ((rank + mask) % comm_size)*recvcount is of type int*int, which is
> then multiplied by an MPI_Aint... yeah, looks suspicious.
Even when they are intermediate operations, the compiler could place the
result of ((rank + mask) % comm_size)*recvcount in an 32-bit integer
register. In fact, in llvm with -O0 optimization level that's what it
does. This is the portion of the code in llvm IR:
%mul220 = mul nsw i32 %rem219.pre, %recvcount
%conv221 = sext i32 %mul220 to i64
%mul222 = mul nsw i64 %conv221, %extent.1851
%add.ptr223 = getelementptr inbounds i8* %recvbuf, i64 %mul222
In the first line it multiplies the result of ((rank + mask) %
comm_size) with recvcount. Note that this is a 32-bit multiplication (it
says i32). Then in the 2nd line it extends the result to a 64-bit
register (unfortunately at this point %mul220 is already wrapped around)
and stores it in %conv221. Then it multiplies it by %extent, and then
gets an element of the array (with getelementptr).
I'm not sure if this would be a problem with higher optimization levels,
though, but we should not rely on that.
I also verified these things are propagated to actual machine registers
by looking at the assembly code. Here's an example:
$ cat test.c
#include <stdio.h>
int main()
{
int val1, val2;
long long tmp, final;
val1 = (1 << 20) / 2;
val2 = 4096;
tmp = 4;
final = val1 * val2 * tmp;
printf("val1: %d, val2: %d, final: %lld\n", val1, val2, final);
return 0;
}
I compiled it with clang and -O0 and this is what I get when I run it:
$ ./test
val1: 524288, val2: 4096, final: -8589934592
This is the assembly code in my Mac:
$ otool -vVt test
test:
(__TEXT,__text) section
_main:
0000000100000f00 pushq %rbp
0000000100000f01 movq %rsp, %rbp
0000000100000f04 subq $0x30, %rsp
0000000100000f08 leaq 0x6f(%rip), %rdi
0000000100000f0f movl $0x0, -0x4(%rbp)
0000000100000f16 movl $0x80000, -0x8(%rbp)
0000000100000f1d movl $0x1000, -0xc(%rbp)
0000000100000f24 movq $0x4, -0x18(%rbp)
0000000100000f2c movl -0x8(%rbp), %eax
0000000100000f2f imull -0xc(%rbp), %eax
0000000100000f33 movslq %eax, %rcx
0000000100000f36 imulq -0x18(%rbp), %rcx
The first multiplication with imull is stored in %eax, and then second
one in %rcx, but at this point %eax already has an overflow.
>
> There are quite a few places where we do math on recvcount and
> recvblocks -- does your static analysis show problems with all of those
> spots?
>
I checked all MPI calls in mpich-3.2b4 and the analysis doesn't report
anything else. Now, this doesn't mean there are no more bugs--just that
it could not find anything else :-)
>> When multiplied with 'extent', which is size_t or MPI_Aint, it will
>> become negative I believe or a huge positive which in any case will
>> point to the wrong location in the recvbuf buffer, unless of course this
>> wraparound behavior is intended.
>
> Nope, wraparound sure is not intended! We would have no hope of
> cleaning up all these locations if it were not for clang's
> -Wshorten64-to-32 flag.
>
Thanks! I was just double checking. In some programs wraparounds are
actually intended.
>> Case 2:
>>
>> There might be a similar problem in line 224 in the above code. With
>> 2^20 ranks, recvblks becomes 524,288 (again if I understand well the
>> algorithm), so the recvblks * recvcount operation will also overflow.
>>
>> I might be wrong on this -- I'm catching these issues with LLVM symbolic
>> analysis -- so they can be totally false positives,
>
> Tell me more about LLVM symbolic analysis! We have started using
> coverity for static analysis but yes, there are a lot of messages
> obscuring the genuine bugs.
I can tell you more in a separate email (I know this mailing list is
only for mpich discussions).
Basically we are looking for scale-dependent bugs, i.e., bugs that will
only manifest at large scale and/or with large inputs. It does data-flow
analysis to understand what code is dependent on things like the size of
the problem (in mpich and/or MPI apps this is basically code that
depends on the size of the communicators, for example). It then performs
tests to see if things could go wrong at large scale, e.g., by replacing
ranks = 1 million values and so on. And then it reports what line of the
code and file the problem is.
Having said that, coverity should find much more bugs. This is very
specialized for scale-dependent bugs.
>
>> but I just wanted to
>> check with the mpich developers if they are valid issues or not. If they
>> are, I believe fixes can be easy to implement (just make all these
>> computations size_t).
>
> I promoted the prototype for MPIC_Recv to this:
>
> int MPIC_Recv(void *buf, MPI_Aint count,
> MPI_Datatype datatype, int source, int tag,
> MPID_Comm *comm_ptr, MPI_Status *status,
> MPIR_Errflag_t *errflag)
>
> Top-level MPI routines are stuck using an 'int' count, but internally we
> can (and now do) use a larger MPI_Aint type.
>
> True, int*int can overflow, but since that parameter to MPIC_Recv is of
> the larger MPI_Aint type, we're ok here. LLVM disagrees?
>
Once you have a version I can download with the patches, I can run the
analysis again and see if I can catch more things. Just let me know and
I would be happy to test it.
In general, overflows can also occur in intermediate computations,
depending on how code is optimized, I believe.
Thanks!
Ignacio
> ==rob
>
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list