[mpich-discuss] Possible integer overflows at scale in gather.c

Thu Oct 1 12:35:32 CDT 2015

Rob,

Thanks for replying! I'm answering your questions inline below.

On 10/1/15 9:15 AM, Rob Latham wrote:
>
>
> On 09/30/2015 04:08 PM, Ignacio Laguna wrote:
>> Hi,
>>
>> I found some 'potential' integer overflows when running mpich at large
>> scale and/or with large inputs in gather.c. I believe that they are
>> related to ticket #1767
>> (http://trac.mpich.org/projects/mpich/ticket/1767), but I didn't see any
>> other bug reports about them so I thought I should confirm with mpich
>> developers.
>>
>> In addition to this case in line 171:
>>
>>      98        int tmp_buf_size, missing;
>> ...
>>     169        if (nbytes < MPIR_CVAR_GATHER_VSMALL_MSG_SIZE)
>> tmp_buf_size++;
>>     170
>>     171        tmp_buf_size *= nbytes;
>>
>> which I believe it's fixed in mpich-3.2b4 (where tmp_buf_size is
>> declared as a 64-bit MPI_Aint),
>
> I fixed this (I sure hope!) over the summer in commit 68f8c7aa798f7009
>

Yes, this is fixed (according to my analysis), so don't worry, I just 
used it as a reference.

>
>> I found the following cases in the same
>> file src/mpi/coll/gather.c (in both mpich-3.1.4 and mpich-3.2b4):
>>
>> Case 1:
>>
>>     222    mpi_errno = MPIC_Recv(((char *)recvbuf +
>>     223                (((rank + mask) % comm_size)*recvcount*extent)),
>>     224                recvblks * recvcount, recvtype, src,
>>     225                MPIR_GATHER_TAG, comm,
>>     226                &status, errflag);
>>
>> In line 223 I believe we get an integer overflow as follows. Suppose I
>> run 2^20 = 1,048,576 ranks and do a gather with 4,096 elements. In this
>> case (if I understand the algorithm well), ((rank + mask) % comm_size)
>> would be 2^20 / 2 = 524,288, and recvcount = 4,096. Then the ((rank +
>> mask) % comm_size)*recvcount expression would overflow: 524,288 * 4,096
>> = 2,147,483,648, and become negative.
>
> We're doing address math here, so I hope the compiler is not trying to
> store intermediate results in an integer.  However, if LLVM is saying
> otherwise, then sure, let's promote those types.
>
>   ((rank + mask) % comm_size)*recvcount is of type int*int, which is
> then multiplied by an MPI_Aint... yeah, looks suspicious.

Even when they are intermediate operations, the compiler could place the 
result of ((rank + mask) % comm_size)*recvcount in an 32-bit integer 
register. In fact, in llvm with -O0 optimization level that's what it 
does. This is the portion of the code in llvm IR:

   %mul220 = mul nsw i32 %rem219.pre, %recvcount
   %conv221 = sext i32 %mul220 to i64
   %mul222 = mul nsw i64 %conv221, %extent.1851
   %add.ptr223 = getelementptr inbounds i8* %recvbuf, i64 %mul222

In the first line it multiplies the result of ((rank + mask) % 
comm_size) with recvcount. Note that this is a 32-bit multiplication (it 
says i32). Then in the 2nd line it extends the result to a 64-bit 
register (unfortunately at this point %mul220 is already wrapped around) 
and stores it in %conv221. Then it multiplies it by %extent, and then 
gets an element of the array (with getelementptr).

I'm not sure if this would be a problem with higher optimization levels, 
though, but we should not rely on that.

I also verified these things are propagated to actual machine registers 
by looking at the assembly code. Here's an example:

$ cat test.c
#include <stdio.h>
int main()
{
   int val1, val2;
   long long tmp, final;
   val1 = (1 << 20) / 2;
   val2 = 4096;
   tmp = 4;
   final = val1 * val2 * tmp;
   printf("val1: %d, val2: %d, final: %lld\n", val1, val2, final);
   return 0;
}

I compiled it with clang and -O0 and this is what I get when I run it:

$ ./test
val1: 524288, val2: 4096, final: -8589934592

This is the assembly code in my Mac:

$ otool -vVt test
test:
(__TEXT,__text) section
_main:
0000000100000f00	pushq	%rbp
0000000100000f01	movq	%rsp, %rbp
0000000100000f04	subq	$0x30, %rsp
0000000100000f08	leaq	0x6f(%rip), %rdi
0000000100000f0f	movl	$0x0, -0x4(%rbp)
0000000100000f16	movl	$0x80000, -0x8(%rbp)
0000000100000f1d	movl	$0x1000, -0xc(%rbp)
0000000100000f24	movq	$0x4, -0x18(%rbp)
0000000100000f2c	movl	-0x8(%rbp), %eax
0000000100000f2f	imull	-0xc(%rbp), %eax
0000000100000f33	movslq	%eax, %rcx
0000000100000f36	imulq	-0x18(%rbp), %rcx

The first multiplication with imull is stored in %eax, and then second 
one in %rcx, but at this point %eax already has an overflow.

>
> There are quite a few places where we do math on recvcount and
> recvblocks -- does your static analysis show problems with all of those
> spots?
>

I checked all MPI calls in mpich-3.2b4 and the analysis doesn't report 
anything else. Now, this doesn't mean there are no more bugs--just that 
it could not find anything else :-)

>> When multiplied with 'extent', which is size_t or MPI_Aint, it will
>> become negative I believe or a huge positive which in any case will
>> point to the wrong location in the recvbuf buffer, unless of course this
>> wraparound behavior is intended.
>
> Nope, wraparound sure is not intended!  We would have no hope of
> cleaning up all these locations if it were not for clang's
> -Wshorten64-to-32 flag.
>

Thanks! I was just double checking. In some programs wraparounds are 
actually intended.

>> Case 2:
>>
>> There might be a similar problem in line 224 in the above code. With
>> 2^20 ranks, recvblks becomes 524,288 (again if I understand well the
>> algorithm), so the recvblks * recvcount operation will also overflow.
>>
>> I might be wrong on this -- I'm catching these issues with LLVM symbolic
>> analysis -- so they can be totally false positives,
>
> Tell me more about LLVM symbolic analysis!  We have started using
> coverity for static analysis but yes, there are a lot of messages
> obscuring the genuine bugs.

I can tell you more in a separate email (I know this mailing list is 
only for mpich discussions).

Basically we are looking for scale-dependent bugs, i.e., bugs that will 
only manifest at large scale and/or with large inputs. It does data-flow 
analysis to understand what code is dependent on things like the size of 
the problem (in mpich and/or MPI apps this is basically code that 
depends on the size of the communicators, for example). It then performs 
tests to see if things could go wrong at large scale, e.g., by replacing 
ranks = 1 million values and so on. And then it reports what line of the 
code and file the problem is.

Having said that, coverity should find much more bugs. This is very 
specialized for scale-dependent bugs.

>
>> but I just wanted to
>> check with the mpich developers if they are valid issues or not. If they
>> are, I believe fixes can be easy to implement (just make all these
>> computations size_t).
>
> I promoted the prototype for MPIC_Recv to this:
>
> int MPIC_Recv(void *buf, MPI_Aint count,
> MPI_Datatype datatype, int source, int tag,
> MPID_Comm *comm_ptr, MPI_Status *status,
> MPIR_Errflag_t *errflag)
>
> Top-level MPI routines are stuck using an 'int' count, but internally we
> can (and now do) use a larger MPI_Aint type.
>
> True, int*int can overflow, but since that parameter to MPIC_Recv is of
> the larger MPI_Aint type, we're ok here.   LLVM disagrees?
>

Once you have a version I can download with the patches, I can run the 
analysis again and see if I can catch more things. Just let me know and 
I would be happy to test it.

In general, overflows can also occur in intermediate computations, 
depending on how code is optimized, I believe.

Thanks!

Ignacio

> ==rob
>
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss