[mpich-discuss] Problem with target_disp Parameter and 1-sided Messages

Corey A. Henderson cahenderson at wisc.edu
Sun Jan 19 12:12:36 CST 2014


Jeff,

Thank you for the help. Indeed, changing the disp_unit on MPI_Win_allocate
from MPI_INT to sizeof(MPI_INT) does eliminate the seg faults in the
subsequent window accesses.

Rajeev, I am not getting any compiler warnings for using int or literals
for target_disp on my machine, but I did change all instances to MPI_Aint
just in case. Possibly running this code with some other compiler would
show warnings. I am using mpic++/g++ on Ubuntu.

I vaguely remember from the MPI-3.0 documentation (but can't seem to find
it now) that the disp_unit on window creation and on any subsequent atomic
operations had to match. I get the impression that the datatype passed to
Get/Put does not determine the length in bytes of the stride used with
target_disp. Instead the disp_unit at window creation is used. In this case
it would seem to me that asking to Put/Get data with a target_disp of 1 and
an original disp_unit of 1275069445 would definitely seg fault since the
buffer and window are only a few tens of bytes long. But as I said earlier,
I am by no means an expert. Still learning my way around.

Thanks again to both of you for the help. I am not sure I would've ever
caught that myself!

Corey


On Sat, Jan 18, 2014 at 4:56 PM, Jeff Hammond <jeff.science at gmail.com>wrote:

> Hi Corey,
>
> It was a simple bug.  You passed MPI_INT as the disp_unit rather than
> sizeof(int), which is almost certainly what you meant.  MPI_INT is an
> opaque handle that is 1275069445 when interpreted as an integer.
>
> I am not entirely sure why using such a large disp_unit caused the
> segfault though.  I haven't yet figured out where disp_unit is used in
> the MPI_Get code.
>
> Best,
>
> Jeff
>
> On Sat, Jan 18, 2014 at 2:41 PM, Corey A. Henderson
> <cahenderson at wisc.edu> wrote:
> > I am having a problem where I cannot set the target_disp parameter to a
> > positive value in any of the 1-sided calls I've tried (EG: MPI_Put,
> MPI_Get,
> > MPI_Fetch_and_op, etc.)
> >
> > I am trying to use a shared (lock_all) approach with flushes. When I set
> > target_disp to zero, the messaging works fine as expected. If I use a
> > positive value I always get a Seg fault.
> >
> > Obligatory disclaimer: I am not a c or MPI expert so it's entirely
> possible
> > I've made some newbie error here. But I am at my wit's end trying to
> figure
> > this out and could use help.
> >
> > Info: MPICH 3.0.4 built on Ubuntu 12.04 LTS running one node on Intel®
> Core™
> > i5-3570K CPU @ 3.40GHz × 4
> >
> > I've attached the code I've isolated to show the problem. With the
> > targetDisp int set to 0, the data is properly transferred. If it is set
> to
> > 1, or sizeof(int), I get the following seg fault from mpiexec for
> > targetDisp>0.
> >
> > corey at UbuntuDesktop:~/workspace/TargetDispBug/Release$ mpiexec -n 2
> > ./TargetDispBug
> >
> >
> ===================================================================================
> > =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> > =   EXIT CODE: 139
> > =   CLEANING UP REMAINING PROCESSES
> > =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> >
> ===================================================================================
> > YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault
> (signal
> > 11)
> > This typically refers to a problem with your application.
> > Please see the FAQ page for debugging suggestions
> >
> > However, for targetDisp == 0 I get (as expected):
> >
> > corey at UbuntuDesktop:~/workspace/TargetDispBug/Release$ mpiexec -n 2
> > ./TargetDispBug
> > Received: 42.
> >
> > The seg fault occurs at the MPI_Win_flush on both processes for
> targetDisp>0
> > on either the Put or Get or both.
> >
> > Any help with this would be great.
> >
> > Code follows:
> >
> > #include "mpi.h"
> >
> > int main(int argc, char* argv[]){
> >
> >     // Test main for one sided message queueing
> >     int rank, numranks, targetDisp = 0;
> >     int sizeInBytes = 10*sizeof(int), *buffer;
> >     MPI_Win window;
> >
> >     MPI_Init(&argc, &argv);
> >
> >     MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> >     MPI_Comm_size(MPI_COMM_WORLD, &numranks);
> >
> >     MPI_Win_allocate(sizeInBytes, MPI_INT, MPI_INFO_NULL, MPI_COMM_WORLD,
> > &buffer, &window);
> >
> >     MPI_Win_lock_all(0, window);
> >
> >     int *sendBuffer;
> >     int *receiveBuffer;
> >
> >     MPI_Alloc_mem(sizeof(int), MPI_INFO_NULL, &sendBuffer);
> >     MPI_Alloc_mem(sizeof(int), MPI_INFO_NULL, &receiveBuffer);
> >
> >     if (rank == 1) {
> >
> >         sendBuffer[0] = 42;
> >
> >         MPI_Put(sendBuffer, 1, MPI_INT, 0, targetDisp, 1, MPI_INT,
> window);
> >
> >         MPI_Win_flush(0, window);
> >
> >     }
> >
> >     MPI_Barrier(MPI_COMM_WORLD);
> >
> >     if (rank == 0) {
> >
> >         MPI_Get(receiveBuffer, 1, MPI_INT, 0, targetDisp, 1, MPI_INT,
> > window);
> >
> >         MPI_Win_flush(0, window);
> >
> >         printf("Received: %d.\n", receiveBuffer[0]);
> >
> >     }
> >
> >     MPI_Win_unlock_all(window);
> >
> >     MPI_Free_mem(sendBuffer);
> >     MPI_Free_mem(receiveBuffer);
> >
> >     MPI_Win_free(&window);
> >
> >     MPI_Finalize();
> >     return 0;
> >
> > }
> >
> >
> > _______________________________________________
> > discuss mailing list     discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> --
> Jeff Hammond
> jeff.science at gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20140119/1fc8ea10/attachment.html>


More information about the discuss mailing list