<div dir="ltr"><div><div>Thank you so much! I figure it out the problem. <br><br></div>it is really amazing by this debugging method with my limited knowledge. I was shocked. just like a magic. Thank you so much for your help!<br>
<br></div>Sufeng<br><div><div><div><div><div><div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Jul 10, 2013 at 5:18 PM, <span dir="ltr"><<a href="mailto:discuss-request@mpich.org" target="_blank">discuss-request@mpich.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Send discuss mailing list submissions to<br>
<a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
<br>
To subscribe or unsubscribe via the World Wide Web, visit<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
or, via email, send a message with subject or body 'help' to<br>
<a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a><br>
<br>
You can reach the person managing the list at<br>
<a href="mailto:discuss-owner@mpich.org">discuss-owner@mpich.org</a><br>
<br>
When replying, please edit your Subject line so it is more specific<br>
than "Re: Contents of discuss digest..."<br>
<br>
<br>
Today's Topics:<br>
<br>
1. Re: MPI_Win_fence failed (Jim Dinan)<br>
<br>
<br>
----------------------------------------------------------------------<br>
<br>
Message: 1<br>
Date: Wed, 10 Jul 2013 18:18:32 -0400<br>
From: Jim Dinan <<a href="mailto:james.dinan@gmail.com">james.dinan@gmail.com</a>><br>
To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
Subject: Re: [mpich-discuss] MPI_Win_fence failed<br>
Message-ID:<br>
<<a href="mailto:CAOoEU4GHbqbzmB7eZmfG1Zp23QCjAT5-YjnvEqjxhdgOomXNKA@mail.gmail.com">CAOoEU4GHbqbzmB7eZmfG1Zp23QCjAT5-YjnvEqjxhdgOomXNKA@mail.gmail.com</a>><br>
Content-Type: text/plain; charset="iso-8859-1"<br>
<br>
I did a quick grep of your code. The following looks like it could be a<br>
bug:<br>
<br>
---8<---<br>
<br>
$ grep MPI_Win_create *<br>
...<br>
udp_server.c: MPI_Win_create(image_buff,<br>
2*image_info->image_size*sizeof(uint16), sizeof(uint16), MPI_INFO_NULL,<br>
MPI_COMM_WORLD, win);<br>
$ grep MPI_Get *<br>
...<br>
rms.c: MPI_Get(strip_buff, image_info->buffer_size, MPI_INT, 0,<br>
(rank-1)*image_info->buffer_size, image_info->buffer_size, MPI_INT, *win);<br>
<br>
---8<---<br>
<br>
The window size is "2*image_info->image_size*sizeof(uint16)" and the<br>
displacement is "(rank-1)*image_info->buffer_size". The displacements<br>
expect a window that is proportional to the number of ranks, but the window<br>
has a fixed size. It looks like this would cause your gets to wander<br>
outside of the exposed buffer at the target.<br>
<br>
~Jim.<br>
<br>
<br>
On Wed, Jul 10, 2013 at 6:11 PM, Jim Dinan <<a href="mailto:james.dinan@gmail.com">james.dinan@gmail.com</a>> wrote:<br>
<br>
> From that backtrace, it looks like the displacement/datatype that you gave<br>
> in the call to MPI_Get() caused the target process to access an invalid<br>
> location in memory. MPICH does not check whether the window accesses at a<br>
> process targeted by RMA operations are constrained to the window. I would<br>
> start by making sure that your gets are contained within the window at the<br>
> target process.<br>
><br>
> ~Jim.<br>
><br>
><br>
> On Wed, Jul 10, 2013 at 1:33 PM, Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>> wrote:<br>
><br>
>> Hi Jeff<br>
>><br>
>> Sorry to send so many emails which messed up discuss group email.<br>
>><br>
>> I found that the scientific image is too large to upload on github. so I<br>
>> put it on the ftp:<br>
>> <a href="ftp://ftp.xray.aps.anl.gov/pub/sector8/" target="_blank">ftp://ftp.xray.aps.anl.gov/pub/sector8/</a> there is 55Fe_run5_dark.tif file.<br>
>><br>
>> just put the tif file with the source code. Sorry again on my frequently<br>
>> email broadcast. Thank you so much for your debugging help<br>
>><br>
>> Sufeng<br>
>><br>
>><br>
>> On Wed, Jul 10, 2013 at 12:08 PM, <<a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a>> wrote:<br>
>><br>
>>> Send discuss mailing list submissions to<br>
>>> <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>><br>
>>> To subscribe or unsubscribe via the World Wide Web, visit<br>
>>> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> or, via email, send a message with subject or body 'help' to<br>
>>> <a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a><br>
>>><br>
>>> You can reach the person managing the list at<br>
>>> <a href="mailto:discuss-owner@mpich.org">discuss-owner@mpich.org</a><br>
>>><br>
>>> When replying, please edit your Subject line so it is more specific<br>
>>> than "Re: Contents of discuss digest..."<br>
>>><br>
>>><br>
>>> Today's Topics:<br>
>>><br>
>>> 1. Re: MPI_Win_fence failed (Jeff Hammond)<br>
>>> 2. Re: MPI_Win_fence failed (Sufeng Niu)<br>
>>><br>
>>><br>
>>> ----------------------------------------------------------------------<br>
>>><br>
>>> Message: 1<br>
>>> Date: Wed, 10 Jul 2013 12:05:09 -0500<br>
>>> From: Jeff Hammond <<a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a>><br>
>>> To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> Subject: Re: [mpich-discuss] MPI_Win_fence failed<br>
>>> Message-ID:<br>
>>> <CAGKz=<br>
>>> <a href="mailto:uJ-aoHqK5A_tS6YfWaaxjw5AjhHM7xL1A0XaUSUjKvDcQ@mail.gmail.com">uJ-aoHqK5A_tS6YfWaaxjw5AjhHM7xL1A0XaUSUjKvDcQ@mail.gmail.com</a>><br>
>>> Content-Type: text/plain; charset=ISO-8859-1<br>
>>><br>
>>> use dropbox, pastebin, etc. for attachments. it makes life a lot<br>
>>> easier for everyone.<br>
>>><br>
>>> jeff<br>
>>><br>
>>> On Wed, Jul 10, 2013 at 11:57 AM, Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>> wrote:<br>
>>> > Sorry, I found that this discussion email cannot add figure or<br>
>>> attachment.<br>
>>> ><br>
>>> > the backtrace information is below:<br>
>>> ><br>
>>> > processes Location PC<br>
>>> > Host Rank ID Status<br>
>>> > 7 _start<br>
>>> > 0x00402399<br>
>>> > `-7 _libc_start_main<br>
>>> > 0x3685c1ecdd<br>
>>> > `-7 main<br>
>>> > 0x00402474<br>
>>> > `-7 dkm<br>
>>> > ...<br>
>>> > |-6 image_rms<br>
>>> > 0x004029bb<br>
>>> > | `-6 rms<br>
>>> > 0x00402d44<br>
>>> > | `-6 PMPI_Win_fence<br>
>>> 0x0040c389<br>
>>> > | `-6 MPIDI_Win_fence<br>
>>> 0x004a45f4<br>
>>> > | `-6 MPIDI_CH3I_RMAListComplete 0x004a27d3<br>
>>> > | `-6 MPIDI_CH3I_Progress ...<br>
>>> > `-1 udp<br>
>>> > 0x004035cf<br>
>>> > `-1 PMPI_Win_fence<br>
>>> 0x0040c389<br>
>>> > `-1 MPIDI_Win_fence<br>
>>> 0x004a45a0<br>
>>> > `-1 MPIDI_CH3I_Progress<br>
>>> 0x004292f5<br>
>>> > `-1 MPIDI_CH3_PktHandler_Get 0x0049f3f9<br>
>>> > `-1 MPIDI_CH3_iSendv<br>
>>> 0x004aa67c<br>
>>> > `- memcpy<br>
>>> > 0x3685c89329 164.54.54.122 0 20.1-13994 Stopped<br>
>>> ><br>
>>> ><br>
>>> ><br>
>>> > On Wed, Jul 10, 2013 at 11:39 AM, <<a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a>> wrote:<br>
>>> >><br>
>>> >> Send discuss mailing list submissions to<br>
>>> >> <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >><br>
>>> >> To subscribe or unsubscribe via the World Wide Web, visit<br>
>>> >> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> or, via email, send a message with subject or body 'help' to<br>
>>> >> <a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a><br>
>>> >><br>
>>> >> You can reach the person managing the list at<br>
>>> >> <a href="mailto:discuss-owner@mpich.org">discuss-owner@mpich.org</a><br>
>>> >><br>
>>> >> When replying, please edit your Subject line so it is more specific<br>
>>> >> than "Re: Contents of discuss digest..."<br>
>>> >><br>
>>> >><br>
>>> >> Today's Topics:<br>
>>> >><br>
>>> >> 1. Re: MPI_Win_fence failed (Sufeng Niu)<br>
>>> >><br>
>>> >><br>
>>> >> ----------------------------------------------------------------------<br>
>>> >><br>
>>> >> Message: 1<br>
>>> >> Date: Wed, 10 Jul 2013 11:39:39 -0500<br>
>>> >><br>
>>> >> From: Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> >> To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> Subject: Re: [mpich-discuss] MPI_Win_fence failed<br>
>>> >> Message-ID:<br>
>>> >><br>
>>> >> <CAFNNHkz8pBfX33icn=+3rdXvqDfWqeu58odpd=<a href="mailto:mOXLciysHgfg@mail.gmail.com">mOXLciysHgfg@mail.gmail.com</a>><br>
>>> >> Content-Type: text/plain; charset="iso-8859-1"<br>
>>> >><br>
>>> >><br>
>>> >> Sorry I forget to add screen shot for backtrace. the screen shot is<br>
>>> >> attached.<br>
>>> >><br>
>>> >> Thanks a lot!<br>
>>> >><br>
>>> >> Sufeng<br>
>>> >><br>
>>> >><br>
>>> >><br>
>>> >> On Wed, Jul 10, 2013 at 11:30 AM, <<a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a>> wrote:<br>
>>> >><br>
>>> >> > Send discuss mailing list submissions to<br>
>>> >> > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> ><br>
>>> >> > To subscribe or unsubscribe via the World Wide Web, visit<br>
>>> >> > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > or, via email, send a message with subject or body 'help' to<br>
>>> >> > <a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a><br>
>>> >> ><br>
>>> >> > You can reach the person managing the list at<br>
>>> >> > <a href="mailto:discuss-owner@mpich.org">discuss-owner@mpich.org</a><br>
>>> >> ><br>
>>> >> > When replying, please edit your Subject line so it is more specific<br>
>>> >> > than "Re: Contents of discuss digest..."<br>
>>> >> ><br>
>>> >> ><br>
>>> >> > Today's Topics:<br>
>>> >> ><br>
>>> >> > 1. Re: MPI_Win_fence failed (Sufeng Niu)<br>
>>> >> ><br>
>>> >> ><br>
>>> >> ><br>
>>> ----------------------------------------------------------------------<br>
>>> >> ><br>
>>> >> > Message: 1<br>
>>> >> > Date: Wed, 10 Jul 2013 11:30:36 -0500<br>
>>> >> > From: Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> >> > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > Subject: Re: [mpich-discuss] MPI_Win_fence failed<br>
>>> >> > Message-ID:<br>
>>> >> > <<br>
>>> >> > <a href="mailto:CAFNNHkyLj8CbYMmc_w2DA9_%2Bq2Oe3kyus%2Bg6c99ShPk6ZXVkdA@mail.gmail.com">CAFNNHkyLj8CbYMmc_w2DA9_+q2Oe3kyus+g6c99ShPk6ZXVkdA@mail.gmail.com</a>><br>
>>> >> > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> >> ><br>
>>> >> > Hi Jim,<br>
>>> >> ><br>
>>> >> > Thanks a lot for your reply. the basic way for me to debugging is<br>
>>> >> > barrier+printf, right now I only have an evaluation version of<br>
>>> >> > totalview.<br>
>>> >> > the backtrace using totalview shown below. the udp is the udp<br>
>>> collection<br>
>>> >> > and create RMA window, image_rms doing MPI_Get to access the window<br>
>>> >> ><br>
>>> >> > There is a segment violation, but I don't know why the program<br>
>>> stopped<br>
>>> >> > at<br>
>>> >> > MPI_Win_fence.<br>
>>> >> ><br>
>>> >> > Thanks a lot!<br>
>>> >> ><br>
>>> >> ><br>
>>> >> ><br>
>>> >> ><br>
>>> >> ><br>
>>> >> ><br>
>>> >> ><br>
>>> >> > On Wed, Jul 10, 2013 at 10:12 AM, <<a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a>><br>
>>> wrote:<br>
>>> >> ><br>
>>> >> > > Send discuss mailing list submissions to<br>
>>> >> > > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > ><br>
>>> >> > > To subscribe or unsubscribe via the World Wide Web, visit<br>
>>> >> > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > > or, via email, send a message with subject or body 'help' to<br>
>>> >> > > <a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a><br>
>>> >> > ><br>
>>> >> > > You can reach the person managing the list at<br>
>>> >> > > <a href="mailto:discuss-owner@mpich.org">discuss-owner@mpich.org</a><br>
>>> >> > ><br>
>>> >> > > When replying, please edit your Subject line so it is more<br>
>>> specific<br>
>>> >> > > than "Re: Contents of discuss digest..."<br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > > Today's Topics:<br>
>>> >> > ><br>
>>> >> > > 1. Re: MPICH3.0.4 make fails with "No rule to make target..."<br>
>>> >> > > (Wesley Bland)<br>
>>> >> > > 2. Re: Error in MPI_Finalize on a simple ring test over TCP<br>
>>> >> > > (Wesley Bland)<br>
>>> >> > > 3. Restrict number of cores, not threads (Bob Ilgner)<br>
>>> >> > > 4. Re: Restrict number of cores, not threads (Wesley Bland)<br>
>>> >> > > 5. Re: Restrict number of cores, not threads (Wesley Bland)<br>
>>> >> > > 6. Re: Error in MPI_Finalize on a simple ring test over TCP<br>
>>> >> > > (Thomas Ropars)<br>
>>> >> > > 7. MPI_Win_fence failed (Sufeng Niu)<br>
>>> >> > > 8. Re: MPI_Win_fence failed (Jim Dinan)<br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > ><br>
>>> ----------------------------------------------------------------------<br>
>>> >> > ><br>
>>> >> > > Message: 1<br>
>>> >> > > Date: Wed, 10 Jul 2013 08:29:06 -0500<br>
>>> >> > > From: Wesley Bland <<a href="mailto:wbland@mcs.anl.gov">wbland@mcs.anl.gov</a>><br>
>>> >> > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > Subject: Re: [mpich-discuss] MPICH3.0.4 make fails with "No rule<br>
>>> to<br>
>>> >> > > make target..."<br>
>>> >> > > Message-ID: <<a href="mailto:F48FC916-31F7-4F82-95F8-2D6A6C45264F@mcs.anl.gov">F48FC916-31F7-4F82-95F8-2D6A6C45264F@mcs.anl.gov</a>><br>
>>> >> > > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> >> > ><br>
>>> >> > > Unfortunately, due to the lack of developer resources and<br>
>>> interest,<br>
>>> >> > > the<br>
>>> >> > > last version of MPICH which was supported on Windows was 1.4.1p.<br>
>>> You<br>
>>> >> > > can<br>
>>> >> > > find that version on the downloads page:<br>
>>> >> > ><br>
>>> >> > > <a href="http://www.mpich.org/downloads/" target="_blank">http://www.mpich.org/downloads/</a><br>
>>> >> > ><br>
>>> >> > > Alternatively, Microsoft maintains a derivative of MPICH which<br>
>>> should<br>
>>> >> > > provide the features you need. You also find a link to that on the<br>
>>> >> > > downloads page above.<br>
>>> >> > ><br>
>>> >> > > Wesley<br>
>>> >> > ><br>
>>> >> > > On Jul 10, 2013, at 1:16 AM, Don Warren <<a href="mailto:don.warren@gmail.com">don.warren@gmail.com</a>><br>
>>> wrote:<br>
>>> >> > ><br>
>>> >> > > > Hello,<br>
>>> >> > > ><br>
>>> >> > > > As requested in the installation guide, I'm informing this list<br>
>>> of a<br>
>>> >> > > failure to correctly make MPICH3.0.4 on a Win7 system. The<br>
>>> specific<br>
>>> >> > error<br>
>>> >> > > encountered is<br>
>>> >> > > > "make[2]: *** No rule to make target<br>
>>> >> > > `/cygdrive/c/FLASH/mpich-3.0.4/src/mpi/romio/Makefile.am', needed<br>
>>> by<br>
>>> >> > > `/cygdrive/c/FLASH/mpich-3.0.4/src/mpi/romio/Makefile.in'. Stop."<br>
>>> >> > > ><br>
>>> >> > > > I have confirmed that both Makefile.am and Makefile.in exist in<br>
>>> the<br>
>>> >> > > directory listed. I'm attaching the c.txt and the m.txt files.<br>
>>> >> > > ><br>
>>> >> > > > Possibly of interest is that the command "make clean" fails at<br>
>>> >> > > > exactly<br>
>>> >> > > the same folder, with exactly the same error message as shown in<br>
>>> m.txt<br>
>>> >> > and<br>
>>> >> > > above.<br>
>>> >> > > ><br>
>>> >> > > > Any advice you can give would be appreciated. I'm attempting<br>
>>> to get<br>
>>> >> > > FLASH running on my computer, which seems to require MPICH.<br>
>>> >> > > ><br>
>>> >> > > > Regards,<br>
>>> >> > > > Don Warren<br>
>>> >> > > ><br>
>>> >> ><br>
>>> <config-make-outputs.zip>_______________________________________________<br>
>>> >> > > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > > To manage subscription options or unsubscribe:<br>
>>> >> > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > ><br>
>>> >> > > -------------- next part --------------<br>
>>> >> > > An HTML attachment was scrubbed...<br>
>>> >> > > URL: <<br>
>>> >> > ><br>
>>> >> ><br>
>>> >> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/69b497f1/attachment-0001.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/69b497f1/attachment-0001.html</a><br>
>>> >> > > ><br>
>>> >> > ><br>
>>> >> > > ------------------------------<br>
>>> >> > ><br>
>>> >> > > Message: 2<br>
>>> >> > > Date: Wed, 10 Jul 2013 08:39:47 -0500<br>
>>> >> > > From: Wesley Bland <<a href="mailto:wbland@mcs.anl.gov">wbland@mcs.anl.gov</a>><br>
>>> >> > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > Subject: Re: [mpich-discuss] Error in MPI_Finalize on a simple<br>
>>> ring<br>
>>> >> > > test over TCP<br>
>>> >> > > Message-ID: <<a href="mailto:D5999106-2A75-4091-8B0F-EAFA22880862@mcs.anl.gov">D5999106-2A75-4091-8B0F-EAFA22880862@mcs.anl.gov</a>><br>
>>> >> > > Content-Type: text/plain; charset=us-ascii<br>
>>> >> > ><br>
>>> >> > > The value of previous for rank 0 in your code is -1. MPICH is<br>
>>> >> > > complaining<br>
>>> >> > > because all of the requests to receive a message from -1 are still<br>
>>> >> > pending<br>
>>> >> > > when you try to finalize. You need to make sure that you are<br>
>>> receiving<br>
>>> >> > from<br>
>>> >> > > valid ranks.<br>
>>> >> > ><br>
>>> >> > > On Jul 10, 2013, at 7:50 AM, Thomas Ropars <<a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a><br>
>>> ><br>
>>> >> > wrote:<br>
>>> >> > ><br>
>>> >> > > > Yes sure. Here it is.<br>
>>> >> > > ><br>
>>> >> > > > Thomas<br>
>>> >> > > ><br>
>>> >> > > > On 07/10/2013 02:23 PM, Wesley Bland wrote:<br>
>>> >> > > >> Can you send us the smallest chunk of code that still exhibits<br>
>>> this<br>
>>> >> > > error?<br>
>>> >> > > >><br>
>>> >> > > >> Wesley<br>
>>> >> > > >><br>
>>> >> > > >> On Jul 10, 2013, at 6:54 AM, Thomas Ropars <<br>
>>> <a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a>><br>
>>> >> > > wrote:<br>
>>> >> > > >><br>
>>> >> > > >>> Hi all,<br>
>>> >> > > >>><br>
>>> >> > > >>> I get the following error when I try to run a simple<br>
>>> application<br>
>>> >> > > implementing a ring (each process sends to rank+1 and receives<br>
>>> from<br>
>>> >> > > rank-1). More precisely, the error occurs during the call to<br>
>>> >> > MPI_Finalize():<br>
>>> >> > > >>><br>
>>> >> > > >>> Assertion failed in file<br>
>>> >> > > src/mpid/ch3/channels/nemesis/netmod/tcp/socksm.c at line 363:<br>
>>> >> > sc->pg_is_set<br>
>>> >> > > >>> internal ABORT - process 0<br>
>>> >> > > >>><br>
>>> >> > > >>> Does anybody else also noticed the same error?<br>
>>> >> > > >>><br>
>>> >> > > >>> Here are all the details about my test:<br>
>>> >> > > >>> - The error is generated with mpich-3.0.2 (but I noticed the<br>
>>> exact<br>
>>> >> > > same error with mpich-3.0.4)<br>
>>> >> > > >>> - I am using IPoIB for communication between nodes (The same<br>
>>> thing<br>
>>> >> > > happens over Ethernet)<br>
>>> >> > > >>> - The problem comes from TCP links. When all processes are on<br>
>>> the<br>
>>> >> > same<br>
>>> >> > > node, there is no error. As soon as one process is on a remote<br>
>>> node,<br>
>>> >> > > the<br>
>>> >> > > failure occurs.<br>
>>> >> > > >>> - Note also that the failure does not occur if I run a more<br>
>>> >> > > >>> complex<br>
>>> >> > > code (eg, a NAS benchmark).<br>
>>> >> > > >>><br>
>>> >> > > >>> Thomas Ropars<br>
>>> >> > > >>> _______________________________________________<br>
>>> >> > > >>> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > >>> To manage subscription options or unsubscribe:<br>
>>> >> > > >>> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > > >> _______________________________________________<br>
>>> >> > > >> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > >> To manage subscription options or unsubscribe:<br>
>>> >> > > >> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > > >><br>
>>> >> > > >><br>
>>> >> > > ><br>
>>> >> > > > <ring_clean.c>_______________________________________________<br>
>>> >> > > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > > To manage subscription options or unsubscribe:<br>
>>> >> > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > > ------------------------------<br>
>>> >> > ><br>
>>> >> > > Message: 3<br>
>>> >> > > Date: Wed, 10 Jul 2013 16:41:27 +0200<br>
>>> >> > > From: Bob Ilgner <<a href="mailto:bobilgner@gmail.com">bobilgner@gmail.com</a>><br>
>>> >> > > To: <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
>>> >> > > Subject: [mpich-discuss] Restrict number of cores, not threads<br>
>>> >> > > Message-ID:<br>
>>> >> > > <<br>
>>> >> > ><br>
>>> <a href="mailto:CAKv15b-QgmHkVkoiTFmP3EZXvyy6sc_QeqHQgbMUhnr3Xh9ecA@mail.gmail.com">CAKv15b-QgmHkVkoiTFmP3EZXvyy6sc_QeqHQgbMUhnr3Xh9ecA@mail.gmail.com</a>><br>
>>> >> > > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> >> > ><br>
>>> >> > > Dear all,<br>
>>> >> > ><br>
>>> >> > > I am working on a shared memory processor with 256 cores. I am<br>
>>> working<br>
>>> >> > from<br>
>>> >> > > the command line directly.<br>
>>> >> > ><br>
>>> >> > > Can I restict the number of cores that I deploy.The command<br>
>>> >> > ><br>
>>> >> > > mpirun -n 100 myprog<br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > > will automatically start on 100 cores. I wish to use only 10<br>
>>> cores and<br>
>>> >> > have<br>
>>> >> > > 10 threads on each core. Can I do this with mpich ? Rememebre<br>
>>> that<br>
>>> >> > > this<br>
>>> >> > an<br>
>>> >> > > smp abd I can not identify each core individually(as in a cluster)<br>
>>> >> > ><br>
>>> >> > > Regards, bob<br>
>>> >> > > -------------- next part --------------<br>
>>> >> > > An HTML attachment was scrubbed...<br>
>>> >> > > URL: <<br>
>>> >> > ><br>
>>> >> ><br>
>>> >> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/ec659e91/attachment-0001.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/ec659e91/attachment-0001.html</a><br>
>>> >> > > ><br>
>>> >> > ><br>
>>> >> > > ------------------------------<br>
>>> >> > ><br>
>>> >> > > Message: 4<br>
>>> >> > > Date: Wed, 10 Jul 2013 09:46:38 -0500<br>
>>> >> > > From: Wesley Bland <<a href="mailto:wbland@mcs.anl.gov">wbland@mcs.anl.gov</a>><br>
>>> >> > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > Cc: <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
>>> >> > > Subject: Re: [mpich-discuss] Restrict number of cores, not threads<br>
>>> >> > > Message-ID: <<a href="mailto:2FAF588E-2FBE-45E4-B53F-E6BC931E3D51@mcs.anl.gov">2FAF588E-2FBE-45E4-B53F-E6BC931E3D51@mcs.anl.gov</a>><br>
>>> >> > > Content-Type: text/plain; charset=iso-8859-1<br>
>>> >> > ><br>
>>> >> > > Threads in MPI are not ranks. When you say you want to launch<br>
>>> with -n<br>
>>> >> > 100,<br>
>>> >> > > you will always get 100 processes, not threads. If you want 10<br>
>>> threads<br>
>>> >> > > on<br>
>>> >> > > 10 cores, you will need to launch with -n 10, then add your<br>
>>> threads<br>
>>> >> > > according to your threading library.<br>
>>> >> > ><br>
>>> >> > > Note that threads in MPI do not get their own rank currently.<br>
>>> They all<br>
>>> >> > > share the same rank as the process in which they reside, so if you<br>
>>> >> > > need<br>
>>> >> > to<br>
>>> >> > > be able to handle things with different ranks, you'll need to use<br>
>>> >> > > actual<br>
>>> >> > > processes.<br>
>>> >> > ><br>
>>> >> > > Wesley<br>
>>> >> > ><br>
>>> >> > > On Jul 10, 2013, at 9:41 AM, Bob Ilgner <<a href="mailto:bobilgner@gmail.com">bobilgner@gmail.com</a>><br>
>>> wrote:<br>
>>> >> > ><br>
>>> >> > > > Dear all,<br>
>>> >> > > ><br>
>>> >> > > > I am working on a shared memory processor with 256 cores. I am<br>
>>> >> > > > working<br>
>>> >> > > from the command line directly.<br>
>>> >> > > ><br>
>>> >> > > > Can I restict the number of cores that I deploy.The command<br>
>>> >> > > ><br>
>>> >> > > > mpirun -n 100 myprog<br>
>>> >> > > ><br>
>>> >> > > ><br>
>>> >> > > > will automatically start on 100 cores. I wish to use only 10<br>
>>> cores<br>
>>> >> > > > and<br>
>>> >> > > have 10 threads on each core. Can I do this with mpich ?<br>
>>> Rememebre<br>
>>> >> > > that<br>
>>> >> > > this an smp abd I can not identify each core individually(as in a<br>
>>> >> > cluster)<br>
>>> >> > > ><br>
>>> >> > > > Regards, bob<br>
>>> >> > > > _______________________________________________<br>
>>> >> > > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > > To manage subscription options or unsubscribe:<br>
>>> >> > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > > ------------------------------<br>
>>> >> > ><br>
>>> >> > > Message: 5<br>
>>> >> > > Date: Wed, 10 Jul 2013 09:46:38 -0500<br>
>>> >> > > From: Wesley Bland <<a href="mailto:wbland@mcs.anl.gov">wbland@mcs.anl.gov</a>><br>
>>> >> > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > Cc: <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
>>> >> > > Subject: Re: [mpich-discuss] Restrict number of cores, not threads<br>
>>> >> > > Message-ID: <<a href="mailto:2FAF588E-2FBE-45E4-B53F-E6BC931E3D51@mcs.anl.gov">2FAF588E-2FBE-45E4-B53F-E6BC931E3D51@mcs.anl.gov</a>><br>
>>> >> > > Content-Type: text/plain; charset=iso-8859-1<br>
>>> >> > ><br>
>>> >> > > Threads in MPI are not ranks. When you say you want to launch<br>
>>> with -n<br>
>>> >> > 100,<br>
>>> >> > > you will always get 100 processes, not threads. If you want 10<br>
>>> threads<br>
>>> >> > > on<br>
>>> >> > > 10 cores, you will need to launch with -n 10, then add your<br>
>>> threads<br>
>>> >> > > according to your threading library.<br>
>>> >> > ><br>
>>> >> > > Note that threads in MPI do not get their own rank currently.<br>
>>> They all<br>
>>> >> > > share the same rank as the process in which they reside, so if you<br>
>>> >> > > need<br>
>>> >> > to<br>
>>> >> > > be able to handle things with different ranks, you'll need to use<br>
>>> >> > > actual<br>
>>> >> > > processes.<br>
>>> >> > ><br>
>>> >> > > Wesley<br>
>>> >> > ><br>
>>> >> > > On Jul 10, 2013, at 9:41 AM, Bob Ilgner <<a href="mailto:bobilgner@gmail.com">bobilgner@gmail.com</a>><br>
>>> wrote:<br>
>>> >> > ><br>
>>> >> > > > Dear all,<br>
>>> >> > > ><br>
>>> >> > > > I am working on a shared memory processor with 256 cores. I am<br>
>>> >> > > > working<br>
>>> >> > > from the command line directly.<br>
>>> >> > > ><br>
>>> >> > > > Can I restict the number of cores that I deploy.The command<br>
>>> >> > > ><br>
>>> >> > > > mpirun -n 100 myprog<br>
>>> >> > > ><br>
>>> >> > > ><br>
>>> >> > > > will automatically start on 100 cores. I wish to use only 10<br>
>>> cores<br>
>>> >> > > > and<br>
>>> >> > > have 10 threads on each core. Can I do this with mpich ?<br>
>>> Rememebre<br>
>>> >> > > that<br>
>>> >> > > this an smp abd I can not identify each core individually(as in a<br>
>>> >> > cluster)<br>
>>> >> > > ><br>
>>> >> > > > Regards, bob<br>
>>> >> > > > _______________________________________________<br>
>>> >> > > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > > To manage subscription options or unsubscribe:<br>
>>> >> > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > > ------------------------------<br>
>>> >> > ><br>
>>> >> > > Message: 6<br>
>>> >> > > Date: Wed, 10 Jul 2013 16:50:36 +0200<br>
>>> >> > > From: Thomas Ropars <<a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a>><br>
>>> >> > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > Subject: Re: [mpich-discuss] Error in MPI_Finalize on a simple<br>
>>> ring<br>
>>> >> > > test over TCP<br>
>>> >> > > Message-ID: <<a href="mailto:51DD74BC.3020009@epfl.ch">51DD74BC.3020009@epfl.ch</a>><br>
>>> >> > > Content-Type: text/plain; charset=UTF-8; format=flowed<br>
>>> >> > ><br>
>>> >> > > Yes, you are right, sorry for disturbing.<br>
>>> >> > ><br>
>>> >> > > On 07/10/2013 03:39 PM, Wesley Bland wrote:<br>
>>> >> > > > The value of previous for rank 0 in your code is -1. MPICH is<br>
>>> >> > > complaining because all of the requests to receive a message from<br>
>>> -1<br>
>>> >> > > are<br>
>>> >> > > still pending when you try to finalize. You need to make sure<br>
>>> that you<br>
>>> >> > are<br>
>>> >> > > receiving from valid ranks.<br>
>>> >> > > ><br>
>>> >> > > > On Jul 10, 2013, at 7:50 AM, Thomas Ropars <<br>
>>> <a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a>><br>
>>> >> > > wrote:<br>
>>> >> > > ><br>
>>> >> > > >> Yes sure. Here it is.<br>
>>> >> > > >><br>
>>> >> > > >> Thomas<br>
>>> >> > > >><br>
>>> >> > > >> On 07/10/2013 02:23 PM, Wesley Bland wrote:<br>
>>> >> > > >>> Can you send us the smallest chunk of code that still exhibits<br>
>>> >> > > >>> this<br>
>>> >> > > error?<br>
>>> >> > > >>><br>
>>> >> > > >>> Wesley<br>
>>> >> > > >>><br>
>>> >> > > >>> On Jul 10, 2013, at 6:54 AM, Thomas Ropars <<br>
>>> <a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a>><br>
>>> >> > > wrote:<br>
>>> >> > > >>><br>
>>> >> > > >>>> Hi all,<br>
>>> >> > > >>>><br>
>>> >> > > >>>> I get the following error when I try to run a simple<br>
>>> application<br>
>>> >> > > implementing a ring (each process sends to rank+1 and receives<br>
>>> from<br>
>>> >> > > rank-1). More precisely, the error occurs during the call to<br>
>>> >> > MPI_Finalize():<br>
>>> >> > > >>>><br>
>>> >> > > >>>> Assertion failed in file<br>
>>> >> > > src/mpid/ch3/channels/nemesis/netmod/tcp/socksm.c at line 363:<br>
>>> >> > sc->pg_is_set<br>
>>> >> > > >>>> internal ABORT - process 0<br>
>>> >> > > >>>><br>
>>> >> > > >>>> Does anybody else also noticed the same error?<br>
>>> >> > > >>>><br>
>>> >> > > >>>> Here are all the details about my test:<br>
>>> >> > > >>>> - The error is generated with mpich-3.0.2 (but I noticed the<br>
>>> >> > > >>>> exact<br>
>>> >> > > same error with mpich-3.0.4)<br>
>>> >> > > >>>> - I am using IPoIB for communication between nodes (The same<br>
>>> >> > > >>>> thing<br>
>>> >> > > happens over Ethernet)<br>
>>> >> > > >>>> - The problem comes from TCP links. When all processes are<br>
>>> on the<br>
>>> >> > > same node, there is no error. As soon as one process is on a<br>
>>> remote<br>
>>> >> > > node,<br>
>>> >> > > the failure occurs.<br>
>>> >> > > >>>> - Note also that the failure does not occur if I run a more<br>
>>> >> > > >>>> complex<br>
>>> >> > > code (eg, a NAS benchmark).<br>
>>> >> > > >>>><br>
>>> >> > > >>>> Thomas Ropars<br>
>>> >> > > >>>> _______________________________________________<br>
>>> >> > > >>>> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > >>>> To manage subscription options or unsubscribe:<br>
>>> >> > > >>>> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > > >>> _______________________________________________<br>
>>> >> > > >>> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > >>> To manage subscription options or unsubscribe:<br>
>>> >> > > >>> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > > >>><br>
>>> >> > > >>><br>
>>> >> > > >> <ring_clean.c>_______________________________________________<br>
>>> >> > > >> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > >> To manage subscription options or unsubscribe:<br>
>>> >> > > >> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > > > _______________________________________________<br>
>>> >> > > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > > To manage subscription options or unsubscribe:<br>
>>> >> > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > > ><br>
>>> >> > > ><br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > > ------------------------------<br>
>>> >> > ><br>
>>> >> > > Message: 7<br>
>>> >> > > Date: Wed, 10 Jul 2013 10:07:21 -0500<br>
>>> >> > > From: Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> >> > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > Subject: [mpich-discuss] MPI_Win_fence failed<br>
>>> >> > > Message-ID:<br>
>>> >> > > <<br>
>>> >> > ><br>
>>> <a href="mailto:CAFNNHkz_1gC7hfpx0G9j24adO-gDabdmwZ4VuT6jip-fDMhS9A@mail.gmail.com">CAFNNHkz_1gC7hfpx0G9j24adO-gDabdmwZ4VuT6jip-fDMhS9A@mail.gmail.com</a>><br>
>>> >> > > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> >> > ><br>
>>> >> > > Hello,<br>
>>> >> > ><br>
>>> >> > > I used MPI RMA in my program, but the program stop at the<br>
>>> >> > > MPI_Win_fence,<br>
>>> >> > I<br>
>>> >> > > have a master process receive data from udp socket. Other<br>
>>> processes<br>
>>> >> > > use<br>
>>> >> > > MPI_Get to access data.<br>
>>> >> > ><br>
>>> >> > > master process:<br>
>>> >> > ><br>
>>> >> > > MPI_Create(...)<br>
>>> >> > > for(...){<br>
>>> >> > > /* udp recv operation */<br>
>>> >> > ><br>
>>> >> > > MPI_Barrier // to let other process know data received from udp<br>
>>> is<br>
>>> >> > > ready<br>
>>> >> > ><br>
>>> >> > > MPI_Win_fence(0, win);<br>
>>> >> > > MPI_Win_fence(0, win);<br>
>>> >> > ><br>
>>> >> > > }<br>
>>> >> > ><br>
>>> >> > > other processes:<br>
>>> >> > ><br>
>>> >> > > for(...){<br>
>>> >> > ><br>
>>> >> > > MPI_Barrier // sync for udp data ready<br>
>>> >> > ><br>
>>> >> > > MPI_Win_fence(0, win);<br>
>>> >> > ><br>
>>> >> > > MPI_Get();<br>
>>> >> > ><br>
>>> >> > > MPI_Win_fence(0, win); <-- program stopped here<br>
>>> >> > ><br>
>>> >> > > /* other operation */<br>
>>> >> > > }<br>
>>> >> > ><br>
>>> >> > > I found that the program stopped at second MPI_Win_fence, the<br>
>>> terminal<br>
>>> >> > > output is:<br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> ><br>
>>> >> ><br>
>>> ===================================================================================<br>
>>> >> > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES<br>
>>> >> > > = EXIT CODE: 11<br>
>>> >> > > = CLEANING UP REMAINING PROCESSES<br>
>>> >> > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES<br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> ><br>
>>> >> ><br>
>>> ===================================================================================<br>
>>> >> > > YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation<br>
>>> fault<br>
>>> >> > > (signal 11)<br>
>>> >> > > This typically refers to a problem with your application.<br>
>>> >> > > Please see the FAQ page for debugging suggestions<br>
>>> >> > ><br>
>>> >> > > Do you have any suggestions? Thank you very much!<br>
>>> >> > ><br>
>>> >> > > --<br>
>>> >> > > Best Regards,<br>
>>> >> > > Sufeng Niu<br>
>>> >> > > ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> >> > > Tel: <a href="tel:312-731-7219" value="+13127317219">312-731-7219</a><br>
>>> >> > > -------------- next part --------------<br>
>>> >> > > An HTML attachment was scrubbed...<br>
>>> >> > > URL: <<br>
>>> >> > ><br>
>>> >> ><br>
>>> >> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/375a95ac/attachment-0001.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/375a95ac/attachment-0001.html</a><br>
>>> >> > > ><br>
>>> >> > ><br>
>>> >> > > ------------------------------<br>
>>> >> > ><br>
>>> >> > > Message: 8<br>
>>> >> > > Date: Wed, 10 Jul 2013 11:12:45 -0400<br>
>>> >> > > From: Jim Dinan <<a href="mailto:james.dinan@gmail.com">james.dinan@gmail.com</a>><br>
>>> >> > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > Subject: Re: [mpich-discuss] MPI_Win_fence failed<br>
>>> >> > > Message-ID:<br>
>>> >> > > <CAOoEU4F3hX=y3yrJKYKucNeiueQYBeR_3OQas9E+mg+GM6Rz=<br>
>>> >> > > <a href="mailto:w@mail.gmail.com">w@mail.gmail.com</a>><br>
>>> >> > > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> >> > ><br>
>>> >> > > It's hard to tell where the segmentation fault is coming from.<br>
>>> Can<br>
>>> >> > > you<br>
>>> >> > use<br>
>>> >> > > a debugger to generate a backtrace?<br>
>>> >> > ><br>
>>> >> > > ~Jim.<br>
>>> >> > ><br>
>>> >> > ><br>
>>> >> > > On Wed, Jul 10, 2013 at 11:07 AM, Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> >> > > wrote:<br>
>>> >> > ><br>
>>> >> > > > Hello,<br>
>>> >> > > ><br>
>>> >> > > > I used MPI RMA in my program, but the program stop at the<br>
>>> >> > MPI_Win_fence,<br>
>>> >> > > I<br>
>>> >> > > > have a master process receive data from udp socket. Other<br>
>>> processes<br>
>>> >> > > > use<br>
>>> >> > > > MPI_Get to access data.<br>
>>> >> > > ><br>
>>> >> > > > master process:<br>
>>> >> > > ><br>
>>> >> > > > MPI_Create(...)<br>
>>> >> > > > for(...){<br>
>>> >> > > > /* udp recv operation */<br>
>>> >> > > ><br>
>>> >> > > > MPI_Barrier // to let other process know data received from<br>
>>> udp is<br>
>>> >> > ready<br>
>>> >> > > ><br>
>>> >> > > > MPI_Win_fence(0, win);<br>
>>> >> > > > MPI_Win_fence(0, win);<br>
>>> >> > > ><br>
>>> >> > > > }<br>
>>> >> > > ><br>
>>> >> > > > other processes:<br>
>>> >> > > ><br>
>>> >> > > > for(...){<br>
>>> >> > > ><br>
>>> >> > > > MPI_Barrier // sync for udp data ready<br>
>>> >> > > ><br>
>>> >> > > > MPI_Win_fence(0, win);<br>
>>> >> > > ><br>
>>> >> > > > MPI_Get();<br>
>>> >> > > ><br>
>>> >> > > > MPI_Win_fence(0, win); <-- program stopped here<br>
>>> >> > > ><br>
>>> >> > > > /* other operation */<br>
>>> >> > > > }<br>
>>> >> > > ><br>
>>> >> > > > I found that the program stopped at second MPI_Win_fence, the<br>
>>> >> > > > terminal<br>
>>> >> > > > output is:<br>
>>> >> > > ><br>
>>> >> > > ><br>
>>> >> > > ><br>
>>> >> > > ><br>
>>> >> > ><br>
>>> >> ><br>
>>> >> ><br>
>>> ===================================================================================<br>
>>> >> > > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES<br>
>>> >> > > > = EXIT CODE: 11<br>
>>> >> > > > = CLEANING UP REMAINING PROCESSES<br>
>>> >> > > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES<br>
>>> >> > > ><br>
>>> >> > > ><br>
>>> >> > ><br>
>>> >> ><br>
>>> >> ><br>
>>> ===================================================================================<br>
>>> >> > > > YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation<br>
>>> fault<br>
>>> >> > > > (signal 11)<br>
>>> >> > > > This typically refers to a problem with your application.<br>
>>> >> > > > Please see the FAQ page for debugging suggestions<br>
>>> >> > > ><br>
>>> >> > > > Do you have any suggestions? Thank you very much!<br>
>>> >> > > ><br>
>>> >> > > > --<br>
>>> >> > > > Best Regards,<br>
>>> >> > > > Sufeng Niu<br>
>>> >> > > > ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> >> > > > Tel: <a href="tel:312-731-7219" value="+13127317219">312-731-7219</a><br>
>>> >> > > ><br>
>>> >> > > > _______________________________________________<br>
>>> >> > > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > > To manage subscription options or unsubscribe:<br>
>>> >> > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > > ><br>
>>> >> > > -------------- next part --------------<br>
>>> >> > > An HTML attachment was scrubbed...<br>
>>> >> > > URL: <<br>
>>> >> > ><br>
>>> >> ><br>
>>> >> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/48c5f337/attachment.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/48c5f337/attachment.html</a><br>
>>> >> > > ><br>
>>> >> > ><br>
>>> >> > > ------------------------------<br>
>>> >> > ><br>
>>> >> > > _______________________________________________<br>
>>> >> > > discuss mailing list<br>
>>> >> > > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> > ><br>
>>> >> > > End of discuss Digest, Vol 9, Issue 27<br>
>>> >> > > **************************************<br>
>>> >> > ><br>
>>> >> ><br>
>>> >> ><br>
>>> >> ><br>
>>> >> > --<br>
>>> >> > Best Regards,<br>
>>> >> > Sufeng Niu<br>
>>> >> > ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> >> > Tel: <a href="tel:312-731-7219" value="+13127317219">312-731-7219</a><br>
>>> >> > -------------- next part --------------<br>
>>> >> > An HTML attachment was scrubbed...<br>
>>> >> > URL: <<br>
>>> >> ><br>
>>> >> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/57a5e76f/attachment.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/57a5e76f/attachment.html</a><br>
>>> >> > ><br>
>>> >> ><br>
>>> >> > ------------------------------<br>
>>> >> ><br>
>>> >> > _______________________________________________<br>
>>> >> > discuss mailing list<br>
>>> >> > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >> ><br>
>>> >> > End of discuss Digest, Vol 9, Issue 28<br>
>>> >> > **************************************<br>
>>> >> ><br>
>>> >><br>
>>> >><br>
>>> >><br>
>>> >> --<br>
>>> >> Best Regards,<br>
>>> >> Sufeng Niu<br>
>>> >> ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> >> Tel: <a href="tel:312-731-7219" value="+13127317219">312-731-7219</a><br>
>>> >> -------------- next part --------------<br>
>>> >> An HTML attachment was scrubbed...<br>
>>> >> URL:<br>
>>> >> <<br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/48296a33/attachment.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/48296a33/attachment.html</a><br>
>>> ><br>
>>> >> -------------- next part --------------<br>
>>> >> A non-text attachment was scrubbed...<br>
>>> >> Name: Screenshot.png<br>
>>> >> Type: image/png<br>
>>> >> Size: 131397 bytes<br>
>>> >> Desc: not available<br>
>>> >> URL:<br>
>>> >> <<br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/48296a33/attachment.png" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/48296a33/attachment.png</a><br>
>>> ><br>
>>> >><br>
>>> >><br>
>>> >> ------------------------------<br>
>>> >><br>
>>> >> _______________________________________________<br>
>>> >> discuss mailing list<br>
>>> >> <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> >> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> >><br>
>>> >> End of discuss Digest, Vol 9, Issue 29<br>
>>> >> **************************************<br>
>>> ><br>
>>> ><br>
>>> ><br>
>>> ><br>
>>> > --<br>
>>> > Best Regards,<br>
>>> > Sufeng Niu<br>
>>> > ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> > Tel: <a href="tel:312-731-7219" value="+13127317219">312-731-7219</a><br>
>>> ><br>
>>> > _______________________________________________<br>
>>> > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > To manage subscription options or unsubscribe:<br>
>>> > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>><br>
>>><br>
>>><br>
>>> --<br>
>>> Jeff Hammond<br>
>>> <a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a><br>
>>><br>
>>><br>
>>> ------------------------------<br>
>>><br>
>>> Message: 2<br>
>>> Date: Wed, 10 Jul 2013 12:08:19 -0500<br>
>>> From: Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> Subject: Re: [mpich-discuss] MPI_Win_fence failed<br>
>>> Message-ID:<br>
>>> <CAFNNHkzu0GYT0qSdWx1VQz0+V7mg5d=<br>
>>> <a href="mailto:tZFQm-MHPVoCyKfiYSA@mail.gmail.com">tZFQm-MHPVoCyKfiYSA@mail.gmail.com</a>><br>
>>> Content-Type: text/plain; charset="iso-8859-1"<br>
>>><br>
>>> Oh, yeah, that would be an easier way. I just create a repository in<br>
>>> github. you can<br>
>>> git clone <a href="https://github.com/sufengniu/mpi_app_test.git" target="_blank">https://github.com/sufengniu/mpi_app_test.git</a><br>
>>><br>
>>> to run the program. you need to install a tif library. I know ubuntu is<br>
>>> sudo apt-get install libtiff4-dev.<br>
>>> after you download it. just make<br>
>>> then there will be 2 bin file,<br>
>>><br>
>>> please change hostfile to your machine, first run mpi: ./run.perl main<br>
>>><br>
>>> then run ./udp_client 55Fe_run5_dark.tif<br>
>>><br>
>>> Thanks a lot!<br>
>>> Sufeng<br>
>>><br>
>>><br>
>>><br>
>>> On Wed, Jul 10, 2013 at 11:57 AM, <<a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a>> wrote:<br>
>>><br>
>>> > Send discuss mailing list submissions to<br>
>>> > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> ><br>
>>> > To subscribe or unsubscribe via the World Wide Web, visit<br>
>>> > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > or, via email, send a message with subject or body 'help' to<br>
>>> > <a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a><br>
>>> ><br>
>>> > You can reach the person managing the list at<br>
>>> > <a href="mailto:discuss-owner@mpich.org">discuss-owner@mpich.org</a><br>
>>> ><br>
>>> > When replying, please edit your Subject line so it is more specific<br>
>>> > than "Re: Contents of discuss digest..."<br>
>>> ><br>
>>> ><br>
>>> > Today's Topics:<br>
>>> ><br>
>>> > 1. Re: MPI_Win_fence failed (Jeff Hammond)<br>
>>> > 2. Re: MPI_Win_fence failed (Sufeng Niu)<br>
>>> ><br>
>>> ><br>
>>> > ----------------------------------------------------------------------<br>
>>> ><br>
>>> > Message: 1<br>
>>> > Date: Wed, 10 Jul 2013 11:46:08 -0500<br>
>>> > From: Jeff Hammond <<a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a>><br>
>>> > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > Subject: Re: [mpich-discuss] MPI_Win_fence failed<br>
>>> > Message-ID:<br>
>>> > <CAGKz=<br>
>>> > <a href="mailto:uLiq6rur%2B15MBip5U-_AS2JWefYOHfX07b1dkR8POOk6A@mail.gmail.com">uLiq6rur+15MBip5U-_AS2JWefYOHfX07b1dkR8POOk6A@mail.gmail.com</a>><br>
>>> > Content-Type: text/plain; charset=ISO-8859-1<br>
>>> ><br>
>>> > Just post the code so we can run it.<br>
>>> ><br>
>>> > Jeff<br>
>>> ><br>
>>> > On Wed, Jul 10, 2013 at 11:39 AM, Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> wrote:<br>
>>> > > Sorry I forget to add screen shot for backtrace. the screen shot is<br>
>>> > > attached.<br>
>>> > ><br>
>>> > > Thanks a lot!<br>
>>> > ><br>
>>> > > Sufeng<br>
>>> > ><br>
>>> > ><br>
>>> > ><br>
>>> > > On Wed, Jul 10, 2013 at 11:30 AM, <<a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a>> wrote:<br>
>>> > >><br>
>>> > >> Send discuss mailing list submissions to<br>
>>> > >> <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >><br>
>>> > >> To subscribe or unsubscribe via the World Wide Web, visit<br>
>>> > >> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> or, via email, send a message with subject or body 'help' to<br>
>>> > >> <a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a><br>
>>> > >><br>
>>> > >> You can reach the person managing the list at<br>
>>> > >> <a href="mailto:discuss-owner@mpich.org">discuss-owner@mpich.org</a><br>
>>> > >><br>
>>> > >> When replying, please edit your Subject line so it is more specific<br>
>>> > >> than "Re: Contents of discuss digest..."<br>
>>> > >><br>
>>> > >><br>
>>> > >> Today's Topics:<br>
>>> > >><br>
>>> > >> 1. Re: MPI_Win_fence failed (Sufeng Niu)<br>
>>> > >><br>
>>> > >><br>
>>> > >><br>
>>> ----------------------------------------------------------------------<br>
>>> > >><br>
>>> > >> Message: 1<br>
>>> > >> Date: Wed, 10 Jul 2013 11:30:36 -0500<br>
>>> > >> From: Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> > >> To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> Subject: Re: [mpich-discuss] MPI_Win_fence failed<br>
>>> > >> Message-ID:<br>
>>> > >><br>
>>> > >> <<a href="mailto:CAFNNHkyLj8CbYMmc_w2DA9_%2Bq2Oe3kyus%2Bg6c99ShPk6ZXVkdA@mail.gmail.com">CAFNNHkyLj8CbYMmc_w2DA9_+q2Oe3kyus+g6c99ShPk6ZXVkdA@mail.gmail.com</a><br>
>>> ><br>
>>> > >> Content-Type: text/plain; charset="iso-8859-1"<br>
>>> > >><br>
>>> > >><br>
>>> > >> Hi Jim,<br>
>>> > >><br>
>>> > >> Thanks a lot for your reply. the basic way for me to debugging is<br>
>>> > >> barrier+printf, right now I only have an evaluation version of<br>
>>> > totalview.<br>
>>> > >> the backtrace using totalview shown below. the udp is the udp<br>
>>> collection<br>
>>> > >> and create RMA window, image_rms doing MPI_Get to access the window<br>
>>> > >><br>
>>> > >> There is a segment violation, but I don't know why the program<br>
>>> stopped<br>
>>> > at<br>
>>> > >> MPI_Win_fence.<br>
>>> > >><br>
>>> > >> Thanks a lot!<br>
>>> > >><br>
>>> > >><br>
>>> > >><br>
>>> > >><br>
>>> > >><br>
>>> > >><br>
>>> > >><br>
>>> > >> On Wed, Jul 10, 2013 at 10:12 AM, <<a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a>><br>
>>> wrote:<br>
>>> > >><br>
>>> > >> > Send discuss mailing list submissions to<br>
>>> > >> > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> ><br>
>>> > >> > To subscribe or unsubscribe via the World Wide Web, visit<br>
>>> > >> > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> > or, via email, send a message with subject or body 'help' to<br>
>>> > >> > <a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a><br>
>>> > >> ><br>
>>> > >> > You can reach the person managing the list at<br>
>>> > >> > <a href="mailto:discuss-owner@mpich.org">discuss-owner@mpich.org</a><br>
>>> > >> ><br>
>>> > >> > When replying, please edit your Subject line so it is more<br>
>>> specific<br>
>>> > >> > than "Re: Contents of discuss digest..."<br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> > Today's Topics:<br>
>>> > >> ><br>
>>> > >> > 1. Re: MPICH3.0.4 make fails with "No rule to make target..."<br>
>>> > >> > (Wesley Bland)<br>
>>> > >> > 2. Re: Error in MPI_Finalize on a simple ring test over TCP<br>
>>> > >> > (Wesley Bland)<br>
>>> > >> > 3. Restrict number of cores, not threads (Bob Ilgner)<br>
>>> > >> > 4. Re: Restrict number of cores, not threads (Wesley Bland)<br>
>>> > >> > 5. Re: Restrict number of cores, not threads (Wesley Bland)<br>
>>> > >> > 6. Re: Error in MPI_Finalize on a simple ring test over TCP<br>
>>> > >> > (Thomas Ropars)<br>
>>> > >> > 7. MPI_Win_fence failed (Sufeng Niu)<br>
>>> > >> > 8. Re: MPI_Win_fence failed (Jim Dinan)<br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> ><br>
>>> ----------------------------------------------------------------------<br>
>>> > >> ><br>
>>> > >> > Message: 1<br>
>>> > >> > Date: Wed, 10 Jul 2013 08:29:06 -0500<br>
>>> > >> > From: Wesley Bland <<a href="mailto:wbland@mcs.anl.gov">wbland@mcs.anl.gov</a>><br>
>>> > >> > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > Subject: Re: [mpich-discuss] MPICH3.0.4 make fails with "No rule<br>
>>> to<br>
>>> > >> > make target..."<br>
>>> > >> > Message-ID: <<a href="mailto:F48FC916-31F7-4F82-95F8-2D6A6C45264F@mcs.anl.gov">F48FC916-31F7-4F82-95F8-2D6A6C45264F@mcs.anl.gov</a>><br>
>>> > >> > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> > >> ><br>
>>> > >> > Unfortunately, due to the lack of developer resources and<br>
>>> interest,<br>
>>> > the<br>
>>> > >> > last version of MPICH which was supported on Windows was 1.4.1p.<br>
>>> You<br>
>>> > can<br>
>>> > >> > find that version on the downloads page:<br>
>>> > >> ><br>
>>> > >> > <a href="http://www.mpich.org/downloads/" target="_blank">http://www.mpich.org/downloads/</a><br>
>>> > >> ><br>
>>> > >> > Alternatively, Microsoft maintains a derivative of MPICH which<br>
>>> should<br>
>>> > >> > provide the features you need. You also find a link to that on the<br>
>>> > >> > downloads page above.<br>
>>> > >> ><br>
>>> > >> > Wesley<br>
>>> > >> ><br>
>>> > >> > On Jul 10, 2013, at 1:16 AM, Don Warren <<a href="mailto:don.warren@gmail.com">don.warren@gmail.com</a>><br>
>>> wrote:<br>
>>> > >> ><br>
>>> > >> > > Hello,<br>
>>> > >> > ><br>
>>> > >> > > As requested in the installation guide, I'm informing this list<br>
>>> of a<br>
>>> > >> > failure to correctly make MPICH3.0.4 on a Win7 system. The<br>
>>> specific<br>
>>> > >> > error<br>
>>> > >> > encountered is<br>
>>> > >> > > "make[2]: *** No rule to make target<br>
>>> > >> > `/cygdrive/c/FLASH/mpich-3.0.4/src/mpi/romio/Makefile.am', needed<br>
>>> by<br>
>>> > >> > `/cygdrive/c/FLASH/mpich-3.0.4/src/mpi/romio/Makefile.in'. Stop."<br>
>>> > >> > ><br>
>>> > >> > > I have confirmed that both Makefile.am and Makefile.in exist in<br>
>>> the<br>
>>> > >> > directory listed. I'm attaching the c.txt and the m.txt files.<br>
>>> > >> > ><br>
>>> > >> > > Possibly of interest is that the command "make clean" fails at<br>
>>> > exactly<br>
>>> > >> > the same folder, with exactly the same error message as shown in<br>
>>> m.txt<br>
>>> > >> > and<br>
>>> > >> > above.<br>
>>> > >> > ><br>
>>> > >> > > Any advice you can give would be appreciated. I'm attempting<br>
>>> to get<br>
>>> > >> > FLASH running on my computer, which seems to require MPICH.<br>
>>> > >> > ><br>
>>> > >> > > Regards,<br>
>>> > >> > > Don Warren<br>
>>> > >> > ><br>
>>> > >> > ><br>
>>> ><br>
>>> <config-make-outputs.zip>_______________________________________________<br>
>>> > >><br>
>>> > >> > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > > To manage subscription options or unsubscribe:<br>
>>> > >> > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> ><br>
>>> > >> > -------------- next part --------------<br>
>>> > >> > An HTML attachment was scrubbed...<br>
>>> > >> > URL: <<br>
>>> > >> ><br>
>>> > >> ><br>
>>> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/69b497f1/attachment-0001.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/69b497f1/attachment-0001.html</a><br>
>>> > >> > ><br>
>>> > >> ><br>
>>> > >> > ------------------------------<br>
>>> > >> ><br>
>>> > >> > Message: 2<br>
>>> > >> > Date: Wed, 10 Jul 2013 08:39:47 -0500<br>
>>> > >> > From: Wesley Bland <<a href="mailto:wbland@mcs.anl.gov">wbland@mcs.anl.gov</a>><br>
>>> > >> > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > Subject: Re: [mpich-discuss] Error in MPI_Finalize on a simple<br>
>>> ring<br>
>>> > >> > test over TCP<br>
>>> > >> > Message-ID: <<a href="mailto:D5999106-2A75-4091-8B0F-EAFA22880862@mcs.anl.gov">D5999106-2A75-4091-8B0F-EAFA22880862@mcs.anl.gov</a>><br>
>>> > >> > Content-Type: text/plain; charset=us-ascii<br>
>>> > >> ><br>
>>> > >> > The value of previous for rank 0 in your code is -1. MPICH is<br>
>>> > >> > complaining<br>
>>> > >> > because all of the requests to receive a message from -1 are still<br>
>>> > >> > pending<br>
>>> > >> > when you try to finalize. You need to make sure that you are<br>
>>> receiving<br>
>>> > >> > from<br>
>>> > >> > valid ranks.<br>
>>> > >> ><br>
>>> > >> > On Jul 10, 2013, at 7:50 AM, Thomas Ropars <<a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a><br>
>>> ><br>
>>> > >> > wrote:<br>
>>> > >> ><br>
>>> > >> > > Yes sure. Here it is.<br>
>>> > >> > ><br>
>>> > >> > > Thomas<br>
>>> > >> > ><br>
>>> > >> > > On 07/10/2013 02:23 PM, Wesley Bland wrote:<br>
>>> > >> > >> Can you send us the smallest chunk of code that still exhibits<br>
>>> this<br>
>>> > >> > error?<br>
>>> > >> > >><br>
>>> > >> > >> Wesley<br>
>>> > >> > >><br>
>>> > >> > >> On Jul 10, 2013, at 6:54 AM, Thomas Ropars <<br>
>>> <a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a>><br>
>>> > >> > wrote:<br>
>>> > >> > >><br>
>>> > >> > >>> Hi all,<br>
>>> > >> > >>><br>
>>> > >> > >>> I get the following error when I try to run a simple<br>
>>> application<br>
>>> > >> > implementing a ring (each process sends to rank+1 and receives<br>
>>> from<br>
>>> > >> > rank-1). More precisely, the error occurs during the call to<br>
>>> > >> > MPI_Finalize():<br>
>>> > >> > >>><br>
>>> > >> > >>> Assertion failed in file<br>
>>> > >> > src/mpid/ch3/channels/nemesis/netmod/tcp/socksm.c at line 363:<br>
>>> > >> > sc->pg_is_set<br>
>>> > >> > >>> internal ABORT - process 0<br>
>>> > >> > >>><br>
>>> > >> > >>> Does anybody else also noticed the same error?<br>
>>> > >> > >>><br>
>>> > >> > >>> Here are all the details about my test:<br>
>>> > >> > >>> - The error is generated with mpich-3.0.2 (but I noticed the<br>
>>> exact<br>
>>> > >> > same error with mpich-3.0.4)<br>
>>> > >> > >>> - I am using IPoIB for communication between nodes (The same<br>
>>> thing<br>
>>> > >> > happens over Ethernet)<br>
>>> > >> > >>> - The problem comes from TCP links. When all processes are on<br>
>>> the<br>
>>> > >> > >>> same<br>
>>> > >> > node, there is no error. As soon as one process is on a remote<br>
>>> node,<br>
>>> > the<br>
>>> > >> > failure occurs.<br>
>>> > >> > >>> - Note also that the failure does not occur if I run a more<br>
>>> > complex<br>
>>> > >> > code (eg, a NAS benchmark).<br>
>>> > >> > >>><br>
>>> > >> > >>> Thomas Ropars<br>
>>> > >><br>
>>> > >> > >>> _______________________________________________<br>
>>> > >> > >>> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > >>> To manage subscription options or unsubscribe:<br>
>>> > >> > >>> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> > >> _______________________________________________<br>
>>> > >> > >> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > >> To manage subscription options or unsubscribe:<br>
>>> > >> > >> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> > >><br>
>>> > >> > >><br>
>>> > >> > ><br>
>>> > >> > > <ring_clean.c>_______________________________________________<br>
>>> > >><br>
>>> > >> > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > > To manage subscription options or unsubscribe:<br>
>>> > >> > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> > ------------------------------<br>
>>> > >> ><br>
>>> > >> > Message: 3<br>
>>> > >> > Date: Wed, 10 Jul 2013 16:41:27 +0200<br>
>>> > >> > From: Bob Ilgner <<a href="mailto:bobilgner@gmail.com">bobilgner@gmail.com</a>><br>
>>> > >> > To: <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
>>> > >> > Subject: [mpich-discuss] Restrict number of cores, not threads<br>
>>> > >> > Message-ID:<br>
>>> > >> > <<br>
>>> > >> ><br>
>>> <a href="mailto:CAKv15b-QgmHkVkoiTFmP3EZXvyy6sc_QeqHQgbMUhnr3Xh9ecA@mail.gmail.com">CAKv15b-QgmHkVkoiTFmP3EZXvyy6sc_QeqHQgbMUhnr3Xh9ecA@mail.gmail.com</a>><br>
>>> > >> > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> > >> ><br>
>>> > >> > Dear all,<br>
>>> > >> ><br>
>>> > >> > I am working on a shared memory processor with 256 cores. I am<br>
>>> working<br>
>>> > >> > from<br>
>>> > >> > the command line directly.<br>
>>> > >> ><br>
>>> > >> > Can I restict the number of cores that I deploy.The command<br>
>>> > >> ><br>
>>> > >> > mpirun -n 100 myprog<br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> > will automatically start on 100 cores. I wish to use only 10<br>
>>> cores and<br>
>>> > >> > have<br>
>>> > >> > 10 threads on each core. Can I do this with mpich ? Rememebre<br>
>>> that<br>
>>> > this<br>
>>> > >> > an<br>
>>> > >> > smp abd I can not identify each core individually(as in a cluster)<br>
>>> > >> ><br>
>>> > >> > Regards, bob<br>
>>> > >> > -------------- next part --------------<br>
>>> > >> > An HTML attachment was scrubbed...<br>
>>> > >> > URL: <<br>
>>> > >> ><br>
>>> > >> ><br>
>>> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/ec659e91/attachment-0001.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/ec659e91/attachment-0001.html</a><br>
>>> > >> > ><br>
>>> > >> ><br>
>>> > >> > ------------------------------<br>
>>> > >> ><br>
>>> > >> > Message: 4<br>
>>> > >> > Date: Wed, 10 Jul 2013 09:46:38 -0500<br>
>>> > >> > From: Wesley Bland <<a href="mailto:wbland@mcs.anl.gov">wbland@mcs.anl.gov</a>><br>
>>> > >> > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > Cc: <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
>>> > >> > Subject: Re: [mpich-discuss] Restrict number of cores, not threads<br>
>>> > >> > Message-ID: <<a href="mailto:2FAF588E-2FBE-45E4-B53F-E6BC931E3D51@mcs.anl.gov">2FAF588E-2FBE-45E4-B53F-E6BC931E3D51@mcs.anl.gov</a>><br>
>>> > >> > Content-Type: text/plain; charset=iso-8859-1<br>
>>> > >> ><br>
>>> > >> > Threads in MPI are not ranks. When you say you want to launch<br>
>>> with -n<br>
>>> > >> > 100,<br>
>>> > >> > you will always get 100 processes, not threads. If you want 10<br>
>>> threads<br>
>>> > >> > on<br>
>>> > >> > 10 cores, you will need to launch with -n 10, then add your<br>
>>> threads<br>
>>> > >> > according to your threading library.<br>
>>> > >> ><br>
>>> > >> > Note that threads in MPI do not get their own rank currently.<br>
>>> They all<br>
>>> > >> > share the same rank as the process in which they reside, so if you<br>
>>> > need<br>
>>> > >> > to<br>
>>> > >> > be able to handle things with different ranks, you'll need to use<br>
>>> > actual<br>
>>> > >> > processes.<br>
>>> > >> ><br>
>>> > >> > Wesley<br>
>>> > >> ><br>
>>> > >> > On Jul 10, 2013, at 9:41 AM, Bob Ilgner <<a href="mailto:bobilgner@gmail.com">bobilgner@gmail.com</a>><br>
>>> wrote:<br>
>>> > >> ><br>
>>> > >> > > Dear all,<br>
>>> > >> > ><br>
>>> > >> > > I am working on a shared memory processor with 256 cores. I am<br>
>>> > working<br>
>>> > >> > from the command line directly.<br>
>>> > >> > ><br>
>>> > >> > > Can I restict the number of cores that I deploy.The command<br>
>>> > >> > ><br>
>>> > >> > > mpirun -n 100 myprog<br>
>>> > >> > ><br>
>>> > >> > ><br>
>>> > >> > > will automatically start on 100 cores. I wish to use only 10<br>
>>> cores<br>
>>> > and<br>
>>> > >> > have 10 threads on each core. Can I do this with mpich ?<br>
>>> Rememebre<br>
>>> > that<br>
>>> > >> > this an smp abd I can not identify each core individually(as in a<br>
>>> > >> > cluster)<br>
>>> > >> > ><br>
>>> > >> > > Regards, bob<br>
>>> > >><br>
>>> > >> > > _______________________________________________<br>
>>> > >> > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > > To manage subscription options or unsubscribe:<br>
>>> > >> > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> > ------------------------------<br>
>>> > >> ><br>
>>> > >> > Message: 5<br>
>>> > >> > Date: Wed, 10 Jul 2013 09:46:38 -0500<br>
>>> > >> > From: Wesley Bland <<a href="mailto:wbland@mcs.anl.gov">wbland@mcs.anl.gov</a>><br>
>>> > >> > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > Cc: <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
>>> > >> > Subject: Re: [mpich-discuss] Restrict number of cores, not threads<br>
>>> > >> > Message-ID: <<a href="mailto:2FAF588E-2FBE-45E4-B53F-E6BC931E3D51@mcs.anl.gov">2FAF588E-2FBE-45E4-B53F-E6BC931E3D51@mcs.anl.gov</a>><br>
>>> > >> > Content-Type: text/plain; charset=iso-8859-1<br>
>>> > >> ><br>
>>> > >> > Threads in MPI are not ranks. When you say you want to launch<br>
>>> with -n<br>
>>> > >> > 100,<br>
>>> > >> > you will always get 100 processes, not threads. If you want 10<br>
>>> threads<br>
>>> > >> > on<br>
>>> > >> > 10 cores, you will need to launch with -n 10, then add your<br>
>>> threads<br>
>>> > >> > according to your threading library.<br>
>>> > >> ><br>
>>> > >> > Note that threads in MPI do not get their own rank currently.<br>
>>> They all<br>
>>> > >> > share the same rank as the process in which they reside, so if you<br>
>>> > need<br>
>>> > >> > to<br>
>>> > >> > be able to handle things with different ranks, you'll need to use<br>
>>> > actual<br>
>>> > >> > processes.<br>
>>> > >> ><br>
>>> > >> > Wesley<br>
>>> > >> ><br>
>>> > >> > On Jul 10, 2013, at 9:41 AM, Bob Ilgner <<a href="mailto:bobilgner@gmail.com">bobilgner@gmail.com</a>><br>
>>> wrote:<br>
>>> > >> ><br>
>>> > >> > > Dear all,<br>
>>> > >> > ><br>
>>> > >> > > I am working on a shared memory processor with 256 cores. I am<br>
>>> > working<br>
>>> > >> > from the command line directly.<br>
>>> > >> > ><br>
>>> > >> > > Can I restict the number of cores that I deploy.The command<br>
>>> > >> > ><br>
>>> > >> > > mpirun -n 100 myprog<br>
>>> > >> > ><br>
>>> > >> > ><br>
>>> > >> > > will automatically start on 100 cores. I wish to use only 10<br>
>>> cores<br>
>>> > and<br>
>>> > >> > have 10 threads on each core. Can I do this with mpich ?<br>
>>> Rememebre<br>
>>> > that<br>
>>> > >> > this an smp abd I can not identify each core individually(as in a<br>
>>> > >> > cluster)<br>
>>> > >> > ><br>
>>> > >> > > Regards, bob<br>
>>> > >><br>
>>> > >> > > _______________________________________________<br>
>>> > >> > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > > To manage subscription options or unsubscribe:<br>
>>> > >> > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> > ------------------------------<br>
>>> > >> ><br>
>>> > >> > Message: 6<br>
>>> > >> > Date: Wed, 10 Jul 2013 16:50:36 +0200<br>
>>> > >> > From: Thomas Ropars <<a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a>><br>
>>> > >> > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > Subject: Re: [mpich-discuss] Error in MPI_Finalize on a simple<br>
>>> ring<br>
>>> > >> > test over TCP<br>
>>> > >> > Message-ID: <<a href="mailto:51DD74BC.3020009@epfl.ch">51DD74BC.3020009@epfl.ch</a>><br>
>>> > >> > Content-Type: text/plain; charset=UTF-8; format=flowed<br>
>>> > >> ><br>
>>> > >> > Yes, you are right, sorry for disturbing.<br>
>>> > >> ><br>
>>> > >> > On 07/10/2013 03:39 PM, Wesley Bland wrote:<br>
>>> > >> > > The value of previous for rank 0 in your code is -1. MPICH is<br>
>>> > >> > complaining because all of the requests to receive a message from<br>
>>> -1<br>
>>> > are<br>
>>> > >> > still pending when you try to finalize. You need to make sure<br>
>>> that you<br>
>>> > >> > are<br>
>>> > >> > receiving from valid ranks.<br>
>>> > >> > ><br>
>>> > >> > > On Jul 10, 2013, at 7:50 AM, Thomas Ropars <<br>
>>> <a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a>><br>
>>> > >> > wrote:<br>
>>> > >> > ><br>
>>> > >> > >> Yes sure. Here it is.<br>
>>> > >> > >><br>
>>> > >> > >> Thomas<br>
>>> > >> > >><br>
>>> > >> > >> On 07/10/2013 02:23 PM, Wesley Bland wrote:<br>
>>> > >> > >>> Can you send us the smallest chunk of code that still exhibits<br>
>>> > this<br>
>>> > >> > error?<br>
>>> > >> > >>><br>
>>> > >> > >>> Wesley<br>
>>> > >> > >>><br>
>>> > >> > >>> On Jul 10, 2013, at 6:54 AM, Thomas Ropars <<br>
>>> <a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a><br>
>>> > ><br>
>>> > >> > wrote:<br>
>>> > >> > >>><br>
>>> > >> > >>>> Hi all,<br>
>>> > >> > >>>><br>
>>> > >> > >>>> I get the following error when I try to run a simple<br>
>>> application<br>
>>> > >> > implementing a ring (each process sends to rank+1 and receives<br>
>>> from<br>
>>> > >> > rank-1). More precisely, the error occurs during the call to<br>
>>> > >> > MPI_Finalize():<br>
>>> > >> > >>>><br>
>>> > >> > >>>> Assertion failed in file<br>
>>> > >> > src/mpid/ch3/channels/nemesis/netmod/tcp/socksm.c at line 363:<br>
>>> > >> > sc->pg_is_set<br>
>>> > >> > >>>> internal ABORT - process 0<br>
>>> > >> > >>>><br>
>>> > >> > >>>> Does anybody else also noticed the same error?<br>
>>> > >> > >>>><br>
>>> > >> > >>>> Here are all the details about my test:<br>
>>> > >> > >>>> - The error is generated with mpich-3.0.2 (but I noticed the<br>
>>> > exact<br>
>>> > >> > same error with mpich-3.0.4)<br>
>>> > >> > >>>> - I am using IPoIB for communication between nodes (The same<br>
>>> > thing<br>
>>> > >> > happens over Ethernet)<br>
>>> > >> > >>>> - The problem comes from TCP links. When all processes are<br>
>>> on the<br>
>>> > >> > same node, there is no error. As soon as one process is on a<br>
>>> remote<br>
>>> > >> > node,<br>
>>> > >> > the failure occurs.<br>
>>> > >> > >>>> - Note also that the failure does not occur if I run a more<br>
>>> > complex<br>
>>> > >> > code (eg, a NAS benchmark).<br>
>>> > >> > >>>><br>
>>> > >> > >>>> Thomas Ropars<br>
>>> > >><br>
>>> > >> > >>>> _______________________________________________<br>
>>> > >> > >>>> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > >>>> To manage subscription options or unsubscribe:<br>
>>> > >> > >>>> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> > >>> _______________________________________________<br>
>>> > >> > >>> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > >>> To manage subscription options or unsubscribe:<br>
>>> > >> > >>> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> > >>><br>
>>> > >> > >>><br>
>>> > >> > >> <ring_clean.c>_______________________________________________<br>
>>> > >><br>
>>> > >> > >> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > >> To manage subscription options or unsubscribe:<br>
>>> > >> > >> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> > > _______________________________________________<br>
>>> > >> > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > > To manage subscription options or unsubscribe:<br>
>>> > >> > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> > ><br>
>>> > >> > ><br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> > ------------------------------<br>
>>> > >> ><br>
>>> > >> > Message: 7<br>
>>> > >> > Date: Wed, 10 Jul 2013 10:07:21 -0500<br>
>>> > >> > From: Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> > >> > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > Subject: [mpich-discuss] MPI_Win_fence failed<br>
>>> > >> > Message-ID:<br>
>>> > >> > <<br>
>>> > >> ><br>
>>> <a href="mailto:CAFNNHkz_1gC7hfpx0G9j24adO-gDabdmwZ4VuT6jip-fDMhS9A@mail.gmail.com">CAFNNHkz_1gC7hfpx0G9j24adO-gDabdmwZ4VuT6jip-fDMhS9A@mail.gmail.com</a>><br>
>>> > >> > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> > >><br>
>>> > >> ><br>
>>> > >> > Hello,<br>
>>> > >> ><br>
>>> > >> > I used MPI RMA in my program, but the program stop at the<br>
>>> > MPI_Win_fence,<br>
>>> > >> > I<br>
>>> > >> > have a master process receive data from udp socket. Other<br>
>>> processes<br>
>>> > use<br>
>>> > >> > MPI_Get to access data.<br>
>>> > >> ><br>
>>> > >> > master process:<br>
>>> > >> ><br>
>>> > >> > MPI_Create(...)<br>
>>> > >> > for(...){<br>
>>> > >> > /* udp recv operation */<br>
>>> > >> ><br>
>>> > >> > MPI_Barrier // to let other process know data received from udp<br>
>>> is<br>
>>> > >> > ready<br>
>>> > >> ><br>
>>> > >> > MPI_Win_fence(0, win);<br>
>>> > >> > MPI_Win_fence(0, win);<br>
>>> > >> ><br>
>>> > >> > }<br>
>>> > >> ><br>
>>> > >> > other processes:<br>
>>> > >> ><br>
>>> > >> > for(...){<br>
>>> > >> ><br>
>>> > >> > MPI_Barrier // sync for udp data ready<br>
>>> > >> ><br>
>>> > >> > MPI_Win_fence(0, win);<br>
>>> > >> ><br>
>>> > >> > MPI_Get();<br>
>>> > >> ><br>
>>> > >> > MPI_Win_fence(0, win); <-- program stopped here<br>
>>> > >> ><br>
>>> > >> > /* other operation */<br>
>>> > >> > }<br>
>>> > >> ><br>
>>> > >> > I found that the program stopped at second MPI_Win_fence, the<br>
>>> terminal<br>
>>> > >> > output is:<br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> ><br>
>>> ><br>
>>> ===================================================================================<br>
>>> > >> > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES<br>
>>> > >> > = EXIT CODE: 11<br>
>>> > >> > = CLEANING UP REMAINING PROCESSES<br>
>>> > >> > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES<br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> ><br>
>>> ><br>
>>> ===================================================================================<br>
>>> > >> > YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation<br>
>>> fault<br>
>>> > >> > (signal 11)<br>
>>> > >> > This typically refers to a problem with your application.<br>
>>> > >> > Please see the FAQ page for debugging suggestions<br>
>>> > >> ><br>
>>> > >> > Do you have any suggestions? Thank you very much!<br>
>>> > >> ><br>
>>> > >> > --<br>
>>> > >> > Best Regards,<br>
>>> > >> > Sufeng Niu<br>
>>> > >> > ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> > >> > Tel: 312-731-7219<br>
>>> > >> > -------------- next part --------------<br>
>>> > >> > An HTML attachment was scrubbed...<br>
>>> > >> > URL: <<br>
>>> > >> ><br>
>>> > >> ><br>
>>> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/375a95ac/attachment-0001.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/375a95ac/attachment-0001.html</a><br>
>>> > >> > ><br>
>>> > >> ><br>
>>> > >> > ------------------------------<br>
>>> > >> ><br>
>>> > >> > Message: 8<br>
>>> > >> > Date: Wed, 10 Jul 2013 11:12:45 -0400<br>
>>> > >> > From: Jim Dinan <<a href="mailto:james.dinan@gmail.com">james.dinan@gmail.com</a>><br>
>>> > >> > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > Subject: Re: [mpich-discuss] MPI_Win_fence failed<br>
>>> > >> > Message-ID:<br>
>>> > >> > <CAOoEU4F3hX=y3yrJKYKucNeiueQYBeR_3OQas9E+mg+GM6Rz=<br>
>>> > >> > <a href="mailto:w@mail.gmail.com">w@mail.gmail.com</a>><br>
>>> > >> > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> > >><br>
>>> > >> ><br>
>>> > >> > It's hard to tell where the segmentation fault is coming from.<br>
>>> Can<br>
>>> > you<br>
>>> > >> > use<br>
>>> > >> > a debugger to generate a backtrace?<br>
>>> > >> ><br>
>>> > >> > ~Jim.<br>
>>> > >> ><br>
>>> > >> ><br>
>>> > >> > On Wed, Jul 10, 2013 at 11:07 AM, Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> > wrote:<br>
>>> > >> ><br>
>>> > >> > > Hello,<br>
>>> > >> > ><br>
>>> > >> > > I used MPI RMA in my program, but the program stop at the<br>
>>> > >> > > MPI_Win_fence,<br>
>>> > >> > I<br>
>>> > >> > > have a master process receive data from udp socket. Other<br>
>>> processes<br>
>>> > >> > > use<br>
>>> > >> > > MPI_Get to access data.<br>
>>> > >> > ><br>
>>> > >> > > master process:<br>
>>> > >> > ><br>
>>> > >> > > MPI_Create(...)<br>
>>> > >> > > for(...){<br>
>>> > >> > > /* udp recv operation */<br>
>>> > >> > ><br>
>>> > >> > > MPI_Barrier // to let other process know data received from<br>
>>> udp is<br>
>>> > >> > > ready<br>
>>> > >> > ><br>
>>> > >> > > MPI_Win_fence(0, win);<br>
>>> > >> > > MPI_Win_fence(0, win);<br>
>>> > >> > ><br>
>>> > >> > > }<br>
>>> > >> > ><br>
>>> > >> > > other processes:<br>
>>> > >> > ><br>
>>> > >> > > for(...){<br>
>>> > >> > ><br>
>>> > >> > > MPI_Barrier // sync for udp data ready<br>
>>> > >> > ><br>
>>> > >> > > MPI_Win_fence(0, win);<br>
>>> > >> > ><br>
>>> > >> > > MPI_Get();<br>
>>> > >> > ><br>
>>> > >> > > MPI_Win_fence(0, win); <-- program stopped here<br>
>>> > >> > ><br>
>>> > >> > > /* other operation */<br>
>>> > >> > > }<br>
>>> > >> > ><br>
>>> > >> > > I found that the program stopped at second MPI_Win_fence, the<br>
>>> > terminal<br>
>>> > >> > > output is:<br>
>>> > >> > ><br>
>>> > >> > ><br>
>>> > >> > ><br>
>>> > >> > ><br>
>>> > >> ><br>
>>> > >> ><br>
>>> ><br>
>>> ===================================================================================<br>
>>> > >> > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES<br>
>>> > >> > > = EXIT CODE: 11<br>
>>> > >> > > = CLEANING UP REMAINING PROCESSES<br>
>>> > >> > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES<br>
>>> > >> > ><br>
>>> > >> > ><br>
>>> > >> ><br>
>>> > >> ><br>
>>> ><br>
>>> ===================================================================================<br>
>>> > >> > > YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation<br>
>>> fault<br>
>>> > >> > > (signal 11)<br>
>>> > >> > > This typically refers to a problem with your application.<br>
>>> > >> > > Please see the FAQ page for debugging suggestions<br>
>>> > >> > ><br>
>>> > >> > > Do you have any suggestions? Thank you very much!<br>
>>> > >> > ><br>
>>> > >> > > --<br>
>>> > >> > > Best Regards,<br>
>>> > >> > > Sufeng Niu<br>
>>> > >> > > ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> > >> > > Tel: 312-731-7219<br>
>>> > >> > ><br>
>>> > >> > > _______________________________________________<br>
>>> > >> > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > > To manage subscription options or unsubscribe:<br>
>>> > >> > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> > ><br>
>>> > >> > -------------- next part --------------<br>
>>> > >> > An HTML attachment was scrubbed...<br>
>>> > >> > URL: <<br>
>>> > >> ><br>
>>> > >> ><br>
>>> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/48c5f337/attachment.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/48c5f337/attachment.html</a><br>
>>> > >> > ><br>
>>> > >> ><br>
>>> > >> > ------------------------------<br>
>>> > >><br>
>>> > >> ><br>
>>> > >> > _______________________________________________<br>
>>> > >> > discuss mailing list<br>
>>> > >> > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >> ><br>
>>> > >> > End of discuss Digest, Vol 9, Issue 27<br>
>>> > >> > **************************************<br>
>>> > >><br>
>>> > >> ><br>
>>> > >><br>
>>> > >><br>
>>> > >><br>
>>> > >> --<br>
>>> > >> Best Regards,<br>
>>> > >> Sufeng Niu<br>
>>> > >> ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> > >> Tel: 312-731-7219<br>
>>> > >> -------------- next part --------------<br>
>>> > >> An HTML attachment was scrubbed...<br>
>>> > >> URL:<br>
>>> > >> <<br>
>>> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/57a5e76f/attachment.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/57a5e76f/attachment.html</a><br>
>>> > ><br>
>>> > >><br>
>>> > >> ------------------------------<br>
>>> > >><br>
>>> > >><br>
>>> > >> _______________________________________________<br>
>>> > >> discuss mailing list<br>
>>> > >> <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > >> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > >><br>
>>> > >> End of discuss Digest, Vol 9, Issue 28<br>
>>> > >> **************************************<br>
>>> > ><br>
>>> > ><br>
>>> > ><br>
>>> > ><br>
>>> > > --<br>
>>> > > Best Regards,<br>
>>> > > Sufeng Niu<br>
>>> > > ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> > > Tel: 312-731-7219<br>
>>> > ><br>
>>> > > _______________________________________________<br>
>>> > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > To manage subscription options or unsubscribe:<br>
>>> > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> ><br>
>>> ><br>
>>> ><br>
>>> > --<br>
>>> > Jeff Hammond<br>
>>> > <a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a><br>
>>> ><br>
>>> ><br>
>>> > ------------------------------<br>
>>> ><br>
>>> > Message: 2<br>
>>> > Date: Wed, 10 Jul 2013 11:57:31 -0500<br>
>>> > From: Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > Subject: Re: [mpich-discuss] MPI_Win_fence failed<br>
>>> > Message-ID:<br>
>>> > <<br>
>>> > <a href="mailto:CAFNNHkzKmAg8B6hamyrr7B2anU9EP_0yxmajxePVr35UnHVavw@mail.gmail.com">CAFNNHkzKmAg8B6hamyrr7B2anU9EP_0yxmajxePVr35UnHVavw@mail.gmail.com</a>><br>
>>> > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> ><br>
>>> > Sorry, I found that this discussion email cannot add figure or<br>
>>> attachment.<br>
>>> ><br>
>>> > the backtrace information is below:<br>
>>> ><br>
>>> > processes Location<br>
>>> > PC Host Rank ID Status<br>
>>> > 7 _start<br>
>>> > 0x00402399<br>
>>> > `-7 _libc_start_main<br>
>>> > 0x3685c1ecdd<br>
>>> > `-7 main<br>
>>> > 0x00402474<br>
>>> > `-7 dkm<br>
>>> > ...<br>
>>> > |-6 image_rms<br>
>>> > 0x004029bb<br>
>>> > | `-6 rms<br>
>>> > 0x00402d44<br>
>>> > | `-6 PMPI_Win_fence<br>
>>> > 0x0040c389<br>
>>> > | `-6 MPIDI_Win_fence<br>
>>> > 0x004a45f4<br>
>>> > | `-6 MPIDI_CH3I_RMAListComplete 0x004a27d3<br>
>>> > | `-6 MPIDI_CH3I_Progress ...<br>
>>> > `-1 udp<br>
>>> > 0x004035cf<br>
>>> > `-1 PMPI_Win_fence<br>
>>> > 0x0040c389<br>
>>> > `-1 MPIDI_Win_fence<br>
>>> > 0x004a45a0<br>
>>> > `-1 MPIDI_CH3I_Progress<br>
>>> 0x004292f5<br>
>>> > `-1 MPIDI_CH3_PktHandler_Get 0x0049f3f9<br>
>>> > `-1 MPIDI_CH3_iSendv<br>
>>> 0x004aa67c<br>
>>> > `- memcpy<br>
>>> > 0x3685c89329 164.54.54.122 0 20.1-13994 Stopped<br>
>>> ><br>
>>> ><br>
>>> ><br>
>>> > On Wed, Jul 10, 2013 at 11:39 AM, <<a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a>> wrote:<br>
>>> ><br>
>>> > > Send discuss mailing list submissions to<br>
>>> > > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > ><br>
>>> > > To subscribe or unsubscribe via the World Wide Web, visit<br>
>>> > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > or, via email, send a message with subject or body 'help' to<br>
>>> > > <a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a><br>
>>> > ><br>
>>> > > You can reach the person managing the list at<br>
>>> > > <a href="mailto:discuss-owner@mpich.org">discuss-owner@mpich.org</a><br>
>>> > ><br>
>>> > > When replying, please edit your Subject line so it is more specific<br>
>>> > > than "Re: Contents of discuss digest..."<br>
>>> > ><br>
>>> > ><br>
>>> > > Today's Topics:<br>
>>> > ><br>
>>> > > 1. Re: MPI_Win_fence failed (Sufeng Niu)<br>
>>> > ><br>
>>> > ><br>
>>> > ><br>
>>> ----------------------------------------------------------------------<br>
>>> > ><br>
>>> > > Message: 1<br>
>>> > > Date: Wed, 10 Jul 2013 11:39:39 -0500<br>
>>> > > From: Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > Subject: Re: [mpich-discuss] MPI_Win_fence failed<br>
>>> > > Message-ID:<br>
>>> > > <CAFNNHkz8pBfX33icn=+3rdXvqDfWqeu58odpd=<br>
>>> > > <a href="mailto:mOXLciysHgfg@mail.gmail.com">mOXLciysHgfg@mail.gmail.com</a>><br>
>>> > > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> > ><br>
>>> > > Sorry I forget to add screen shot for backtrace. the screen shot is<br>
>>> > > attached.<br>
>>> > ><br>
>>> > > Thanks a lot!<br>
>>> > ><br>
>>> > > Sufeng<br>
>>> > ><br>
>>> > ><br>
>>> > ><br>
>>> > > On Wed, Jul 10, 2013 at 11:30 AM, <<a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a>> wrote:<br>
>>> > ><br>
>>> > > > Send discuss mailing list submissions to<br>
>>> > > > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > ><br>
>>> > > > To subscribe or unsubscribe via the World Wide Web, visit<br>
>>> > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > or, via email, send a message with subject or body 'help' to<br>
>>> > > > <a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a><br>
>>> > > ><br>
>>> > > > You can reach the person managing the list at<br>
>>> > > > <a href="mailto:discuss-owner@mpich.org">discuss-owner@mpich.org</a><br>
>>> > > ><br>
>>> > > > When replying, please edit your Subject line so it is more specific<br>
>>> > > > than "Re: Contents of discuss digest..."<br>
>>> > > ><br>
>>> > > ><br>
>>> > > > Today's Topics:<br>
>>> > > ><br>
>>> > > > 1. Re: MPI_Win_fence failed (Sufeng Niu)<br>
>>> > > ><br>
>>> > > ><br>
>>> > > ><br>
>>> ----------------------------------------------------------------------<br>
>>> > > ><br>
>>> > > > Message: 1<br>
>>> > > > Date: Wed, 10 Jul 2013 11:30:36 -0500<br>
>>> > > > From: Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> > > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > Subject: Re: [mpich-discuss] MPI_Win_fence failed<br>
>>> > > > Message-ID:<br>
>>> > > > <<br>
>>> > > > <a href="mailto:CAFNNHkyLj8CbYMmc_w2DA9_%2Bq2Oe3kyus%2Bg6c99ShPk6ZXVkdA@mail.gmail.com">CAFNNHkyLj8CbYMmc_w2DA9_+q2Oe3kyus+g6c99ShPk6ZXVkdA@mail.gmail.com</a><br>
>>> ><br>
>>> > > > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> > > ><br>
>>> > > > Hi Jim,<br>
>>> > > ><br>
>>> > > > Thanks a lot for your reply. the basic way for me to debugging is<br>
>>> > > > barrier+printf, right now I only have an evaluation version of<br>
>>> > totalview.<br>
>>> > > > the backtrace using totalview shown below. the udp is the udp<br>
>>> > collection<br>
>>> > > > and create RMA window, image_rms doing MPI_Get to access the window<br>
>>> > > ><br>
>>> > > > There is a segment violation, but I don't know why the program<br>
>>> stopped<br>
>>> > > at<br>
>>> > > > MPI_Win_fence.<br>
>>> > > ><br>
>>> > > > Thanks a lot!<br>
>>> > > ><br>
>>> > > ><br>
>>> > > ><br>
>>> > > ><br>
>>> > > ><br>
>>> > > ><br>
>>> > > ><br>
>>> > > > On Wed, Jul 10, 2013 at 10:12 AM, <<a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a>><br>
>>> wrote:<br>
>>> > > ><br>
>>> > > > > Send discuss mailing list submissions to<br>
>>> > > > > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > ><br>
>>> > > > > To subscribe or unsubscribe via the World Wide Web, visit<br>
>>> > > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > > or, via email, send a message with subject or body 'help' to<br>
>>> > > > > <a href="mailto:discuss-request@mpich.org">discuss-request@mpich.org</a><br>
>>> > > > ><br>
>>> > > > > You can reach the person managing the list at<br>
>>> > > > > <a href="mailto:discuss-owner@mpich.org">discuss-owner@mpich.org</a><br>
>>> > > > ><br>
>>> > > > > When replying, please edit your Subject line so it is more<br>
>>> specific<br>
>>> > > > > than "Re: Contents of discuss digest..."<br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > > Today's Topics:<br>
>>> > > > ><br>
>>> > > > > 1. Re: MPICH3.0.4 make fails with "No rule to make<br>
>>> target..."<br>
>>> > > > > (Wesley Bland)<br>
>>> > > > > 2. Re: Error in MPI_Finalize on a simple ring test over TCP<br>
>>> > > > > (Wesley Bland)<br>
>>> > > > > 3. Restrict number of cores, not threads (Bob Ilgner)<br>
>>> > > > > 4. Re: Restrict number of cores, not threads (Wesley Bland)<br>
>>> > > > > 5. Re: Restrict number of cores, not threads (Wesley Bland)<br>
>>> > > > > 6. Re: Error in MPI_Finalize on a simple ring test over TCP<br>
>>> > > > > (Thomas Ropars)<br>
>>> > > > > 7. MPI_Win_fence failed (Sufeng Niu)<br>
>>> > > > > 8. Re: MPI_Win_fence failed (Jim Dinan)<br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > ><br>
>>> > ----------------------------------------------------------------------<br>
>>> > > > ><br>
>>> > > > > Message: 1<br>
>>> > > > > Date: Wed, 10 Jul 2013 08:29:06 -0500<br>
>>> > > > > From: Wesley Bland <<a href="mailto:wbland@mcs.anl.gov">wbland@mcs.anl.gov</a>><br>
>>> > > > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > Subject: Re: [mpich-discuss] MPICH3.0.4 make fails with "No rule<br>
>>> to<br>
>>> > > > > make target..."<br>
>>> > > > > Message-ID: <<a href="mailto:F48FC916-31F7-4F82-95F8-2D6A6C45264F@mcs.anl.gov">F48FC916-31F7-4F82-95F8-2D6A6C45264F@mcs.anl.gov</a>><br>
>>> > > > > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> > > > ><br>
>>> > > > > Unfortunately, due to the lack of developer resources and<br>
>>> interest,<br>
>>> > the<br>
>>> > > > > last version of MPICH which was supported on Windows was 1.4.1p.<br>
>>> You<br>
>>> > > can<br>
>>> > > > > find that version on the downloads page:<br>
>>> > > > ><br>
>>> > > > > <a href="http://www.mpich.org/downloads/" target="_blank">http://www.mpich.org/downloads/</a><br>
>>> > > > ><br>
>>> > > > > Alternatively, Microsoft maintains a derivative of MPICH which<br>
>>> should<br>
>>> > > > > provide the features you need. You also find a link to that on<br>
>>> the<br>
>>> > > > > downloads page above.<br>
>>> > > > ><br>
>>> > > > > Wesley<br>
>>> > > > ><br>
>>> > > > > On Jul 10, 2013, at 1:16 AM, Don Warren <<a href="mailto:don.warren@gmail.com">don.warren@gmail.com</a>><br>
>>> > wrote:<br>
>>> > > > ><br>
>>> > > > > > Hello,<br>
>>> > > > > ><br>
>>> > > > > > As requested in the installation guide, I'm informing this<br>
>>> list of<br>
>>> > a<br>
>>> > > > > failure to correctly make MPICH3.0.4 on a Win7 system. The<br>
>>> specific<br>
>>> > > > error<br>
>>> > > > > encountered is<br>
>>> > > > > > "make[2]: *** No rule to make target<br>
>>> > > > > `/cygdrive/c/FLASH/mpich-3.0.4/src/mpi/romio/Makefile.am',<br>
>>> needed by<br>
>>> > > > > `/cygdrive/c/FLASH/mpich-3.0.4/src/mpi/romio/Makefile.in'.<br>
>>> Stop."<br>
>>> > > > > ><br>
>>> > > > > > I have confirmed that both Makefile.am and Makefile.in exist<br>
>>> in the<br>
>>> > > > > directory listed. I'm attaching the c.txt and the m.txt files.<br>
>>> > > > > ><br>
>>> > > > > > Possibly of interest is that the command "make clean" fails at<br>
>>> > > exactly<br>
>>> > > > > the same folder, with exactly the same error message as shown in<br>
>>> > m.txt<br>
>>> > > > and<br>
>>> > > > > above.<br>
>>> > > > > ><br>
>>> > > > > > Any advice you can give would be appreciated. I'm attempting<br>
>>> to<br>
>>> > get<br>
>>> > > > > FLASH running on my computer, which seems to require MPICH.<br>
>>> > > > > ><br>
>>> > > > > > Regards,<br>
>>> > > > > > Don Warren<br>
>>> > > > > ><br>
>>> > > ><br>
>>> ><br>
>>> <config-make-outputs.zip>_______________________________________________<br>
>>> > > > > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > > To manage subscription options or unsubscribe:<br>
>>> > > > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > ><br>
>>> > > > > -------------- next part --------------<br>
>>> > > > > An HTML attachment was scrubbed...<br>
>>> > > > > URL: <<br>
>>> > > > ><br>
>>> > > ><br>
>>> > ><br>
>>> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/69b497f1/attachment-0001.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/69b497f1/attachment-0001.html</a><br>
>>> > > > > ><br>
>>> > > > ><br>
>>> > > > > ------------------------------<br>
>>> > > > ><br>
>>> > > > > Message: 2<br>
>>> > > > > Date: Wed, 10 Jul 2013 08:39:47 -0500<br>
>>> > > > > From: Wesley Bland <<a href="mailto:wbland@mcs.anl.gov">wbland@mcs.anl.gov</a>><br>
>>> > > > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > Subject: Re: [mpich-discuss] Error in MPI_Finalize on a simple<br>
>>> ring<br>
>>> > > > > test over TCP<br>
>>> > > > > Message-ID: <<a href="mailto:D5999106-2A75-4091-8B0F-EAFA22880862@mcs.anl.gov">D5999106-2A75-4091-8B0F-EAFA22880862@mcs.anl.gov</a>><br>
>>> > > > > Content-Type: text/plain; charset=us-ascii<br>
>>> > > > ><br>
>>> > > > > The value of previous for rank 0 in your code is -1. MPICH is<br>
>>> > > complaining<br>
>>> > > > > because all of the requests to receive a message from -1 are<br>
>>> still<br>
>>> > > > pending<br>
>>> > > > > when you try to finalize. You need to make sure that you are<br>
>>> > receiving<br>
>>> > > > from<br>
>>> > > > > valid ranks.<br>
>>> > > > ><br>
>>> > > > > On Jul 10, 2013, at 7:50 AM, Thomas Ropars <<br>
>>> <a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a>><br>
>>> > > > wrote:<br>
>>> > > > ><br>
>>> > > > > > Yes sure. Here it is.<br>
>>> > > > > ><br>
>>> > > > > > Thomas<br>
>>> > > > > ><br>
>>> > > > > > On 07/10/2013 02:23 PM, Wesley Bland wrote:<br>
>>> > > > > >> Can you send us the smallest chunk of code that still exhibits<br>
>>> > this<br>
>>> > > > > error?<br>
>>> > > > > >><br>
>>> > > > > >> Wesley<br>
>>> > > > > >><br>
>>> > > > > >> On Jul 10, 2013, at 6:54 AM, Thomas Ropars <<br>
>>> <a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a><br>
>>> > ><br>
>>> > > > > wrote:<br>
>>> > > > > >><br>
>>> > > > > >>> Hi all,<br>
>>> > > > > >>><br>
>>> > > > > >>> I get the following error when I try to run a simple<br>
>>> application<br>
>>> > > > > implementing a ring (each process sends to rank+1 and receives<br>
>>> from<br>
>>> > > > > rank-1). More precisely, the error occurs during the call to<br>
>>> > > > MPI_Finalize():<br>
>>> > > > > >>><br>
>>> > > > > >>> Assertion failed in file<br>
>>> > > > > src/mpid/ch3/channels/nemesis/netmod/tcp/socksm.c at line 363:<br>
>>> > > > sc->pg_is_set<br>
>>> > > > > >>> internal ABORT - process 0<br>
>>> > > > > >>><br>
>>> > > > > >>> Does anybody else also noticed the same error?<br>
>>> > > > > >>><br>
>>> > > > > >>> Here are all the details about my test:<br>
>>> > > > > >>> - The error is generated with mpich-3.0.2 (but I noticed the<br>
>>> > exact<br>
>>> > > > > same error with mpich-3.0.4)<br>
>>> > > > > >>> - I am using IPoIB for communication between nodes (The same<br>
>>> > thing<br>
>>> > > > > happens over Ethernet)<br>
>>> > > > > >>> - The problem comes from TCP links. When all processes are<br>
>>> on the<br>
>>> > > > same<br>
>>> > > > > node, there is no error. As soon as one process is on a remote<br>
>>> node,<br>
>>> > > the<br>
>>> > > > > failure occurs.<br>
>>> > > > > >>> - Note also that the failure does not occur if I run a more<br>
>>> > complex<br>
>>> > > > > code (eg, a NAS benchmark).<br>
>>> > > > > >>><br>
>>> > > > > >>> Thomas Ropars<br>
>>> > > > > >>> _______________________________________________<br>
>>> > > > > >>> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > >>> To manage subscription options or unsubscribe:<br>
>>> > > > > >>> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > > >> _______________________________________________<br>
>>> > > > > >> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > >> To manage subscription options or unsubscribe:<br>
>>> > > > > >> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > > >><br>
>>> > > > > >><br>
>>> > > > > ><br>
>>> > > > > > <ring_clean.c>_______________________________________________<br>
>>> > > > > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > > To manage subscription options or unsubscribe:<br>
>>> > > > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > > ------------------------------<br>
>>> > > > ><br>
>>> > > > > Message: 3<br>
>>> > > > > Date: Wed, 10 Jul 2013 16:41:27 +0200<br>
>>> > > > > From: Bob Ilgner <<a href="mailto:bobilgner@gmail.com">bobilgner@gmail.com</a>><br>
>>> > > > > To: <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
>>> > > > > Subject: [mpich-discuss] Restrict number of cores, not threads<br>
>>> > > > > Message-ID:<br>
>>> > > > > <<br>
>>> > > > ><br>
>>> <a href="mailto:CAKv15b-QgmHkVkoiTFmP3EZXvyy6sc_QeqHQgbMUhnr3Xh9ecA@mail.gmail.com">CAKv15b-QgmHkVkoiTFmP3EZXvyy6sc_QeqHQgbMUhnr3Xh9ecA@mail.gmail.com</a>><br>
>>> > > > > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> > > > ><br>
>>> > > > > Dear all,<br>
>>> > > > ><br>
>>> > > > > I am working on a shared memory processor with 256 cores. I am<br>
>>> > working<br>
>>> > > > from<br>
>>> > > > > the command line directly.<br>
>>> > > > ><br>
>>> > > > > Can I restict the number of cores that I deploy.The command<br>
>>> > > > ><br>
>>> > > > > mpirun -n 100 myprog<br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > > will automatically start on 100 cores. I wish to use only 10<br>
>>> cores<br>
>>> > and<br>
>>> > > > have<br>
>>> > > > > 10 threads on each core. Can I do this with mpich ? Rememebre<br>
>>> that<br>
>>> > > this<br>
>>> > > > an<br>
>>> > > > > smp abd I can not identify each core individually(as in a<br>
>>> cluster)<br>
>>> > > > ><br>
>>> > > > > Regards, bob<br>
>>> > > > > -------------- next part --------------<br>
>>> > > > > An HTML attachment was scrubbed...<br>
>>> > > > > URL: <<br>
>>> > > > ><br>
>>> > > ><br>
>>> > ><br>
>>> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/ec659e91/attachment-0001.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/ec659e91/attachment-0001.html</a><br>
>>> > > > > ><br>
>>> > > > ><br>
>>> > > > > ------------------------------<br>
>>> > > > ><br>
>>> > > > > Message: 4<br>
>>> > > > > Date: Wed, 10 Jul 2013 09:46:38 -0500<br>
>>> > > > > From: Wesley Bland <<a href="mailto:wbland@mcs.anl.gov">wbland@mcs.anl.gov</a>><br>
>>> > > > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > Cc: <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
>>> > > > > Subject: Re: [mpich-discuss] Restrict number of cores, not<br>
>>> threads<br>
>>> > > > > Message-ID: <<a href="mailto:2FAF588E-2FBE-45E4-B53F-E6BC931E3D51@mcs.anl.gov">2FAF588E-2FBE-45E4-B53F-E6BC931E3D51@mcs.anl.gov</a>><br>
>>> > > > > Content-Type: text/plain; charset=iso-8859-1<br>
>>> > > > ><br>
>>> > > > > Threads in MPI are not ranks. When you say you want to launch<br>
>>> with -n<br>
>>> > > > 100,<br>
>>> > > > > you will always get 100 processes, not threads. If you want 10<br>
>>> > threads<br>
>>> > > on<br>
>>> > > > > 10 cores, you will need to launch with -n 10, then add your<br>
>>> threads<br>
>>> > > > > according to your threading library.<br>
>>> > > > ><br>
>>> > > > > Note that threads in MPI do not get their own rank currently.<br>
>>> They<br>
>>> > all<br>
>>> > > > > share the same rank as the process in which they reside, so if<br>
>>> you<br>
>>> > need<br>
>>> > > > to<br>
>>> > > > > be able to handle things with different ranks, you'll need to use<br>
>>> > > actual<br>
>>> > > > > processes.<br>
>>> > > > ><br>
>>> > > > > Wesley<br>
>>> > > > ><br>
>>> > > > > On Jul 10, 2013, at 9:41 AM, Bob Ilgner <<a href="mailto:bobilgner@gmail.com">bobilgner@gmail.com</a>><br>
>>> wrote:<br>
>>> > > > ><br>
>>> > > > > > Dear all,<br>
>>> > > > > ><br>
>>> > > > > > I am working on a shared memory processor with 256 cores. I am<br>
>>> > > working<br>
>>> > > > > from the command line directly.<br>
>>> > > > > ><br>
>>> > > > > > Can I restict the number of cores that I deploy.The command<br>
>>> > > > > ><br>
>>> > > > > > mpirun -n 100 myprog<br>
>>> > > > > ><br>
>>> > > > > ><br>
>>> > > > > > will automatically start on 100 cores. I wish to use only 10<br>
>>> cores<br>
>>> > > and<br>
>>> > > > > have 10 threads on each core. Can I do this with mpich ?<br>
>>> Rememebre<br>
>>> > > that<br>
>>> > > > > this an smp abd I can not identify each core individually(as in a<br>
>>> > > > cluster)<br>
>>> > > > > ><br>
>>> > > > > > Regards, bob<br>
>>> > > > > > _______________________________________________<br>
>>> > > > > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > > To manage subscription options or unsubscribe:<br>
>>> > > > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > > ------------------------------<br>
>>> > > > ><br>
>>> > > > > Message: 5<br>
>>> > > > > Date: Wed, 10 Jul 2013 09:46:38 -0500<br>
>>> > > > > From: Wesley Bland <<a href="mailto:wbland@mcs.anl.gov">wbland@mcs.anl.gov</a>><br>
>>> > > > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > Cc: <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
>>> > > > > Subject: Re: [mpich-discuss] Restrict number of cores, not<br>
>>> threads<br>
>>> > > > > Message-ID: <<a href="mailto:2FAF588E-2FBE-45E4-B53F-E6BC931E3D51@mcs.anl.gov">2FAF588E-2FBE-45E4-B53F-E6BC931E3D51@mcs.anl.gov</a>><br>
>>> > > > > Content-Type: text/plain; charset=iso-8859-1<br>
>>> > > > ><br>
>>> > > > > Threads in MPI are not ranks. When you say you want to launch<br>
>>> with -n<br>
>>> > > > 100,<br>
>>> > > > > you will always get 100 processes, not threads. If you want 10<br>
>>> > threads<br>
>>> > > on<br>
>>> > > > > 10 cores, you will need to launch with -n 10, then add your<br>
>>> threads<br>
>>> > > > > according to your threading library.<br>
>>> > > > ><br>
>>> > > > > Note that threads in MPI do not get their own rank currently.<br>
>>> They<br>
>>> > all<br>
>>> > > > > share the same rank as the process in which they reside, so if<br>
>>> you<br>
>>> > need<br>
>>> > > > to<br>
>>> > > > > be able to handle things with different ranks, you'll need to use<br>
>>> > > actual<br>
>>> > > > > processes.<br>
>>> > > > ><br>
>>> > > > > Wesley<br>
>>> > > > ><br>
>>> > > > > On Jul 10, 2013, at 9:41 AM, Bob Ilgner <<a href="mailto:bobilgner@gmail.com">bobilgner@gmail.com</a>><br>
>>> wrote:<br>
>>> > > > ><br>
>>> > > > > > Dear all,<br>
>>> > > > > ><br>
>>> > > > > > I am working on a shared memory processor with 256 cores. I am<br>
>>> > > working<br>
>>> > > > > from the command line directly.<br>
>>> > > > > ><br>
>>> > > > > > Can I restict the number of cores that I deploy.The command<br>
>>> > > > > ><br>
>>> > > > > > mpirun -n 100 myprog<br>
>>> > > > > ><br>
>>> > > > > ><br>
>>> > > > > > will automatically start on 100 cores. I wish to use only 10<br>
>>> cores<br>
>>> > > and<br>
>>> > > > > have 10 threads on each core. Can I do this with mpich ?<br>
>>> Rememebre<br>
>>> > > that<br>
>>> > > > > this an smp abd I can not identify each core individually(as in a<br>
>>> > > > cluster)<br>
>>> > > > > ><br>
>>> > > > > > Regards, bob<br>
>>> > > > > > _______________________________________________<br>
>>> > > > > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > > To manage subscription options or unsubscribe:<br>
>>> > > > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > > ------------------------------<br>
>>> > > > ><br>
>>> > > > > Message: 6<br>
>>> > > > > Date: Wed, 10 Jul 2013 16:50:36 +0200<br>
>>> > > > > From: Thomas Ropars <<a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a>><br>
>>> > > > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > Subject: Re: [mpich-discuss] Error in MPI_Finalize on a simple<br>
>>> ring<br>
>>> > > > > test over TCP<br>
>>> > > > > Message-ID: <<a href="mailto:51DD74BC.3020009@epfl.ch">51DD74BC.3020009@epfl.ch</a>><br>
>>> > > > > Content-Type: text/plain; charset=UTF-8; format=flowed<br>
>>> > > > ><br>
>>> > > > > Yes, you are right, sorry for disturbing.<br>
>>> > > > ><br>
>>> > > > > On 07/10/2013 03:39 PM, Wesley Bland wrote:<br>
>>> > > > > > The value of previous for rank 0 in your code is -1. MPICH is<br>
>>> > > > > complaining because all of the requests to receive a message<br>
>>> from -1<br>
>>> > > are<br>
>>> > > > > still pending when you try to finalize. You need to make sure<br>
>>> that<br>
>>> > you<br>
>>> > > > are<br>
>>> > > > > receiving from valid ranks.<br>
>>> > > > > ><br>
>>> > > > > > On Jul 10, 2013, at 7:50 AM, Thomas Ropars <<br>
>>> <a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a>><br>
>>> > > > > wrote:<br>
>>> > > > > ><br>
>>> > > > > >> Yes sure. Here it is.<br>
>>> > > > > >><br>
>>> > > > > >> Thomas<br>
>>> > > > > >><br>
>>> > > > > >> On 07/10/2013 02:23 PM, Wesley Bland wrote:<br>
>>> > > > > >>> Can you send us the smallest chunk of code that still<br>
>>> exhibits<br>
>>> > this<br>
>>> > > > > error?<br>
>>> > > > > >>><br>
>>> > > > > >>> Wesley<br>
>>> > > > > >>><br>
>>> > > > > >>> On Jul 10, 2013, at 6:54 AM, Thomas Ropars <<br>
>>> > <a href="mailto:thomas.ropars@epfl.ch">thomas.ropars@epfl.ch</a>><br>
>>> > > > > wrote:<br>
>>> > > > > >>><br>
>>> > > > > >>>> Hi all,<br>
>>> > > > > >>>><br>
>>> > > > > >>>> I get the following error when I try to run a simple<br>
>>> application<br>
>>> > > > > implementing a ring (each process sends to rank+1 and receives<br>
>>> from<br>
>>> > > > > rank-1). More precisely, the error occurs during the call to<br>
>>> > > > MPI_Finalize():<br>
>>> > > > > >>>><br>
>>> > > > > >>>> Assertion failed in file<br>
>>> > > > > src/mpid/ch3/channels/nemesis/netmod/tcp/socksm.c at line 363:<br>
>>> > > > sc->pg_is_set<br>
>>> > > > > >>>> internal ABORT - process 0<br>
>>> > > > > >>>><br>
>>> > > > > >>>> Does anybody else also noticed the same error?<br>
>>> > > > > >>>><br>
>>> > > > > >>>> Here are all the details about my test:<br>
>>> > > > > >>>> - The error is generated with mpich-3.0.2 (but I noticed the<br>
>>> > exact<br>
>>> > > > > same error with mpich-3.0.4)<br>
>>> > > > > >>>> - I am using IPoIB for communication between nodes (The same<br>
>>> > thing<br>
>>> > > > > happens over Ethernet)<br>
>>> > > > > >>>> - The problem comes from TCP links. When all processes are<br>
>>> on<br>
>>> > the<br>
>>> > > > > same node, there is no error. As soon as one process is on a<br>
>>> remote<br>
>>> > > node,<br>
>>> > > > > the failure occurs.<br>
>>> > > > > >>>> - Note also that the failure does not occur if I run a more<br>
>>> > > complex<br>
>>> > > > > code (eg, a NAS benchmark).<br>
>>> > > > > >>>><br>
>>> > > > > >>>> Thomas Ropars<br>
>>> > > > > >>>> _______________________________________________<br>
>>> > > > > >>>> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > >>>> To manage subscription options or unsubscribe:<br>
>>> > > > > >>>> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > > >>> _______________________________________________<br>
>>> > > > > >>> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > >>> To manage subscription options or unsubscribe:<br>
>>> > > > > >>> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > > >>><br>
>>> > > > > >>><br>
>>> > > > > >> <ring_clean.c>_______________________________________________<br>
>>> > > > > >> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > >> To manage subscription options or unsubscribe:<br>
>>> > > > > >> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > > > _______________________________________________<br>
>>> > > > > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > > To manage subscription options or unsubscribe:<br>
>>> > > > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > > ><br>
>>> > > > > ><br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > > ------------------------------<br>
>>> > > > ><br>
>>> > > > > Message: 7<br>
>>> > > > > Date: Wed, 10 Jul 2013 10:07:21 -0500<br>
>>> > > > > From: Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> > > > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > Subject: [mpich-discuss] MPI_Win_fence failed<br>
>>> > > > > Message-ID:<br>
>>> > > > > <<br>
>>> > > > ><br>
>>> <a href="mailto:CAFNNHkz_1gC7hfpx0G9j24adO-gDabdmwZ4VuT6jip-fDMhS9A@mail.gmail.com">CAFNNHkz_1gC7hfpx0G9j24adO-gDabdmwZ4VuT6jip-fDMhS9A@mail.gmail.com</a>><br>
>>> > > > > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> > > > ><br>
>>> > > > > Hello,<br>
>>> > > > ><br>
>>> > > > > I used MPI RMA in my program, but the program stop at the<br>
>>> > > MPI_Win_fence,<br>
>>> > > > I<br>
>>> > > > > have a master process receive data from udp socket. Other<br>
>>> processes<br>
>>> > use<br>
>>> > > > > MPI_Get to access data.<br>
>>> > > > ><br>
>>> > > > > master process:<br>
>>> > > > ><br>
>>> > > > > MPI_Create(...)<br>
>>> > > > > for(...){<br>
>>> > > > > /* udp recv operation */<br>
>>> > > > ><br>
>>> > > > > MPI_Barrier // to let other process know data received from udp<br>
>>> is<br>
>>> > > ready<br>
>>> > > > ><br>
>>> > > > > MPI_Win_fence(0, win);<br>
>>> > > > > MPI_Win_fence(0, win);<br>
>>> > > > ><br>
>>> > > > > }<br>
>>> > > > ><br>
>>> > > > > other processes:<br>
>>> > > > ><br>
>>> > > > > for(...){<br>
>>> > > > ><br>
>>> > > > > MPI_Barrier // sync for udp data ready<br>
>>> > > > ><br>
>>> > > > > MPI_Win_fence(0, win);<br>
>>> > > > ><br>
>>> > > > > MPI_Get();<br>
>>> > > > ><br>
>>> > > > > MPI_Win_fence(0, win); <-- program stopped here<br>
>>> > > > ><br>
>>> > > > > /* other operation */<br>
>>> > > > > }<br>
>>> > > > ><br>
>>> > > > > I found that the program stopped at second MPI_Win_fence, the<br>
>>> > terminal<br>
>>> > > > > output is:<br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > ><br>
>>> > ><br>
>>> ><br>
>>> ===================================================================================<br>
>>> > > > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES<br>
>>> > > > > = EXIT CODE: 11<br>
>>> > > > > = CLEANING UP REMAINING PROCESSES<br>
>>> > > > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES<br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > ><br>
>>> > ><br>
>>> ><br>
>>> ===================================================================================<br>
>>> > > > > YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation<br>
>>> fault<br>
>>> > > > > (signal 11)<br>
>>> > > > > This typically refers to a problem with your application.<br>
>>> > > > > Please see the FAQ page for debugging suggestions<br>
>>> > > > ><br>
>>> > > > > Do you have any suggestions? Thank you very much!<br>
>>> > > > ><br>
>>> > > > > --<br>
>>> > > > > Best Regards,<br>
>>> > > > > Sufeng Niu<br>
>>> > > > > ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> > > > > Tel: 312-731-7219<br>
>>> > > > > -------------- next part --------------<br>
>>> > > > > An HTML attachment was scrubbed...<br>
>>> > > > > URL: <<br>
>>> > > > ><br>
>>> > > ><br>
>>> > ><br>
>>> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/375a95ac/attachment-0001.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/375a95ac/attachment-0001.html</a><br>
>>> > > > > ><br>
>>> > > > ><br>
>>> > > > > ------------------------------<br>
>>> > > > ><br>
>>> > > > > Message: 8<br>
>>> > > > > Date: Wed, 10 Jul 2013 11:12:45 -0400<br>
>>> > > > > From: Jim Dinan <<a href="mailto:james.dinan@gmail.com">james.dinan@gmail.com</a>><br>
>>> > > > > To: <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > Subject: Re: [mpich-discuss] MPI_Win_fence failed<br>
>>> > > > > Message-ID:<br>
>>> > > > > <CAOoEU4F3hX=y3yrJKYKucNeiueQYBeR_3OQas9E+mg+GM6Rz=<br>
>>> > > > > <a href="mailto:w@mail.gmail.com">w@mail.gmail.com</a>><br>
>>> > > > > Content-Type: text/plain; charset="iso-8859-1"<br>
>>> > > > ><br>
>>> > > > > It's hard to tell where the segmentation fault is coming from.<br>
>>> Can<br>
>>> > you<br>
>>> > > > use<br>
>>> > > > > a debugger to generate a backtrace?<br>
>>> > > > ><br>
>>> > > > > ~Jim.<br>
>>> > > > ><br>
>>> > > > ><br>
>>> > > > > On Wed, Jul 10, 2013 at 11:07 AM, Sufeng Niu <<a href="mailto:sniu@hawk.iit.edu">sniu@hawk.iit.edu</a>><br>
>>> > > wrote:<br>
>>> > > > ><br>
>>> > > > > > Hello,<br>
>>> > > > > ><br>
>>> > > > > > I used MPI RMA in my program, but the program stop at the<br>
>>> > > > MPI_Win_fence,<br>
>>> > > > > I<br>
>>> > > > > > have a master process receive data from udp socket. Other<br>
>>> processes<br>
>>> > > use<br>
>>> > > > > > MPI_Get to access data.<br>
>>> > > > > ><br>
>>> > > > > > master process:<br>
>>> > > > > ><br>
>>> > > > > > MPI_Create(...)<br>
>>> > > > > > for(...){<br>
>>> > > > > > /* udp recv operation */<br>
>>> > > > > ><br>
>>> > > > > > MPI_Barrier // to let other process know data received from<br>
>>> udp is<br>
>>> > > > ready<br>
>>> > > > > ><br>
>>> > > > > > MPI_Win_fence(0, win);<br>
>>> > > > > > MPI_Win_fence(0, win);<br>
>>> > > > > ><br>
>>> > > > > > }<br>
>>> > > > > ><br>
>>> > > > > > other processes:<br>
>>> > > > > ><br>
>>> > > > > > for(...){<br>
>>> > > > > ><br>
>>> > > > > > MPI_Barrier // sync for udp data ready<br>
>>> > > > > ><br>
>>> > > > > > MPI_Win_fence(0, win);<br>
>>> > > > > ><br>
>>> > > > > > MPI_Get();<br>
>>> > > > > ><br>
>>> > > > > > MPI_Win_fence(0, win); <-- program stopped here<br>
>>> > > > > ><br>
>>> > > > > > /* other operation */<br>
>>> > > > > > }<br>
>>> > > > > ><br>
>>> > > > > > I found that the program stopped at second MPI_Win_fence, the<br>
>>> > > terminal<br>
>>> > > > > > output is:<br>
>>> > > > > ><br>
>>> > > > > ><br>
>>> > > > > ><br>
>>> > > > > ><br>
>>> > > > ><br>
>>> > > ><br>
>>> > ><br>
>>> ><br>
>>> ===================================================================================<br>
>>> > > > > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES<br>
>>> > > > > > = EXIT CODE: 11<br>
>>> > > > > > = CLEANING UP REMAINING PROCESSES<br>
>>> > > > > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES<br>
>>> > > > > ><br>
>>> > > > > ><br>
>>> > > > ><br>
>>> > > ><br>
>>> > ><br>
>>> ><br>
>>> ===================================================================================<br>
>>> > > > > > YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation<br>
>>> > fault<br>
>>> > > > > > (signal 11)<br>
>>> > > > > > This typically refers to a problem with your application.<br>
>>> > > > > > Please see the FAQ page for debugging suggestions<br>
>>> > > > > ><br>
>>> > > > > > Do you have any suggestions? Thank you very much!<br>
>>> > > > > ><br>
>>> > > > > > --<br>
>>> > > > > > Best Regards,<br>
>>> > > > > > Sufeng Niu<br>
>>> > > > > > ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> > > > > > Tel: 312-731-7219<br>
>>> > > > > ><br>
>>> > > > > > _______________________________________________<br>
>>> > > > > > discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > > To manage subscription options or unsubscribe:<br>
>>> > > > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > > ><br>
>>> > > > > -------------- next part --------------<br>
>>> > > > > An HTML attachment was scrubbed...<br>
>>> > > > > URL: <<br>
>>> > > > ><br>
>>> > > ><br>
>>> > ><br>
>>> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/48c5f337/attachment.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/48c5f337/attachment.html</a><br>
>>> > > > > ><br>
>>> > > > ><br>
>>> > > > > ------------------------------<br>
>>> > > > ><br>
>>> > > > > _______________________________________________<br>
>>> > > > > discuss mailing list<br>
>>> > > > > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > > ><br>
>>> > > > > End of discuss Digest, Vol 9, Issue 27<br>
>>> > > > > **************************************<br>
>>> > > > ><br>
>>> > > ><br>
>>> > > ><br>
>>> > > ><br>
>>> > > > --<br>
>>> > > > Best Regards,<br>
>>> > > > Sufeng Niu<br>
>>> > > > ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> > > > Tel: 312-731-7219<br>
>>> > > > -------------- next part --------------<br>
>>> > > > An HTML attachment was scrubbed...<br>
>>> > > > URL: <<br>
>>> > > ><br>
>>> > ><br>
>>> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/57a5e76f/attachment.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/57a5e76f/attachment.html</a><br>
>>> > > > ><br>
>>> > > ><br>
>>> > > > ------------------------------<br>
>>> > > ><br>
>>> > > > _______________________________________________<br>
>>> > > > discuss mailing list<br>
>>> > > > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > > ><br>
>>> > > > End of discuss Digest, Vol 9, Issue 28<br>
>>> > > > **************************************<br>
>>> > > ><br>
>>> > ><br>
>>> > ><br>
>>> > ><br>
>>> > > --<br>
>>> > > Best Regards,<br>
>>> > > Sufeng Niu<br>
>>> > > ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> > > Tel: 312-731-7219<br>
>>> > > -------------- next part --------------<br>
>>> > > An HTML attachment was scrubbed...<br>
>>> > > URL: <<br>
>>> > ><br>
>>> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/48296a33/attachment.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/48296a33/attachment.html</a><br>
>>> > > ><br>
>>> > > -------------- next part --------------<br>
>>> > > A non-text attachment was scrubbed...<br>
>>> > > Name: Screenshot.png<br>
>>> > > Type: image/png<br>
>>> > > Size: 131397 bytes<br>
>>> > > Desc: not available<br>
>>> > > URL: <<br>
>>> > ><br>
>>> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/48296a33/attachment.png" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/48296a33/attachment.png</a><br>
>>> > > ><br>
>>> > ><br>
>>> > > ------------------------------<br>
>>> > ><br>
>>> > > _______________________________________________<br>
>>> > > discuss mailing list<br>
>>> > > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> > ><br>
>>> > > End of discuss Digest, Vol 9, Issue 29<br>
>>> > > **************************************<br>
>>> > ><br>
>>> ><br>
>>> ><br>
>>> ><br>
>>> > --<br>
>>> > Best Regards,<br>
>>> > Sufeng Niu<br>
>>> > ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> > Tel: 312-731-7219<br>
>>> > -------------- next part --------------<br>
>>> > An HTML attachment was scrubbed...<br>
>>> > URL: <<br>
>>> ><br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/7c5cb5bf/attachment.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/7c5cb5bf/attachment.html</a><br>
>>> > ><br>
>>> ><br>
>>> > ------------------------------<br>
>>> ><br>
>>> > _______________________________________________<br>
>>> > discuss mailing list<br>
>>> > <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> > <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>> ><br>
>>> > End of discuss Digest, Vol 9, Issue 30<br>
>>> > **************************************<br>
>>> ><br>
>>><br>
>>><br>
>>><br>
>>> --<br>
>>> Best Regards,<br>
>>> Sufeng Niu<br>
>>> ECASP lab, ECE department, Illinois Institute of Technology<br>
>>> Tel: 312-731-7219<br>
>>> -------------- next part --------------<br>
>>> An HTML attachment was scrubbed...<br>
>>> URL: <<br>
>>> <a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/2de2b7a5/attachment.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/2de2b7a5/attachment.html</a><br>
>>> ><br>
>>><br>
>>> ------------------------------<br>
>>><br>
>>> _______________________________________________<br>
>>> discuss mailing list<br>
>>> <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>>> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>>><br>
>>> End of discuss Digest, Vol 9, Issue 31<br>
>>> **************************************<br>
>>><br>
>><br>
>><br>
>><br>
>> --<br>
>> Best Regards,<br>
>> Sufeng Niu<br>
>> ECASP lab, ECE department, Illinois Institute of Technology<br>
>> Tel: 312-731-7219<br>
>><br>
>> _______________________________________________<br>
>> discuss mailing list <a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
>> To manage subscription options or unsubscribe:<br>
>> <a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
>><br>
><br>
><br>
-------------- next part --------------<br>
An HTML attachment was scrubbed...<br>
URL: <<a href="http://lists.mpich.org/pipermail/discuss/attachments/20130710/bc26f429/attachment.html" target="_blank">http://lists.mpich.org/pipermail/discuss/attachments/20130710/bc26f429/attachment.html</a>><br>
<br>
------------------------------<br>
<br>
_______________________________________________<br>
discuss mailing list<br>
<a href="mailto:discuss@mpich.org">discuss@mpich.org</a><br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" target="_blank">https://lists.mpich.org/mailman/listinfo/discuss</a><br>
<br>
End of discuss Digest, Vol 9, Issue 34<br>
**************************************<br>
</blockquote></div><br><br clear="all"><br>-- <br>Best Regards,<div>Sufeng Niu</div><div>ECASP lab, ECE department, Illinois Institute of Technology</div><div>Tel: 312-731-7219</div>
</div></div></div></div></div></div></div></div>