<font size=2 face="sans-serif">I think my problem with MPIU_THREAD_GRANULARITY_GLOBAL
is a CS_YIELD. </font>
<br>
<br><font size=2 face="sans-serif">MPIU_THREAD_CS_ENTER(ALLFUNC,);
</font>
<br><font size=2 face="sans-serif">....</font>
<br>
<br><font size=2 face="sans-serif">MPIR_Bsend_data_t *<b>active = BsendBuffer.active</b>,
*next_active;
</font>
<br><font size=2 face="sans-serif"> while (active) {
</font>
<br><font size=2 face="sans-serif"> fprintf(stderr,"%2.2u:%u:<b>active</b>
%p (0x%08x kind=%d) refcount %d\n",
</font>
<br><font size=2 face="sans-serif">
Kernel_ProcessorID(),__LINE__,
</font>
<br><font size=2 face="sans-serif">
(active->request),
</font>
<br><font size=2 face="sans-serif">
(active->request)->handle,
</font>
<br><font size=2 face="sans-serif">
active->request->kind,
</font>
<br><font size=2 face="sans-serif">
MPIU_Object_get_ref((active->request)));</font>
<br><font size=2 face="sans-serif">...</font>
<br>
<br><font size=2 face="sans-serif">There's one or more yields somewhere...
in test and/or progress. I haven't tracked it down and I'm out tomorrow.
I end up with 3 threads (26, 54, 48) working on the same <b>active
</b>request. 26 frees it and moves on to the next active. 48
chokes on the freed request.</font>
<br>
<br><font size=2 face="sans-serif">stderr[0]: threaded exit</font>
<br><font size=2 face="sans-serif">stderr[0]: <b>26</b>:441:<b>active 0x15d1c78
</b>(0xac000003 kind=1) refcount 2</font>
<br><font size=2 face="sans-serif">stderr[0]: 26:decr 0x15d1aa8 (0xac000001
kind=REQUEST) refcount to 1</font>
<br><font size=2 face="sans-serif">stderr[0]: 26:decr 0x15d1c78 (0xac000003
kind=REQUEST) refcount to 1</font>
<br><font size=2 face="sans-serif">stderr[0]: yield</font>
<br>
<br><font size=2 face="sans-serif">stderr[0]: <b>54</b>:441:<b>active 0x15d1c78
</b>(0xac000003 kind=1) refcount 1</font>
<br><font size=2 face="sans-serif">stderr[0]: 54:set 0x15d1d60 (0xac000004
kind=REQUEST) refcount to 1</font>
<br><font size=2 face="sans-serif">stderr[0]: 54:set 0x15d1d60 (0xac000004
kind=REQUEST) refcount to 2</font>
<br><font size=2 face="sans-serif">stderr[0]: 54:decr 0x15d1d60 (0xac000004
kind=REQUEST) refcount to 1</font>
<br><font size=2 face="sans-serif">stderr[0]: yield</font>
<br>
<br><font size=2 face="sans-serif">stderr[0]: <b>48</b>:441:<b>active 0x15d1c78
</b>(0xac000003 kind=1) refcount 1</font>
<br><font size=2 face="sans-serif">stderr[0]: 48:set 0x15d1e48 (0xac000005
kind=REQUEST) refcount to 1</font>
<br><font size=2 face="sans-serif">stderr[0]: 48:set 0x15d1e48 (0xac000005
kind=REQUEST) refcount to 2</font>
<br><font size=2 face="sans-serif">stderr[0]: 48:decr 0x15d1e48 (0xac000005
kind=REQUEST) refcount to 1</font>
<br><font size=2 face="sans-serif">stderr[0]: yield</font>
<br>
<br><font size=2 face="sans-serif">stderr[0]: 26:decr 0x15d1c78 (0xac000003
kind=REQUEST) refcount to 0</font>
<br><font size=2 face="sans-serif">stderr[0]: 26:decr 0x1560f78 (0x44000000
kind=COMM) refcount to 3</font>
<br><font size=2 face="sans-serif">stderr[0]:<b> 26</b>:<b>free 0x15d1c78</b>
(0xac000003 kind=0) refcount 0</font>
<br><font size=2 face="sans-serif">stderr[0]: 26:356:prev 0x15d1c78, active
0x15d1aa8 (0xac000001 kind=1) refcount 1</font>
<br><font size=2 face="sans-serif">stderr[0]: 26:441:active 0x15d1aa8 (0xac000001
kind=1) refcount 1</font>
<br>
<br><font size=2 face="sans-serif">stderr[0]: yield</font>
<br><font size=2 face="sans-serif">stderr[0]: 32:441:active 0x15d1aa8 (0xac000001
kind=1) refcount 1</font>
<br>
<br><font size=2 face="sans-serif">stderr[0]: yield</font>
<br><font size=2 face="sans-serif">stderr[0]: <b>48</b>:badcase <b>0x15d1c78
</b>(0xac000003 kind=0) refcount 0</font>
<br><font size=2 face="sans-serif">stderr[0]: Abort(1) on node 0 (rank
0 in comm 1140850688): Fatal error in MPI_Bsend: Internal MPI error!, error
stack:</font>
<br><font size=2 face="sans-serif">stderr[0]: MPI_Bsend(181)..............:
MPI_Bsend(buf=0x19c8a06d70, count=1024, MPI_CHAR, dest=1, tag=0, MPI_COMM_WORLD)
failed</font>
<br><font size=2 face="sans-serif">stderr[0]: MPIR_Bsend_isend(226).......:
</font>
<br><font size=2 face="sans-serif">stderr[0]: MPIR_Bsend_check_active(474):
</font>
<br><font size=2 face="sans-serif">stderr[0]: MPIR_Test_impl(65)..........:
</font>
<br><font size=2 face="sans-serif">stderr[0]: MPIR_Request_complete(239)..:
INTERNAL ERROR: unexpected value in case statement (value=0)</font>
<br>
<br>
<br><font size=2 face="sans-serif">I'm guessing the problem with MPIU_THREAD_GRANULARITY_PER_OBJECT
is there's no lock and the threads are all over each other... no yield
needed? It's just not thread safe with the static BsendBuffer.</font>
<br>
<br><font size=2 face="sans-serif"><br>
Bob Cernohous: (T/L 553) 507-253-6093<br>
<br>
BobC@us.ibm.com<br>
IBM Rochester, Building 030-2(C335), Department 61L<br>
3605 Hwy 52 North, Rochester, MN 55901-7829<br>
<br>
> Chaos reigns within.<br>
> Reflect, repent, and reboot.<br>
> Order shall return.<br>
</font>