[mpich-discuss] error spawning processes in mpich-3.2rc1

Min Si msi at il.is.s.u-tokyo.ac.jp
Tue Nov 10 13:14:02 CST 2015


Hi Siegmar,

We have had a quick fix for the issue in the spawn programs. But the 
unaligned memory access may exist in every received packet inside MPICH, 
thus we decided to completely fix it and will add this fix in future 
release (not mpich-3.2rc2).

Attached patch is our quick fix for received eager packets, it solves 
the issue in your spawn programs. You can patch it to mpich-3.2rc1 for 
temporary use.

cd <mpich-3.2rc1 source dir>
gpatch -p1 < unalign_access.patch
autoconf -f
./configure <your options>
make && make install

Please also track the ticket for official fix.

Best regards,
Min

On 11/10/15 3:40 AM, Siegmar Gross wrote:
> Hi Min,
>
> yesterday I installed mpich-3.2rc2 and it still breaks. Are you still
> trying to solve the problem?
>
>
> Best regards
>
> Siegmar
>
>>>> Min Si<msi at il.is.s.u-tokyo.ac.jp>  10/21/15 6:36 PM >>>
> Hi Siegmar,
>
> Thanks for providing us the test machine. We have confirmed this failure
> is caused by unaligned memory access inside MPICH. So the failure only
> happens on SPARC, which is alignment-sensitive.
>
> We will fix it. You can track the progress from this ticket.
> https://trac.mpich.org/projects/mpich/ticket/2309#ticket
>
> Because we do not have other SPARC platform, would you mind if we use
> your machine for testing during this fix ?
>
> Best regards,
> Min
> On 10/12/15 9:24 AM, Siegmar Gross wrote:
>> Hi Min,
>>
>>> It seems you already enabled the most detailed error outputs. We could
>>> not think out any clue for now. If you can give us access to your
>>> machine, we are glad to help you debug on it.
>> Can you send me your email address because I don't want to send
>> login data to this list.
>>
>>
>> Kind regards
>>
>> Siegmar
>>
>>
>>> Min
>>>
>>> On 10/8/15 12:02 AM, Siegmar Gross wrote:
>>>> Hi Min,
>>>>
>>>> thank you very much for your answer.
>>>>
>>>>> We cannot reproduce this error on our test machines (Solaris i386,
>>>>> Ubuntu x86_64) by using your programs. And unfortunately we do not
>>>>> have
>>>>> Solaris Sparc machine thus could not verify it.
>>>> The programs work fine on my Solaris x86_64 and Linux machines
>>>> as well. I only have a problem on Solaris Sparc.
>>>>
>>>>
>>>>> Sometime, it can happen that you need to add "./" in front of the
>>>>> program path, could you try it ?
>>>>>
>>>>> For example, in spawn_master.c MPI: A Message-Passing Interface
>>>>> Standard
>>>>>> #define SLAVE_PROG "./spawn_slave"
>>>> No, it wil not work, because the programs are stored in a
>>>> different directory ($HOME/{SunOS, Linux}/{sparc, x86_64}/bin)
>>>> which is part of PATH (as well as ".").
>>>>
>>>> Can I do anything to track the source of the error?
>>>>
>>>>
>>>> Kind regards
>>>>
>>>> Siegmar
>>>>
>>>>> Min
>>>>>
>>>>> On 10/7/15 5:03 AM, Siegmar Gross wrote:
>>>>>> Hi,
>>>>>>
>>>>>> today I've built mpich-3.2rc1 on my machines (Solaris 10 Sparc,
>>>>>> Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with gcc-5.1.0
>>>>>> and Sun C 5.13. I still get the following errors on my Sparc machine
>>>>>> which I'd already reported September 8th. "mpiexec" is aliased to
>>>>>> 'mpiexec -genvnone'. It still doesn't matter if I use my cc- or
>>>>>> gcc-version of MPICH.
>>>>>>
>>>>>>
>>>>>> tyr spawn 119 mpichversion
>>>>>> MPICH Version:          3.2rc1
>>>>>> MPICH Release date:     Wed Oct  7 00:00:33 CDT 2015
>>>>>> MPICH Device:           ch3:nemesis
>>>>>> MPICH configure: --prefix=/usr/local/mpich-3.2_64_cc
>>>>>> --libdir=/usr/local/mpich-3.2_64_cc/lib64
>>>>>> --includedir=/usr/local/mpich-3.2_64_cc/include64 CC=cc CXX=CC
>>>>>> F77=f77
>>>>>> FC=f95 CFLAGS=-m64 CXXFLAGS=-m64 FFLAGS=-m64 FCFLAGS=-m64
>>>>>> LDFLAGS=-m64
>>>>>> -L/usr/lib/sparcv9 -R/usr/lib/sparcv9 --enable-fortran=yes
>>>>>> --enable-cxx --enable-romio --enable-debuginfo --enable-smpcoll
>>>>>> --enable-threads=multiple --with-thread-package=posix --enable-shared
>>>>>> MPICH CC:       cc -m64   -O2
>>>>>> MPICH CXX:      CC -m64  -O2
>>>>>> MPICH F77:      f77 -m64
>>>>>> MPICH FC:       f95 -m64  -O2
>>>>>> tyr spawn 120
>>>>>>
>>>>>>
>>>>>>
>>>>>> tyr spawn 111 mpiexec -np 1 spawn_master
>>>>>>
>>>>>> Parent process 0 running on tyr.informatik.hs-fulda.de
>>>>>>    I create 4 slave processes
>>>>>>
>>>>>> Fatal error in MPI_Comm_spawn: Unknown error class, error stack:
>>>>>> MPI_Comm_spawn(144)...........: MPI_Comm_spawn(cmd="spawn_slave",
>>>>>> argv=0, maxprocs=4, MPI_INFO_NULL, root=0, MPI_COMM_WORLD,
>>>>>> intercomm=ffffffff7fffde50, errors=0) failed
>>>>>> MPIDI_Comm_spawn_multiple(274):
>>>>>> MPID_Comm_accept(153).........:
>>>>>> MPIDI_Comm_accept(1057).......:
>>>>>> MPIR_Bcast_intra(1287)........:
>>>>>> MPIR_Bcast_binomial(310)......: Failure during collective
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> tyr spawn 112 mpiexec -np 1 spawn_multiple_master
>>>>>>
>>>>>> Parent process 0 running on tyr.informatik.hs-fulda.de
>>>>>>    I create 3 slave processes.
>>>>>>
>>>>>> Fatal error in MPI_Comm_spawn_multiple: Unknown error class, error
>>>>>> stack:
>>>>>> MPI_Comm_spawn_multiple(162)..: MPI_Comm_spawn_multiple(count=2,
>>>>>> cmds=ffffffff7fffde08, argvs=ffffffff7fffddf8,
>>>>>> maxprocs=ffffffff7fffddf0, infos=ffffffff7fffdde8, root=0,
>>>>>> MPI_COMM_WORLD, intercomm=ffffffff7fffdde4, errors=0) failed
>>>>>> MPIDI_Comm_spawn_multiple(274):
>>>>>> MPID_Comm_accept(153).........:
>>>>>> MPIDI_Comm_accept(1057).......:
>>>>>> MPIR_Bcast_intra(1287)........:
>>>>>> MPIR_Bcast_binomial(310)......: Failure during collective
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> tyr spawn 113 mpiexec -np 1 spawn_intra_comm
>>>>>> Parent process 0: I create 2 slave processes
>>>>>> Fatal error in MPI_Comm_spawn: Unknown error class, error stack:
>>>>>> MPI_Comm_spawn(144)...........:
>>>>>> MPI_Comm_spawn(cmd="spawn_intra_comm",
>>>>>> argv=0, maxprocs=2, MPI_INFO_NULL, root=0, MPI_COMM_WORLD,
>>>>>> intercomm=ffffffff7fffded4, errors=0) failed
>>>>>> MPIDI_Comm_spawn_multiple(274):
>>>>>> MPID_Comm_accept(153).........:
>>>>>> MPIDI_Comm_accept(1057).......:
>>>>>> MPIR_Bcast_intra(1287)........:
>>>>>> MPIR_Bcast_binomial(310)......: Failure during collective
>>>>>> tyr spawn 114
>>>>>>
>>>>>>
>>>>>> I would be grateful if somebody can fix the problem. Thank you very
>>>>>> much for any help in advance. I've attached my programs. Please let
>>>>>> me know if you need anything else.
>>>>>>
>>>>>>
>>>>>> Kind regards
>>>>>>
>>>>>> Siegmar
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> discuss mailinglistdiscuss at mpich.org
>>>>>> To manage subscription options or unsubscribe:
>>>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>>
>>>>> _______________________________________________
>>>>> discuss mailing listdiscuss at mpich.org
>>>>> To manage subscription options or unsubscribe:
>>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>>
>>>>
>>>> _______________________________________________
>>>> discuss mailinglistdiscuss at mpich.org
>>>> To manage subscription options or unsubscribe:
>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>> _______________________________________________
>>> discuss mailing listdiscuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>
>> _______________________________________________
>> discuss mailing listdiscuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>

-------------- next part --------------
diff --git a/configure.ac b/configure.ac
index 9ac2e21..54cbb9b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2980,6 +2980,11 @@ else
    AC_MSG_RESULT([int or better])
 fi
 
+# Define strict alignment memory access for alignment-sensitive platform (i.e., SPARC)
+if test "$host_cpu" = "sparc" ; then
+    AC_DEFINE(HAVE_STRICT_ALIGNMENT,1,[Define if strict alignment memory access is required])
+fi
+
 # There are further alignment checks after we test for int64_t etc. below.
 
 # Get the size of the C types for encoding in the basic datatypes and for
diff --git a/src/mpid/ch3/channels/nemesis/src/ch3_progress.c b/src/mpid/ch3/channels/nemesis/src/ch3_progress.c
index 245fe61..0ad94a7 100644
--- a/src/mpid/ch3/channels/nemesis/src/ch3_progress.c
+++ b/src/mpid/ch3/channels/nemesis/src/ch3_progress.c
@@ -752,12 +752,19 @@ int MPID_nem_handle_pkt(MPIDI_VC_t *vc, char *buf, MPIDI_msg_sz_t buflen)
                 MPIDI_msg_sz_t len = buflen;
                 MPIDI_CH3_Pkt_t *pkt = (MPIDI_CH3_Pkt_t *)buf;
 
+#ifdef HAVE_STRICT_ALIGNMENT
+                MPIDI_CH3_Pkt_t aligned_pkt_hdr;
+                MPIDI_CH3_Pkt_t *pkt_ptr = pkt;
+
+                MPIDI_CH3_Pkt_get_aligned_ptr(&aligned_pkt_hdr, buf, &pkt);
+#endif
+
                 MPIU_DBG_MSG(CH3_CHANNEL, VERBOSE, "received new message");
 
                 /* invalid pkt data will result in unpredictable behavior */
                 MPIU_Assert(pkt->type >= 0 && pkt->type < MPIDI_CH3_PKT_END_ALL);
 
-                mpi_errno = pktArray[pkt->type](vc, pkt, &len, &rreq);
+                mpi_errno = pktArray[pkt->type](vc, buf, &len, &rreq);
                 if (mpi_errno) MPIR_ERR_POP(mpi_errno);
                 buflen -= len;
                 buf    += len;
diff --git a/src/mpid/ch3/include/mpidimpl.h b/src/mpid/ch3/include/mpidimpl.h
index 51d1420..a5944c6 100644
--- a/src/mpid/ch3/include/mpidimpl.h
+++ b/src/mpid/ch3/include/mpidimpl.h
@@ -1894,4 +1894,35 @@ int MPIDI_CH3I_Progress_deactivate_hook(int id);
 #define MPID_Progress_activate_hook(id_) MPIDI_CH3I_Progress_activate_hook(id_)
 #define MPID_Progress_deactivate_hook(id_) MPIDI_CH3I_Progress_deactivate_hook(id_)
 
+
+#ifdef HAVE_STRICT_ALIGNMENT
+/* The received packet may be stored at unaligned memory address (i.e., eager message
+ * in TCP netmod), direct access is not allowed on alignment-sensitive platforms
+ * (i.e., SPARC). This function checks whether the buffer address is aligned and
+ * returns a pointer to the packet header for safe access.
+ * If the address is not aligned, then copy the packet header to the aligned
+ * temporary buffer and returns the pointer to the aligned buffer; otherwise
+ * returns the pointer to the original buffer. Note that the caller is responsible
+ * for managing the aligned temporary buffer. */
+static inline void MPIDI_CH3_Pkt_get_aligned_ptr(MPIDI_CH3_Pkt_t * aligned_buf,
+                                                 MPIDI_CH3_Pkt_t * buf, MPIDI_CH3_Pkt_t ** pkt_ptr)
+{
+    int align_sz = 8;           /* default aligns everything to 8-byte boundaries */
+#ifdef HAVE_MAX_STRUCT_ALIGNMENT
+    if (align_sz > HAVE_MAX_STRUCT_ALIGNMENT) {
+        align_sz = HAVE_MAX_STRUCT_ALIGNMENT;
+    }
+#endif
+
+    /* check alignment */
+    if (((uintptr_t) buf % align_sz) != 0) {
+        memcpy(aligned_buf, buf, sizeof(MPIDI_CH3_Pkt_t));
+        (*pkt_ptr) = aligned_buf;
+    }
+    else {
+        (*pkt_ptr) = buf;
+    }
+}
+#endif
+
 #endif /* !defined(MPICH_MPIDIMPL_H_INCLUDED) */
diff --git a/src/mpid/ch3/src/ch3u_eager.c b/src/mpid/ch3/src/ch3u_eager.c
index 1a81c5a..a17fce0 100644
--- a/src/mpid/ch3/src/ch3u_eager.c
+++ b/src/mpid/ch3/src/ch3u_eager.c
@@ -291,11 +291,21 @@ int MPIDI_CH3_EagerContigShortSend( MPID_Request **sreq_p,
 int MPIDI_CH3_PktHandler_EagerShortSend( MPIDI_VC_t *vc, MPIDI_CH3_Pkt_t *pkt, 
 					 MPIDI_msg_sz_t *buflen, MPID_Request **rreqp )
 {
-    MPIDI_CH3_Pkt_eagershort_send_t * eagershort_pkt = &pkt->eagershort_send;
+    MPIDI_CH3_Pkt_eagershort_send_t * eagershort_pkt = NULL;
     MPID_Request * rreq;
     int found;
     int mpi_errno = MPI_SUCCESS;
 
+#ifdef HAVE_STRICT_ALIGNMENT
+    MPIDI_CH3_Pkt_t aligned_pkt_hdr;
+    MPIDI_CH3_Pkt_t *pkt_ptr = pkt;
+
+    MPIDI_CH3_Pkt_get_aligned_ptr(&aligned_pkt_hdr, pkt, &pkt_ptr);
+    eagershort_pkt = &pkt_ptr->eagershort_send;
+#else
+    eagershort_pkt = &pkt->eagershort_send;
+#endif
+
     MPID_THREAD_CS_ENTER(POBJ, MPIR_THREAD_POBJ_MSGQ_MUTEX);
 
     /* printf( "Receiving short eager!\n" ); fflush(stdout); */
@@ -607,7 +617,7 @@ int MPIDI_CH3_EagerContigIsend( MPID_Request **sreq_p,
 int MPIDI_CH3_PktHandler_EagerSend( MPIDI_VC_t *vc, MPIDI_CH3_Pkt_t *pkt, 
 				    MPIDI_msg_sz_t *buflen, MPID_Request **rreqp )
 {
-    MPIDI_CH3_Pkt_eager_send_t * eager_pkt = &pkt->eager_send;
+    MPIDI_CH3_Pkt_eager_send_t * eager_pkt = NULL;
     MPID_Request * rreq;
     int found;
     int complete;
@@ -615,6 +625,16 @@ int MPIDI_CH3_PktHandler_EagerSend( MPIDI_VC_t *vc, MPIDI_CH3_Pkt_t *pkt,
     MPIDI_msg_sz_t data_len;
     int mpi_errno = MPI_SUCCESS;
 
+#ifdef HAVE_STRICT_ALIGNMENT
+    MPIDI_CH3_Pkt_t aligned_pkt_hdr;
+    MPIDI_CH3_Pkt_t *pkt_ptr = pkt;
+
+    MPIDI_CH3_Pkt_get_aligned_ptr(&aligned_pkt_hdr, pkt, &pkt_ptr);
+    eager_pkt = &pkt_ptr->eager_send;
+#else
+    eager_pkt = &pkt->eager_send;
+#endif
+
     MPID_THREAD_CS_ENTER(POBJ, MPIR_THREAD_POBJ_MSGQ_MUTEX);
 
     MPIU_DBG_MSG_FMT(CH3_OTHER,VERBOSE,(MPIU_DBG_FDEST,
@@ -697,7 +717,7 @@ int MPIDI_CH3_PktHandler_EagerSend( MPIDI_VC_t *vc, MPIDI_CH3_Pkt_t *pkt,
 int MPIDI_CH3_PktHandler_ReadySend( MPIDI_VC_t *vc, MPIDI_CH3_Pkt_t *pkt,
 				    MPIDI_msg_sz_t *buflen, MPID_Request **rreqp )
 {
-    MPIDI_CH3_Pkt_ready_send_t * ready_pkt = &pkt->ready_send;
+    MPIDI_CH3_Pkt_ready_send_t * ready_pkt = NULL;
     MPID_Request * rreq;
     int found;
     int complete;
@@ -705,6 +725,16 @@ int MPIDI_CH3_PktHandler_ReadySend( MPIDI_VC_t *vc, MPIDI_CH3_Pkt_t *pkt,
     MPIDI_msg_sz_t data_len;
     int mpi_errno = MPI_SUCCESS;
     
+#ifdef HAVE_STRICT_ALIGNMENT
+    MPIDI_CH3_Pkt_t aligned_pkt_hdr;
+    MPIDI_CH3_Pkt_t *pkt_ptr = pkt;
+
+    MPIDI_CH3_Pkt_get_aligned_ptr(&aligned_pkt_hdr, pkt, &pkt_ptr);
+    ready_pkt = &pkt_ptr->ready_send;
+#else
+    ready_pkt = &pkt->ready_send;
+#endif
+
     MPIU_DBG_MSG_FMT(CH3_OTHER,VERBOSE,(MPIU_DBG_FDEST,
 	"received ready send pkt, sreq=0x%08x, rank=%d, tag=%d, context=%d",
 			ready_pkt->sender_req_id, 
-------------- next part --------------
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list