[mpich-discuss] Buffer corruption due to an excessive number of messages

Joachim Jenke jenke at itc.rwth-aachen.de
Thu Sep 14 15:10:16 CDT 2023


Hi Kurt,

just a thought: do you execute single-threaded or multi-threaded?

In case of multi-threaded execution, you should look into 
MPI_Improbe/MPI_Mrecv just to make sure that you really receive the 
message you probed for.
Even in single-threaded execution you might try whether using these 
functions instead fixes your issue.

Best
Joachim

Am 14.09.23 um 22:02 schrieb Mccall, Kurt E. (MSFC-EV41) via discuss:
> It seems that when I send a process too many non-blocking messages (with 
> MPI_Isend) , MPI_Iprobe/MPI_Recv sometimes returns a buffer
> 
> with corrupted data for some of the messages.   Usually the corrupted 
> data objects are at the end of the array that was sent.  I checked the
> 
> buffers passed to MPI_Isend, and they are uncorrupted.
> 
>  1. Is there a way to detect this kind of overload with an MPI call?
>  2. Is there an upper bound on the number of messages that can be “in
>     flight”?
>  3. Is there a upper bound on message length?
> 
> Thanks,
> 
> Kurt
> 
> 
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

-- 
Dr. rer. nat. Joachim Jenke

IT Center
Group: High Performance Computing
Division: Computational Science and Engineering
RWTH Aachen University
Seffenter Weg 23
D 52074  Aachen (Germany)
Tel: +49 241 80- 24765
Fax: +49 241 80-624765
jenke at itc.rwth-aachen.de
www.itc.rwth-aachen.de

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5903 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20230914/4d48dfe0/attachment.p7s>


More information about the discuss mailing list