[mpich-devel] Issue with MPI_Error_string() for user-defined codes/clases

Lisandro Dalcin dalcinl at gmail.com
Tue Apr 22 04:04:11 CDT 2014


The MPI-3 standard says (pp.354, lines 39-40):

"""
If MPI_ERROR_STRING is called when no string has been set, it will
return a empty
string (all spaces in Fortran, "" in C).
"""

The following simple tests segfaults. A quick fix would be to return
an empty string in MPIR_Err_get_dynerr_string() (file dynerrutil.c),
e.g:

diff --git a/src/mpi/errhan/dynerrutil.c b/src/mpi/errhan/dynerrutil.c
index 943e8c3..fb40469 100644
--- a/src/mpi/errhan/dynerrutil.c
+++ b/src/mpi/errhan/dynerrutil.c
@@ -297,11 +297,13 @@ const char *MPIR_Err_get_dynerr_string( int code )
     if (errcode) {
  if (errcode < first_free_code) {
     errstr = user_code_msgs[errcode];
+    if (!errstr) errstr = "";
  }
     }
     else {
  if (errclass < first_free_class) {
     errstr = user_class_msgs[errclass];
+    if (!errstr) errstr = "";
  }
     }


[dalcinl at kw2060 ~]$ cat error_string2.c
#include <stdio.h>
#include <mpi.h>
int main(int argc, char *argv[])
{
  int errorclass;
  char errorstring[MPI_MAX_ERROR_STRING] = {64,0};
  int slen;
  MPI_Init(&argc, &argv);
  MPI_Add_error_class(&errorclass);
  MPI_Error_string(errorclass, errorstring, &slen);
  printf("errorclass:%d errorstring:'%s' len:%d\n", errorclass,
errorstring, slen);
  MPI_Finalize();
  return 0;
}

[dalcinl at kw2060 ~]$ mpicc error_string2.c
[dalcinl at kw2060 ~]$ ./a.out
Segmentation fault (core dumped)

[dalcinl at kw2060 ~]$ valgrind -q ./a.out
==14176== Invalid read of size 1
==14176==    at 0x4C6F5E7: MPIU_Strncpy (safestr.c:65)
==14176==    by 0x4C61864: MPIR_Err_get_string (errutil.c:601)
==14176==    by 0x4DA4DB4: PMPI_Error_string (error_string.c:80)
==14176==    by 0x400888: main (in /home/dalcinl/Devel/BUGS-MPI/~/a.out)
==14176==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==14176==
==14176==
==14176== Process terminating with default action of signal 11 (SIGSEGV)
==14176==  Access not within mapped region at address 0x0
==14176==    at 0x4C6F5E7: MPIU_Strncpy (safestr.c:65)
==14176==    by 0x4C61864: MPIR_Err_get_string (errutil.c:601)
==14176==    by 0x4DA4DB4: PMPI_Error_string (error_string.c:80)
==14176==    by 0x400888: main (in /home/dalcinl/Devel/BUGS-MPI/~/a.out)
==14176==  If you believe this happened as a result of a stack
==14176==  overflow in your program's main thread (unlikely but
==14176==  possible), you can try to increase the size of the
==14176==  main thread stack using the --main-stacksize= flag.
==14176==  The main thread stack size used in this run was 8720384.
Segmentation fault (core dumped)



-- 
Lisandro Dalcin
---------------
CIMEC (UNL/CONICET)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
3000 Santa Fe, Argentina
Tel: +54-342-4511594 (ext 1016)
Tel/Fax: +54-342-4511169


More information about the devel mailing list