[mpich-discuss] Error creating 272 processes on a multicore, single CPU

Gajbe, Manisha manisha.gajbe at intel.com
Mon Aug 1 11:24:01 CDT 2016


Hi Kenneth,

Here is the output of "mpichversion" and "mpiexec -info"

[mmgajbe at fm05wcon0025 bin]$ pwd
/usr/local/my-mpich-3.2/64bit/bin

[mmgajbe at fm05wcon0025 bin]$ ./mpichversion
MPICH Version:          3.2
MPICH Release date:     Wed Nov 11 22:06:48 CST 2015
MPICH Device:           ch3:nemesis
MPICH configure:        --prefix=/usr/local/my-mpich-3.2/64bit
MPICH CC:       icc    -O2
MPICH CXX:      icpc   -O2
MPICH F77:      ifort   -O2
MPICH FC:       ifort   -O2


[mmgajbe at fm05wcon0025 bin]$ mpiexec -info
HYDRA build details:
    Version:                                 3.2
    Release Date:                            Wed Nov 11 22:06:48 CST 2015
    CC:                              icc
    CXX:                             icpc
    F77:                             ifort
    F90:                             ifort
    Configure options:                       '--disable-option-checking' '--prefix=/usr/local/my-mpich-3.2/64bit' '--cache-file=/dev/null' '--srcdir=.' 'CC=icc' 'CFLAGS= -O2' 'LDFLAGS=' 'LIBS=-lpthread ' 'CPPFLAGS= -I/home/mmgajbe/Work/Software/mpich-3.2/src/mpl/include -I/home/mmgajbe/Work/Software/mpich-3.2/src/mpl/include -I/home/mmgajbe/Work/Software/mpich-3.2/src/openpa/src -I/home/mmgajbe/Work/Software/mpich-3.2/src/openpa/src -D_REENTRANT -I/home/mmgajbe/Work/Software/mpich-3.2/src/mpi/romio/include'
    Process Manager:                         pmi
    Launchers available:                     ssh rsh fork slurm ll lsf sge manual persist
    Topology libraries available:            hwloc
    Resource management kernels available:   user slurm ll lsf sge pbs cobalt
    Checkpointing libraries available:
    Demux engines available:                 poll select


Thanks,
~ Manisha

-----Original Message-----
From: discuss-request at mpich.org [mailto:discuss-request at mpich.org] 
Sent: Monday, August 1, 2016 7:36 AM
To: discuss at mpich.org
Subject: discuss Digest, Vol 46, Issue 1

Send discuss mailing list submissions to
	discuss at mpich.org

To subscribe or unsubscribe via the World Wide Web, visit
	https://lists.mpich.org/mailman/listinfo/discuss
or, via email, send a message with subject or body 'help' to
	discuss-request at mpich.org

You can reach the person managing the list at
	discuss-owner at mpich.org

When replying, please edit your Subject line so it is more specific than "Re: Contents of discuss digest..."


Today's Topics:

   1. Re:  Error creating 272 processes on a multicore, single CPU
      (Kenneth Raffenetti)
   2.  Broken links on the web site (Michele De Stefano)
   3. Re:  Broken links on the web site (Halim Amer)
   4. Re:  Minor compilation problem with 3 mpich (Edric Ellis)
   5. Re:  Broken links on the web site (Kenneth Raffenetti)


----------------------------------------------------------------------

Message: 1
Date: Thu, 28 Jul 2016 10:51:51 -0500
From: Kenneth Raffenetti <raffenet at mcs.anl.gov>
To: <discuss at mpich.org>
Subject: Re: [mpich-discuss] Error creating 272 processes on a
	multicore, single CPU
Message-ID: <e5ffbe8a-8c20-c022-3b42-0bc384e5b349 at mcs.anl.gov>
Content-Type: text/plain; charset="utf-8"; format=flowed

Can you also send the output of:
   /usr/local/my-mpich-3.2/64bit/bin/mpichversion
   /usr/local/my-mpich-3.2/64bit/bin/mpiexec -info

Something in your error output doesn't look right to me. I'm having a hard time finding the code that would execute a PMI command that causes the string buffer to overflow and omit a newline.

Ken

On 07/26/2016 11:25 AM, Gajbe, Manisha wrote:
> Hi Kenneth,
>
> The configure script is default except the "prefix". I used Intel compilers version 16.0.2.
>
> $ ./configure --prefix=/usr/local/my-mpich-3.2/64bit
>
> Hi Halim,
>
> I removed all the writes to stdout. No change in the output. Also, the code runs with IntelMPI 5.1.3 without any issues with lots of writes to stdout. Do you have any suggestion on setting up the limits of pipe on what should be the reasonable size.
>
> Below is output from my ulimit
>
> [mmgajbe at fm05wcon0025 Test]$ ulimit -a
> core file size          (blocks, -c) 0
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 62912
> max locked memory       (kbytes, -l) 64
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 2048
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 8192
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 4096
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
>
>
> Hi Husen,
>
> I tried on multiple systems with different Oss such as RHEL 7.0, OpenSuse, Ubuntu 12.04 etc. However, the error is observed only with MPICH and not with IntelMPI.
>
>
> ~ Manisha
> -----Original Message-----
> From: discuss-request at mpich.org [mailto:discuss-request at mpich.org]
> Sent: Tuesday, July 26, 2016 9:00 AM
> To: discuss at mpich.org
> Subject: discuss Digest, Vol 45, Issue 7
>
> Send discuss mailing list submissions to
> 	discuss at mpich.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://lists.mpich.org/mailman/listinfo/discuss
> or, via email, send a message with subject or body 'help' to
> 	discuss-request at mpich.org
>
> You can reach the person managing the list at
> 	discuss-owner at mpich.org
>
> When replying, please edit your Subject line so it is more specific than "Re: Contents of discuss digest..."
>
>
> Today's Topics:
>
>    1. Re:  Error creating 272 processes on a multicore, single CPU
>       (Husen R)
>    2.  Segfault with MPICH 3.2+Clang but not GCC (Andreas Noack)
>    3. Re:  Segfault with MPICH 3.2+Clang but not GCC (Jeff Hammond)
>    4. Re:  Segfault with MPICH 3.2+Clang but not GCC (Rob Latham)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 26 Jul 2016 10:58:01 +0700
> From: Husen R <hus3nr at gmail.com>
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] Error creating 272 processes on a
> 	multicore, single CPU
> Message-ID:
> 	<CACPfdUsN7N5TwU++tsoPiUGA1kiS3Cm-VGBaXotDfkKcquhiag at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> it seems the number of processes that can be created in one machine limited by Operating System.
>
> On Sat, Jul 23, 2016 at 11:19 AM, Gajbe, Manisha 
> <manisha.gajbe at intel.com>
> wrote:
>
>> Hi,
>>
>>
>>
>> I have installed mpich-3.2 on a multicore platform. When I spawn 272 
>> processes , I get the error message mentioned below. I am able to 
>> create upto 271 processes successfully.
>>
>>
>>
>> /usr/local/my-mpich-3.2/64bit/bin/mpirun -n 272 ./hello_c
>>
>>
>>
>> [cli_0]: write_line: message string doesn't end in newline: :cmd=put
>> kvsname=kvs_8846_0 key=r2h1
>> value=r0#0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,2
>> 3,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,4
>> 6,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,6
>> 9,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,9
>> 2,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,11
>> 1,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128
>> ,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,
>> 146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,1
>> 63,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,18
>> 0,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197
>> ,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,
>> 215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,2
>> 32,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,24
>> 9,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266
>> ,267,268,269,270,271$
 :
>>
>>
>>
>>
>>
>>
>>
>> *Manisha Gajbe*
>>
>> *MVE PQV Content*
>>
>> SCE ? System Content Engineering
>>
>>
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
>
>
> --
> Post Graduate Student
> Faculty of Computer Science
> University of Indonesia
> Depok
> -------------- next part -------------- An HTML attachment was 
> scrubbed...
> URL: 
> <http://lists.mpich.org/pipermail/discuss/attachments/20160726/a77bbc7
> 1/attachment-0001.html>
>
> ------------------------------
>
> Message: 2
> Date: Tue, 26 Jul 2016 11:17:05 -0400
> From: Andreas Noack <andreasnoackjensen at gmail.com>
> To: discuss at mpich.org
> Subject: [mpich-discuss] Segfault with MPICH 3.2+Clang but not GCC
> Message-ID:
> 	<CAFKYB6mgy8bO7Mw05it4w78XDRoh70KxUu1epkZfv34=KqL=yQ at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> On my El Capitan macbook I get a segfault when running the program below with more than a single process but only when MPICH has been compiled with Clang.
>
> I don't get that good debug info but here is some of what I got
>
> (lldb) c
> Process 61129 resuming
> Process 61129 stopped
> * thread #1: tid = 0x32c438, 0x00000003119d0432 libpmpi.12.dylib`MPID_Request_create + 244, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
>     frame #0: 0x00000003119d0432 libpmpi.12.dylib`MPID_Request_create 
> + 244
> libpmpi.12.dylib`MPID_Request_create:
> ->  0x3119d0432 <+244>: movaps %xmm0, 0x230(%rax)
>     0x3119d0439 <+251>: movq   $0x0, 0x240(%rax)
>     0x3119d0444 <+262>: movl   %ecx, 0x210(%rax)
>     0x3119d044a <+268>: popq   %rbp
>
> My version of Clang is
>
> Apple LLVM version 7.3.0 (clang-703.0.31)
> Target: x86_64-apple-darwin15.6.0
> Thread model: posix
> InstalledDir: /Library/Developer/CommandLineTools/usr/bin
>
> and the bug has been confirmed by my colleague who is running Linux and compiling with Clang 3.8. The program runs fine with OpenMPI+Clang.
>
> #include <mpi.h>
> #include <stdio.h>
> #include <stdlib.h>
>
> int main(int argc, char *argv[])
> {
>     MPI_Init(&argc, &argv);
>
>     MPI_Comm comm = MPI_COMM_WORLD;
>     uint64_t *A, *C;
>     int rnk;
>
>     MPI_Comm_rank(comm, &rnk);
>     A = calloc(1, sizeof(uint64_t));
>     C = calloc(2, sizeof(uint64_t));
>     A[0] = rnk + 1;
>
>     MPI_Allgather(A, 1, MPI_UINT64_T, C, 1, MPI_UINT64_T, comm);
>
>     MPI_Finalize();
>     return 0;
> }
>
>
> Best regards
>
> Andreas Noack
> Postdoctoral Associate
> Computer Science and Artificial Intelligence Laboratory Massachusetts 
> Institute of Technology
> -------------- next part -------------- An HTML attachment was 
> scrubbed...
> URL: 
> <http://lists.mpich.org/pipermail/discuss/attachments/20160726/27eff56
> b/attachment-0001.html>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 26 Jul 2016 08:56:03 -0700
> From: Jeff Hammond <jeff.science at gmail.com>
> To: MPICH <discuss at mpich.org>
> Subject: Re: [mpich-discuss] Segfault with MPICH 3.2+Clang but not GCC
> Message-ID:
> 	<CAGKz=uL+3P=d7W1TP9R9Teafip44aMnnk9c6nzuL_yDF_u+bsw at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> I cannot reproduce this.  I am using Darwin 15.5.0 instead of 15.6.0, but the compiler is identical.  I am using MPICH Git master from June 29.
>
> At this point, it is unclear to me if the bug is in MPICH or Clang.
>
> Jeff
>
> vsanthan-mobl1:BUGS jrhammon$ 
> /opt/mpich/dev/clang/default/bin/mpichversion
>
> MPICH Version:    3.2
>
> MPICH Release date: unreleased development copy
>
> MPICH Device:    ch3:nemesis
>
> MPICH configure: CC=clang CXX=clang++ FC=false F77=false --enable-cxx 
> --disable-fortran --with-pm=hydra 
> --prefix=/opt/mpich/dev/clang/default
> --enable-cxx --enable-wrapper-rpath --disable-static --enable-shared
>
> MPICH CC: clang    -O2
>
> MPICH CXX: clang++   -O2
>
> MPICH F77: false
>
> MPICH FC: false
>
> vsanthan-mobl1:BUGS jrhammon$ /opt/mpich/dev/clang/default/bin/mpicc 
> -v
>
> mpicc for MPICH version 3.2
>
> Apple LLVM version 7.3.0 (clang-703.0.31)
>
> Target: x86_64-apple-darwin15.5.0
>
> Thread model: posix
>
> InstalledDir:
> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xct
> oolchain/usr/bin
>
> clang: warning: argument unused during compilation: '-I /opt/mpich/dev/clang/default/include'
>
> On Tue, Jul 26, 2016 at 8:17 AM, Andreas Noack 
> <andreasnoackjensen at gmail.com
>> wrote:
>
>> On my El Capitan macbook I get a segfault when running the program 
>> below with more than a single process but only when MPICH has been 
>> compiled with Clang.
>>
>> I don't get that good debug info but here is some of what I got
>>
>> (lldb) c
>> Process 61129 resuming
>> Process 61129 stopped
>> * thread #1: tid = 0x32c438, 0x00000003119d0432 
>> libpmpi.12.dylib`MPID_Request_create + 244, queue = 
>> 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
>>     frame #0: 0x00000003119d0432 libpmpi.12.dylib`MPID_Request_create
>> + 244
>> libpmpi.12.dylib`MPID_Request_create:
>> ->  0x3119d0432 <+244>: movaps %xmm0, 0x230(%rax)
>>     0x3119d0439 <+251>: movq   $0x0, 0x240(%rax)
>>     0x3119d0444 <+262>: movl   %ecx, 0x210(%rax)
>>     0x3119d044a <+268>: popq   %rbp
>>
>> My version of Clang is
>>
>> Apple LLVM version 7.3.0 (clang-703.0.31)
>> Target: x86_64-apple-darwin15.6.0
>> Thread model: posix
>> InstalledDir: /Library/Developer/CommandLineTools/usr/bin
>>
>> and the bug has been confirmed by my colleague who is running Linux 
>> and compiling with Clang 3.8. The program runs fine with OpenMPI+Clang.
>>
>> #include <mpi.h>
>> #include <stdio.h>
>> #include <stdlib.h>
>>
>> int main(int argc, char *argv[])
>> {
>>     MPI_Init(&argc, &argv);
>>
>>     MPI_Comm comm = MPI_COMM_WORLD;
>>     uint64_t *A, *C;
>>     int rnk;
>>
>>     MPI_Comm_rank(comm, &rnk);
>>     A = calloc(1, sizeof(uint64_t));
>>     C = calloc(2, sizeof(uint64_t));
>>     A[0] = rnk + 1;
>>
>>     MPI_Allgather(A, 1, MPI_UINT64_T, C, 1, MPI_UINT64_T, comm);
>>
>>     MPI_Finalize();
>>     return 0;
>> }
>>
>>
>> Best regards
>>
>> Andreas Noack
>> Postdoctoral Associate
>> Computer Science and Artificial Intelligence Laboratory Massachusetts 
>> Institute of Technology
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
>
>
> --
> Jeff Hammond
> jeff.science at gmail.com
> http://jeffhammond.github.io/
> -------------- next part -------------- An HTML attachment was 
> scrubbed...
> URL: 
> <http://lists.mpich.org/pipermail/discuss/attachments/20160726/a92cfc5
> 6/attachment-0001.html>
>
> ------------------------------
>
> Message: 4
> Date: Tue, 26 Jul 2016 11:00:00 -0500
> From: Rob Latham <robl at mcs.anl.gov>
> To: <discuss at mpich.org>
> Subject: Re: [mpich-discuss] Segfault with MPICH 3.2+Clang but not GCC
> Message-ID: <57978900.8030207 at mcs.anl.gov>
> Content-Type: text/plain; charset="windows-1252"; format=flowed
>
>
>
> On 07/26/2016 10:17 AM, Andreas Noack wrote:
>> On my El Capitan macbook I get a segfault when running the program 
>> below with more than a single process but only when MPICH has been 
>> compiled with Clang.
>>
>> I don't get that good debug info but here is some of what I got
>
>
> valgrind is pretty good at sussing out these sorts of things:
>
> ==18132== Unaddressable byte(s) found during client check request
> ==18132==    at 0x504D1D7: MPIR_Localcopy (helper_fns.c:84)
> ==18132==    by 0x4EC8EA1: MPIR_Allgather_intra (allgather.c:169)
> ==18132==    by 0x4ECA5EC: MPIR_Allgather (allgather.c:791)
> ==18132==    by 0x4ECA7A4: MPIR_Allgather_impl (allgather.c:832)
> ==18132==    by 0x4EC8B5C: MPID_Allgather (mpid_coll.h:61)
> ==18132==    by 0x4ECB9F7: PMPI_Allgather (allgather.c:978)
> ==18132==    by 0x4008F5: main (noack_segv.c:18)
> ==18132==  Address 0x6f2f138 is 8 bytes after a block of size 16 alloc'd
> ==18132==    at 0x4C2FB55: calloc (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==18132==    by 0x4008B0: main (noack_segv.c:15)
> ==18132==
> ==18132== Invalid write of size 8
> ==18132==    at 0x4C326CB: memcpy@@GLIBC_2.14 (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==18132==    by 0x504D31B: MPIR_Localcopy (helper_fns.c:84)
> ==18132==    by 0x4EC8EA1: MPIR_Allgather_intra (allgather.c:169)
> ==18132==    by 0x4ECA5EC: MPIR_Allgather (allgather.c:791)
> ==18132==    by 0x4ECA7A4: MPIR_Allgather_impl (allgather.c:832)
> ==18132==    by 0x4EC8B5C: MPID_Allgather (mpid_coll.h:61)
> ==18132==    by 0x4ECB9F7: PMPI_Allgather (allgather.c:978)
> ==18132==    by 0x4008F5: main (noack_segv.c:18)
> ==18132==  Address 0x6f2f138 is 8 bytes after a block of size 16 alloc'd
> ==18132==    at 0x4C2FB55: calloc (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==18132==    by 0x4008B0: main (noack_segv.c:15)
>
>
>>
>>      MPI_Comm_rank(comm, &rnk);
>>      A = calloc(1, sizeof(uint64_t));
>>      C = calloc(2, sizeof(uint64_t));
>>      A[0] = rnk + 1;
>>
>>      MPI_Allgather(A, 1, MPI_UINT64_T, C, 1, MPI_UINT64_T, comm);
>
> Your 'buf count tuple' is ok for A: every process sends one uint64
>
> your 'buf count tuple' is too small for C if there are any more than 2 proceses .
>
> When you say "more than one"... do you mean 2?
>
> ==rob
>
>
> ------------------------------
>
> _______________________________________________
> discuss mailing list
> discuss at mpich.org
> https://lists.mpich.org/mailman/listinfo/discuss
>
> End of discuss Digest, Vol 45, Issue 7
> **************************************
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>

------------------------------

Message: 2
Date: Sun, 31 Jul 2016 08:20:05 +0200
From: Michele De Stefano <micdestefano at gmail.com>
To: discuss at mpich.org
Subject: [mpich-discuss] Broken links on the web site
Message-ID:
	<CAJ7vBf03M2N7+fdNMBZaL+-2ux+FkUKSLHCXjb-PkSzsK2XERQ at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Dear MPICH staff,

in these days I was trying to download MPICH 3.2 and I've found this to be impossible. The links simply continue to redirect to the download page, but no download begins.

I've also tried to check the Installer Guide or other guides, but the links are broken too (error 404).

Can you please verify that your web site is working properly, please?

I suspect you have several links broken also in other pages.

Thanks.
Best regards.

--
Michele De Stefano
Linked In <http://it.linkedin.com/in/micheledestefano>
mds-utils: a general purpose Open Source library <http://sourceforge.net/projects/mds-utils/>
Personal Web Site <http://www.micheledestefano.altervista.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20160731/8a0e51c1/attachment-0001.html>

------------------------------

Message: 3
Date: Sun, 31 Jul 2016 08:16:20 -0500
From: Halim Amer <aamer at anl.gov>
To: <discuss at mpich.org>
Subject: Re: [mpich-discuss] Broken links on the web site
Message-ID: <4ae20041-f87f-5bbc-1e94-b3cbe1f8d4e9 at anl.gov>
Content-Type: text/plain; charset="windows-1252"; format=flowed

Thank you for reporting the problem. We are aware of the issue and are working on fixing it ASAP.

--Halim
www.mcs.anl.gov/~aamer

On 7/31/16 1:20 AM, Michele De Stefano wrote:
> Dear MPICH staff,
>
> in these days I was trying to download MPICH 3.2 and I've found this 
> to be impossible. The links simply continue to redirect to the 
> download page, but no download begins.
>
> I've also tried to check the Installer Guide or other guides, but the 
> links are broken too (error 404).
>
> Can you please verify that your web site is working properly, please?
>
> I suspect you have several links broken also in other pages.
>
> Thanks.
> Best regards.
>
> --
> Michele De Stefano
> Linked In <http://it.linkedin.com/in/micheledestefano>
> mds-utils: a general purpose Open Source library 
> <http://sourceforge.net/projects/mds-utils/>
> Personal Web Site <http://www.micheledestefano.altervista.org>
>
>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
>


------------------------------

Message: 4
Date: Mon, 1 Aug 2016 12:16:29 +0000
From: Edric Ellis <Edric.Ellis at mathworks.co.uk>
To: "discuss at mpich.org" <discuss at mpich.org>
Subject: Re: [mpich-discuss] Minor compilation problem with 3 mpich
Message-ID:
	<306c1630f6074534b27f9b0b6eab256e at ex13emea-00-uk.ad.mathworks.com>
Content-Type: text/plain; charset="us-ascii"

Here's a hopefully-comprehensible diff for what I changed:

--- REPOSITORY/mpich/src/mpi/comm/comm_agree.c	2015-08-24 16:09:33.000000000 -0400
+++ SANDBOX/mpich/src/mpi/comm/comm_agree.c	2015-08-24 16:09:33.000000000 -0400
@@ -25,7 +25,6 @@
 #ifndef MPICH_MPI_FROM_PMPI
 #undef MPIX_Comm_agree
 #define MPIX_Comm_agree PMPIX_Comm_agree -#endif
 
 #undef FUNCNAME
 #define FUNCNAME MPIR_Comm_agree
@@ -106,6 +105,8 @@
     goto fn_exit;
 }
 
+#endif /* !defined(MPICH_MPI_FROM_PMPI) */
+
 #undef FUNCNAME
 #define FUNCNAME MPIX_Comm_agree
 #undef FCNAME
--- REPOSITORY/mpich/src/mpi/comm/comm_shrink.c	2015-08-24 16:09:33.000000000 -0400
+++ SANDBOX/mpich/src/mpi/comm/comm_shrink.c	2015-08-24 16:09:33.000000000 -0400
@@ -39,7 +39,6 @@
 #ifndef MPICH_MPI_FROM_PMPI
 #undef MPIX_Comm_shrink
 #define MPIX_Comm_shrink PMPIX_Comm_shrink -#endif
 
 #undef FUNCNAME
 #define FUNCNAME MPIR_Comm_shrink
@@ -93,6 +92,8 @@
     goto fn_exit;
 }
 
+#endif /* !defined(MPICH_MPI_FROM_PMPI) */
+
 #undef FUNCNAME
 #define FUNCNAME MPIX_Comm_shrink
 #undef FCNAME
--- REPOSITORY/mpich/src/glue/romio/all_romio_symbols.c	2015-08-24 16:09:33.000000000 -0400
+++ SANDBOX/mpich/src/glue/romio/all_romio_symbols.c	2015-08-24 16:09:33.000000000 -0400
@@ -36,6 +36,9 @@
 #include "mpi.h"
 
 void MPIR_All_romio_symbols(void);
+
+#ifndef MPICH_MPI_FROM_PMPI
+
 void MPIR_All_romio_symbols(void)
 {
 #ifdef MPI_MODE_RDONLY
@@ -525,3 +528,4 @@
     }
 #endif /* MPI_MODE_RDONLY */
 }
+#endif /* !defined(MPICH_MPI_FROM_PMPI) */

Note that the problem doesn't show up when building MPICH in any normal manner - what I'm doing is unpacking the static libraries libmpi.a and libpmpi.a on Mac and re-packaging all the object files into a single .dylib file - and that fails because of the duplicate symbols.

Cheers,
Edric.

-----Original Message-----
Message: 5
Date: Thu, 28 Jul 2016 10:48:23 -0500
From: Kenneth Raffenetti <raffenet at mcs.anl.gov>
To: <discuss at mpich.org>
Subject: Re: [mpich-discuss] Minor compilation problem with 3 mpich
	source files
Message-ID: <47b96cec-9753-01d1-f1bf-a7d4571b4bb2 at mcs.anl.gov>
Content-Type: text/plain; charset="utf-8"; format=flowed

Thanks, I believe I've seen this issue before but our nightly builds don't currently show any errors. Might be specific to a compiler/linker version.

Do you have a patch you could share to show your fix? If not, I will try to reimplement the solution.

Ken

On 07/28/2016 09:36 AM, Edric Ellis wrote:
> Hi there,
>
>
>
> I found three source files:
>
>
>
> src/mpi/comm/comm_agree.c
>
> src/mpi/comm/comm_shrink.c
>
> src/glue/romio/all_romio_symbols.c
>
>
>
> where the MPIR functions were ending up in both the ?mpi? and ?pmpi?
> object files (we need to repackage the libraries in a slightly odd way 
> on Mac, and this was causing that to fail). I fixed this by moving (or
> adding) the ?#ifndef MPICH_MPI_FROM_PMPI? guards to ensure the MPIR 
> symbols didn?t end up in the ?mpi? object files. This problem appears 
> to still be present in the latest version 3.2 (I?m actually currently 
> building 3.1.4).


------------------------------

Message: 5
Date: Mon, 1 Aug 2016 09:35:50 -0500
From: Kenneth Raffenetti <raffenet at mcs.anl.gov>
To: <discuss at mpich.org>
Subject: Re: [mpich-discuss] Broken links on the web site
Message-ID: <5b895fc7-aa50-1c77-56cc-95893d83c7b4 at mcs.anl.gov>
Content-Type: text/plain; charset="utf-8"; format=flowed

All links should now be restored.

On 07/31/2016 08:16 AM, Halim Amer wrote:
> Thank you for reporting the problem. We are aware of the issue and are 
> working on fixing it ASAP.
>
> --Halim
> www.mcs.anl.gov/~aamer
>
> On 7/31/16 1:20 AM, Michele De Stefano wrote:
>> Dear MPICH staff,
>>
>> in these days I was trying to download MPICH 3.2 and I've found this 
>> to be impossible. The links simply continue to redirect to the 
>> download page, but no download begins.
>>
>> I've also tried to check the Installer Guide or other guides, but the 
>> links are broken too (error 404).
>>
>> Can you please verify that your web site is working properly, please?
>>
>> I suspect you have several links broken also in other pages.
>>
>> Thanks.
>> Best regards.
>>
>> --
>> Michele De Stefano
>> Linked In <http://it.linkedin.com/in/micheledestefano>
>> mds-utils: a general purpose Open Source library 
>> <http://sourceforge.net/projects/mds-utils/>
>> Personal Web Site <http://www.micheledestefano.altervista.org>
>>
>>
>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss


------------------------------

_______________________________________________
discuss mailing list
discuss at mpich.org
https://lists.mpich.org/mailman/listinfo/discuss

End of discuss Digest, Vol 46, Issue 1
**************************************
_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss


More information about the discuss mailing list