[mpich-discuss] discuss Digest, Vol 8, Issue 39
Jeff Hammond
jhammond at alcf.anl.gov
Mon Jun 24 16:14:20 CDT 2013
MPI_Barrier will not sync threads. If N threads call MPI_Barrier, you will get at best the same result as if you call MPI_Barrier N times from the main thread.
If you want to sync threads, you need to sync them with the appropriate thread API. OpenMP and Pthreads both have barrier calls. If you want a fast Pthread barrier, you should not use pthread_barrier though. The Internet has details.
Jeff
----- Original Message -----
From: "Antonio J. Peña" <apenya at mcs.anl.gov>
To: discuss at mpich.org
Sent: Monday, June 24, 2013 3:49:08 PM
Subject: Re: [mpich-discuss] discuss Digest, Vol 8, Issue 39
Sufeng,
I'd say you're OK syncing your threads by all of them calling MPI_Barrier, as it's a thread-safe function.
Antonio
On Monday, June 24, 2013 11:41:44 AM Sufeng Niu wrote:
Hi, Antonio
Thanks a lot! Now I make sense. Let's say if I am running MPI and multithreads program. If I called MPI_Barrier in each threads
what gonna happen? Will threads be synced by MPI_Barrier? or I should use thread level sync?
Thank you!
Sufeng
On Sun, Jun 23, 2013 at 6:54 PM, < discuss-request at mpich.org > wrote:
Send discuss mailing list submissions to
discuss at mpich.org
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.mpich.org/mailman/listinfo/discuss
or, via email, send a message with subject or body 'help' to
discuss-request at mpich.org
You can reach the person managing the list at
discuss-owner at mpich.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of discuss digest..."
Today's Topics:
1. Re: Error with MPI_Spawn (Jeff Hammond)
2. Re: discuss Digest, Vol 8, Issue 37 (Antonio J. Pe?a)
----------------------------------------------------------------------
Message: 1
Date: Sun, 23 Jun 2013 17:03:09 -0500
From: Jeff Hammond < jeff.science at gmail.com >
To: " discuss at mpich.org " < discuss at mpich.org >
Subject: Re: [mpich-discuss] Error with MPI_Spawn
Message-ID: <-1578386667793986954 at unknownmsgid>
Content-Type: text/plain; charset=ISO-8859-1
This is the wrong way to use PETSc and to parallelize a code with a
parallel library in general.
Write the PETSc user list and they will explain to you how to
parallelize your code properly with PETSc.
Jeff
Sent from my iPhone
On Jun 23, 2013, at 4:59 PM, Nitel Muhtaroglu < muhtaroglu.n at gmail.com > wrote:
> Hello,
>
> I am trying to integrate PETSc library to a serial program. The idea is that the serial program creates a linear equation system and then calls PETSc solver by MPI_Spawn and then solves this system in parallel. But when I execute MPI_Spawn the following error message occurs and the solver is not called. I couldn't find a solution to this error. Does anyone have an idea about it?
>
> Kind Regards,
> --
> Nitel
>
> **********************************************************
> Assertion failed in file socksm.c at line 590: hdr.pkt_type == MPIDI_NEM_TCP_SOCKSM_PKT_ID_INFO || hdr.pkt_type == MPIDI_NEM_TCP_SOCKSM_PKT_TMPVC_INFO
> internal ABORT - process 0
> INTERNAL ERROR: Invalid error class (66) encountered while returning from
> MPI_Init. Please file a bug report.
> Fatal error in MPI_Init: Unknown error. Please file a bug report., error stack:
> (unknown)(): connection failure
> [cli_0]: aborting job:
> Fatal error in MPI_Init: Unknown error. Please file a bug report., error stack:
> (unknown)(): connection failure
> **********************************************************
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
------------------------------
Message: 2
Date: Sun, 23 Jun 2013 18:54:38 -0500
From: Antonio J. Pe?a < apenya at mcs.anl.gov >
To: discuss at mpich.org
Subject: Re: [mpich-discuss] discuss Digest, Vol 8, Issue 37
Message-ID: <8872211.IGCl89YM8r at localhost.localdomain>
Content-Type: text/plain; charset="iso-8859-1"
Sufeng,
The correct way is to use the MPI_Init_thread function with
MPI_THREAD_MULTIPLE. This will tell the MPI implementation to be thread
safe. It supports OpenMP and Posix Threads (OpenMP primitives in most
systems are likely to be implemented on top of PThreads).
Antonio
On Sunday, June 23, 2013 11:13:31 AM Sufeng Niu wrote:
Hi, Antonio
Thanks a lot for your reply, I just figure out that is the firewall issue. after I
set the firewall. it works now. Thanks again.
But I still got a few questions on MPI and multithreads mixed programming.
Currently, I try to run each process on each server, and each process
using thread pool to run multiple threads (pthread lib). I am not sure
whether it is the correct way or not. I wrote it as:
MPI_Init()
....
...
/* create thread pool and initial */
......
/* fetch job into thread pool */
......
MPI_Finalize();
When I check the book and notes, I found people use
MPI_Init_thread() with MPI_THREAD_MULTIPLE
but the some docs said it supported OpenMP, is that possible to use it
with pthread library?
I am new guy to this hybrid programming. I am not sure which is the proper
way to do it. Any suggestions are appreciate. Thank you!
Sufeng
On Sat, Jun 22, 2013 at 12:12 PM, < discuss-request at mpich.org [1]> wrote:
Send discuss mailing list submissions to discuss at mpich.org [2]
https://lists.mpich.org/mailman/listinfo/discuss[3]
discuss-request at mpich.org [1]
discuss-owner at mpich.org [4]
sniu at hawk.iit.edu [5]>To: discuss at mpich.org [2]
CAFNNHkwpqdGfZXctL0Uz3hpeL25mZZMtB93qGXjc_+tjnV4csA at mail.gmail.c
om[6]>Content-Type: text/plain; charset="iso-8859-1"
Hi,
Sorry to bother you guys on this stupid question. last time I re-install OSfor
all blades to keep them the same version. after I mount, set keylessssh,
the terimnal gives the error below:
[ proxy:0:1 at iocfccd3.aps.anl.gov [7]]
HYDU_sock_connect(./utils/sock/sock.c:174): unable to connect from
" iocfccd3.aps.anl.gov [8]" to" iocfccd1.aps.anl.gov [9]" (No route to host)
[ proxy:0:1 at iocfccd3.aps.anl.gov [7]] main (./pm/pmiserv/pmip.c:189):
unable toconnect to server iocfccd1.aps.anl.gov [9] at port 38242 (check
for firewalls!)
I can ssh from iocfccd1 to iocfccd3 without password. Should I shut downall
firewalls on each server? I cannot find out where is the problem. Thankyou
--Best Regards,Sufeng NiuECASP lab, ECE department, Illinois Institute of
TechnologyTel: 312-731-7219[10]
http://lists.mpich.org/pipermail/discuss/attachments/20130621/5503b1bc/a
ttachment-0001.html[11] >
------------------------------
Message: 2Date: Fri, 21 Jun 2013 10:58:26 -0500From: Antonio J. Pe?a
< apenya at mcs.anl.gov [12]>To: discuss at mpich.org [2]
iocfccd1.aps.anl.gov [9]" from iocfccd3?[1]
Antonio
On Friday, June 21, 2013 10:51:50 AM Sufeng Niu wrote:
Hi,
Sorry to bother you guys on this stupid question. last time I re-install OS
forall blades to keep them the same version. after I mount, set keyless
ssh,the terimnal gives the error below:
proxy:0:1 at iocfccd3.aps.anl.gov [7][2]]
HYDU_sock_connect(./utils/sock/sock.c:174): unable to connect from
" iocfccd3.aps.anl.gov [8][3]"to " iocfccd1.aps.anl.gov [9][1]" (No route to
host)[ proxy:0:1 at iocfccd3.aps.anl.gov [7][2]] main
(./pm/pmiserv/pmip.c:189):unable to connect to server
iocfccd1.aps.anl.gov [9][1] at port 38242 (checkfor firewalls!)
I can ssh from iocfccd1 to iocfccd3 without password. Should I shut downall
firewalls on each server? I cannot find out where is the problem. Thankyou
-- Best Regards,Sufeng NiuECASP lab, ECE department, Illinois Institute of
TechnologyTel: 312-731-7219[4]
--------[1] _ http://iocfccd1.aps.anl.gov_
proxy%3A0%3A1 at iocfccd3.aps.anl.gov [13]
http://iocfccd3.aps.anl.gov [8]
312-731-7219[10]
http://lists.mpich.org/pipermail/discuss/attachments/20130621/01b37902/
attachment-0001.html[14] >
------------------------------
Message: 3Date: Sat, 22 Jun 2013 12:49:10 -0400From: Jiri Simsa
< jsimsa at cs.cmu.edu [15]>To: discuss at mpich.org [2]
CAHs9ut-_6W6SOHTJ_rD+shQ76bo4cTCuFVAy1f9x-
J0gioakHg at mail.gmail.com [16]>Content-Type: text/plain;
charset="iso-8859-1"
Hi,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: < http://lists.mpich.org/pipermail/discuss/attachments/20130623/a6104115/attachment.html >
------------------------------
_______________________________________________
discuss mailing list
discuss at mpich.org
https://lists.mpich.org/mailman/listinfo/discuss
End of discuss Digest, Vol 8, Issue 39
**************************************
--
Best Regards,
Sufeng Niu
ECASP lab, ECE department, Illinois Institute of Technology
Tel: 312-731-7219
--
Antonio J. Peña
Postdoctoral Appointee
Mathematics and Computer Science Division
Argonne National Laboratory
9700 South Cass Avenue, Bldg. 240, Of. 3148
Argonne, IL 60439-4847
(+1) 630-252-7928
apenya at mcs.anl.gov
_______________________________________________
discuss mailing list discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list