[mpich-discuss] discuss Digest, Vol 8, Issue 39

Jeff Hammond jhammond at alcf.anl.gov
Mon Jun 24 16:14:20 CDT 2013


MPI_Barrier will not sync threads.  If N threads call MPI_Barrier, you will get at best the same result as if you call MPI_Barrier N times from the main thread.

If you want to sync threads, you need to sync them with the appropriate thread API.  OpenMP and Pthreads both have barrier calls.  If you want a fast Pthread barrier, you should not use pthread_barrier though.  The Internet has details.

Jeff

----- Original Message -----
From: "Antonio J. Peña" <apenya at mcs.anl.gov>
To: discuss at mpich.org
Sent: Monday, June 24, 2013 3:49:08 PM
Subject: Re: [mpich-discuss] discuss Digest, Vol 8, Issue 39






Sufeng, 



I'd say you're OK syncing your threads by all of them calling MPI_Barrier, as it's a thread-safe function. 



Antonio 





On Monday, June 24, 2013 11:41:44 AM Sufeng Niu wrote: 


Hi, Antonio 



Thanks a lot! Now I make sense. Let's say if I am running MPI and multithreads program. If I called MPI_Barrier in each threads 


what gonna happen? Will threads be synced by MPI_Barrier? or I should use thread level sync? 



Thank you! 


Sufeng 






On Sun, Jun 23, 2013 at 6:54 PM, < discuss-request at mpich.org > wrote: 


Send discuss mailing list submissions to 
discuss at mpich.org 

To subscribe or unsubscribe via the World Wide Web, visit 
https://lists.mpich.org/mailman/listinfo/discuss 
or, via email, send a message with subject or body 'help' to 
discuss-request at mpich.org 

You can reach the person managing the list at 
discuss-owner at mpich.org 

When replying, please edit your Subject line so it is more specific 
than "Re: Contents of discuss digest..." 


Today's Topics: 

1. Re: Error with MPI_Spawn (Jeff Hammond) 
2. Re: discuss Digest, Vol 8, Issue 37 (Antonio J. Pe?a) 


---------------------------------------------------------------------- 

Message: 1 
Date: Sun, 23 Jun 2013 17:03:09 -0500 
From: Jeff Hammond < jeff.science at gmail.com > 
To: " discuss at mpich.org " < discuss at mpich.org > 
Subject: Re: [mpich-discuss] Error with MPI_Spawn 
Message-ID: <-1578386667793986954 at unknownmsgid> 
Content-Type: text/plain; charset=ISO-8859-1 

This is the wrong way to use PETSc and to parallelize a code with a 
parallel library in general. 

Write the PETSc user list and they will explain to you how to 
parallelize your code properly with PETSc. 

Jeff 

Sent from my iPhone 

On Jun 23, 2013, at 4:59 PM, Nitel Muhtaroglu < muhtaroglu.n at gmail.com > wrote: 

> Hello, 
> 
> I am trying to integrate PETSc library to a serial program. The idea is that the serial program creates a linear equation system and then calls PETSc solver by MPI_Spawn and then solves this system in parallel. But when I execute MPI_Spawn the following error message occurs and the solver is not called. I couldn't find a solution to this error. Does anyone have an idea about it? 
> 
> Kind Regards, 
> -- 
> Nitel 
> 
> ********************************************************** 
> Assertion failed in file socksm.c at line 590: hdr.pkt_type == MPIDI_NEM_TCP_SOCKSM_PKT_ID_INFO || hdr.pkt_type == MPIDI_NEM_TCP_SOCKSM_PKT_TMPVC_INFO 
> internal ABORT - process 0 
> INTERNAL ERROR: Invalid error class (66) encountered while returning from 
> MPI_Init. Please file a bug report. 
> Fatal error in MPI_Init: Unknown error. Please file a bug report., error stack: 
> (unknown)(): connection failure 
> [cli_0]: aborting job: 
> Fatal error in MPI_Init: Unknown error. Please file a bug report., error stack: 
> (unknown)(): connection failure 
> ********************************************************** 
> _______________________________________________ 
> discuss mailing list discuss at mpich.org 
> To manage subscription options or unsubscribe: 
> https://lists.mpich.org/mailman/listinfo/discuss 


------------------------------ 

Message: 2 
Date: Sun, 23 Jun 2013 18:54:38 -0500 
From: Antonio J. Pe?a < apenya at mcs.anl.gov > 
To: discuss at mpich.org 
Subject: Re: [mpich-discuss] discuss Digest, Vol 8, Issue 37 
Message-ID: <8872211.IGCl89YM8r at localhost.localdomain> 
Content-Type: text/plain; charset="iso-8859-1" 


Sufeng, 

The correct way is to use the MPI_Init_thread function with 
MPI_THREAD_MULTIPLE. This will tell the MPI implementation to be thread 
safe. It supports OpenMP and Posix Threads (OpenMP primitives in most 
systems are likely to be implemented on top of PThreads). 

Antonio 


On Sunday, June 23, 2013 11:13:31 AM Sufeng Niu wrote: 


Hi, Antonio 


Thanks a lot for your reply, I just figure out that is the firewall issue. after I 
set the firewall. it works now. Thanks again. 


But I still got a few questions on MPI and multithreads mixed programming. 
Currently, I try to run each process on each server, and each process 
using thread pool to run multiple threads (pthread lib). I am not sure 
whether it is the correct way or not. I wrote it as: 


MPI_Init() 
.... 
... 
/* create thread pool and initial */ 
...... 
/* fetch job into thread pool */ 
...... 


MPI_Finalize(); 


When I check the book and notes, I found people use 


MPI_Init_thread() with MPI_THREAD_MULTIPLE 


but the some docs said it supported OpenMP, is that possible to use it 
with pthread library? 
I am new guy to this hybrid programming. I am not sure which is the proper 
way to do it. Any suggestions are appreciate. Thank you! 


Sufeng 




On Sat, Jun 22, 2013 at 12:12 PM, < discuss-request at mpich.org [1]> wrote: 


Send discuss mailing list submissions to discuss at mpich.org [2] 
https://lists.mpich.org/mailman/listinfo/discuss[3] 
discuss-request at mpich.org [1] 
discuss-owner at mpich.org [4] 
sniu at hawk.iit.edu [5]>To: discuss at mpich.org [2] 
CAFNNHkwpqdGfZXctL0Uz3hpeL25mZZMtB93qGXjc_+tjnV4csA at mail.gmail.c 
om[6]>Content-Type: text/plain; charset="iso-8859-1" 

Hi, 

Sorry to bother you guys on this stupid question. last time I re-install OSfor 
all blades to keep them the same version. after I mount, set keylessssh, 
the terimnal gives the error below: 

[ proxy:0:1 at iocfccd3.aps.anl.gov [7]] 
HYDU_sock_connect(./utils/sock/sock.c:174): unable to connect from 
" iocfccd3.aps.anl.gov [8]" to" iocfccd1.aps.anl.gov [9]" (No route to host) 
[ proxy:0:1 at iocfccd3.aps.anl.gov [7]] main (./pm/pmiserv/pmip.c:189): 
unable toconnect to server iocfccd1.aps.anl.gov [9] at port 38242 (check 
for firewalls!) 

I can ssh from iocfccd1 to iocfccd3 without password. Should I shut downall 
firewalls on each server? I cannot find out where is the problem. Thankyou 

--Best Regards,Sufeng NiuECASP lab, ECE department, Illinois Institute of 
TechnologyTel: 312-731-7219[10] 
http://lists.mpich.org/pipermail/discuss/attachments/20130621/5503b1bc/a 
ttachment-0001.html[11] > 

------------------------------ 

Message: 2Date: Fri, 21 Jun 2013 10:58:26 -0500From: Antonio J. Pe?a 
< apenya at mcs.anl.gov [12]>To: discuss at mpich.org [2] 
iocfccd1.aps.anl.gov [9]" from iocfccd3?[1] 

Antonio 


On Friday, June 21, 2013 10:51:50 AM Sufeng Niu wrote: 


Hi, 


Sorry to bother you guys on this stupid question. last time I re-install OS 
forall blades to keep them the same version. after I mount, set keyless 
ssh,the terimnal gives the error below: 


proxy:0:1 at iocfccd3.aps.anl.gov [7][2]] 
HYDU_sock_connect(./utils/sock/sock.c:174): unable to connect from 
" iocfccd3.aps.anl.gov [8][3]"to " iocfccd1.aps.anl.gov [9][1]" (No route to 
host)[ proxy:0:1 at iocfccd3.aps.anl.gov [7][2]] main 
(./pm/pmiserv/pmip.c:189):unable to connect to server 
iocfccd1.aps.anl.gov [9][1] at port 38242 (checkfor firewalls!) 



I can ssh from iocfccd1 to iocfccd3 without password. Should I shut downall 
firewalls on each server? I cannot find out where is the problem. Thankyou 




-- Best Regards,Sufeng NiuECASP lab, ECE department, Illinois Institute of 
TechnologyTel: 312-731-7219[4] 


--------[1] _ http://iocfccd1.aps.anl.gov_ 
proxy%3A0%3A1 at iocfccd3.aps.anl.gov [13] 
http://iocfccd3.aps.anl.gov [8] 
312-731-7219[10] 
http://lists.mpich.org/pipermail/discuss/attachments/20130621/01b37902/ 
attachment-0001.html[14] > 

------------------------------ 

Message: 3Date: Sat, 22 Jun 2013 12:49:10 -0400From: Jiri Simsa 
< jsimsa at cs.cmu.edu [15]>To: discuss at mpich.org [2] 
CAHs9ut-_6W6SOHTJ_rD+shQ76bo4cTCuFVAy1f9x- 
J0gioakHg at mail.gmail.com [16]>Content-Type: text/plain; 
charset="iso-8859-1" 

Hi, 
-------------- next part -------------- 
An HTML attachment was scrubbed... 
URL: < http://lists.mpich.org/pipermail/discuss/attachments/20130623/a6104115/attachment.html > 

------------------------------ 

_______________________________________________ 
discuss mailing list 
discuss at mpich.org 
https://lists.mpich.org/mailman/listinfo/discuss 

End of discuss Digest, Vol 8, Issue 39 
************************************** 





-- 
Best Regards, 

Sufeng Niu 

ECASP lab, ECE department, Illinois Institute of Technology 

Tel: 312-731-7219 





-- 

Antonio J. Peña 

Postdoctoral Appointee 

Mathematics and Computer Science Division 

Argonne National Laboratory 

9700 South Cass Avenue, Bldg. 240, Of. 3148 

Argonne, IL 60439-4847 

(+1) 630-252-7928 

apenya at mcs.anl.gov 


_______________________________________________
discuss mailing list     discuss at mpich.org
To manage subscription options or unsubscribe:
https://lists.mpich.org/mailman/listinfo/discuss



More information about the discuss mailing list