[mpich-discuss] MPI server setup issue
Sufeng Niu
sniu at hawk.iit.edu
Sat Jun 15 15:17:41 CDT 2013
Hi, Antonio
Thanks a lot for your reply. I run my program on 64 bit OS for each nodes.
Do you know how can overcome this OS problems? Should I add compile flags
as mpicc -m64 ....?
Thanks a lot!
Sufeng
On Sat, Jun 15, 2013 at 10:03 AM, <discuss-request at mpich.org> wrote:
> Send discuss mailing list submissions to
> discuss at mpich.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.mpich.org/mailman/listinfo/discuss
> or, via email, send a message with subject or body 'help' to
> discuss-request at mpich.org
>
> You can reach the person managing the list at
> discuss-owner at mpich.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of discuss digest..."
>
>
> Today's Topics:
>
> 1. Re: MPI server setup issue (Antonio J. Pe?a)
> 2. Re: Running an mpi program that needs to access /dev/mem
> (Jim Dinan)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 14 Jun 2013 16:46:20 -0500
> From: Antonio J. Pe?a <apenya at mcs.anl.gov>
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] MPI server setup issue
> Message-ID: <3198378.OIJ6uL42Ef at localhost.localdomain>
> Content-Type: text/plain; charset="iso-8859-1"
>
>
> Hi Sufeng,
>
>
> > On Friday, June 14, 2013 04:35:39 PM Sufeng Niu wrote:
>
>
> > Hello,
> >
>
>
> > I am a beginner on MPI programming, and right now I am working on an
> MPI project. I got a few questions related to implementation issues:
> >
>
>
> > 1. when I run a simple MPI hello world on multiple nodes, (I already
> installed mpich3 library on master node, mount the nfs, shared the
> executable file and mpi library, set slave node to be keyless ssh), my
> program was stoped there say:
> > bash: /mnt/mpi/mpich-install/bin/hydra_pmi_proxy: /lib/ld-linux.so.2: bad
> ELF interpreter: No such file or directory.
> > I can not get rid of it for a long times. even though I reset everything
> (I
> already add PATH=/mnt/mpi/mpich-install/bin:$PATH in .bash_profile). Do
> you have any clues on this problems?
> >
>
>
> This issue may be related to mismatch between 32 and 64 bit libraries. Are
> you running 64 or 32 bit operating systems in all of your nodes
> consistently?
>
> > 2. for multiple servers, each of them has 10G ethernet card. for
> example, one network card address is eth5: 10.0.5.55. So if I want to
> launch MPI communication through 10G network card. Should I set the
> hostfile as: 10.0.5.55:$(PROCESS_NUM)? Or using iface eth5
>
>
> You can address those nodes by either IP or DNS name in the hostfile,
> depending on how your system is configured. Using IP addresses is
> completely OK.
>
>
> Best,
> Antonio
>
> >
>
>
> > Thanks a lot!
> >
>
>
> > -- Best Regards,
> > Sufeng Niu
> > ECASP lab, ECE department, Illinois Institute of Technology
> > Tel: 312-731-7219
> >
>
>
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.mpich.org/pipermail/discuss/attachments/20130614/67207b83/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 2
> Date: Sat, 15 Jun 2013 10:03:02 -0500
> From: Jim Dinan <james.dinan at gmail.com>
> To: discuss at mpich.org
> Subject: Re: [mpich-discuss] Running an mpi program that needs to
> access /dev/mem
> Message-ID:
> <CAOoEU4E87SNHZS2KmbtywMLF=
> T0q4Kq2a7kDJHV2q54WT34nBg at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Eibhlin,
>
> Did you make those permissions changes on every node where your program
> runs? What happens if you run "mpiexec touch /dev/mem"?
>
> ~Jim.
>
>
> On Fri, Jun 14, 2013 at 4:43 PM, Lee, Eibhlin
> <eibhlin.lee10 at imperial.ac.uk>wrote:
>
> > Pavan,
> > sorry when I do run mpiexec id the output is
> > uid=1000(pi) gid=1000(pi)
> >
> groups=1000(pi),4(adm),20(dialout),24(cdrom),27(sudo),29(audio),44(video),46(plugdev),60(games),100(users),105(netdev),999(input)
> >
> > regardless of whether I'm in root or my usual user. root at raspi or
> > pi at raspi. Is this output what you would expect?
> >
> > Jim,
> > I have tried changing the ownership of /dev/mem by
> > chmod 755 /dev/mem so that the output of ls -l /dev/mem is
> > crwxr-xr-x 1 root kmem 1, 1 Jan 1 1970 /dev/mem
> > but I still can't open /dev/mem inside my program. I also tried with the
> > code 777.
> >
> > I tried adding my user to the kmem group by doing
> > usermod -a -G kmem pi
> > but this doesn't fix the problem.
> >
> >
> > Have I gotten totally confused and pi isn't my user?
> >
> > Thank you in advance,
> > Eibhlin
> > ------------------------------
> > *From:* discuss-bounces at mpich.org [discuss-bounces at mpich.org] on behalf
> > of Jim Dinan [james.dinan at gmail.com]
> > *Sent:* 14 June 2013 21:31
> >
> > *To:* discuss at mpich.org
> > *Subject:* Re: [mpich-discuss] Running an mpi program that needs to
> > access /dev/mem
> >
> > I don't know if this has been suggested, but you could also add your
> > user to the kmem group and chmod /dev/mem so that you have the access you
> > need.
> >
> > ~Jim.
> >
> >
> > On Fri, Jun 14, 2013 at 1:24 PM, Pavan Balaji <balaji at mcs.anl.gov>
> wrote:
> >
> >>
> >> You can run mpich as root. There's no restriction on that. You still
> >> haven't tried out my suggestion of running "id" to check what user ID
> you
> >> are running your processes as. My guess is that you are not setting
> your
> >> user ID correctly.
> >>
> >> -- Pavan
> >>
> >>
> >> On 06/14/2013 06:27 AM, Lee, Eibhlin wrote:
> >>
> >>> I found that the reason we want to access /dev/mem is to setup memory
> >>> regions to access the peripherals. (We are trying to read the output
> of an
> >>> ADC). At this point it becomes more a linux/raspberry-pi specific
> problem
> >>> than an MPICH problem. Although the fact that you can't run a program
> that
> >>> needs access to memory mapping (even as the root user) seems something
> that
> >>> MPICH could improve on for future versions. I know I am using smpd
> instead
> >>> of hydra so this problem may already be solved. But if someone could
> >>> confirm that, it would be really helpful.
> >>> ______________________________**__________
> >>> From: discuss-bounces at mpich.org [discuss-bounces at mpich.org] on behalf
> >>> of Lee, Eibhlin [eibhlin.lee10 at imperial.ac.uk]
> >>> Sent: 14 June 2013 11:20
> >>> To: discuss at mpich.org
> >>> Subject: Re: [mpich-discuss] Running an mpi program that needs
> >>> to access /dev/mem
> >>>
> >>> Gus,
> >>> I tried running cpi, as is included in the installation of MPI, on two
> >>> machines with two processes. The output message confirmed that it had
> >>> started only 1 process instead of 2.
> >>> Process 0 of 1 is on raspi
> >>> pi is approximately...
> >>>
> >>> Then it just hung. I think this is because the other machine didn't
> know
> >>> where to output the data?
> >>>
> >>> When I tried running two processes on the one machine using the wrapper
> >>> you suggested the output was the same but doubled. It didn't hang. This
> >>> confirms that every process was started with rank 0.
> >>>
> >>> I'm not entirely sure why /dev/mem is needed. I'm working in a group
> and
> >>> another member set up io and gpio and it seemed it needed access to
> >>> /dev/mem I am going to do a strace as suggested by Pavan Balaji to see
> >>> where it is used and see if I can somehow work around it.
> >>>
> >>> Thank you for your help.
> >>> Eibhlin
> >>> ______________________________**__________
> >>> From: discuss-bounces at mpich.org [discuss-bounces at mpich.org] on behalf
> >>> of Gus Correa [gus at ldeo.columbia.edu]
> >>> Sent: 13 June 2013 21:11
> >>> To: Discuss Mpich
> >>> Subject: Re: [mpich-discuss] Running an mpi program that needs to
> >>> access /dev/mem
> >>>
> >>> Hi Eibhlin
> >>>
> >>> On 06/13/2013 12:59 PM, Lee, Eibhlin wrote:
> >>>
> >>>> Gus,
> >>>> I believe your first assumption is correct. Unfortunately it just
> >>>> seemed to hang. I think this might be because each one is being made
> to
> >>>> have the same rank...
> >>>>
> >>>
> >>> Darn! I was afraid that it might give only rank 0 to all MPI
> processes.
> >>> So, with the script wrapper the process being launched by mpiexec may
> >>> indeed be sudo,
> >>> not the actual mpi executable (main) :(
> >>> Then it may actually launch a bunch of separate rank 0 replicas of your
> >>> program,
> >>> instead of assigning to them different ranks.
> >>> However, without any output or error message, it is hard to tell.
> >>>
> >>> No output at all?
> >>> No error message, just hangs?
> >>> Have you tried a verbose flag (-v) to mpiexec?
> >>> (Not sure if it exists in MPICH mpiexec, you'd need to check.)
> >>>
> >>> Would you care to try it with another mpi program,
> >>> one that doesn't deal with /dev/mem (a risky business),
> >>> say cpi.c (in the examples directory), or an mpi version of Hello,
> world,
> >>> just to see if the mpiexec+sudo_script_wrapper works as expected or
> >>> if everybody gets rank 0?
> >>>
> >>>
> >>> It may already be obvious but this is the first time I am using Linux.
> >>>> I had tried sudo $(which mpiexec ....) and sudo $(which mpiexec) ...
> both
> >>>> without success.
> >>>>
> >>>
> >>> "which mpiexec" will return the path to mpiexec, but won't execute it.
> >>>
> >>> You could try this (with backquotes):
> >>>
> >>> `which mpiexec` -n 2 ~/main
> >>>
> >>> On a side note, make sure the mpiexec you're using matches the
> >>> mpicc/mpif90/MPI library from the MPICH that
> >>> you used to compile the program.
> >>> Often times computers have several flavors of MPI installed, and mixing
> >>> them just doesn't work.
> >>>
> >>> Is putting the full path to it similar to/is a symlink? (This still
> >>>> doesn't make main have super user privileges though.)
> >>>>
> >>>
> >>> No, nothing to do with sudo privileges.
> >>>
> >>> This suggestion was just to avoid messing up your /usr/bin,
> >>> which is a directory that despite the somewhat misleading name (/usr,
> >>> for historical reasons I think),
> >>> is supposed to hold system (Linux) programs (that users can use), but
> >>> not user-installed programs.
> >>> Normally things are that are installed in /usr get there via some Linux
> >>> package manager program
> >>> (yum, rpm, apt-get, etc), to keep consistency with libraries, etc.
> >>>
> >>> I belive MPICH would install by default in /usr/local/ (and put mpiexec
> >>> in /usr/local/bin),
> >>> which is kind of a default location for non-system applications.
> >>>
> >>> The full path suggestion would be something like:
> >>> /path/to/where/you/installed/**mpiexec -n 2 ~/main
> >>>
> >>> However, this won't solve the other problem w.r.t. sudo and /dev/mem.
> >>>
> >>> You must know what you are doing, but it made me wonder,
> >>> even if your program were sequential, why would you want to mess with
> >>> /dev/mem directly?
> >>> Just curious about it.
> >>>
> >>> Gus Correa
> >>>
> >>>
> >>>
> >>> Eibhlin
> >>>> ______________________________**__________
> >>>> From: discuss-bounces at mpich.org [discuss-bounces at mpich.org] on behalf
> >>>> of Gus Correa [gus at ldeo.columbia.edu]
> >>>> Sent: 13 June 2013 15:37
> >>>> To: Discuss Mpich
> >>>> Subject: Re: [mpich-discuss] Running an mpi program that needs to
> >>>> access /dev/mem
> >>>>
> >>>> Hi Lee
> >>>>
> >>>> How about replacing "~/main" in the mpiexec command line
> >>>> by one-liner script?
> >>>> Say, "sudo_main.sh", something like this:
> >>>>
> >>>> #! /bin/bash
> >>>> sudo ~/main
> >>>>
> >>>> After all, it is "main" that accesses /dev/mem,
> >>>> and needs "sudo" permissions, not mpiexec, right?
> >>>> [Or do the mpiexec-launched processes inherit
> >>>> the "sudo" stuff from mpiexec?]
> >>>>
> >>>> Not related, but, instead of putting mpiexec in /usr/bin,
> >>>> can't you just use the full path to it?
> >>>>
> >>>> I hope this helps,
> >>>> Gus Correa
> >>>>
> >>>> On 06/13/2013 10:09 AM, Lee, Eibhlin wrote:
> >>>>
> >>>>> Pavan,
> >>>>> I had a lot of trouble getting hydra to work without having to enter
> a
> >>>>> password/passphrase. I saw the option to pass a phrase in the mpich
> >>>>> installers guide. I eventually found that for that command you
> needed to
> >>>>> use the smpd process manager. That's the only reason I chose smpd
> over
> >>>>> hydra.
> >>>>> As to your other suggestion. I ran ./main and the same error (Can't
> >>>>> open /dev/mem...) appeared. sudo ./main works but of course without
> >>>>> multiple processes.
> >>>>> Eibhlin
> >>>>> ______________________________**__________
> >>>>> From: discuss-bounces at mpich.org [discuss-bounces at mpich.org] on
> behalf
> >>>>> of Pavan Balaji [balaji at mcs.anl.gov]
> >>>>> Sent: 13 June 2013 14:34
> >>>>> To: discuss at mpich.org
> >>>>> Subject: Re: [mpich-discuss] Running an mpi program that needs to
> >>>>> access /dev/mem
> >>>>>
> >>>>> I just saw your older email. Why are you using smpd instead of the
> >>>>> default process manager (hydra)?
> >>>>>
> >>>>> -- Pavan
> >>>>>
> >>>>> On 06/13/2013 08:05 AM, Pavan Balaji wrote:
> >>>>>
> >>>>>> What's "-phrase"? That's not a recognized option. I'm not sure
> where
> >>>>>> the /dev/mem check is coming from. Try running ~/main without
> mpiexec
> >>>>>> first.
> >>>>>>
> >>>>>> -- Pavan
> >>>>>>
> >>>>>> On 06/13/2013 06:56 AM, Lee, Eibhlin wrote:
> >>>>>>
> >>>>>>> Hello all,
> >>>>>>>
> >>>>>>> I am trying to use two raspberry-pi to sample and then process some
> >>>>>>> data. The first process samples while the second processes and vice
> >>>>>>> versa. To do this I use gpio and also mpich-3.0.4 with the process
> >>>>>>> manager smpd. I have successfully run cpi on both machines (from
> the
> >>>>>>> master machine). I have also managed to run a similar program but
> >>>>>>> without the MPI, this involved compiling with gcc and when running
> >>>>>>> putting sudo in front of the binary file.
> >>>>>>>
> >>>>>>> When I combine these two processes I get various error messages.
> >>>>>>> For input:
> >>>>>>> mpiexec -phrase cat -machinefile machinefile -n 2 ~/main
> >>>>>>> the error is:
> >>>>>>> Can't open /dev/mem
> >>>>>>> Did you forget to use 'sudo .. ?'
> >>>>>>>
> >>>>>>> For input:
> >>>>>>> sudo mpiexec -phrase cat -machinefile machinefile -n 2 ~/main
> >>>>>>> the error is:
> >>>>>>> sudo: mpiexec: Command not found
> >>>>>>>
> >>>>>>> I therefore put mpiexec into /usr/bin
> >>>>>>>
> >>>>>>> now for input:
> >>>>>>> sudo mpiexec -phrase cat -machinefile machinefile -n 2 ~/main
> >>>>>>> the error is:
> >>>>>>> Can't open /dev/mem
> >>>>>>> Did you forget to use 'sudo .. ?'
> >>>>>>>
> >>>>>>> Does anyone know how I can work around this?
> >>>>>>> Thanks,
> >>>>>>> Eibhlin
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> ______________________________**_________________
> >>>>>>> discuss mailing list discuss at mpich.org
> >>>>>>> To manage subscription options or unsubscribe:
> >>>>>>> https://lists.mpich.org/**mailman/listinfo/discuss<
> https://lists.mpich.org/mailman/listinfo/discuss>
> >>>>>>>
> >>>>>>> --
> >>>>> Pavan Balaji
> >>>>> http://www.mcs.anl.gov/~balaji
> >>>>> ______________________________**_________________
> >>>>> discuss mailing list discuss at mpich.org
> >>>>> To manage subscription options or unsubscribe:
> >>>>> https://lists.mpich.org/**mailman/listinfo/discuss<
> https://lists.mpich.org/mailman/listinfo/discuss>
> >>>>> ______________________________**_________________
> >>>>> discuss mailing list discuss at mpich.org
> >>>>> To manage subscription options or unsubscribe:
> >>>>> https://lists.mpich.org/**mailman/listinfo/discuss<
> https://lists.mpich.org/mailman/listinfo/discuss>
> >>>>>
> >>>> ______________________________**_________________
> >>>> discuss mailing list discuss at mpich.org
> >>>> To manage subscription options or unsubscribe:
> >>>> https://lists.mpich.org/**mailman/listinfo/discuss<
> https://lists.mpich.org/mailman/listinfo/discuss>
> >>>> ______________________________**_________________
> >>>> discuss mailing list discuss at mpich.org
> >>>> To manage subscription options or unsubscribe:
> >>>> https://lists.mpich.org/**mailman/listinfo/discuss<
> https://lists.mpich.org/mailman/listinfo/discuss>
> >>>>
> >>>
> >>> ______________________________**_________________
> >>> discuss mailing list discuss at mpich.org
> >>> To manage subscription options or unsubscribe:
> >>> https://lists.mpich.org/**mailman/listinfo/discuss<
> https://lists.mpich.org/mailman/listinfo/discuss>
> >>> ______________________________**_________________
> >>> discuss mailing list discuss at mpich.org
> >>> To manage subscription options or unsubscribe:
> >>> https://lists.mpich.org/**mailman/listinfo/discuss<
> https://lists.mpich.org/mailman/listinfo/discuss>
> >>> ______________________________**_________________
> >>> discuss mailing list discuss at mpich.org
> >>> To manage subscription options or unsubscribe:
> >>> https://lists.mpich.org/**mailman/listinfo/discuss<
> https://lists.mpich.org/mailman/listinfo/discuss>
> >>>
> >>>
> >> --
> >> Pavan Balaji
> >> http://www.mcs.anl.gov/~balaji
> >> ______________________________**_________________
> >> discuss mailing list discuss at mpich.org
> >> To manage subscription options or unsubscribe:
> >> https://lists.mpich.org/**mailman/listinfo/discuss<
> https://lists.mpich.org/mailman/listinfo/discuss>
> >>
> >
> >
> > _______________________________________________
> > discuss mailing list discuss at mpich.org
> > To manage subscription options or unsubscribe:
> > https://lists.mpich.org/mailman/listinfo/discuss
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.mpich.org/pipermail/discuss/attachments/20130615/d29dd202/attachment.html
> >
>
> ------------------------------
>
> _______________________________________________
> discuss mailing list
> discuss at mpich.org
> https://lists.mpich.org/mailman/listinfo/discuss
>
> End of discuss Digest, Vol 8, Issue 29
> **************************************
>
--
Best Regards,
Sufeng Niu
ECASP lab, ECE department, Illinois Institute of Technology
Tel: 312-731-7219
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpich.org/pipermail/discuss/attachments/20130615/3a4d7615/attachment.html>
More information about the discuss
mailing list