[mpich-discuss] MPI_init very slow with more than 3 nodes: Solved (in part)
Pavan Balaji
balaji at mcs.anl.gov
Thu Dec 5 10:11:03 CST 2013
Excellent. Thanks for letting us know.
— Pavan
On Dec 5, 2013, at 9:24 AM, Bixente BODO GOMEZ <bixente.bodo at ehu.es> wrote:
> Hi!
>
> The problem is nfs, I don't know by server or clients.
> I have installed mpich in all nodes locally and now the system runs well:
>
> mpiu at u105251:~$ date; mpirun -f machinefile -np 28 test/hello; date
> jue dic 5 16:08:54 CET 2013
> ...
> Hola desde el procesador u103972. 4 de 28
> ...
> jue dic 5 16:08:57 CET 2013
>
> And
>
> mpiu at u105251:~$ date; mpirun -f machinefile -np 8 test/cilindro; date
> jue dic 5 16:17:17 CET 2013
>
> Calculo de superficie y volumen de un cilindro
> de radio r y altura h
> Numero de procesadores: 8
>
> Introduce el radio r: 1
> la altura h: 22
> delta: 1e-8
>
> Calculando.....
>
> Valor conocido del volumen : 69.1150383770
> Valor calculado : 69.1150349775
> Valor conocido del area : 138.2300767540
> Valor calculado : 138.2300699549
>
> Tiempo empleado (sec) : 0.95427608
>
> jue dic 5 16:17:26 CET 2013
>
>
> mpiu at u105251:~$ date; mpirun -f machinefile -np 28 test/cilindro; date
> jue dic 5 16:17:46 CET 2013
>
> Calculo de superficie y volumen de un cilindro
> de radio r y altura h
> Numero de procesadores: 28
>
> Introduce el radio r: 1
> la altura h: 22
> delta: 1e-8
>
> Calculando.....
>
> Valor conocido del volumen : 69.1150383770
> Valor calculado : 69.1150348402
> Valor conocido del area : 138.2300767540
> Valor calculado : 138.2300696804
>
> Tiempo empleado (sec) : 0.30431104
>
> jue dic 5 16:17:55 CET 2013
>
> Thanks.
>
> Pavan Balaji <balaji at mcs.anl.gov> escribió:
>
>> One thing you could try is to install mpich on a non-NFS directory such as /tmp and run a program that is also not on NFS to see if the problem persists.
>>
>> You can build mpich on an NFS directory with --prefix=/tmp/mpich and then do “make install” on each machine.
>>
>> Similarly for the “application”, you can run “/bin/hostname” or something similar.
>>
>> — Pavan
>>
>> On Dec 3, 2013, at 12:06 PM, Antonio J. Peña <apenya at mcs.anl.gov> wrote:
>>
>>>
>>> Yes, it may be a good idea to look into that. Hopefully somebody in this list with NFS knowledge will give you a hint.
>>>
>>> Antonio
>>>
>>>
>>> On 12/03/2013 09:48 AM, Bixente BODO GOMEZ wrote:
>>>> I send a extract of tcpdump. It seems that the problem is in the file attribues o locks of nfs. I changed fstab on clients to add these options:
>>>>
>>>> nfsvers=3,rw,bg,noac,rsize=8192,wsize=8192
>>>>
>>>> but nothing has improved.
>>>>
>>>> "Antonio J. Peña" <apenya at mcs.anl.gov> escribió:
>>>>
>>>>> I don't think this is helping. Maybe tcpdump / wireshark captures could be helpful.
>>>>>
>>>>> Antonio
>>>>>
>>>>>
>>>>> On 12/02/2013 11:44 AM, Bixente BODO GOMEZ wrote:
>>>>>> The attachment...
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> discuss mailing list discuss at mpich.org
>>>>>> To manage subscription options or unsubscribe:
>>>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>>>
>>>>>
>>>>> --
>>>>> Antonio J. Peña
>>>>> Postdoctoral Appointee
>>>>> Mathematics and Computer Science Division
>>>>> Argonne National Laboratory
>>>>> 9700 South Cass Avenue, Bldg. 240, Of. 3148
>>>>> Argonne, IL 60439-4847
>>>>> (+1) 630-252-7928
>>>>> apenya at mcs.anl.gov
>>>>> www.mcs.anl.gov/~apenya
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> discuss mailing list
>>>> discuss at mpich.org
>>>>
>>>> To manage subscription options or unsubscribe:
>>>>
>>>> https://lists.mpich.org/mailman/listinfo/discuss
>>>
>>>
>>> --
>>> Antonio J. Peña
>>> Postdoctoral Appointee
>>> Mathematics and Computer Science Division
>>> Argonne National Laboratory
>>> 9700 South Cass Avenue, Bldg. 240, Of. 3148
>>> Argonne, IL 60439-4847
>>>
>>> apenya at mcs.anl.gov
>>> www.mcs.anl.gov/~apenya
>>> _______________________________________________
>>> discuss mailing list discuss at mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>> --
>> Pavan Balaji
>> http://www.mcs.anl.gov/~balaji
>>
>> _______________________________________________
>> discuss mailing list discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the discuss
mailing list