[mpich-discuss] Can't receive messages
Jeff Hammond
jeff.science at gmail.com
Wed Jan 1 16:09:48 CST 2014
Sent from my iPhone
> On Jan 1, 2014, at 1:50 PM, Pavan Balaji <balaji at mcs.anl.gov> wrote:
>
>
>> On Jan 1, 2014, at 1:25 PM, Jeff Hammond <jeff.science at gmail.com> wrote:
>> Isn't the process manager infrastructure hetero-safe? At the very least, ssh-ing "uname -a" around the ring of procs can identify the problem on O(1) cost.
>
> 1. The process manager not heterogeneous-safe.
Is it possible to get the proc table sufficient for 3 though?
> 2. "uname -a" is not an accurate representation, since two different Linux distributions on the same architecture are not considered heterogeneous. The datatype representation has to be different.
>
Well query whatever you need to query then. MPICH could hash all the datatypes sizes during configuration and store that value somewhere useful.
> 3. Doing an extra ssh to all the nodes is expensive to do every time.
>
You just need a ring to verify homogeneity. Forming a ring is O(1) per node.
Jeff
> — Pavan
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
>
> _______________________________________________
> discuss mailing list discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss
More information about the discuss
mailing list