[mpich-discuss] Can't receive messages

Jeff Hammond jeff.science at gmail.com
Wed Jan 1 16:09:48 CST 2014

Sent from my iPhone

> On Jan 1, 2014, at 1:50 PM, Pavan Balaji <balaji at mcs.anl.gov> wrote:
>> On Jan 1, 2014, at 1:25 PM, Jeff Hammond <jeff.science at gmail.com> wrote:
>> Isn't the process manager infrastructure hetero-safe? At the very least, ssh-ing "uname -a" around the ring of procs can identify the problem on O(1) cost.
> 1. The process manager not heterogeneous-safe.

Is it possible to get the proc table sufficient for 3 though?

> 2. "uname -a" is not an accurate representation, since two different Linux distributions on the same architecture are not considered heterogeneous.  The datatype representation has to be different.

Well query whatever you need to query then. MPICH could hash all the datatypes sizes during configuration and store that value somewhere useful. 

> 3. Doing an extra ssh to all the nodes is expensive to do every time.

You just need a ring to verify homogeneity.  Forming a ring is O(1) per node. 


>  — Pavan
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
> _______________________________________________
> discuss mailing list     discuss at mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss

More information about the discuss mailing list