[mpich-discuss] Questions about using MPIR_proctable interface in a parallel debugger
ashley at pittman.co.uk
Mon Dec 23 11:13:39 CST 2013
On 23 Dec 2013, at 14:17, David Wootton <drwootton at hvc.rr.com> wrote:
> I have a couple questions
> 1) Is this the correct way to do this or is there another method I should be using to get this information?
It’s a way and it’s not incorrect ;) It’s the most universal method however isn’t as convenient as some other ways. There are often easier ways to get this information directly from any resource manager that you’re using, for example slurm has a “scontrol listpids” command.
> 2) I read somewhere that the MPIR_proctable was limited to 64 proceses initially. Is this only an initial size limitation or does MPIR_proctable grow to hold the info all processes regardless of how many processes exist in the application?
I’m not familiar with this restriction and I’ve certainly used MPIR_proctable with systems much larger than that.
> I also ran into a problem experimenting with this at the MPICH 3.0.4 level. It seems that the pid in all entries of MPIR_proctable is the same for each process and is the pid of the process that is rank zero. All processes are running on the same host that mpirun is also running on.
This sounds like a bug in mpich2.
More information about the discuss