[mpich-discuss] Problems running MPICH jobs under SLURM

Pavan Balaji balaji at mcs.anl.gov
Mon Jun 3 08:30:31 CDT 2013


On 06/03/2013 01:57 AM, Markus Geimer wrote:
> (reply intentionally not sent to the list -- I don't like such logs
> to show up in mailing list archives...)

Ok, I'm cc'ing discuss at mpich.org back.

>> Just to make sure, are these the only configure options you are using:
>>
>> --prefix=... --enable-shared --enable-debuginfo
>
> Yes, these are the only options (besides explicitly specifying the
> GNU compiler, but this shouldn't do any harm). Please find the full
> configure log attached.

Thanks.

>> Also, can you run mpiexec with the -verbose option for one of the
>> failing tests (probably just mpiexec -n 4 ./hello) and send me the output?
>
> Output attached.

Hmm.  The double free error seems to be coming from the executable, 
rather than from mpiexec or the proxy.  So we might be looking in the 
wrong place.

1. Can you run your application processes using "ddd" or some other 
debugger to see where the double free is coming from?  You might have to 
build mpich with --enable-g=dbg to get the debug symbols in.

2. Can you send me the output with the ssh launcher as well?  I want to 
see if there are any critical differences in the environment variables 
being propagated (e.g., LD_LIBRARY_PATH/LD_PRELOAD) that might affect 
shared library builds.

Feel free to send me the logs off-list.

Thanks,

  -- Pavan

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji



More information about the discuss mailing list