[mpich-discuss] Option '-iface' broken when MPICH is configured with '--enable-strict'

Jan Bierbaum jan.bierbaum at tudos.org
Tue Feb 4 15:52:26 CST 2014


Hi!

Although I'm not sure whether this should be considered a bug or just an 
annoyance, I thought it would be a good idea to bring it up here because 
it took me quiet some time to figure out the underlying problem.

For those not interested in the details: My suggestion is to disable 
'-iface' completely when MPICH is configured with '--enable-strict'. 
When the option is used in such a situation, some descriptive error 
message should be displayed.


Now, for the full story: I routinely pass the '--enable-strict' option 
to configure. According to 'configure --help' this option is supposed to 
"Turn on strict compilation testing" which sounds like a good thing when 
you are working with unfamiliar code.

Recently I also started to play around with the '-iface' option of 
mpiexec and to my surprise MPICH just produced a rather useless error 
message:

 > Fatal error in MPI_Init: Other MPI error"

A debug build of MPICH 3.0.4 was more informative:

> Fatal error in MPI_Init: Other MPI error, error stack:
> MPIR_Init_thread(433)..............:
> MPID_Init(176).....................: channel initialization failed
> MPIDI_CH3_Init(70).................:
> MPID_nem_init(286).................:
> MPID_nem_tcp_init(108).............:
> MPID_nem_tcp_get_business_card(354):
> MPID_nem_tcp_init(246).............: The network interface, "eth0", specified in MPICH_NETWORK_IFACE was not found.

To make a long story short, '--enable-strict' apparently passes '-ansi' 
to the compiler. The compiler then defines the internal macro 
'__STRICT_ANSI__'. On my machine with gcc (Debian 4.7.2-5) 4.7.2 this 
will eventually cause 'net/if.h' to hide the definition of 'struct 
ifreq' and 'struct ifconf'.

'configure' detects that these structures are missing and creates an 
MPICH build that uses an empty, always-failing version of 
'MPIDI_Get_IP_for_iface'. This failure, in turn, results in the 
aforementioned rather useless error message.

IMHO there should either be some mechanism to enable strict code 
checking without passing the '-ansi' option or '-iface' should be 
disabled cleanly when '--enable-strict' is used. At least some 
explanatory error message would be nice ;-)


Regards, Jan



More information about the discuss mailing list