<tt><font size=2>> From: "Rob Latham" <robl@mcs.anl.gov></font></tt>
<br>
<br><tt><font size=2>> How did this single rank get a negative offset?
Was there some<br>
> integer math that overflowed?<br>
</font></tt>
<br><tt><font size=2>That's for the app developer to figure out. My
issue is that if all ranks had failed the write he probably would have
started figuring that out a few days ago and I wouldn't have gotten involved
:) It's the weird hw error that dragged me into this when the non-failing
ranks entered allreduce in romio and the failing ranks entered allreduce
in the app.</font></tt>
<br>
<br><tt><font size=2>Like I said :</font></tt>
<br>
<br><tt><font size=2>> > Just wondering if there's something I can
fix here in addition to the <br>
> > application.<br>
</font></tt>
<br><tt><font size=2>Not the highest priority really. But I coincidentally
just got another report (from ANL this time) that an app is hung with half
the ranks in write_at_all and half the ranks in a later barrier. It
could be something similar. I don't have enough information yet to
know but I've suggested they look at errors from write.</font></tt>
<br>
<br>