<meta http-equiv="Content-Type" content="text/html; charset=utf-8"><div dir="ltr">Hi,<div>I'm running hdft with mpiio on a single compute node (32 cores), 10 OSTs, the file system is lustre v2.5. </div><div>I submit the job with 3 processes, they are writing to a shared file, which is about 3GBs, </div><div>and each process writes 1/3 of the file, for example, </div><div>The array is a 4D double array, 3*32*1024*128, then each process writes 32*1024*128 to the file, which is contiguous. <br clear="all"><div><br></div><div>I observed some wired performance number, I tried both independent I/O and collective IO.</div><div>In the case of independent I/O, each rank seems to block each other and finish writing one after another. But in collective I/O, all three ranks reports same I/O cost, I think this is because there is only one aggregator. </div><div>My question is, in the case of independent I/O, are the I/Os blocking when accessing the file?<br></div><div>If not blocking, can I expect linear speedup on a single node by increasing number of processes?</div><div><br></div><div>Best,</div><div>Jialin</div>Lawrence Berkeley Lab
</div></div>