Description
The test/nonblocking/mcoll_perf.c
test detects incorrect data when comparing two files that were written two different ways which should have identical content.
cd test/nonblocking
srun -n2 ./mcoll_perf /unifyfs/testfile.nc
<snip>
P0: diff at line 282 variable[2] var1_2: NC_INT buf1 != buf2 at position 32762
After tracing pwrite
and pread
calls under a debugger, the problem is that both ranks write to the same byte offsets without any synchronization in between. In this case, rank 1 writes a fill value and rank 0 later writes actual data. It's a race as to which value actually ends up in the file.
The fill call is here:
When filling the variable 2, rank 1 writes to (offset=648, length=8) and (offset=680, length=8).
And the write call is here:
In that write, rank 0 writes to (offset=640, length=16) and (offset=672, length=16), which overlaps with the region that rank 1 wrote to during the fill operation.
The test case can be fixed by adding a call to ncmpi_sync(ncid);
:
for (i=2; i<nvars; i++){
/* fill record variables to silence valgrind complaining about uninitialised bytes */
for (j=0; j<array_of_gsizes[0]; j++) {
err = ncmpi_fill_var_rec(ncid, varid[i], j);
CHECK_ERR
}
}
ncmpi_sync(ncid); // <--- add sync here to fix the test case
for (i=0; i<nvars; i++){
err = ncmpi_put_vara_all(ncid, varid[i], starts[i], counts[i], buf[i], bufcounts[i], MPI_INT);
CHECK_ERR
}
For reference, here is the sequence of (offset, length) values for writes from different ranks when k==0
. There are multiple overlapping writes, one of which is shown below:
offset, length values for writes
-------- -------
rank 0 rank 1
-------- -------
0, 336
512, 32 544, 32
576, 32 608, 32
640, 8 648, 8 <--- this "fill" by rank 1
4, 4
672, 8 680, 8
4, 4
704, 8 712, 8
4, 4
736, 8 744, 8
4, 4
656, 8 664, 8
688, 8 696, 8
720, 8 728, 8
752, 8 760, 8
512, 32 544, 32
576, 32 608, 32
640, 16 704, 16 <-- overlaps with this "put" by rank 0
672, 16 736, 16
656, 16 720, 16
688, 16 752, 16
Activity