[MITgcm-devel] local (tiled) MDSIO
Jean-Michel Campin
jmc at ocean.mit.edu
Tue Jun 2 10:15:11 EDT 2009
Hi Martin and others,
Thanks for the test (for the number I reported, I changed the
dumpFreq to 1 to get much more IO, so that, by comparaison,
the timing you are getting are not bad).
But if, latter on, you have a chance to pick some number from
a "real" (the one you generally run) simulation on this sx8,
could be interesting.
Otherwise, I could push this tiled IO further, read/write all levels
at a time, but I need bigger buffers (with size that are not know
at compile time) so it requires more changes. And if we choose to
go this way, I would prefer to combine this change (all levels)
with modifications to get those IO routines able to read/write
non-shared array (not in common block) for multi-threaded run.
Cheers,
Jean-Michel
On Tue, Jun 02, 2009 at 12:36:53PM +0200, Martin Losch wrote:
> Hi Jean-Michel,
>
> here's what I find for verification/deep_anelastic (no modifications of
> data files, 4 tiles)
> on 2009-05-31 04:20 (so before you changes)
> PID.TID 0000.0001) Seconds in section "ALL
> [THE_MODEL_MAIN]":
> (PID.TID 0000.0001) User time: 13.26000034343451
> (PID.TID 0000.0001) System time: 1.430000022053719
> (PID.TID 0000.0001) Wall clock time: 15.95677185058594
> today (after your changes):
> (PID.TID 0000.0001) Seconds in section "ALL
> [THE_MODEL_MAIN]":
> (PID.TID 0000.0001) User time: 12.71000026725233
> (PID.TID 0000.0001) System time: 1.395000047981739
> (PID.TID 0000.0001) Wall clock time: 15.20528006553650
>
> So faster, but not significantly. The reason is probably, that for the
> GSFS of the SX8 the batches of IO are still very small. The system
> considers basically everything below 1GB as small (o:
>
> Martin
>
>
>
> On Jun 1, 2009, at 4:34 PM, Jean-Michel Campin wrote:
>
>> Hi Martin,
>>
>> I've check-in a modification to MDSIO pkg such as tiled IO are
>> now done by chunk of 1-level tile (instead of 1-line of length sNx).
>> I remember you reported that non-SingleCpuIO was slower than
>> SingleCpuIO because of many small read/write pieces.
>> This modification should improve the speed of those IO,
>> and it would be interesting to see if it really does (because it's
>> still a matter of platform/disk system ...).
>>
>> I've did some short test with lot of IO, and in the most favorable one
>> (verification/deep_anelastic), without any Optimisation, I get:
>> std_outp.new User: 13.3799661 System: 0.490924996 Wall clock:
>> 14.110111
>> std_outp.ref User: 15.7935989 System: 6.53900592 Wall clock:
>> 22.7398989
>> In other cases, I've seen also a reduction of the System time,
>> but the wall-clock time improvement was not as big.
>>
>> Cheers,
>> Jean-Michel
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel
More information about the MITgcm-devel
mailing list