[MITgcm-devel] local (tiled) MDSIO
Matthew Mazloff
mmazloff at MIT.EDU
Tue Jun 2 10:41:12 EDT 2009
Hi JMC,
I updated to this code and will let you know if it changes my
performance on Ranger. Thanks for working to improve the I/O routines!
-Matt
On Jun 2, 2009, at 7:15 AM, Jean-Michel Campin wrote:
> Hi Martin and others,
>
> Thanks for the test (for the number I reported, I changed the
> dumpFreq to 1 to get much more IO, so that, by comparaison,
> the timing you are getting are not bad).
> But if, latter on, you have a chance to pick some number from
> a "real" (the one you generally run) simulation on this sx8,
> could be interesting.
>
> Otherwise, I could push this tiled IO further, read/write all levels
> at a time, but I need bigger buffers (with size that are not know
> at compile time) so it requires more changes. And if we choose to
> go this way, I would prefer to combine this change (all levels)
> with modifications to get those IO routines able to read/write
> non-shared array (not in common block) for multi-threaded run.
>
> Cheers,
> Jean-Michel
>
> On Tue, Jun 02, 2009 at 12:36:53PM +0200, Martin Losch wrote:
>> Hi Jean-Michel,
>>
>> here's what I find for verification/deep_anelastic (no
>> modifications of
>> data files, 4 tiles)
>> on 2009-05-31 04:20 (so before you changes)
>> PID.TID 0000.0001) Seconds in section "ALL
>> [THE_MODEL_MAIN]":
>> (PID.TID 0000.0001) User time: 13.26000034343451
>> (PID.TID 0000.0001) System time: 1.430000022053719
>> (PID.TID 0000.0001) Wall clock time: 15.95677185058594
>> today (after your changes):
>> (PID.TID 0000.0001) Seconds in section "ALL
>> [THE_MODEL_MAIN]":
>> (PID.TID 0000.0001) User time: 12.71000026725233
>> (PID.TID 0000.0001) System time: 1.395000047981739
>> (PID.TID 0000.0001) Wall clock time: 15.20528006553650
>>
>> So faster, but not significantly. The reason is probably, that for
>> the
>> GSFS of the SX8 the batches of IO are still very small. The system
>> considers basically everything below 1GB as small (o:
>>
>> Martin
>>
>>
>>
>> On Jun 1, 2009, at 4:34 PM, Jean-Michel Campin wrote:
>>
>>> Hi Martin,
>>>
>>> I've check-in a modification to MDSIO pkg such as tiled IO are
>>> now done by chunk of 1-level tile (instead of 1-line of length sNx).
>>> I remember you reported that non-SingleCpuIO was slower than
>>> SingleCpuIO because of many small read/write pieces.
>>> This modification should improve the speed of those IO,
>>> and it would be interesting to see if it really does (because it's
>>> still a matter of platform/disk system ...).
>>>
>>> I've did some short test with lot of IO, and in the most
>>> favorable one
>>> (verification/deep_anelastic), without any Optimisation, I get:
>>> std_outp.new User: 13.3799661 System: 0.490924996 Wall clock:
>>> 14.110111
>>> std_outp.ref User: 15.7935989 System: 6.53900592 Wall clock:
>>> 22.7398989
>>> In other cases, I've seen also a reduction of the System time,
>>> but the wall-clock time improvement was not as big.
>>>
>>> Cheers,
>>> Jean-Michel
>>> _______________________________________________
>>> MITgcm-devel mailing list
>>> MITgcm-devel at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel
More information about the MITgcm-devel
mailing list