[MITgcm-devel] local (tiled) MDSIO

Tue Jun 2 10:41:12 EDT 2009

Hi JMC,

I updated to this code and will let you know if it changes my  
performance on Ranger.  Thanks for working to improve the I/O routines!

-Matt

On Jun 2, 2009, at 7:15 AM, Jean-Michel Campin wrote:

> Hi Martin and others,
>
> Thanks for the test (for the number I reported, I changed the
> dumpFreq to 1 to get much more IO, so that, by comparaison,
> the timing you are getting are not bad).
> But if, latter on, you have a chance to pick some number from
> a "real" (the one you generally run) simulation on this sx8,
> could be interesting.
>
> Otherwise, I could push this tiled IO further, read/write all levels
> at a time, but I need bigger buffers (with size that are not know
> at compile time) so it requires more changes. And if we choose to
> go this way, I would prefer to combine this change (all levels)
> with modifications to get those IO routines able to read/write
> non-shared array (not in common block) for multi-threaded run.
>
> Cheers,
> Jean-Michel
>
> On Tue, Jun 02, 2009 at 12:36:53PM +0200, Martin Losch wrote:
>> Hi Jean-Michel,
>>
>> here's what I find for verification/deep_anelastic (no  
>> modifications of
>> data files, 4 tiles)
>> on 2009-05-31 04:20 (so before you changes)
>> PID.TID 0000.0001)   Seconds in section "ALL
>> [THE_MODEL_MAIN]":
>> (PID.TID 0000.0001)           User time:  13.26000034343451
>> (PID.TID 0000.0001)         System time:  1.430000022053719
>> (PID.TID 0000.0001)     Wall clock time:  15.95677185058594
>> today (after your changes):
>> (PID.TID 0000.0001)   Seconds in section "ALL
>> [THE_MODEL_MAIN]":
>> (PID.TID 0000.0001)           User time:  12.71000026725233
>> (PID.TID 0000.0001)         System time:  1.395000047981739
>> (PID.TID 0000.0001)     Wall clock time:  15.20528006553650
>>
>> So faster, but not significantly. The reason is probably, that for  
>> the
>> GSFS of the SX8 the batches of IO are still very small. The system
>> considers basically everything below 1GB as small (o:
>>
>> Martin
>>
>>
>>
>> On Jun 1, 2009, at 4:34 PM, Jean-Michel Campin wrote:
>>
>>> Hi Martin,
>>>
>>> I've check-in a modification to MDSIO pkg such as tiled IO are
>>> now done by chunk of 1-level tile (instead of 1-line of length sNx).
>>> I remember you reported that non-SingleCpuIO was slower than
>>> SingleCpuIO because of many small read/write pieces.
>>> This modification should improve the speed of those IO,
>>> and it would be interesting to see if it really does (because it's
>>> still a matter of platform/disk system ...).
>>>
>>> I've did some short test with lot of IO, and in the most  
>>> favorable one
>>> (verification/deep_anelastic), without any Optimisation, I get:
>>> std_outp.new   User: 13.3799661 System: 0.490924996 Wall clock:
>>> 14.110111
>>> std_outp.ref   User: 15.7935989 System: 6.53900592 Wall clock:
>>> 22.7398989
>>> In other cases, I've seen also a reduction of the System time,
>>> but the wall-clock time improvement was not as big.
>>>
>>> Cheers,
>>> Jean-Michel
>>> _______________________________________________
>>> MITgcm-devel mailing list
>>> MITgcm-devel at mitgcm.org
>>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>>
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel