[MITgcm-support] tiles vs processors
Constantinos Evangelinos
ce107 at ocean.mit.edu
Wed Feb 2 15:29:19 EST 2005
On Wednesday 02 February 2005 07:24, chris hill wrote:
> Its also what is used to support true shared memory parallelism.
> mnc_assembly should be able to work for tiles and processes, not sure
> why it doesn't - I'll ask Ed.
>
> Chris
>
> Martin Losch wrote:
> > Chris,
> > something I never quite understood: Why do we have the capability for
> > multiple tiles per processors (the bi-bj-loops)? The only reason I see
> > is the cubed sphere grid. Are there any other situations in which it is
> > advantageous to have more than one tile per processor?
Beyond that (and with preliminary indications from a quick serial test on an
old Origin 2000 platform), there is an opportunity for speeding up the code
by arranging tile sizes such that a tile's memory size is close to a large
divisor of (or at most equal to) some level of the cache (usually L2, maybe
L3 depending on their relative size and speed).
I plan to runs further tests of this down the line and it may be possible to
have some method of determining a (sub)optimal tile size for a given problem
on a given architecture.
Constantinos
--
Dr. Constantinos Evangelinos
Department of Earth, Atmospheric and Planetary Sciences
Massachusetts Institute of Technology
More information about the MITgcm-support
mailing list