[MITgcm-support] building with MPI on a dual-core mac
Klymak Jody
jklymak at uvic.ca
Wed Jul 22 22:52:14 EDT 2009
Thanks a lot Constantinos,
that is very clear!
(PID.TID 0000.0001) Seconds in section "ALL
[THE_MODEL_MAIN]":
(PID.TID 0000.0001) User time: 1318.5500146746635
(PID.TID 0000.0001) System time: 522.38001954555511
(PID.TID 0000.0001) Wall clock time: 1977.2751381397247
For 2 machines over gigabit enet. For me, the 10% imbalance is
probably acceptable given the cost of infiniband cards....
Cheers, Jody
On 22-Jul-09, at 7:33 PM, Constantinos Evangelinos wrote:
> On Tuesday 21 July 2009 5:22:00 pm Klymak Jody wrote:
>
>> While I suppose I can guess, what is the technical difference between
>> "user", "system" and "wallclock"? I suppose a large difference
>> between "wallclock" and "system" means lots of MPI overhead?
>
> user time "u" is processor time spent on behalf of a process in user
> (non-privileged) code.
> system time "s" is processor time spent on behalf of a process in the
> operating system kernel (or equivalent, depending on the operating
> system)
> wallclock time "w" is self explanatory.
>
> Essentially all of your computational code should be user time. Part
> of your
> I/O will count as user and part as system time (how big each part is
> depends
> on the O/S). Communication time is treated similarly to I/O
> (communicating
> over Ethernet has a significant system component - and corresponding
> overhead
> of switching to kernel mode - communicating over a high speed
> interconnect
> like Myrinet, Infiniband etc. should be mainly user time). Moreover
> time
> spent waiting for data (with the process not relinquishing the cpu
> busy
> spinning away) counts as user or system time (e.g. time spent
> waiting for
> data from main memory or time spent waiting for data from disk). If
> however
> the O/S switches control of the cpu away from an idling process,
> that time
> only counts as wallclock time.
>
> Given the above, u+s <= w (to within precision - u and s usually are
> to 0.01s
> while w is to 1us.) and ideally u+s=w (that would indicate a process
> that
> spends no time idling waiting for data). A large discrepancy between
> u+s and
> w indicates either load imbalance or significant network or disk I/O
> issues.
>
> Constantinos
> --
> Dr. Constantinos Evangelinos
> Department of Earth, Atmospheric and Planetary Sciences
> Massachusetts Institute of Technology
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
More information about the MITgcm-support
mailing list