Home > Engine > Time, Only Time

Time, Only Time

Keeping track of time is a sensitive problem. Designing a good system for this is in many ways one of the most basic and most crucial tasks a game engine developer has. One of those reasons is that most (if not all) game engines are simulation engines at their core, and without proper treatment and handling of time, nothing much else can be handled and treated.

Anyways, I’m not going to talk about the design of the time system that we have implemented in Zorvan (that’s what we call our game engine) but about its implementation, and not about the whole implementation, but rather about how we actually read time from the system, and the pitfalls and problems associated with it.

Basically, when writing your programs in C, and on Windows on a PC, you have a range of options for reading the absolute time. The basis of this absolute time is not important because we’ll be only working with time deltas and almost never with the value of the time itself. Let’s call each source of such an absolute time a “time source”(!)

There are a few parameters that we should be concerned about in a time source. The first is precision, or the frequency or the smallest time value that a source can actually measure or the number of meaningful digits in its return value. The next one is update interval. For example, a time source may advertise that it can measure time in micro- or nano-seconds, but its value may only change (or be updated) once a millisecond. The third parameter is the overhead of actually reading time from that source. If a source writes data to a disk file only and you have to read it from there, it’s gonna be no good for you whether it measures time in femto-seconds or not. You are going to spends a few of milliseconds reading that number anyway (if not more) and it won’t be any good then.

The two last parameters can be combined, but I thought I’d make a distinction because they are indeed different to a programmer and they present different symptoms in the application.

OK, enough with the pep talk. Let’s get down to business. The most obvious time sources are the CRT clock(), the Win32 API’s GetTickCount() and Win32 Multimedia API’s timeGetTime(). These are all millisecond sources. That is, they all make you think they have a millisecond precision. On Windows, the first two actually have an awful update interval of 15-16 milliseconds (any old DOS programmer should find this number painfully nostalgic!) Of course, this number is not fixed and you should not assume it, but I don’t remember if I’ve encountered any different behaviour by these two calls in recent years. The multimedia timer actually does have a millisecond precision and it does indeed get updated every actual millisecond, but the API documentation asserts that you cannot count on that either (but you can ask the OS to make an effort to update this timer value in any multiples of milliseconds that you want.) Among these three, the multimedia timer has the lowest call overhead and is obviously the best of them.

In any case, the millisecond precision is not enough for most of the timing needs of a game engine, but it may be enough for the most important usage of measuring the frame time. For games that have a framerate of below 100, this should work out well enough for now, but you should think about future too. In general, don’t use this to measure any duration less that a second in a game.

The next time source on Windows, with more precision and supposedly better behavior is the high performance timer (or whatever) aka QueryPerformanceCounter(). This allegedly uses one of the hardware counters on your system and produces close to microsecond (or a little better) precision. However, the frequency of this time source is not fixed from system to system and you have to query it from the OS (using QueryPerformanceFrequency()) at runtime, but it is guaranteed not to change when the computer is up and running (I don’t know what happens across system hibernations, and frankly, I don’t want to find out!) On my system, and a few others I’ve tested the frequency of this source is 3’579’545 ticks per second, which gives it a precision of 280 nanoseconds. However, as my tests show, each invocation of this function takes about 2 microseconds (on my test system) which makes the overhead and latency rather high compared to its precision. However, for timing frames and other non-critical code (out of your precious inner loops) this is probably your best bet.

One largely curious (not to mention disturbing) behavior we observed recently in Zorvan while using this method for time calculations was that it returned the same number in two invocations a few milliseconds apart which resulted in all sorts of erratic behaviors, but I haven’t had time to investigate it in depth and I haven’t been able to reproduce it again.

Perhaps the most pervasive method for obtaining time in game engines is using the rdtsc (Read Time-Stamp Counter) x86 instruction which has been available since the era of 486 CPUs. It has no parameters and returns the number of CPU cycles past since the CPU was restarted as a 64-bit number in EDX:EAX. The instruction is lightweight and low-overhead, has a very high precision (almost the highest precision possible, because any lapse of time smaller that the CPU clock cycle is hardly meaningful or measurable in general computing) and is available everywhere (on all PCs anyway, and it’s obviously not Windows-specific.)

For those who are afraid on inline assembly, there is even a convenient intrinsic available in Visual C++ (include “intrin.h” and call __rdtsc(). (The GCC inline assembly call is left as an exercise for the reader!)

But nothing is free. There are a few problems attached to the use of rdtsc which fall mostly in two categories: problems caused by multi-CPU systems and those that result from its interaction with CPU power saving schemes.

In multi-CPU systems (including multi-cores,) the different CPUs may not have started counting cycles at exactly the same time which causes the results read from different cores slightly different. This can happen very easily because normally your code can run on different CPUs at different times (across task switches) and can cause the time seen by the application to appear to go backwards (imagine trying to feed a negative time delta to your physics solver!) The first time I saw this was on an AMD Athlon 64 X2. Although the problem is solvable with an official patch from AMD, it has left me always afraid from having to encounter it again! In any case, I haven’t seen this problem on Intel Core architecture CPUs and I haven’t had access to an AMD Phenom or Opteron to test.

The second group of problems happen because modern CPUs may vary and adapt their clock rates at runtime to different loads (this is quite common on mobile-class CPUs specially.) This means than when you measure your CPU clock rate (e.g. at the start of your application) it may not stay the same during the lifetime of your application and may go down or up (if the initial load was low.) In either case, it will wreak havoc on your time calculations.

But don’t despair! While solving the first issue is rather ugly in applications (you have to bind your time-reading thread to a single CPU (it’s called setting the “CPU affinity” for that thread; STFW yourselves)) and solving the second problem is impractical in application code, the second category of problems can be solved rather easily (IMHO) in the CPU itself. It just has to always report the time-stamp counter value according to the highest clock rate. And I suspect that CPUs actually do this, because I haven’t encountered problems of the second category yet and it’s only a theoretical problem for me for the time being. I may just be lucky, but I don’t believe so!

Anyway, for your reference and purposes of comparison, I have measured some relevant timing values for the 5 time sources discussed above on my laptop (a T7200 CPU, i.e Inter Core 2 Duo, 2GHz) which is presented in the table below:

Time Source Call Overhead (microseconds) Minimum Value Jump In Two Successive Calls Frequency (Hz) Precision (MHz-1)
clock() 0.03822 15 1000 1000
GetTickCount() 0.00313 15 1000 1000
timeGetTime() 0.02026 1 1000 1000
QueryPerformanceCounter() 1.921 5 3579545 0.279365
rdtsc 0.000516 60 2000000000 0.0005

*: Each number is measured and averaged over about 100 million iterations.
There are of course a couple of more methods of reading time from other hardware sources, but their availability and parameters are rather system dependent and I won’t go into the topic anymore.

VN:F [1.9.3_1094]
Rating: 8.9/10 (14 votes cast)
VN:F [1.9.3_1094]
Rating: +3 (from 3 votes)
Time, Only Time, 8.9 out of 10 based on 14 ratings
Author: Yaser Zhian Categories: Engine Tags: , ,
  1. May 9th, 2009 at 04:03 | #1

    I forgot to say. On Linux, your best bet is either the clock() function (don’t!), the rdtsc instruction or the gettimeofday() function which gives a theoretical nanosecond precision. Thought someone might be interested.

    VN:F [1.9.3_1094]
    Rating: 5.0/5 (1 vote cast)
    VN:F [1.9.3_1094]
    Rating: 0 (from 0 votes)
  1. May 9th, 2009 at 04:00 | #1