Timestepping
Version 1
My name is Peter Sundqvist and I'm a game programmer at Turborilla. I'm writing this article because I haven't found any other text which covers all important aspects of timestepping in a game or simulation loop. If you've read articles like [1] or [2] you're familiar with the problem and know why you'd want a fixed timestep. Unfortunately though, these articles don't mention some significant issues you ought to consider. So let's go through timestepping carefully. Still lots of details are left out, like actual implementations, but shouldn't be too hard to fill in by an intermediate programmer.In case something is unclear, errorous or if you have comments or suggestions, then please drop a mail to peter
What's new?
If you've seen articles like this before you probably wonder why you're supposed to read yet another one. Here's a list of the important key points.
- The algorithms presented are not new but they may be in variations you haven't seen. The important part is that this article shows ways to evaluate them and the pros and cons of each method.
- Focus is on vertical synchronization and why your algorithms must take this into account.
- You should calculate the likely timing errors your algorithm will induce.
- Are interpolated draws, as desribed in [1] and [2], worth the trouble? What is gained and what is lost?
- Threads and triple buffering, how can they help?
- Choose the best timestep algorithm for your game. The needs are not the same for all types of games.
The basics
Almost every animated computer game will have some kind of a time stepping loop. In every cycle of this loop the game mechanics are integrated by some small delta time. This procedure we can call an
update
. After an update the game state is typically drawn on screen. Simply call this a
draw
. We should say that a draw doesn't change the game state in any way. It will therefore make no sense to do two draws in a row since the game would be rendering the same frame twice. It becomes important that the updates are synchronized with real time so that the accumulated delta times add up to the actual elapsed wall clock time. Otherwise the game may run too fast or too slow and likely with irregular speed. A simple solution to this is to use a variable delta time in the updates as follows:
start = current_time()
loop
dt = current_time() – start
start = current_time()
update(dt)
draw()
end
Listing 1
: Variable time step.
Here the game mechanics are moved along with a time of
dt
each update. It is called variable since dt will be different each cycle depending on the time it takes to update and draw. The method above is easy and commonly used but not deterministic. This can in a bad scenario mean that the game behaves well in one run while blowing up in your face in the next. Much care must be taken when programming a variable time step game to make sure it works reasonably well with very different timesteps. Things that become especially tricky to handle are physics, collision detection, networking and predictions for AI. Therefore many programmers suggest a fixed time step to make life easier and save some computation time. Fixed time steps is mainly what we'll focus on here. If you're not convinced this is a good idea please read the previously mentioned articles. Lets look at the simplest possible fixed time step algorithm:
dt = 1/60
loop
update(dt)
draw()
end
Listing 2
: Fixed time step without speed control.
Here the delta time is fixed to 1/60 of second each update i.e. 60 frames per second. The game is now deterministic to the level of floating point precision being different on different machines. This is probably close enough to deterministic for your game. It does however not at all correspond to real time. Your game will behave the same each time you run it on every machine but the speed will obviously differ widely. Let's look at the case where the computer does updates and draws too fast. We may try to solve it with something like:
dt = 1/60
start = current_time()
num = 0
loop
update(dt)
draw()
num = num + 1
wait(start + num*dt)
end
Listing 3
: Fixed time step with speed control.
The wait function should be interpreted as "wait until we've reached this time stamp". What happens in the first frame is that the game waits until 1/60 of a second has passed before doing the next update. This guarantees that the interval between updates will be correct if the computer can do the update and draw fast enough. Note that we always measure the time elapsed since the game loop started and not only the time of the current frame. This way we don't introduce any unnecessary clock drift from per frame calculation errors. All our algorithms work this way because we want to synchronize with real time also in the long run.
Now all seems well if we have a hasty computer, but now comes the real problem and the reason for writing this article. What matters after all is at what time the user gets to see the image of the game's state on screen. The drawing may likely use double buffering and the users computer can either have vertical synchronization enabld or disabled. If you don't know what vertical synchronization is please read
this
.
Vertical synchronization
If vsync is not enabled the new image will immediately be displayed on screen as soon as the draw function finishes. So is that a problem? Yes. Since the time it takes to do an update and draw may vary also the time between frames displayed on screen will vary. The game may appear jerky, sometimes moving faster and sometimes slower, because the rendered frames always show the game state logically 1/60 of a second apart. I will call this a
time variation flaw
in lack of a better name. The error can be up to 1.0 dt = 1/60 second but is likely smaller than that. The average speed of the game in the long run will however remain the same on all fast machines. Depending on the game this error of likely much less than
dt
and may not even be noticeable. 2D games genarlly suffer more than 3D games. Another, worse problem, is the tearing we get when vsync is disabled. The game may show parts of the old frame and new frame at the same time. This makes animations appear cut off and is very visible when the game has big, fast moving objects such as a scrolling background.
To solve the timing error we can break out the common command to swap buffers from the draw procedure and instead insert it after the wait in listing 3. The frame is now fully rendered to the back buffer in the draw phase, we then do the waiting and lastly swap the buffers right in time to get a nice 1/60 interval between frames. Tearing will still kill some games (Yes, I'm picky) but without vsync there is nothing we can do about it.
So what if vsync is enabled? Actually if vsync was set to 60 Hz the program in listing 2 (not 3) would be the perfect choice for a fast computer. Swapping buffers will now wait for the next vertical blank which happens with 1/60 of a second interval and our program is automatically synchronized with real time. However it is usually not possible to really know if vsync is enabled or in what frequency it runs. You may get a good guess but no guarantees. So what happens with the method in listing 3, with the swap buffer addition, if we consider 60 Hz vsync on a fast enough computer?
Say that the buffers where just swapped and we start on the next frame making first the update, then the draw. After the draw and wait exactly 1/60 second should've passed but since the measuring is not perfect we may miss the vblank by a mircosecond and we're forced to wait for another whole 1/60 second before the program can do the swap. All later frames will then need to wait a negative amount of time and we're out of sync forever. Too bad. Now you might notice that if the swap buffer is moved to before the wait all is well in wonderland, but we're back to the method which was bad for disabled vsync. Indeed it's contradictive.
The first solution we may think of is to wait only 95% of dt, swap buffers and then wait for the full dt to elapse. This does however require our time measuring to be synchronized with the vblanks so that a full dt is elapsed exactly when the vblank comes. It all seems dangerous. But before trying to fix anything let's look at the case when the vsync frequency is different from the update frequency.
Interpolated draws
Consider a vsync of 75 Hz and the update delta time 1/60 (60 Hz). It gives us five vblanks on only four updates. This means that every fifth frame must be the same as the previous. Figure 1 tries to illustrate this simple fact. It is far worse than anything we've looked at so far since the game will seem to halt for a short time when we render the same frame twice. This is very noticable, especially in 2D games. So what can we do about it without losing the nice deterministic bahavior?
Figure 1 : With four updates (illustrated as U) spread on five vblanks, two frames are bound to be the same.
A common idea is to save the current and the next game state and interpolate these before doing a draw. Of course we only need to save states relevant to the drawing like positions, rotations, scaling, and other animated properties. The code can look like this:
dt = 1/60
start = current_time()
num = 0
loop
while (current_time() – start) > num*dt
update(dt)
num = num + 1
end
alpha = (current_time() – start)/dt – (num-1);
draw_interpolated(alpha)
swap()
end
Listing 4
. Fixed time step with speed control and interpolated draws.
When an update occures in the code above we calculate what the game state will be 1/60 of a second in the future. As long as this time has not passed we will not do another update but instead draw as many times as we can and interpolate towards, but not beyond, this future game state. Notice that the draw now has a state which is changed by the interpolation variable alpha. It's become both a draw and lightweight update procedure. Figure 2 tries to explain a possible scenario. With a fast computer this method can produce unique frames for each vblank in 75 Hz. The result is better than before in the sense that the user will less likely notice the irregularities.
Let dt = 1/60 be the wanted frame time as usual.
- The first update U takes 0.2
dt
- A draw D(0.2) is hence done with interpolation 0.2.
- Swap S waits until vblank at 0.8 dt = 1/75 seconds
- Only 0.8 dt < dt has passed so we set out on a new draw with interpolation D(0.8)
- Swap S waits until vblank at 1.6 dt = 2/75 seconds
- Update U takes 0.2 dt
- Draw D(0.8) is done using the latest data from the update we just did i.e. we're drawing the game state at time 1.8
dt
.
- Swap S waits until vblank at 2.4 dt = 3/75 seconds
- and so on…
So now a problem with this method appears. In the end what the user see on her screen will be a frame with interpolation 0.2 at a time when 0.8 is expected, 0.8 when 1.6 is expected and so on. The errors are, in order, 0.6, 0.8, 0.6, 0.6 and 0.6. So overall we can say it's a lag of 0.6 dt and the error/jerk is just 0.2 dt each fifth frame. This will probably be good enough for most games. Of course the updates will not always take 0.2 dt but instead be in the range of 0.05 dt to 0.5 dt giving us the time variation flaw again.
So what if we have a vsync of 60 Hz? Wouldn't this be the ideal for a game that runs its updates at 60 Hz? Figure 3 tries to explain the case. The game updates at a steady rate but the game state seen on screen will be closer to the old game state (interpolation 0.2) rather than the one we ought to have seen (interpolation 1.0). We have two flaws; a constant lag of around 0.8 dt and an interpolation difference between frames which varies by the same amount as the update time variation. If we can put an upper bound on the updates we could completely remove the time variation flaw by waiting for the upper bound time to elapse before doing a draw.
One could go on and check what happens at other frequencies like 80 Hz or 120 Hz. The reasoning is the same. You'll get a bound for the lag and error in orders of dt.
In a real implementation we should also adjust the condition time > num*dt so we don't set out on another draw even though it's only a microsecond left until the update should be done, one can for example subtract 5% of dt .
A good merit of the interpolated draws is that we can run the game smoothly in slowmotion just by doing the updates less frequently.
Threads
Another approach which may pop up in your mind is to use threads. One thread could be doing all updates and another thread does the drawing. The program can look like the one in listing 5.
[Thread 1]
dt = 1/60
start = current_time()
num = 0
loop
update(dt)
num = num + 1
wait(start + num*dt)
end
[Thread 2]
loop
draw() // Can also be interpolated
swap()
end
Listing 5
. Fixed time step and drawing in a separate thread.
There are some alternatives to how the draw and update would interact here. The draw can for example (a) always use the latest fully completed update, (b) use an interpolation of the two latest updates or (c) always use the absolutely latest data from the currently ongoing update. Alternative (c) implies that we draw parts of the old and new state in the same frame. All these flavors come with more or less complicated resource locking and need for data redundancy but we shall not go into that. Just know that whatever the setup is all our problems with vsync, time variation and lag remains.
Too see this we just need to look at some examples. Say that we choose the easy method (a). The first draw must wait for the first update to finish and then there will be no point in doing another draw before the next update is done. On a fast enough computer it would be the same result as using the program in listing 3. We still have the vsync waiting problem where we can get inconsistent times between frames if vsync is disabled or miss a swap if it is enabled. The upside is that if a draw would take a long time to complete we can still proceed with an update at the same time. You can reason on method (b) and (c) in the same way.
So the only gain is that we can start on the next update even when the draw is waiting for a vsync or is busy rendering. Still all the problems from before remain. The cost is data redundancy and added difficulty of concurrency issues.
There are of course plenty of other ways to use threads to speed up the individual updates and draws, but this is not an article on optimization.
Triple buffering
Something we also should consider is if triple buffering can help us. If we have three draw buffers we can use one on the screen, put one on queue to the screen and do our drawing to the third. The queued buffer makes it possible to wait for vsync while doing other computations.
Triple buffering will only have effect if vsync is enabled and it will make the swap command return immediately. The gain is that we can begin another draw or update at the same time as we're waiting for the next vblank. It doesn't make a difference for the fast computer scenarios that's discussed above. It can be useful if the display frequency is lower than the update frequency or if the time of draws often become larger than the update frequency so we are forced to drop frames. The important thing to remember is that all the problems above still apply.
This is kind of like a simple version of the thread approach. The gain is that we can, like with threads, start a new update while waiting for vsync. Unlike threads however we can't start the update while still drawing.
Slow computers
So far we've assumed only fast enough computers with no temporary slowdowns. It's rarely the case in practice though. If the update and draw takes longer time than
dt
then we are simply forced to drop frames. If vsync is not enabled we can hope that sometime in the future a frame will be produced in less time than
dt
and get us back on track. If vsync is enabled however we must make up for this lost frame by doing two updates in a row. If we're always forced to do this the framerate will be half of the update frequency, e.g. 30 Hz with 60 Hz updates. This may be acceptable and it's not possible to do anything about besides getting a faster computer.
The pseudo code in listing 5 below does many updates in a row if necessary to get back on track with the real time.
dt = 1/60
start = current_time()
num = 0
loop
do
update(dt)
num = num+1
while current_time() – start > num * dt
draw()
swap()
wait(start + num*dt)
end
Listing 5
: Fixed time step with lower and upper bound speed control.
It's a simple but effective way to make slow computers keep up. Note that if the program should halt for a long time it will do a number of subsequent updates without doing any draws. I would recommend to set a bound on the number of consecutive updates in the while loop, e.g. a maximum of 5 before a draw is done. The game will then run in fast forward until it's back on track with realtime instead of completely freezing. This is left out from the code to keep it clean and simple. If the updates take longer time than dt we simply can't run the game on the computer in question, our only option will be to buy a new one.
Alternating framerate
It may be an annoyance if the framerate keeps alternating. This can happen if the draw load varies much or if the computer is only close to fast enough to render each frame. Sometimes the program is hitting the next vblank and other times missing it. A common case is a 60 Hz game going back and forth from 30 Hz. It's then better to step down and always do 30 Hz. This can be done by measuring the times it takes to produce each frame and if it's above dt too often we assume that we need to always do more updates. If the computer can't do 60 Hz we do 2 updates per draw and get 30 Hz. If it's still too slow we do 3 updates and get 20 Hz and so on. Listing 6 shows this idea in code.
extra_time = 0
loop
frame_start = current_time()
… update to state at current_time() + extra_time * dt …
… draw …
elapsed = current_time() – frame_start
… wait until at least (number of updates – 1) * dt has passed …
swap()
if elapsed > (1+extra_time) * dt
lag = lag + UP_STEP
if lag >= MAX_THRESHOLD
extra_time = extra_time + 1
end
else if elapsed < extra_time * dt
lag = lag – DOWN_STEP
if lag <= MIN_THRESHOLD
extra_time = extra_time – 1
end
else
lag = lag – 1 * sign(lag)
end
end
Listing 6. Prevent an alternating framerate by always doing extra updates if too many vblanks are missed.
This codes contains quite a lot of parameter fiddling. In the normal case we're producing frames in rougly the same rate. The lag varaible then slowly goes towards zero (the line after the last else statement). Instead if the time elapsed is too much we increase the lag varaible and check the threshold. We can for example have an UP_STEP of 3 and MAX_THRESHOLD of 15 which tells us that if 5 frames in a row are too slow we increment the number of updates per draw. Maybe only 3 frames in a row are too slow and then we get 2 fast frames and then again 3 slow. This will still trigger an increase of updates since the lag variable is only slowly decreasing to zero compared to the UP_STEP.
The same happens if we later start to render fast again, for example if the draw load is reduced. The lag variable becomes negative and lower then our tresholds which signals we can go back to 60 Hz or whatever was our original frame rate.
Conclusion
Is vsync enabled or disabled?
What is the vsync frequency?
If we know the answer to these quetions it's a much easier task to device the best timestepping algorithm. For example if vsync is on in 75 Hz and we have 60 Hz updates we would use interpolated draws with the precalculated fractions from figure 2. If vsync is disabled we use the program in listing 3 with the swap command added after the wait. However we usually must treat these as unknowns. The system can give us a good hint on what they are but no guarantees. So here are two suggestions on how to handle the problem:
(A) Favor the right settings
The idea is that people with the right settings, and a fast computer, should get the most of out the game and don't suffer any tearing, lag or frame time variation. This means that we should do everything in our power to enable vsync in the frequency we like and use the program from listing 5. If we see that it's not possible to get the settings we want we might display a warning and tell the user that the game will run smoother if the settings are changed manually.
This method is not only made to please the game enthusiast but it makes programming easier. We don't need to do the render interpolation work. It is of course also deterministic and synchronizd with real time even if the user has the wrong settings, it's just the visuals which will be less smooth.
To check if vsync is enabled and in what frequency it runs we can either trust the system hint or do measuring by drawing and swapping a few empty frames at the start of the game, during loading screens or at other well chosen locations. Without vsync the measured average time will be very small and with vsync it will be close to e.g. 1/50, 1/60, 1/75, 1/80, 1/120 or any other common rate. This may not always work because of some unfortunate temporary slowdown at the measuring point but it's likely to work most of the time. It's a good enough basis for displaying a warning.
I would in most cases recommend you to use this simple method. People with the wrong vsync are not worth the trouble. Depending on your game it may not even be noticable. Still if you want interpolations we can pull some tricks.
(B) Interpolate with prediction
Instead of blindly rendering until it's time for a new update, like the commonly suggested program in listing 4, we can try to predict when the next swap will occur and aim for that fraction of dt. For example if we predict that the next swap will happen at 3.2 dt we make sure we've done 4 updates and interpolate between update 3 and 4 with alpha 0.2. Listing 7 shows a way to do this.
dt = 1/60
num = 0
swap()
last_swap = current_time()
start = current_time()
next_swap_time = start
loop
while num * dt < next_swap_time – start
update(dt)
num = num + 1
end
alpha = (next_swap_time – start) / dt – (num – 1)
draw(alpha)
swap()
swap_time = current_time() – last_swap
last_swap = current_time()
next_swap_time = current_time() + swap_time
end
Listing 7 . Upper and lower bound speed control with predicted interpolated draws.
For 75 Hz vsync we would in the ideal case get the table below.
| Iteration
|
next_swap_time
|
num (after while loop)
|
alpha
|
| 0
|
0 dt
|
0
|
1.0
|
| 1
|
0.8 dt
|
1
|
0.8
|
| 2
|
1.6 dt
|
2
|
0.6
|
| 3
|
2.4 dt
|
3
|
0.4
|
The next_swap_time values will of course not be this exact but still our alpha values will be closer to the time the frame is actually displayed than without prediction. We get less lag and less frame time variation. You could try to make a table for 60 Hz vsync and also consider what will happen if we miss a swap and have to wait for the next. If you do, please let me know.
There is still a big flaw though. If vsync is not enabled the program in listing 7 will potentially run wild with draws. To fix this one could say that if swap_time is below dt /2 it will be set to dt .
Practical considerations
In all listings dt and other timing variables are presented in seconds e.g. 1/60 but you should not in practice save it as a double precision number 1.0/60.0. Instead use integers in all calculations. If you measure time in microseconds do comparisions like the one below.
current_time() – start > (num * 1000) / 60
If you target another frame rate then change the number 60, and if you measure in nanoseconds add some extra zeroes. The dt value sent to the update procedure can still be the fixed floating point number 1.0/60.0.
References
http://www.flipcode.com/archives/Main_Loop_with_Fixed_Time_Steps.shtml
[2] "Fix Your Timestep" by Glenn Fiedler
http://gafferongames.com/game-physics/fix-your-timestep/
-
Blitzen permalinkWow man thanks for that, very interesting!
