A Cybernetic Understanding of Fitts' Law

information or interaction

Most treatments of Fitts' Law say WHAT is true, but not WHY. However, if one understands why it is easier to predict where it will hold and where fail. Whilst Fitts' original paper uses an analogy with Shannon and Weaver's information theory, it does little more than postulate some neurological information rate.

In fact it is easier to understand Fitts' Law if one considers the control task of hand-eye coordination. This is an interaction - one moves the mouse (or other pointer), the eyes see the movement, you correct etc. This all happens in a fraction of a second.

Timing is critical as the delay between seeing something, the processing to be done in your brain and the signal to get down your arm to your muscles is between 150 and 200 milliseconds. Your arms moves a long way in that time, typically 70-90% of the way to the target. This is rather like controlling one of those Mars robots where the transmission delay can be 20 minutes or even more when Mars is further away. You could very very slowly move it inch by inch, or you could work out approximately where a speed and direction would take it in 20 minutes and send it trundling off.

The human process is faster, but similar in nature.

(In fact, for the Mars Rover the times are so long that the vehicle has its own autonomous control as well, this is rather like the fact that your hand pulls back from heat before the signals ever getting to your brain.)

the control process

Imagine we are writing the control circuitry for the human hand-eye pointing task.

Let's look at a basic motor control cycle:

  • (i) eyes see target and pointer
  • (ii) brain assesses (with some inaccuracy) direction to move pointer
  • (iii) brain tells arm/wrist/fingers to move
  • (iv) muscles move as requested (with some inaccuracy)
  • (v) eyes monitor movement
  • (vi) back to (ii)
  • (vii) when target hit shout "hooray"

This whole cycle has a minimum time associated with it due to the delays in processing in your brain, sending signals to the arms etc.

Let's assume two further things;

  • (a) the inaccuracies of perception and of movement are proportional to the distance
  • (b) the inaccuracy of direction of muscle movement is largely independent of speed
  • (c) our brain tells our muscles to move faster when the distance is longer

Now as a continuous process this is hard to imagine, especially because of the delays ... it is like a general commanding an army in the days before radio, you only ever know where your soldiers were several days ago.

However, if instead we imagine this more as a series of discrete movements it is easier to imagine. Each movement corresponds to the hand-eye period.

Whilst this is a bit of a simplification it gives the general idea quite well, and indeed corresponds very closely to the observed behaviour for certain pointing devices. For others, in particular mouse movement, the process is not move-stop-view-plan, but more one of constant correction, but the time for the correction cycle is similar.

Fitts' law as discrete movements

Figure 1 shows a discrete steps through 4 cycles of sensing and movement. The diagram shows how the movements in each step get gradually smaller as the target gets closer.

Because of the processing delay the shorter paths cannot be executed faster than the minimum delay. So it is reasonable to assume (c) that the brain tells the muscles to move slower and slower as the target gets closer. That is the time for each movement constant, not dependent on the distance moved.

Finally, because the errors are proportional to distance, the movements get smaller geometrically.

Figure 1. step-wise movement towards target

So we have a sequence of moves, each of which reduces the distance to the target geometrically and each of which takes the same time. When remaining distance is such that the error circle of the remaining movement is less than the size of the target then we can actually move and get inside the target.

try it yourself!

You can do a little experiment to see what this is like.

  • (i) choose an icon or screen control as your target
  • (ii) position the mouse somewhere else on screen
  • (iii) look carefully at the target
  • (iv) close your eyes
  • (v) with your eyes closed do a single movement of the mouse to try and get to the target (where you recall it to be)
  • (vi) open your eyes
  • (vii) if you have not hit the target, repeat from (iii)

I find I can hit toolbar-sized icons in about 3 steps and things like the window open and close boxes in no more than 4.

You may be surprised too at how accurate you are on the first movement.

I also find I tend to always undershoot - hence Figure 1 is drawn like that, although the calculations below do not depend on that. In fact it is reasonable that we subconsciously tend to undershoot in positioning tasks as real world positioning is often to grab something. If you over shoot you will hurt your hand, or knock over the thing you are trying to grab.

back to information

Although the analogy with Shannon and Weaver's information theory is not sufficient in itself to explain Fitts' Law, the fact that the two formulae correspond so closely is no accident.

Think of the "infinite" Fitts' task of trying to hit an exact (size zero) target. Of course you could never do this, but it is really just to give the flavour. The pointing task could be thought of as trying to communicate the location from the eye tot he mouse pointer ... but the channel (the visual-muscle system) has errors and hence is noisy. The noise level is related to the distance moved and hence gives an maximum rate on the amount of information it can carry.

For the finite Fitts' task we only care about positioning to an accuracy of S and hence we only need log(D/S) bits of information to be "communicated".

the limits of Fitts' Law

From the cybernetic description we can also see that the assumptions allow us to see the limits where Fitts' Law will fail.

One of the critical human abilities is to be able to tell your muscles where to move and to be able to predict where this will take the pointer. In the case of indirect pointing through a mouse, joystick or other devices, this means our brains have to 'understand' (in a tacit sense) the acceleration and other non-linear mappings between movement and location.

It is in fact quite amazing that our brains are able to learn these complex mappings (see Alan's cyborg driving essay for an explanation why!). However, it takes time and practice. This is why (and usually not stated) Fitts' Law depends on 'over learnt' behaviour, that is use of a device that is so practised that the user has attained peak performance.

Not only does the brain need to know where the expected location of the pointer will be after one hand-eye cycle, but also, to avoid overshoot, how accurate that estimate is. So, for a device with some inaccuracy or noise in addition to those of your muscles, your brain needs to be able to 'learn' this level of inaccuracy to be able to assess how far short of the target to aim.

This is why menus at the top or bottom of the screen help, the overshoot is less important so your brain can afford to aim to hit in one movement, rather than fall short, of the target.

Any delay in the device adds to the total time of the hand-eye coordination loop. This gives rise to a slow down in the whole process and the 'B' figure gets larger by a factor of (τh + τd) / τh, where τh is the hand-eye coordination time and τd is the device delay. This effect has been observed in experiments.

The logarithmic number of steps to the target is also dependent on the maximum speed of the device allowing a virtually complete movement to target within one hand-eye cycle. If this is not the case then a series of smaller steps will need to be taken and a different timing behaviour would be observed. For a screen size of around 1000 pixels, this means the device must be able to support movement speeds of the order of 5000 pixels per second.

Similarly at the lower end, the minimum (non stationary) movement speed must be such that the target is not missed entirely within one hand eye cycle. For example, if the target is 10 pixels across, the minimum speed needs to be around 50 pixels per second.

In informal experiment with game controllers Kiel Gilleade found that most of the small thumb joystocks on these did not obey Fitts' Law because they broke one or more of the constraints above. This is not to say they are not good controllers, just they are not Fitts' Law ones. In fact Kiel observes that real gamers simply push the controllers to max all the time anyway!

some references

P. M. Fitts and M. I. Posner. Human Performance. Wadsworth, Wokingham, 1967.
This is 'the' Fitts! He does not use a model like the above, but uses a parallel with Shannon and Weaver's Information Theory.
I. Scott MacKenzie. Motor Behaviour Models for Human-Computer Interaction. In HCI Models, Theories, and Frameworks: Toward an Multidisciplinary Science. John Carroll (ed.) ISBN 1-55860-808-7. Morgan Kaufman, 2003. pp. 27-54
This chapter summarises a lot of current knowledge on Fitts' Law and related models of motor movement, but in common with most of the field focusing more on what the behaviour is rather than why it occurs.
I. S. MacKenzie, A. Sellen, and W. Buxton. A comparison of input devices in elemental pointing and dragging tasks. In S. P. Robertson, G. M. Olson, and J. S. Olson, editors, Reaching through technology - CHI'91 conference proceedings, pages 161-166. Human Factors in Computing Systems, ACM Press, New York, April, 1991.
Shows that Fitts law constant varies depending on whether a mouse button is held down. Quite reasonably our wrist and arm movement accuracy depends on whether we are trying to hold the mouse down.
Y. Guiard, M. Beaudouin-Lafon, J. Bastin, D. Pasveer, S. Zhai. View size and pointing difficulty in multi-scale navigation. In Proceedings of the working conference on Advanced Visual Interfaces, AVI04. Gallipoli, Italy, 2004. pp. 117-124.
This shows that Fitts' Law continues to hold if one is able to zoom the interface in continuously as well as move towards the target, and also for cases where the target is off screen and one is guided by bulls-eye style circles radiating from it.
R. Balakrishnan. "Beating" Fitts’ law: virtual enhancements for pointing facilitation. Int. J. Human-Computer Studies 61 (2004) 857–874
A good review of the state of the use of varrying mappings between physical movement and screen movement in order to try to improve on Fitts' Law performance.The conclusion is that imporovent can be made when there are a small number of targets, but drop off quickly as the number and density of targets grows.
[ Note: This use of manipulations in design goes back a long way, certainly in the early 1980's there were 'magnetic' buttons and scrollbars with 'gravity wells' to help you select more easily. This was largely forgotten until a few years ago when it has become more popular again. ]

See also CHI papers on Fitts' Law of path following and the way Fitts' law constant changes as the movement changed to different muscle groups (arm vs. wrist vs. finger movement).

A. Dix (2002). driving as a cyborg experience (working paper)
This essay tries to understand how it is that we humans can extend our idea of 'self' into artefacts such as cars, knife and fork, and mouse-screen interactions. It looks at this form an evolutionary psychology standpoint looking for similar abilities needed in pre-technological life.

doing the calculations

Mathematically let the distance to the target at each stage be Di and the distance moved di:

D0 = D - the initial distance

di = λ Di - where λ is some empirical constant

Di+1 = (1-λ) Di


Di = (1-λ)i D

The error circle is radius ri:

ri = σ di - where σ is some empirical constant

ri = σ λ (1-λ)i D

The process stops after n steps when the radius rn is less than the target size S:

σ λ (1-λ)n D < S

So n is approximately:

-log ( σ λ D / S ) / log ( 1-λ )

Each step takes a fixed time τ, and there will be some initial time for the brain to "get started", say α. So the total time T is given by:

T = α + τ n

   = α + τ ( -log ( σ λ D / S ) / log ( 1-λ ) )

T = A + B log ( D / S )


A = α - τ log ( σ λ ) / log ( 1-λ )

B = - τ / log ( 1-λ )

N.B. log(1-λ) is negative so B is positive



Alan Dix © 2003,2005