by chance - enhancing interaction with large data sets through statistical sampling

Alan Dix
Lancaster University, UK.
email: alan@hcibook.com
     Geoffrey Ellis
Huddersfield University, UK.
email: g.p.ellis@hud.ac.uk


Paper accepted for AVI'2002, Advanced Visual Interfaces, May 22-24, 2002, Trento, ITALY .

Download draft paper as PDF (288Kb)

Full reference:

A. Dix and G. Ellis (2002). By chance - enhancing interaction with large data sets through statistical sampling. Proceedings of Advanced Visual Interfaces ­ AVI2002, Trento, Italy, ACM Press. pp.167-176.
http://www.hcibook.com/alan/papers/avi2002/

See also:
more on randomness at: http://www.hcibook.com/alan/topics/vis/
related work on visualisation at: http://www.hcibook.com/alan/topics/random/


Abstract

The use of random algorithms in many areas of computer science has enabled the solution of otherwise intractable problems. In this paper we propose that random sampling can make the visualisation of large datasets both more computationally efficient and more perceptually effective. We review the explicit uses of randomness and the related deterministic techniques in the visualisation literature. We then discuss how sampling can augment existing systems. Furthermore, we demonstrate a novel 2D zooming interface -the Astral Telescope Visualiser, a visualisation suggested and enabled by sampling. We conclude by considering some general usability and technical issues raised by sampling-based visualisation.

keywords: random sampling, visualisation, very large data sets, Astral Telescope Visualiser, sampling from databases


Contents

1 Introduction and background
In which we consider some of the problems of visualising large data sets and also some of the uses of randomness in other areas of computing.
2 Existing randomness and alternatives
In which we review existing visualisation techniques that use random effects and also techniques that achieve similar aims.
3 Using randomness
In which we suggest ways of using randomness to enhance or enable different forms of visualisation and interaction.
4 Randomness and interaction
In which we discus some of the issues sampling raises for interaction and how to choose correct sampling levels.
5 Sampling databases
In which we examine ways of extracting random samples from existing databases, look at some research literature on sampling from large data sets and see how this may be used to help design bespoke data storage.
6 Conclusions
In which we sum up that randomness is a jolly good thing and the next AVI should be held in Monte Carlo :-)


References


http://www.hcibook.com/alan/papers/avi2002/ Alan Dix 31/1/2002