next up previous
Next: Theoretical Complexity Up: Data Structures and Algorithms Previous: Selection

Querying

occurs continuously as the user drags the mouse to update the selected range slider. During querying, the algorithm recomputes the hit set and updates the output on the screen. For the purpose of timing, we say that a single query occurs each time the DQI detects a change in the position of the mouse on the range slider. Experiments show that DQIs must process each query in about 0.1 seconds in order to give a continuous response [1].

During the selection operation, the DQI computes the maximum hit set, which is the hit set determined by the extreme values for the selected attribute and the current ranges for the other attributes. Then it partitions the maximum hit set into p buckets, one for each user-specifiable value for the current attribute. This is essentially a linear time counting sort of the maximum hit set. We store the size of each bucket and all left-to-right partial sums of these sizes (an example range slider with bucket information included is given in Figure 3).

  figure63
Figure 3: A sample range slider with bucket information. This example shows 10 records distributed on the range slider sorted in ascending order for that attribute. The arrows from the range slider to the hit set show the related positions in the sorted maximum hit set that each discrete pixel position of the range slider maps to where the numbers in this hit set gives the values of the records for that attribute represented by the slider. The lower array of boxes represents the partial sums of counts for a given pixel position of the range slider. This example contains only 7 hypothetical pixels for simplicity.

In selection, we also compute the information that facilitates computation of histograms and tight coupling of range sliders, i.e., the redisplay of all slider ranges when any range is changed. To achieve this goal a two dimensional array for each slider is kept (i.e., size tex2html_wrap_inline394 , shown in Figure 4).

  figure68
Figure: A table used to keep the histogram and the tight coupling information. This figure uses the same data and example from Figure 3. Each box in the square holds a count. The user sets the range for attribute 1. For a given specified range of attribute 1 we can find the valid range for attribute 2 by projecting the hit set's bounding rectangle onto attribute 2. Each row is a prefix sum of counts from left to right. So if we subtract column j from column i we can deal with the resulting difference array to find the valid range for attribute 2. The highest nonzero row (k) and the lowest nonzero row (m) give the valid range for attribute 2. Note that the histogram information for attribute 2 is just the difference array.

Thus, any time the range slider is updated, even by a large number of pixels (as might happen if the user moves the mouse extremely fast), we can determine the number of hits in constant time. We can also determine ranges for the other sliders in constant time per slider, and histograms for the other sliders in constant time per point. Changes to the display are determined by scanning the buckets between the older attribute value and the new one. The display is updated in time that is linear in the number of changes.

If the user changes the axis parameters for the starfield display, then the hits are redisplayed (projected on the new pair of coordinates) but none of the internal data structures is changed. The only thing that is updated is the starfield information.

The starfield information is equivalent to a two dimensional ``dirty-pixel'' array where for each starfield pixel we keep a count. This count tells us that whether that pixel position is dirty or not and if it is, how many points overlap at that pixel. Whenever a pixel count reaches zero that pixel color is set to the background color (and visa versa i.e., if it was zero and becomes a positive integer then we plot that pixel on the starfield display using the foreground color). A clever programmer can deal with these counts to give another form of histogram information to the user using different scales of different colors. Also we can immediately find the necessary pixels on the starfield display that must be updated (i.e., only the pixel positions where the counts vary).


next up previous
Next: Theoretical Complexity Up: Data Structures and Algorithms Previous: Selection

Egemen Tanin
Fri May 9 11:28:05 EDT 1997