IEEE Transactions on Software Engineering

December 18, 2002

 

To Whom It May Concern:

Please accept our submission to TSE for consideration titled Toolkit Design for Interactive Structured Graphics. Please send all correspondance concerning this manuscript to me:

Ben Bederson

3171 A.V. Williams Building

Computer Science Department

University of Maryland

College Park, MD 20742

(301) 405-2764

bederson@cs.umd.edu

 

Thank you,

Benjamin B. Bederson


Toolkit Design for Interactive Structured Graphics


Benjamin B. Bederson, Jesse Grosjean, Jon Meyer

Human-Computer Interaction Laboratory

Institute for Advanced Computer Studies

Computer Science Department

University of Maryland, College Park, MD 20742

+1 301 405-2764

{bederson, jesse, meyer}@cs.umd.edu

 

ABSTRACT

In this paper, we analyze three approaches to building graphical applications with rich user interfaces. We compare hand-crafted custom code to polylithic and monolithic toolkit-based solutions. Polylithic toolkits follow a design philosophy similar to 3D scene graphs supported by toolkits including Java3D and OpenInventor. Monolithic toolkits are more akin to 2D Graphical User Interface toolkits such as Swing or MFC. We describe Jazz (a polylithic toolkit) and Piccolo (a monolithic toolkit), each of which we built to support interactive 2D structured graphics applications in general, and Zoomable User Interface applications in particular. We examine the trade-offs of each approach in terms of performance, memory requirements, and programmability. We conclude that, for most applications, a monolithic-based toolkit is more effective than either a hand-crafted or a polylithic solution for building interactive structured graphics, but that each has advantages in certain situations.

Keywords

Monolithic toolkits, Polylithic toolkits, Zoomable User Interfaces (ZUIs), Animation, Structured Graphics, Graphical User Interfaces (GUIs), Pad++, Jazz, Piccolo.

INTRODUCTION

Application developers rely on User Interface (UI) toolkits such as Microsofts MFC and .NET Windows Forms, and Suns Swing and AWT to create visual user interfaces. However, while these toolkits are effective for traditional forms-based applications, they fall short when the developer needs to build a new kind of user interface component one that is not bundled with the toolkit. These components might be simple widgets, such as a range slider, or more complex objects, including interactive graphs and charts, sophisticated data displays, timeline editors, zoomable user interfaces, or fisheye visualizations.

Developing application-specific components usually requires large amounts of custom code to manage a range of features, many of which are similar from one component to the next. These include managing which areas of the window need repainting (called region management), repainting those regions efficiently, sending events to the internal object that is under the mouse pointer, managing multiple views, and integrating with the underlying windowing system.

Writing this code is cumbersome, yet most standard 2D UI toolkits provide only rudimentary support for creating custom components typically just a set of methods for drawing 2D shapes, and methods for listening to low-level events.

Some toolkits such as Tcl/Tk [18] include a structured canvas component, which supports basic structured graphics. These canvases typically contain a collection of graphical 2D objects, including shapes, text and images. These components could in principal be used to create application-specific components. However, structured canvases are designed primarily to display graphical data, not to support new kinds of interaction components. Thus, for example, they usually do not allow the application to extend the set of objects that can be placed within the canvas. They also do not adequately address complex, dynamic, or interactive content, or data binding. We have found that many developers bypass these structured canvas components and follow a roll-your-own design philosophy, rewriting large quantities of code and increasing engineering overhead, particularly in terms of reliability and programmability. There are also commercial toolkits available such as Flash [7] and Adobe SVG Viewer [3]. But these approaches are often difficult to extend and integrate into an application.

We believe future user interface toolkits must address theseis problems by providing higher-level libraries for supporting custom interface components. However, there is still an open question regarding which design philosophy to adopt for these higher-level toolkits.

In this paper, we consider two distinct design philosophies for toolkits to support creation of custom graphical components: monolithic and polylithic. We describe the key properties of monolithic and polylithic designs, and examine two toolkits that we built, Jazz[1], a polylithic toolkit, and Piccolo[2], a monolithic toolkit. Finally, we provide a qualitative and quantitative analysis to compare hand-crafted code with code written using these two toolkits, looking at application speed and size, memory usage, and programmability.

In this paper, we are concerned primarily with issues related to data presentation, painting, event management, layout and animation. We do not address many issues that modern UIs often include such as accessibility, localization, keyboard navigation, etc. In addition, our analysis is about our two specific toolkits. While our experimental results are clearly tied to these specific toolkits, we believe that the main lessons we learned are generalizable to other monolithic and polylithic toolkits.

REQUIREMENTS FOR NEW UI COMPONENTS

When creating a new kind of UI component, the choice between using a toolkit or writing hand-crafted code must be made based upon the requirements of the particular component being built. A very simple new component, such as a range slider (where users can control two parameters instead of just one), may not warrant a toolkit-based solution. On the other hand, a more complex component such as an interactive graph probably does.

Let us start by defining our requirements for such a toolkit. In our research, we are particularly interested in new visualization techniques, such as Zoomable User Interfaces (ZUIs) [10, 11, 12, 13]and fisheye visualizations [9, 16]. We are also interested in animation and in dynamic data displays. For components that support our needs, a range of toolkit requirements arise:

1)      The toolkit must be small and easy to learn and use with an existing GUI framework.

2)      The toolkit must manage painting, picking and event dispatch, with enough flexibility for applications to customize these features.

3)      It must be possible to write interaction handlers that provide for user manipulation of individual elements, and groups of objects.

4)      The toolkit must provide support for graphics that are non-rectangular or transparent, scaled, translated and rotated, as well as support for traditional interactive widgets such as buttons and sliders.

5)      Large numbers of objects must be supported so that rendering and interaction performance is maintained with complex scenes.

6)      View navigations (pans and zooms) should be available, and should be animated.

7)      Multiple views onto the surface should be supported, both via multiple windows, and via camera objects that are placed on the surface, used as "portals" or "lenses".

RELATED WORK

There are number of research [17, 21]and commercial [5, 18]structured canvas toolkits available. However, most structured canvas components provide a fixed vocabulary of the kinds of shapes they support within the canvas. It can be difficult to create new classes of objects to place on the canvas. The Tk Canvas [18] for example supports object-oriented 2D graphics, but it has no hierarchies or extensibility.

The InterViews framework [20] for example, supports structured graphics and user interface components. Fresco [28] was derived from InterViews and unifies structured graphics and user interface widgets into a single hierarchy. Both Fresco and later versions of InterViews support lightweight glyphs and a provide a hierarchy of graphical objects. However, these systems handle large numbers of visual objects poorly, and do not support multiple views onto a single scene graph, or dynamic scene graphs. They also do not support advanced visualization techniques such as fisheye views and context sensitive objects.

A number of 2D GUI toolkits provide higher-level support for creating custom application widgets, or provide support for structured graphics. Amulet [21] is a toolkit that supports widgets and custom graphics, but it has no support for arbitrary transformations (such as scaling), semantic zooming, and multiple views.

The GUI toolkit that perhaps comes closest to meeting the needs for custom widgets is SubArctic [17]. It is typical of other GUI toolkits in that it is oriented towards more traditional graphical user interfaces. While SubArctic is innovative in its use of constraints for widget layout and rich input model, it does not support multiple cameras or arbitrary 2D transformations (including scale) on objects and views.

Morphic [4, 26]is another interesting toolkit that supports many of our listed requirements. Morphics greatest strength is in the toolkits uniform and concrete implementation of structured graphics, making it both flexible and easy to learn. But Morphics support for arbitrary node transforms and full screen zooming and panning is weak. It also provides no support for multiple cameras, making it problematic for creating our zooming interfaces.

There were several prior implementations of Zoomable User Interfaces toolkits as well. These include the original Pad system [22], and more recently Pad++ [11, 12, 14], as well as other systems [15, 23, 24], and a few commercial ZUIs that are not widely accessible [1, 25; Chapter 6, 30]. All of these previous ZUI systems are implemented in terms of a hierarchy of objects. However, like GUI toolkits, they use a monolithic class structure that places a large amount of functionality in a single top-level Node class. In this paper we compare and contrast these kinds of toolkits with Jazz, a new toolkit we have developed which follows a polylithic design, and with Piccolo, a lightweight monolithic toolkit.

MONOLITHIC VERSUS POLYLITHIC DESIGNS

Object-oriented software engineers advocate the use of concrete class hierarchies in which there is a strong mapping between software objects and real-world things. These hierarchies tend to be easier for people to learn [19]. Modern GUI toolkits typify this design, using classes that strongly mirror real-world objects such as buttons, sliders, and containers. Similarly, toolkits for two-dimensional structured graphics usually adopt a class hierarchy whose root class is a visual object, with subclasses for the various shapes, lines, labels and images (Figure 1).

Figure 1: Class hierarchy of a GUI toolkit (left) and a structured-graphics toolkit (right).

In these toolkits, runtime parent/child relationships are used to define a visual tree, where each object in the tree is mapped to a portion of the display, and has a visual representation. Many of the complex mechanisms necessary for modern graphical interfaces (navigation, rendering, event propagation) are buried within the class structure.

Three-dimensional graphics toolkits provide an important counterexample. Toolkits such as Java3D [6] and OpenInventor [2] use a more abstract model. Here, distinct classes are used to represent materials, lighting, camera views, layout, behavior and visual geometry. Instances of these classes are organized at runtime in a semantic graph (usually a DAG) called a scene graph. Some nodes in the scene graph correspond to visual objects on the screen, but many of the nodes in the scene graph represent non-visual data such as behaviors, coordinate transforms, cameras, or lights (Figure 2). This design provides opportunities for introducing abstractions and promoting code reuse, though the downside is that it tends to yield a greater number of overall classes. While scene graphs are very common in 3D graphics, they are rarely used with 2D graphics.

Figure 2: Class Hierarchy of a typical 3D graphics toolkit

We call the concrete design approach adopted by most 2D toolkits monolithic, because these toolkits have a few large classes containing all the core functionality likely to be used by applications. We call the 3D toolkit design approach polylithic, because it consists of many small classes, each representing an isolated bit of functionality where several are often linked together to represent one semantic unit.

Monolithic toolkits suffer from a common problem: the toolkit classes tend to be complex and have large numbers of methods, and the functionality provided by each class is hard to reuse in new widgets. To support code reuse, toolkit designers often place large amounts of generally useful code in the top-level Component class that is inherited by all of the widgets in the toolkit. This decision leads to a very complex hard-to-learn top-level class. In Microsoft MFC, the top-level CWnd class has over 300 methods. The original Java Component class has over 160 methods. And the newer Java Swing top level JComponent class has over 280 methods. Even the new Microsoft .NET base class for GUI widgets (named Control) has 288 interactors (6 constructors, 72 properties, 153 methods, and 57 events). In addition, application developers are forced to accept the functionality provided by the toolkits top-level class they often cannot add their own reusable mechanisms to enhance the toolkit.

Polylithic designs on the other hand, can potentially offer both reusability and customizability, because they compose functionality through runtime instantiation rather than through sub-classing. This promise of better toolkit maintainability and extensibility led us to the polylithic design of Jazz.

Composing Functionality

A design goal of polylithic systems is to compose functionality by using a runtime graph of nodes. Each node in the runtime graph contributes a specific piece of functionality according to its type. Polylithic systems thus shift complexity from the static class hierarchy into the runtime data structure. This contrasts strongly with monolithic systems, which rely heavily on the static class inheritance hierarchy to compose functionality. For example, consider defining a new kind of Button object. In a monolithic GUI toolkit, you might use a class hierarchy as shown on the left in Table 1:

Table 1: Use of inheritance in monolithic and polylithic designs

// Monolithic approach

class Component {

}

 

class Transform {

}

 

class Label extends Component {

}

 

class Button extends Label {

}

// Polylithic approach

class Node {

}

 

class TransformNode extends Node {

}

 

class LabelNode extends Node {

}

 

class ButtonBehavior extends Node {

}

The functionality is derived by statically extending the Label class and adding more methods. Button instances are created and added directly to the visual graph:

Button b = new Button(Click Me);

b.setLabel(Click Me);

b.setTranslation(20, 20);

 

Now consider a polylithic design, as seen on the right in Table 1. First, TransformNode, ButtonBehavior and LabelNode are all defined as subclasses of Node they are otherwise unrelated entities. To create a new button, the developer creates a transform, a button behavior object and a label object independently, and adds them to the runtime scene graph explicitly to define the relationship between the button and its label, e.g.

ButtonBehavior button = new ButtonBehavior();

Node rootTransform transform = new TransformNode();

ButtonBehavior behavior = new ButtonBehavior();

Label Node label = new LabelNode(Click Me);

transform.setTranslation(20, 20);

label.setLabel(Click Me);

button.add(transform);

transform.add(label);

root.add(behavior);

behavior.add(label);

 

In this example, the label is added as a child to the ButtonBehavior object, which is added as a child to the root transform object.

By adopting this approach to composing functionality, the same ButtonBehavior class could conceivably be reused for many different kinds of buttons (e.g. image-based buttons), not just for buttons with labels.

Of course, similar functionality can be achieved in both monolithic and polylithic toolkits. In polylithic toolkits, new functionality is created by composing instances, whereas in monolithic toolkits, new functionality is introduced through sub-classing. In this sense, polylithic designs are more similar to Prototype-based programming systems such as Self [29] or ECMAScript [4], which use runtime instancing to create derived types.

The example above immediately demonstrates the main drawback of polylithic systems: the application code is about twice as long as that for the What takes only one line of code in a monolithic system requires five in the polylithic system. More importantly, the programmer has to understand and manage four different node types. Monolithic systems also tend to be more familiar to programmers used to languages like Java or C#. On the other hand, because polylithic systems explicitly separate node types based on their functionality, they potentially encourage designers to think of useful abstractions from the outset. The polylithic design approach yields more flexible class hierarchies.

This flexibility is likely to be useful when applications and objects are built dynamically at run-time. This frequently happens in prototyping systems and within design tools. In these contexts, it could be quite powerful to dynamically load a new object (potentially downloaded from the web) and insert it into an existing scene graph - changing the behavior or look of an object in ways not imagined by the author of the original one. Thus, there is a trade-off between application code complexity and flexibility.

THE JAZZ POLYLITHIC TOOLKIT

Jazz is a general-purpose toolkit for creating structured graphics with explicit support for Zoomable User Interface (ZUI) applications. Jazz is built entirely in Java and uses the Java2D renderer. Figure 3 shows a screen snapshot of PhotoMesa [8], a zoomable photo browser application we built using Jazz.

Jazz is a polylithic toolkit, offering functionality by composing a number of simple objects within a scene graph hierarchy. These objects are frequently non-visual (e.g. layout nodes), or serve to decorate nodes beneath them in the hierarchy with additional appearance or functionality (e.g. selection nodes). Jazz therefore tackles the complexity of a graphical application by dividing object functionality into small, easily understood and reused node types.

The base ZNode class in Jazz has 64 public methods (16 are related to events, 15 are related to the object structure, 14 are related to coordinates, and the rest are for other functions such as painting, saving, properties, and debugging.)

Figure 3: Screen snapshot of the PhotoMesa application, written using Jazz. It uses a Zoomable User Interface to give users the ability to see many images at once, grouped by directory. By letting users smoothly zoom in and out, they maintain control over the trade-off between number and resolution of images while simultaneously maintaining context making it easier to avoid getting lost. PhotoMesa is available at http://www.cs.umd.edu/hcil/photomesa.

Jazz is a polylithic toolkit, offering functionality by composing a number of simple objects within a scene graph hierarchy. These objects are frequently non-visual (e.g. layout nodes), or serve to decorate nodes beneath them in the hierarchy with additional appearance or functionality (e.g. selection nodes). Jazz therefore tackles the complexity of a graphical application by dividing object functionality into small, easily understood and reused node types.

Jazz borrows many of the structural elements common to 3D scene graph systems. By using a basic hierarchical scene graph model with cameras, Jazz is able to directly support a variety of common as well as forward-looking interface mechanisms. This includes hierarchical groups of objects with affine transforms (translation, scale, rotation and shear), layers, zooming, internal cameras (portals), lenses, semantic zooming, and multiple representations.

Figure 4 shows a complete standalone Jazz program that displays "Hello World!". Default navigation event handlers let the user pan with the left mouse button, and zoom with the right mouse button by dragging right or left to zoom in or out, respectively. Jazz automatically updates the portion of the screen that has been changed, so no manual repaint calls are needed.

import edu.umd.cs.jazz.*;

import edu.umd.cs.jazz.util.*;

import edu.umd.cs.jazz.component.*;

 

public class ZHelloWorld extends ZFrame {

public void initialize() {

ZText text = new ZText("Hello World!");

ZVisualLeaf leaf = new ZVisualLeaf(