information visualization toolkit

introduction > toolkit structure

The design of the prefuse toolkit is based upon the information visualization reference model, a software architecture pattern that breaks up the visualization process into a series of discrete steps [1], from data acquisition and modeling to the visual encoding of data to the presentation of interactive displays. This process is illustrated in the figure below.

Figure. Diagram depicting the information visualization reference model. Source data is mapped into data tables that back a visualization. These backing data tables are then used to construct a visual abstraction of the data, modeling visual properties such as position, color, and geometry. The visual abstraction is then used to create interactive views of the data, with user interaction potentially affecting change at any level of the framework.
  • The first step is the collection of the source data to visualize. This could be a table of figures, a social network graph, a file directory structure, or any other data set.
  • This source data is then used to construct data tables, internal representations of the data as it is to be visualized. The process of going from source data to data tables might only involve reading in the data from a formatted file or database, but could potentially involve any number of data transformations.
  • The resulting data tables (which, despite the name, can also represent networked data structures such as graphs and trees) are then subject to visual mappings to create a visual abstraction, a data model that includes visual features such as spatial layout, color, size, and shape. The visual abstraction is responsible for containing all the information needed to draw a visual representation of the data.
  • The actual rendering of the data in the visual abstraction is done through a process of view transformations, in which rendering components draw the contents of the visual abstraction into any number of interactive views. These views can provide varying perspectives onto the data, for example by supporting panning and zooming operations to hone-in on specific regions, or by using an array of "small multiples" displays to show different snapshots of a fluctuating data variable.
  • User interaction with the visualization (most commonly through mouse and keyboard input) can feedback into this process, causing changes or updates at any stage of the visualization pipeline. Examples include dragging an item, zooming into a view, or opening a different data file.

The reference model described above is quite similar to the popular model-view-controller design pattern for implementing user interfaces. This pattern breaks up a user interface component (e.g., a slider or combo box) into

  • a model containing backing data values
  • one or more views displaying the contents of this model, and
  • controllers for processing user input and appropriately updating the model and view in response.
The information visualization reference model extends this common pattern by adding an additional level. The data tables serve as a baseline data model which can back any number of visualizations, with each visual abstraction serving as a visualization-specific model with its own set of views and controllers.

The figure below illustrates how the different packages and classes of the prefuse toolkit implement the information visualization reference model, providing support for each stage of the visualization pipeline.

Figure. Diagram depicting the relation of different prefuse packages and classes to the infovis reference model. Click the image to see a larger version.
  • The package provides Table, Graph, and Tree data structures for representing data, providing the data tables of the reference model. Table rows are represented by the Tuple class, while the Node and Edge classes represent the members of graph and tree structures. The Graph and Tree classes are implemented using Table instances to store the node and edge data. These data structures are memory efficient and, as discussed later in the manual, can also be queried for specific data ranges or values.
  • As an advanced feature, prefuse also provides an intepreted expression language. This language can be used to write queries to prefuse data structures and create derived data columns as functions of existing data fields (and thus providing an easy form of data transformations). The expression language is implemented using the classes of the package and textual expressions are parsed by the ExpressionParser class.

  • The package provides classes for reading and writing table, graph, and tree data from formatted files. For tables, CSV (comma-separated-values) and delimited text (tab-delimited, pipe-delimited, etc) files are supported. For network structures, the XML-based GraphML and TreeML file formats are supported. The package provides facilities for issuing queries to a SQL database and then returning the result within a prefuse Table. Appropriately structured tables returned from a database can also be used as the node and edge tables in a graph or tree.
  • A visual abstraction of a data set can be created by adding the data to the prefuse Visualization class. This creates a special data structure that includes the original data but also introduces new visualization-specific data fields, such as x,y coordinates, and color, size, and font values. For any backing Tuple, Node, or Edge added to the visualization, corresponding VisualItem instances are created. VisualItems provide access to both the visual attributes and the underlying data values. NodeItem and EdgeItem are VisualItem instances that also provide access to a backing graph structure.
  • Specific visual mappings are provided by Action modules. These are independent processing modules for setting item visibility, computing layouts, assigning color values, and any number of other processing tasks over the VisualItem instances in a Visualization. The prefuse.action package and its sub-packages provide a rich library of Action components for layout, visual encodings, distortion (e.g., fisheye views), and animation. Custom visualizations often involve creating new Action subclasses to provide application-specific processing tasks.
  • The actual appearance of VisualItem instances are determined by Renderer modules. Renderers are responsible for drawing items and computing item bounds (how much space an item takes up on the screen). Prefuse provides Renderers for drawing various shapes, labels, and images. Furthermore, the Renderer interface is quite simple (just three methods), easing the process of creating custom renderers. Which Renderer to use for a given VisualItem is determined by a RendererFactory, which is asked for the appropriate Renderer each time a VisualItem is to be drawn to the screen.
  • Interactive views are provided by the Display component, which acts as a camera onto the contents of a Visualization. The Display draws all the items within its current view, and can be panned, zoomed, and rotated as desired. A single Visualization can be associated with multiple Display instances, enabling different multi-view configurations, including overview + detail views and small multiples displays. Display instances are first-class user interface components, and can be added into Java applications and applets.
  • Each Display also supports any number of interactive Controls, which process mouse or keyboard actions on the Display and on individual VisualItems. The prefuse.controls package provides pre-built controls for selecting focus items, dragging items around, and panning, zooming, and rotating the Display view. Furthermore, it is easy to create custom Controls by subclassing the ControlAdapter class.
  • Finally, interaction can also occur through the use of the dynamic query bindings provided in the package. These classes create a binding between a column of table data and an expression Predicate (or query) over that column. These bindings can automatically generate appropriate user interface components (e.g., sliders, radio buttons, check boxes, text search boxes, etc) for directly manipulating the settings of the query. As seen later in the example application below, this can be used to interactively filter for data items of interest.


  1. The information visualization reference model was developed in the Ph.D. thesis work of Ed Chi, under the name of the data state model. Chi showed that the framework successfully modeled a wide array of visualization applications and later showed that the model was functionally equivalent to the data flow model used in existing graphics toolkits such as VTK. In their book Readings in Information Visualization: Using Vision to Think, Card, Mackinlay, and Shneiderman present their own interpretation of this pattern, dubbing it the information visualization reference model. (back to text)

post a comment

You may use HTML tags for style. Please enclose source code within <pre class="codebox">...</pre> tags.

If you haven't left a comment here before, you will need to be approved before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.