Enhancing State-Space Tree Diagrams for
Collaborative Problem Solving
Steven L. Tanimoto
University of Washington, Seattle WA 98195, USA,
WWW home page: http://www.cs.washington.edu/homes/tanimoto/
Abstract. State-space search methods in problem solving have often
been illustrated using tree diagrams.We explore a set of issues related to
coordination in collaborative problem solving and design, and we present
a variety of interactive features for state-space search trees intended to
facilitate such activity.Issues include how to show provenance of deci-
sions, how to combine work and views produced separately, and how to
represent work performed by computer agents.Some of the features have
been implemented in a kit called TStar and a design tool called PRIME
Problem solving and design processes tend to confront more and more complex
challenges each year. A solution to a single problem often requires expertise from
several domains. For example, Boston’s “Big Dig” required expertise from civil
engineers, geologists, urban planners, and politicians, among many others.
A group at the University of Washington is studying the use of the classical
theory of problem solving as a methodology in support of collaborative design.
One part of the approach involves the use of interactive computer displays to
anchor the shared experience of a team. This paper describes a set of issues and
possible features to address those issues. The issues and features pertain to the
use of interactive tree diagrams that show portions of a design space or solution
space, and that support a team in managing their work.
State-Space Search Trees
Problem solving has been cast in the formal framework of “state space search”
by early researchers in artiﬁcial intelligence, such as Newell and Simon (see
Simon, 1969). A solver starts with an initial state, representing an absence of
commitments to the particulars of a solution, and step by step, tries adding
particular elements or modifying the current state, in an attempt to reach a
“goal state”. The goal state, or perhaps the sequence of steps taken to reach
it, constitute a solution to the problem. The states visited in such a search can
usually be plotted as nodes of a graph. Each move from one state to another,
in turn, corresponds to an edge of the graph. If we decide that each state is
described by the sequence of operators applied, starting at the initial state,
then the graph has the structure of a tree, and it can be drawn as a tree. Such
diagrams have been used to illustrate textbooks of artiﬁcial intelligence, often
in conjunction with classical puzzles such as the Eight puzzle (Nilsson, 1971)
or the game of Tic-Tac-Toe as shown in Fig. 1. The use of state-space search
trees in the design of image-processing procedures was discussed in the context
of face-image analysis (Cinque et al, 2007) and that diagrammatic methodology
is shown in Fig. 2.
Fig. 1. A familiar kind of state-space search tree.This diagram corresponds to a partial
search in the middle of a Tic-Tac-Toe game.
This paper is concerned with a set of issues that arise when one chooses to use
state-space search tree diagrams as interactive representations of the progress of
a design or problem-solving team. Some of these issues are the following.
Provenance. Each node represents a partial design or solution. Who contributed
what to this design? From what other designs was it derived? Supports for
querying and viewing provenance are needed.
Fig. 2. Tree representing an exploration in a space of transformed images.
Negotiating Shared Views. A designer’s personal view of a design tree reﬂects
her priorities and preferences. In a shared view, priorities and preferences may
collide, leading to the need for compromises. Algorithms can oﬀer certain com-
promises automatically, but shared views that are most satisfactory to a group
may require give and take. Facilities to manage such negotiation are needed.
Visualization of Opportunity. The members of a team may understand their
own components of a design or solution, but it’s often diﬃcult to understand
the potential interactions of the pieces. In some cases, separately designed pieces
can be combined, and the diagram can indicate the compatibility of a piece for a
particular role. Some opportunities result from the applicability of operators to
newly computed states. Others result from the availability of resources in relation
to deﬁned tasks. Helping users see and evaluate opportunities is an important
goal for collaborative design and problem-solving tools.
Work by Agents. A collaborative team traditionally consists of people. However,
computer agents may be present and available to help out. Agents generally
work very diﬀerently from people, and special aﬀordances are required to deﬁne
and display tasks for agents and the results of those tasks.
While much research has been done separately on problem solving, tree diagram
layout, operations on trees, and interacting with trees, we mention in this section
a mix of foundations and some work that bridges these topics.
Theory of ProblemSolving. What we’ll call the “classical theory of problem
solving” was developed during the 1960s. An introduction can be found in Nilsson
(1971). The important concepts for understanding the roles of the tree diagrams
are presented below in section 2. The rationale for this somewhat reductionist
view of design and problem solving was provided by Simon (1969).
Flexible Tree Layout. In order to display a large tree so that various parts of the
tree can be distinguished, or so that a minimum or limited area is used, many
methods have been proposed. A survey of tree display methods was presented
by Hanrahan (2001). Popular methods include “hyperbolic trees” (Lamping et
al, 1995), subdivision diagrams, and cluster diagrams. Usually, the goal is to
display a large tree in such a way that individual nodes get their fair share of
screen real estate.
Interactive Tree Display. A family of interactive techniques for trees has been
developed in the context of operating-system ﬁle hierarchies. The best-known
example of a hierarchical ﬁle system browser is the Microsoft Windows Explorer
(not Internet Explorer). Interior nodes of the tree shown in Explorer correspond
to folders in the ﬁle system, and the leaves correspond to individual ﬁles. The
user can click on folder icons (situated at the nodes) to open and close the folders.
When a folder that contains ﬁles is closed, it has a hotspot labeled with a plus
(+) sign. Clicking on the plus sign opens the folder to reveal its contained ﬁles
and subfolders. A more general set of aﬀordances for hiding and revealing folders
is described in a US patent (see Chu and Haynes, 2007). Here it is possible to
have two nodes in a relationship where A is an ancestor of B, where A is closed,
but B is visible and open. In such a view, all of A’s descendants, except B and
its descendants (and any other nodes explicitly opened) are hidden.
Interactive Layout. There have been several developments that have combined
ﬂexible layout with interactivity. A notable example is “Degree of Interest Trees”
(Card and Nation, 2002). In such a tree, the user can indicate interest or disin-
terest in particular nodes of the tree by clicking on them, and the layout of the
tree immediately adapts to the new distribution of interest, using a continuous
animation to move from the current layout to the new layout.
Here we deﬁne the basic concepts on which our tree diagrams are based.
States and Moves. We assume that a problem, possibly a design problem, has
been deﬁned. We assume that a solution can be developed by starting with either
a blank slate or an arbitrary conﬁguration of elements (that we call the initial
state) and taking a series of steps called moves. (The term “move” comes from
the steps involved in solving puzzles and playing games like checkers.) Each move
leads to a new state. A state represents a snapshot, a particular point of progress,
in the process of ﬁnding a solution. A state that satisﬁes the requirements for
being a solution to the problem being solved is called a goal state. The set of all
states that can, in principle, be reached from the initial state my making moves
(possibly inﬁnite sequences of moves)
Operators. A move comes about by applying a method called an operator to a
given state. Any given operator may or may not be applicable to a particular
state. For example, when designing a house, an operator to “connect the ﬁrst
and second ﬂoors with a staircase in the northwest corner of the house” is only
applicable if both the ﬁrst ﬂoor and second ﬂoor have already been created
as parts of the design and no staircase already connects them in the northwest
corner of the house. The applicability of an operator is determined by a predicate
that is associated with it called its precondition. There are two kinds of operators:
ﬁxed operators and parameterized operators. A ﬁxed operator doesn’t require
any additional speciﬁcation in order to be applied. A parameterized operator
requires more information in order to apply it. For example, an operator to
“add a back deck” might require a (width, length) speciﬁcation. The allowable
parameters may depend on the state to which the operator is being applied.
Search Trees. In many puzzles, games, and real-world problem-solving situa-
tions, it is possible to arrive at a particular state via two or more diﬀerent se-
quences of moves. If displayed nodes always corresponded one-to-one with states,
the diagrams corresponding to such collections of states and moves would not
necessarily be trees but directed graphs that might even contain cycles.
We will avoid dealing with cycles and multiple paths to the same state by
deﬁning a node not strictly in terms of an underlying state but in terms of the
sequence of moves taken from the initial state. This ensures that the diagrams
we work with will always be trees. Not only that, it respects the diﬀerent prove-
nances of nodes even when they represent equivalent states.
We deﬁne a search tree to be an acyclic, directed graph, with at least one node,
the root, which corresponds to the initial state for a problem. Each node other
than the root has a unique parent node from which it is derived by applying
one operator (ﬁxed or parameterized). The sequence of moves (or practically
speaking, operators and any needed parameters) to get from the root to a node
N identiﬁes the node N. A search tree represents a portion of a state space. If
the state space is ﬁnite and small, the search tree could possibly represent the
whole space, but in general it will usually represent only a small fraction of the
entire state space. It typically represents an “explored portion” of a state space.
The way that the tree is drawn can be modiﬁed to reﬂect the properties of
nodes, operators, parameters, and to take into account viewing preferences and
status of current activities.
Potential vs Realized Nodes. A search tree, representing only a portion of a
state space, contains only “realized” nodes. The others are “potential nodes.” A
potential node corresponds to a sequence of operators that could be applied to
the initial state to obtain a state for the node, but that has not yet been applied.
When a tree is grown, some potential nodes are converted into realized nodes
by applying operators and thus making new moves.
A node schema corresponds to a set of potential and/or realized nodes all
of which can be produced by a sequence of operators some of which may be
parameterized operators whose parameters have not been speciﬁed. Changing
any parameter of a parameterized operator in the sequence always changes the
ﬁnal node, and so the schema can correspond to a multiplicity of nodes. A schema
can be considered a plan for exploration.
3Application to Collaborative and Design
We use the classical theory of problem solving outlined above for two reasons:
(1) provide a common language for the problem solving process to design teams
whose members represent diﬀerent disciplines, and (2) help humans interact, via
the computer interface, not only with each other, but with computational agents
that perform design or problem-solving services.
For convenience, let’s call our problem solving or design process simply “the de-
sign process,” with the understanding that the methodology serves either pur-
pose. We assume that the entire process of design can be broken down into small
steps that we call design acts. There are several types of design acts, including
communication acts, design steps, evaluation acts, and administrative acts.
Communication acts include writing and reading messages associated with
the project. The messages may be annotations attached to a diagram, or they
may be email or chat messages decoupled from precise design contexts.
Design steps are primarily applications of operators to existing nodes to
produce new nodes in a search tree. Selecting a node or adjusting a selection, in
order to apply an operator, may also be considered a design step.
An evaluation act is either a judgment or an application of an evaluation
measure to a node or set of nodes. A judgment is a human-created quantitative
or qualitative estimation of the value of a node on some scale with regard to some
particular characteristic. An evaluation measure is a mathematical function that
can be applied to a node to return a value (usually numeric). Such a value may
correspond to the degree to which a partial design meets a particular criterion.
For example, one evaluation measure for designs of houses is the number of
rooms in the house. Another is the square footage of area in the designed house
so far. A judgment might correspond to an aesthetic evaluation – how pleasing
is this ﬂoor layout to the eye of designer Frank?
Administrative acts include actions such as the deﬁnition of tasks and sub-
goals for the design team, commitments of eﬀort to tasks, and adjustments to
views of the progress made so far. It may be diﬃcult to separate administrative
from communicative acts, since a commitment is usually accompanied by an
expression of that commitment to others.
Roles of the Diagram
By using tree diagrams, it is possible to provide computer assistance, at some
level, for each type of design act mentioned above. Many communication acts
will relate to parts of a tree. Messages can refer to parts of a tree in two ways:
(1) by naming a labeled node or by describe the path to it from the root, and
(2) by being embedded in the node as an annotation. Messages can also refer
to nodes via descriptions or expressions in a node-speciﬁcation language. Such
a language is a kind of query language with features for identifying nodes not
only via their properties and annotations, but via their “kinship” relations.
Design steps are supported via interactive tree-building functionality. The
selection of nodes and application of operators can be performed using a direct-
manipulation style of construction. (An example of a design tree built with a
tool called PRIME Designer is shown in Fig. 3).
Evaluation is supported via annotation and editing tools for judgments and
via commands to apply built-in functions to sets of nodes.
Administrative acts such as task deﬁnition can often be expressed in terms
of work to be done exploring diﬀerent branches of a tree. Naming of subtrees or
work associated with them can make use of node annotation facilities.
In addition to helping with speciﬁc design acts, the diagrams provide an easily
browsable record of the history of the design process. They also provide contex-
tual information for particular tasks. Thus a state-space search tree provides an
organizing framework to a team for pursuit of solutions.
Layout and Visibility Aﬀordances
State-space search trees have the potential to grow very large. In order to make
large trees manageable on an interactive display, the following techniques are
commonly used: scrolling, overall scaling, and opening/closing (revealing/hiding)
of subtrees. We discuss variations on each of these techniques that can provide
additional user control in the case of search trees.
Scrolling is the business of translating the viewport on the tree to locations of
interest. It can be done manually by the user with scrollbars, or automatically
by the program according to policies or special commands. Manual scrolling is
well understood by users, but it can be diﬃcult to navigate when the viewport
is small relative to the entire diagram.
Automatic scrolling “by policy” means that the program changes the location
of the viewport when certain kinds of events occur. When a new node is created
as a result of a user’s instruction, automatically scrolling to the new node can be
helpful. This facilitates inspection of the new node, and it anticipates the likely
next step in search as one that builds further in the same direction, i.e., from
the new node (i.e., depth-ﬁrst).
Fig. 3. Design tree showing multiple paths developed by separate members of a team.
Also shown are “miniﬁed” nodes, an ellipsis under the ﬁrst miniﬁed node, and an
enlarged node.The four quadrants in each node correspond to diﬀerent roles on the
design team.This view is an “all-roles” view.
Automatic scrolling “by command” means that the program manages the
adjustment of the viewport in order to obey a command from the user, such
as “scroll to the root”, or “scroll to the parent of node 113.” Relative-motion
commands can also be useful: “move to left sibling.”
Scrolling is also assisted by birds-eye views, scrolling histories, and options to
“snap to tree nodes” (analogous to snap-to-grid policies in drawing programs).
Overall scaling, equivalent to zooming, is an essential viewing technique for large
trees. A challenge for designers of visualizations is providing controls so that
users can perform diﬀerential scaling, so that diﬀerent parts of the tree can have
diﬀerent scale factors. A form of diﬀerential scaling can be found in dynamic
tree layout methods such as hyperbolic trees (Lamping et al, 1995) and “degree-
of-interest” trees (Card and Nation, 2002). These methods are convenient for
giving prominence to regions of a tree, a region being a subtree or the nodes in
a graph neighborhood of a given node.
Another approach to diﬀerential scaling allows the user to associate, with
any node or nodes s/he deems special, a particular scale factor for that node.
That factor is then used as a relative factor, and zooming of the overall view
maintains the diﬀerential. Our TStar software supports this technique.
While scrolling and scaling are often considered to be geometric operations, they
are also means of information hiding, since scrolling is a way of showing some and
hiding other information in a diagram, and scaling makes some details visible or
invisible and shows more or less of the area of the entire diagram.
Now we describe two additional forms of information hiding, one of which we
call “hidden nodes” and the other “ellipsis.” They are closely related. They diﬀer
in three respects: how aﬀected portions of the diagram are rendered, what kinds
of sets of nodes can be hidden, and how the user interacts with them. We make
these distinctions in order to provide a rich set of view controls to users, and to
suggest additional possibilities for dealing with the complex but important issue
of working with state-space tree diagrams.
We deﬁne a hidden node to be an invisible, but fully computed node object,
that is neither rendered on the screen nor has any eﬀect on the rendering of the
rest of the tree. From the standpoint of screen appearance, a hidden node may
just as well be a node that was never created or that has been deleted. If it has
any children, they, too, must be hidden. In our TStar system, the user can hide a
subtree in order to create a simpler view, or unhide a subtree to (re)include it in
the diagram. Any overhead that would be associated with deleting or recreating
corresponding states is avoided in the hiding and unhiding operations.
In order for the user to know that a hidden node exists in a tree, a small
indication is available, on demand, in the form of a highlight on nodes having
hidden children. Hiding the root is not permitted in TStar.
A node in ellipsis is a tree node that is not displayed but is represented by an
ellipsis indication. One or more nodes, in an ancestor chain, can be represented
by a single ellipsis indication. An ellipsis is a collection of all the nodes in ellipsis
that are represented by the same ellipsis indication.
An ellipsis therefore represents a set of nodes in a tree whose existence is im-
portant enough to indicate, but whose details and whose number are suﬃciently
unimportant that they are hidden. For example, the nodes in an ellipsis may
have an important descendant, displayed in detail, but not have any details of
their own worth showing.
The technique of ellipsis gives us a means to show nodes that have been
explored but which we show in a very abbreviated form.
While we have just deﬁned an ellipsis as a static set, in practice the user may
work with dynamic ellipses. A basic dynamic ellipsis is a changing set of nodes
in ellipsis, and a user-controlled dynamic ellipsis is a basic dynamic ellipsis with
a boolean variable “currently-in-ellipsis” that can be set true or false by the user
to adjust the visibility of the nodes without losing the grouping.
In principle a single node might be an element of zero or more separate
user-controlled dynamic ellipses, but if a node belongs to two or more, and their
variable values conﬂict, a priority mechanism must be used to determine the sta-
tus of the node’s visibility. One simple priority mechanism is event-based: when
the variable associated with an ellipsis changes, update all its nodes to be consis-
tent with it, at the possible expense of consistency with other ellipses. Another
system would involve assigning numeric priority values to each user-controlled
dynamic ellipsis. Our system TStar currently implements basic dynamic ellipses
but not user-controlled dynamic ellipses or any priority systems.
The traditional means of displaying an ellipsis in text, formulas, and many
kinds of diagrams is three dots in a line: “. . .”. In a tree, an ellipsis can be
indicated by using a dotted line for a path containing nodes in ellipsis. An
example of this can be seen in Fig. 3 under the ﬁrst small blue node.
There is a layout issue for ellipses. How long should the dotted line be? On the
one hand, a reason for using ellipsis is to save space, and this suggests “as short
as possible.” However, it’s important that the line communicate the ellipsis, and
so if it’s dotted, three or more dots should be visible. If the line leads to a fully
visible node, then the length of the line might be suggestive of the depth of that
node. If the node chain in ellipsis is hundreds of levels long, and the rendered
line is the same length as a single normal tree edge, that might be misleading.
One layout choice could be that an ellipsis line should be 1.5 times as long as a
normal edge, and that if space permits, a numeric label next to the line should
indicate the actual length of the path represented by the ellipsis. The factor of
1.5 will intentionally tend to misalign the visible descendants of the nodes in
ellipsis from nodes on the left and right, reminding the viewer that they are not
at the same depth.
Ellipses not only hide node details and reduce clutter (see, e.g., Taylor et al,
2006) but they save space by reducing the amount of space allocated to the nodes