© 2009 2012 by Fred Lakin (back to book site:
)
This work described in these essays has two goals: to both understand and support the
phenomenon of human text graphic activity. And if the phrase "phenomenon of human
text graphic activity" bothers you, just substitute "blackboard behavior." All we need in order
to proceed with the discussion is a general category that includes the doing of visual stuff by
people on blackboards (and whiteboards too, what the heck).
From the beginning it was clear that the two goals were related. Generating text and graphics over
time, the activity in question, is by its very nature artifactual. Therefore, as long as an artifact must
be constructed anyway to support the activity, why not also design the artifact to help reveal the
structure of that activity?
And in turn, better understanding of the activity will then allow better artifacts to be designed for understanding and supporting it (and so on, and so on, bootstrap).
Ergo, I set out to design and built such a dual purpose tool.
I tried numerous times to write a paragraph called "Who would enjoy this book." But with each
version, I was afraid that many of the very folks the very visual, logical, and inquisitive folks
whom I hoped to entertain would read the paragraph, take it the wrong way, self define outside
my enclosure, and thus not venture further when really they should. Eventually I gave up and
instead drew and wrote a very graphic intro to the intro, see next section, in hopes of enticing one
and all.
[Exception condition, you know who you are: (first (last *footnotes*)) ]
The Visual Punchline: lumpy oatmeal, visual grammars for visual languages, and Dave the visual agent
First, for the very very visually impatient, this section is simply a sequence of pictures with explanations. The images and text are intended to show you: that expressions in a visual language are like coagulated lumps in oatmeal, all the same basic stuff, only just more highly structured; that "trees" (thin spider webs that actually look like roots) are a diagramatic way to represent structure; and, that both humans and software can use that structure in dealing with images.
To start things off, here is a spread of metaphorical text graphic oatmeal with lumps:
And then here is the notation I will use to represent details of structure. They are called "trees"
even though they look more like roots. The fine lines show the grouping structure of text graphic
objects without violating their visual integrity. The spider webs are simply an overlay to be turned
on or off; they show the objects to be members of ordered, recursive lists (during manual
manipulation of objects, the order is often ignored and the lists are just used as simple groupings).
But, for the visually impatient and ruthlessly practical, maybe not so much. A better
understanding of visual languages, as evidenced by visual grammars to represent underlying
structure well, too high falutin'. Where's the meat? Instead, how about this: visual grammars
provide a way for a piece of software called a visual agent to recognize patterns in the
text graphic flow and intervene to assist the human in her visual activity.
Bottom line, trees are the answer. During the fact, providing a framework for improvisation; after the fact, providing a framework for analysis. And, for the grand finale combo shot, both in vivo and in cogitare, where Dave the visual agent in real time uses the spatial arrangement of objects generated by the performer, along with the structure specified by the trees in the grammar, to recognize patterns in the performer's activity and take immediate action.
The Plan
OK, so we have the phenonemon, text graphic activty, and we know the payoff, visual grammars
to represent the underlying structure of the activity for use by a visual agent. It merely remains to
connect the two in a useful way via a plan, which plan will unfold as the details of our graphic story ...
Having skipped ahead to the end, we must now return to the beginning of the project, where the first step was a definitional one. Or rather, the lack thereof. Another thing which had become clear to me was that the distinction between text and graphics was arbitrary and context dependent, varying widely from one visual communication system to another, and therefore must be deferred as long as possible.
Hence I decided early on to simply refer to all visual objects as "text graphic," leaving it a task
for later as to which was which, and how and when to define the difference[1]. Some more
examples of text graphic objects which the visual tool was designed to handle:
Instrumentality
After constructing various paper based systems like the "Wall Scroll" ...


and the "Vacuum Board Vertical Desk", and recording their use with time lapse
photography, and then studying the recordings, I decided to try building a visual instrument
which employed computer graphics.
Although my original intent was simply to emulate my previous paper systems using a computer controlled display, I soon realized that computer science had some concepts which might be of great use to my project.
A key insight was the realization that John McCarthy's LISP could be generalized from
"Computing with (textual) Symbolic Expressions" to "Computing with Text Graphic Forms".
That is, visual objects could direct the processing of other visual objects. This system, called
PAM for "PAttern Manipulating" (even as LISP stands for "LISt Processing")
would be a simple and clear framework for describing text graphic objects and manipulations
upon them. And graphics would be first class objects in PAM, able to be executed as instructions
by a computer for performing manipulations on other graphic objects.
Because PAM was an extension of LISP, it would of course also be a general purpose programming language, enjoying all the power of LISP. That is, PAM is simply a programming language extremely well suited for building visual performing instruments[2] and analyzing their use.
Trees
The basic tree structures of LISP (and PAM) were very handy for supporting the dual aims of the
project (you remember, building a visual performing instrument which was also a processing/measuring
instrument for visual performance activity). Here is the underlying tree structure for a canonical
text graphic object:
So if we contemplate for a moment hyper skilled visual performer David Sibbet in action during a
Group Graphics session:

Then we can use trees to represent a reasonable underlying grouping structure for the
complex text graphic image that was generated by David during the session:
And then (once again) we can write/draw a visual grammar for a simplified version of the visual
language used by David to organize the text and graphics on the paper display
"); a bullet list is a piece of text over three bulletRecusion is pretty cool. It allows very concise grammars to describe visual expressions of arbitrary complexity. Here is the underlying structure for a recursive Sibtran expression as recovered by the parser using the grammar and the spatial arrangement of the visual atoms:
Visual Linguistics and the contents of this book
The various essays in this book all deal with "visual linguistics"[3], either indirectly through design of performing instruments for generating visual expressions, or directly by building tools for analyzing such expressions.
The basic tree structure for visual objects and computational use thereof is described in the
chapter Computing with Text Graphic Forms [A], where the resulting framework is
referred to as PAM for visual "PAttern Manipulating." Here is the result of evaluating
a visual mapping function:
The visual performing instrument built using PAM is called vmacs; vmacs is described in
the chapters A Structure from Manipulation for Text Graphic Objects and A
Performing Medium for Working Group Graphics. Nuts and bolts vmacs functionality as an
environment for graphic editing and programming non scrolling, manual display
management, discretionary eval is described in Viz Literals [B].
A good use of the vmacs instrument to facilitate the communication of distributed groups
is described in The Visual Telefacilitation Project at PGC.
The kind of visual linguistics which can be done on graphic communication activity performed
with such an instrument when the linguist has available visual grammars that are
text graphic patterns which direct the processing of other text graphic patterns in the task of
spatial parsing [4] is described in Visual Grammars for Visual Languages [C].
In that essay, I say:
"The purpose of spatial parsing is to aid in the processing of visual languages. As an operational definition of visual language, we say: A visual language is a set of spatial arrangements of textgraphic symbols with a semantic interpretation that is used in carrying out communicative actions in the world. Spatial parsing deals with the spatial arrangement of the textgraphic symbols in a visual phrase from a visual language: Spatial parsing is the process of recovering the underlying syntactic structure of a visual communication object from its spatial arrangement."
And let me add:
"A visual grammar is a graphic representation of the rules for laying out pieces of text and graphics in a spatial arrangement which expresses the syntantic structure of a properly constructed visual phrase in a particular visual language."
When people perform text graphic activty, there are patterns in the text graphic stuff they leave
behind. Blackboard activity is live and spontaneous, general and unstructured "conversational
graphics". Visual language expressions often arise in the midst of conversational graphics like the
coagulation of lumps in oatmeal. Such a "lump," or, expression in a particular visual language, is
highly structured according to the rules for laying out pieces of text and graphics to construct
phrases in that language. Those lay out rules are the grammar for that language. If the rules in that
grammar are represented visually written and drawn then we have a visual grammar for a
visual language
In visual conversations, over time the amount of spatial organization in the text graphics tends
to increase (or, "visual entropy" decreases). Loosely speaking, we can call this the "local
coagulation of lumps in the text graphic oatmeal." This odd phrase points out the important fact
that the basic text graphic stuff remains the same during the course of the activity, but that as it
continues, a discernible infrastructure arises among the pieces. Where "over time" means both
visual conversation time in minutes and cultural time in years. And where it is the job of the
visual linguist to try and tease out that underlying structure, and then to represent it in a visual
grammar.
Let's take another run at showing how visual grammars work. This time we will not come from the very complex end of the diagrammatic spectrum and try to grapple with the richness of a Sibbet panorama. Instead, we will use a very simple example, a "toy" domain as the computer guys like to say.
So, on the left is a complete grammar for a minimalistic family of bar charts, and on the right, one member of that very impoverished family.
Additional articles in the book describe other graphical/computational tools for doing visual
linguistics. Some of the tools can be used to study human text graphic activity over time. In
Mapping Design Information, the focus is on design ideation over days and weeks as
recorded in an electronic design notebook. And in Measuring Text Graphic Activity, the
focus is on visual performance over seconds and minutes; here is an automatically generated
graph of attention shifting during image construction:
Visual Grammars, the big payoff: presenting Dave the visual agent
Visual Agents are software entities which assist people in performing graphical tasks. One
useful and interesting graphical task is to make a text and graphic record of a group meeting.
In A Visual Agent for Performance Graphics[I], a visual agent named "Dave"
[5] is described. Dave acts as a whiteboard assistant for group graphics,
helping a person to graphically record the conversation and concepts of a working group on a
large display. Here is Dave's response when the user draws two connecting lines.
Visual agents in vmacs have complete access to user actions and the state of the visual world in
the midst of text graphic performance. Thanks to the visual grammar for Sibtran mentioned
earlier, Dave is able to recognize certain patterns of text and graphics. And when the human
does create such a pattern, Dave then generates a text graphic response and this object is
displayed on the screen. The response is appropriate both to the general type of visual pattern
which triggered it, and to specific elements in each individual triggering pattern. The human
then looks at Dave's response, incorporates it (or not) into the text graphic stream of visual
recording, and continues working. And so on and so on.
When two entities are trading text graphics back and forth in this manner, we call it a "text graphic dialog;"
in this case the dialog is between a human and a visually adept machine.
Footnotes
[1] In fact, let's leave it as a PAM programming exercise for the
reader (i.e. to write algorithms for distinguishing between text and graphics in
different visual contexts, a task for which PAM is well suited by design).
A further example which confounds most ways of distinguishing text from graphics (and vice versa):
[2] Amongst other features, PAM like LISP has both an interpreter and a compiler
which are in sync, so that the entire language is available interactively
[3] Many lay claim to the term "visual linguistics", and I do not wish to dispute
with them. All I know is that after being exposed to the concept of visual language from
the graphic design point of view in the early 70's, and then computational linguistics in
the mid 70's, I came up with the term for myself in that same decade, and first used it
in print at the beginning of the next decade (A Structure from Manipulation for
Text
[4] The problem with procedurally directed parsing is that knowledge about the syntax
of each language is embedded implicitly in procedures, making it hard to understand and modify.
In Visual Grammars for Visual Languages, a spatial parser is described that utilizes
context
[5] I feel obligated to mention that "Dave" the visual agent has no relation
to David Sibbet (whom I would never call Dave). My viz agent is in fact named after the
affable and finally capable Kevin Kline character in the eponymous movie. And I didn't
even know Dave Gray at the time when I built the agent.
[6] Preface to the 1978 manuscript: Structure and Manipulation
in Text
"Presented in this book is a way of thinking about graphic images and manipulating them. This
way of thinking is called PAM, which stands for PAttern Manipulation.
"If you are not a computer scientist, then you can think of PAM as a formal notation for
describing graphic objects and manipulations of them
"If you are a computer scientist, then you can consider PAM to be a machine independent
description of how to think about graphic objects and manipulations of them. PAM is a simple
and powerful system for describing algorithmic devices like: graphics editors (handPAM, the
Electric Blackboard); evaluation functions for graphic forms (evaluate, the PAM interpreter);
graphic programming environments, graphic text editors, circuit diagramming aids and
architectual drafting systems.
"In acknowledgement, this book could never have been written without the inspiration of John
McCarthy's LISP and the LISP editor concepts of Peter Deutsch and Warren Teitleman."
[7] To programmers of the McCarthyesque persuasion, I simply say that the heart of the project
is to make graphics do LISP. This is completely different than making LISP do graphics (which
usually involves using LISP to control output to a screen via side
Instead, making graphics do LISP generalizes "Computing with Symbolic Expressions" to
"Computing with Text
Many consequences follow naturally from the generalization and are described in this book. Lisp
is good for processing textual languages, both programming and natural; hence PAM is good for
defining and implementing visual programming languages, and also facilitates spatial parsing (via
visual grammars) for natural graphic languages (like found on blackboards at day's end, or walls
in the inner city).
Frankly, once the crucial leap (textual symbols to text
Chapters
[A]
Computing with Text
As for the rest, almost every chapter is also a paper of the same name. See
those chapters for citations.
[B] The one exception is the Viz Literals chapter, which
is as yet unpublished and instead has it's own web site (which I guess counts as
published these days).
[C]
Visual Grammars For Visual Languages,
proceedings of AAAI
Introduction
Fred Lakin, cabin on the ridge, Santa Cruz, CA, February 2010
for
executing visual objects as well as textual commands. And of course PAM can easily
write programs which write programs, a capability paramount in visual language processing
(for VLs both designed and natural).
Graphic Objects, ACM SIGGRAPH 1980).
free grammars which are both visual and machine readable. The parser takes two inputs:
a region of image space and a visual grammar. The parser employs the grammar in recovering the
structure for the elements of the graphic communication object lying within the region. One
advantage of a visual grammar is that it makes the syntactic features of the visual language it
portrays graphically explicit and obvious (the visual linguist can literally "draw" the grammar
using the indigenous text graphic symbols of the language being parsed). Grammars also increase
modularity by parameterizing one parser with different grammars, it is easy to change the
behavior of the parser to handle new visual languages.
Graphic Images a phenomenological approach just as algebra is a notation system for
describing numbers and manipulations of them. Non computer people might use PAM as a
precise way of talking about: text graphics images (in order to design them); visual languages;
human text graphic activity (for anthropology, linguistics and psychology); and models of
text graphic understanding.
effects). Graphic Forms," where graphics on the screen are first class objects and
can direct the processing of other graphics. The system is called PAM for "PAttern
Manipulating" (even as LISP stands for "LISt Processing") graphic forms) has been made, and the
spider web notation concocted to show structure, then most of the programmatic
consequences are pretty obvious. The graphics structure editor, vmacs, is based on the
Teitelman/Deutsch program structure editor. Spatial parsing of visual languages just uses a
routine backtracking strategy for a context free grammar where the rules serve as spatial
search templates. And this VennLISP version of labels is a minor hack on the vizeval
interpeter defined in Computing with Text Graphic Forms.
Graphic Forms,
proceedings of the first ever LISP Conference at Stanford University, August 1980. 87, the conference of the AMERICAN ASSOCIATION for ARTIFICIAL INTELLIGENCE,
Seattle, Washington, July 1987.
Chapters
Computing with Text Graphic Forms
Viz Literals
Visual Grammars for Visual Languages
A Structure from Manipulation for Text Graphic Objects
A Performing Medium for Working Group Graphics
The Visual Telefacilitation Project at PGC
Visual Languages for Cooperation
On the Syntax of Diagrams
Executable Graphics
Computing for the Right Brain
Mapping Design Information
Measuring Text Graphic Activity
A Visual Agent for Performance Graphics
Author Bio
© 2009 2012 by Fred Lakin (back to book site:
)