1. Visual Languages in Human/Computer Interaction

When a person employs a text and graphic object in communication, that object has meaning under a system of interpretation, or ``visual language.'' Visual languages can be used to communicate with computers, and are becoming an important kind of human/computer interaction. Phrases in a formal visual language can be used to direct searches in a data base [Odesta85]; construct simulations [Budge82]; provide communication for aphasics [Steele85]; or serve as expressions in a general purpose programming language [Sutherland65, Christianson69, Futrelle78, Lakin80c, Robinett81, Tanimoto82, Lanier84, Kim84, Glinert84].

2. Drawbacks of Special Purpose Visual Language Interfaces

A visual language interface should provide the user with two capabilities: the agility to create and modify phrases in the visual language, and the processing power to interpret the phrase and take appropriate action. All of the interfaces currently available (to the author's knowledge) which allow creation and processing of visual objects employ some kind of special purpose editor which is syntax-driven. Such editors achieve graphical agility and interpretative power, but at the expense of generality. In lieu of understanding, these editors substitute restriction. From a practical point of view, they limit the user's freedom: he can't spontaneously arrange text and graphics in new ways, or add a piece of text to an object already defined as graphical, or edit the text in a pull-down menu, or create a new kind of diagram. From a theoretical point of view, such editors never deal with the general issues of understanding diagrams: the meaning has been built into the structures and procedures of the predefined object categories (compare this approach with that of SRI's TEAM system)*.

* A parallel can be drawn between special purpose, syntax-driven graphics editors and menu-driven so-called `natural language' interfaces to data bases. The latter interfaces allow the user to construct a natural language query through choosing from a series of menus containing predefined natural language fragments. As each succeeding fragment is selected, it can be put immediately into its proper place in the final query because the offering of subsequent menus is guided by the logical form of a query. Parsing (and understanding) has been finessed. Compare this approach to a general purpose natural language front end such as LUNAR [Woods74] or TEAM [Martin83]. These ``English understanding'' systems are much more complex, but allow the user to type in query sentences that he or she makes up. The menu-driven approach has short-term practical advantages: it will run faster on cheaper computers; malformed sentences are not permitted so they don't have to be handled. On the other hand, the comprehensive approach used by the LUNAR and TEAM projects has long-term advantages: it gives the user freedom and tries to handle `the sentence he was thinking of' as opposed to forcing construction from predefined pieces; it can handle arbitrary embedding of phrases; and insofar as the projects are successful, general principles about computer understanding of natural language will be discovered.