Visual Agent Architecture
Visual Agent Architecture
At this point in the development of visual agents, it is felt that a relatively simple mechanism
reacting appropriately to complex human activity in a rich visual environment is a useful first
step. The difficult part was to develop an architecture that readily supports agents interacting
with free-form human graphical behavior. Now that connection has been achieved, it will be
relatively easy to add sophistication to agent responses using standard technologies (see
Future Work). vmacs provides a basic framework for visual agent implementation:
- User actions are free-form. User writes&draws whatever she wants; if the
agent doesn't recognize a pattern, it does nothing (libertarian) [foot3].
- Each user action is represented symbolically.
- Agents can examine each user action before and after execution.
- Agents can examine state of the visual world before and after each user action.
- Agents can change state of the visual world before and after each user action.
- Agents can use mechanisms based on visual parsing and logic programming.
- Agents can communicate with other agents through a visual blackboard which
accepts multi-media postings (visual as well as symbolic objects, figure 4).
Figure 3. Visual Agent architecture showing relation between
Controller and Visual Parser.
At this time 8 different visual agents have been implemented:
- Dave the group graphics assistant.
- A Fitt's Law daemon to group drawn lines during sketching [foot4]
- A logic-based activity characterizer for conceptual design drawing.
- A logic-based pointing trainer (cursor control).
- A marking trainer (drawing skills).
- A visual language trainer for aphasics.
- A "graphic speedometer" which measures text-graphic manipulations per second.
- An agent which maintains spatial connectivity between straight arrows
and their objects.