Programmer's ROM - The Computer Language Library

home *** CD-ROM | disk | FTP | other *** search

/ Programmer's ROM - The Computer Language Library / programmersrom.iso / ada / ai / alsdsgn.doc next >

Wrap

Text File | 1988-05-03 | 90.3 KB | 2,704 lines

C2 Support Software in Ada(R) Top Level Design Specification SAE-DC-85-R001 January 27, 1986 Software Architecture & Engineering, Inc. 1500 Wilson Boulevard, Suite 800 Arlington, Virginia 22209 (R) Ada is a registered trademark of the U.S. Government - Ada Joint Program Office 1. Introduction Recognition is growing rapidly of the potential appli- cations of Artifical Intelligence (AI) technology in Depart- ment of Defense (DOD) Command and Control (C2) and mission critical systems. AI techniques will play key roles in many of these systems in the near future. One factor that will slow this trend is the fact that most currently successful AI applications are implemented, either directly or indirectly, in LISP. Ada is the required language for new DOD (embedded) application software. The creation of a package library to aid in the development of AI applica- tions, along with tools that allow efficient translation of existing LISP-based AI applications into Ada, will allow the DOD to take advantage of the opportunities offerred by AI technology. Logic programming is an important development in AI programming techniques. Its implementation in the form of PROLOG has produced a language of tremendous potential, par- ticularly in the area of knowledge-based expert systems. The development of a logic programming capability within Ada, again through a package library, will allow programmers to capitalize on this emerging technology within new AI applications. 1.1. Purpose The capabilities offerred by a synthesis of the conven- tional software engineering technology supported by Ada in combination with symbolic computation and logic programming techniques will produce a C2 programming support environment of impressive power. This document describes the top-level design of a collection of Ada packages sufficient to provide such capabilities for developers of C2 applications in Ada, and to provide the basis for translation of LISP-based AI applications into Ada. This document is the final delivery of CDRL A002 (Informal Technical Information) for contract N66001-85-C- 0039 between Software Architecture and Engineering, Inc., and the Naval Oceans Systems Center, 271 Catalina Boulevard, Building A33, San Diego, CA 92152. 1.2. Scope AI technology, and, more specifically, knowledge-based expert systems, are proving to be of value in decision sup- port, target classification, indications and warning, and many other C2-related areas. AI offers the opportunity to vastly increase the amount of information that can be analyzed in a timely fashion, information that might other- wise go unused due to the sheer volume generated by present day sensors and intelligence systems. This volume will only grow, as these systems evolve, without some type of filter- ing agent. The effective use of AI techniques can have the effect of significantly increasing the power of our current C2 systems by providing that filtering mechanism. Today, LISP (of some type) is the most prevalent language used to implement AI applications. The reason for this is that the major current LISP dialects provide power- ful symbolic processing facilities not provided by the majority of modern, block-structured, higher-order, conven- tional programming languages. These conventional languages do not exclude symbolic processing, they just do not provide language-based facilities for such manipulations. As a result, the AI design paradigms used with LISP are not easily implemented in languages like Ada. Ada can be used to implement AI applications if it is augmented with capa- bilities which emulate some of the unique aspects of the symbolic manipulation facilities of LISP. This can be accomplished through specially designed Ada packages. Logic programming systems are a direct implementation, in the form of a programming language, of the general con- cept of a production system, discussed in detail later in this document. Logic programming languages, like PROLOG and the OPS family, are directly useful in implementing knowledge-based expert systems where knowledge is primarily representable in the form of rules. However, in general, the use of a logic programming facility provides a powerful, yet extraordinarily concise, technique for developing appli- cations which must do any kind of symbolic reasoning, that is, reasoning about logical relationships among symbolic data items. LISP-based AI applications which need to per- form this function implement reasoning functions and data structures tailored to the application. Rather than require an application developer to invent unique reasoning systems, Ada provides the mechanism to augment the language with fun- damental logic programming facilities. These facilities can then support the development of generalized reasoning sys- tems fully integrated with the symbolic processing package. Software A & E will implement a library of Ada packages to support such capabilites. The effort is limited in this first phase to the minimal required support for many LISP- based and logic programming applications. Follow-on efforts could expand the base provided by this first increment to include conversion aid tools, enhanced logic programming paradigms, and additional low level AI component packages. 2 1.3. References ANSI/MIL-STD-1815A, 1983. Ada Programming Language. U. S. Government, Washington, DC. Barr, A., and Feigenbaum, E. A. (Eds.) 1981. The Handbook of Artificial Intelligence (Vol. 1 & 2). Los Altos, CA: Wil- liam Kaufmann, Inc. Booch, G. 1983. Software Engineering with Ada. Menlo Park, CA: Benjamin/Cummings Publishing Co. Campbell, J. A. (Ed.) 1984. Implementations of PROLOG. New York, NY: Halstead Press. Charniak, E., Reisbeck, C. K., and McDermott, D. V. 1980. Artificial Intelligence Programming. Hillsdale, NJ: Lawrence Erlbaum Associates. Knowledge Engineering System, Knowledge Base Author's Refer- ence Manual. 1983. Software Architecture and Engineering, Inc. Kowalski, R. 1979. Logic for Problem Solving. New York, NY: Elsevier Science Publishing Co., Inc. Nilsson, N. J. 1980. Priciples of Artificial Intelligence. Palo Alto, CA: Tioga Publishing Co. Parnas, D. L. 1972. On the Criteria to be used in Decompos- ing Systems into Modules. CACM 15:12, pp 1053-1058. Schwartz, R. L., and Melliar-Smith, P. M. 1980. The Suita- bility of Ada for Artificial Intelligence Applications. Menlo Park, CA: SRI International. Steel, G. L. 1984. COMMON LISP, The Language. Burlington, MA: Digital Press. Winston, P. H., and Horn, B. K. P. 1984. LISP, Second Edi- tion. Reading, MA: Addison-Wesley Publishing Co., Inc. 3 1.4. Overview An AI support environment for Ada would ultimately con- sist of the following categories of components: (a) LISP-capability emulation: packages which emulate the symbolic manipulation capabilities of LISP that are not directly available in Ada. (b) Translation aids: packages that support syntactic and semantic mapping of LISP-based programs into Ada con- structs and facilities provided by the packages in category (a). (c) AI mechanisms: packages that represent commonly occur- ring AI mechanisms, e.g. a backward inferencing con- trol strategy. Software A & E recommended an incremental approach to development of the software in these three categories in the contract proposal. The first increment, a fundamental sub- set of categories (a) and (c), is addressed by this docu- ment. This fundamental environment will consist of abstract data types, objects, and associated operations commonly used in AI applications, but not directly supported in Ada. The functional areas supported by the basic package to be developed are briefly described in the Statement of Work (SOW) as follows: (a) AI data types: abstract data types will be developed that characterize classes of symbolic data and associ- ated operations commonly used in LISP-based AI applica- tions; these data types will provide a common base for symbolic computation to be shared by AI applications without concern for details of representation. (b) Dynamic storage management: a typical AI program is characterized by the highly dynamic way in which large numbers of primitive data objects are created and dis- carded; facilities for management of dynamic allocation and reuse of object storage space are required to sup- port this. In addition to this basic functionality, the SOW requires the following: (c) Generic AI objects: in general, flexibility can be attained in AI programs through an object oriented view of data that allows the details of object representa- tion and typing to be hidden; packages will be developed to provide definitions of objects generally found useful in LISP-based AI programs. 4 1.4.1. Product Goals (a) The product will support production development of AI applications, as opposed to the evolutionary develop- ment that is typical of AI research programming in LISP. The emphasis here is on decisions made during application construction as part of various application life-cycle methodologies for specification, design, implementation, and revision of large software systems. (b) The product will provide a collection of abstract data objects (with operations) that is both necessary and sufficient to support object-oriented design and imple- mentation of AI applications in Ada. These objects should be arbitrarily composable with each other and with the predefined objects of Ada in the same manner that the Ada primitive objects are composable. These object types should not duplicate predefined objects of Ada unnecessarily. (c) Ada is a language of strong static typing and strong dynamic sub-typing, and the facilities implemented by the product should adhere to the same design princi- ples. The product should build upon a (collection of) basic data type(s). These should be the minimum addi- tion to Ada's predefined data types that will give an Ada programmer the means to manipulate dynamic objects in a strongly typed environment. The intent is not to build a LISP or PROLOG environment on top of Ada, but rather to support the AI application development para- digms of these environments in the style of Ada. 5 2. General Description 2.1. Product Perspective Artificial intelligence is the part of computer science concerned with designing computer systems that exhibit characteristics commonly associated with intelligence in humans: problem solving, logical reasoning, language syn- thesis and understanding, learning, expertise and vision. Recently, the area of expert systems has begun to emerge as a tool for developing practical applications of AI tech- niques. Typical examples of current expert systems interact with a human user in a dialog in a manner similar to the way the user would interact with a human expert. Such consulta- tion systems have achieved high levels of performance in tasks which are amenable to this approach, such as geologi- cal data analysis and medical diagnosis. The next step in the evolution of such systems is to embed them within other application systems in such a way that the application sys- tem, as the user, can interact with the expert system to solve problems. The ability to behave with intelligence is usually described in terms of knowledge. Since AI involves the design of computer systems that exhibit intelligent behavior, knowledge representation is an important issue. A representation of knowledge is some combination of data structures and procedures which lead to such intelligent behavior when used in an appropriate manner. Work on knowledge representation has involved the design of several classes of data structures for storing information as well as the development of procedures that allow manipulation of these data structures to make intelligent deductions. 2.1.1. Production Systems A common type of knowledge representation system is the production system, which is a modular scheme finding increasing popularity. The major components of such a sys- tem are a data base, a knowledge base, and a control stra- tegy. The data base is actually a temporary working storage area used during construction of the solution of a specific problem. A production system operates by manipulating the data base using the operations defined in the knowledge base under the control of the control strategy. The operations defined by the knowledge base in a production system are in the form of rules. Each rule has a pre-condition that is either satisfied or not by the data base. If the pre- condition is satisfied, the rule can be applied. Applica- tion of a rule changes the data base. The choice of which applicable rule to actually apply is done by the control 6 strategy, which also has the responsibility of recognizing a termination condition on the data base after each rule is applied. The primary task of a production system is to infer (sometimes, deduce or prove) goals from the initial contents of the data base using the rules. In a conventional production system, the data base, knowledge base and control strategy are each globally defined for the application. The control strategy is usu- ally a standard predefined component of a generalized pro- duction system, which is supplied with an application- specific knowledge base. There is no notion of a local sub- set of the data base accessible by a subset of the knowledge base. In fact, the data base is the only medium of communi- cation between rules, which do not "call" one another. This results in an application structure which is extremely modu- lar in its knowledge base part, but which is otherwise rela- tively unstructured. In general, the ordering of the appli- cation of rules is irrelevant, which allows changes to the knowledge base to be made relatively independently of each other. There are two basic types of control strategy: (a) irrevocable - an application rule is selected and applied without provision for reconsideration. Irrevo- cable control strategies are only used when there is infallible knowledge about how to proceed toward a goal from a state of the computation. (b) tentative - an applicable rule is selected (either arbitrarily or using some heuristics), the rule is applied, but provision is made to undo the effects of the selection and subsequent computations and apply some other rule. The most common tentative control strategy is backtracking. Backtracking is the process whereby, if a subsequent computation fails, the state of the computation reverts to a prior state, the choices made at that prior state are discarded, and new choices are made. The states at which backtracking can return to are points in the computation determined by the control strategy. Typically, the selection of a rule from a set of applicable rules is established as a backtracking point. AI applications require knowledge of the problem domain in three basic categories in order to achieve efficient operations. These categories are assertional, procedural, and control knowledge. Assertional knowledge expresses explicit knowledge about a problem or problem domain; pro- cedural knowledge expresses rules for manipulating the assertional knowledge; control knowledge expresses knowledge about the problem solving process for the problem domain, 7 that is, the appropriate control strategy. Production sys- tems are generally accepted as a convenient and natural mechanism for expressing assertional and procedural knowledge. Production systems have been found to be useful mechanisms for controlling the interaction between asser- tional and procedural knowledge. Assertional knowledge is separable into the two sub-categories of rules and facts, whereas procedural knowledge is expressed only by rules. Rules are usually stated in the form of implications: if A then B or A => B and can express generalized knowledge, while facts are sim- ple non-implicational statements expressing specific knowledge. In an implication, the pre-condition, or if- part, is called the antecedent, while the result, or then- part, is called the consequent. The general form of the antecedent and consequent of a rule is a sequence of terms joined by conjunction (AND) and/or disjunction (OR) opera- tors. A term is usually expressed in functional notation as a predicate with arguments, which may be arbitrary symbolic expressions. Control knowledge will be discussed later. An inference is either antecedent driven or consequent driven. In an antecedent driven inference, the occurrence of a fact in the data base which matches the antecedent of a rule makes that rule applicable. In a consequent driven inference, the system attempts to obtain a specific goal by locating a rule whose consequent matches that goal, and then attempts to obtain the antecedent of that rule as a sub- goal. Antecedent driven inferencing is usually referred to as forward inferencing, while consequent driven inferencing is usually referred to as backward inferencing, since the coventional interpretation of the "flow" of an implication is from the antecedent to the consequent. Conventional pro- duction systems typically employ control strategies which are either strictly forward or strictly backward inferencing systems. Although a combined forward and backward control strategy is certainly possible, it is inherently more com- plicated due to the global nature of the structure of these conventional production systems. In forward inferencing systems, the rules of the knowledge base are applied to a data base (initialized with a given set of facts) until a termination condition involv- ing a goal defined by the particular problem is reached. Termination occurs when at least one term of the goal expression is inferred. In backward inferencing systems, the rules of the knowledge base are applied to a data base (initialized with a goal expression) until a termination condition involving facts in the knowledge base is reached. 8 Termination occurs when the goal expression is composed of facts. In order to use control knowledge in a production sys- tem, there must be a suitable representation for it. Vari- ous approaches have been considered, such as treating the control strategy as a series of problems to be solved by another (meta) production system, embedding control knowledge into discrimination functions supplied to a gen- eral control strategy, or embedding control knowledge into the rules themselves. This latter approach is taken by the "logic programming" languages, such as PROLOG or the OPS family. The general logic programming paradigm is to treat a rule like a subprogram of a conventional programming language. The control strategy invokes rules only when new goals or facts are infered, depending on the inferencing mechanism used. A simple control strategy, similar to that found in PROLOG, is to select all applicable rules whenever a new goal is created and invoke the first one selected. An "executing" rule may assert new facts, goals, or even rules, which may cause the "executing" rule to be suspended if new rules are invoked. A rule terminates either with success or failure, depending on whether its sub-goals can be inferred or not. In either case, control is returned to the invoking rule, but in the failure case, backtracking occurs and a new rule is invoked. 9 2.1.2. Logic Programming Logic programming systems are an important development in the AI field. These languages, in their pure form, use assertions about logical relationships between data elements to specify programs, rather that specifying procedurally how operations are to be performed. Logic programming permits extraordinarily concise programs because its statements are declarative rather than prescriptive. Unfortunately, this programming conciseness is not without costs. In PROLOG, for example, the control strategy works by generating virtu- ally all possible implications of a given initial data base configuration, whether or not they are relevant. Further, PROLOG, in its pure form, is not deterred from searching for more than one inference sequence for the same goal. Most PROLOG implementations have been augmented with control strategy directives to constrain a program's search path. PROLOG is a language of tremendous potential which it has, to date, failed to realize, in many cases, due to a number of questionable design decisions. One of these is the choice of backtracking as the sole control structure of the language. This paradigm fits certain problems well, but yields programs that are inefficient as well as incomprehen- sible when confronted with the many situations for which backtracking is unsuited. The great power conferred by the availability of a logic programming capability strongly argues for its inclu- sion in the AI programming environment. The combination of a logic processing facility and a symbolic processing facil- ity, both interfacing smoothly with each other and with the Ada host language, produces a facility of impressive power. This situation is provided by having such a logic program- ming facility available to be invoked as needed embedded in the Ada host language and interfacing with the symbolic com- putation tools. In particular, since the control structures of the Ada language are available, the difficulties engen- dered by being confined to backtracking are obviated, and efficient logic programming becomes possible. There is no reason to restrict a programmer in the same manner as is done in conventional production systems to a single knowledge base, a single working storage data base, and a single control strategy. Given Ada's capabilities, it is straightforward to define logic programming control stra- tegy operations which take a data base and a knowledge base as arguments. Furthermore, facilities could easily be pro- vided implementing both backward inferencing, as in PROLOG, as well as forward inferencing, as in the OPS language fam- ily, control strategies. With the support for concurrent task implementations provided by Ada, it may even be possi- ble to ultimately provide implementations of concurrent inferencing control strategies. 10 2.2. Product Functions 2.2.1. Symbolic Expressions The essential capability that has allowed LISP to retain its dominant position as an AI programming language is its ability to perform abstract manipulations on dynami- cally shaped data structures. This unconstrained form of data structuring is an important characteristic of AI pro- gramming languages, as it allows data structures of unpredictable size and shape to be conveniently constructed and manipulated. The data structures are, at this level of LISP, interpreted only in a structural fashion by the opera- tors which manipulate them. Interpretations of the elemen- tary data items, and interpretations of their structural composition as representations for relationships between them, is left up to the individual application program. A disadvantage of the LISP approach to data structuring is that there is no built-in mechanism to enforce any application-specific constraints on the size and shape of the structures, as determined by the intended interpreta- tion. COMMON LISP addresses this problem somewhat by defin- ing facilities that allow a programmer to do explicit con- straint checking for data types. This approach to data pro- cessing is usually referred to as symbolic computation, and the data structures are usually referred to as symbolic expressions. Most introductory LISP references talk about lists and list processing when explaining the essential capabilities of LISP. This tends to obscure the real power behind LISP, since the notion of a "list of components" is a particular interpretation of a symbolic expression, where another interpretation may be more appropriate in the specific application, such as a "function with arguments." Part of the reason for the conventional approach to LISP is that the procedural semantics of the LISP symbolic expressions are defined in terms of operations on the representation of sym- bolic expressions, with conventions that treat the represen- tation as a linked list, even though it is actually a binary tree structure. The fundamental characteristic of the LISP symbolic expression is that it is a recursive data type, when viewed as an abstraction independently of the all-too-visible implementation details of LISP. It is the recursive nature of the type that gives LISP its expressive power. The most common characterization of a recursive data structure is the tree structure. There are a large number of concepts dealt with in typical AI applications that can be easily represented as tree structures, given an appropriate set of access and construction operations. 11 The other important issue with respect to the specifi- cation of the symbolic expression type is structure sharing. Symbolic processing applications typically have large memory requirements. To lessen this burden on available storage, symbolic expression structures will be shared. That is, the result of a symbolic processing operation will typically return a reference to a specific portion of the input argu- ment rather than returning a brand-new copy of the desired result. However, this raises a problem if destructive operations such as LISP's replaca and replacd, are provided. The solution is simply not to provide any of these destruc- tive operations. Thus, the choice of an internal implemen- tation using sharing can not impact any user of the facil- ity. All operations defined on symbolic expressions will be non-destructive or constructive (no existing values are cor- rupted, and results of operations are defined as new values). 12 2.2.2. Pattern Matching A pattern is a symbolic expression with special atomic components referred to as variables. Two patterns are said to match if their variables can be replaced by values to make both patterns identical, where multiple occurrences of the same variable in a pattern must be substituted con- sistently with the same value. The values which are matched with variables are usually of interest to the application using the pattern matching operation. Consequently, the matching operations is usually defined to return as a result the correspondences between variables and values generated during a successful matching operation. The correspondence between a variable and its matching value is referred to as the variable binding, and the result of the matching operation is the variable binding list. Pattern matching is typically implemented as a sequen- tial operation which considers each corresponding pair of components from the two patterns in turn, with a recursive call to the matching operation occurring when a component is non-atomic. Under this approach, variable bindings are gen- erated immediately upon the first match for each variable. Subsequent occurrences of such a bound variable then use the bound value rather than the variable itself in the component matching test. A variable can be bound only if it is currently unbound, e.g., the value of a bound variable can- not change. Variables may be bound to other variables, in which case the variables are said to be linked rather than bound. Linked variables remain candidates for binding until one of the linked variables becomes bound, at which point all variables linked to that bound variable become bound to that same value. One restriction that is typically placed on this pro- cess is that the scope of a pattern variable is a single pattern. In other words, a variable in a pattern is dis- tinct from any variable in any other pattern, even if the representation of the symbols used to denote two different variables on an external device are the same. Even with this restriction, it is possible for a variable to become linked to itself, or bound to an expression which includes itself as a component. This can occur when the variable is matched against a variable of the other pattern which was linked or bound at a previous component position. Linking a variable to itself causes no conceptual prob- lems, but the case of binding a variable to an expression which contains that same variable as a component causes cir- cularity problems. Detection of this situation requires a potentially expensive check during each component match 13 step. In this case, the match operation is usually defined to fail, as it is difficult to deal with circular data structures. The checking required is called the occurs check in the literature. If multiple copies of a pattern are used in pattern matching operations simultaneously, it must be possible to prevent binding conflicts between the various usages. This can occur as the result of using a global pattern in a multi-tasking application, or in sequential processing within a self-recursive subprogram. This latter situation is especially important to the product design, as patterns will be the basis for the implementation of rules, and the algorithms used for proving goals from rules are essentially recursive pattern matching operations. During a single recursive search for a proof, the algorithm may utilize the same rule multiple times at different levels of the recur- sion. The obvious solution is for the pattern matching operations to systematically rename variables in each pat- tern before attempting the match. Optimization of this pro- duction of variants of patterns is essential to an efficient implementation. The classic implementation of variable binding is as a list of pairs of variable name and bound value, accessed by linear lookup on the variable name. This is easy to imple- ment and to understand, but the time to find the binding of a variable is proportional to the number of bound variables. The worst case behavior is terrible, but the normal case behavior is reasonable for finding the binding of a bound variable, since the binding list will typically be short. Determination that a variable is unbound requires traversal of the entire list, accentuated by the fact that long chains of linked variables cause a list traversal for each link, as the linking is represented as a binding pair with two vari- able names. In this case, an occurs check is needed to dis- cover the situation of linking a variable to itself, which will cause a circularity in this implementation. For- tunately, the check is simple in this special case. The matching operation is defined to succeed, but no binding results. An alternative to using a single list representing all bindings is to have a binding list associated with each pat- tern. The trick here is that, when a variable is bound to a value, the binding is represented by a pair consisting of the value and the binding list associated with the pattern from which the value was extracted. To discover the value bound to a variable may require several list traversals in the case of variable linking, but each list will typically be quite small. The major advantage of this scheme is that the association of a binding list with each pattern can be designed to implicitly take care of the renaming of 14 variables when a pattern is used multiple times con- currently, provided that the association of binding lists with patterns is local to the location of usage so that each copy of the pattern in use has a unique binding list. An important issue is how to handle variable linking, especially within the context of using patterns to implement rules, as long chains of linked variables can result during a single attempt to prove a goal. In the classic implemen- tation, this is represented by a binding a variable to another variable and repeatedly looking up the next binding until an unbound variable or a value is encountered. 15 2.2.3. Logic Programming The major workhorse of a logic programming system is the unifier, which matches goals with consequents of asser- tions, possibly generating new sub-goals, which must be proved recursively. Assertions represent both rules and facts, where facts are represented by rules with no antecedent (or sometimes with a literal representing true as the antecedent). This allows uniform treatment of rules and facts. If a (sub-)goal is matched with a fact consequent, then it is proved, while if it is matched with a rule conse- quent, then the antecedent of that rule must be proved as a sub-goal. During this recursive matching process, the unif- ier binds variables matched to component expressions and uses those bindings during subsequent sub-goal processing. The initial goal is referred to as a query, and is usually represented as a rule with no consequent, so that the unif- ier is always matching antecedents representing (sub-)goals with consequents of assertions. A general term referring to both assertions and queries is the rule. The unification algorithm matches a pair of arguments, which are generally components extracted from an antecedent of one rule and a consequent of some other rule, at each step, doing so within an environment which contains any current variable bindings. As for the case of patterns, variables are required to be unique to each rule. In addi- tion, variables within rules are generally interpreted as being universally quantified over the entire rule (e.g., the rule is interpreted as applying for any value of each vari- able). If the unification succeeds for a (sub-)goal, it adds any new variable bindings to the environment, and returns the extended environment to the caller, which uses the new environment when attempting to prove additional sub-goals. Since it is possible for an assertion to be used more than once during this recursive unification process, it is important to prevent binding conflicts in the environ- ment, which occur when the variables of an assertion are bound to different values for each use of that assertion. This is why the classical unification algorithm renames variables before using an assertion. However, as mentioned during the discussion on pattern matching, there are ways to avoid this expensive operation. The binding of variables during unification has add- tional requirements beyond those for the simple case of pat- tern matching. Symbolic expressions containing unbound variables can become bound to variables after unifying the consequent of an assertion with a (sub-)goal. These unbound variables may later become bound (or not) while processing the antecedent of that assertion as a new sub-goal. Conse- quently, the binding environment generated during unifica- tion cannot, in general, disappear until after the original 16 query has been proved. At least that subset of the environ- ment which is required to keep track of such unbound vari- ables returned out of sub-goal unifications must be retained. Variables can also become unbound during back- tracking when sub-goal unification fails to find a match. For this initial phase of the implementation, the prim- itive objects and operations necessary to implement a logic programming control strategy and to otherwise support the implementation of a logic programming facility are required. Additionally, a simple control strategy implementing essen- tially a PROLOG style of backward inferencing is necessary in order to demonstrate the fundamental facilities. The required objects are, essentially, generalized rules and rulebases. A rulebase object can be used to implement both knowledge bases and working storage data bases if goals and facts are treated as special cases of the generalized rules. 17 2.3. General Constraints 2.3.1. Design Methodology Software A & E will use modern software engineering techniques in the implementation of this product. In the design phase, the particular techniques employed will be information hiding and object-oriented design. 2.3.1.1. Information Hiding The decomposition criteria supporting step-wise refine- ment of a software system design due to Parnas, known as information hiding, is an important design technique. The basic idea is to decompose a software system into modules in such a way that each module allows the implementation details of a single design aspect of the system to be hid- den. This technique explicitly captures the design struc- ture of a software system in the software at the level at which each design decision was made. The advantage of this design approach is that, if the need arises to modify the system, the effects of the modification should be localized to one module. An information hiding module defines a set of software functions while hiding the information about how the func- tions are actually implemented. This technique supports the development of software which is designed for easily imple- mented enhancements and modifications. One of the principle design goals of Ada was to allow exploitation of these pro- ductivity advantages. One disadvantage of this approach is that the resulting application may suffer from the perfor- mance standpoint when compared to an equivalent application which has its functions coded inline. Ada provides an elegant solution to this problem by allowing the developer to tell the compiler to compile certain functions as if they had been written inline. It is a general design principle of Ada that the presence of expensive capabilities (in terms of performance) should not penalize the efficiency for usage of simple capabilities. A software system designed using this technique is organized into a hierarchy of modules. Within the hierar- chy, a function of a module is only permitted to use func- tions of other modules which are beneath it in the hierarchy to perform its function. The levels in the hierarchy can be viewed as abstract machines. Each abstract machine provides functional capabilities to higher-level software components that are more powerful and application specific than the lower levels of abstraction. Use of information hiding as the decomposition criteria fully supports incremental 18 development, as each level of the hierarchy only depends on the visible functions provided by lower levels. Unless the visible functions change, modifications are localized. 2.3.1.2. Object-oriented Design Object-oriented design is a methodology which attempts to map a solution to a problem directly to the solution's users' view of the problem. The methodology is character- ized by its balanced treatment between the software com- ponents representing the objects of the problem space and the software operations representing the actions taken by or applied to these objects. The software objects are viewed as actors, each with its own set of applicable operations. The basic approach is to initially develop an informal stra- tegy for the solution to a problem that parallels the world view of the users. This solution strategy is expressed in terms of the concepts from this problem space. Given this informal strategy, the objects and actions are identified, and the relationships among the objects are established. At this point, a formal description of the visible interfaces to each object can be developed, which explicitly defines the operations that may be performed on or by each object. The scope and visibility of each entity is also defined. The object-oriented design methodology supports the principles of abstraction through information hiding. Abstractions are directly implemented from the problem space, providing a strategy for decomposing a system into modules in which design decisions are localized to match the "world view." 19 2.3.2. Dynamic Storage Management The concept of the name of an object is central to the design of a dynamic storage management mechanism. A name is an access path to an object. Variables occurring in the usual fashion in compiled programs are one kind of name, in this case the access path (at compile time) to the storage location allocated by the compiler for the variable at run- time. A variable is a constant name; an example of a dynamic name is the value of an Ada access type, which is the name of an object dynamically allocated at run-time. An access to integer variable is the name of an access object, whose value is the name of an integer object whose value is an integer. The problem of managing dynamically allocated storage is, at the level of Ada source code, a problem in the dynamic management of name values. A data object which is accessible via two different names is said to have aliases. A simple example is seen with the Ada access type: if X contains an access value referencing an integer object, and X is assigned to Y (assuming both X and Y are assignment compatible), then both X and Y contain alias names for the same integer object. Changing the value of that integer object can be accom- plished using the value assigned either to X or to Y as its name. Alias names cause a problem when there is an operation which causes a dynamically allocated object to become deal- located. A deallocated object is no longer in use, and so can be reused to satisfy another allocation request for dynamic storage space. If a particular object has alias names, and one of those names is used as an argument to the operation which deallocates the object, then the alias names will reference a part of storage which is supposed to be currently inaccessible and available for reuse. This situa- tion is referred to as dangling references. In order to address the dangling reference problem caused by alias names, there are various storage reclamation schemes which ensure that an object is not reused while there are outstanding names for the object. The Ada stan- dard does not require an implementation to provide such a mechanism, although it is permitted. The implementation of dynamic objects assumes that storage reclamation is not pro- vided. A storage reclamation scheme must either know the set of names referring to a particular object at all times, or else must know what names are available for referencing any object at all times and know the set of objects currently allocated. The first alternative can be simplified to merely record the number of names for a particular object. 20 This is called reference counting, and permits incremental storage reclamation, since as soon as there are no names for an object, it can be deallocated. The second alternative is referred to as garbage collection. When the available free space for allocating new dynamic objects is exhausted, all objects accessible from the known set of available name values are marked as in use. All dynamic objects not marked can then be reused. Either version of storage reclamation mentioned above is satisfactory within the run-time environment supporting either an interpreted language, such as LISP, or a compiled language, such as Ada. Both an interpreter and a compiler can ensure that reference counting occurs where required, and both can ensure that the garbage collector knows the set of names which may reference dynamic storage. Unfor- tunately, neither implementation is completely satisfactory if built on top of an interpreted or compiled language. The reasons are discussed in the following subsections. 21 2.3.2.1. Reference Counting The reference counting scheme requires that a count of the number of different names by which a dynamic object can be referenced be maintained at all times. In particular, this means that whenever a name value is assigned to a name variable, the reference count of the object named be incre- mented, and if the name variable originally referred to some other object, that object's reference count be decremented. Since the Ada assignment statement does not do this for us, an assignment operation must be defined and use of the Ada assignment statement should be prohibited. Ada provides a mechanism for prohibiting the assignment statement by defining the type representing dynamic object names as lim- ited private. Unfortunately, since assignment is a state- ment, overloading of the ":=" symbol is not permitted, so the assignment operation must be defined as a procedure. For any other symbolic expression operations, the internal implementation can maintain reference counts as required. If this were sufficient to reclaim all inaccessible dynamic object storage, the implementation would be extermely powerful and would hide all details of the recla- mation process from a programmer, even to the extent of hid- ing its existance. The latter comment is significant, in that an implementation of symbolic expressions running on an Ada run-time system which supports automatic storage recla- mation could allow the run-time system to do the work, while an implementation running on a more restricted run-time sys- tem could take care of it, and the programmer would not have to consider the issue in his program. Nevertheless, the above approach is not without its problems. While declaring types to be limited private does prevent the use of the assignment statement, this also has the unfortunate effect of prohibiting initializations, con- stants and default parameters and precludes a programmer from ignoring the storage management facilities if the Ada run-time system provides storage management. In addition, there are three important situations that arise when implementing reference counting on top of a run- time system which make it impossible to hide the fact that storage reclamation exists from a programmer. These prob- lems affect the manner in which a programmer will have to write his code. These situations are: (a) Aliasing of names via subprogam parameters. (b) Exit from the scope of a locally declared name vari- able. 22 (c) Returning a name value as the result of a function. The real problem is that there is no compiler support for locating name values. The only access to name values available to a Ada programmer is via name variables or name constants, whereas the compiler itself knows where all of the name values will be at run-time. 23 2.3.2.1.1. Name Aliasing via Subprogram Parameters Within a subprogram with a parameter of the symbolic expression type, the formal parameter establishes a new alias for the object named by the actual parameter. Since parameter passing is implemented below the level of storage management in this situation, the reference count of that object has not been properly adjusted. If the programmer is allowed to use an operation on the formal parameter which adjusts the reference count downward, the dynamic object's reference count may reach zero prematurely, and be treated as inaccessible by the storage reclamation process. At this point, the actual parameter may now contain a dangling reference. On the other hand, if the actual parameter is a visible global (with respect to the subprogram) variable and this variable is used as an argument to an operation which adjusts the reference count downward, resulting in the object becoming reclaimed, the formal parameter name con- tains a dangling reference. This latter case is a side effect that happens to be defined as an erroneous program by the Ada standard <6.2, 13>, since the standard then defines the formal parameter to have an undefined value. Assume that the symbolic expression type is defined as limited private in order to prevent use of the Ada assign- ment statement, forcing the type to be used only as a sub- program parameter of mode in. If all operations which can have the internal effect of modifying a reference count have the appropriate parameters defined as mode in out, then the programmer will be prohibited from using these operations on subprogram formal parameters, thus enforcing the Ada interpretation of a formal parameter of mode in as a local constant. This will force all operations which may reclaim object space to be applied only to variables. This will not prevent the programmer from getting into the second situa- tion, unless the run-time system can enforce it. The disadvantage of this solution is that, since the programmer cannot pass symbolic expressions as output param- eters from a subprogram, it will encourage use of global variables, which tends to be discouraged by most of the recent program design methodologies. This solution does, on the other hand, tend to encourage a functional programming style of program design, where the only way to get new values is as the returned value of a function subprogram with no side effects on function parameters. Unfortunately, the process of returning a symbolic expression value from a function has its own problems, discussed later. 24 2.3.2.1.2. Scope Exit for a Local Variable If a dynamic object is created and its name value assigned only to a name variable declared in a local declaration scope, then on exit from that scope, the dynamic object can be reclaimed since there are no longer any refer- ences to it. However, with no interpreter/compiler support for object reclamation, the programmer must be relied upon to explicitly indicate this situation. Therefore, he must be given a special operation to be used to indicate to the dynamic storage reclamation process that a name value is to be eliminated, causing the reference count of the named object to be decremented. 25 2.3.2.1.3. Returning a Value from a Function In order to conveniently manipulate dynamic objects, the objects themselves are seldom actually copied. Instead, names of dynamic objects are manipulated as values (in Ada, as values of some access type). As previously discussed, all formal parameters of the symbolic expression type will be local constant name values within a subprogram, and so cannot be used as name variables to construct a new symbolic expression value to return as a function result. Also, the programmer must explicitly apply a deallocation operation to all local name variables before leaving a declaration scope, which includes the body of a subprogram. There are two aspects of this situation which cause problems, which are really the two sides of the more general issue of how to deal with a temporary name value. On the one hand, a name value which is not currently assigned to any name variable needs to be treated as naming a dynamic object with reference count zero, so that when that value is assigned to a name variable, the assignment operation, which increments reference counts, will leave the reference count at one. On the other hand, within the function body, the name value is most likely assigned to a local name variable which will be used as the expression of the Ada return statement to define the result value of the function call. Since the value is assigned to a variable, it has reference count equal to (at least) one, so the value of the function call, when assigned to a variable of the outer dynamic range, will have a reference count equal to (at least) two. In this situation, the dynamic object named appears to have more alias names than it actually has, so the dynamic object storage will never be reclaimed by the storage reclamation process. As additional problem arises if the returned value is never assigned to a variable, but rather is immediately passed as an actual parameter of some other subprogram call, which never internally assigns it to a variable. In this situation, there is no name variable available with which to free the dynamic object. A function result can be arranged to have the appropri- ate reference count only if the programmer is given a spe- cial operation to be used only as the result expression of a return statement within a function subprogram which needs to get its return value from a local variable. The parameter of this function could only be a name variable which con- tains the name value of the desired result of the function call. Since there is no way to enforce proper use of this special operation, it would have to be up to the programmer to use it where it is required, and only where it is required, which is an undesirable but inescapable situation. 26 The problem arising from never assigning a name value returned from a function cannot be solved without explaining the situation to the programmer and leaving it up to him to do the right thing in each particular situation. An easy convention would be, of course, to always assign the result of a function call to a variable. Alternatives considered, but abandoned as needlessly complex, included the following: (a) Attach the names of all newly created dynamic objects, or those which are returned from a function via a spe- cial result operation with reference count equal to zero, to some globally known list of temporary values. The assignment operation applied to a name value on this list removes it from the list. Name values left hanging around on this list must be reclaimed by a spe- cial operation invoked by the programmer when he knows that all possible temporary values are no longer required. (b) Provide dynamic scope start and end operations. All symbolic expression variables would have to be attached to some dynamic scope by use of a dynamic instantiation operation separate from the Ada declaration. All tem- porary values not assigned to a variable are also attached to the current dynamic scope; values returned from functions via a special result operation with reference count equal to zero are attached to the dynamic scope enclosing the location of the function call. On execution of the dynamic scope end operation, all values local to the dynamic scope are reclaimed. 27 2.3.2.2. Garbage Collection Implementing a garbage collector is somewhat cumbersome in a language like Ada, since the only useful characteriza- tion of names is the name variable. The only entity that knows about the locations of name values is the compiler. Therefore, name value tracking has to be implemented as name variable tracking, using some mechanism for maintaining a collection of dynamic name variables. The approaches are all variations of the same basic theme of implementing the symbolic expression type as a reference to some dynamically allocated component of an object known to the storage allo- cation and reclamation operations. Each such component is, or contains, a slot which is a name variable. Given this approach, whenever desired, the storage rec- lamation process knows where all of the name values for dynamic objects can be found, and, if it can also gain access to all allocated dynamic objects, it can mark all objects accessible from any name value and then locate all unmarked objects, which can be reclaimed. If each variable of type symbolic expression is allo- cated to a unique name variable in this global set of names, then the problems associated with aliasing via subprogram parameters, as discussed for reference counting, go away in the sense that side effects are consistently propagated to all aliases. Operations which reclaim dynamic storage do not change the set of currently active names, so both aliases generated by an actual/formal parameter association always share the same dynamic object (or lack thereof). This has the disadvantage of potentially violating the spirit of Ada's parameter mode in, since a formal parameter could be modified and thus modify the actual parameter even in this mode. The problems associated with exit from the scope of a locally declared symbolic expression variable and with returning a symbolic expression as the result of a function call are essentially unchanged, requiring the same basic solutions, which are not repeated here. 28 2.3.2.3. Summary Based upon the previous discussion, two subprograms will be provided for each AI data type to assist in storage management. (a) A procedure to indicate that a variable is at the end of its scope and that its contents should be deallo- cated. (b) A function which can decrement the reference count of a variable which is used to return a value from a func- tion. In addition, it has been determined that the use of limited private types will not provide sufficient assistance with storage management to overcome the disadvantages they impose. Therefore, it has been decided that the AI data types developed in these packages be declared as private to provide sufficient data encapsulation without undue restric- tions. 29 2.3.3. Multi-Tasking Synchronization Ada provides a powerful and general multi-tasking mechanism as an integral part of the language. A task is an independent flow of control which interfaces in a discip- lined way with other flows of control, each with its own internal data. The tasking constructs provide intertask communication using message passing. The fact that tasks provide a high-level synchronization capability, and are the only Ada data type to provide such a capability, means that there will be many situations in which tasks will be used purely as a synchronizing mechanism to implement other con- trol abstractions, such as co-routines. In addition to high-level communication constructs, tasks provide high-level access to facilities commonly pro- vided in a machine-dependent fashion in other languages. For example, the delay statement enables a task to suspend its execution until at least the specified time interval has elapsed, allowing a program to control the frequency of its execution. Once the use of tasks as a primitive construct out of which more complex abstractions are built becomes common- place, complex applications composed of explicitly con- current or parallel processes will become readily accepted solutions. Under these conditions, the problems of effi- ciently synchronizing access to shared global data must be dealt with. This will become especially important when, for example, expert systems technology is applied to airborne applications and is installed within embedded computers in a multi-tasking environment. Expecting that synchronization can be efficiently accomplished outside of the implementa- tion package for symbolic expressions is probably naive. This is especially true if the implementation of symbolic expressions includes hidden structure sharing. It is more reasonable to expect that a version of the symbolic expres- sion package will be available which provides, internally, the proper synchronization efficiently tailored to the par- ticular implementation. The initial implementation will not provide such inter- nal synchronization mechanisms. However, the interface to the symbolic expression facilities will be designed to allow such synchronization to be installed at the lowest implemen- tation levels within another version of the product without requiring changes to the visible interface. 30 3. Functional Specifications All visible Ada type identifiers which represent the visible objects defined below will be defined as private. This will ensure that only the provided operations will be used to manipulate these objects. 3.1. Rulebases 3.1.1. Introduction The two major design decisions for rules are how to represent them and where to put them for fast retrieval. Retrieval of rules is independent of rule representation, so the approach taken is to encapsulate the retrieval issues within objects representing rulebases. The representation issues for rules will be deferred to a lower level of abstraction. The term rulebase, as used here, refers to a collection of rules and facts representing a knowledge base, or to a temporary collection of facts, rules and a query represent- ing the current state of computation for a problem. Structure sharing is an important criteria for this implementation in order to avoid generating an exploding requirement for storage when rule data bases are manipulated with set operations which may be used to implement certain control strategies. An obvious solution to the requirement for fast retrieval is to provide some type of internal indexing of the rules present in a data base. Since it is assumed that the structure of a rule is a set of terms in predicate form, separated by ANDs and ORs, a simple approach would be to build a hash table indexing the predicate symbols of each of the terms in a rule. 3.1.2. Objects (a) Rulebase, which provides associative access, or content addressing, for its contents. Retrieval of the con- tents is achieved by pattern-directed retrieval, which will allow access of the data base contents by partial specification of the desired content of a retrieval. Rulebase variables will allow implementation of extremely flexible control strategies for expert system and logic programming implementation. For example, conditional execution of subsets of rules can be imple- mented by separating rules into individual rulebases which are selected for use at each unification step 31 using Ada's conditional control structures. The set of rulebases could be implemented as an array of rule- bases. This would provide a powerful mixture of non- procedural programming with the more conventional pro- cedural facilities of Ada. This also eliminates the restriction to a single knowledge base common to con- ventional logic programming and production systems. (b) The null rulebase, a constant representing a rulebase which has no rules. 3.1.3. Operations 3.1.3.1. Rulebase Operations (a) Determine if a rulebase is empty. (b) Determine if two rulebases are equivalent. (c) Convert a rulebase to a symbolic expression. (d) Convert a symbolic expression to a rulebase. (e) Assign values to variables of type rulebase. (f) Delete an unused rulebase. (g) Return a rulebase bound to a local variable as the value of a function. (h) Add a rule to a rulebase. (i) Delete a rule from a rulebase. (j) Associative retrieval using a rule as a query, yielding a symbolic expression containing instantiated values which satisfy the query. (k) Set operations on rulebases - union ("and"), intersec- tion ("or"), difference ("-") and exclusive union ("xor"). (l) Input/output operations for rulebases. The conversion from/to symbolic exrepssions provides a primitive inteface between the symbolic computation facility and the logic programming facility. The control strategies for logic programming all need some form of matching with variable binding, which suggests that the antecedent and consequent of a generalized rule should be implemented using patterns with pattern matching operations defined. It is 32 intended that the contents of a rulebase are uninstantiated rules. 33 3.2. Rules 3.2.1. Introduction The representation of rules depends primarily upon how variables are represented. We will implement rules in terms of patterns and defer this issue to a lower level in the abstraction hierarchy. 3.2.2. Objects Other than as stated below, no interpretation is imposed by this level of abstraction on the contents of a rule. Rules are represented as a data structure consisting of a template and a (possibly null) binding context. The template is a pattern consisting of two sub-patterns which represent the (possibly null) antecedent and consequent parts. When a rule is created, the two sub-patterns are treated as a single pattern so that the variable binding context is shared by both components of the rule template. This allows variable bindings to apply across an entire rule. (a) Rules, which are a subtype of rule composed of a conse- quent, or head, and an antecedent, or body; the interpretation of a rule is that the truth of the antecedent implies the truth of the consequent; (b) Facts, which are a subtype of rule composed of a conse- quent only; the interpretation of a fact is that the truth of the consequent does not depend on anything; (c) Querys, which are a subtype of rule composed of an antecedent only; the interpretation of a query is that the truth of the antecedent is unknown, but may be inferred from some collection of facts and rules with an appropriate inference mechanism; (d) Variable binding context, which are associated with an instance of a rule during rule matching operations, and keep track of variable bindings. (e) The null rule, a constant representing a rule which has a null antecedent, consequent and binding context. 3.2.3. Operations (a) Determine if a rule is null. 34 (b) Determine if one rule is equal to another. (c) Determine if a rule is a fact, query or general rule. (d) Create a rule by conversion from a symbolic expression. (e) Convert a rule back to a symbolic expression. (f) Extract the antecedent or consequent from a rule as a pattern. The variable binding context of the extracted component is shared with the original rule. (g) Extract the variable binding context from a rule. (h) Associate a specific variable binding context with a rule. (i) Match two rules, retaining variable bindings in the associated variable binding contexts. (j) Instantiate a rule, yielding a symbolic expression with all variables replaced by the values to which they are bound. (k) Make a rule unique by uniquely renaming all variables within the rule. (l) Assign values to variables of type rule. (m) Delete an unused rule value or rule instance value. (n) Return a rule value bound to a local variable as the value of a function. (o) Input/output operations for rules. 35 3.3. Patterns 3.3.1. Introduction The representation of patterns depends upon how vari- ables are to be represented. Our approach is based upon structure sharing of the literals of a pattern (the tem- plate) with binding list which is unique to each instance of a pattern. 3.3.2. Objects (a) Pattern variables, which are special atomic components of a symbolic expression used to indicate component positions in an expression which will match an arbi- trary component of another expression. (b) Patterns, which are symbolic expressions consisting of a pattern (possibly null) template and a (possibly null) binding context. (c) Variable binding contexts, a collection of variable binding associations used to hold the result of a match operation for variables of a given pattern. Variable binding contexts are unique to a particular pattern, but may be shared by a pattern which is extracted from some other pattern. (d) The null pattern, a constant representing a pattern with a null template and binding context. 3.3.3. Operations (a) Determine if a pattern is null. (b) Determine if one pattern is equal to another. (c) Create a pattern by conversion from a symbolic expres- sion. (d) Convert a pattern back to a symbolic expression. (e) Extract the first component from a pattern. The vari- able binding context of the extracted component is shared with the original pattern. (f) Extract all components except the first from a pattern. The variable binding context of the extracted com- ponents is shared with the original pattern. 36 (g) Extract the variable binding context from a pattern. (h) Associate a specific variable binding context with a pattern. (i) Match two patterns, retaining variable bindings in the associated variable binding contexts. (j) Instantiate a pattern, yielding a symbolic expression with all variables replaced by the values to which they are bound. (k) Make a pattern unique by uniquely renaming all vari- ables within the pattern. (l) Assign values to variables of type pattern. (m) Delete an unused pattern. (n) Return a pattern bound to a local variable as the value of a function. (o) Input/output operations for patterns. 37 3.4. Symbolic Expressions 3.4.1. Introduction A symbolic expression is either null, or is a single atomic component, or is an ordered collection of components, each of which is itself a symbolic expression. At each level of recursion in a symbolic expression, the components are ordered sequentially. Consequently, it is easiest to deal with symbolic expressions in terms of this sequential component ordering. A non-atomic symbolic expression is considered to have a first, component, which can be extracted directly. The only other primitive access operation is to extract everything except the first com- ponent, that is, to extract the rest of the symbolic expres- sion. The primitive construction operation prefix is the direct inverse of these two access operations, and makes two symbolic expressions into the first and rest components of a new symbolic expression. Most of the other symbolic expres- sion manipulation operations are provided for efficiency and ease-of-use only, as they can all be defined in terms of these primitives. 3.4.2. Objects (a) Atomic expressions, which contain values of some application-defined type. It is assumed that, for any AI production program, the values of an atomic expres- sion are from at most a finite set of types. The application developer can define the atomic type as a record with variant part encompassing the set of types if there is more than one atomic type required. (b) Non-atomic expressions, which consist of an ordered collection of components, each of which is itself a symbolic expression. (c) The null symbolic expression, which may be considered either atomic or non-atomic, depending on the context. The actual structure of a symbolic expression will be hidden from the user. 3.4.3. Operations (a) Determine if a symbolic expression is null. (b) Determine if a symbolic expression is atomic. 38 (c) Determine if a symbolic expression is non-atomic. (d) Determine if a symbolic expression is a variable. (e) Determine if one symbolic expression is equal to another. (f) Determine if one symbolic expression is a member of another. (g) Create a symbolic expression from a user-supplied atomic value. (h) Create a symbolic expression variable with a given name. (i) Set the tag associated with a symbolic expression vari- able. (j) Extract the user-supplied atomic value from a given atomic symbolic expression. (k) Extract the name from a given symbolic expression vari- able. (l) Extract the tag from a given symbolic expression vari- able. (m) Assign values to variables of type symbolic expression. (n) Delete an unused symbolic expression. (o) Return a symbolic expression bound to a local variable as the value of a function. (p) Prefix one symbolic expression onto another. (q) Append one symbolic expression onto another. (r) Determine the length (number of top-level components) of a symbolic expression. (s) Extract the first component of a symbolic expression. (t) Extract all components except the first of a symbolic expression. (u) Extract the last component of a symbolic expression. (v) Extract the nth component of a symbolic expression. (w) Extract the first component of a symbolic expression n times using the result of the previous extraction as 39 the new argument. (x) Extract all components except the first of a symbolic expression n times using the result of the previous extraction as the new argument. (y) Reverse the order of components within a symbolic expression. (z) Delete all top-level occurrences of one symbolic expression from another. (aa) Replace all top-level occurrences of one symbolic expression within another with a new symbolic expres- sion. (bb) Flatten a symbolic expression by extracting all non- atomic components while retaining all atomic components of the given symbolic expression. (cc) Set operations on symbolic expressions - union ("and"), intersection ("or"), difference ("-") and exclusive union ("xor"). (dd) Extract either the first or all the non-atomic com- ponents of a symbolic expression whose first component matches a given symbolic expression. (ee) Input/output operations for symbolic expressions. 40