Review of [Geis 1995]

Hans Dybkjær, Review in Natural Language Engineering, Volume 4 , Issue 2 (June 1998), Pages 175 - 190, ISSN:1351-3249.

This book presents Dynamic Speech Act Theory (DSAT) as an attempt to account for competence in naturally occurring dialogue. The book is relevant both to conversation theorists and to developers of conversational devices such as advanced spoken dialogue systems. The book is well-written and interesting reading.

DSAT departs form speech act theory, conversation analysis, and artificial intelligence approaches to natural language processing. The "dynamic" part of DSAT comes from the claim that for a theory of conversational competence to be useful, it must be incorporable in a computational model of utterance generation and understanding. The presentation of and argumentation behind DSAT is based on analyses of transcriptions of real-life conversations, a very sound approach.

The book starts out with a review and critique of traditional speech act theory a la Searle. The traditional theory is claimed to fail, for two main reasons: 1) There is no effective mapping of complex multiturn interactions into trasditional speech acts 2) Even if there was such a mapping, associating speech acts with individual utterances does not exploain the interactions in which they occur.

Geis stetes that the traditional speech act theory claims that any sentence has a primary force is false. Communicative acts in dialogue are not results of single utterances but of the continued effort of participants. Moreover, in accordance with conversation analysis speech acts are social, not linguistic. The alternative offered by DSAT is that utterances have transactional significance related to the ostensible goal and/or interactional significance reflecting "face work", the interpersonal side of interaction.

The central means of administrating the utterance effects during conversation is the DSAT interaction structure. This structure contains (final) transactional and interactional effects, initial-state conditions, satisfaction conditions (ability and willingness), and a set of domain predicates for each action referred to in the effects and conditions. DSAT interaction structures are highly domain dependent. Geis uses a "slot and filler" representation for interaction structures. They seem to originate from AI based natural language processing and clearly have roots back to Marvin Minsky's frames. Geis does not exploain how such interaction structures (frames) may be designed, nor how to exploit the similarity of many such frames nor their domain predicates. However, for the construction of restricted task-domain systems such as current task-oriented spoken language dialogue systems the approach seems practical and promising. Two related approaches form dialogue modelling are the interaction-as -filling-missing-axioms [Smith and Hipp 1994] and the use of dialogue patterns [Novick and Sutton 1996]..

The interaction structure corresponds to Searle's speech act structure as follows: a) The transactional effect corresponds to Searle's essential condition. An important point of DSAT is the addition of an interactional effect. b) The initial-state conditions correspond closely to Searleøs sincereity condition, but emphasises more strictly the psychological state. c) The satisfaction conditions correspond to Searle's preparatory conditions, but adds willingness to ability, and restricts the scope to conditions that must be satisfied before the predicates replace Searle's propositional content condition because while single utterances might have a propositional content, multiturn interactions cannot.

According to Geis terms like request, offer etc. are useful in informal discussions of what participants are doing in multiturn interaction, but do not play a role in achieving and recognising goals. DSAT instead maps utterances in multiturn interactions into interaction structure conditions and domain predicates, instantiating slots of the DSAT interaction structure. Given that Searle's five general categories of speech acts (assertives, directives, commissives, expressives, and declaratives) seem to have psychological significance [Searle 1983], and that sets of concrete instances of speech acts are widely employed in dialogue managers in natural language dialogue systems, the relation between these "traditional" speech acts and the DSAT interaction structure mappings deserves more consideration than given by Geis. In DSAT the acts seem to pop up again as multiturn communicative actions, called encounters, such as service requests and invitations. Each interaction structure corresponds to encounter in a certain domain (note that a given interaction may address several interaction structures). I believe that this approach and Geis' claim that speech acts are not properties of individual utterances may be right, but I would like to see more real-data tests of the respective qualities of DSAT and individual utterance speech acts.

A nice consequence of the framelike interaction structures is that computations of Gricean implicatures are reduced or even avoided. Conditions and subconditions are directly instantiated from one another according to the structure hierarchy, essentially providing "precompiled inferences".

Also, Geis argues and illustrates that simply storing representations of transactional and interactional significance of utterances in DSAT interaction structures accounts for adjacency pairs and insertion sequences. Adjacency pairs are in focus in conversational analysis but is criticised by e.g. Searle [1992].

Moreover, the complexity of indirect speech acts is replaced with the simple notion of direct communication (utterances taht instantiate primary conditions of interaction structures) and indirect communication (utterances that instantiate preconditions of primary conditions).

Finally, I should mention that for the purpose of illustrating the computability of DSAT it is embedded in the Discourse Representation Theory (DRT) of Kamp and Reyle. Arguably this embedding enables DRT to account for conversational interaction. This part of the presentation is illustrative and sound, even though I still fail to see the first practical use of DRT.

Taking all this together, Geis convincingly presents Dynamic Spoeech Acts Theory as part of the basis for further theoretical and practical work on conversation, relating the structure of conversation with utterance understanding and utterance generation, via mappings to an underlying interaction structure.