Illustration of guidelines through examples

In the following each individual guideline for cooperative human-machine dialogue is explained and illustrated. The presentation is organised by aspect of interaction. For each aspect the relevant generic guidelines are described and for each generic guideline the subsumed specific guidelines are described.

Each guideline is expressed in a shortform version immediately followed by its fullform version. For each guideline a justification is provided. The justification explains the nature of the problems that may occur in user-system interaction if the guideline is violated.

The justification is followed by one or more examples of violations and/or correct use of the guideline. Examples include excerpts of user-system interaction which are shown in the left-hand columns of the tables below. Comments are provided in the right-hand columns. Comments on guideline violations include suggestions for repair of the problem illustrated by the violations.

In the examples, S means system and U means user. [...] indicates that part of the transcribed interaction has been omitted.

Text in square brackets in the left-hand column provides context for the interaction excerpt. After each comment in the right-hand column a reference in square brackets refers to the dialogue from which the example in question was drawn. All examples derive from user dialogues with the Danish Dialogue System during various stages of its development: From early Wizard of Oz (WOZ) simulations (indicated by WOZxSyDz, i.e. WOZ iteration x, subject y, dialogue z) and from the user test of the implemented system (indicated by user test, followed by a unique dialogue identifier). All examples have been translated from Danish.

[< | >] Contents

Aspect 1: Informativeness
GG1 Say Enough
SG1 State commitments explicit
SG2 Provide immediate feedback
GG2 Don't say to much
Aspect 2: Truth and evidence
GG3 Don't lie
GG4 Check what you say
Aspect 3: Relevance
GG5 Be relevant
Aspect 4: Manner
GG6 Avoid obscurity
GG7 Avoid ambiguity
SG3 Ensure uniformity
GG8 Be brief
GG9 Be orderly
Aspect 5: Partner asymmetry
GG10 Highlight asymmetries
SG4 State your capabilities
SG5 State how to interact
Aspect 6: Background knowledge
GG11 Be aware of users' background knowledge
SG6 Be aware of user inferences
SG7 Adapt to target group
GG12 Be aware of user expectations
SG8 Cover the domain
Aspect 7: Meta-communication
GG13 Enable meta-communication
SG9 Enable system repair
SG10 Enable inconsistency clarification
SG11 Enable ambiguity clarification

In the section headings the links '<', ' | ', and '>' refer to previous aspect or guideline, contents of this section, and next aspect or guideline, respectively.

[< | >] Interaction aspect 1: Informativeness

The first group of two generic and two specific guidelines address the amount of information that should be provided in an interaction turn.

[< | >] GG1: Say enough
Make your contribution as informative as is required (for the current purposes of the exchange)

If the system's contribution is not sufficiently informative, this will typically lead to misunderstanding which may only be detected much later during interaction or, at best, lead to immediate request for clarification by the user. Requests for clarification tend to be difficult for the system to handle.

Examples of GG1: Say enough
S:
Are you particularly interested in discount?
U:
Yes.
[...]
S:
At which time?
U:
It must be on Saturday at 7:20.
S:
There is no departure at 7:20. The closest other red departure is at 10:50.
Violation: The system withholds important (non-requested) information. The user is not being told that there is a blue departure at 7:20. The system should have said that there is no red departure but that there is a blue departure at the chosen hour. [User test, T32a]
S:
There is no departure at 21:00. The closest other red departures are at 19:45 and 22:40. Do you want one of these departures?
U:
Yes.
S:
Which?
U:
Repeat.
S:
There is no departure at 21:00. Which one?
Violation: The system withholds requested information. It only repeats part of its previous response excluding the list of possible departures. The easiest solution is to let the system repeat its entire previous utterance since it is impossible to know in advance which part a user will be most interested in having repeated. [User test, G33a]
S:
S: Do you want more?
U:
Yes, I have two people who are going to Aalborg on Monday January 23 in the morning.
Violation: The system question is too open and non-specific. It invites the user to take the initiative and say all sorts of things. The question should be re-phrased into: 'Do you want to make another reservation?', since this is essentially what was intended and what the system can handle. [User test, T31aB]
S:
At which time?
U:
Afternoon.
S:
In the afternoon on Sunday January 29th there is a departure from Sønderborg to Copenhagen at 17:00. Do you want this departure?
Full departure information as requested: The system provides a full list of existing departures (in this case only one) within the time interval indicated by the user. [User test, G13b]

The two specific guidelines subsumed by GG1 elaborate on how to say enough in an interaction turn, given the context.

[< | >] SG1 (GG1): State commitments explicitly
Be fully explicit in communicating to users the commitments they have made

In spoken human-computer interaction design as well as in spoken human-human conversation, it is good practice to summarise the commitments made during the dialogue to make sure that the key information exchanged has been correctly understood. This is sometimes called summarising feedback. Commitment means that the user has made a commitment to do, buy etc. something. Examples are ticket reservation, hotel room booking and agreeing on a meeting date and hour. In information tasks, as distinct from ordering (or reservation) tasks, there are no such user commitments and hence no need to follow SG1. In the absence of summarising feedback, the user does not get an opportunity to detect misunderstandings that occurred earlier in the interaction. When the flight ticket for the wrong day arrives, it may be too late to do anything about it.

Examples of SG1: State commitments explicitly
[The user has made a change to a ticket which was booked earlier. Another person is going to travel instead.]
S:
Do you want to make other changes to this reservation?
U:
No I don't.
S:
Do you want more?
U:
Ah no ... I mean is it okay then?
S:
The reservation with reference number 420 has been changed into a ticket for Aksel Hansen with ID-number 23.
U:
Yes that's fine.
S:
Do you want more?
Violation: The interaction had been designed such that no summarising feedback was provided on the commitments made by the user. A summary of the commitments made during the interaction allows the user to easily detect if everything has been correctly understood. In the example, the wizard improvises a confirmation when directly asked for. [WOZ6 S12D2]
S:
You have now booked one ticket for ID-number 2 Jens Kristiansen from Aalborg to Copenhagen on Monday January 16th at 7:20 returning on Monday January 16th at 17:45. Your reference number is 52. The total price is 1410 kroner.
Commitment to buy a ticket: When, towards the end of a reservation dialogue, the system has all the information needed to book a flight ticket, it communicates explicitly to the user the commitments s/he has made. [User test, G14a]

[< | >] SG2 (GG1): Provide immediate feedback
Provide feedback on each piece of information provided by the user

In interactive speech systems design it is good practice to provide some kind of immediate feedback on each of those pieces of information provided by the user which are intended to contribute to the achievement of the goal of the dialogue, such as making a flight ticket reservation. Among other things, speech recognisers are still fragile and this is one reason why it is more likely that misunderstandings will occur in human-computer interaction than in human-human interaction. The sooner misunderstandings can be corrected, the better.

Immediate feedback provides users with an opportunity to detect misunderstandings immediately. A problem with immediate feedback is that it tends to make the interaction somewhat "heavier" than corresponding human-human exchanges. This is why various forms of "lightweight" feedback have been introduced as shown in the examples below, preserving full explicit feedback for the more difficult cases in which something has already gone wrong or there is a high risk of understanding failure. Full, immediate explicit feedback is an example of graceful degradation.

Examples of SG2: Provide immediate feedback
S:
Do you want a return ticket?
U:
No, we will book that one later. [Pause] It's okay. They are not going back.
S:
Red discount may be obtained on certain weekday departures and green discount may be obtained in weekends. In case of red discount the home journey may only start after two days at the earliest. Are you particularly interested in discount?
Violation: Boldface shows overlapping user speech. There is no immediate system feedback on the user's answer to the question whether s/he wants a return ticket. This makes it difficult to detect misunderstandings at once unless one knows the system and/or the domain well. Discount is not offered on one-way tickets. However, if the user does not know this (which in the current case s/he apparently does not), the misunderstanding will only show later in the interaction when the system asks for a return date. To solve the problem, the system should provide immediate feedback. [User test, T21a]
S:
Where does the journey start?
U:
The journey starts in Aalborg.
S:
From Aalborg. Where does the journey end?
U:
In Copenhagen.
S:
In Copenhagen. Do you want a return ticket?
Echo feedback: The system echoes the key contents of the user's input. This allows the user to detect a misunderstanding immediately. No extra turns are needed if the input was correctly understood because the system does not ask the user for explicit confirmation. [User test, T32a]
S:
At which time?
U:
Afternoon.
S:
In the afternoon on Sunday January 29th there is a departure from Sønderborg to Copenhagen at 17:00. Do you want this departure?
Implicit feedback: The system provides implicit feedback by including in its answer which date and time of the day it is talking about. [User test, G13b]
S:
One person. Please state the ID number of this person.
U:
Change.
S:
One person. Is that correct?
Explicit feedback: The system provides explicit feedback by asking the user to confirm whether the information it has is correct. [User test, G11b]

[< | >] GG2: Don't say too much
Do not make your contribution more informative than is required

The user may become inattentive or try to take the initiative if too much information is being provided in a single system turn. Also, the user may become confused about the actual interaction focus and what is the point of relevance, initiating, for instance, clarification meta-communication as a result. However, guideline GG2 must be fairly heavily violated to produce an interaction problem that may endanger the success of the interaction. In most cases, GG2 problems will be revealed through users' subsequent comments and opinions on the system.

Example of GG2: Don't say too much
U:
I want information on discount for children.
S:
Accompanied children between 2 and 11 years of age may obtain discount on return journeys: red discount at 323 kroner or green discount at 400 kroner. Children between 12 and 19 years of age may obtain green discount at 550 kroner. Red and green discount can only be obtained on certain departures. Children less than two years old do not need a separate ticket if accompanied by an adult. Do you want information on other discount possibilities?
U:
Hmm .. I want to know more on children. What about children travelling alone?
Violation: A user in WOZ5 complained that the system talked far too much, that she got bored during interaction and that she had forgotten what she wanted when she was finally allowed to speak. This complaint probably does not refer to any particular point in the quoted system turn but rather to its overall design. In particular, the system talked too much when providing information on travel conditions. The quoted system answer perhaps is not exceptionally long in itself. However, if most system turns are this long and if the user does not really get the information s/he wants because navigation in system directed information tasks tends to be cumbersome, then users may easily feel that the system talks far too much. In such cases, the entire dialogue structure should be carefully examined to remove superfluous material. [WOZ5S3D5]

[< | >] Interaction aspect 2: Truth and Evidence

The aspect of truth and evidence is addressed by two generic guidelines which state that if one believes something to be false, or if one does not have sufficient evidence that something is in fact true, then one should not make others believe that it is true. These guidelines are evidently quite fundamental to cooperative task-oriented spoken interaction in which the whole point of using the system is that it acts as a complete domain expert within its stated domain of competence. Guidelines GG3 and GG4 differ from the rest of the guidelines in that they refer to mental states, such as those of believing something or having evidence for something. All other guidelines appear to be more behavioural in nature, stating what the system should do or avoid doing. Some might question whether present-day computers have anything akin to mental states. Whatever the answer to this philosophical question, however, it is crucial to design the system's interaction, and the system more generally, such that it does not produce false beliefs in its users.

[< | >] GG3: Don't lie
Do not say what you believe to be false

It is obviously important that the user can trust what the system says. Users have good reason to become annoyed if the system provides them with false information on, e.g., departure times, prices or meeting venues. Still, this may happen despite all effort to the contrary by the developers, for instance because of bugs in the database or wizard errors, as the examples demonstrate.

Examples of GG3: Don't lie
S:
At which time?
U:
Morning.
S:
The green morning departures are fully booked.
Violation: According to the flight file the first departure on this particular day is 9:15. This hour is not defined as 'morning' but as 'before noon' by the database (the term 'morgen' in the Danish denoting a more restricted time interval than the English 'morning'). The corresponding flight is not fully booked. Therefore, the system's message should have been that there are no morning departures. Apparently there is a bug in the database which should be found and corrected. [User test, G13b]
S:
Id-number 89, Carl Frederiksen has two reservations. One on Friday September 11 from Aalborg to Copenhagen and one on Saturday September 10 from Aalborg to Copenhagen. Is it one of these?
U:
Ahm ...something is wrong because Saturday September 10 must be a Thursday if September 11 is a Friday. Hmm ... Friday September 11 at 7:20.
Violation: The system's information on dates clearly cannot be true as pointed out by the user. The error was made by a wizard. Careful domain model design is needed to avoid such problems. [ WOZ6S6D4]

[< | >] GG4: Check what you say
Do not say that for which you lack adequate evidence

It is not sufficient to believe that something probably is right. For the system, this means that it must make sure before giving the information to the user. Otherwise, the implication may be very much the same as for GG3, namely, discontented users.

Example of GG4: Check what you say
Violation: We did not observe any violations of GG4 during WOZ and in the user test. An example could have been the following. A user asks for two tickets. The system checks and finds that there are just two free seats left on the desired flight. It then confirms that everything is OK and that the reservation has been made. However, if confirmation is given to the customer before the database has actually made and confirmed the reservations, another customer calling the system at approximately the same time might have managed to book the same two seats. Contrary to the system's confirmation, the former customer gets no tickets. In fact, the travel agency in which we made our corpus recordings had had this problem because they did not have on-line access to the database.

[< | >] Interaction aspect 3: Relevance

Irrelevant system utterances are such which do not address the current topic of the interaction. Just as system developers are likely to be vigilant in ensuring database correctness (cf. GG3 and GG4), interaction designers can be expected to generally avoid designing irrelevant system utterances. When relevance problems occur, they are often produced by factors other than the interaction design itself.

[< | >] GG5: Be relevant
Be relevant, i.e. be appropriate to the immediate needs at each stage of the transaction

Lack of relevance in the system's utterances will typically lead to clarification dialogue. The user will want to ask what the interaction is about in the first place or what the system meant by what it just said. System irrelevance may be caused by misrecognition. In such cases it may be hard to avoid violation of the guideline. The system's reply may be perfectly relevant given its interpretation of the user's utterance but totally irrelevant given what the user actually said.

Example of GG5: Be relevant
S:
At which date will the journey start?
U:
The journey starts on January 23.
S:
Today is not the 23rd of January.
Violation: A system misrecognition (grammar failure) triggers an error message which may appear rather irrelevant to the user who did not mention today at all. In this particular case the system clearly should be able to correctly recognise what was said. So the grammar needs modification. Recognition failure is more difficult. Threshold values (score values) for evaluating the probability of correct recognition might be helpful. These would allow the system to initiate repair if the score is too low rather than producing a response which in all likelihood is based on misunderstanding. [User test, G22a]

[< | >] Interaction aspect 4: Manner

The aspect of manner concerns the way in which intended meaning is being expressed. Whereas it is rather unlikely for developers to forget to design databases and interaction according to the guidelines of truth, evidence and relevance discussed above, it is much easier to design infelicitous expressions which violate the guidelines of manner. However, the misleading effects on users, if this happens, can be disastrous.

[< | >] GG6: Avoid obscurity
Avoid obscurity of expression

Obscurity naturally leads to doubt and need for clarification in the user. As emphasised above, the system is unlikely to be able to understand and respond appropriately to most clarification sub-dialogues initiated by the user.

Examples of GG6: Avoid obscurity
S:
At which time?
U:
He wants to depart at 10:45.
S:
There is no departure at 10:45, the closest other departure is at 17:00. Do you want this departure?
Violation: The system provides misleading information since the departure at 17:00 is the only departure that day. The user is left in doubt as to whether there might be an earlier departure. S/he therefore proposes an earlier departure no less than three times during the subsequent interaction (not shown) but is still only offered the departure at 17:00 hrs. The system should have said that the departure at 17:00 is the only departure that day. [User test, T53aA]
S:
At which time?
U:
Before noon.
S:
In the morning on Friday January 27th there is a departure from Copenhagen to Karup at 9:10 and 11:50 sold out. Do you want this departure?
Violation: The system's answer is difficult to understand because of its incorrect grammar. The intended meaning is that only the 9:10 departure has free seats whereas the departure at 11:50 is already fully booked. The output phrase should be rewritten in a grammatically correct and non-obscure form. [User test, G11b]

[< | >] GG7: Avoid ambiguity
Avoid ambiguity

Ambiguity creates a need for clarification if detected by the user. If undetected, as often happens, the effects of ambiguity can be even worse. If the user unknowingly selects a non-intended meaning of a word or phrase uttered by the system, all sorts of things can go wrong leading to repair meta-communication or even transaction failure.

Examples of GG7: Avoid ambiguity
S:
Hello, this is the DanLuft reservation service for flights between Copenhagen and Aalborg. Please answer the following questions briefly and one at a time. You may answer "interrupt" whenever you want. Do you want to have information, make a reservation or change a reservation?
[...]
S:
Do you want more?
U:
No thanks. I interrupt.
Violation: The keyword "interrupt" was intended to act as a command which would suspend domain communication and allow meta- communication to start. However, the intended meaning of the keyword was never explained to users. Users interpreted the keyword as shown in the example, i.e. as a signal that one wants to break off the interaction. To break off is probably the most frequent meaning of the word "interrupt" in Danish. The keyword "interrupt" should either be explained or, preferably, replaced by a contextually non-ambiguous command. [WOZ6S11D3]
S:
Where does the journey start?
U:
It starts in Hellerup.
Violation: Hellerup is not the name of an airport. It is a suburb of Copenhagen and is not in the system's vocabulary. The user rightly considers that the journey starts at home, that is in Hellerup, rather than at the airport. A more precise and unambiguous system question would be to explicitly ask for the departure airport. [User test, T52a]

The specific guideline subsumed by GG7 expresses a strategy for avoiding ambiguity in a relatively limited class of situations.

[< | >] SG3 (GG7): Ensure uniformity
Provide same formulation of the same question (or address) to users everywhere in the system's interaction turns

Given similar contexts, different formulations of, for instance, the same question will not necessarily be interpreted the same way by users. Even a uniform formulation of, e.g., the same question may be interpreted differently if there are no domain restrictions. However, in limited domains, such as those of interactive speech systems, use of uniform formulations in similar contexts tends to remove all ambiguity. An added advantage is that the use of uniform formulations helps reduce users' vocabulary because users tend to model the phrases used by the system.

Examples of SG3: Ensure uniformity
S:
At which time does the flight you want leave?
U:
I don't know.
S:
At which time?
U:
On Friday evening.
Violation: In the early WOZ iterations there were many examples of different formulations of the same system question or answer. In this example, the first question may be interpreted as a request for a precise hour of departure which the user apparently does not know. The second question, however, is interpreted more broadly and gives the user the opportunity to provide a time interval. [WOZ3S0D4]
S:
At which time?
[...]
S:
At which time?
Same formulation wrt. hour of departure: The system uses the same formulation when it asks for the hour of departure no matter whether the topic is the out journey or the return journey. [User test, T24a]

[< | >] GG8: Be brief
Be brief (avoid unnecessary prolixity)

The user may become bored and inattentive or may try to take over the initiative if the system talks too much. This guideline must be fairly heavily violated to cause a real interaction problem. Rather, the problem will be revealed through users' comments and opinions on the system.

Examples of GG8: Be brief
S:
Which time of the day are we talking about?
U:
We are talking about 17:45.
Violation: The lengthy formulation of the question is unnecessary. Moreover, the user is being made to model the superfluous part of the question. This results in an unnecessarily long user answer. Compare the following example. [WOZ5S2D2]
S:
At which time?
U:
7:20.
Brief question on hour of departure: The system question is stated very briefly and so is the user's answer. [User test, G14a]

[< | >] GG9: Be orderly
Be orderly

The system should address the task-relevant topics of interaction in an order which is as close as possible to the order expected by the user. If the user expects some topic to come up early during interaction, that topic's non-occurrence at its expected "place" may cause a user-initiated clarification sub-dialogue which the system cannot cope with. The study of the structure of human-human conversation in the domain for which the system is being designed may support orderly interaction design.

Examples of GG9: Be orderly
S:
On which date will the return journey start?
U:
Preferably Sunday.
S:
At which time do you want a departure?
U:
I would like a departure late in the afternoon. Is there ah ... any kind of discount possibility?
Violation: In the WOZ3 iteration we found several problems of the kind illustrated in the example. Having expected the topic of discount to come up for some time, users began to inquire about discount when approaching the end of the interaction. From WOZ6 onwards, users were asked early on whether they are interested in discount fares, thus blocking impatient questions about discount possibilities. [WOZ3S0D6]
See the dialogue T11b(A) of the user test. Orderly dialogue structure: The interaction structure for flight reservation was carefully designed to be as naturally ordered as possible. Among other things, this was done by studying the structure of human-human conversation concerning the same task.

[< | >] Interaction aspect 5: Partner asymmetry

Interaction partner asymmetry means that important differences exist between the interlocutors which are likely to influence the course and eventual success of the interaction. When learning to speak, we implicitly learn what is a "normal" or "standard" partner in spoken interaction. Unless otherwise told, we assume that our partner(s) in interaction is "normal" or "standard". If it turns out that this is not the case, we are trained to adjust our manner of speaking to the partner's abilities, such as when speaking to children, the weak of hearing or interlocutors who find themselves in noisy environments. The computer is in many respects a non-standard partner in spoken interaction and strongly needs to make its users aware of this fact on the penalty of generating all sorts of miscommunication which it cannot possibly handle.

[< | >] GG10: Highlight asymmetries
Inform interaction partners of important non-normal characteristics which they should take into account in order to behave cooperatively during interaction. Ensure the feasibility of what is required of them

GG10 is an important and slightly complex guideline. In addition to guidelines stating how the system itself should behave to act cooperatively during interaction, such as the ones discussed above, guidelines are needed according to which the system should transfer part of the responsibility for cooperation to the user. Failure to do so will demonstrate that the system is not a cooperative speech partner. The guideline is that the non-normal interaction partner should inform its partners of the particular non-normal characteristics which they should take into account in order to act cooperatively. Consider an example from human-human conversation. To be cooperative, the human interlocutor who is in a noisy environment should inform the partners in conversation that this is the case. Having done that, it will be up to the partners to demonstrate their cooperativity by suitable modifications of their conversational behaviour. In human-human conversation, this is normally accomplished without significant difficulty. Designers of spoken human-computer interaction can rely on the same flexibility in their intended human users, provided that the system actually does inform its users about its important non-normal characteristics, as stated in the first part of GG10. However, the interaction designers may easily ignore the crucial precondition stated in the second part of GG10. It is that what they demand of the system's users should actually be feasible for them to do. In other words, it is not sufficient for cooperative interaction design that the system tells the users that, to be understood by the system, they have to do X. The reason is that doing X may not be cognitively feasible for the users. Due to problems of cognitive overload or other factors, they may simply be unable to do what the system requires them to do during interaction. And if they are thus unable, then it will be difficult for the users to get their task done through interaction with the system. Therefore, the user cooperativity asked for by the system should be feasible for its users. If it is not, difficult or impossible cases of miscommunication may proliferate.

Examples of GG10: Highlight asymmetries
U:
Yes, please. I would like to have 9:40.
S:
22:40. At which date will the return journey start?
U:
No, 9:40.
Violation: In its introduction, the system has asked its users to use the keyword 'change' if they want to take the initiative in order to make a correction to the system's understanding of some previous utterance of theirs. During the actual interaction, however, the user is too preoccupied with the ticket reservation problem at hand to remember to use the designer-designed keyword 'change'. It is simply unnatural for the user to use a specific keyword in situations such as the one illustrated. More natural mixed-initiative meta-communication should be allowed. [User test, T22a(A)]
S:
At which time?
U:
They must arrive at 9:30 at the latest.
Violation: In its introduction, the system has asked its users to answer its questions briefly and one at a time. In this example, the user is in fact being brief and only answers the question asked by the system. However, instead of providing a straight answer, the user provides an indirect answer. In human-human conversation, indirect answers such as the one illustrated are perfectly correct answers. An indirect response simply indicates that the interlocutor does not possess the information necessary to provide a direct answer. In response to the indirect user answer shown, a human travel agent would list the relevant departures. An interactive speech system, however, having limited inferential capabilities, will be unable to cope with most indirect responses. It is the task of the interaction model designer to either prevent indirect responses, such as the one illustrated, from happening, or enable the system to understand indirect responses of this type. [User test, T34a]
S:
At which time?
U:
What are the possibilities?
Violation: In its introduction, the system has asked its users to answer its questions and only take the initiative by using one of the two keywords 'change' and 'repeat'. Nevertheless, the user in this example asks a question and thus takes the initiative. What the example shows is that reservation dialogue, in its very nature, so to speak, is informed reservation dialogue. It is natural for users who are going to make a reservation or, more generally, to order something, that they do not always possess the full information needed to decide what to do. In such cases, they ask for the information. Since the system poses as a perfect domain expert, this is legitimate. What users ignore, however, and despite what was said in the system's introduction, is that the system does not have the skills to process their questions. Again, the interaction designers have failed to take into account that users should only be asked to do what is feasible for them. [User test, T12a]
S:
On which date will the journey start?
U:
The journey starts in on Friday at 8:15.
Violation: In its introduction, the system has asked users to answer its questions briefly and one at a time. However, in the shown example the user answers two questions at a time. What this suggests is that users sometimes naturally store information in "packages" consisting of several pieces of information. In such cases, they are unlikely to consistently split these packages into single pieces of information despite having been told to do so by the system. This must be taken into account when designing the system's input understanding capabilities. [User test, T31a(A)]

The specific guidelines subsumed by GG10 concern communication of what the system can and cannot do and instructions on how to interact with the system.

[< | >] SG4 (GG10): State your capabilities
Provide clear and comprehensible communication of what the system can and cannot do

It should be made clear to users what the system knows about, what it does not know about and what its limitations are as a partner in spoken interaction. Users should not waste time on interaction which eventually will show that the system does not possess the knowledge they look for. Nor should they be misled into attributing to the system more advanced comprehension skills than it actually possesses. Making these things sufficiently clear to users can be difficult indeed. SG4 is of particular importance in walk-up-and-use systems where users cannot be expected to have access to independent, e.g. written, information about the system.

Examples of SG4: State your capabilities
S:
Customer number 2. Is that correct?
U:
No id-numbers 1 and 4.
S:
Customer number 4.
U:
Wrong, the customer
number is 2.
S:
How many people will travel?
Violation: Italics shows overlapping user speech. The user tries to make a change. However, the system does not listen during its own turns. Users were not informed on this as they should have been. [User test, T31a(B)]
S:
At 19:50.
U:
Yes.
S:
You have now booked ...
Violation: Boldface shows overlapping user speech. The example shows a phatic expression which overlaps with the system's turn. This is a common way for humans to express that the conversation is being followed with interest during the interlocutor's turn. Pauses, in particular, may be filled this way. In such cases it does not matter that the system does not listen during its own turns. However, users were not informed on this as they should have been. [User test, T41a]
S:
Hello, this is the DanLuft reservation service for domestic flights. Do you know how to use this system?
U:
No.
S:
The system can reserve tickets for Danish domestic flights.
Communication of what the system can do: The system informs the user from the very beginning on which task it can solve. [User test, T11a]

[< | >] SG5 (GG10): State how to interact
Provide clear and sufficient instructions to users on how to interact with the system

Like SG4, SG5 addresses both the system's task capabilities and its communication capabilities. If the system is unable to handle some task in a standard way or is only able to handle the task in one of several standard ways, this should be communicated to users to prevent interaction failure. And if the system needs particularly reduced forms of user input, this should be stated or otherwise clearly communicated. SG5 is of particular importance in walk-up-and-use systems where users cannot be expected to have access to independent, e.g. written, information about the system.

Examples of SG5: State how to interact
S:
At which time?
U:
At 18:10.
S:
There is no departure at 18:10. The closest other departures are at 18:00 and 22:00. Do you want one of these departures?
U:
No.
S:
At which time?
U:
15:45 from Esbjerg.
Violation: The user is supposed to book a round-trip ticket and, as is common, perceives the task as one of booking a return ticket. Round-trip tickets, however, can only be handled by the system as two separate single reservations. When asked about the home departure hour for the second time, the user tries and fails to input a departure airport for the home journey that is different from the arrival airport for the out journey. The system does not provide information on how to handle round-trips as it should have done. [User test, G54a]
S:
The system can inform about prices, times and travel conditions as well as reserve tickets for Danish domestic flights. You use it by answering the system's questions. In addition you may use the two special commands "repeat" and "change" to have the most recent information repeated or changed. The system will only understand you when you answer its questions briefly and one at a time.
Instructions on how to interact: In its introduction, the system informs its users on how to interact with it. However, it turned out that what was required of users was often not feasible for them (cf. the examples under GG10). The use of keywords for repair and clarification meta-communication is not natural. Moreover, information is often naturally packaged, such as date and time together, which means that it may be difficult to answer the system's questions one at a time. [WOZ7S22D1]

[< | >] Interaction aspect 6: Background knowledge

This aspect addresses the fact that the system and its developers need to recognise relevant differences among users and user groups in terms of background knowledge. These differences not only concern the novice/expert distinction among users but also other types of background knowledge differences, and may even include recognition of widespread erroneous background assumptions among users. The system must also take into account the expectations users may have with respect to the system's background knowledge. Based on this recognition, the system either has built prior to interaction, or adaptively builds during interaction, a model of the user which serves to guide system cooperativity.

[< | >] GG11:Be aware of users' background knowledge
Take users' relevant background knowledge into account

The system needs to adjust to users' background knowledge and inferences based thereupon. Otherwise, the users may fail to understand the system and initiate clarification meta-communication. See the example of GG11. We do not currently have any example of a violation of GG11 that is not also a violation of one of its subsumed specific guidelines.

Example of GG11: Be aware of users' background knowledge
S:
Red discount may be obtained on certain weekday departures and green discount may be obtained in weekends. In case of red discount the home journey may only start after two days at the earliest.
Information on discount types: In particular novice users cannot be expected to have any background knowledge on discount types. The system therefore provides information on these to users who have said earlier that they do not know the system. [User test, G12b]

The specific guidelines subsumed by GG11 concern inferences by analogy and the distinction between novices and experts.

[< | >] SG6 (GG11): Be aware of user inferences
Take into account possible (and possibly erroneous) user inferences by analogy from related task domains

If the system does not take into account possible user inferences by analogy, this may invite users to ask clarification questions or leave them, probably discontentedly so, with unanswered questions.

Example of SG6: Be aware of user inferences
S:
Do you want more?
U:
Did I get a discount?
Violation: The user wants discount and does not know that this is unavailable on one-way journeys. The system should take into account users' background knowledge by mentioning that, if the user has chosen one-way tickets, discount cannot be obtained. [User test, T13a]

[< | >] SG7 (GG11): Adapt to target group
Separate whenever possible between the needs of novice and expert users (user-adaptive interaction)

There are major differences between the needs of novice and expert users of the system, one such difference being that the expert users already possess the information needed to understand system functionality. If the system favours expert users, it is likely to fail as a walk-up-and-use system. If it favours novice users, it is likely to be perceived as cumbersome and redundant by expert users.

Example of SG7: Adapt to target group
S:
Hello, this is the DanLuft reservation service for domestic flights. Do you know this system?
U:
No.
S:
The system can reserve tickets for Danish domestic flights. You use it by answering the system's questions. In addition you may use the two special commands "repeat" and "change" to have the most recent information repeated or changed. The system will only understand you when you answer its questions briefly and one at a time.
S:
Please state your customer number.
Optional system introduction: The example shows the system introduction which users will get if they answer 'no' to the first system question. If they answer 'yes' the system will immediately ask for a customer number. [User test, T11a]

[< | >] GG12: Be aware of user expectations
Take into account legitimate user expectations as to your own background knowledge

When the system can be assumed to be a perfect expert within its declared domain of expertise, it must possess the amount and types of background knowledge which a user legitimately may expect it to have. If this is not the case, users may become confused or annoyed with what they rightly regard as a deficient system. GG12 is illustrated below. This illustration might have been used for SG8 as well.

Example of GG12: Be aware of user expectations
S:
At which time?
U:
20:10.
S:
The departure at 20:10 is fully booked. The closest other departures are at 19:00 and at 22:05. Do you want one of these departures?
Violation: At first sight nothing is wrong here. However, when you call a travel agency and the flight you want is fully booked, you are usually offered to be put on a waiting list. The user pointed this out in a telephone interview after her interaction with the system. She rightly missed this option during interaction and might as well have started asking questions about it during her interaction with the system. The waiting list functionality should be included in the system's capabilities. [WOZ7S16D3]

The specific guideline subsumed by GG12 concerns the system's task domain knowledge and inference capabilities.

[< | >] SG8 (GG12): Cover the domain
Provide sufficient task domain knowledge and inference

An interactive speech system is often if not always supposed to be a perfect expert within its declared domain of expertise. This includes task domain coverage and relevant inferences. The system must be able to provide the appropriate domain information when and as required by its users. The system must also be able to make appropriate inferences to avoid lengthy and inefficient turn taking which only serves to clarify something which the system could have inferred on its own.

Examples of SG8: Cover the domain
S:
At which time [out journey]?
U:
A quarter past seven.
...
S:
At which time [home journey, same day]?
U:
A quarter to six.
S:
The home journey must not be earlier than the out journey.
Violation: The problem concerns the hour of departure for the home journey. The system checks the temporal relationship with the out journey. The home departure time 'a quarter to six' is being understood as AM by default. However, it is not ambiguous in the present context and should have been understood as 17:45. The system should provide a general solution to the class of problems exemplified by: if out and home journey are on the same day and the home departure time is 'a quarter to six' then understand the answer as '17:45'. [User test, T14a]
S:
The tickets will be mailed in a couple of days.
U:
Change.
S:
Do you want more?
U:
Change.
S:
The tickets should be sent. Is this correct.
U:
No.
S:
Should the tickets be sent or will they be picked up at the airport?
Violation: There are only two possibilities concerning ticket delivery: tickets may either be mailed or picked up at the airport. Although the user has stated that the tickets should not be mailed, the system does not make the obvious inference but asks again whether tickets should be sent or will be picked up at the airport. The system should not ask superfluous questions but deduce what the user wants when this is obvious. If only two user answers are possible, then understand rejection of one of them as confirmation of the other. [User test, G44a]
S:
On which date will the return journey start?
U:
Same day.
S:
Thursday January 26th
Date inference: The user has asked for a journey starting on January 26. The system correctly infers that when the user then answers 'same day' to the question of return journey, this means January 26. [User test, G34a]
S:
At which time?
U:
Before noon.
S:
Before noon on Wednesday February 1 there is a red departure from Aalborg to Copenhagen at 10:50. Do you want this departure?
Departure information: The system provides appropriate information on departures in the given context (the user has asked for discount and a departure time before noon). [User test, G12b]

[< | >] Interaction aspect 7: Meta-communication

This aspect addresses the need for meta-communication design. Meta-communication is typically initiated for purposes of clarification and repair. We do that in human-human spoken conversation when we say, e.g., "Please repeat - I didn't hear what you just said" (repair); "Sorry, I said 'recognise speech', not 'wreck a nice beach'" (repair), "Is 'speech recognition' the same as 'speech understanding'?" (clarification); or "What do you mean by 'red departure'?" (clarification). Meta-communication is likely to be needed in all interactive speech systems but should be reduced to a minimum to optimise interaction efficiency.

[< | >] GG13: Enable meta-communication
Enable repair or clarification meta-communication in case of communication failure

Users as well as systems need to initiate clarification or repair meta-communication from time to time. The need may have many different causes, such as system violation of a cooperativity guideline, user inattention, or system misunderstanding. GG13 covers user initiated repair and clarification meta-communication as well as system initiated meta-communication. As said before, user initiated clarification meta-communication is generally very difficult to handle and should be prevented as far as possible. This is probably true not only for system-directed interaction as in the Danish dialogue system.

Examples of GG13: Enable meta-communication
S:
ID number 1, Jens Hansen. Where does the journey start?
U:
Change.
S:
ID number 1, Jens Hansen. Is that correct?
U:
No, ID number 4.
User initiated repair: The system has misunderstood the user's input as regards the ID number of the traveller. The user correctly initiates repair to change the misunderstood information. [User test, T11b(A)].
S:
There is no departure at 21:00, the closest other red departures are at 19:45 and 22:40. Do you want one of these departures?
U:
Yes.
S:
Which?
U:
Repeat.
User initiated repair: The user probably forgot which departure times were mentioned and asks the system to repeat them. [User test, G33a].
S:
Friday evening from Copenhagen to Aalborg there are the following departures at 17:45, at 19:30 and at 22:15.
U:
Are any of them green departures?
User initiated clarification: The user wants a discount ticket. Since the system does not provide any information on discount along with the listed departure hours, the user asks for clarification. [WOZ3S1D2]

The specific guidelines subsumed by GG13 address system initiated repair and clarification meta-communication in cases of missing understanding, inconsistent input and ambiguous input.

[< | >] SG9 (GG13): Enable system repair
Initiate repair meta-communication if system understanding has failed

If user input cannot be interpreted as meaningful in the context, the system needs the ability to ask for repetition or to otherwise indicate that it did not understand what was said.

Example of SG9: Enable system repair
S:
At which time?
U:
It must be as cheap as possible.
S:
Sorry. At which time?
U:
Cheap.
S:
Sorry, I did not understand.
System-initiated repair: The word 'cheap' is not in the system's vocabulary and neither of the two user utterances in the example are recognised as meaningful in the context. The system therefore correctly initiates repair, informing the user that the input was not understood. [User test, T13a].

[< | >] SG10 (GG13): Enable inconsistency clarification
Initiate clarification meta-communication in case of inconsistent user input

If the user's input is inconsistent, clarification becomes necessary. The system should not try to guess the user's priorities because if the guess is wrong, the user will have to initiate meta-communication instead, possibly in the form of clarification.

Example of SG10: Enable inconsistency clarification
S:
Are you particularly interested in discount?
U:
Yes.
S:
On which date will the journey start?
U:
Monday.
S:
Monday January 16. At which time?
U:
At 7:20.
S:
There is no departure at 7:20. The closest other red departure is at 10:50.
Violation: The user inconsistently wants both the 7.20 departure and discount. There is, in fact, a departure at 7:20 but it does not offer discount. The system implicitly assigns priority to discount over departure time without sufficient reason. Instead, the system should ask for the user's priorities, saying, for instance: "7:20 is not a discount departure. Red discount can be obtained on the departures at x, y and z. Which departure do you want?" [User test, G14a].

[< | >] SG11 (GG13): Enable ambiguity clarification
Initiate clarification meta-communication in case of ambiguous user input

If the user's input is ambiguous, clarification becomes necessary. The system should not try to guess what the user actually means because if the guess is wrong, the user will have to initiate meta-communication instead, possibly in the form of clarification.

Example of SG11: Enable ambiguity clarification
U:
Around 7 o'clock.
S:
19 o'clock. At which date will the return journey start?
U:
Change.
Violation: All departure times which may be understood as being before noon (AM), such as 7 o'clock, are automatically understood as such. If there is no flight at the time, the dialogue manager automatically adds 12 hours to the departure time. If there is a flight at this time, input will be interpreted as being PM yielding, in the example, 19 o'clock. This rule does not work in practice. Instead, the system should ask the user for clarification in case of ambiguous temporal input which cannot be resolved by context. [User test, G32a].