Matthew Bell Allen et al. The program level infrastructure is laid out in a modular fashion, with a central hub and message-passing interfaces. The authors imply that this would support modules informing each other's processing, while a psycholinguistics class I took critiqued the Chomskyan modular view of languauge on precisely the grounds that different aspects of language could influence the processing of it, i.e. that there are emergent effects of syntax, semantics, pragmatics, etc. that cannot be computed effectively given a modular approach. My question is: Is the modular approach adopted here different from the modular language processing model taken up by Chomsky and attacked by non-Chomskyan psycholinguists? If so, how? If not, does a success here provide empiracal support of modular language processing views as opposed to connectionist and distributed processing ones? Larsson and Traum: I must admit, I found this paper somewhat opaque as far as understanding it went. It seems they advocate having a central knowledge structure which represents a particular theory of discourse as it operates within a given discourse genre, then having that structure subsume and define its own computations -- which amounts to denying that there is a "general shell" of the sort proposed by Allen et. al (2000). Is this the upshot of their claim? If not, what are they claiming? I'm puzzled by their proposed data structure. They seem to think they have one, but they do not describe it sufficiently in the paper to illustrate how one would encode a theory of discourse in it. Are they proposing a data structure and knowledge representation formalism for discourse theories, or doing something less specific; presenting an idea that it would be good to have such a formalism, perhaps? Or am I missing the point of the paper entirely? ------------------------------------------------------------------------------- Noboru Matsuda Allen et al. (2000) - Don't have much to say, but just wondering how much domain dependent knowledge have to be codified to use their shell for a particular application. Smith et al. (1995) - Theorem prover requires a complete domain theory. How does the system deal with the frame problem? In a real world situation, for example, a wire could be rusted hence can't be connected, or it could be just a plastic tube. - The system could have abstract (generic) schema of the world to handle above frame problem? An abstract schema matches a particular situation while omitting some details. The above example could be an instance of "can't_connect" schema that doesn't care much about why it can't be connected. ------------------------------------------------------------------------------- Antonio Roque In "An Architecture for a Generic Dialogue Shell", Allen et al's "Domain-independence Hypothesis" (p. 2) is exciting, and its limitations worth investigating: Components such as the Discourse Manager may be able to deal with such things as goals and intentions in an abstract way (p.7) and there may be quick domain-specific training methods for components such as the speech recognition (p 9-11) and parsing (p. 11-12), but the section on reference resolution (p. 12-13) acknowledges a wider range of reference behavior in practical dialogues than in other language genres; it seems possible that reference behaviour would also vary across task domains, and the solutions to that reference resolution would need more than to just draw on semantic type information. Also, if a reference resolution to "the green route" cited on p. 13 only needs a call to the Display Manager without needing to know the semantics of routes or greenness, doesn't this just shift the domain-specific responsibilty to the Display Manager? Are these just minor obstacles, or a suggestion of a fundamental dependence of dialogue agent components on domain-specific knowledge? What kind of obstacle could break the Domain-independence Hypothesis? If the hypothesis breaks down as a dialogue system moves from the confines of task-specific dialogue competence towards full human conversational competence, would this tell us something about general (i.e. human and machine) cognition and language use? Regarding Larsson and Traum's "Information State and Dialogue Management in the TRINDI Dialogue Move Engine Toolkit": In what ways does the toolkit constrain the theories that could be tested in it? What theoretical assumptions does the toolkit itself make? ------------------------------------------------------------------------------- Eric Williams Allen, et al 2000 p.2 The authors present two hypotheses. From what scientific theories have they developed these hypotheses? The scientific method would suggest that any new hypothesis should be based upon accepted theories, laws, and/or new observations, none of which seem to be mentioned by the authors. p. 3 What is the difference between scripted demonstration and scripted interaction? p. 5 What are the authors academic backgrounds? This approach reminds me of hardware and software design methods used by engineers. Also, I see their architecture as indicative of a move toward strong AI in a field seemingly dominated weak AI. This, in my highly biased opinion, is a very good move and long needed. Thoughts? p. 8 Is anyone in the class familiar with Minsky's "society of mind" ideas? From what little I've read of it, this architecture seems to be a small-scale realization of it. Or am I reading too much into what is really just a well-designed modular program, much like so many others found in object-oriented environments? p. 10 Who is Rayner and what did he/she contribute to the field? general Would it be possible to review the concept of an n-gram? Larsson and Traum 2000 p. 4 Reading this "debate" about component design strategies, I am reminded of behavioral versus cognitive psychology arguments. I was under the impression that this was essentially a closed case - and cognitive psychology won. Am I wrong? If I am, please explain, as it will help this "debate" make more sense to me. If I am right, then why this "debate" even relevant? p. 4 The comparison to game theory is a fascinating one, but is the problem sufficiently "well-behaved" to be modelled as a game? It seems to me dialog is not deterministic enough for this approach. In this game model, how would interruptions, abrupt changes of mind, and antagonism be handled? It seems to me a game model requires a highly predictable problem with an obvious end or goal state. Are these conditions met? p. 6 I'm still a little fuzzy on what a dialog move may consist of. p. 6 Why did they choose to fire the first rule that applies? What if the domain was like the emergency dispatch one mentioned in Allen, et al 2000? This policy would be very poor if another rule ought to take precedence do to greater importance. p. 9 -10 How is implicit information handled? For instance, common sense assumptions can often be made based on what has been said and how. Are they represented as private data, such as beliefs, or as public data, because they are presumably data that all participants agree to be valid? Does the TMP field account for these implied facts until they become explicit? ------------------------------------------------------------------------------- H. Chad Lane Allen, et al 2000 They suggest that the task of tutoring falls under the "practical dialogue" classification. Does it? The other examples provided (top of p.2) involve only domain specific knowledge (on top of their generic shell), whereas tutoring changes the game in more ways than just the domain (e.g., less user competence, different communicative goals, etc.). To resolve a referring expression, the first step is to construct a list of known properties (p.12). In the next paragraph, they mention that "entire stretches of discourse" are viable referents. What does the list of properties for such a referent look like? Is it as easy to throw such a wide variety of referents into one bag as they seem to suggest? Smith, et al, 1995 Interactions only occur when the theorem proving fails and needs information (p.286). What sorts of limitations arise from this architectural decision? One effect is that tasks might get completed without user comprehension... an interesting twist on the typical direction of human-machine relations! The Smith sample dialogues (p.284-5) contain mostly short user utterances with extended system utterances. The Allen samples (from TRAINS papers), on the other hand, tend to do the opposite (very short system utterances and longer user utterances). Are these typical of both systems? If so, what is it about each that lead to such dialogues? ------------------------------------------------------------------------------- Alan D. Berfield Prolog-Style: How and when does initiative level change during a given dialog? Or is this something that must be set beforehand? Generic Dialogue Shell: Are the KQML performatives for messages believed to be sufficient for all domains? TRINDI: What kind of drawbacks are there to using such a toolkit? ------------------------------------------------------------------------------- Andy P. Gaydos The Prolog-based system initially makes some assumptions about the user's knowledge but will remove these assertions later if the user shows he does not have this knowledge. If the user supplies incorrect information, can the system determine some assertions must be wrong and make a plan to recover? ------------------------------------------------------------------------------- Roy Wilson (1) Smith, et.al, hypothesized that machine directive mode would yield longer completion times. Are you satisfied with the statistical part of their evaluation as it pertains to completion times? (2) Larrson and Traum intend the architecture they describe to support comparison of dialogue systems. Ignoring TRINDKIT, what level(s) of granularity would/could such comparisons be made on the basis of the architecture they describe? (3) Recalling earlier comments by Diane and Amy, the underlying technology constrains and enables evaluation as well as design: compare, for example, the experiments briefly described by Allen, et.al, and those described by Smith, et.al. ------------------------------------------------------------------------------- Ilya Goldin Smith, D.R. Hipp, and A.W. Biermann. An Architecture for Voice Dialog Systems Based on Prolog-Style Theorem Proving. Computational Linguistics, 21:3, 1995. An Architecture for a Generic Dialogue Shell. James Allen, Donna Byron, Myroslava Dzikovska, George Ferguson, Lucian Galescu, and Amanda Stent. Natural Language Engineering, 6(3), 2000. Information state and dialogue management in the TRINDI Dialogue Move Engine Toolkit. Staffan Larsson and David Traum. Natural Language Engineering, 6(3-4), 2000. In the Allen et al paper, Table 2 lists various modules that comprise the TRIPS architecture, and explains their functions. It seems to me a useful way of looking at dialog system architectures in general is to modify this chart: reverse the order of the columns, rename "Module" to "Allen's Module" and add another column (e.g., "Hipp's Module") for every other system architecture we examine. This allows us to ask questions such as the following: - What functions are missing from system X? How can it get away with missing them? - What functions do we need at the minimum to claim that we have a dialog system? - What modules contain the intelligence? - How separable (loosely coupled) are these modules? Q1: What other questions do we need to ask about dialogue systems? What analytical tools can we use to ask them? At the same time, this chart I propose does not accomodate the TRINDI view. I would argue, however, that TRINDI considers the problem from the point of view of theory of dialogue, rather than a theory of dialogue systems. TRINDI's theory of dialogue is a computational theory, which makes system-building feasible. Q2: How can we bridge the TRINDI and TRIPS perspectives? Does one subsume the other? Are they compatible? Orthogonal? Q3: TRINDI makes the claim that some knowledge in a dialogue is shared, and some is private (and possibly some is semi-private). It's easy to create hypothetical dialogues where the distinction does not apply, or at least cannot be reliably detected (much less predicted) by a computer. Can we evaluate this claim of the theory? Can we empiricaly claim that a system that makes an arbitrary decision either way will be "good enough?" ------------------------------------------------------------------------------- Stefanie Bruninghaus As for the Smith et.al. paper: It seems to me that the way the dialog is carried out very much depends on the way the domain is represented. So, if the task structure is modeled in great detail, the dialog will reflect that detail. It seems that the method presented in the paper does not give a lot of guidance to ensure that (1) the dialog is always at the right level of detail, and that (2) a consistent level of detail is maintained throughout the entire system when multiple people work on development. What happens if the system made a mistake and has to change some of the basic facts in its user model (e.g., user knows where the dial is located). It seems that in the extreme, the whole user model can just collapse if there is a mistake at a very important, basic fact. Allen paper: Where do they maintain a user model in that architecture? Overall, I had the impression that this is not a very practical architecture (useful as a discussion basis, though). The outlined system requires a lot of coordination and messaging between the components (see Figure 1), which seems to contradict the authors' claim that this architecture is robust and easy to debug. ------------------------------------------------------------------------------- Theresa Wilson In "An Architecture for voice Dialog Systems Based on Prolog-Style Theorem Proving", the authors say the paper presents "a theory of voice dialog systems". Later in section 4, the authors discuss "a theory of task-oriented language." As the authors present their work in this paper, are these really theories? They seem more like they are presenting an implementation or perhaps a hypothysis for the implementation and handling of various theoretical issues of dialog systems (such as they discuss in section 10, Theoretical Issues from the Literature.