****************************************************************************** 1. How general should a dialogue system's model be? Jurafsky & Martin pg. 738 defines AI-complete problems as being those requiring a complete artificial intelligence to solve and suggests using less sophisticated methods for dialogue act comprehension. This seems to be a tension in AI-related fields as a whole: to what extent does a researcher pursue techniques that are more sophisticated vs. those that work well without the theoretical sophistication? It would be nice to pursue AI-complete problems, but clearly it's impractical to do so if doing so won't produce results in the time-frames required by funders and other academic sponsors. Furthermore, it may be impossible to solve AI-complete problems given the current lack of understanding regarding such things as "meaning" and "knowledge". However, if effective AI-complete systems are currently out of our reach, it still seems limiting to build systems that "do the job" and nothing more. Doing so would turn our discipline into nothing more than a collection of heuristics; there would be no true scientific progress. It seems that a good approach is to aim for reusable systems: dialogue agents that, for example, work over muliple domains or that can be easily adapted to domains other than those for which they were first created. Related to this is the notion that human-dependent tasks such as knowledge engineering should be automated as much as possible. However, there is an important distinction: automation is a practical issue, while reusability is a theoretical issue. Reusability implies generalization, which in turn implies a model that is more likely to reflect fundamental truths about cognition. Should a dialogue system be just sophisticated enough to solve the problem, and no more? Should it be just general enough to solve any related problem? Should it be built with the goal in mind of advancing the field towards a general AI? Should a researcher even try to make a "theoretical point" while building a dialogue system? How should a researcher make this decision when designing a dialogue system? 2. What would a good theory tying together syntax, semantics, and pragmatics look like? In Grosz et al p 230, Scott and Kamp say: "contemporary linguistics ... is still struggling to achieve a genuine integration of semantics and pragmatics." Further down the page they say a unifying theory would have to have a precise formal definition and have semantics and pragmatics be "suitably attuned" to each other, given that one often depends on the other. What specifically would such a theory look like? In what ways do the four theories that Scott and Kamp present lack the requisites of an effective semantic-pragmatic theory? 3. What are the precise limitations of Dialogue Grammars and Plan-Based Models? In Grosz et al p 235 Cohen says: "In general, no consensus exists on the appropriate research goals, methodologies, and evaluation procedures for modeling dialogue." Later he specifically critiques Dialogue Grammars as lacking explanatory power (p. 236) and Plan-Based Models as lacking a "crisp theoretical base", by not having a "specification of what the system should do". (p. 239) What are the specifics of these limitations? In what kind of circumstances would these shortcomings be apparent? It seems that they complement each other, but obviously the solution isn't that simple. ****************************************************************************** On Chapter 6 from Discourse and Dialogue: The authors state that computational theories of discourse need to develop means of distinguishing between "textual, rhetorical, intentional, or informational" relations. These don't seem to be defined, and although intuitive definitions exist for each, I feel week in my understanding of what would define "textual" relations or "rhetorical" relations, or what could distinguish between "rhetorical" and "intentional" relations. (Pg. 232) The discussion of discourse grammars on pg 236 state that even humans find annotating for "communicative action" difficult. How difficult do human's find it? Do human's find it easier when asked to exhaustively describe each of many goals met by an utterence? It seems like MUDs would be an interesting way to obtain a corpus for studying these phenomena, as users not only converse but often define the world in which the conversation is grounded by "building" rooms, objects, etc. according to some theme. I think I recall this being mentioned in the NLP class last semester. Has this been done (for corpus building and analysis specifically)? Questions from Allen's book: The point was made in the Discourse and Dialogue chapter, and elsewhere (I recall this from the KR course) that logicial inference systems are working with computationally intractable problems. Still, Allen seems to rely heavilly on such techniques for his proposed conversational agent. Is the goal of his chapter mostly philosophical, asking how understanding could in principle be achieved, or what is minimally involved in understanding conversation? Or is the logic he describes tractable for the sorts of examples he gives? Jurafsky and Martin: On 729 - 730 the authors describe the DAMSL annotations briefly, mentioning forward and backward looking functions, but the difference between these doesn't quite seem clear. What is this? In the section comtrasting cue-based and plan based approaches, the third possibility of combining the two doesn't seem to be mentioned. It seems trying such is natural -- has it been done? It also seems that any real implementation of the plan-based approach would require some facility at recognizing cues. ****************************************************************************** In reading about belief models and modeling communication acts, I couldn't help but wonder if and how they handle miscommunication. Certainly the nested beliefs support this, but what about when we are also dealing with shared knowledge? In section 17.5 of Defining a Conversational Agent, they give the example of Sam telling Helen that it is raining. Assuming that Sam is our conversational agent, after he tells Helen that it is raining, would Sam put into his shared knowledge database that both he and Helen believe it is raining? If so, what if miscommunication somehow occurred and Helen believes that Sam wants her to believe that it is snowing (not raining)? If Sam put the belief that "Both Helen and I believe it is raining" in his shared knowledge base is this correct?o ****************************************************************************** Ch. 6 Discourse and Dialogue ---------------------------- - how successful have attempts to combine informational and intentional approaches been? - what does "anaphora" mean? Ch. 17 Defining a Conversational Agent -------------------------------------- - have conflicting beliefs (irrational, but happens) ever been considered? - this paper didn't mention anything about degree of belief... Ch. 19 Dialogue and Conversational Agents ----------------------------------------- - how useful are the DAMSL tags? do these tags significantly help reasoning systems? ****************************************************************************** Grosz/Sidner: I had a really hard time with this paper, and I don't think I really got a handle on it. However, after the Herb Clark talk, I was wondering what is the realtion between the Clark work and the Grosz/Sinder paper - both talk about different aspects of a joint activity and a dialog, respectively, and both line up the dialog and task to be accomplished and the speakers' intentions, respectively. To me, that seemed very similar. Also, and that's a question I already had when reading the Jurafsky/Martin chapter: What's the difference between the dominance and the satisfaction relation? I have not been able to figure that out at all. Grosz chapter: What I noticed most about this chapter, was that I never saw any mention of task/domain-specific models, all discussions seemed to assume that the methods are generic. Why is that? Allen chapter: As much as I am sceptical of Schank's work, wouldn't it be advantageous to use situation-based scripts, rather than the planning approach? It seems that there's a lot of reasoning required to cover standard situations, where one typically should have a pretty good idea what the participants are likely to do. ****************************************************************************** From "Discourse and Dialogue", regarding Cohen's last drawback of plan-based models of dialogue... (p.239) Why is it difficult to "express precisely what are the various constructs (plans, goals, intentions, etc.)"? I certainly believe it, but I'd like to hear more. He goes on to suggest that what is lacking is a precise notion of correctness. Section 6.4 discusses some evaluation issues, with a conclusion that user satisfaction is the "ultimate criterion." How can something so informal and subjective be used as part of a "crisp theoretical base"? ****************************************************************************** Questions related to J&M, Ch. 19 How well have general techniques performed for determining what kind of speech act / dialogue act a given input represents? As compared to domain-specific techniques or even application-specific techniques? Do dialogue systems even need to determine what kind of speech act a given input represents are can they use a more specific (i.e., domain or application-specific) set of categories for classifying input? What would be the disadvantage of the latter? Is it correct to say that many dialogue systems do not allow all kinds of illocutionary acts. Terminology: Are these terms synonyms: illocutionary acts, speech acts, dialogue acts? Plan-based v. cue-based interpretation of dialogue acts - plan-based approach has the diasadvantage of being AI-complete (p. 738). Also, while being very attractive from a theoretical standpoint, it seems to make little sense psychologically to assume that a hearer might go through the 8-step chain of reasoning shown on p. 733 upon hearing "Can you give me a list of the flights from Atlanta to Boston?" Could we say that a cue-based approach is more plausible psychologically? ******************************************************************************