Matthew Bell
Allen et al.
The program level infrastructure is laid out in a modular fashion, with
a central hub and message-passing interfaces. The authors imply that
this would support modules informing each other's
processing, while a psycholinguistics class I took critiqued the
Chomskyan modular view of languauge on precisely the grounds that
different aspects of language could influence the processing of
it, i.e. that there are emergent effects of syntax, semantics,
pragmatics, etc. that cannot be computed effectively given a modular
approach. My question is: Is the modular approach adopted here
different from the modular language processing model taken up by
Chomsky and attacked by non-Chomskyan psycholinguists? If so, how? If
not, does a success here provide empiracal support of
modular language processing views as opposed to connectionist and
distributed processing ones?
Larsson and Traum:
I must admit, I found this paper somewhat opaque as far as
understanding it went. It seems they advocate having a central
knowledge structure which represents a particular theory of discourse
as it
operates within a given discourse genre, then having that structure
subsume and define its own computations -- which amounts to denying
that there is a "general shell" of the sort proposed by Allen
et. al (2000). Is this the upshot of their claim? If not, what are
they claiming?
I'm puzzled by their proposed data structure. They seem to think they
have one, but they do not describe it sufficiently in the paper to
illustrate how one would encode a theory of discourse in it.
Are they proposing a data structure and knowledge representation
formalism for discourse theories, or doing something less specific;
presenting an idea that it would be good to have such a
formalism, perhaps? Or am I missing the point of the paper entirely?
-------------------------------------------------------------------------------
Noboru Matsuda
Allen et al. (2000)
- Don't have much to say, but just wondering how much domain dependent
knowledge have to be codified to use their shell for a particular
application.
Smith et al. (1995)
- Theorem prover requires a complete domain theory. How does the
system deal with the frame problem? In a real world situation, for
example, a wire could be rusted hence can't be connected, or it
could be just a plastic tube.
- The system could have abstract (generic) schema of the world to
handle above frame problem? An abstract schema matches a particular
situation while omitting some details. The above example could be
an instance of "can't_connect" schema that doesn't care much about
why it can't be connected.
-------------------------------------------------------------------------------
Antonio Roque
In "An Architecture for a Generic Dialogue Shell", Allen et
al's
"Domain-independence Hypothesis" (p. 2) is exciting, and its
limitations
worth investigating:
Components such as the Discourse Manager may be able to deal with such
things as goals and intentions in an abstract way (p.7) and there may
be
quick domain-specific training methods for components such as the
speech
recognition (p 9-11) and parsing (p. 11-12), but the section on
reference
resolution (p. 12-13) acknowledges a wider range of reference behavior
in
practical dialogues than in other language genres; it seems possible
that
reference behaviour would also vary across task domains, and the
solutions
to that reference resolution would need more than to just draw on
semantic
type information. Also, if a reference resolution to "the green
route"
cited on p. 13 only needs a call to the Display Manager without needing
to
know the semantics of routes or greenness, doesn't this just shift the
domain-specific responsibilty to the Display Manager? Are these just
minor obstacles, or a suggestion of a fundamental dependence of
dialogue
agent components on domain-specific knowledge? What kind of obstacle
could break the Domain-independence Hypothesis?
If the hypothesis breaks down as a dialogue system moves from the
confines
of task-specific dialogue competence towards full human conversational
competence, would this tell us something about general (i.e. human and
machine) cognition and language use?
Regarding Larsson and Traum's "Information State and Dialogue
Management
in the TRINDI Dialogue Move Engine Toolkit": In what ways does the
toolkit constrain the theories that could be tested in it? What
theoretical assumptions does the toolkit itself make?
-------------------------------------------------------------------------------
Eric Williams
Allen, et al 2000
p.2
The authors present two hypotheses. From what scientific theories have
they developed these hypotheses? The scientific method would suggest
that
any new hypothesis should be based upon accepted theories, laws, and/or
new
observations, none of which seem to be mentioned by the authors.
p. 3
What is the difference between scripted demonstration and scripted
interaction?
p. 5
What are the authors academic backgrounds? This approach reminds me of
hardware and software design methods used by engineers. Also, I see
their
architecture as indicative of a move toward strong AI in a field
seemingly
dominated weak AI. This, in my highly biased opinion, is a very good
move
and long needed. Thoughts?
p. 8
Is anyone in the class familiar with Minsky's "society of mind"
ideas? From what little I've read of it, this architecture seems to be
a
small-scale realization of it. Or am I reading too much into what is
really just a well-designed modular program, much like so many others
found
in object-oriented environments?
p. 10
Who is Rayner and what did he/she contribute to the field?
general
Would it be possible to review the concept of an n-gram?
Larsson and Traum 2000
p. 4
Reading this "debate" about component design strategies, I am reminded
of
behavioral versus cognitive psychology arguments. I was under the
impression that this was essentially a closed case - and cognitive
psychology won. Am I wrong? If I am, please explain, as it will help
this
"debate" make more sense to me. If I am right, then why this "debate"
even
relevant?
p. 4
The comparison to game theory is a fascinating one, but is the problem
sufficiently "well-behaved" to be modelled as a game? It seems to me
dialog is not deterministic enough for this approach. In this game
model,
how would interruptions, abrupt changes of mind, and antagonism be
handled? It seems to me a game model requires a highly predictable
problem
with an obvious end or goal state. Are these conditions met?
p. 6
I'm still a little fuzzy on what a dialog move may consist of.
p. 6
Why did they choose to fire the first rule that applies? What if the
domain was like the emergency dispatch one mentioned in Allen, et al
2000? This policy would be very poor if another rule ought to take
precedence do to greater importance.
p. 9 -10
How is implicit information handled? For instance, common sense
assumptions can often be made based on what has been said and how. Are
they represented as private data, such as beliefs, or as public data,
because they are presumably data that all participants agree to be
valid?
Does the TMP field account for these implied facts until they become
explicit?
-------------------------------------------------------------------------------
H. Chad Lane
Allen, et al 2000
They suggest that the task of tutoring falls under the
"practical dialogue" classification. Does it? The other
examples provided (top of p.2) involve only domain specific
knowledge (on top of their generic shell), whereas tutoring
changes the game in more ways than just the domain (e.g.,
less user competence, different communicative goals, etc.).
To resolve a referring expression, the first step is to
construct a list of known properties (p.12). In the next
paragraph, they mention that "entire stretches of discourse"
are viable referents. What does the list of properties for
such a referent look like? Is it as easy to throw such a
wide variety of referents into one bag as they seem to
suggest?
Smith, et al, 1995
Interactions only occur when the theorem proving fails
and needs information (p.286). What sorts of limitations
arise from this architectural decision? One effect is that
tasks might get completed without user comprehension... an
interesting twist on the typical direction of human-machine
relations!
The Smith sample dialogues (p.284-5) contain mostly short
user utterances with extended system utterances. The Allen
samples (from TRAINS papers), on the other hand, tend to do
the opposite (very short system utterances and longer user
utterances). Are these typical of both systems? If so,
what is it about each that lead to such dialogues?
-------------------------------------------------------------------------------
Alan D. Berfield
Prolog-Style:
How and when does initiative level change during a given dialog? Or is
this something that must be set beforehand?
Generic Dialogue Shell:
Are the KQML performatives for messages believed to be sufficient for
all
domains?
TRINDI:
What kind of drawbacks are there to using such a toolkit?
-------------------------------------------------------------------------------
Andy P. Gaydos
The Prolog-based system initially makes some assumptions about
the user's
knowledge but will remove these assertions later if the user shows he
does
not have this knowledge. If the user supplies incorrect information,
can
the system determine some assertions must be wrong and make a plan to
recover?
-------------------------------------------------------------------------------
Roy Wilson
(1) Smith, et.al, hypothesized that machine directive mode would
yield
longer completion times. Are you satisfied with the statistical part
of
their evaluation as it pertains to completion times?
(2) Larrson and Traum intend the architecture they describe to support
comparison of dialogue systems. Ignoring TRINDKIT, what level(s) of
granularity would/could such comparisons be made on the basis of the
architecture they describe?
(3) Recalling earlier comments by Diane and Amy, the underlying
technology constrains and enables evaluation as well as design:
compare,
for example, the experiments briefly described by Allen, et.al, and
those described by Smith, et.al.
-------------------------------------------------------------------------------
Ilya Goldin
Smith, D.R. Hipp, and A.W. Biermann. An Architecture for Voice
Dialog
Systems Based on Prolog-Style Theorem Proving. Computational
Linguistics,
21:3, 1995.
An Architecture for a Generic Dialogue Shell. James Allen, Donna
Byron,
Myroslava Dzikovska, George Ferguson, Lucian Galescu, and Amanda
Stent.
Natural Language Engineering, 6(3), 2000.
Information state and dialogue management in the TRINDI Dialogue Move
Engine Toolkit. Staffan Larsson and David Traum. Natural Language
Engineering, 6(3-4), 2000.
In the Allen et al paper, Table 2 lists various modules that comprise
the
TRIPS architecture, and explains their functions. It seems to me a
useful
way of looking at dialog system architectures in general is to modify
this
chart: reverse the order of the columns, rename "Module" to "Allen's
Module" and add another column (e.g., "Hipp's Module") for every other
system architecture we examine. This allows us to ask questions such as
the
following:
- What functions are missing from system X? How can it get away with
missing them?
- What functions do we need at the minimum to claim that we have a
dialog
system?
- What modules contain the intelligence?
- How separable (loosely coupled) are these modules?
Q1: What other questions do we need to ask about dialogue systems?
What
analytical tools can we use to ask them?
At the same time, this chart I propose does not accomodate the TRINDI
view.
I would argue, however, that TRINDI considers the problem from the
point of
view of theory of dialogue, rather than a theory of dialogue systems.
TRINDI's theory of dialogue is a computational theory, which makes
system-building feasible.
Q2: How can we bridge the TRINDI and TRIPS perspectives? Does one
subsume
the other? Are they compatible? Orthogonal?
Q3: TRINDI makes the claim that some knowledge in a dialogue is shared,
and
some is private (and possibly some is semi-private). It's easy to
create
hypothetical dialogues where the distinction does not apply, or at
least
cannot be reliably detected (much less predicted) by a computer. Can
we
evaluate this claim of the theory? Can we empiricaly claim that a
system
that makes an arbitrary decision either way will be "good enough?"
-------------------------------------------------------------------------------
Stefanie Bruninghaus
As for the Smith et.al. paper:
It seems to me that the way the dialog is carried out very much
depends
on the way the domain is represented. So, if the task structure is
modeled in great detail, the dialog will reflect that detail. It seems
that the method presented in the paper does not give a lot of guidance
to ensure that (1) the dialog is always at the right level of detail,
and that (2) a consistent level of detail is maintained throughout the
entire system when multiple people work on development.
What happens if the system made a mistake and has to change some of
the
basic facts in its user model (e.g., user knows where the dial is
located). It seems that in the extreme, the whole user model can just
collapse if there is a mistake at a very important, basic fact.
Allen paper:
Where do they maintain a user model in that architecture?
Overall, I had the impression that this is not a very practical
architecture (useful as a discussion basis, though). The outlined
system
requires a lot of coordination and messaging between the components
(see
Figure 1), which seems to contradict the authors' claim that this
architecture is robust and easy to debug.
-------------------------------------------------------------------------------
Theresa Wilson
In "An Architecture for voice Dialog Systems Based on
Prolog-Style
Theorem Proving", the authors say the paper presents "a theory of
voice dialog systems". Later in section 4, the authors discuss
"a theory of task-oriented language."
As the authors present their work in this paper, are these really
theories? They seem more like they are presenting an implementation
or perhaps a hypothysis for the implementation and handling of various
theoretical issues of dialog systems (such as they discuss in section
10,
Theoretical Issues from the Literature.