Homework 1 Natural Language Processing (CS 3730 / ISSP 3120) Fall 2001 Assigned: September 12, 2001 Due: September 26, 2001 1. (10 pts) Amelia Bedelia is a "literal-minded but charming housekeeper who never fails to confound the Rogers family" (from the jacket of Amelia Bedelia, by Peggy Parish). For example, when told to "Draw the drapes when the sun comes in," Amelia waits for the right time of day, takes out her pad, and sketches the drapes. What do you think Amelia does when told to "Dress the chicken" after it is delivered by the meat market? Explain why this book has kept children laughing, using the parlance of Chapter 1. 2. Consider recognizing a simple numerical language, consisting of integers and real numbers, where there is an optional preceding plus or minus, and a required digit before the decimal point. Leading zeros are not ok, but trailing zeros are. Thus 12, 3.110, 12.34, +12, -12.34, 0.1 are in the language, but .1 and 01 are not. a) (10 pts) Design a NFSA using epsilon transitions to recognize this language. b) (10 pts) Design a DFSA that removes the epsilon transitions. c) (10 pts) Write a regular expression using only the primitive operators to recognize this language. - You may use some sort of shorthand for classes of symbols, e.g. in a furniture domain, furniture instead of desk, chair, table - If you write your FSAs in a table representation, please also draw the equivalent graph version (by hand is fine) 3. (60 pts) Implement the ND-Recognize algorithm from Jurafsky and Martin (page 44) using depth first search. Test your algorithm using sheep language NFSA#1 (Fig. 2.18 in Jurafsky and Martin), as well as the NFSA developed for question 2 above. For each test, show several examples where your algorithm accepts an input string, and examples where it does not. - Your code should be table-driven (i.e., represent the FSA as a table). - Use whatever programming language you like. - Don't worry about out of alphabet symbols. - Your output should include the input and outputs of the algorithm. - Also, your output should trace the execution of the machine to illustrate the handling of backup.