Project 1: Lexing MiniJava
Due: Sunday, February 11, 2018, at 11:59pm

Description

In class, we have discussed the use of regular expressions to express the lexical tokens that make up a programming language. In this project, we will use the JFlex tool to express the tokens that comprise the MiniJava language presented in Appendix A of the textbook and then simply print the tokens out in a nicely formatted way.

If you do not have the book, the grammar for MiniJava can be found at http://www.cambridge.org/resources/052182060X/MCIIJ2e/grammar.html

A token in the grammar is anything in double quotes or angle brackets (“” or <>) that is not hyperlinked. Note that we have two value tokens: INT_LITERAL and IDENTIFIER. The remaining hyperlinked categories are grammar nonterminals that will be used when we write a MiniJava parser.

Example

Input

class Factorial {

      public static void main( String[] a ) {

            System.out.println( new Fac().ComputeFac( 10 ) );

      }

}

 

class Fac {

      public int ComputeFac( int num ) {

            int num_aux;

            if( num < 1 )

                  num_aux = 1;

            else

                  num_aux = num * ( this.ComputeFac( num – 1 ) );

            return num_aux;

      }

}

Example Output

Line

Column

Token

Value

======================================================================

1

1

CLASS

 

1

7

ID

Factorial

1

17

LBRACE

 

2

2

PUBLIC

 

2

9

STATIC

 

2

16

VOID

 

2

21

ID

main

2

28

LPAREN

 

2

27

STRING

 

2

33

LBRACKET

 

2

34

RBRACKET

 

2

36

ID

a

2

38

RPAREN

 

2

40

LBRACE

 

3

3

PRINT

 

3

21

LPAREN

 

3

23

NEW

 

3

27

ID

Fac

3

30

LPAREN

 

3

31

RPAREN

 

3

32

DOT

 

3

33

ID

ComputeFac

3

43

LPAREN

 

3

45

INT

10

3

48

RPAREN

 

3

50

RPAREN

 

3

51

SEMICOLON

 

4

2

RBRACE

 

5

1

RBRACE

 

… omitted output for the Fac class

Requirements

·         The lexer should return tokens in the lexical rules associated with each regular expression

o   These can be a set of public static final ints and the Symbol class. The Symbol class is part of the parser (JavaCUP)’s runtime jar file. You may use it in the lexer simply by setting the classpath to include the jar file from JavaCUP and importing it.

o   Do not use anything else from JavaCUP, or write a parser.

·         The lexer should be driven by a program that prints out the line and column numbers of each token encountered in the program. This is information you can track in your token. An example is shown below.

·         The lexer should discard all comments.

·         The lexer should report any invalid character with the message “Illegal character ‘%c’ at line %d column %d\n”

Submission

By the deadline, you need to submit:

1.       Your JFlex file containing your parser

2.       Your java files containing main() and any auxiliary java files you have used

3.       A README text file describing how to build and run your program.

4.       Any examples that you have tested your program on

Create a zip file of the above files. Use SFTP to copy your zipfile to unixs.cis.pitt.edu and then copy your file to:

     ~jrmst106/submit/1622