CS 1501 Fall 2003

Midterm Exam Solutions

1)      Fill in the Blanks (20 points -- 2 points each). Complete the statements below with the MOST APPROPRIATE words/phrases.

a)         If a program's run-time can be modeled by the function   20N3 + 4N2(2N + NlgN) , the program's Theta runtime growth rate is Theta(N3 log n)

b)         An algorithm that is known to run in time Theta(N2) requires 40 seconds to run on a problem of size K.  How long will the algorithm take to run on a problem of size 2K on a computer that is twice as fast? 80 seconds

c)         An algorithm that is known to run in time Theta(N2) requires 40 seconds to run on a problem of size K.  How large of input can this algorithm finish in 40 seconds on a new computer that is 100 times as fast? 10K

d)         Given a radix search trie in which 32-bit keys are compared 4 bits at a time, the maximum height of the tree is  8  and interior nodes will each have up to 16 children.

e)         The InsertionSort algorithm has a worst case asymptotic runtime of  Theta(N2) and a best case asymptotic runtime of Theta(N)

f)          The reason that hashing with linear probing results in slower access times than double hashing when the hash table is full is commonly called  primary clustering.

g)         The simple divide and conquer integer multiplication algorithm has a run-time of  Theta(N2) and Karatsuba's improved algorithm has a run-time of Theta(Nlg  3) = Theta(N1.6).

h)         Using the recursive, efficient GCD algorithm that we discussed in class, finish the equation sequence (show each recursive call and the result): GCD(154, 70) =  GCD(70, 14) = GCD(14, 0) = 14

i)           If x and y are 1000 bit numbers then the product x*y contains about 2000  bits, and x raised to the power of y, that is xy, contains about 1000000 bits.

j)           The reason that it is standard practice to ignore multiplicative constants when computing the running times of algorithms/programs is that they are implementation dependent.

2)      True/False (10 points -- 2 points each). Indicate whether each of the following is TRUE or FALSE, explaining why in an informative way for false answers.

a)         If used as intended, the Vernam, or one-time pad cipher, is provably unbreakable if the key has not been compromised. True

b)         If I am using hashing with separate chaining, the maximum possible load factor (a) for the hash table is 1. False, when closed addressed hashing is used the table can hold an arbitrary number of elements.

c)         By using a random pivot for QuickSort, we improve the worst case run-time to Theta(NlgN). False, if one is very very unlucky a runtime of Theta(N2) is possible.

d)         In order to enable decompression of an LZW-compressed file, we need to prepend extra information to the beginning of the compressed file. False, LZW is an adaptive compression scheme.

e)         The Miller-Rabin Witness algorithm is used to verify the authenticity of sent messages. False, the Miller-Rabin Witness test is used to test primality.


3)      (14 points – 2 + 6 + 6) Consider de la Briandais Trees (dlBs), as we discussed in lecture.

a)         Give the type and variable declarations to create a de la Briandais tree in either Java , C or C++.

Answers will vary slightly.  Below is one possibility in Java:

 

public class DLBNode

{

   public byte letter;

   public DLBnode rsibling, lchild;

}

 

public class DLB

{

   private DLBnode root;

}

 

b)         Given an initially empty dlB, draw how it would look if the words shown below were inserted in the order shown below:

gone   go        goes    so        zone

 

 

No answer given.  See TA for answer.

 

                      

c)         Assume that you have a dictionary of 50,000 words over the standard Roman/English alphabet of 26 letters. Assume that the maximum length of a word is 18 characters. Given your declarations above, calculate the maximum amount of space in bytes required by a de la Brandies tree for this dictionary. You need not simplify your expression. Justify your answer. The worst case is if the overlap between the words is minimum. An upper bound on this value is (50000)(18)*(space per node). So for the declarations above we would get something like (50,000)(18)(2*4+1) bytes.

 

4)      (4 points) Assume that you have the following hash table indexed from 0 to 9 with entries as shown. You are using double hashing. You insert a new item x with primary hash function h1(x)=5 and secondary hash function h2(x)=3. State the resulting probe sequence and where x is inserted: probe sequence = 5, 8, 1, 4, 7 and the item is stored in location 7.

 

 

                          0                1            2              3           4           5              6            7            8            9

                         

 

R

T

 

A

F

 

 

E

 

 

 

 

 

5)      (16 points 8+4+4)  Consider the following algorithm for filling in words in a crossword puzzle board, and assume that the 26 letters A through Z are considered.

Each possible letter in the alphabet is tried for each square on the board.  However, no tests for validity using the dictionary are done until after each assignment of the entire board.   The procedure continues until a solution is found or all possibilities are tried.  For example, the first board that would be tested would be a board with the letter A in each square.  The second board that would be tested would be a board with the A letter in each square except the last square, which would contain a B.

Consider a 4x4 board with 16 total squares, all of which are initially valid empty squares

a)         Write C/C++  or Java code for this algorithm. Assume that you have a function CheckBoard which will determine if a completely filled in board is a valid crossword.

 

Many correct answers exist to this problem.  Any correct answer should have the following:

      An array variable to store the current board assignment.

Index values to determine the current square (likely as parameters to the recursive function)

The recursive function loops through each valid letter for that position, for each one making a recursive call to the next position.

      The recursive call should be made without any checking of board validity

      The base case occurs when the last square is filled – only in this case is the validity of the board actually checked.

 

b)         How many boards must be tested if no solution is found?  Be precise in terms of the number of letters and the number of squares in the board and justify your answer. Each of the 16 board positions can take any of 26 possible values. Thus there will be 2616 leaves in the recursive exhaustive search tree.

 

c)         How can this algorithm be improved to consider fewer possible boards, while still guaranteeing a solution will be found if it exists if the dictionary is a de la Brandies tree?   Explain your answer. You do not need to give code, but make sure to state where in your code, for your answer to part a), you would modify the code. Fill the board in from left to right, and from top to bottom. Then before you make a recursive call, you should verify that each string constructed so far is a prefix to a valid word. This will greatly prune the size of the exhaustive search tree.

6)      (8 points – 4 + 4)  Consider the mismatched character heuristic of the Boyer-Moore string matching algorithm.  For each of the pattern and text strings shown below, state and justify how many total character comparisons must be done in order to match each pattern within the text string.  Justify your answers using the skip arrays for each pattern.

a)         Text:               ABCDXABCDYABCDZABCDE

Pattern:           ABCDE

 

Skip(A)=4, Skip(B)=3, Skip(C)=2, Skip(D)=1 and Skip(E)=0. All other Skip array entries are 5. The total number of character comparisons is 8. For each of the first 3 mismatches the maximum skip value of 5 is used.  The final 5 comparisons are used to match the pattern right to left.

 

 

 

b)         Text:               XXXXXXXXXXXXXXXXXXXY

Pattern:           XXXXY

 

Skip(X)=1 and Skip(Y)=0. All other Skip array entries are 5. The total number of character comparisons is 20. For each of the first 15 character mismatches, the pattern advances only 1 position. The final 5 comparisons are used to match the pattern right to left.

7)      (6 points)  Consider the forest of single-node trees with frequencies shown below.  Draw the Huffman tree that would result from this forest of nodes.  Show your work.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


8)      (8 points)  Consider the MTF heuristic used for compression in Assignment 3.  Assume that your arrays are size 256 (or 28) and are in their initialized state.  Show (in bits) the output of the compression of the text shown below.   Note that the ASCII value for A = 65, B = 66, etc.  Show your work to receive full credit.

 

ABAACB

 

110  1000001

110  1000010

000  1

000  0

110  1000011

001  10


9)      (6 points)  Consider the algorithm we discussed in lecture for raising an N-bit integer to an integer power and (i.e. XY for N-bit integers X and Y). You program should run in a reasonable amount of time for 1000 bit numbers. Write a C/C++ or Java code fragment to calculate this value.  Assume that a primitive type verylong exists which can store arbitrary length integers.

 

verylong answer=X;

verylong square=1;

verylong temp=Y;

 

while (temp != 0)

            {

            if ( temp % 2 != 0 )  answer=answer*square;

            square=square*square;

            temp= temp / 2 ;

            }

Note: A recursive solution is also acceptable

 

10)      (8 points) Consider the QuickSort algorithm that we discussed in class.  Assume that a separate function / method partition is already defined to partition the array as we discussed in lecture, and it also returns the index of the location of the pivot after partition has been completed.  Write the Java or C++ code for QuickSort, using one of the headers below.

    public static void QuickSort(int [] A, int left, int right)  // Java

              void QuickSort(int A[], int left, int right)        // C++

 

 

void quicksort(itemType a[], int l, int r)

  {

    int i;

    if (r > l)

      {

        i = partition(a, l, r);

        quicksort(a, l, i-1);

        quicksort(a, i+1, r);

      }

  }

 

Note: This is the EXACT same problem given in the practice test.