CS 1501 Fall 2003
Midterm Exam Solutions
1) Fill in the Blanks (20 points -- 2 points each). Complete the statements below with the MOST APPROPRIATE words/phrases.
a)
If
a program's run-time can be modeled by the function 20N3 + 4N2(2N + NlgN) , the program's Theta
runtime growth rate is Theta(N3 log n)
b)
An
algorithm that is known to run in time Theta(N2) requires 40
seconds to run on a problem of size K. How long will the algorithm take to run on a problem of size
2K on a computer that is twice as fast? 80 seconds
c)
An
algorithm that is known to run in time Theta(N2) requires 40
seconds to run on a problem of size K. How large of input can this algorithm finish in 40 seconds on a
new computer that is 100 times as fast? 10K
d)
Given
a radix search trie in which 32-bit keys are compared 4 bits
at a time, the maximum height of the tree is 8 and interior nodes will each have up
to 16 children.
e)
The
InsertionSort algorithm has a worst case asymptotic runtime
of Theta(N2)
and a best case asymptotic runtime of Theta(N)
f)
The
reason that hashing with linear probing results in slower access times than
double hashing when the hash table is full is commonly called primary
clustering.
g)
The
simple divide and conquer integer multiplication algorithm has a
run-time of Theta(N2) and Karatsuba's
improved algorithm has a run-time of Theta(Nlg 3) = Theta(N1.6).
h)
Using
the recursive, efficient GCD algorithm that we discussed in
class, finish the equation sequence (show each recursive call and the result): GCD(154, 70) = GCD(70, 14) = GCD(14, 0) = 14
i)
If
x and y are 1000 bit numbers then the product x*y contains about 2000 bits, and x raised to the power of y, that is xy, contains
about 1000000 bits.
j)
The
reason that it is standard practice to ignore multiplicative constants when
computing the running times of algorithms/programs is that they are implementation dependent.
2)
True/False (10
points -- 2 points each). Indicate whether each of the following is TRUE or
FALSE, explaining why in an informative
way for false answers.
a)
If
used as intended, the Vernam, or one-time pad cipher, is provably unbreakable
if the key has not been compromised. True
b)
If
I am using hashing with separate chaining, the maximum possible load
factor (a) for the hash table is 1. False, when closed addressed hashing is
used the table can hold an arbitrary number of elements.
c)
By
using a random pivot for QuickSort, we improve the worst case
run-time to Theta(NlgN). False, if one
is very very unlucky a runtime of Theta(N2) is possible.
d)
In
order to enable decompression of an LZW-compressed file, we need to prepend
extra information to the beginning of the compressed file. False, LZW is an adaptive compression
scheme.
e)
The
Miller-Rabin Witness algorithm is used to verify the authenticity
of sent messages. False, the Miller-Rabin
Witness test is used to test primality.
3)
(14 points – 2 + 6 + 6) Consider de la Briandais Trees (dlBs), as we
discussed in lecture.
a)
Give
the type and variable declarations to create a de la Briandais tree in either
Java , C or C++.
Answers will vary slightly. Below is one possibility in Java:
public
class DLBNode
{
public byte letter;
public DLBnode rsibling, lchild;
}
public
class DLB
{
private DLBnode root;
}
b)
Given
an initially empty dlB, draw how it would look if the words shown below were
inserted in the order shown below:
gone go goes so zone
No answer given.
See TA for answer.
c)
Assume
that you have a dictionary of 50,000 words over the standard Roman/English
alphabet of 26 letters. Assume that the maximum length of a word is 18 characters.
Given your declarations above, calculate the maximum amount of space in bytes
required by a de la Brandies tree for this dictionary. You need not simplify
your expression. Justify your answer. The
worst case is if the overlap between the words is minimum. An upper bound on
this value is (50000)(18)*(space per node). So for the declarations above we
would get something like (50,000)(18)(2*4+1) bytes.
4)
(4 points) Assume that you have the following hash table
indexed from 0 to 9 with entries as shown. You are using double hashing. You
insert a new item x with primary hash function h1(x)=5 and secondary hash
function h2(x)=3. State the resulting probe sequence and where x is inserted: probe
sequence = 5, 8, 1, 4, 7 and the item is stored in location 7.
0 1 2 3 4 5
6 7 8 9
|
|
R |
T |
|
A |
F |
|
|
E |
|
5)
(16 points 8+4+4) Consider the following
algorithm for filling in words in a crossword puzzle board, and assume
that the 26 letters A through Z are considered.
Each possible letter in the alphabet is tried for
each square on the board. However, no
tests for validity using the dictionary are done until after each assignment of
the entire board. The procedure
continues until a solution is found or all possibilities are tried. For example, the first board that would be
tested would be a board with the letter A in each square. The second board that would be tested would
be a board with the A letter in each square except the last square, which would
contain a B.
Consider a 4x4 board with 16 total squares,
all of which are initially valid empty squares
a) Write C/C++ or Java code for this algorithm. Assume that you have a function CheckBoard which will determine if a completely filled in board is a valid crossword.
Many correct
answers exist to this problem. Any
correct answer should have the following:
An array variable to store the current
board assignment.
Index values to determine the current square (likely as parameters to
the recursive function)
The recursive function loops through each valid letter for that
position, for each one making a recursive call to the next position.
The recursive call should be made without
any checking of board validity
The base case occurs when the last square
is filled – only in this case is the validity of the board actually checked.
b)
How many boards must be tested if no solution is
found? Be precise in terms of the number
of letters and the number of squares in the board and justify your answer. Each of the 16 board positions can take any
of 26 possible values. Thus there will be 2616 leaves in the
recursive exhaustive search tree.
c)
How can this algorithm be improved to consider fewer possible
boards, while still guaranteeing a solution will be found if it exists if the
dictionary is a de la Brandies tree?
Explain your answer. You do not need to give code, but make sure to
state where in your code, for your answer to part a), you would modify the
code. Fill the board in from left to
right, and from top to bottom. Then before you make a recursive call, you
should verify that each string constructed so far is a prefix to a valid word.
This will greatly prune the size of the exhaustive search tree.
6) (8 points – 4 + 4) Consider the mismatched character heuristic of the Boyer-Moore string matching algorithm. For each of the pattern and text strings shown below, state and justify how many total character comparisons must be done in order to match each pattern within the text string. Justify your answers using the skip arrays for each pattern.
a) Text: ABCDXABCDYABCDZABCDE
Pattern: ABCDE
Skip(A)=4,
Skip(B)=3, Skip(C)=2, Skip(D)=1 and Skip(E)=0. All other Skip array entries are
5. The total number of character comparisons is 8. For each of the first 3
mismatches the maximum skip value of 5 is used. The final 5 comparisons are used to match the pattern right to
left.
b) Text: XXXXXXXXXXXXXXXXXXXY
Pattern: XXXXY
Skip(X)=1 and Skip(Y)=0. All other Skip array entries are 5. The total
number of character comparisons is 20. For each of the first 15 character
mismatches, the pattern advances only 1 position. The final 5
comparisons are used to match the pattern right to left.
7)
(6 points) Consider the forest of
single-node trees with frequencies shown below. Draw the Huffman tree that would result from this forest
of nodes. Show your work.

8)
(8 points) Consider the MTF heuristic
used for compression in Assignment 3.
Assume that your arrays are size 256 (or 28) and are in their
initialized state. Show (in bits) the
output of the compression of the text shown below. Note that the ASCII value for A = 65, B = 66, etc. Show your work to receive full
credit.
110 1000001
110 1000010
000 1
000 0
110 1000011
001 10
9)
(6 points) Consider the algorithm we discussed in lecture for raising an N-bit integer to an integer power and
(i.e. XY for N-bit integers X and Y). You program should run in a
reasonable amount of time for 1000 bit numbers. Write a C/C++ or Java code
fragment to calculate this value.
Assume that a primitive type verylong
exists which can store arbitrary length integers.
verylong
answer=X;
verylong
square=1;
verylong
temp=Y;
while
(temp != 0)
{
if ( temp % 2 != 0 ) answer=answer*square;
square=square*square;
temp= temp / 2 ;
}
Note: A
recursive solution is also acceptable
10)
(8 points) Consider the QuickSort algorithm that we discussed
in class. Assume that a separate
function / method partition is already defined to partition the array as we discussed in lecture,
and it also returns the index of the location of the pivot after partition has
been completed. Write the Java or C++
code for QuickSort, using one of the headers below.
public
static void QuickSort(int [] A, int left, int right) // Java
void QuickSort(int A[], int left, int right) // C++
void quicksort(itemType a[], int l, int r)
{
int i;
if (r > l)
{
i =
partition(a, l, r);
quicksort(a, l, i-1);
quicksort(a, i+1, r);
}
}
Note: This is the EXACT same problem given in the
practice test.