My primary technical areas of interest are compilers, programming
languages, and their interaction with computer architecture. The overarching
question that has motivated my research is how limitations of purely static
analysis and optimization can be overcome by exploiting information available
only at run time. Taking advantage of run-time information becomes increasingly
important in a number of domains because of the new ways software is deployed,
the languages it is written in, and the emerging computer architectures programs
are executed on. Harnessing run-time information can both improve performance
and facilitate the use of abstraction in programs by removing overheads
typically associated with generic language constructs.
I am currently exploring some of these ideas in the Chasqui project.
Moreover, run-time
information can augment static analyses when their results are too imprecise to
be useful in practice. This creates potential applications beyond optimization,
most conspicuously in the software engineering domain, in the maintenance and
evolution of software.
Previous Work
As a member of the UW Dynamic Compilation Project group I have been one of the principle designers of the DyC dynamic compilation system. DyC is an annotation-driven dynamic compilation system, i.e., annotations added to C code by the programmer control what optimizations are performed at run time. The annotations specify what code should be dynamically optimized, how aggressive the optimizations should be, and they also control a number of tradeoffs between dynamic compilation cost and benefit. Specifically, I have been responsible for DyC's annotation language, and I have designed and implemented its Binding Time Analysis (BTA), which controls what code is generated at run time and is central to DyC's main optimization, run-time code specialization.
Making DyC annotation-driven, allowed us to separate the mechanisms of dynamic compilation from the policy of where dynamic compilation should be applied. As it turns out, deciding where dynamic compilation may be beneficial, and determining the best cost-benefit tradeoffs for the optimization process, is at least equally challenging. As the major part of my dissertation, I have designed and implemented Calpa, a system to automate selective dynamic compilation. Calpa uses static program analysis to identify program regions and variables that might benefit from DyC's optimizations. It uses detailed value-profiling and a cost-benefit model to estimate the dynamic compilation costs based on the program's run-time characteristics and the resulting benefit for candidate annotations identified by the static analysis. Calpa selects the annotation that promises the best overall cost-benefit tradeoff. Calpa has found annotations of equal quality as those that had been previously been found by hand, however, at a fraction of the time -- minutes or hours of machine time versus days or weeks of human time, which it had taken us previously to annotate some of our benchmarks by hand.
Recently, I have looked at the run-time behavior of pointers and the potential applications of dynamic pointer information. Jointly with Manuvir Das I showed that the results of some static points-to analysis algorithms are one to two orders of magnitude worse than dynamically observed pointer behavior (published in PASTE'01). This has applications beyond optimization, particularly in program understanding and its applications, such as program maintenance and evolution. Together with Darren C. Atkinson I have explored how dynamic pointer information can improve program slicing for C (appears in Foundations of Software Engineering '02).