Compiler-Directed Energy Management for 

Time-sensitive Embedded Applications


 

Problem

Reducing device energy has become one of the most important challenges to embedded systems designers. Processors with dynamic voltage scaling have been introduced that permit trading performance for reduced energy consumption on-the-fly as a program executes. 

In this work, we present a novel hybrid scheme that uses dynamic voltage scaling to adjust the performance of embedded applications to reduce energy consumption while also meeting time constraints. Our fine-grained approach uses the compiler to insert power management hints in the application code. These hints convey path-specific run-time information about the program’s progress to power management points invoked by the operating system that adjust processor performance.

Methodology

 
At compile time, the compiler inserts PMH in the application’s code (based on profiled timing information) for computing the WCR at each of these hints. 
At run time, the OS periodically interrupts the application to set a new voltage/frequency based on the dynamic behavior of the application. This behavior is known through the WCR set by the complier 

Step 1: Collect Timing Information
Goal: Collect fine-grain timing information about each executed path in the CFG

Methodology: Divide the control flow graph (CFG) into regions. A region is the aggregation of one or more basic block. The region segmentation reduces the size of profiled information (compared to profiling on the BB level) and improves the worst-case estimation precision.

Step 2: Set Interrupt Interval
 Goal: Find the optimal number of executed PMP in an application to achieve minimum energy consumption.

Methodology: Use the theoretical solution presented in [2] to get the optimal program segment size in cycles. The segment size represents the interrupt interval length.  

Step 3: Insert PMH in application’s code
Goal: Insert PMHs no further apart than size of the interrupt interval in worst case scenario 

Methodology: Use a PMH algorithm that traverses an application CFG, covering all the paths. The algorithm uses a cycle counter that get incremented by the worst-case cycles of each traversed region in the CFG. A hint is inserted before the cycle exceeds the interrupt interval length. 

Step 4: Compute Worst Case Remaining Cycles 
Goal: Compute dynamically (at run-time) the WC remaining cycles for the application at each of the inserted hints based on the followed execution path 

Methodology: the compiler inserts in the application the code necessary for computing run-time evaluated hints. Run-time computed hints are placed

1.      Inside loops – because the WCR inside the loop is dependent on the loop iteration counter.

2.      At procedure calls – because the WCR for a procedure instance is dependent on the execution path, this specific procedure instance is called from.   

 

Results

The results presented below show a reduction of the energy consumption up to 56% and 47% over the no power management and static power management respectively for the Transmeta Crusoe processor model. The results also show a reduction up to 79% and 50% over the no power management and static power management respectively for the Intel XScale processor model.

Automatic Target Recognition application:

·        ATR results (Transmeta and XScale)

MPEG2 video decoder:

·        Transmeta results

·        XScale results

MELP audio encoder:

(Graphs will be posted soon)

Publications

[1]   N. AbouGhazaleh , B. Childers, D. Mossé, R. Melhem, M. Craven. “Collaborative Compiler-OS Power Management for Time-Sensitive Applications”. Technical Report TR-02-103, 2002. 

[2]   N. AbouGhazaleh, D. Mossé, B. Childers and R. Melhem ``Toward the Placement of Power Management Points in Real Time Applications'', COLP'01 (Workshop on Compilers and Operating Systems for Low Power), Barcelona, Spain, 2001