PROJECT 1: WebServer Cluster

Announcement


Overview

In this project, you will design policies to achieve "best" service in a webserver cluster. The webserver used in this system is an Apache server, supporting HTTP1.1.  In this cluster, there are n machines

  1. one is a FrontEnd, which functions as the main webserver (e.g., www.cs.pitt.edu) to which requests are sent
  2. k clients will send out web requests
  3. n-k are servers (all of which contain the same data) that respond to client requests.  In other words, all servers can be used interchangeably.

The entire working procedure is as follows (See the following figure for intuition, each color represents the life cycle of one request):

 

1. Each client sends out web requests to FrontEnd, which chooses the server that will be used  by clients.  The server is chosen according to the distributed policies you will implement.  After choosing a server, the FrontEnd sends back to clients redirection messages (arrows with dotted lines); clients then redirect their requests to the specific server chosen by the FrontEnd.

2. After receiving redirection messages from FrontEnd, clients will establish connections with the assgined servers (arrows with solid lines).

The idea is to minimize the energy consumed by the webcluster while complying with performance constraints imposed by the clients.


Experiment Environment

The machine cluster you will use is the MATE cluter. Names for these 15 machines are: l1.mate.cs.pitt.edu,   l2.mate.cs.pitt.edu  ...  l15.mate.cs.pitt.edu.

Log on each of server by using your CS afs account. Students who don't have a CS account, need to apply through cs webpage.

All experiments will be done based on Apache system. More details are here.


Algorithms Design

There are two main components of the FrontEnd, namely (a) request distribution (or server assignment); and (b) energy conservation.

(a) When the FrontEnd receives a request, it should decide which server will respond to this request, and tell the clients which server(s) they should connect to. The request distribution policy you will implement on the FrontEnd will take into account the power/energy consumed by the entire systems, the workload, and the request response time.

(b) The second aspect of the FrontEnd is to choose the "best" number of servers to be on at any time, in order to satisfy the performance requirements and at the same time, minimize energy consumption by the cluster.

Intuitively, if there is a small quantity of requests, not all servers need to be on, some of them can be off. But when there are many requests, beyond the capacity of the ON servers, the response time should be considered and a new (set of) server(s) turned on.
Also, if the response time is so long, even with a small number of requests, you may turn some servers on to reduce the response time. 
Conversely, if response time is shorter than required, a few servers can be turned OFF.

NOTE that these two algorithms are not independent, they influence on each other.


Implementation

You are expected to do the followings in this project:

  1. Numbers of Clients and Servers in the cluster
    There are 15 machines in the cluster, 1 for FrontEnd, 14 for clients and servers.

    Individuals  You may assume that there are 4 clients and 10 servers in the cluster.

    Groups You must figure out exactly what is the optimal number of clients and servers which will give you the best throughput in the cluster.  You will need to carry out analysis and/or extensive experimentation to show these results

  2. Algorithm for turning on/off server(s)

    You will implement this algorithm based on the workload, which are the requests sent by clients, in the cluster. In fact, turning on/off server(s) can't occur immediately, which will take a period.

    Individuals You may assume that a server can be turned on/off instantaneously and with no energy penalty.  Thus, in your algorithm, you don't need to predict the workload in the period following the turn on/off action.

    Groups  Each server will take t1 seconds to boot, and t2 seconds to shut down. For example, at time t, you turn on server a, after t1 seconds (that is, at time t+t1), server a will be turned on, but in the period from time t to time time t+t1, it is still off.  Therefore, during this interval no requests should be redirected to server a.
    This means that your algorithm cannot be based on the current instantaneous workload, you must define workload prediction function to predict trend of the workload to decide if you will turn on/off server(s) or not at a certain time.

  3. Algorithm for choosing best servers to redirect requests to

    In order to be able to maintain the minimum energy, the algorithm in item 2) will turn off machines when the workload falls below a certain threshold.  However, this should not be the only criteria, because the load in item 2) is a predicted workload, and not necessarily the real workload.  When the response time of requests goes beyond a certain boundary, this means that more processing power is needed to make the response time smaller.

In this project we will use weighted average response time (WART) to evaluate the best policy.  This simply means that each type of request is assigned a weight, and the total response time is the sum of all weighted response times for each type of request. Clearly, the most important requests will get a larger weight.

2. Modules involved in this project

Modules in Apache are involved in this project are in mod_backhand-1.2.2-new, mod_cupfreq_new and mod_on_off folders, especially mod_backhand.c/mod_backhand.h, mod_cupfreq.c/mod_cupfreq.h, and mod_on_off.c/mod_on_off.h. (More details about these modules)

3. Tips for implementation

code1 for random distribution policy.

code2 for send out requests and collect response time. You can use this code directly.


Submission Issues

Due Date: Feb 20, 5pm

You should submit a report for describing how you implemented the algorithms (especially explaining some important part in your code), how does your code compile and be executed (need a README and a Makefile), which files you modified and created and your experiments results (the more graphs, the better).  All results and your report should be under your own directory. You must point out their path by sending us email before deadline and give mosse and changliu proper permission for read, write and modification on your project folder. To do this, you can issue the following commands:

fa sa directory_name mosse all
fa sa directory_name changliu all
fa sa directory_name system:anyuser none

Format of files:

1. Executable file for sending out request must be named as "distrib.exe"

2. Report must be name as "yourlastname_report.doc".

Checklist for submission:

report, code files you modified and created, readme file (if it's necessary)


 

Acknowledgment

The sample code and the apache module are provided by Cosmin. Thank you very much!