Multicast communication raises many data management issues that either do not arise in unicast communication, or that obviously require different solutions than the standard methods used in unicast settings. As an illustrative example consider the application of a highly scalable Web server. The objective of the Web server application is to scale to a large client population, and scalability will be accomplished by using the middleware. In the middleware, the server can disseminate data by choosing any combination of the following three schemes: multicast push, multicast pull, and unicast push. In multicast push the server repeatedly sends information to the clients without explicit client requests. (For example, television is a classic multicast push system). Multicast push is an ideal fit for asymmetric communication links, such as satellites and base station methods, where there is little or no bandwidth from the client to the server. For the same reason, multicast push is also ideal to achieve maximal scalability of Internet hot spots. Hence, generally multicast push should be restricted to hot resources. In multicast pull, the clients make explicit requests for resources, and the server broadcasts the responses to all members of the multicast group. If multiple clients request the same resource at approximately the same time, the server may aggregate these requests, and only broadcast the resource once. One would expect that this possibility of aggregation would improve user perceived performance for the same reason that proxy caches improve performance, that is, it is common for different users to make requests to the same resource. Multicast pull is a good fit for ``warm resources'' for which repetitive multicast push cannot be justified, while there is an advantage in aggregating concurrent client requests. Traditional unicast pull is reserved for cold documents. The end-user should not perceive that Web resources are downloaded with a variety of methods, as the browser and the middleware shield the user from the details of the multi-tier dissemination protocol.
In the Web server application, the document selection unit periodically gathers statistics on document popularity. Once statistics have been collected, the server partitions the resources into hot, warm, and cold documents. When a client wishes to request a Web document, it either downloads it from a multicast channel or it requests the document explicitly depending on whether the document is hot or not. The server also broadcasts an index of sorted URIs which quickly allows the client to determine whether the requested resource is in the hot broadcast set. On the whole, the client determines the multicast channel, downloads the appropriate portions of the index, and determines whether the resource is upcoming along the cyclic broadcast. If the request is not in the hot broadcast set, the client can make an explicit request to the server, and simultaneously starts to listen to the warm multicast channel if one is available. If the page is cold, the requested resource is returned on the same connection. If the page is warm, the clients waits on the warm multicast channel until the requested resource is transmitted. The multicast pull scheduling component resolves contention among client request for the use of the warm multicast channel and establishes the order in which pages are sent over that channel.
In multicast push, the server periodically broadcasts hot resources to the clients. The server chunks hot resources into nearly equal-size pages that fit into one datagram and then cyclically sends them on a single or on a layered multicast channel along with index pages. The frequency and ordering of the pages within the multicast push channel are determined by the multicast push scheduling component. Upon receipt of the desired pages, the client can buffer them to reconstruct the original resource and can cache resources to satisfy future request. The set of hot pages is cyclically multicast, and so received pages are current in that they cannot be more than one cycle out-of-date. Furthermore, certain types of consistency semantics can be guaranteed by transmitting additional information along with the control pages.