Next: Integrated Delay Analysis for Up: New Delay Analysis in Previous: New Delay Analysis in

Introduction

A major challenge in the design of high-speed integrated services networks is the implementation of a bounded-delay service, that is, a communication service with deterministically bounded delays for all packets in a connection. In such a network, the number of connections with a bounded-delay service requirement that can be supported is mostly determined by (i) traffic characterizations used to describe the traffic of connections, (ii) the packet scheduling disciplines at each server or switch in the network, and (iii) the accuracy of the delay analysis used for connection admission control tests.

In order to guarantee that all the connections can meet their deadline requirements, an effective and efficient method to derive the upper bound for the end-to-end delay experienced by connection's traffic is needed. By a delay analysis method being effective, we mean that the method is able to produce delay bounds that are relatively tight. A method that overestimates the delay bounds reduces the utilization of the network. By efficient, we mean that the method is simple and fast in order to be used as part of on-line connection admission control. During the past decade, a number of service scheduling disciplines that aim to provide per-connection performance guarantees have been proposed in the context of high speed packet switching networks, such as Fair Queueing, Virtual Clock, Self Clocked Fair Queueing, Stop and Go Queueing, Earliest Deadline First, Static-Priority Scheduling, Rate Controlled Service Discipline, and SCED Scheduling. Along with these service scheduling disciplines, various delay analysis techniques have been devised to evaluate upper bounds for end-to-end delays experienced by connections in a network.

We can group these delay analysis techniques into two classes, depending on whether they decompose the network into isolated servers that are analyzed separately, or whether they integrate individual servers in the network into larger superservers. We distinguish therefore decomposition-based from service-curve based methods. A brief description of these methods is presented as following.

Decomposition-Based Methods

The basic idea for any decomposition-based method is to partition the the network into isolated servers, and base the end-to-end delay analysis on the local delay analysis on the isolated servers. First, the local traffic is characterized on a per-connection basis at each server inside the network. The traffic is dependent on the source traffic for the connection and on the delay experienced by the traffic at previous servers. Next, the local delay bounds are independently computed. Finally, the upper bound for the end-to-end delay of the connection is computed as the sum of the local delay bounds at the individual servers on the path of the connection. The fundamental approach was proposed in [8, 9] and has been widely adopted in various forms.

Decomposition-based methods are very simple to use and are suitable for networks with arbitrary topology. On the other hand, they often overestimate the end-to-end delay suffered by the connection's traffic and so reduce the network resource utilization. This is because this approach assumes that a packet suffers the worst-case delay at every server along its path. This assumption is conservative; while a packet may suffer the worst case delay at one server, it may not incur the worst case delay at a successive server. It follows that some real time connections may be rejected by a decomposition-based admission control algorithm even though the network can guarantee their QoS requirements.

Service-Curve Based Methods

The basic idea in service-curve based methods is to find a representation of a sequence of servers on the path of the connection as a single server. Successive servers are therefore integrated and dependencies between delays on successive servers can be taken into account. Servers are represented by their service curve , which defines the minimum amount of service (in bits transferred) that a server k can give to a particular connection i during time interval [0,t) [10, 5].

Cruz [10] describes how the service curve can be used to effectively evaluate the end-to-end delay suffered by a connection. Suppose that Connection i passes through m servers and the k-th server offers the connection a service curve . Furthermore, suppose that the amount of traffic entering the network on Connection i during time interval [0,t) is bounded by . Then the end-to-end delay of Connection i is bounded by

where is called as network service curve of Connection i and is defined as

Service curves can be used in two ways for delay computation, depending on whether scheduling algorithms are derived from pre-defined service curves, or whether service curves are derived from pre-defined scheduling algorithms.

Allocated Service Curve Method

First, service curves are assigned to every connection at each server. Then, the end-to-end delay bound is derived based on the source traffic characterization and network service curve, which can be computed from the service curves of all servers on the path of the connection. The scheduling disciplines on the servers can be synthesized in a separate step from the service curves that were assigned earlier. See [10, 31] for some examples.

Theoretically this method fully utilizes the network resource and can be applied to networks with arbitary topology. However, the scheduling discipline synthesized from the service curves always relies on a dynamic priority assignment. Therefore, the scheduling overhead is not negligible, and will impair utilization of the network resource.

Induced Service Curve Method

As opposed to the allocated service curve method, here servers are assigned scheduling disciplines first. Then, service curves are derived for each server based on the local server scheduling discipline. Next, the network service curve is derived based on these service curves. Finally, the end-to-end delay bound is derived based on the source traffic characterization and the network service curve [28].

Once the service curve is known for the scheduling disciplines in the system, delay analysis is straightforward. Unfortunately, except for guaranteed-rate scheduling algorithms [18], deriving service curves is very difficult, if not impossible. This is indeed the case for static-priority (SP) schedulers, simple earliest-deadline-first (EDF) schedulers, and first-in-first-out (FIFO) schedulers. In this paper we will derive an approximation for the service curve of a FIFO server and use it to compare the performance of a service-curve based approach with the integrated approach presented in this paper.

Integrating Servers

As noted above, both general approaches to end-to-end computation do not work well for non-guaranteed-rate scheduling disciplines. Decomposition-based approaches over-estimate end-to-end delays for all disciplines by not taking into consideration self-regulating effects as traffic traverses the network on common paths. Service-curve based approaches work fine for guaranteed-rate disciplines, but fail for other disciplines. Indeed, we will illustrate later with an example of a chain of FIFO servers (Section 4) that service-curve based approaches can perform substantially worse than decomposition based ones.

In this paper we propose an integrated approach to analyze networks of non-guaranteed-rate servers. The general approach is to determine an accurate integrated service description for a collection of servers. Similar to the network service curve described earlier, that allows for a computation of output traffic descriptors for connections leaving the collection of servers under consideration. End-to-end delays can then be computed by partitioning the network into collections of servers, and then applying a decomposition-based method collections of servers instead of individual servers, thus greatly reducing the amount of over-estimation occuring in the delay computation.

We will describe the new delay analyis method in Section 2 on for a simple subnetwork containing two servers. While the approach itself is generic for a large class of service disciplines, we will focus our attention to systems with FIFO servers. In Section 3 we will apply the results of Section 2 to define an algorithm for end-to-end delay computation. We provide a detailed evaluation of the new algorithm by comparing it with a decomposition-based and a service-curve based algorithm. Section 5 concludes the paper.

Next: Integrated Delay Analysis for Up: New Delay Analysis in Previous: New Delay Analysis in

Riccardo Bettati
Wed Jul 14 18:25:49 CDT 1999