
Managing Redundancy in Distributed Computer
Networks A State Transition Graph Approach for the
Stashing Problem.
Managing redundant information is becoming an important
issue in today's increasingly large distributed
computer networks. As total redundancy is extremely
costly to achieve, it has been proposed to keep
perfectly updated information only at the servers,
while keeping old copies of that information on
local computers. For such copies to be useful, a
maximum lifetime length is assigned to them. Before
the lifetime has elapsed, the local devices must
be stashed with a new updated copy. The problem
of optimizing the updates so that the maximum lifetime
length constraints are respected has been previously
formulated as a binary problem and proved to be
NP-hard through a reduction to the Steiner tree
problem in graphs. In this paper we explore the
properties of another formulation, based on a state
transition graph approach. We prove that only a
subset of states and transitions will be in the
optimal solution and that, thanks to those properties,
it is possible to greatly reduce the size of the
graph. A solution algorithm that is based on an
efficient evaluation of similar Steiner tree problems
with similar properties is presented. We discuss
extensions of this problem to future applications
of broadband multicast services.
(Received October 1994; revision received July
1996; accepted August 1996)
As computer networks evolve toward larger and more
distributed systems, and with the advent of the
world wide web and the so-called "information
super-highway," managing redundant information
is becoming a very prominent problem. In the computer
science literature, the idea of stashing information
locally as a means of maintaining redundancy goes
back to the end of the 1980s (Birrel 1988). Unfortunately,
perfect redundancy is difficult and costly to achieve
(Walker et al. 1983 and Mullender 1989); therefore,
some authors suggested that "less-than-perfect"
information be maintained locally to achieve a less
costly, though limited, type of redundancy (Alonso
and Cova 1990, Alonso et al. 1990. and Cova 1990).
The idea is that, for information that changes over
time (such as in databases or web pages), only key
computers called "servers" will have an
exact replica of the information, and that files
will be stashed locally on "client" computers
of the network. While the clients need not be updated
constantly, after a certain time or a certain number
of updates, the stashed file must be replaced by
a new, updated copy. In the computer network implementation
of this idea (Cova 1990) the local user is given
the flexibility to choose when the stashed files
must be updated. This flexibility is provided in
terms of what was called a "predicate,"
which can be seen as a maximum number of updates
(or the maximum time elapsed) for a local copy to
be replaced. Of course, this flexibility can be
transformed into a more complex system management
as the stashing strategy increases in complexity.
The problem of finding the optimal stashing strategy
in a fixed length of time so that the clients predicate
requirements are met was given the name Dynamic
Predicate Stashing Copy Problem (DPSCP) by Sanso
and Soumis (1993). In their work, they formally
defined the Problem in mathematical terms through
a binary programming formulation. As the problem
was classified as NP-hard and the binary formulation
was not appropriate for the steady state, the authors
suggested that a state transition graph approach
be explored. The purpose of this paper is to present
a new algorithm based on a systematic exploration
of the properties of the transition graph and on
an efficient exploitation of the similarities of
the Steiner tree substructures that must be solved
in order to find the global solution.
The problem we present is closely related to the
future expansions of broadband multicast services.
The term multicast (Halsall 1996) is generally used
to define a broadcast connection linking a particular
computer to a subset of users. The connection is
carried out through a tree of minimum cost, therefore
the different variants of the multicast problem
are associated to variants of the Steiner tree problem
in graphs (Plunket et al. 1994, Salama et al. 1995).
In foreseen Internet services (Clarck and Ammar
1995, Deering et al. 1994) it is being proposed
to use the multicast protocol to provide subset
of users with copy of key information (web pages,
for instance), at particular instants of time. If
the users are given the choice of the maximum acceptable
time before refreshing their pages, then the problem
becomes a variant of the DPSCP.
This paper is organized as follows. In Section
1 the problem is formally defined, and some normalization
properties of the problem are described. The state
transition graph formulation is presented in Section
2, where the size and some properties of the transition
graph are discussed. To be able to produce an efficient
algorithm, it is necessary to prove that not all
potential transitions can be used in the optimal
solution. The properties that yield such results
are proven in Section 3. The state graph construction
algorithm and the solution approach for the problem
are taken up in Section 4. Computational results
are presented in Section 5. Conclusions, suggestions
for further work, and additional comments about
the implication of the problem can be found in Section
6.