Reference no: EM133197379
Imagine that we have a network of processors, each one of which needs access to a common database in order to carry out its work (for example, a network of banking machines accessing client information). To simplify, assume all operations are queries (no operation changes the data). Our problem is to determine which processor(s) should store the database (these processors will be called "servers") in order to have reasonably fast access time (including the load on each server), as well as having as little duplication of the database as possible.
At one extreme, each processor could store its own copy of the database. This would ensure the fastest possible access time, but at the cost of replicating the database many times. At the other extreme, the database could be stored in a single server (all other processors would connect to the server to access the data). This would ensure the smallest amount of duplication, but at the cost of longer access times and server load. In either case, there is a significant cost if the network is large.
A reasonable middle ground between the two extremes would be to fix a distance parameter d and to store the database on as few servers as possible, chosen so that every processor is within distance d of at least one server (the distance is measured as the minimum number of links required to send communications from one processor to the next). Unfortunately, it seems impossible to solve this problem efficiently.
Formally, show that the following SERVERLOCATION problem is NP-complete by first showing that SERVERLOCATION is in NP and then showing that it is NP-hard. You may use the NP-complete problem from the following SUBSET-SUM for your reduction where the SUBSET-SUM is an NP-Complete problem defined as follows:
SUBSET-SUM:
Input: Set of Integers S= {w1, ... , wn}, integer W
Output: Is there S' ⊆ S that adds up exactly to W?
And the SERVERLOCATION problem is defined as follows:
Input: An undirected graph G (the network of processors) and non-negative integers k and d.
Output: Is there some subset S of no more than k vertices (the "servers") such that every processor is within distance d of some server in S? (The "distance" between vertices u, v is measured as the number of edges on a shortest path from u to v, or infinity (∞) if there is no path from u to v.)
Formally, show that the following SERVERLOCATION problem is NP-complete by first showing that SERVERLOCATION is in NP and then showing that it is NP-hard