Distributed
Databases
Initially
distributed systems still maintained a single central database.
However the use of distributed data soon developed since otherwise a
large amount of data needed to be moved through the network and a
mainframe or telecommunications failure isolated the local processor
from the data that it needed.
In
a distributed database system, elements of the database are
distributed throughout the network and are stored at the location
where they are needed. Various approaches are used to implement this.
-
Partition. The database is
partitioned with each node on the network containing that section of
the database that relates to it. For example the section of the
database that relates to customers served at that node. Other data is
held centrally and any changes to central data can be dealt with
overnight by a batch update.
An
extreme case of this approach is when the entire database is
replicated to local processors. Here again the central copy will be
updated by batch processing, probably overnight.
Both
of these approaches lead to data inconsistency and will therefore be
unsuitable for applications such as holiday bookings where data
changes at one node need to be available to other nodes. They will
however work well in situations where local data processing is
compartmentalised and has no immediate effect on other nodes. An
example of this is supermarket stock control.
-
Another approach is to hold
only one working copy of the data at the local node with each node
storing the data that is most closely associated with it. The database
is in fact distributed throughout the network. If a node needs access
to records that are not held locally then this is obtained through the
network - possibly by initially accessing a central index to find the
location of the data. Software handles access to the database so that
the fact that it is spread over a number of sites is not apparent to
the user.
This approach requires more constant and heavier use of
the network but
it
eliminates the problems of data redundancy and it removes the need
for overnight reconciliation.
Advantages of distributed databases include:
-
Faster response to local
queries
-
Reduction in amount of network
traffic
-
Effect of breakdowns is
minimised
-
Better local control over the
system
-
Less powerful cheaper
processors needed
There
are however some disadvantages in the use distributed database
-
Increased tendency to data
redundancy and data inconsistency.
-
System is dependent on high
quality telecommunication lines, which may be vulnerable.
-
Need to maintain and enforce
consistent standards and data definitions over a wide area.
-
Increased security problems -
need to enforce security procedures over wider area plus increased
problems over data transmission.
|