d 3 d 4 d d d d d d d d d d d 1 d d d d d d

Similar documents
Online Appendix to: Generalizing Database Forensics

6.854J / J Advanced Algorithms Fall 2008

Indexing the Edges A simple and yet efficient approach to high-dimensional indexing

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources

Computer Organization

Non-homogeneous Generalization in Privacy Preserving Data Publishing

Coupling the User Interfaces of a Multiuser Program

Fast Fractal Image Compression using PSO Based Optimization Techniques

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE

Skyline Community Search in Multi-valued Networks

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help

Learning Polynomial Functions. by Feature Construction

Questions? Post on piazza, or Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)!

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem

Generalized Edge Coloring for Channel Assignment in Wireless Networks

On Effectively Determining the Downlink-to-uplink Sub-frame Width Ratio for Mobile WiMAX Networks Using Spline Extrapolation

Comparison of Methods for Increasing the Performance of a DUA Computation

Waleed K. Al-Assadi. Anura P. Jayasumana. Yashwant K. Malaiya y. February Colorado State University

Route Registries (IRRs) [1] are use to store an istribute RPSL specications. Thir, a collection of software tools, calle the RAToolSet [2], gives netw

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method

Multilevel Linear Dimensionality Reduction using Hypergraphs for Data Analysis

the probabilistic network subsystem

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES

William S. Law. Erik K. Antonsson. Engineering Design Research Laboratory. California Institute of Technology. Abstract

EFFICIENT STEREO MATCHING BASED ON A NEW CONFIDENCE METRIC. Won-Hee Lee, Yumi Kim, and Jong Beom Ra

Lecture 1 September 4, 2013

Socially-optimal ISP-aware P2P Content Distribution via a Primal-Dual Approach

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization

THE APPLICATION OF ARTICLE k-th SHORTEST TIME PATH ALGORITHM

Distributed Line Graphs: A Universal Technique for Designing DHTs Based on Arbitrary Regular Graphs

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Divide-and-Conquer Algorithms

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH

Image Segmentation using K-means clustering and Thresholding

A FUZZY FRAMEWORK FOR SEGMENTATION, FEATURE MATCHING AND RETRIEVAL OF BRAIN MR IMAGES

Additional Divide and Conquer Algorithms. Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2

Verifying performance-based design objectives using assemblybased vulnerability

Politehnica University of Timisoara Mobile Computing, Sensors Network and Embedded Systems Laboratory. Testing Techniques

Loop Scheduling and Partitions for Hiding Memory Latencies

Design of Policy-Aware Differentially Private Algorithms

Kinematic Analysis of a Family of 3R Manipulators

Questions? Post on piazza, or Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)!

Top-down Connectivity Policy Framework for Mobile Peer-to-Peer Applications

Threshold Based Data Aggregation Algorithm To Detect Rainfall Induced Landslides

Wireless Sensing and Structural Control Strategies

Graphics Calculator Applications to Maximum and Minimum Problems on Geometric Constructs

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means

PART 2. Organization Of An Operating System

Design Management Using Dynamically Defined Flows

. The problem: ynamic ata Warehouse esign Ws are dynamic entities that evolve continuously over time. As time passes, new queries need to be answered

New Version of Davies-Bouldin Index for Clustering Validation Based on Cylindrical Distance

Compiler Optimisation

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation

Image compression predicated on recurrent iterated function systems

INFORMATION RETRIEVAL USING MARKOV MODEL MEDIATORS IN MULTIMEDIA DATABASE SYSTEMS. Mei-Ling Shyu, Shu-Ching Chen, and R. L.

Adjacency Matrix Based Full-Text Indexing Models

Analysis of half-space range search using the k-d search skip list. Here we analyse the expected time for half-space

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien

[2006] IEEE. Reprinted, with permission, from [Damith C. Herath, Sarath Kodagoda and Gamini Dissanayake, Simultaneous Localisation and Mapping: A

Recitation Caches and Blocking. 4 March 2019

One-to-Many Multicast Restoration Based on Dynamic Core-Based Selection Algorithm in WDM Mesh Networks

MODULE V. Internetworking: Concepts, Addressing, Architecture, Protocols, Datagram Processing, Transport-Layer Protocols, And End-To-End Services

Solutions to Tutorial 1 (Week 8)

Experion PKS R500 Migration Planning Guide

Disjoint Multipath Routing in Dual Homing Networks using Colored Trees

Robust PIM-SM Multicasting using Anycast RP in Wireless Ad Hoc Networks

P. Fua and Y. G. Leclerc. SRI International. 333 Ravenswood Avenue, Menlo Park, CA

Non-Uniform Sensor Deployment in Mobile Wireless Sensor Networks

An Ecient Approximation Algorithm for the. File Redistribution Scheduling Problem in. Fully Connected Networks. Abstract

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems

Pairwise alignment using shortest path algorithms, Gunnar Klau, November 29, 2005, 11:

APPLYING GENETIC ALGORITHM IN QUERY IMPROVEMENT PROBLEM. Abdelmgeid A. Aly

A Plane Tracker for AEC-automation Applications

Data Mining: Concepts and Techniques. Chapter 7. Cluster Analysis. Examples of Clustering Applications. What is Cluster Analysis?

Change Patterns and Change Support Features in Process-Aware Information Systems

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks

A Classification of 3R Orthogonal Manipulators by the Topology of their Workspace

Dense Disparity Estimation in Ego-motion Reduced Search Space

Research Article REALFLOW: Reliable Real-Time Flooding-Based Routing Protocol for Industrial Wireless Sensor Networks

filtering LETTER An Improved Neighbor Selection Algorithm in Collaborative Taek-Hun KIM a), Student Member and Sung-Bong YANG b), Nonmember

Problem Paper Atoms Tree. atoms.pas. atoms.cpp. atoms.c. atoms.java. Time limit per test 1 second 1 second 2 seconds. Number of tests

Politecnico di Torino. Porto Institutional Repository

Towards an Adaptive Completion of Sparse Call Detail Records for Mobility Analysis

Shift-map Image Registration

Impact of cache interferences on usual numerical dense loop. nests. O. Temam C. Fricker W. Jalby. University of Leiden INRIA University of Versailles

Blind Data Classification using Hyper-Dimensional Convex Polytopes

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control

Characterizing Decoding Robustness under Parametric Channel Uncertainty

Topics. Computer Networks and Internets -- Module 5 2 Spring, Copyright All rights reserved.

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract

A PSO Optimized Layered Approach for Parametric Clustering on Weather Dataset

6.823 Computer System Architecture. Problem Set #3 Spring 2002

Bends, Jogs, And Wiggles for Railroad Tracks and Vehicle Guide Ways

Software Reliability Modeling and Cost Estimation Incorporating Testing-Effort and Efficiency

Optimal Oblivious Path Selection on the Mesh

I DT MC. Operating Manual SINAMICS S120. Verification of Performance Level e in accordance with EN ISO

NAND flash memory is widely used as a storage

Transcription:

Proceeings of the IASTED International Conference Software Engineering an Applications (SEA') October 6-, 1, Scottsale, Arizona, USA AN OBJECT-ORIENTED APPROACH FOR MANAGING A NETWORK OF DATABASES Shu-Ching Chen School of Computer Science Floria International University Miami, FL 331 Mei-Ling Shyu School of Electrical an Computer Engineering Purue University West Lafayette, IN 40 Chi-Min Shu Department of Environmental Safety Engineering National Yunlin University of Science an Technology Yunlin, Taiwan, R.O.C ABSTRACT A large scale network may consist of hunres of isparate an autonomous atabases. Users in such an information-proviing environment usually access information from those atabases in the same or similar application omains so that there is no nee to hanle all the entities from all the atabases. That is, in most cases, only a subset of atabases is require for the users' requests. As the number of atabases increases, the nee to manage such a network of atabases increases. In this paper, we present a split/cluster approach using the object-oriente technique to allow users to incrementally an ynamically access the information they want without being overwhelme with all of the unstructure information. The approach is base on the anity relationships of the atabases an is performe recursively to split these atabases into clusters. Then a cluster hierarchy is forme to provie ierent levels of abstractions for the users. This framework provies a exible means of sharing information to all the atabases. Theoretical terms along with a running example are presente. Key wors: Object-oriente atabases, anity, clustering, splitting. 1. INTRODUCTION -060-1 - Integrating heterogeneous atabases is a challenging problem since incompatibilities exist among all the atabases [3]. To provie as transparent as possible a atabase schema, conicts nee to be resolve before it can provie a view to the users. A number of researchers have investigate the problem of integrating heterogeneous atabases [1] [] []. However, the issues of conict resolutions are not iscusse in this paper since we try to focus on managing the network of atabases to help users better utilize the information in the atabases. In such a large scale atabase network, queries ten to traverse ata relate to the same or similar application omains an which resie in ierent atabases. Most of the queries request information from a small fraction of the atabases in the network without the nee to show all the entities of all the atabases. This motivates us to split the network of atabases recursively into benecial clusters base on the access behavior of application queries. An example of grouping close to the concept of ours is the Internet [4]. The Internet is a computer network consisting of several connecte subnetworks. Every subnetwork follows its own communication protocols an is usually set up to serve some special purposes. One ierence between these two concepts is that all the subnetworks provie almost the same set of information; while each cluster in the propose approach can provie iverse sets of information. In this paper, an object-oriente split/cluster approach is propose. The object-oriente paraigm is aopte since things in the worl aroun us have properties or features; we can think of ata as an object class with its ening attributes. The anity measures between every pair of atabases are formalize an calculate base on the access behavior of application queries. Each query may be activate several times an hence each query has its access frequency. Therefore, the access frequency of a query per time perio shoul be taken into account in the anity measures. The splitting proceure is base on the anity relationships of the atabases an is performe recursively to split these atabases into clusters. After the split/cluster step, a cluster hierarchy is generate. The cluster hierarchy provies ierent levels of abstractions an hence allows users to incrementally an ynamically access the pieces of information they want without being overwhelme with all of the unstructure information. The constructe clusters can be use as the unit not only for query processing but also for iscovering the objectoriente relationships such as superclass, subclass, an equivalence relationships, which is the subject of a forthcoming paper. For those users who wish

to access only parts of the atabases, they can access the ata from the appropriate clusters without going through the whole network of atabases. In other wors, the propose approach provies a exible means of sharing information to all the atabases. This paper is organize as follows. In Section, the propose object-oriente approach with relative anity formulations an the split/cluster proceure is introuce. A simple example is given to illustrate the steps of the split/cluster proceure. Section 4 conclues this paper.. PROPOSED OBJECT-ORIENTED APPROACH A set of historical queries which are issue to the atabases in the network is use as a priori for the split/cluster proceure. We use the relative anity values to measure how frequently two atabases have been accesse together in the set of historical queries. Realistically, it cannot be expecte that the user applications are able to specify these anity values an hence formulas nee to be ene..1. Relative Anity Measures Let Q = fq1, q, : : :, q q g be the set of queries that run on the set of atabases D = f1,, : : :, g in the large scale atabase environment. Dene the variables: use i () = a vector of length q inicating the usage patterns of i with respect to all the queries in Q. For each atabase i, use i () is ene as follows an the kth entry of use i () enotes the usage pattern of i with respect to q k. use i () = 1 if object classes in i is accesse by q k 0 otherwise access() = a vector of length q inicating the access frequencies of the queries in Q per time perio. The kth entry of access() enotes the access frequency for query q k. rel(i, j) = P q k=1 use i(q k ) use j (q k ) access(q k ) = the anity value of atabase i an j. M=a matrix of size gg inicating the anity measures of the atabases in a group DB GROUP IJ with respect to all queries in Q assuming DB GROUP IJ = f1; ;... ; g g. The rel(i, j) value is place at the (i,j)th entry in M. Note that M is a symmetric matrix an for simplicity, only the entries which i j are compute. The (i,j)th entry will have the same result as the (j,i)th entry an the (i,i)th entry will not be use in the split/cluster proceure. -060 - - PP(i,j) = a closeness ierence function which calculates the closeness ierence between column i for i an column j for j. Let O represent a temporary matrix in the split/cluster proceure which contains the rst several columns of M. For every possible pair of neighbors i an j, PP(i,j) is e- ne as follows: PP(i,j)=M(1,i)-O(1,j) if i is put to the left of j PP(j,i)=O(1,j)-M(1,i) if i is put to the right of j.. Split/Cluster Proceure The objective of the split/cluster proceure is to n several clusters of atabases that are accesse together more frequently by the set of queries. For a large scale atabase environment, this split/cluster proceure shoul be invoke iteratively to form the cluster hierarchy. The split/cluster proceure takes the primary ata as inputs, computes the entities of the matrix M, calculates the closeness ierence values, permutes its columns, an then generates an upate matrix M. A function PP(i,j) is ene to calculate the ierence of two anity values of the nearby neighbors for each possible position ( i ; j ) base on the entries in M. The permutation is one by consiering the minimum of the PP values for each atabase. The PP(i,j) function is esigne to be the closeness ierence for two columns i an j. Let column i be the one that nees to be place in the temporary matrix O where O consists of the rst several columns of matrix M. Column i can be place on the left or right of column j in O. The main iea is to position column i in the place which satises two conitions: its anity measure shoul be less than or equal to the anity measure of its left neighbor an greater than or equal to the anity measure of its right neighbor. For the leftmost or the rightmost position of O, simply consier one of the above two conitions because it has only one neighbor in such cases. For each closeness ierence value, check whether it is less than zero. If yes, ignore this possible position since a negative ierence means the require conitions are not satise. Since the proceure computes only the closeness ierences of the nearby neighbors an consiers the minimum of the ierences, it tens to partition the matrix M into two clusters - one is in the upper left corner an the other is in the lower right corner. In general the borer for the split is not very clear-cut. For this purpose, a splitting phase is propose to ecie the split point. The splitting phase compares the mean value of the rst column with each iniviual value in that column in M. If the iniviual value is greater than or equal to the mean value, then it belongs to the upper left corner group. Other-

wise, it belongs to the lower right corner group. Two clusters can therefore be generate at each iteration. The mean value of the rst column is chosen to be the splitting criterion since the rst column tens to have the larger anity values. However, there must be some stopping criteria to en the iterations. There are two stopping criteria for each split/cluster proceure iteration: (1) when the size of a cluster is one, i.e. the number of atabases in the cluster is one, an () when the size of a cluster is less than four. If one of the above conitions is satise, then there is no more splitting for that cluster since it makes no sense to have a cluster with only one element in the cluster. Otherwise, each cluster executes the split/cluster proceure iteratively until one of the conitions is met. Initially, the split/cluster proceure is applie to all the atabases in the network. The proceure is iterate until no more splitting is permitte. Steps for the split/cluster proceure: 1. Preparation of the primary ata: The primary ata require are access() an use i () where i=1 : : : ( is the total number of atabases). These vectors are given as a priori from a set of historical queries. However, since the application queries issue to the atabases can be recore per time perio (say monthly or annually), the require ata can be upate accoringly.. Computation of the entries in M: rel(i; j) = P q k=1 use i(q k ) use j (q k ) access(q k ), where i, j=1 : : :. 3. Determination of the cluster size: Each cluster in the cluster hierarchy is an input to the split/cluster proceure. The size of a cluster (g) is the number of atabases in the cluster. Initially, g= since the input cluster consists of all the atabases in the network. Assuming DB GROU P I J = f1; ; : : : ; g g, g <= =) the size of DB GROU P I J = g. 4. for loop1 = 1 to g-1 = initialization of the matrix O = = place the rst loop1+1 columns of M into O = O(, 1) = M(, ); O(, ) = M(, ); : : : O(, loop1+1) = M(, loop1+1); for loop = loop1+ to g For each column in the the remaining g-(loop1+1) columns, calculate a PP vector for the loop possible positions -060-3 - for that column. Select the position by the minimum PP value. = position selections = Once the position for the column is etermine, permute the columns in O if necessary. Place the column into its corresponing position in O. = column permutations = en Once the positions for the remaining g-(loop1+1) columns are etermine, the permutations of corresponing rows are performe so that the relative positions in O are maintaine. = row permutations = M = O = upate M = en. Splitting phase: Compute the mean value of the rst column of the matrix M. This mean value is then use as the criterion for the splitting phase. Two clusters are generate from the matrix M after the splitting phase is applie. 6. Stopping criteria checking for each cluster generate in step : If the size of the cluster is one, then no splitting for this cluster an stop. Else goto step for each cluster. If the size of the cluster is less than four, then no splitting for this cluster an stop. Else goto step for each cluster.. Generating a cluster hierarchy: After all the clusters execute the split/cluster proceure an nish the stopping criteria checking, a cluster hierarchy for all the atabases in the network can be create. 3. AN EXAMPLE In this section, a simple example is use to illustrate the propose split/cluster proceure. Once the network of atabases is partitione into several clusters an each cluster consists of one or more atabases which have high anity relationships, the cost of query processing can be reuce. Example: Suppose there are atabases in the network an the historical ata consists of queries. Let D = f 1 ; ; : : : ; g an Q = fq 1 ; q ; : : : ; q g. Assume the following use i () where i=1 : : :, an access() values are the require primary ata obtaine from the set of historical ata.

1 3 4 6 1 3 4 6 1 3 4 6 1 0 0 1 0 1 0 0 3 0 0 0 0 0 3 0 3 10 30 30 1 10 30 1 0 0 160 0 1 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 3 10 30 30 1 10 30 1 3 4 6 1 1 1 0 0 0 0 1 160 1 0 0 0 0 1 1 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 3 3 0 0 0 1 3 10 10 30 30 30 1 3 10 10 30 30 30 0 0 0 0 0 30 30 6 6 6 0 0 0 0 0 30 30 6 6 6 0 0 0 0 0 30 30 6 6 6 Figure 1: Initial Anity Measure Matrix M. Each entity (i,j) in M has the relative anity value rel(i,j), where i,j=1 to. use 1 () = [1 0 0 1 1 0 0 0]; use () = [0 0 0 1 0 0 1 0]; use 3 () = [0 1 0 0 1 0 1 1]; use 4 () = [0 0 1 0 0 0 0 1]; use () = [1 0 0 1 1 1 0 0]; use 6 () = [0 0 1 0 0 0 0 1]; use () = [1 0 0 1 1 0 0 0]; use () = [0 1 0 1 1 0 0 0]; use () = [0 1 0 0 1 0 1 1]; use () = [0 0 1 0 0 0 0 1]; access() = [1 0 3 0 0 3 30]; With the availability of the primary ata, the relative anity values can be calculate. For example, the anity measure for the entity M(1,) can be obtaine by the following way. M(1,) P = rel(1, ) = k=1 use 1(q k ) use (q k ) access(q k ) = access(q1) + access(q4) + access(q) = 1. Figure : Final Anity Measure Matrix. The ashe lines separate two clusters. PP(,3)). Position 1: to the left of column 1, PP(3,1) = M(1,3) - O(1,1) = - 1 < 0; Position : in between column 1 an column, PP(1,3) = O(1,1) - M(1,3) = 1 - = ; PP(3,) = M(1,3) - O(1,) = - 0 < 0; Position 3: to the right of column, PP(,3) = O(1,) - M(1,3) = 0 - = 4; Since the PP values for positions 1 an are negative, these two possibilities are ignore. Therefore, we select to place 3 to the right of column. Similarly, all the other atabases are calculate to get an upate matrix. Finally, the rows are permute to be in the same orer as the columns. The same steps are applie for iterations to to get the - nal anity measure matrix M which is then use to illustrate the splitting phase. First, the mean of the rst column is calculate. Similarly, all the rel(i,j) entities for M can be compute. The initial anity measure matrix M is shown in Figure 1. As shown in Figure 1, each entity (i,j) in M has the value of rel(i,j) which inicates the relative anity measure for atabase i an j. In aition, M is symmetric so that the entity (i,j) has the same value as in the entity (j,i). For example, M(1,) an M(,1) have the same value 1. Take the initial anity measure matrix M an execute the rst iteration, i.e., when loop1=1. Accoring to our propose split/cluster proceure, initially the rst two columns of M are place into the temporary matrix O an column 3 (i.e., 3) is consiere next. There are three possible positions for column 3: to the left of column 1 (computing PP(3,1)), in between column 1 an (computing PP(1,3) an PP(3,)), an to the right of column (computing -060-4 - mean = (1+1+1++0+++0+0+0)/ = 4.; Accoring to the propose splitting phase, the mean value is use to consier the splitting of the matrix M. Therefore, two clusters are create: one is in the upper left corner an the other is in the lower right corner (see the ashe line in Figure ). Let the upper left corner cluster be DB GROUP 1 = f1,,,, g an the lower right corner cluster be DB GROUP = f3, 4, 6,, g. Since both clusters contain more than three atabases, each cluster nees to execute the split/cluster phase iteratively. Again, the mean value for each cluster nees to be calculate an use as the splitting criterion. Base on the mean values, the clusters DB GROUP 1 an DB GROUP 1 are further split into two more

1 1 1 1 1 0 1 160 1 0 1 1 1 0 1 0 0 0 0 0 3 4 6 3 4 6 10 10 30 30 30 10 10 30 30 30 30 30 6 6 6 30 30 6 6 6 30 30 6 6 6 11 1 (a) Figure 3: Splitting for the two clusters. Each cluster can be split into two more clusters (as shown in the ashe lines). clusters iniviually (the ashe lines in Figure 3(a) an 3(b)). Since the numbers of atabases in all the four clusters are less than four, the proceure stops. mean = (1+1+1++0)/ = (for DB GROUP 1 ) mean = (10+10+30+30+30)/ = 4 (for DB GROUP ) After all the clusters execute the split/cluster proceure an the stopping criteria checking, a cluster hierarchy for all the atabases in the network can be create. As shown in Figure 4, the cluster DB GROUP 11 which consists of all the atabases in the network is at the root of the hierarchy. Initially, the split/cluster proceure starts with DB GROUP 11 an partitions it into two clusters DB GROUP 1 an DB GROUP. Then, DB GROUP 1 can be partitione into DB GROUP 31 an DB GROUP 3, an DB GROUP can be partitione into DB GROUP 33 an DB GROUP 34. Each ner cluster consists of its own atabases. Those atabases in the same cluster shoul be highly aliate an be accesse for query information. This cluster hierarchy is then use to ecie where a query shoul be searche for the requeste information to reuce the cost of query processing. 4. CONCLUSIONS In this paper, we propose an object-oriente approach to partition a large scale network of atabases into a set of clusters. We have formalize a new set of relative anity measures to represent how frequently two atabases have been accesse together by a set of historical queries. Anity-base measures are both intuitively reasonable an unerstanable since they consier the access frequencies of queries. We gave a split/cluster proceure for clustering the atabases. The split/cluster proceure inclues a splitting phase an two stopping criteria, an is execute iteratively. A simple example is run to illustrate the steps of the propose split/cluster proceure. A cluster hier- -060 - - (b) 31 3 33 34 1 3 4 6 Figure 4: The resulting cluster hierarchy. The lowest level of the hierarchy consists of the iniviual atabases in the network. The root cluster of the hierarchy consists of all the atabases. The clusters at each level have their own member atabases. archy which provies ierent levels of abstractions for users to incrementally an ynamically access the information is forme. Since a set of atabases belonging to a certain application omain is place in the same cluster an is require consecutively on some query access path, the number of platter switches for query processing can be reuce. Moreover, the constructe clusters can be use as the unit not only for query processing but also for iscovering the objectoriente relationships such as superclass, subclass, an equivalence relationships. References [1] D.M. Dilts an W. Wu, Using knowlege-base technology to integrate CIM atabases, IEEE Trans. Knowlege Data Eng., vol. 3(), June 11. [] W. Gotthar, P.C. Lockemann, an A. Neufel, System-guie view integration for objectoriente atabases, Knowlege Data Eng., vol. 4(1), Feb. 1. [3] W. Litwin, L. Mark, an N. Roussopoulos, Interoperability of multiple autonomous atabases, ACM Computing Surveys,, 10, pp. 6-3. [4] J.S. Quarterman an J.C. Hoskins, Notable computer networks, Communication of ACM, vol. (), 16, pp. 3-1. [] M.P. Rey, B.E. Prasa, P.G. Rey, an A. Gupta, A methoology for integration of heterogeneous atabases, IEEE Trans. Knowlege Data Eng., vol. 6(6), December 14.