Data management in distributed data bases*

Size: px
Start display at page:

Download "Data management in distributed data bases*"

Transcription

1 Data maagemet i distributed data bases* by C. V. RAMAMOORTHY ad BENJAMN W. WAH Uiversity of Califoria Berkeley, Califoria NTRODUCTON The recet advaces i large-scale itegrated logic ad memory techology, coupled with the explosio i size ad complexity of the applicatio areas, have led to the desig of distributed architectures. Basically, a Distributed Computer System (DeS) is cosidered as a itercoectio of digital systems called Processig Elemets (PEs), each havig certai processig capabilities ad commuicatig with each other. This defiitio ecompasses a wide rage of cofiguratios from a uiprocessor system with differet fuctioal uits to multiplicity of geeral-purpose computers (e.g. ARPANET). geeral, the otio of "distributed systems" varies i character ad scope with differet people. 30 So far, there is o accepted defiitio ad basis for classifyig these systems. this paper, we limit our discussio to a class of DCSs which have a itercoectio. of dedicated/shared, programmable, fuctioal PEs workig o a set of jobs which may be related or urelated. Due to the iformatio explosio ad the eed for more striget requiremets, the desig of efficiet coordiatio schemes for the maagemet of data o a DCS is a very critical problem. Data o a DCS are maaged through a data base. A Data Base is a collectio of stored operatioal data used by the applicatio systems of some particular eterprise,6.12 ad a Distributed Data Base (DDB) ca be thought of as the data stored at differet locatios of a DCS. t ca be cosidered to exist oly whe data elemets at multiple locatios are iterrelated ad/or there is a eed to access data stored at some locatios from aother locatio. Due to the ever-icreasig demad for o-lie processig, there is a eed for decomposig very large data bases ito physically or geographically dispersed uits ad/or itegratig existig data bases held i physically isolated odes ito a sigle, coheret data base that will be available to each of the distributed odes. this paper, the desig issues ad solutios for resource maagemet of data o a DDB are studied. The differet aspects of resource maagemet are categorized i the ext sectio. These maagemet issues are part of the issues related to the operatioal cotrol of a DDB ad are co- * Research supported by Ballistic Missile Defese Cotract DASG60-77-C cered with the maagemet of data as resources. They ca be divided ito three related levels, amely, the query level, the file level ad the task level. The query level is cocered with the processig of user queries ad requests so that parallelism i processig ca be maximized, ad the amout of commuicatios o the system ca be miimized. O the file level, the related issues are the compressio of data files for efficiet storage ad commuicatio, as well as the placemet ad migratio of files for efficiet accesses. O the task level, the objective is to schedule the requests so that overlap i processig ca be maximized. These issues ad some of the correspodig solutio algorithms are studied i detail i the third to sixth sectios respectively. Fially, the seveth sectio provides some cocludig remarks. RESOURCE MANAGE~.1ENT OF DATA N DDBS There are may issues i the desig of a data base, amog which are the issues i logical orgaizatio, architectural desigs, operatioal cotrol ad evolutio. These issues have bee discussed i Referece 31 ad will ot be repeated here. A summary of the issues i the desig of a DDB are show i Figure 1. this paper, the resource maagemet issues of data ad files o a DDB are studied. The specific data maagemet issues ivestigated are: Query decompositio o DDBs A query is a access request made by a user or a program i which oe or more files have be accessed. Whe multiple files are accessed by the same query o a DDB, these files usually have to reside at a commo locatio before the query ca be processed. Substatial commuicatio overhead may be ivolved if these files are geographically distributed ad a copy of each file has to be trasferred to a commo locatio. t is therefore ecessary to decompose the query ito sub-queries so that each sub-query accesses a sigle file. These sub-queries may the be processed i parallel at ay locatio which has a copy of the required file. The results after the processig are the set back to the re- 667

2 668 Natioal Computer Coferece, 1979 ssues of DDBs User terface Logical ~ Selectio of Data ~10del Orgaizatio Desig of the Coceptual Level Mappig from Coceptual Level to Physical Level < ter Architecture Operatioal Cotrol Evolutio ~ ~ selectio Network.Sys- Sele ti Des1g c o etc. of Network Topology of Chael Type ~ OPtimizatio ~lemory "'~----Virtual Memory Desig Nodal Sys-< Hierarchy terccmectios ter Desig Data Base Needed Archi ~lachie -----tectural Cocepts Query Processig o Fi 1 e Pl acemet R ~ esource d ~.. t' 1'1 t a.-.1 gra 10 ' aageme of Data Data Compressio Task Schedulig Cocurret Accesses ad Updates Directory Maagemet -Security ad Privacy Reliability - Rollback ad Recovery TeCh010gy Applicatios Stadard1zat10 Figure -Classificatio of issues i distributed data base systems, DDBs questig locatio. t is geerally true that the amout of commuicatios eeded to trasmit the results is much smaller tha the amout eeded to trasmit the files. This approach has bee proposed i the desig of the cetralized versio of NGRES41 ad is exteded to the desig of SDD- 1,42 ad distributed NGRES.8 However, i some cases decompositio is impossible ad some file trasfers are still ecessary. order to avoid these extra trasfers, a techique is proposed i the third sectio so that redudat iformatio is added to the files ad o-decomposable queries ca still be processed without ay file movemets. Data compressio Data compressio is ay reversible ecodig techique that produces a measurable reductio i the size of the data ecoded. By reversible, it is meat that the origial data is recoverable from the compressed form. Due to the growth i the size of iformatio processig, it is ecessary to develop good data compressio techiques which reduce the size of the stored iformatio ad the amout of iterode commuicatios. This issue is discussed i the fourth sectio. File placemet ad migratio This issue relates to the distributio ad migratio of data base compoets, amely, files ad cotrol programs, o the DDB with the objective of miimizig the overall storage, migratio, updatig ad access costs o the system. A file assigmet algorithm is proposed i the fifth sectio. Task schedulig Requests o the DDB must be scheduled so that high parallelism ad overlap ca be achieved. The request may be a sigle word fetch or it may be a page or file access. The parallelism o the DDB is importat because i order to attai high throughput, the parallel hardware ad resources must be efficietly utilized. The cotrol of task schedulig ca be distributed or cetralized. distributed cotrol, each ode may act idepedetly ad coordiate with each other. cetralized cotrol, there is a primary ode i which all schedulig cotrol will be performed there. The decisio of which is the better cotrol mechaism depeds very heavily o the itercoectio structure ad the

3 Data Maagemet i Distributed Data Bases 669 commuicatio overhead ivolved. This issue is discussed i the sixth sectio. The relatioships amog the various data maagemet issues are show i Figure 2 where a relatio ~ is said to exist betwee two desig issues a, 6, i.e. a~6, if the solutio of 6 is trasparet to the solutio of a. That is, the solutio of a is ot affected by the solutio to 6, but ot vice versa. The solutio to a ca therefore be developed idepedet of 6. Figure 2, it is see that geerally, task schedulig is trasparet to file placemet ad migratio which i tur could be trasparet to data compressio ad query decompositio. Algorithms for data compressio ad query decompositio ca therefore be developed idepedetly. developig algorithms for file placemets ad migratios, the solutios for data compressio ad quej:y decompositio should be take ito accout. However, i most cases, assumptios ca be made about their solutios ad the file placemet ad migratio problem ca be solved idepedetly. For example, it may be assumed that all queries which access multiple files may be decomposed ito sub-queries that access sigle files. The file placemet ad migratio problem for multiple files is therefore decomposed ito may sigle file optimizatio sub-problems. t must be oted that other operatioal cotrol requiremets may also impose restrictios o the solutios to the data maagemet issues. For istace, differet reliability requiremets may demad differet lower bouds o the umber of copies of a file o the DDB; differet cocurrecy cotrol mechaisms may have differet costs o the file placemet problem; etc. Reasoable assumptios must therefore be made about these operatioal cotrol requiremets i order to determie their effects o the resource maagemet issues ad to solve these issues idepedetly. QUERY DECOMPOSTON ON DDBS The approach 'usig Ciiiery decompositio is geared towards relatioal data bases. 4 a relatioal data base, data Query DecompOSitio.. ~ H File Data Placemet ad Compressio r 1i gratio T~ y. Y-.~ Task ~~~-~~--- Query Level Data Fi le Level Task Level Figure 2-Relatioships amog various data maagemet issues. is viewed as relatios of varyig degree, the degree beig the umber of distict domais participatig i the relatio. Each istace of a relatio is kow as a tuple, which has a value for each domai of the relatio. Thus a relatio ca be simply represeted i tabular form with colums as domais ad rows as tuples. query decompositio, optimizatio is performed o the processig of a sigle query origiated at a ode. The objective is to decompose a multiple relatio query ito as may sigle relatio sub-queries as possible so that data (relatio) movemets from oe ode to aother ca be miimized However, there exists o-decomposable queries which require all the relatios that they access to be preset at a commo locatio. A large umber of relatio trasfers may be eeded if these relatios are geographically distributed. order to avoid these extra relatio trasfers, a techique utilizig redudat iformatio is proposed here. stead of decomposig queries that access multiple files, it may be sufficiet to provide redudat iformatio i each relatio so that multiple relatios do ot eed to reside at a sigle locatio before the query ca be processed. This will be illustrated later i this sectio. We begi by first examiig the differet types of queries o a relatioal data base. A query o a relatioal data base cosists of two parts: the part specifyig the domais of the relatio to be retrieved ad the part specifyig the predicate which is a quatificatio represetig the defiig properties of the set to be accessed. Let S be a relatio of domai s#, same, status, city; ad SP be a relatio of domais s#, p#, qty. The queries o a relatioal data base ca be classified ito the followig categories: 6 Retrieval Operatios a. Sigle Relatio Retrieval-The predicate represetig the defiig property of the set to be retrieved is defied o the same relatio as the set. E.g. GET (S.same): S.city = "Paris" AND S. status > 10 b. Multiple Relatio Retrieval-The predicate, as well as the set to be retrieved, may be defied over multiple relatios. E.g. GET (S.same): (SP.s#=S.s# AND SP.p#="P 2 ") Relatio SP ad S must be available simultaeously before the query ca be processed. Storage Operatios a. Sigle Relatio Update b. Multiple Relatio Update c. sertio d. Deletio Library Fuctios These represet more complicated operatios o the predicate tha the equality operatios, e.g. coutig the umber of occurreces, selectig the maximum! miimum etc. A query which is defied over multiple relatios is ot decomposable ito sigle relatio sub-queries whe it has a

4 670 Natioal Computer Coferece, 1979 logical relatio defied over a commo domai of these multiple relatios. For example, the query: GET (S.same): (SP.s#=S.s# AND SP.p#="P 2 ") is ot decomposable ito sigle relatio retrievals because there is a logical relatio. = " which is defied over a commo domai s# of the relatios Sad SP. These relatios must be available simultaeously at a commo locatio before the retrieval or update operatios ca be performed. t is oted that the commo domais of these multiple relatios actually represet multiple copies of the same domai o these relatios (although the iformatio they cotai may ot be idetical). A lot of trasfers ca be elimiated if their commo iformatio is represeted i both relatios. For example, i processig the query: GET (S.same): (SP.s#=S.s# AND SP.p#="P 2 ") o two geographically separated relatios, Sad SP (Figure 3a), it may be ecessary to trasfer relatio S to the ode where SP resides ad the process the query there or vice versa. However, if the iformatio SP.s#=S.s# are compiled beforehad ito the two relatios (Figure 3b), the it is oly ecessary to sed the query to the locatio where S or SP resides ad the query ca be processed there. This techique poses several problems. First, it is ecessary to take oe extra bit for each tuple i order to compile this piece of iformatio. f the amout of iformatio to be added is large, (e.g. whe the umber of differet predicates defied o a commo domai of two relatios is large), the size of the extra storage space may be sigificat. Secod, whe the commo domai of oe relatio is modified, it is ecessary to "multiple update" all the commo domais of the other relatios i the data base. Referrig to Figure 3b, if a extra tuple with s#=2 ad same="bosto" is added to relatio S, the it is ecessary to update the SP.s#=S.s# iformatio i relatio SP because relatio SP cotais a tuple with s#=2. f updatig activity is frequet, the the "multiple update" cost is large. Third, this techique requires that the data base desiger to be able to estimate the amout of additioal iformatio to be compiled ito the relatios. A possible techique is to pre-aalyze the type of predicates used i retrievals ad updates ad to determie what are the essetial iformatio to be compiled ito the relatios. A compromise should be made betwee itroduc- S s# 3 5 same New York Sa Fraei sed Chi eago Figure 3a-Relatios Sad SP. SP s# p# 1 Al 2 Al 3 A2 4 A2 5 P2 S 5# S.s#= same SP 5# S.s#= p# SP.s# SP.5# 1 1 New York 1 1 Al 3 1 Sa Fracisco 2 Al 5 1 Chicago 3 1 A2 4 A2 5 1 P2 Figure 3b-Relatios Sad SP with (S.s#=SP.s#) iformatio compiled ito the relatios. ig extra iformatio with additioal storage space ad higher cost i multiple updates ad reducig the amout of relatio trasfers. t would be advatageous for the more frequetly used predicates ad less advatageous for others. DATA COMPRESSON With the icrease i the amout of iformatio processig, it is importat to keep the utilizatio of the memory high. The iformatio cotet of data stored i large alphaumeric data bases is usually low. Further, as the processig becomes distributed, the commuicatio overhead of trasferrig data from oe locatio to aother is usually substatial. order to keep the utilizatio of the storage sub-system high, ad to keep the amout of data trasferred over commuicatio liks low, data compressio is a atural solutio to the problem. However, the use of compressio codes which remove the redudacy of data seems to be i direct coflict with the use of redudat codig, e.g. parity check codes, which icrease the reliability. What is eeded the is a exploratio of efficiet error limitig codes which ca be applied to compressed data ad a aalysis of the error rate of various compressio schemes. Desirable properties of compressio codes desigig a compressio code, it should possess to some degree each of the followig properties:. The techique should be reversible, i.e. the origial data should be fully recoverable from the compressed form. This property ca be relaxed i certai situatios whe the data is repeated elsewhere, e.g. the keys i a directory structure are usually repeated across levels. 2. The codig scheme should cause a measurable reductio i the size of the stored data. comparig compressio codes, a stadard measure called percet compressio is geerally used. [size of iput data] perce~ = -[size of output data] compresso. x 100% [size of mput data] 3. The techique should be reasoably efficiet to implemet.

5 Data Maagemet i Distributed Data Bases The techique should be geeral eough to be equally applicable to all alphaumeric data files. Two other properties which are ofte desirable i compressio schemes are: 5. The prefix property, i.e. o c~de is the prefix of aother code. This assures that the decoder ever has to backup o ay portio of the text. 6. Lexicographic orderig property, i.e. if the iput data is i a sorted order, the after ecodig, the output data is still i sorted order. This property is useful for idexes. Existig compressio techiques, which possess part or all of the above properties, ca be classified ito the followig board categories: (1) Ru legth ecodig; (2) Differecig; (3) Statistical ecodig; (4) Value set schemes.. Ru Legth Ecodig- a data base, there are frequet occurreces i which the data occur i a cotiuous sequece of idetical characters, e.g. sequece of zeroes. This sequece ca be replaced by the character followed by a cout. Ru legth ecodig is a techique by which a strig of cotiuous characters or a "clump" are replaced by a repeat flag for the character followed by the size of the "clump" or ru legth. practice, however, sice very log clumps are highly improbable, oe ca limit the ru legth ecoded ad combie the flag ad legth i a sigle byte. This is the techique used i WYLBUR.lO Ru legth ecodig of a sigle character type is potetially the most successful, with dimiishig returs for more characters. Huag has discovered a upper boud for the etropy of ru legth ecodig. 1!! Differecig-Differecig refers to techiques which compare a curret record to a patter record ad retai oly the differeces betwee them. t is particularly successful with large files of records with fixed legth alphaumeric fields where most correspodig fields are the same or are blaks ad zeroes. This is the approach ormally used for sequetial files, where the patter record is take by the previous record i the file. Whe differecig is applied to direct access files, however, the first record of each block is left ucompressed ad used as a patter for the remaiig records i the block. The uit of iformatio o which differecig is performed ca be the bit, the byte, the field or some logical data i the record. Byte-level differecig is the most commo case sice byte access is coveiet ad cheap. field-level differecig, bit maps are ofte used to idicate the presece or absece of a field whe idetical to the previous. Two examples of the use of differecig i relatioal data base systems are Titma's experimetal system 38 ad the Peterlee Relatioal Test Vehicle. 39 Statistical Ecodig-Statistical ecodig is a trasformatio of a iput alphabet so that it is assiged a code bit strig whose legth is iversely proportioal to the frequecy of its occurreces i the text. Sice differet characters occur with differet frequecies, a statistical ecodig scheme will usually compress the text. Huffma codig scheme 20 is a optimum, elegat ad simple algorithm to assig variable legth bit codes with the prefix property to characters, give their frequecies of occurrece i a text. There are other techiques such as the Hu-Tucker Algorithm,18 which has both the prefix ad the lexicographic orderig property. The major drawbacks of statistical ecodig are that it does ot exploit the atural radix of the computer (e.g. byte, word, etc.), ad it does ot take ito accout some special characteristics of the data, e.g. strigs of repeatig characters, ad the distictio betwee umeric ad character data. A solutio to this is the use of fixed legth ecodig which maipulate data i uits of byte. 32,2 Further, the fact that the size of each character is variable also causes problems whe the data are modified ad the reliability of the data is difficult to assure because the character stream would ot be recogizable oce a bit is destroyed. Value Set Schemes-A value set scheme i a data base system is a codig scheme i which repeated storage of data elemets i their full character represetatio is avoided. stead, each data elemet is stored oce i the system ad all subsequet occurreces of the same data elemet are referred to the first stored occurrece. A example of this techique is show i the MacAMs Data Maagemet (MADAM) System 14 i which a referece umber is assiged to a ew eterig data elemet ad all subsequet operatios o the data elemet use the referece umber. However, the fact that referece umbers are uique oly withi a relatio could lead to problems i the reliability of the data maagemet system ad the itegrity of the data. The MADAM System also uses a biary tree scheme for maitaiig referece umbers which is iefficiet for isertio ad costly i storage space for large sets of data. There are other schemes which represet a better tradeoff betwee storage efficiecy ad processig efficiecy. 24 The decisio of which code to use is highly depedet o the applicatios. For example, i a data base where the order of data is ot importat, the lexicographic orderig property is ot importat. The required properties of the applicatios must therefore be idetified by the desigers before the code is selected. Future directios of research While there are may reported results o data compressio, the future directios of research are see to be cocetrated i the followig areas: detify ad characterize data redudacy- a data base, there are may levels of data. For example, there are the file level, the record level, the field level ad the byte level. The type of data redudacy at each level must be idetified. This would aid i selectig data compressio schemes best suited to the particular type of redudacy. Further, it leads us to the possibility of multi-level compressio schemes, wherei data is compressed through a set of cascaded stages. Each level of the data is possibly compressed usig a differet techique. The compressio code must be selected so that it miimizes the effects o other levels of the data.

6 672 Natioal Computer Coferece, 1979 Develop a compariso model for various compressio schemes-the compariso model must be able to measure the amout of storage reductio ad the computatio cost for ecodig ad decodig. A simple measure is the percet compressio defied earlier. The computatio cost ca be broke dow ito the CPU cost, the memory usage cost ad the iput/output cost. order to calculate the storage reductio for a give compressio scheme, the umber of ecodable uits of tokes i a record or file must be predicted. This ca be obtaied from a assumed iput distributio such as uiform distributio, ormal distributio or Zipf s distributio at the give level of data. Study adaptive HutTma codig techiques which respod to update activity-as the data base gets updated, the iitial Huffma code assigmet based o the a priori character frequecy distributio may o loger be optimal. A threshold for the expected compressio ratio has to be determied which ca dyamically reassig the variable legth codes for the ew frequecy distributio. Further, the threshold selected should ot cause excessive re-codig. The problem of updates which chage the size of the data, ad the reliability problems should also be studied. vestigate the feasibility of implemetig, i a microprocessor, a simple self-measurig self-adjustig ecoder/decoder-experiece has show that the curret implemetatio of data base systems are /O boud withi a ode ad commuicatio boud o the DCS. A microprocessor ecoder/decoder, by performig compressio ad decompressio, would cause commuicatios to be doe more efficietly, at the same time distributig or relievig this fuctio from the processor sub-systems. Such a device would perform the followig fuctios: (a) ecode ad decode data: (b) measure ad adjust the code assigmets: (c) detect errors ad automatically re-iitiate the operatio; ad (d) cotrol cocurret accesses. The advatage of this desig is that it would make data compressio trasparet to the rest of the system. coclusio, the use of data compressio allows data to be stored more efficietly ad data commuicatio to be doe with shorter messages. However, may issues relatig to the feasibility, the desig of codig techiques, the reliability of the resultat codes, the implemetatio issues, etc., must be solved. t is coteded that such solutios do ot exist ow ad future study is ecessary. is affected directly by the query decompositio strategies ad rather lightly by the data compressio techiques. f the query is always decomposable ito sigle file sub-queries, the the placemet of each file may be optimized idepedetly. Otherwise, the distributio of the files o the DCS must be optimized joitly ad this icreases the complexity of the problem sigificatly. O the other had, data compressio techiques geerally affect the amout of data requested at a ode ad therefore the cost of a access is govered by the type of compressio techiques used. By makig certai assumptios o the query decompositio strategy ad the compressio techique, the F AP ca be studied idepedetly. Motivatios for file placemet ad migratio The major reaso for allocatig multiple copies of a file to certai parts of the system at certai times ad the uecessariess of keepig a copy of every file at every ode all the time is because users have localities of access i ay time iterval. At ay particular time, a file may be used by a group of users ad it will cotiue to be used by the same group for a certai legth of time. For a particular user, the file that he wats to access may be available locally, i which case, he ca access the file with very little cost. f the file is ot available locally, he would have to pay a cost i terms of delay i accessig the file ad also itroducig traffic i the etwork before he ca make the access. t is uder this situatio that we should cosider movig a copy of the file to his ode. troducig a ew copy would also icrease the cost i terms of storage space ad the additioal overhead i lockig ad cocurrecy cotrol. Therefore, the decisio of whether to itroduce a ew copy of a file ivolves a balace of the cost betwee the two cases. The costs, e.g. commuicatio costs, are a fuctio of the topology of the system, the type of commuicatio protocols used ad most importatly, the extesiveess of usage at a particular ode: Further, as the request frequecies chage, the file allocatio o the system must also chage accordigly. However, i this case, the cost i migratig the file from oe ode to aother must also be take ito accout i the file placemet algorithm. PROGRAMDAT A PLACEMENT AND MGRATON Defiitio o/the problem The problem is defied as follows: give a umber of computers that process commo iformatio files, how ca oe allocate the files so that the allocatio yields miimum overall operatig costs. This problem has bee called the File Allocatio Problem (FAP).9 A more geeral problem is the Dyamic File Allocatio Problem (DFAP) i which the files are allowed to migrate over time so as to adapt to chagig access requiremets. The solutio to this problem Previous work Most of the previous studies o optimizatio are based o static distributio, that is, the allocatio does ot chage with time. Some variatio of dyamic distributio ivolves the applicatio of static algorithms wheever eed arises. A summary of the previous researches i this area is show i Table. These algorithms are very expesive to ru i real time. A particular solutio to this problem ivolvig a 30 site etwork required about a hour o a BM 360/91 computer. 16 The difficulty i optimizatio is also exemplified i Referece 33. Moreover, most of the algorithms are show

7 Data Maagemet i Distributed Data Bases 673 TABLE.-A Survey of Previous Researches i File Placemet/Migratio Network Flow Techiques Mathematical Programmig & Exhaustive Searches Heuristic Stoe J ey 2l.l7 Chu 3 Caseyl Levi & Foster Loomis & Mahmoud & Morga Ghoshl 3 et. al. ll Popek Riordo 28 Assumptio Complete Complete Complete All objects relatios amog relatios relatios idepedet. objects: No amog objects, amog redudat cop;e, objects: File of objects. access is poisso. Parameters A vg. amout of Fuctioal Storage cost: Storage cost: comm. traffic equatios Trasmissio Query tras. amog obj.: represets cost: File cost: Update Executio cost costraits o legth: tras. cost: o a computer: which process Request rate Query rate Overhead i placemets betwee betwee odes: migratio. deped; files: Update Update rate Commuica- rate betwee betwee odes. tio demads files: betwee Maximum processes. allowable access time: Storage capacity. Algorithm Network flow Network flow teger Path search o used & predicate programmig cost graph calculus. Remarks Static: Optimal Do ot Algorithm Algorithm for 2 processors: cosider very efficiet: Sub-optimal for multiple copy complex: depedece of 3 processor allocatio: Cosider objects reduces sys,tem: Ca Mi-cut alg. delay from allocatio of calculate critical produces etwork multiple file to load factor for 2 optimal queuig sigle file. processor subprocess approach. system. groupigs: Miimize commuicatio overhead. Oly Program- All objects Star etwork: Complete dep. obj's: data relatio idepedet. All objects probabilistic Query & ref exists betwee idepedet. relatio amog traffic divided objects. objects. equally amog alloc. odes. Commuica- Data base with Queuig time ter-ode Commuicatio tio cost for multiple target & service time tras. cost: cost: File query: segmet types: for Node storage cost: Commuica- Queries with trasactios: capability: File Query/update tio cost for multiple target Storage legth: traffic & update: Traffic segmet types capacity: A vg. Processig correspodig rate for query/ o. of eeds of file: retur traffic for update from a messages i Prob. of a each file at each ode to a file etwork: Avg. request acc. a ode: via a program. local object: Prob. Availability processig: of a request/ requiremets. A verage file update is legth: Access icidet o a frequecy: H/ ode: Prob. of W,S/W 2 objects characteristics. processed i parallel. Path search o Combiatorial Queueig Clusterig t. prog. or cost graph search thru. etwork alg.: add-drop possible sol. teger prog. heuristic Algorithm Maximize o. Miimize Dyamic Obtai both efficiet: of segmets differece etwork capacity! Defiite access that query ca from optimal behavior assigmet for relatios retrieve i brachig igored: liks & file amog objects parallel from probabilities: Maximize placemets: reduces the differet Algorithm potetial for Should cosider allocatio of odes: Do ot complex. parallelism. query to be multiple file to model routed to earest sigle file. commuicatio ode & ot delays. distributed equally amog all odes. to be NP-complete. ** 9 Although polyomial algorithms could exist for some special cases of the problem, e.g. the allocatio of files i a two-processor system,34 their use i practical applicatios is very limited. This result suggests that the distributed system desiger should focus his attetio to efficiet heuristics. Heuristics for file distributio o a DDB are usually iteractive algorithms. A feasible solutio ca be geerated. Users of some decisio algorithms the have to decide whether to improve the solutio or ot ad how to improve it. The disadvatages of these types of algorithms are that they usually fid a local optimum istead of a global optimum ad the validatio of the algorithm is very difficult. For most cases, the heuristics ca be show to perform "'* NP-complete problems 22 is a ciass of problems for which there are o kow optimal algorithms with a computatio time which icreases polyomially with the size of the problem. The computatio times for all kow optimal algorithms for this class of problem icrease expoetially with problem size, i.e., if represets the size of the problem, the the computatio time goes up as k where k> 1. satisfactorily for some example values, but the algorithm is so complex that its worst case behavior is very difficult to determie. We first classify the three most commoly used heuristics, the we will discuss the applicatio of a file assigmet algorithm o this problem. Hierarchical desigs This is a heuristic procedure i which attetio is first restricted to the more importat features of a system. a file allocatio problem, attetio ca first be restricted to geographical regios. After aalysis has bee performed ad the files have bee distributed to differet geographical regios, attetio ca be directed to the less importat details such as allocatig files withi a geographical iegio. This stepwise refiemet procedure ca cotiue dow may levels. At each level of optimizatio, it is hoped that the effects o the optimizatio of the curret level from the levels above ad the levels below are very small. Neverthe-

8 674 Natioal Computer Coferece, 1979 less, iteratios ad desig cycles may exist to refie the solutio. Clusterig algorithms Clusterig algorithms are horizotal desig processes which have a similar objective as hierarchical algorithms, amely, to' reduce the complexity of the aalysis i a large system. a DDB, clusters ca be formed o geographical distributio of access frequecies. The files are the allocated to clusters. The file allocatio withi a cluster may further be refied as i hierarchical algorithms. 26,27 Add-drop algorithms applyig this algorithm, a feasible distributio of files is first foud. The total cost of the system ca be improved by successive additio or deletio of file copies. Whe a feasible solutio with a lower cost is foud, it is adopted as a ew startig solutio ad the process cotiues. Evetually, a local optimum is reached i which additio or deletio does ot reduce the cost. The whole procedure ca be repeated with a differet startig feasible solutio ad several local optima ca be obtaied. By takig the miimum of all the local miima obtaied, it is hoped that we ca get very close to the global optimum. 28 The above techiques are by o meas complete. A combiatio of these techiques may be chose by the desiger. the ext sectio, we itroduce a file assigmet heuristic which utilizes some of the priciples of add-drop algorithms. File assigmet algorithm this sectio. we preset a algorithm which ca be used to optimize the file placemets o a DCS. The assumptios that we use i developig the model are: J. File accesses are idepedet-by this, it is meat that there are o iteractios amog the files ad all the accesses o the system are sigle file accesses. The placemets of each file ca therefore be optimized idepedetly. 2. t is assumed that all the costraits o the system ca be represeted i the form of costs. For istace, paths likig two odes i the etwork which violate some costraits such as the respose time costrait, have a high iter-commuicatio cost iduced o them. 3. t is assumed that for a certai time iterval cosidered, it is divided ito periods. The file access behavior for a period are assumed to be estimated at the begiig of the period ad the access behavior for the subsequet periods caot be estimated at that poit. With this assumptio. it is possible to optimize the file allocatios of each period idepedetly ad is ot ecessary to use dyamic programmig to optimize the allocatios for all the periods as doe i Referece 25. No assumptio is made o the legth of each period. Their legths eed ot be idetical ad may be determied dyamically. The algorithm described i this sectio determies the file placemets for each period, but o provisio is made for determiig the legth of each period. The symbols used i the model are: umber of odes i the distributed system idices for files legth of the curret period of cosideratio, T a radom variable idicatig the total amout of query accesses (icludig updates) at ode i to file a (sice we are optimizig each file idepedetly, we will ot write the superscript a i the remaiig part of the discussio) a radom variable idicatig the fractio of queries ~t ode i that are updates to file a per uit cost of accessig file a from ode i to odej per uit cost of multiple updatig file a from ode i to ode j per uit cost of movig file a from ode i to odej per uit cost of storig file a at ode i legth of file a i bits 0 if file a does ot exist at ode i durig period { 1 T otherwise if file a does ot exist at ode i durig period {~ T -1 otherwise {j: Yj=O}=set of odes without a copy assiged {j: Y j = l}=set of odes with a copy assiged {j: l j =uassiged}=set of odes uassiged KoUKl UK 2, K= (cardiality of K) Cosider the problem of allocatig file a o the system at the begiig of period T of legth t, the total amout of the retrievals ad updates i this iterval are estimated to be qj( l.-al) ad ajql' The per uit cost of assessig, updatig ad trasferrig file a from ode i to ode j are Si,j, Mj,j, ad NjJ respectively. We assume that wheever a user at ode i makes a request to a file ot residig at ode i, he will make the access at a ode which has a copy of the file ad with the lowest cost of access from ode i. Our objective is to miimize the cost i the system. Our objective fuctio is: Z= L qj(1-ai) mi Si,j i=1 j,yj=1 + L Yj(fj + mi N i,j)la j=1 i,x;=l + L L YjaiqiMi,j j=l i=1 The first term i the above equatio represets the query access cost; the secod term represets the fixed cost of the period (cost of storage+cost of file trasfers at begiig of

9 Data Maagemet i Distributed Data Bases 675 period); while the third term represets the multiple update cost of the system. We ca rewrite the iteger program as follows: where subject to Z= 2 QjmiSiJ+ 2 YjFj (1) i=l j, Y j=l j=l (2) Fj=(fj+ minj,j)la+ 2 (liqimi,j (3) i,xj=l i=l The file assigmet algorithm proposed here cosists of the followig basic parts: 1. Property or coditio to assig or ot to assig a copy of the file to a ode. 2. Computatio of a represetative value for a cadidate problem. (The state of a cadidate problem is made up of the states of allocatio to the differet odes of the DCS. geeral, the odes of the DCS ca be partitioed ito three sets, Ko, Kl ad K2') The fuctio of the represetative value is to illustrate the miimum of the cadidate problem without actually eumeratig all the allocatios for the uassiged odes. 3. Stoppig the criterio. The geeral steps of the algorithm are show i Figure 4. We discuss each of these steps briefly here. M-l. This is to iitialize the cadidate problem-all odes are uassiged at this poit. The cadidate list, which is a list of states where a extra ode from K2 is added to Ko or Kl ad its correspodig represetative value, is assiged the empty set. M-2 to M-S. These four steps essetially achieve the followig: a ode is selected from the u-assiged set, K2, ad is assiged a copy or ot assiged a copy of the file. A represetative value, which is chose to be a lower boud estimated by solvig the iteger program (Equatio 1) without the itegrality costraits (Equatio 4), is calculated for each of the correspodig cadidate problems. The derivatio of the liear programmig lower boud for a cadidate problem is show i the appedix. The computed lower boud ad the correspodig assigmets are attached to the cadidate list. These steps are the repeated for each of the odes i K 2 M-6. This step selects, from the cadidate list, the cadidate problem with the miimum lower boud ad the correspodig assigmet of odes ad use it for the ext iteratio. Steps M-2 to M-6 therefore have selected a ode ad have decided whether a copy should be placed at that ode. This ode is removed from the K2 list. M-7. The steps M-2 to M-6 are repeated util the K2 list is empty. The overall computatioal complexity of the (4) algorithm is O( 4 ). To further illustrate the steps of the algorithm, it is applied o the followig example. Suppose the followig matrix represets the query cost Si,j for a five-ode system. l Let ad S= r!2 ~2 r i2 L Q = [Q i1 = [ ] F=[Fi1=[ ]. By eumeratig the possible allocatios, it is foud that a copy of the file should be allocated to odes 1, 4 ad 5 givig a cost of 705. The detailed applicatio of the heuristic is show i Figure 5, givig a solutio of 717. geeral, this method will give a solutio very close to the optimal solutio ad the computatio complexity is very low whe compared with that of geeratig the optimal solutio. The five examples o a 19-ode problem solved by Caseyl are compared with the solutios usig the proposed heuristic ad is show i Table. t is see that the results do ot deviate substatially from the optimum solutios. The results idicated here are somewhat prelimiary. For the sake of simplicity, a more complicated algorithm is ot preseted. This algorithm, together with the aalytical results ad the theoretical studies will be preseted i a future paper. TASK SCHEDULNG Defiitio of the problem This problem is related very strogly to the problem of query decompositio. After the query has bee decomposed, the Query Schedulig Problem (QSP) is to sequece the processig of the sub-queries o the DDB for a give distributio of the files o the DCS defied by the F AP. Depedig o the ways i which the sub-queries are processed, QSP ca further be classified ito Sequetial Query Schedulig Problem (SQSP) ad Parallel Query Schedulig Problem (PQSP). SQSP, the sub-queries are processed TABLE H.-Compariso betwee Casey's Solutios o a 19-ode Problem ad the Solutios of the Proposed Algorithm Casey's Cost usig Time o U pdatelquery Optimum Proposed COC6400 Problem Percet Cost Algorithm (sec.) 1 to

10 676 Natioal Computer Coferece, 1979 NO NO itialize Cadidate Problem KO=~' K1=f!l. K 2 ={1.2,.. } K 2 +K 2 Cadidate List- + f!l Form Cadidate Problem C i where ~.i+ko' ~.i+ku{i}. ~,i+k2-{i} Fid lower boud of C i Attach to Cadidate List Form Cadidate Problem C i where ~. i+kou{i}. ~. i+kl' ~. i+k2-{i} Fid lower boud of C i Attach to Cadidate List Select miimum of Cadidate List, j Set KO~.j' Kl~.j' K2~.j for the selected Cadidate Problem; K 2 +K 2 ; Cadi date Figure 4-File assigmet algorithm. M-l M-2 M-3 M-4 t-1-5 M-6 M-7 i sequetial order. Usig the results produced by the processig of the previous sub-query, the processig of the preset sub-query will produce some results to be used by the ext sub-query i sequece. f the files used by the subqueries are separated geographically, itermediate results have to be trasferred over commuicatio lies. The objective is to miimize the amout -of commuicatios required. PQSP, multiple queries are set to differet odes ad they are processed i parallel. The results after the processig are set back to the requestig locatio. this case, the respose time may be smaller because all the commuicatios are doe i parallel (it is assumed that the major overhead is i commuicatios ad ot i processig). For a compromise betwee the amout of commuicatios ad the respose time, a combiatio of sequetial ad parallel query processig may be used. The QSP is a very similar problem which has bee studied i other areas, e.g. the determiistic schedulig of multi-processors, the schedulig of requests i a computer system, etc. The results obtaied there may therefore be exteded to this study. order to solve the QSP, the otio of task must be defied. A task is defied to be a simple request which uses a resource for a fiite amout of time. A request is said to be simple if o other resource is eeded durig the processig of this request. A complex request ca always be broke dow ito a sequece of simple requests. A resource o a DDB ca be physical, such as a commuicatio chael, a processor, etc., or it ca be logical, such as a file. The tasks are usually govered by a precedece graph so that a task caot be processed util its predecessor has fiished processig. The task schedulig problem is to sequece the processig of the tasks, subject to precedece costraits, so that some overall optimizatio criterio is satisfied. The criterio ca be the maximum completio time of all the tasks if the objective is to maximize the throughput of the system; or it ca be the sum of the completio times of all the tasks if the objective is to miimize the average respose time. To schedule the processig of queries, they are first decomposed ito multiple tasks ad the tasks are subsequetly scheduled. The geeral task precedece graph for the processig of a query i the PQSP which requires the use of geographically distributed files is show i Figure 6. O a DCS, the commuicatio overhead, which icludes time to set up the commuicatio path ad the queueig delay to trasmit the messages, is usually much larger tha the processig overhead for a query. Therefore, the time required to process a task at a ode i Figure 6 is usually egligible as compared to the time to pass the results over the commuicatio sub-system. The commuicatio overhead o a DCS is dictated by the cofiguratio of the itercoectio mechaism. May models have bee desiged to study the behavior of these delays, e.g. i Referece 23. The QSP is usually solved with distributed cotrol, that is, there does ot exist a primary ode which schedules all the processig of the queries o the DCS. Further, complete iformatio for optimal schedulig are usually ot available due to the high overhead i distributig them. The tasks are usually scheduled at each ode sub-optimally without assemblig all the ecessary iformatio before the schedulig. Assumptios lsed to simplify the problem Certai assumptios are ofte used so that the problem ca be simplified.

11 Data Maagemet i Distributed Data Bases 677 LB=325.5 LB=288.6 / LB= LB=298.2 LB= LO ~ ~ LB=324.8 LB= ~... N ~.. LB=293.4 "Q.-f ::.:::.. LB=396.6!LB=48708 "Q LB=336.0 LB=417.0 LB=520.5 LB= ::.::: LB=255.0 LB= LB= LB=606.0 LB=337.5 LB=397.5 LB=480.0 LB=649.0 LB=753.0 LB=252.6 LB=353.4 LB=492.6 LB=615.0 \LB=717. a Figure 5-Applicatio of file assigmet o Casey's 5 ode example. Commuicatio overhead The processig overhead is usually much smaller tha the commuicatio overhead ad they are usually igored. This assumptio will elimiate may tasks i the precedece graph. Static versus dyamic algorithms Static algorithms schedule a set of tasks available at the time of schedulig ad a set of tasks that are kow to arrive at fixed future times. The schedule does ot chage durig the duratio of the processig of these tasks. O the other had, dyamic algorithms are more flexible ad they reschedule all the available tasks wheever a ew task comes i. The advatage of dyamic algorithms is that they allow task iitiatios to be dyamic ad do ot restrict the schedule to the order determied iitially, but they have the disadvatage of larger overhead. With the use of dyamic algorithms, the assumptio that there are precedece costraits amog the tasks ca also be relaxed. Wheever a task eters the system, all the tasks i the system are rescheduled dyamically. The choice betwee the use of static ad dyamic algorithms is system depedet. f the arrivals of requests are idetermiate, the dyamic algorithms are usually better. O the other had, if the arrivals of requests ca be determied precisely, the static algorithms should be used. The choice betwee static ad dyamic may also be dictated by the overhead i each type of algorithm, ad a judicious choice must be made by the desiger. Determiistic versus probabilistic processig time The processig time for a task ca be assumed to be determiistic or probabilistic. the determiistic case, it is possible to determie which order ca best satisfy the optimizatio criterio ad therefore all the tasks ca be scheduied i a specific order. However, i a probabilistic case, it is difficult to do so whe the processig times of all the tasks are govered by a commo distributio. Certai assumptios, e.g. the distributio of job size, have to be made before a aalytical evaluatio is possible. 5 The theory of

12 678 Natioal Computer Coferece, 1979 r-- Node J... r.-- Commui catio Nei ghbori g ~~_ Commuicatio 1 Sub-system Nodes ~r Sub-system ~~ Node ~ covuicatig a request to a eighborig ode j processig the request at ode j covuicatig the result to ode from ode j decomposig & processig query at ode i commuicatig a request to a eighborig ode k processig the request at ode k covuicatig the result to ode from ode k process i g the query at ode i commui cati g a request to a eighborig ode m processig the request at ode m comuicatig the result to ode i from ode m Figure 6--Task precedece graph for the processig of a query i the PQSP which requires the use of geographically distributed files. schedulig developed ow are mostly applicable to the determiistic case}5 They ca be used to approximate the probabilistic case whe the average or the worst case processig times are used. Lastly, the difficulty of the schedulig problem ca be assessed easily i most cases uder the determiistic assumptio. NP-completeess of the problem ca usually be show or a polyomial algorithm ca be foud. The QSP uder the idepedet query assumptio, ca be show to be NP-complete. Uder this situatio, the desiger has to look for good heuristics which ca be executed withi the real time costraits. However, the evaluatio of these heuristics is usually difficult. Evaluatio methods ad techiques are usually ofthree kids, aalytical techiques, simulatios ad approximatio algorithms. Aalytical techiques geerally have to make some simplifyig assumptios about the system parameters i order for the solutio to be tractable ad the results obtaied are usually ot accurate. O the other had, simulatios are almost always expesive to ru, ad it is difficult to exhaust all the possible cases of the system. A third type of evaluatio algorithms are approximatio algorithms. 40 There are two classes of these approximatios, oe guarateeig a earoptimal solutio always, ad the other producig a optimal or ear-optimal solutio "almost everywhere." These types of algorithms are still i the research stage ad a uifyig approach i desigig algorithms of this type is still lackig. The future tred is i the directio of ivestigatig good approximatio algorithms for schedulig qeries. CONCLUSON this paper, we have studied i detail the issues of resource maagemet of data o a distributed data base. These issues are divided ito three related levels. amely, the query level, the file level ad the task level. O the query level, the major issue is the decompositio of user queries so that parallelism i processig ca be maximized ad the amout of commuicatios o the system ca be miimized. t is show that the approach usig decompositio is deficiet whe the query is o-decomposable. this case, the files eeded to process the query must be moved to a commo locatio before the query ca be processed. A algorithm is proposed i this paper which preaalyzes the type of accesses o the system ad itroduces redudat iformatio oto the files so that file trasfers ca be reduced. O the file level, the issues are the compressio of data for efficiet storage ad commuicatio ad the placemet ad migratio of files for efficiet accesses. data compressio, the existig techiques has bee classified ito four areas-ru legth ecodig, differecig, statistical ecodig ad value set schemes. A multi-level compressio scheme is proposed so that data is compressed through a set of cascaded stages. the area of file placemet ad migratio, a file assigmet algorithm has bee proposed. geeral, this algorithm gives a solutio very close to the optimal solutio ad the computatio complexity is very low whe compared with that of geeratig the optimal solutio.

13 Data Maagemet i Distributed Data Bases 679 O the task level, the problem is to sequece the processig of the sub-queries for a give distributio of the files o the distributed system so that overlap i processig ca be maximized. t is show that the problem of query schedulig o a distributed data base is NP-complete. The future directios of research are therefore i the search of effective apl'roximatio algorithms. The issues we have discussed i this paper ecompasses the spectrum from the processig of the query to the schedulig of the requests. However, may other issues may arise i the desig of the data base. These iclude other issues i operatioal cotrol, such as directory maagemet, cocurrecy cotrol, security ad privacy, etc. ad they may affect the strategies used i data maagemet. The study of these _ issues, however, are beyod the scope of this paper. APPENDX-DERVATON OF THE LOWER BOUND OF A CANDDATE PROBLEM We derive i the appedix the lower boud of a cadidate problem give the state of it. We ca rewrite the objective fuctio (Equatio ) o coditio o Ko, Kl ad K2. We have + ~ Q-*mi S ,J ieli. O j,yj=1 + ~ Qi*mi Si,j+ ~ FiY i ieli. 2 j, Y J=1 ieli. 2 mi Z= ~ Fi+ ~ Qi* mi Si,j+ ~ FiYi (A-) ieli. 1 ie1i.ol'k2 j, Y j=1 ieli. 2 subject to Yi=O,l where Qi is defied i Equatio 2, ad Fi is defied i Equatio 3. Equatio A- is a o-liear iteger program, we ca rewrite it i the form of a liear program. Let UiJ = fractio of accesses made from ode i to ode j; Pi = set of idexes of those odes that ca access ode i; i = cardiality of Pi' mi Z= ~..... ~Q-S- 1,J -u,j -+ ~F-Y.. 1 i=l j=l i=l s.t. 2: Ui,j = 1 j=l O~ ~ Ui,j ~j Y j iepj i= i,..., j= 1,..., (A-2) (A-3) (A-4) (A-5) Equatio A-3 is true because the total amout of fractios must be summed to 1. Equatio A-4 is derived by summig over all iep j, the iequality O~Xi,j~Yi which says that oly odes with a copy of the file ca supply users' demads. By relaxig costrait A-5, it becomes a liear program ad the value of Z obtaied will provide a lower boud to the origial iteger program. The solutio to the liear program (Equatio A-2 to Equatio A-4) has bee solved i Referece 7. The solutio is: Z- _= { 1,J 0 The complexity of the solutio is 0(2). REFERENCES (A-6) (A-7) L Casey, R. G., "Allocatio of Copies of a File i a formatio Network," AFPS SJCC, 1972, pp Che, T. c., ad. T. Ho, "Storage Efficiet Represetatio of Decimal Data," CACM, Vol. 18, No.1, Ja. 1975, pp Chu, W. W., "Multiple File Allocatio i a Multiple Computer System," EEE Tras. o Computers, Vol. C-18, No. 10, Oct. 1969, pp Codd, E. F.,.. A Relatioal Model of Data for Large Shared Data Bases,'" CACM, Vol. 13, No.6, Jue Coffma, E. G., Jr., ad P. J. Deig, Operatig System Theory, Pretice Hall c., N.J., Date, C. J., A troductio to Data Base Systems,2d Editio, Addiso-Wesley, Efroymso, M. A., ad T. C. Ray, 'A Brach ad Boud Algorithm for Plat Locatio," Operatios Research, f>.. ~' "me 1966, pp Epstei, et. a., "Distributed Query Processig i a Relatioal Data Base System," Report No. UCB/ERL M78/18, Electroics Research Laboratory, Uiversity of Califoria, Berkeley, Eswara, K. P., "Placemet of Records i a File ad File Allocatio i a Computer Network," formatio Processig 74, FPS, North Hollad Publishig Co., Fajma, R., ad J. Borgelt, "WYLBUR: A teractive Text Editig ad Remote Job Etry System," CACM, Vol. 15, No.5, May 1973, pp L Foster, D. V., et. a., "File Assigmet i Star Network," Proc. of the 1977 SigmetricslCMG V Cof o Compo Perf: Modellig, Measuremet ad Maagemet, Washigto, D.C., Nov. 1977, pp Fry, J. P., ad E. H. Sibley, "Evolutio of Data Base Maagemet Systems," Computer Surveys, Vol. 8, No.1, March 1976, pp Ghosh, S. P., "Distributig A Data Base with Logical Associatios o a Computer Network for Parallel Searchig," EEETSE, Vol. SE-2, No. 2, Jue 1976, pp Goldstei, R. c., ad A. J. Strad, "The MacAMS Data Maagemet System,' Proc. ACM-SGFDET Workshop, Nov. 1970, pp Graham, R. L., et. al., "Optimizatio ad Approximatio i Determiistic Sequecig ad Schedulig: A Survey," Proc. of Discrete Optimizatio 1977, Vacouver, Aug Grapa, E., ad G. G. Belford, "Some Theorems to Aid i Solvig the File Allocatio Problem," CACM, Vol. 20, No. 11, Nov. 1977, pp Hofri, M., ad C. J. Jey, "O the Allocatio of Processes i Distributed Computig Systems," BM Research Report, RZ905, Hu, T. c., ad J. Tucker, "Optimal Computer Search Trees ad Variable Legth Alphabetic Codes," SAM J. of App. Math., Vol. 21, 1971, pp Huag, T., "A Upper Boud o the Etropy of Ru Legth Codig," EEE Tras. o fo. Theory, Vol. T-20, No.9, Sept. 1974, pp Huffma, D. A., "A Method for the Costructio of Miimum Redudacy Codes," Proc. RE, Vol. 40, Sept. 1952, pp

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000. 5-23 The course that gives CM its Zip Memory Maagemet II: Dyamic Storage Allocatio Mar 6, 2000 Topics Segregated lists Buddy system Garbage collectio Mark ad Sweep Copyig eferece coutig Basic allocator

More information

Redundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis

Redundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis IOSR Joural of Egieerig Redudacy Allocatio for Series Parallel Systems with Multiple Costraits ad Sesitivity Aalysis S. V. Suresh Babu, D.Maheswar 2, G. Ragaath 3 Y.Viaya Kumar d G.Sakaraiah e (Mechaical

More information

Operating System Concepts. Operating System Concepts

Operating System Concepts. Operating System Concepts Chapter 4: Mass-Storage Systems Logical Disk Structure Logical Disk Structure Disk Schedulig Disk Maagemet RAID Structure Disk drives are addressed as large -dimesioal arrays of logical blocks, where the

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19 CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.

More information

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation 6-0-0 Kowledge Trasformatio from Task Scearios to View-based Desig Diagrams Nima Dezhkam Kamra Sartipi {dezhka, sartipi}@mcmaster.ca Departmet of Computig ad Software McMaster Uiversity CANADA SEKE 08

More information

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software Structurig Redudacy for Fault Tolerace CSE 598D: Fault Tolerat Software What do we wat to achieve? Versios Damage Assessmet Versio 1 Error Detectio Iputs Versio 2 Voter Outputs State Restoratio Cotiued

More information

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig

More information

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 6 Defiig Fuctios Pytho Programmig, 2/e 1 Objectives To uderstad why programmers divide programs up ito sets of cooperatig fuctios. To be able to

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers * Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of

More information

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

Data diverse software fault tolerance techniques

Data diverse software fault tolerance techniques Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the

More information

1 Enterprise Modeler

1 Enterprise Modeler 1 Eterprise Modeler Itroductio I BaaERP, a Busiess Cotrol Model ad a Eterprise Structure Model for multi-site cofiguratios are itroduced. Eterprise Structure Model Busiess Cotrol Models Busiess Fuctio

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 22 Database Recovery Techiques Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Recovery algorithms Recovery cocepts Write-ahead

More information

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015. Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Hash Tables xkcd. http://xkcd.com/221/. Radom Number. Used with permissio uder Creative

More information

ECE4050 Data Structures and Algorithms. Lecture 6: Searching

ECE4050 Data Structures and Algorithms. Lecture 6: Searching ECE4050 Data Structures ad Algorithms Lecture 6: Searchig 1 Search Give: Distict keys k 1, k 2,, k ad collectio L of records of the form (k 1, I 1 ), (k 2, I 2 ),, (k, I ) where I j is the iformatio associated

More information

Modern Systems Analysis and Design Seventh Edition

Modern Systems Analysis and Design Seventh Edition Moder Systems Aalysis ad Desig Seveth Editio Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Desigig Databases Learig Objectives ü Cocisely defie each of the followig key database desig terms: relatio,

More information

The Magma Database file formats

The Magma Database file formats The Magma Database file formats Adrew Gaylard, Bret Pikey, ad Mart-Mari Breedt Johaesburg, South Africa 15th May 2006 1 Summary Magma is a ope-source object database created by Chris Muller, of Kasas City,

More information

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence _9.qxd // : AM Page Chapter 9 Sequeces, Series, ad Probability 9. Sequeces ad Series What you should lear Use sequece otatio to write the terms of sequeces. Use factorial otatio. Use summatio otatio to

More information

Computers and Scientific Thinking

Computers and Scientific Thinking Computers ad Scietific Thikig David Reed, Creighto Uiversity Chapter 15 JavaScript Strigs 1 Strigs as Objects so far, your iteractive Web pages have maipulated strigs i simple ways use text box to iput

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

Improving Information Retrieval System Security via an Optimal Maximal Coding Scheme

Improving Information Retrieval System Security via an Optimal Maximal Coding Scheme Improvig Iformatio Retrieval System Security via a Optimal Maximal Codig Scheme Dogyag Log Departmet of Computer Sciece, City Uiversity of Hog Kog, 8 Tat Chee Aveue Kowloo, Hog Kog SAR, PRC dylog@cs.cityu.edu.hk

More information

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

arxiv: v2 [cs.ds] 24 Mar 2018

arxiv: v2 [cs.ds] 24 Mar 2018 Similar Elemets ad Metric Labelig o Complete Graphs arxiv:1803.08037v [cs.ds] 4 Mar 018 Pedro F. Felzeszwalb Brow Uiversity Providece, RI, USA pff@brow.edu March 8, 018 We cosider a problem that ivolves

More information

CS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1

CS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1 CS200: Hash Tables Prichard Ch. 13.2 CS200 - Hash Tables 1 Table Implemetatios: average cases Search Add Remove Sorted array-based Usorted array-based Balaced Search Trees O(log ) O() O() O() O(1) O()

More information

Lecture 28: Data Link Layer

Lecture 28: Data Link Layer Automatic Repeat Request (ARQ) 2. Go ack N ARQ Although the Stop ad Wait ARQ is very simple, you ca easily show that it has very the low efficiecy. The low efficiecy comes from the fact that the trasmittig

More information

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network Available olie at www.sciecedirect.com Eergy Procedia 6 (202) 60 64 202 Iteratioal Coferece o Future Eergy, Eviromet, ad Materials Adaptive Resource Allocatio for Electric Evirometal Pollutio through the

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 1 Computers ad Programs 1 Objectives To uderstad the respective roles of hardware ad software i a computig system. To lear what computer scietists

More information

Appendix A. Use of Operators in ARPS

Appendix A. Use of Operators in ARPS A Appedix A. Use of Operators i ARPS The methodology for solvig the equatios of hydrodyamics i either differetial or itegral form usig grid-poit techiques (fiite differece, fiite volume, fiite elemet)

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Virtual Memory Prof. Yajig Li Uiversity of Chicago A System with Physical Memory Oly Examples: most Cray machies early PCs Memory early all embedded systems

More information

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control EE 459/500 HDL Based Digital Desig with Programmable Logic Lecture 13 Cotrol ad Sequecig: Hardwired ad Microprogrammed Cotrol Refereces: Chapter s 4,5 from textbook Chapter 7 of M.M. Mao ad C.R. Kime,

More information

1. SWITCHING FUNDAMENTALS

1. SWITCHING FUNDAMENTALS . SWITCING FUNDMENTLS Switchig is the provisio of a o-demad coectio betwee two ed poits. Two distict switchig techiques are employed i commuicatio etwors-- circuit switchig ad pacet switchig. Circuit switchig

More information

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 11 Frieds, Overloaded Operators, ad Arrays i Classes Copyright 2014 Pearso Addiso-Wesley. All rights reserved. Overview 11.1 Fried Fuctios 11.2 Overloadig Operators 11.3 Arrays ad Classes 11.4

More information

Bayesian approach to reliability modelling for a probability of failure on demand parameter

Bayesian approach to reliability modelling for a probability of failure on demand parameter Bayesia approach to reliability modellig for a probability of failure o demad parameter BÖRCSÖK J., SCHAEFER S. Departmet of Computer Architecture ad System Programmig Uiversity Kassel, Wilhelmshöher Allee

More information

Algorithms for Disk Covering Problems with the Most Points

Algorithms for Disk Covering Problems with the Most Points Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi

More information

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures COMP 633 - Parallel Computig Lecture 2 August 24, 2017 : The PRAM model ad complexity measures 1 First class summary This course is about parallel computig to achieve high-er performace o idividual problems

More information

A Generalized Set Theoretic Approach for Time and Space Complexity Analysis of Algorithms and Functions

A Generalized Set Theoretic Approach for Time and Space Complexity Analysis of Algorithms and Functions Proceedigs of the 10th WSEAS Iteratioal Coferece o APPLIED MATHEMATICS, Dallas, Texas, USA, November 1-3, 2006 316 A Geeralized Set Theoretic Approach for Time ad Space Complexity Aalysis of Algorithms

More information

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects. The

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful

More information

Parallel Polygon Approximation Algorithm Targeted at Reconfigurable Multi-Ring Hardware

Parallel Polygon Approximation Algorithm Targeted at Reconfigurable Multi-Ring Hardware Parallel Polygo Approximatio Algorithm Targeted at Recofigurable Multi-Rig Hardware M. Arif Wai* ad Hamid R. Arabia** *Califoria State Uiversity Bakersfield, Califoria, USA **Uiversity of Georgia, Georgia,

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time ( 3.1) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step- by- step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Ruig Time Most algorithms trasform iput objects ito output objects. The

More information

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve Advaces i Computer, Sigals ad Systems (2018) 2: 19-25 Clausius Scietific Press, Caada Aalysis of Server Resource Cosumptio of Meteorological Satellite Applicatio System Based o Cotour Curve Xiagag Zhao

More information

Data Structures and Algorithms. Analysis of Algorithms

Data Structures and Algorithms. Analysis of Algorithms Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output

More information

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:

More information

AN OPTIMIZATION NETWORK FOR MATRIX INVERSION

AN OPTIMIZATION NETWORK FOR MATRIX INVERSION 397 AN OPTIMIZATION NETWORK FOR MATRIX INVERSION Ju-Seog Jag, S~ Youg Lee, ad Sag-Yug Shi Korea Advaced Istitute of Sciece ad Techology, P.O. Box 150, Cheogryag, Seoul, Korea ABSTRACT Iverse matrix calculatio

More information

IMP: Superposer Integrated Morphometrics Package Superposition Tool

IMP: Superposer Integrated Morphometrics Package Superposition Tool IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College

More information

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP Nature-Ispired Computig Hadlig Costraits Dr. Şima Uyar September 2006 Itroductio may practical problems are costraied ot all combiatios of variable values represet valid solutios feasible solutios ifeasible

More information

Algorithm. Counting Sort Analysis of Algorithms

Algorithm. Counting Sort Analysis of Algorithms Algorithm Coutig Sort Aalysis of Algorithms Assumptios: records Coutig sort Each record cotais keys ad data All keys are i the rage of 1 to k Space The usorted list is stored i A, the sorted list will

More information

Review: The ACID properties

Review: The ACID properties Recovery Review: The ACID properties A tomicity: All actios i the Xactio happe, or oe happe. C osistecy: If each Xactio is cosistet, ad the DB starts cosistet, it eds up cosistet. I solatio: Executio of

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised

More information

BOOLEAN MATHEMATICS: GENERAL THEORY

BOOLEAN MATHEMATICS: GENERAL THEORY CHAPTER 3 BOOLEAN MATHEMATICS: GENERAL THEORY 3.1 ISOMORPHIC PROPERTIES The ame Boolea Arithmetic was chose because it was discovered that literal Boolea Algebra could have a isomorphic umerical aspect.

More information

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only Edited: Yeh-Liag Hsu (998--; recommeded: Yeh-Liag Hsu (--9; last updated: Yeh-Liag Hsu (9--7. Note: This is the course material for ME55 Geometric modelig ad computer graphics, Yua Ze Uiversity. art of

More information

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1 Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Memory Hierarchy (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Itroductio Programmers wat ulimited amouts

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5. Morga Kaufma Publishers 26 February, 208 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Virtual Memory Review: The Memory Hierarchy Take advatage of the priciple

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe

More information

% Sun Logo for. X3T10/95-229, Revision 0. April 18, 1998

% Sun Logo for. X3T10/95-229, Revision 0. April 18, 1998 Su Microsystems, Ic. 2550 Garcia Aveue Moutai View, CA 94045 415 960-1300 X3T10/95-229, Revisio 0 April 18, 1998 % Su Logo for Joh Lohmeyer Chairperso, X3T10 Symbios Logic Ic. 1635 Aeroplaza Drive Colorado

More information

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem Exact Miimum Lower Boud Algorithm for Travelig Salesma Problem Mohamed Eleiche GeoTiba Systems mohamed.eleiche@gmail.com Abstract The miimum-travel-cost algorithm is a dyamic programmig algorithm to compute

More information

1 Graph Sparsfication

1 Graph Sparsfication CME 305: Discrete Mathematics ad Algorithms 1 Graph Sparsficatio I this sectio we discuss the approximatio of a graph G(V, E) by a sparse graph H(V, F ) o the same vertex set. I particular, we cosider

More information

CS 683: Advanced Design and Analysis of Algorithms

CS 683: Advanced Design and Analysis of Algorithms CS 683: Advaced Desig ad Aalysis of Algorithms Lecture 6, February 1, 2008 Lecturer: Joh Hopcroft Scribes: Shaomei Wu, Etha Feldma February 7, 2008 1 Threshold for k CNF Satisfiability I the previous lecture,

More information

One advantage that SONAR has over any other music-sequencing product I ve worked

One advantage that SONAR has over any other music-sequencing product I ve worked *gajedra* D:/Thomso_Learig_Projects/Garrigus_163132/z_productio/z_3B2_3D_files/Garrigus_163132_ch17.3d, 14/11/08/16:26:39, 16:26, page: 647 17 CAL 101 Oe advatage that SONAR has over ay other music-sequecig

More information

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 10 Defiig Classes Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 10.1 Structures 10.2 Classes 10.3 Abstract Data Types 10.4 Itroductio to Iheritace Copyright 2015 Pearso Educatio,

More information

n Learn how resiliency strategies reduce risk n Discover automation strategies to reduce risk

n Learn how resiliency strategies reduce risk n Discover automation strategies to reduce risk Chapter Objectives Lear how resiliecy strategies reduce risk Discover automatio strategies to reduce risk Chapter #16: Architecture ad Desig Resiliecy ad Automatio Strategies 2 Automatio/Scriptig Resiliet

More information

Accuracy Improvement in Camera Calibration

Accuracy Improvement in Camera Calibration Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z

More information

condition w i B i S maximum u i

condition w i B i S maximum u i ecture 10 Dyamic Programmig 10.1 Kapsack Problem November 1, 2004 ecturer: Kamal Jai Notes: Tobias Holgers We are give a set of items U = {a 1, a 2,..., a }. Each item has a weight w i Z + ad a utility

More information

1.2 Binomial Coefficients and Subsets

1.2 Binomial Coefficients and Subsets 1.2. BINOMIAL COEFFICIENTS AND SUBSETS 13 1.2 Biomial Coefficiets ad Subsets 1.2-1 The loop below is part of a program to determie the umber of triagles formed by poits i the plae. for i =1 to for j =

More information

Dynamic Programming and Curve Fitting Based Road Boundary Detection

Dynamic Programming and Curve Fitting Based Road Boundary Detection Dyamic Programmig ad Curve Fittig Based Road Boudary Detectio SHYAM PRASAD ADHIKARI, HYONGSUK KIM, Divisio of Electroics ad Iformatio Egieerig Chobuk Natioal Uiversity 664-4 Ga Deokji-Dog Jeoju-City Jeobuk

More information

Math 10C Long Range Plans

Math 10C Long Range Plans Math 10C Log Rage Plas Uits: Evaluatio: Homework, projects ad assigmets 10% Uit Tests. 70% Fial Examiatio.. 20% Ay Uit Test may be rewritte for a higher mark. If the retest mark is higher, that mark will

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution Multi-Threadig Hyper-, Multi-, ad Simultaeous Thread Executio 1 Performace To Date Icreasig processor performace Pipeliig. Brach predictio. Super-scalar executio. Out-of-order executio. Caches. Hyper-Threadig

More information

Greedy Algorithms. Interval Scheduling. Greedy Algorithms. Interval scheduling. Greedy Algorithms. Interval Scheduling

Greedy Algorithms. Interval Scheduling. Greedy Algorithms. Interval scheduling. Greedy Algorithms. Interval Scheduling Greedy Algorithms Greedy Algorithms Witer Paul Beame Hard to defie exactly but ca give geeral properties Solutio is built i small steps Decisios o how to build the solutio are made to maximize some criterio

More information

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 11: More Caches Prof. Yajig Li Uiversity of Chicago Lecture Outlie Caches 2 Review Memory hierarchy Cache basics Locality priciples Spatial ad temporal How to access

More information

Outline. CSCI 4730 Operating Systems. Questions. What is an Operating System? Computer System Layers. Computer System Layers

Outline. CSCI 4730 Operating Systems. Questions. What is an Operating System? Computer System Layers. Computer System Layers Outlie CSCI 4730 s! What is a s?!! System Compoet Architecture s Overview Questios What is a?! What are the major operatig system compoets?! What are basic computer system orgaizatios?! How do you commuicate

More information

Chapter 4 Threads. Operating Systems: Internals and Design Principles. Ninth Edition By William Stallings

Chapter 4 Threads. Operating Systems: Internals and Design Principles. Ninth Edition By William Stallings Operatig Systems: Iterals ad Desig Priciples Chapter 4 Threads Nith Editio By William Stalligs Processes ad Threads Resource Owership Process icludes a virtual address space to hold the process image The

More information

Evaluation scheme for Tracking in AMI

Evaluation scheme for Tracking in AMI A M I C o m m u i c a t i o A U G M E N T E D M U L T I - P A R T Y I N T E R A C T I O N http://www.amiproject.org/ Evaluatio scheme for Trackig i AMI S. Schreiber a D. Gatica-Perez b AMI WP4 Trackig:

More information

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a 4. [10] Usig a combiatorial argumet, prove that for 1: = 0 = Let A ad B be disjoit sets of cardiality each ad C = A B. How may subsets of C are there of cardiality. We are selectig elemets for such a subset

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 20 Itroductio to Trasactio Processig Cocepts ad Theory Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Trasactio Describes local

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information