The Magma Database file formats

Similar documents
Chapter 9. Pointers and Dynamic Arrays. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Elementary Educational Computer

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.

. Written in factored form it is easy to see that the roots are 2, 2, i,

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

1.2 Binomial Coefficients and Subsets

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.

6.854J / J Advanced Algorithms Fall 2008

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

Lecture 1: Introduction and Strassen s Algorithm

Computers and Scientific Thinking

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

CIS 121. Introduction to Trees

CS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1

Combination Labelings Of Graphs

EE123 Digital Signal Processing

6.851: Advanced Data Structures Spring Lecture 17 April 24

Heaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015

The Implementation of Data Structures in Version 5 of Icon* Ralph E. Gr is wo Id TR 85-8

ECE4050 Data Structures and Algorithms. Lecture 6: Searching

The isoperimetric problem on the hypercube

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

Python Programming: An Introduction to Computer Science

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

Linked Lists 11/16/18. Preliminaries. Java References. Objects and references. Self references. Linking self-referential nodes

Review: The ACID properties

One advantage that SONAR has over any other music-sequencing product I ve worked

Evaluation scheme for Tracking in AMI

How do we evaluate algorithms?

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

Lecture 10 Collision resolution. Collision resolution

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)

Civil Engineering Computation

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0

IMP: Superposer Integrated Morphometrics Package Superposition Tool

Speeding-up dynamic programming in sequence alignment

Xiaozhou (Steve) Li, Atri Rudra, Ram Swaminathan. HP Laboratories HPL Keyword(s): graph coloring; hardness of approximation

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA

On Infinite Groups that are Isomorphic to its Proper Infinite Subgroup. Jaymar Talledo Balihon. Abstract

Schema for the DCE Security Registry Server

Chapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Fast Fourier Transform (FFT) Algorithms

Chapter 8. Strings and Vectors. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions:

the beginning of the program in order for it to work correctly. Similarly, a Confirm

Module 8-7: Pascal s Triangle and the Binomial Theorem

Lecture 28: Data Link Layer

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

condition w i B i S maximum u i

Overview. Chapter 18 Vectors and Arrays. Reminder. vector. Bjarne Stroustrup

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Oracle Server. What s New in this Release? Release Notes

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 )

Ones Assignment Method for Solving Traveling Salesman Problem

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation

5.3 Recursive definitions and structural induction

K-NET bus. When several turrets are connected to the K-Bus, the structure of the system is as showns

Chapter 8. Strings and Vectors. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Counting Regions in the Plane and More 1

1. SWITCHING FUNDAMENTALS

UNIVERSITY OF MORATUWA

15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #2: Randomized MST and MST Verification January 14, 2015

Data Structures Week #5. Trees (Ağaçlar)

Thompson s Group F (p + 1) is not Minimally Almost Convex

top() Applications of Stacks

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

End Semester Examination CSE, III Yr. (I Sem), 30002: Computer Organization

BAAN IVc/BaanERP. Conversion Guide Oracle7 to Oracle8

Appendix D. Controller Implementation

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

CHAPTER IV: GRAPH THEORY. Section 1: Introduction to Graphs

Chapter 3. Floating Point Arithmetic

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

3D Model Retrieval Method Based on Sample Prediction

The number n of subintervals times the length h of subintervals gives length of interval (b-a).

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

Alpha Individual Solutions MAΘ National Convention 2013

% Sun Logo for. X3T10/95-229, Revision 0. April 18, 1998

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Chapter 24. Sorting. Objectives. 1. To study and analyze time efficiency of various sorting algorithms

Massachusetts Institute of Technology Lecture : Theory of Parallel Systems Feb. 25, Lecture 6: List contraction, tree contraction, and

Chapter 5. Functions for All Subtasks. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

OCR Statistics 1. Working with data. Section 3: Measures of spread

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

In this chapter, you learn the concepts and terminology of databases and

APPLICATION NOTE. Automated Gain Flattening. 1. Experimental Setup. Scope and Overview

Lower Bounds for Sorting

why study sorting? Sorting is a classic subject in computer science. There are three reasons for studying sorting algorithms.

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering

27 Refraction, Dispersion, Internal Reflection

Lecture 5. Counting Sort / Radix Sort

Design and Analysis of Algorithms Notes

Data Structures Week #9. Sorting

Transcription:

The Magma Database file formats Adrew Gaylard, Bret Pikey, ad Mart-Mari Breedt Johaesburg, South Africa 15th May 2006 1 Summary Magma is a ope-source object database created by Chris Muller, of Kasas City, U.S.A. Magma is implemeted etirely i Smalltalk. These otes explai the formats of the files used by Magma to store its data. This iformatio was obtaied by reverse-egieerig the Magma source code with Chris Muller s permissio. It is sufficiet 1 to allow for a clea-room implemetatio of readig ad writig the raw data files. This work was doe usig the versio MagmaTester-157 ad its depedecies. 2 Backgroud The core idea behid Magma ad other object databases is to allow for objects to be saved to stable storage i a database ( persisted ) ad later re-created. I Magma, object persistece is trasparet, that is there is o eed to explicitly map ay object from its memory represetatio to its file-based represetatio. Magma uses Smalltalk s detailed class ispectio features ( reflectio ) to persist objects automatically. It makes use of lightweight proxy-objects to re-create the objects o-demad from the database: whe a proxy-object is agitated, that is whe ay access to it is performed, the proxy-object is replaced by the dyamically-created real object. Magma also allows objects to be persisted i collectios, which are aalogous to tables i a covetioal relatioal database. These collectios may be idexed o oe or more give member-variables to facilitate rapid (that is, better tha liear) access. 3 Object idetifiers I Magma, each object is idetified by a object idetifier, abbreviated to OID. I the revisio of Magma used for this work, OIDs are 48 bits wide. They are aalogous to poiters i laguages such as C, but exist ot oly i the computer s memory, but also i Magma s data files. OIDs are cetral to the way Magma maages its object data. There is a oe-to-oe mappig betwee istaces ad OIDs. OIDs are used ot oly to idetify istaces. For certai objects which are both small ad frequetly-used, the OID ot oly idetifies the object, it also cotais the object. This is possible because 31-bit itegers ad 32-bit IEEE-754 floatig poit umbers, amog other classes, fit easily ito a 48-bit OID. To make this scheme work, the 48-bit OID-space is divided ito rages which are set apart for each class of object that ca be represeted. The 1 well, early sufficiet 1

Table 1: Rages i OID-space start of rage ed of rage object 0 0 il 1 1 false 2 2 true 3 3 Sigifies a uused hash idex slot 4 4 + 2 16 1 = 65539 DB character set 65540 4 + 2 16 + 1000 = 66540 Reserved for future use 66541 66541 + 4000000 = 4066541 New-object OIDs; up to four millio ad oe i a sigle commit 4066542 2 48 2 31 2 32 1 User objects 2 48 2 31 2 32 2 48 2 31 1 32-bit IEEE floatig poit umbers 2 48 2 31 2 48 1 Small (31-bit) itegers details are cotaied i table 1, which was obtaied from the MaOidCalculator Smalltalk class. 4 The objects.idx file format 0 1 7 13 6 OID ad offset size offset for OID f + 1 = 4066542 offset for OID f + 2 = 4066543 offset for OID f + 3 = 4066544 array of further offsets Figure 1: Biary layout of the objects.idx file This file ca be cosidered to be the master idex. It is used to map ay give OID to a offset i the objects file. The first byte i the file is the OID width i bytes. For this versio of Magma, it is always the value 6. Refer to figure 1. The rest of the file is a array of offsets, each oe OID-width (i.e. 6) bytes wide. Each array etry correspods to a OID. The array is offset from the costat firstuserobjectoid i accordace with the formula = o f 1 where is the array etry umber, o is the OID i questio, ad f is the firstuserobjectoid costat. Each array etry cotais the offset ito the objects file where the object correspodig to this OID ca be foud. The offset is i little-edia byte-order, as are all other biary data i Magma. Sice the offsets are 48-bit umbers, the objects file is limited to a maximum 2

Table 2: Metadata i the objects file header startig offset size i bytes sigature 0 8 versio 8 2 boolea flags 10 1 defiitio OID 11 8 class defiitio OID 27 8 achor object OID 43 8 legth of 2 48 1 bytes 2. 5 The objects file format 0 The Magma objects file header 0x400 a MaObjectBuffer a MaObjectBuffer a MaObjectBuffer (further MaObjectBuffers) Figure 2: Biary layout of the objects file This file is the oly place where objects are actually stored. It cosists of a 1024-byte header, ad a list of variable-legth records. Each record represets oe istace. For details, see figure 2. The header cotais a biary sigature, a versio umber, the defiitio OID, the class defiitio OID, ad the achor object OID. For details, refer to table 2. Note that the OIDs are 64-bit umbers i this header; some older versios of Magma used 64-bit OIDs, ad the possibility remais of doig so agai i the future. After the header, the objects file is a flat list of variable-legth records. Each record represets oe object istace. Whe a record eeds to be updated, it ca be updated i-place if the ew size is idetical to the old size. If the ew size differs 3, the a ew record is simply appeded to the file s ed, ad the offset i the master idex is updated to poit to the ew record. While this scheme does leave stale objects i the file, it does have some beefits: appedig to a file is a quick operatio, ad if the system should fail before writig the offset, the old offset will still poit to the old object. Tools also exist for removig stale records. 2 Strictly speakig, the last record i the objects file must start o further tha 2 48 1 bytes from the frot of the file. It might exted up to 2 24 7 bytes beyod that limit. 3 It s ot clear to me what happes if the ew object is smaller tha the old oe. While the ew oe would fit i the space, there would be uused bytes left, ad it might be difficult to discover where the ext record started (it would o loger be related to the legth field). 3

Each record is oe of the five possible types show i figure 3 (solid lies deote cocrete classes). Magma uses each type of record for persistig certai types of objects; for istace, strigs are always saved as MaByteObjectBuffer records. MaObjectBuffer MaFixedObjectBuffer MaVariableBuffer MaByteObjectBuffer MaVariableObjectBuffer MaStorageObjectBuffer MaVariableWordBuffer Figure 3: Hierarchy of Magma object record classes All of these records cosist of a size, a cotrol-field, a class ID umber, ad a class versio umber. After the header comes either a array of OIDs (each 48 bits wide, as usual), or biary data. For details, see figure 4. Note that all umbers, such as the 24-bit size, are stored i biary, with the MSB first. Sice the size field has the OID width (i bytes) added to it, the maximum size of ay istace is 2 24 6 = 16777210 bytes. The cotrol field maps to the type of object buffer which this record represets. The class ID is used to idetify which class to build whe re-creatig a object from a record. The class versio allows for classes to evolve over time; each time a class structure is chaged, the versio umber is icremeted. I the case of MaByteObjectBuffer records, the header is followed by biary data. I the case of MaFixedObjectBuffer ad MaVariableObjectBuffer records, the header is a array of OIDs, which may be immediate objects (small itegers or 32-bit floats), or may poit to other objects. MaStorageObjectBuffer ad MaVariableWordBuffer records were ot upacked beyod the header fields durig this work, ad are likely to differ from this layout. 6 The hash-idex (OID-member.hdx) file format The other files which are part of Magma s data storage are hash-idex files. They allow for large collectios of objects to be searched rapidly. The hash-idex files do ot themselves store objects the objects are stored oly i the mai objects file. Istead, they store keys (which are a hashed represetatio of some member variable), ad values, which are the OIDs. The OIDs i tur allow for the object to be located. Thus, give a value of a particular member variable, the correspodig OID ca be quickly looked up i a hash-idex. Each file is amed with the OID of the collectio it idexes, ad the relevat membervariable. Each idexed collectio has a idex storig purely the OIDs it cotais, to eable rapid cofirmatio that a OID is i a collectio. This file is simply amed OID.hdx. For example, a collectio which has the OID 456789, idexed o the foo member variable, would thus be stored as two files: 456789.hdx ad 456789foo.hdx. 4

record header 0 1 2 3 4 5 6 7 object data (variable legth) 8 9 10 11 12 13 first OID 8 9 10 11 12 13 h e l l o, secod ad subsequet OIDs Strig text or biary data class versio class ID cotrol field size + OID width Figure 4: Biary layout of a Magma object record byte 0: key size k i bits - 1 byte 1: value size i bits - 1 byte 2: umber of slots record 0 record 1 record 2 a MaHashIdexRecord a MaHashIdexRecord a MaHashIdexRecord recordsize (further MaHashIdexRecords) Figure 5: The biary format of Magma s hash-idex files Each hash-idex file cosists of a three-byte header, ad a array of hash-idex records, as show i figure 5. Each record i tur cosists of a header ad a array of slots, which may be uused or may be hash-idex record etries. Each etry cotais a sigle key ad value pair. It also cotais the umber of child etries ad a record-umber where these may be foud. For details, see figure 6. 6.1 The process of addig records to a hash idex file The process of addig records to a hash idex file is best illustrated by meas of examples. We cosider 4 scearios, as depicted i figure 7. I sceario 1, we have a empty idex, cosistig of a sigle record. The hash of the key ca rage from 0 to 999 (for simplicity s sake). We add a item where the key hashes to 777. A etry is created cotaiig the hash ad the OID. All the other etries which are uused will cotai a record umber of zero ad the OID value 3, both of which reflect a uused hash idex slot. 5

a MaHashIdexRecord rage start rage ed etry slot etry a array of size oofetries slot keysize/8 keysize/8 recordsize a Etry value rec# key o of kids valuesize/8 keysize/8 keysize/8 sizeofrecordnumber Figure 6: Biary layout of a Magma hash idex record I sceario 2, we start with the idex from sceario 1, still cosistig of a sigle record. We add a item where the key hashes to 744; the existig etry is updated to idicate that there are two childre, oe i this etry, ad aother oe i a subsequet record. A ew record is cotaied to hold etries with keys i the rage 700 to 799. A ew etry is created to hold the hash of 777 its ad OID. Thus the existig etry is bumped dow to a ew record whe a ew etry with a lower hash takes its place. The record umber field i the origial slot is updated to reflect the record umber where further child etries ca be foud. Record umbers are zero-based (i.e. the first record i the hash-idex after the three-byte header is umber 0). I sceario 3, we start with the idex from sceario 2, ow cosistig of two records. We add a item where the key hashes to 721. A ew etry is created ad added to the secod record sice it falls withi the record s rage of 700 to 799. Agai, the higher-valued hash etry is bumped dow. The first record (i.e. the paret record) gets its umberofetriesiteger 4 set accordigly. I sceario 4, we start with the idex from sceario 3, cosistig of two records. We add a item where the key hashes to 745. A ew record is created to spa the rage from 740 to 749. Two ew etries are created to hold keys with the hashes 744 ad 745, ad are added to the record. This ew record s paret (the secod oe) has its etry for the rage 740 to 749 updated to reflect two child-etries. I tur, that record s paret (the first oe) has its etry i the rage 700 to 799 updated to reflect four child objects. 6.2 Calculatios of rages ad slots i the geeral case This is, however, ot the complete story. Complicatios arise whe a record has a startig ad edig rage with fewer iteger values tha it has slots. Magma s code calls this a record with duplicates, as a sigle key value ca be stored ito several possible slots. Cosider the third record i sceario umber 4: its start key is 740 ad its ed key is 749, givig a rage of 10 itegers, which coveietly matches the 10 slots we have available. Sice we 4 This field may have bee reamed to oofchildre 6

before after 1 2 744#2 700-799 770 779 3 744#2 721#3 700-799 700-799 744#1 770 779 740 749 770 779 4 721#3 721#4 700-799 744#1 700-799 744#2 740 749 770 779 740 749 770 779 740-749 745#1 745 746 Figure 7: The process of addig records to a hash idex file could cotiue to add etries with the key 745 (sice hash collisios are both possible ad permissible), ad sice we could elect ab iitio to have more tha 10 slots per record, it is clear that further refiemet to the process is required. 6.2.1 Addig a ew record Each record s header cotais its rage start ad rage ed fields. These values correspod to the lowest ad highest possible keys that this record ca hold. Whe addig a record, the ew lowest possible key l is calculated from the curret record ad slot umber: l = l + s(h + 1 l) 7

where s l h is the zero-based slot umber, is the umber of slots i a hash-idex record, is the lowest key i the rage, is the highest key i the rage, ad is the umber of slots i each record. The ew highest possible key h is the same if this record ca hold duplicates. Otherwise, it is h = l + (s + 1)(h + 1 l) 1 6.2.2 Records with o duplicate slots If a record has o duplicates for a give key, the the slot umber s is give by where k is the give key. s = (k l) h + 1 l I additio, a adjustmet to the slot umber is doe: if the key the k > l + (s + 1)(h + 1 l) s s + 1 6.2.3 Records with duplicate slots If a record has several slots with the same key, the the highest possible slot is used whe addig ew etries. If it is full, the the ext-highest slot is used, ad so o dow util the lowest possible slot is attempted. If all the permissible slots i a record are filled, the a ew record is appeded, ad the highest slot s record umber field is updated, ad its umberofchildre field is icremeted. The lowest possible slot umber s l is calculated as for s above i 6.2.2. The highest possible slot umber s h is calculated as Agai, the adjustmet is doe: if s h = (k + 1 l) h + 1 l 1 the k > l + s h(h + 1 l) s h s h + 1 8

l + 0(2K ) l + 1(2K ) l + 2(2K ) l + ( 1)(2K ) l + (2K ) l h = l + (2K ) 1 Figure 8: A hash-idex record with a lowest key of l, a highest key of h, a key-space of K bits ad slots, showig the lowest-possible keys for each slot i geeral terms 6.3 Parameter tuig The first record s lowest (l 0 ) ad highest (h 0 ) possible keys are a fuctio of the hash idex s key size K, which is costraied to be a itegral multiple of 8 bits, ad hece will always be a power of 2. Thus l 0 = 0 h 0 = 2 K 1 The slots i each record cover this rage, as show i figure 8. Older versios of Magma used hash-idices cotaiig records with 10 slots each. Each record, therefore, could ot divide its keyspace ito equally-sized slots. Some slots would be greater tha others; that is, they would cotai more keys tha others. This is because all the quatities (low key, high key, record key, umber of slots, slot) ivolved are by defiitio itegers. This ueveess ecessitated the adjustmets described i 6.2.2 ad 6.2.3. Therefore, if every record r were to cotai slots where log 2 Z the the rage (l r..h r ) would always be evely divisible by. For illustratio, cosider the example show i figure 9. Newer versios of Magma eforce this costrait to esure that each record cotais slots of uiform size; this provides a small performace improvemet. 7 Extractig all objects from a database Each of the (o-stale) objects i the database may be foud i oe of two ways: 1. Objects which are stored i a hash-idex may be obtaied by examiig each slot i each record. Slots which are ot empty will cotai OIDs. This operatio is a array-walk. 2. The achor object by defiitio holds refereces, i the form of OIDs, to all other reachable objects which which are ot also i hash-idices. The achor object s OID is obtaiable from the objects file header. This operatio is a recursive treewalk. Give a OID, the offset ito the objects file may be foud by a simple array dereferece operatio o the objects.idx file. Oce the offset is kow, the data i the objects file may be read, startig with the legth field, which idicates how far to read. After that, the 9

record 0 l 0 = 0 slot 0 slot 1 slot 2 slot 3 0..15 16..31 32..47 48..63 h 0 = 2 6 1 = 63 record 1 l 1 = 0 + 0(63+1 0) 4 = 0 0..3 4..7 8..11 12..15 h 1 = 0 + (0+1)(63+1 0) 4 1 = 15 record 2 l 2 = 4 + 0(15+1 0) 4 = 4 4 5 6 7 h 2 = 4 + (0+1)(15+1 0) 4 1 = 7 record 3 l 3 = 6 + 0(3+1 0) 4 = 6 6 6 6 6 h 3 = 6 + (0+1)(3+1 0) 4 1 = 6 Figure 9: The first four records of a hash-idex where the keysize K = 6 ad the umber of slots = 4 class ID ad class versio fields may be read sice their offsets are kow a priori. This provides eough iformatio to covert the remaiig bytes i the MaObjectBuffer ito the correspodig object. Thus to extract all objects from a Magma database, it is ecessary to walk each (OID-oly) hash idex ad the to traverse the object tree startig at the achor object. 10