Transparent Access to Legacy Data in Java. Olivier Gruber. IBM Almaden Research Center. San Jose, CA Abstract

Similar documents
On Object Orientation as a Paradigm for General Purpose. Distributed Operating Systems

TOPLink for WebLogic. Whitepaper. The Challenge: The Solution:

Understanding Impact of J2EE Applications On Relational Databases. Dennis Leung, VP Development Oracle9iAS TopLink Oracle Corporation

Efficient Object-Relational Mapping for JAVA and J2EE Applications or the impact of J2EE on RDB. Marc Stampfli Oracle Software (Switzerland) Ltd.

Project # 1: Database Programming

CocoBase Delivers TOP TEN Enterprise Persistence Features For JPA Development! CocoBase Pure POJO

second_language research_teaching sla vivian_cook language_department idl

Object Persistence Design Guidelines

foreword to the first edition preface xxi acknowledgments xxiii about this book xxv about the cover illustration

WORLD WIDE NEWS GATHERING AUTOMATIC MANAGEMENT

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Distributed transactions (quick refresh) Layers of an information system

Chapter 18: Parallel Databases

Stackable Layers: An Object-Oriented Approach to. Distributed File System Architecture. Department of Computer Science

Java in Next-Generation Database Systems

Appendix A - Glossary(of OO software term s)

Middleware for Heterogeneous and Distributed Information Systems Sample Solution Exercise Sheet 5

An Introduction to Subtyping

Distributed Systems. The main method of distributed object communication is with remote method invocation

Advanced Database Applications. Object Oriented Database Management Chapter 13 10/29/2016. Object DBMSs

Server software accepts requests for data from client software and returns the results to the client

CAS 703 Software Design

DESIGN PATTERN - INTERVIEW QUESTIONS

FIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION

AOSA - Betriebssystemkomponenten und der Aspektmoderatoransatz

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Layers of an information system. Design strategies.

Web site Image database. Web site Video database. Web server. Meta-server Meta-search Agent. Meta-DB. Video query. Text query. Web client.

Schema and Database Evolution in Object. Database Systems. Project Progress Report. Parag Mahalley. Jayesh Govindrajan. Swathi Subramanium

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

Advanced Database Systems

Enterprise Features & Requirements Analysis For EJB3 JPA & POJO Persistence. CocoBase Pure POJO

Chapter 18: Parallel Databases Chapter 19: Distributed Databases ETC.

Chapter 2. DB2 concepts

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

CS 403/534 Distributed Systems Midterm April 29, 2004

Hyper-V Top performance and capacity tips

Performance investigation into selected object persistence stores

Microsoft Access Database How to Import/Link Data

JDBC SHORT NOTES. Abstract This document contains short notes on JDBC, their types with diagrams. Rohit Deshbhratar [ address]

Automatic Code Generation for Non-Functional Aspects in the CORBALC Component Model

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 14 Database Connectivity and Web Technologies

Persistence. Chapter Introducing Persistence

PyROOT: Seamless Melting of C++ and Python. Pere MATO, Danilo PIPARO on behalf of the ROOT Team

C. E. McDowell August 25, Baskin Center for. University of California, Santa Cruz. Santa Cruz, CA USA. abstract

OBJECT ORIENTED SYSTEM DEVELOPMENT Software Development Dynamic System Development Information system solution Steps in System Development Analysis

CPSC 320 Sample Solution, Playing with Graphs!

Chapter 20: Database System Architectures

Violating Independence

"Charting the Course... Agile Database Design Techniques Course Summary

Utilizing Linux Kernel Components in K42 K42 Team modified October 2001

Java OOP: Java Documentation

The CESAR Project using J2EE for Accelerator Controls

Machine-Independent Virtual Memory Management for Paged June Uniprocessor 1st, 2010and Multiproce 1 / 15

Network. Department of Statistics. University of California, Berkeley. January, Abstract

Several major software companies including IBM, Informix, Microsoft, Oracle, and Sybase have all released object-relational versions of their

Chapter 2 Distributed Information Systems Architecture

The Data Access Layer:

KB_SQL 2016 (Version 5.8) Release Notes 03/28/2016. KBS eservice Center ( KBS.NET Download Agent...

Xton Access Manager GETTING STARTED GUIDE

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System

Assignment 5. Georgia Koloniari

KNSP: A Kweelt - Niagara based Quilt Processor Inside Cocoon over Apache

Storing Java Objects in any Database

Automatic Generation of Workflow Provenance

Chapter 13 Introduction to SQL Programming Techniques

Database Management Systems

Lecture 23 Database System Architectures

Agent-Enabling Transformation of E-Commerce Portals with Web Services

5. Distributed Transactions. Distributed Systems Prof. Dr. Alexander Schill

user with a QBE-like interface. The QBE-like interface creates a single virtual environment displaying database information about all current connecti

Microsoft SharePoint Server 2013 Plan, Configure & Manage

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems!

CS558 Programming Languages Winter 2013 Lecture 8

Transparent Java access to mediated database objects

Enterprise Workflow Resource Management

ODBMS: PROTOTYPES & PRODUCTS. The ODBMS Manifesto [M.Atkinson & al 89]

A Tool for Managing Evolving Security Requirements

Chapter 40 Another Solution to Publish Distributed SGML/XML Documents onto the Web

Best practices to achieve optimal memory allocation and remote desktop user experience

Object-Relational Modeling

Executing Legacy Applications on a Java Operating System

Ecient Redo Processing in. Jun-Lin Lin. Xi Li. Southern Methodist University

Persistent Identifier (PID) Partition. Indirectory free-list. Object Object Object Free space

UNIT III - JDBC Two Marks

ECLIPSE PERSISTENCE PLATFORM (ECLIPSELINK) FAQ

Database Administration and Tuning

ACRONYMS AND GLOSSARY

The Jini Architecture Bruno Souza Java Technologist, Sun Microsystems

Opal. Robert Grimm New York University

Overview. Distributed Systems. Distributed Software Architecture Using Middleware. Components of a system are not always held on the same host

2 Data Reduction Techniques The granularity of reducible information is one of the main criteria for classifying the reduction techniques. While the t

A taxonomy of race. D. P. Helmbold, C. E. McDowell. September 28, University of California, Santa Cruz. Santa Cruz, CA

Monitoring the Usage of the ZEUS Analysis Grid

GAVIN KING RED HAT CEYLON SWARM

CS370: Operating Systems [Spring 2017] Dept. Of Computer Science, Colorado State University

JiST: Java in Simulation Time

White Paper: Delivering Enterprise Web Applications on the Curl Platform

Programming in C++ Prof. Partha Pratim Das Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

COPYRIGHTED MATERIAL. Introducing the Project: The SmartCA Application. The Problem

Trusted Components. Reuse, Contracts and Patterns. Prof. Dr. Bertrand Meyer Dr. Karine Arnout

Joint Entity Resolution

Transcription:

Transparent Access to Legacy Data in Java Olivier Gruber IBM Almaden Research Center San Jose, CA 95120 Abstract We propose in this paper an extension to PJava in order to provide a transparent access to corporate data (les, relational systems, etc.). This extension relies on the concept of external object faulter which is conceptually close to the external pager one in operating systems. Our approach is a good basis for building middleware. Our prototype shows the feasibility of our approach. 1 Introduction Accessing corporate data is a more and more crucial issue for any desktop environment, Java is no exception. Java currently oers the Java DataBase Connectivity (JDBC) framework which aims at playing in the Java realm the same role that ODBC played in the PC world. However, JDBC does not provide transparent access to corporate data. It is interesting to realize that such a transparent access is complementary as well as contradictory with persistence in programming languages. Persistence in programming language is concerned by making some created objects to persist accross program executions. Criteria to decide which objects have to persist dier depending upon the adopted persistence model (reachability, inheritance, persistent allocation). However, the universal issue is how to keep on stable storage some objects, created during earlier program executions, in order to re-access them at a later time. Transparent access to corporate data targets the converse problem of accessing existing legacy data and mapping them into a programming language. Providing such a transparent access has the same advantages than providing transparent persistence. The basic idea is to be able to view legacy data as collections of objects. For instance, relations of tuples or les of records become collections of objects. These collections can be browsed but also queried if one is to provide a query processor and a query engine in the language. Transparent access to corporate data can be looked at as an extension to a persistent language. In this paper, we will assume the existence of a persistence 1

framework such as the one in PJava [2] which provides both persistence by reachibility and a transactional framework. Our approach is based on the notion of an external object faulter, conceptually very close to the concept of external pagers in operating systems. The external object faulter is in charged of fetching or putting away object states, being driven by the PJava run-time. The mapping between object classes and the legacy systems is under the responsability of each external object faulter which wraps a legacy system. For example, a relation of tuples in a relational database describing employees could be mapped onto a collection of employee objects. The external object faulter would be responsible for mapping tuple attributes onto class attributes, foreign key attributes onto object references, etc. This paper is structured as follows. In section 2, we sketch a typical partition architecture for a persistent language which allows for an external object faulting mechanism. Section 3 describes in detail the protocol between an external object faulter and the PJava runtime. In Section 4, we present our current validation prototype. Our implementation is totally in Java and do not rely on PJava which is simulated. In Section 5, we conclude. 2 A Partitioned Approach An external object faulter (XOF) is a similar concept to the external pager one in operating systems, except that XOFs apply to objects and not pages. We decided to look at XOFs as an extension to a persistent language. It may be possible to provide XOFs for non persistent languages, however we feel having a persistent host language simplies the design and provides potentially more power, especially when one considers transactional persistent languages such as PJava [2]. The key point addressed by external object faulters is that some objects persist outside the standard persistence mechanism of the language. This has two consequences. First, the language run-time cannot be responsible for fetching and saving all object states. Second, it is not solely responsible any more for the persistent identity of objects. We believe in an open approach where the language run-time delegates parts of the persistence task to some externally dened agents (XOFs) while remaining the overall coordinator. The key idea is to partition the object store beneath the Java virtual machine, each partition being backed up by a dierent XOF or by the native persistent manager. In such a context, PJava plays the role of a coodinator. Stabilize calls as well as transaction commits become distributed two-phase protocols between XOFs. Not only that, but the notion of identity has to take into account the dierent partitions, that is, dealing with local and cross-partition references. Each XOF provides persistent identity in a partition-local manner (LID). However, objects may reference each other across partition boundaries. This requires 2

PJava runtime to provide OIDs which are system-wide object identiers. Like in most partitioned or distributed systems, a system-wide OID is composed of the identier of the partition and the local identier. This is a well-known mechanism. PJava maintains the cross-partition reference tables. Notice that LIDs are XOF-dependent, that is, the content and structure of a LID depends of the legacy sources which is wrapped. LIDs may take many forms from simple number (oset in les) to multi-valued primary keys in relational database systems. Also notice that some legacy system may not be able to accept OIDs in place of LIDs. Relational systems are a good example if this case. The problems appears if one assumes that legacy applications will still run on top of the relational database or that SQL will still be used from PJava to benet from the power of queries. In both cases, relational foreign keys cannot be replaced with OIDs: neither legacy applications nor SQL would be able to handle these. 3 External Object Faulter The role of the external object faulter is to fetch or ush object states on behalf of the PJava runtime. It is interesting to detail the object faulting mechanism. 3.1 Object Faulting One of the main issue here is to determine which mechanism should be provided to write XOFs. On the one hand, XOFs can be thought of as plug-ins. In this case, XOFs are trusted software modules (like the current JDBC-ODBC bridge is) and raw C pointers are used to hydrate object shells. This approach provides maximum performance but very limited portability. On the other hand, one may wish to have XOFs implemented in Java. Indeed, XOFs for mapping les or JDBC sources would be great if they could be downloaded as applets. In this later case, it is important that security be not compromized by XOFs. Therefore, care must be taken in the way XOFs are allowed to ll in empty object shells. Our idea is to introduce an interface XObject. XObject provides generic methods to set and get attributes. These methods access attributes on a name basis, that is, the class name and the attribute name 1. Note that a complete type orthogonality is achieved since classes do not have to be dened as supporting the XObject interface nor they have to implement the get/set methods. XObject interface is in fact how an XOF sees an object shell, not through the object real type. The behavior of the get/set methods are obvious for basic types such as int or char. The string and array cases are more delicate. Although strings and arrays are Java objects, most legacy systems treat them 1 Having both a class and an attribute names is necessary because attribute shadowing is allowed in Java between a subclass and a super class. 3

as values. For instance, a relational database cannot provide an identity for strings, only tuples have an identity. Therefore, the get/set methods for string or array attributes will have two avors: a value avor and an object avor. In the value avor, the get/set methods are simply taking a Java string/array as parameter. The case of the object avor belongs to the general case of object attribute. What should be the behavior of get/set methods when the attribute is an object reference? Well, this is where swizzling and unswizzling occurs. Everything starts by binding a root (a named object). The PJava runtime keeps track of the association (name,xof). When an XOF binds a root, it will ask for a creation of an object from the PJava runtime, providing the oid of that object, it will get an XObject back. Using that XObject interface, the XOF will be able to set the attribute values. For object attributes, the XOF must provide the OID of the referenced object. The PJava runtime will remember the OID and swizzle it checking for already resident objects (thereby avoiding duplicates). Upon ushing, the XOF will get back OIDs when asking for the value of object attributes. In some sense, the XOF never sees the Java references to the object it loads. It only sees XObject and OIDs. Notice the above scheme simplies XOF implementations. Indeed, all the burden of providing a fuledge object manager is under PJava responsability. This requires PJava to maintain the object cache, to keep track of resident objects (avoiding duplicates) and of updates. Also, it is responsible for swizzling/unswizzling object references. 3.2 Schema Mapping An important issue in wrapping legacy sources is the mapping technology used. Both upon object faults and ushes, XOFs have to know the type information of objects as well as the necessary information for mapping the object states onto their legacy system. One could state this is the XOF implementor problem. The XOF has then to keep schema information on objects. This means that given an OID 2, an XOF has to be able to nd the corresponding schema description. If this works ne, this induces much redundant work. Indeed, the PJava runtime already keeps a detailled schema about classes and is able to relate objects with their class description. Allowing the XOF to access that information would really simplify the XOF implementation. However, this is not enough. Most of the time, a mapping is necessary. For instance, a name mapping may be needed between the column names in a relational database and the attribute names in the object. Similarily, the table name needs to be mapped to the class name. Therefore, XOFs need to associate a class denition with mapping information. Again, if PJava would keep that 2 Provided along with the XObject interface 4

information, the implementation of XOFs would be even more simplied. In other words, PJava would allow XOFs to attach mapping information to type denitions, mapping information provided along with the XObject interface upon object faults and ushes. This way, XOFs leverage to its full potential the fact that PJava has to manage a persistent schema. 4 Implementation Our current implementation diers from the above design, mainly because PJava is not yet available and because we lacked man power to modify the Java interpreter. However, the philosophy is respected. Our prototype is entirely written in Java and composed of a schema and two XOFs, one for les of records and another for relational database systems. The XOF for relational database systems is underconstruction, it uses JDBC to access relational sources. Having a schema is necessary to keep track of object types, persistent roots and collections, as well as the mapping from legacy data sources to object classes. Our schema is based on Java type system and is totally implemented in Java. We developed a class loader in Java which is able to read class le format. We thereby have a fuledge run-time description of Java classes, within Java. The mapping framework is very primitive, mostly supporting name mapping and foreign key translation into object references. Our prototype essentially diverges from the above design around the object swizzling mechanism. The only possible strategy for swizzling in a safe language like Java seems to be an half-eager half-lazzy approach. In a eager approach, when an object is hydrated (loaded with its state), all the references it contains are swizzled, that is, the pointed-to objects are created and their state loaded in memory. In a lazzy approach, the pointed-to objects are neither created nor their state loaded. Without support from the VM, none of the above schemes is possible. However, an intermediate approach works. When hydrating an object, the responsible XOF will create the pointed-to objects, but will not hydrate them (their state will be not be loaded). By creating the pointed-to objects, the XOF gets the Java references it needs for the hydrate process. In other words, if an object A references an object B, when A is hydrated, the object B is created as a side eect, but as an empty shell. That way, the XOF obtains a Java reference for B for swizzling the reference in A pointing to B. Given that we create an empty shell, we avoid the recursive swizzling of an eager strategy. Of course, some special course of actions has to be taken to fetch states of empty object shells before they are actually used. Objects which are managed by XOFs having to be instances of classes deriving from a special persistent class (PObject). So long for transparency, but again, this is due to the abscence of collaboration from the Java VM. The PObject class provides a method called hydrate() which needs to be called before an object state is accessed. Also, the 5

PObject class provides a method mark dirty() which signals updates. 5 Conclusion We proposed in this paper an extension to PJava in order to provide a transparent access to corporate data. This extension relies on the concept of external object faulter. The necessary protocols between PJava runtime and external object faulter are simple. Security is not violated, allowing external object faulters to be written in Java and therefore being safely downloadable. External object faulters can be used to wrap legacy systems, thereby providing a seamless access to corporate data which reside in le systems or relational database systems. This is an extremely interesting cornerstone for building middlewares such as Garlic [1]. Middleware of this categories target seamless access but want to retain ecient query capabilities across multiple legacy systems. The challenge is to support queries while retaining an impedance mismatch as minimal as possible. Our prototype shows the feasibility of our approach, even though PJava functionalities were simulated. We also intend to pursue in the middleware direction, implementing a query processor and a query engine in Java. The biggest challenges of our approach are undoubtly the transactional issues which are unavoidable when one talks about legacy systems. The impedance mismatch today is maximal regarding transactions. Transactional support varies greatly in legacy systems, from nothing to mostly short at two-phase locking transactions. This is far from enough for most applications which end developing a lot of dicult and error-prone code. An airline booking system demonstrates the short comings of current transactional supports and the amount of coding eort which is needed. Providing the right transactional framework is a key issue in the success of PJava for the market segments of middlewares and business objects. Acknowledgement I would like to thanks the people in the Garlic group for introducing me to middleware problems and architectures. Our discussion undoubtly seeded the ideas presented in this paper. References [1] M. Carey et al. Towards heterogeneous multimedia information systems: The garlic approach. In Proceedings of IEEE RIDE Workshop, 1995. [2] M.P. Atkinson, M.J. Jordan, L. Daynes and S. Spence. Design issues for persistent java: a type-safe, object-oriented, orthogonally persistent system. In Seventh International Workshop on Persistent Object Systems, 1996. 6