Point-in-Polygon linking with OGSA-DAI

Size: px
Start display at page:

Download "Point-in-Polygon linking with OGSA-DAI"

Transcription

1 Point-in-Polygon linking with OGSA-DAI Xiaochi Ma MSc in High Performance Computing The University of Edinburgh Year of Presentation:

2 Point-in-Polygon linking With OGSA-DAI Xiaochi Ma 2

3 Abstract Distributed data resources can be heterogeneous in their formats, schema, amount, hardware/software implementation, management policy and etc. If people want to access and share these heterogeneous data resources, there should be software that can hide the heterogeneity of resources and support the functionalities of shared resource in a standard way. A procedure for integrating some special geographic information data on Grid is presented in this dissertation. The basic procedure and relating performance issues are discussed and the implementation is based on OGSA-DAI Grid middleware is described. 3

4 1. Introduction Concept of Grid Concept of Web Service OGSA-DAI Grid middleware Point-in-Polygon linking in spatial database Aims of this project Content of rest chapters Overview of OGSA-DAI Document-Oriented interface Data Service Resource OGSA-DAI Activity How data flow through activities Synchronous and asynchronous request Summary Data migration in OGSA-DAI Basic transportation procedure Transportation model Measuring activity performance experiment Experiment aims How to measure performance at client side Experiment method Experiment Result Conclusion Experimental setup Creating DataBuffer Activity Why creating DataBuffer activity How it improves performance Synchronization objects for DataBuffer Implementation detail Spatial linking with OGSA-DAI Spatial database working with OGSA-DAI How the linking operation is performed How to improve efficiency of spatial join Spatial data migration in OGSA-DAI The procedure of data linking Migrating tables of normal relational data Creating spatial data column Summary Further works Migrating table schema between databases Transferring spatial data Create a data resource accessor for PostGIS database Conclusion Reference

5 1. Introduction In this chapter, some basic concepts about Grid and Web Service are introduced; it also briefs the motivation of developing OGSA-DAI middleware and the roles of this middleware within this project. The application scenario and aims of this project is also discussed there, there is an example illustrating the data integration problem that could be solved by Grid. And contents of other chapters will be presented in the last section of this chapter Concept of Grid The Grid technology offers new opportunities for integrating and sharing large scale, dynamic, autonomous and geographically distributed computing resources. Compared with the legacy distributed computing system, the participants of Grid computing environment can be more dynamic and various. For example, in the conventional distributed data base system, most databases are usually supported by the same vendor; implementation and management of these data resources follow the identical principles. In Grid environment, the resources contained in it have various implementations and belong to different organization and security domains, so to share these resources among the participants the legacy distributed system seems not to have enough flexibility. The desired Grid environment can be considered as a pool of resources, users do not need to know the detail of each resource, adding and withdrawing the resource from the pool is dynamic. However in the conventional distributed computing environment, user is aware of the capabilities of each resource and the resource configuration is usually static. Sharing the heterogeneous resources in a dynamic way is one desired goal of Grid technology. Another initial goal of Grid is to expose the functionality of shared resource in a standard interface, applications and user of different platforms all can access these resources and use the special functionalities. The standards for communication and interaction between client and resource of Gird should be well accepted and platform-independent. The Grid should have a mechanism that can hide implementation heterogeneity of resource from user and expose these shard resources in a stand interface which can be accessed through different platform. In an ideal situation, the service from Grid should be seamless and standardized. To summarize, the motivation of Grid is all about gathering different kinds of resources together and sharing them among different users and applications. 5

6 1.2. Concept of Web Service In order to achieve the desired functions of Grid, the participants of various software and hardware systems should interoperated with each other very well. To solve the problems during implementation of Grid, we need to borrow a lot of primary web technology. Web service is a technology that allows applications to communicate with each other in a platform- and programming language-independent manner. [1]. Due to the interoperate character of web service; it is very desirable for implementing the initial goals of Grid concept. The promise of web service is to enable a distributed environment in which any number of applications, or application components, can interoperate seamlessly among and between organizations in a platform- and language- neutral fashion [2]. Grid is a higher level concept, to implement those desired goals; it needs a stack of protocol to standardize the functionalities of Gird, e.g. exposing resource, accessing resource, invoking special functionality Web Service technology is a lower level concept which can be used to realize that goal of Grid, just like the Object-Oriented technique for Software concept. There are some characters of web service technology: XML based The XML technology is the core foundation of web service as it provides platform independence and interoperability. Using the XML standard as format of communication can eliminate the protocol- or platform- specification of different participant in the Gird Loose coupled A consumer of a web service is not tied to that web service directly; the web service interface can change over time without compromising the clients ability to interact with the service. [3] In the Grid environment adding and withdrawing resource should be dynamic, so the loose-coupled character enables dynamic resource configuration in Grid. Supports Remote Procedure Calls (RPCs) Web service allows clients to invoke procedures, functions, and methods on remote object using an XML based protocol. [3]. This mechanism can be used by client to invoke some functionality of remote resources in Grid. The manipulation on resources can be exposed as a service interface, and the input and output of the execution is transferred between client and service in a XMLbased format. Three technologies are used to support the core functionalities of web service, they are: 6

7 Simple Object Access Protocol (SOAP) SOAP is an XML based protocol for exchange information through a distributed environment. It provides a standard for encapsulating the data and transporting the document with lower level internet protocol, e.g. HTTP, SMTP. By working this SOAP, the communication of heterogeneous participant becomes standardized and interpretable. Web Service Description Language (WSDL) This technology is used to describe the web service interface, WSDL is an XML grammar for describing a collection of access endpoints capable of exchanging message in a document-oriented manner [3]. Universal Description, Discovery, and Integration(UDDI) This is used to provide a registry and directory model for advertising and discovering a web service. The web service technology is not intend to solve the heterogeneous problem of resources, however it supports a stack of protocols to standardize the way of interaction between client and the desired resources. The initial goal of Grid is to support a standard interface for heterogeneous resources, the loose-coupled pattern allows the resource is deployed without noticing client, and PRC mechanism supports the distributed resource can be accessed from local client. As the web service and Grid have a lot of similar characters and behaviors, so the concept of Grid is implemented on the stack of web service standards OGSA-DAI Grid middleware With the web service technology and corresponding protocol, we can develop the Grid middleware for resource configuration and management. The Open Grid Service Architecture Data Access and Integration provides such implementations which can exposure data resource via a web service interface. The data resource there is usually referred as a database system or a plain text file, which can store or output data. OGSA-DAI is a middleware product that allows data resources, such as relational or XML databases, to be accessed via web services. An OGSA-DAI web service allows data to be queried, updated, transformed and delivered. OGSA-DAI web services can be used to provide web services that offer data integration services to clients [4]. One of the motivations of developing OGSA-DAI is to support integration of data from various data resources, by interacting with standard web service interfaces users can integrate datasets that are stored in different kinds of database. By working with the OGSA-DAI client toolkit, users can either extend the OGSA-DAI functionalities e.g. 7

8 developing a new activity supported by a data service resource, or develop client-side software of OGSA-DAI Grid. In this project, each data base is exposed with a OGSA- DAI data service interface, the procedure of data integration is controlled by a client software which is developed with OGSA-DAI client toolkit. In the data Grid enabled by OGSA-DAI, the actual data resource is hidden from user by a standard web service interface. The client dose not interact with resource directly, otherwise it communicate the data service interface. For each data service interface, it could expose more than one data service resource which is associated to a single data resource. To manipulate on the data resource, client needs to know the address of web service interface and the ID of data service resource. The web service address will indicate where the perform document is sent, and specified resource ID tells service on which data resource the manipulation is performed. After execution, the result is returned in a XML encoded document to client. Client Data Service Data Service Resource Data Resource Figure 1: Interacting data resource via OGSA-DAI data service 1.4. Point-in-Polygon linking in spatial database In the normal relation database, if we want to perform join operations over two tables, then both of the joined table must have one or more identical columns. Most time, the equal symbol =, can be used in the SQL language to join the tables which have some identical columns. However, in spatial database, it could be meaningless to compare the values of a geographic coordinate point and that of a polygon. To tell whether or not the point is inside of a polygon, the database itself supports a specific function to calculate the distance between point and polygon. In spatial database, we can join the tables that have relative geographic information. The point-in-polygon linking operation can join the tables together, which either has a coordinate or a bounding polygon column. In application scenario, the tables that contain spatial columns are stored in remote database. The normal relational database could be unable to perform the spatial linking operations, so there could be two choices regarding where the spatial join operation will be carried. One method is to join data in local memory that requires user to establish a spatial database in their local machine; another one is to join them in the remote database 8

9 in where the geographic data residents. If we can invoke the special functionality of spatial database via an OGSA-DAI data service interface, the problem could be easy to solve. Another problem of joining data in different database is that requires data to be moved between databases. It is important to discover and analyze the performance of data transportation in OGSA-DAI and explore an efficient mechanism for doing this Aims of this project The aim of this project is to explore the issues relating to following application scenarios: The user is aware of the following public databases: A database that contains some user interested data, e.g. house price, average salary, regional education level and etc. These public data are categorized according to the geographic zones. Each zone is defined as a polygon defined by a set of geographic coordinate points. Another database that has a intermediate table mapping a postcode to the corresponding coordinate point Accesses to these databases go through the web service interface exposed by OGSA-DAI server. The user has his own database which is also exposed by OGSA-DAI and in his database there are some private datasets containing postcode column. In the application scenario, user needs to specify the relating columns of joined tables and required public data; the client application will perform the data integration across tables and move the final result back to users database. To get the final dataset, it needs to perform the ordinary join operation among the normal relational tables and spatial join, as point-in-polygon linking, in spatial database. When performing the linking operation, the participating tables must be stored in the same database, if the tables are stored in different databases, the client should know how to move and join the data efficiently. A simple scenario of data integration is presented here: Table A contain a public data set in spatial database and user want to find out the corresponding hours price and salary in his local region. User also can create table and move data to this table. 9

10 Polygon House Price Salary Polygon(PointA, PointB,...) 120,000 25,000 Polygon(PointC, PointD,...) 140,000 20,000 Table B mapping postcode to a geographic coordinate, this table is supposed to belong to a different spatial database. This table working as an index, by comparing the value of Postcode column with user specified postcode, we can find out the geographic position of user s postcode. The point-in-polygon linking is performed between this table and tablea. Postcode EH3 5LW EH5 2PU osgridcoord PointM PointN Table C is user private data set and stored in user own database. Operator = can be used to perform join operation between this table and table B, this linking operation can be carried out in a normal relational database. User data (id, address,...) (id, address,...) Postcode EH5 2PU EH3 5LW User wants to get the following dataset: User data postcode house price salary (id,address ) EH5 2PU 120,000 25,000 (id,address ) EH3 5LW 140,000 20,000 For the postcode of user database, most of them could belong to a small area, which means a few polygons. However the polygon table in public database could store the geographic information covers a worldwide region, which needs a larger number of polygon data rows. If we can remove small unnecessary polygon data before joining them with user specified point data set, it could improve the performance of data transportation and the efficiency of spatial linking. In this project, there is an experiment for doing the spatial linking in an efficient way; the detail will discussed in chapter 5. 10

11 1.6. Content of rest chapters The organization of the remaining chapters as follows: Chapter 2 introduces each component of OGSA-DAI, mainly focus on Activity and how it works. Chapter 3 focuses on the data delivery mechanism of OGSA-DAI and performance issues; it will analyze the performance of activities in the chain for data transportation between services. Chapter 4 introduces the DataBuffer activity and how it improve the performance by making the activities working in parallel both on sink and source side. Chapter 5 presents the procedure of performing spatial linking operations within OGSA-DAI Grid environment, a efficiency mechanism of doing the point-in-polygon linking operations will be presented. In Chapter 6, the further works for this project is discussed. Chapter 7 is the last chapter, it summarize the current works for this project 11

12 2. Overview of OGSA-DAI In this project, all the data sets are manipulated via the OGSA-DAI enabled web service interface. In application scenario, users only need to interact with these data service interfaces; the detail of data manipulation is hidden from client. To understand the way how OGSA-DAI works, it is important to understand its working procedure and architecture. In this chapter, the concept of data service, data service resource and activity is discussed; the working procedure of activity chain is also presented. The OGSA-DAI middleware is a web service implementation of data Grid. In OGSA- DAI Grid, each data resource is exposed via a data service interface, the intrinsic functionalities of the resource is exposed as a kind of web service. An OGSA-DAI web service allows data to be queried, updated, transformed and delivered. OGSA-DAI web services can be used to provide web services that offer data integration services to clients. [4] There are several components and interfaces in the architecture of OGSA- DAI middleware. How the components interact is presented in Figure 2: Figure 2: interaction via a document-oriented interface [4] OGSA-DAI can provide web services follow two kinds of specification: Web Service Inter-operability (WSI) and Web Service Resource Framework (WSRF). In this project, all experiments are implemented with OGSA-DAI (WSI) 2.1. Document-Oriented interface OGSA-DAI middleware is designed to support web service to the user for data integration purpose. For the interoperability of web service, the service should be accessible through various platforms, so it needs a stack of protocols to regulate the communication between client and service. As a well accepted format, XML standard is chosen as the format of message in web service. A Web service is an interface that describes a collection of operations that are network-accessible through standardized XML messaging. [3]. 12

13 In OGSA-DAI the interaction between client and data service interface is achieved by sending Perform Document (client-to-service) and receiving Response Document (service-to-client). Both kinds of documents used for communication are written in a XML manner. In perform document, the content is a chain of OGSA-DAI activities which are pipeline together. The perform document instructs the service what manipulation is carried on the data resource. Actually the data service does not parse the content of perform document, when receiving the document it will be forward to data service resource where the activity is executed. The data service interface is the Endpoint for the interaction between client and service. When the data service resource finish the execution of activities in the perform document, the result or executing status will be written in the response document and sent back to client. In OGSA-DAI, the communication between client and service is achieved by exchanging document Data Service Resource In the architecture of OGSA-DAI, data service is only used for contracting with client, and data service resource is the point where user s instruction is understood and carried on. The separation of data service and data service resource can hide the implementation detail in the server-side from client; this is helpful for supporting seamless service. One data service can associate with several data service resource, so in client side, to get an instance of web service, it needs to specify the address of data service and the ID of associated resource. Inside of OGSA-DAI web service, the perform document is consumed at data service resource, in where its content will be parsed and executed. The data service resource interprets the activities into corresponding data manipulations and wraps the executing result from database system into response document. Each data service resource can only be associated with one single data resource OGSA-DAI Activity In OGSA-DAI, the capabilities of a data service resource is referred as activity, each activity represents a kind of operation that can be performed by the data service resource on the data resource. Activities are the operations that data service resources can perform on behalf of a client. These normally expose an intrinsic capability of the underlying data resource or may be functionality added at the service layer. [4] The activity is the key concept of OGSA-DAI, all user s instructions to service are represented by the activities in the perform document. Usually the activities are pipelined in a chain and this is achieved by connecting the output of one activity to the input of 13

14 next one. When data service resource executing the activity chain, the data is streamed from the first activity to the last one. An example of activity chain is illustrated at Figure 3: DataStore activity DataBuffer activity Tokenizer activity Figure 3: Activity chain Due to the activity can be an additional functionality of service layer, it means user can extend the OGSA-DAI activity by adding some specific manipulation at the service side. The perform document is used by client to instruct data service resource which manipulation is performed on the data resource. All the client specified activities are written into the content of perform document which is XML-based. There is an example of perform document, in this document there is an activity chain for synchronous query. Activity SQLQuery and WebRowSet are included. <? xml version="1.0" encoding="utf-8"?> <perform xmlns=" <documentation> Perform a simple SELECT statement and transform the results into WebRowSet XML. </documentation> <sqlquerystatement name="statement"/> <expression>select * from littleblackbook where id=10</expression> <resultstream name="statementoutput"/> </sqlquerystatement> <sqlresultstoxml name="webrowset"> <resultset from="statementoutput"/> <webrowset name="webrowsetoutput"/> </sqlresultstoxml> </perform> [4] In this perform document, element sqlquerystatement represents the SQLQuery activity, it has two sub elements expression and resultstream. The content is expression element is the SQL query statement which is executed on the data resource. The attribute name of resultstream element specifies the name the output of query result. The attribute from of resultset element specifies which stream is used as input of WebRowSet activity. Both of the attributes have the same value "statementoutput", that means the output the SQLQuery is connected to the input of WebRowSet. 14

15 2.4. How data flow through activities When the activity chain is executed by the data service resource, at any moment there is only one activity which is being executed. The activity will not ask data from the next one until the current execution is finished. For example, a request that has a chain containing SQLQuery, WebRowSet and SQLBulkLoad activities, when data service resource begins to execute the request, the query operation is not carried first. Otherwise the first activity being executed is SQLBulkLoad, it asks one block of data from its input port which is connected to the output of WebRowSet, and then that activity is executed. The query operation is not started until the WebRowSet asking data from it. The following figure shows how data goes through the activities. Output SQlBulkLoad Input Ask for data Output WebRowSet Input Ask for data Output SQLQuery Figure 4: Data goes through activities. The data streamed through activities is in unit of block. A block can be a Java Object of any type although usually they are Strings or byte arrays. [4]. In the above example, the data block of SQLQuery output is a object of ResultSet object, and for WebRowSet, a single data block means the XML formatted data of one row from query result Synchronous and asynchronous request The activity request can be executed synchronously or asynchronously, the execution pattern depends on the executing activity. The activity that does not have attached output is executed asynchronously, e.g. DeliverToNull, when this activity is append at the end of activity chain, the response perform will return immediately after data service receive the perform document. If user does not want to wait a long time for the executing result or the output is not required, the asynchronous request is very useful. 15

16 2.6. Summary In this chapter, some basic component and concept of OGSA-DAI is discussed there. The data service is a general interface that tells client where to find the shared data resource. In order to access a certain data resource, user to need specify the id of data service resource this is associated with the underlying resource. All the manipulations performed on the data resource are represented as activities of data service resource. Activity also can be a functionality of service layer. 16

17 3. Data migration in OGSA-DAI In the application scenario, to join tables of different database it needs to duplicate some data set by moving data from one data base to another. Before the join operation begins, the joined data set should be stored in the same data base, and after the join operation, the linked data set is also needed to transfer to a scratch database for further retrieve. So the performance of data transportation part is crucial for the performance of whole linking procedure. It is important to understand the performance issue and discover an efficient mechanism to migrate data between OGSA-DAI data service. OGSA-DAI supports various ways for data delivery, the transportation mechanism used in this project is to connect to output data to session stream exposed by the data service resource of source service, and sink service will pull the data from the stream and insert it to the target table. The session there is used to store some status information for interaction between multiple activity requests executed on the same data service resource. When executing the request at sink data service resource, a new session requirement is specified and the output stream activity will create a session stream. If no session requirement is specified for the activity request, then the request will join the implicit session which will be terminated when the execution is over. For the transportation works correctly, we need the session stream exist until all the required data is delivered to sink service, so an explicit session is created when processing the request at source data service Basic transportation procedure In order to make the data is pulled from source service to sink service, client should send three perform documents. The first one is for source service in the transportation, it contains the activities which query the required data and connect it into the output stream. The second perform document is executed by the data service resource at sink side, it instructs the sink service where the output stream is and how to manipulate the delivered data. When the data delivery is finished, client also needs to send another perform document to source service to terminate the specified session for transportation. 17

18 An example of pulling data between services is illustrated in Figure 5: Source Service Sink Service SQLQuery WebRowSet DTOutput Source Session DeliverFromDT SQLBulkLoad DeliverToNull Figure 5: Pulling data from source service When the first perform document is sent to source service, the end of activity chain is DTOutput, which is an activity without attached output. So the activity request is executed at asynchronous pattern, at this moment the data is not streamed to the output stream yet. Then data is pulled from source to sink service when the SQLBulkLoad activity is executed by the data service resource at sink side. For each time, the Bulkload activity asks one block of data from DeliverFromDT which can retrieve data from the output stream exposed by the data service resource of source service. The following figure illustrates how the bulk load activity retrieves the data from source service. The bulk load activity should wait for the completion of data transportation. SQLBulkLoad DeliverFromDT Output stream processblock getnblock Figure 6: How data is moved across activities 18

19 3.2. Transportation model In OGSA-DAI, there are two types of models for data transportation, Block and Full. In the above example, the transportation between services uses Simple Object Access Protocol, SOAP as the protocol for data delivery. In a SOAP message the data is capsulated into blocks. In Block model the maximum number of blocks is a constant for each message, if the delivered data blocks exceed the specified number, source service will need to construct another SOAP message for holding the rest data, which could cost some time. In Full model all the data in is written in one message, so source service only needs to produce one SOAP message, but if the number of data blocks is very large, it could require more memory for constructing message Measuring activity performance experiment To analyze the performance of data migration between OGSA-DAI data services, it needs to measure the processing time of different perform documents. In an activity chain, there are several activities, for performance analyzing, it needs to know the efficiency of each activity in the chain. So to get the processing time of activity and perform document, we can put some time measurement code at the server side. In many situations, users would not be to able to add the profiling code at the servers administrated by other people. For the users, it could be interesting and useful to explore a mechanism for measuring the performance and identifying the bottleneck of activity request at client side Experiment aims There are three stages in the whole data migration procedure, first stage is for querying data and converting it into WebRowSet format for further delivery and insertion, that needs executing activity SQLQuery and WebRowSet; second stage is for connecting the query result into output stream of source data service and delivering it to the input stream of sink service, which are tasks of DeliverToDT and DeliverFromDT; the last stage is for inserting data into target table with activity SQLBulkLoad. These activities are used for delivering data between data service, their behaviors directly effect the performance of data transportation; in the experiment the performance of each stage in data transportation is measured. By analyzing the experiment result, we can discover the efficiency of these activities and find out the inefficient part in the transportation procedure. 19

20 3.5. How to measure performance at client side To get to processing time of perform document, we can record the system time before sending perform document and after receiving the response document. In the implementation code, it looks likes this: long start = System.currentTimeMillis(); Service.perform(request); long stop = System.currentTimeMillis(); time = stop start; Method System.currentTimeMillis () records the time before sending the perform document and receiving response document. By comparing the difference, we can know how much time is used for processing the activities in the perform document. However, for obtaining the performance of asynchronous request, this method is not available, because the response document is returned before the execution of activities of this request is finished. The method polluntilrequestcompleted can be used to wait until the execution of asynchronous request is finished.... long start = System.currentTimeMillis(); Sink.perform (sinkreq); Source.pollUntilRequestCompleted (source_session.getsessionid (), 100); long stop = System.currentTimeMillis(); The time interval is a parameter of the poll method, in the above code the execution status of source service is checked every 100 milliseconds till it is set as COMELETED. To measure the performance of asynchronous request for data transportation, the poll method can be used to obtain the processing time of perform document executed at source side. In the experiment, the client did not check the processing status of source session every millisecond because handling the status request from the client will slow down the data transportation work of the source server. Instead, the status of the source session is checked every 100 milliseconds. However, doing this will introduce some error into the result data because the session could be finished just after the client sends the last status request. In that situation there will be almost extra 100 milliseconds in the session processing time. When a source data service delivers a relatively large amount of data, for example hundreds of rows, to a sink service, the whole processing time is much larger than 100 milliseconds. For instance, when the number of delivered rows is 125, the total 20

21 processing time is around s. The extra 100 milliseconds only takes 2.135% of the total time. So this error is tolerable (or can be ignored) because the further performance analysis is based on thousands of rows. After obtaining the processing time of activity request, by comparing the difference of these performances, the efficiency of each activity can be calculated. For example, one request contains activity A, B and C, another request only has activity A and C. We can know the processing time of activity B by comparing the time of two requests. In the data migration procedure, the activity chain can be divided into three sub parts: 1. Query the source data base and transform the result into specified format for transportation and inserting into sink data base. The corresponding activities are SQLQuery and WebRowSet. 2. Serialize and deserialize the data for transportation. Corresponding activities are DeliverToDT and DeliverFromDT. 3. Loading the data into tables in the sink side. The output of SQLBulkLoad is the number of updated rows, usually user do not need to know that. So adding the DeliverToNull to the end of activity chain, in order to make the request processed asynchronously. The performance of part 1 and 2 can be simply measured at the client side by timing the executing of activity request; but to get the time used for delivering data between data services, the clients need to measure the performance of executing the asynchronous request at source service by polling it until the session status is completed Experiment method There are two activity requests: Request#1: SQLQuery + WebRowSet + DeliverToNull; Request#2: SQLQuery + WebRowSet + SQLBulkLoad; By executing the two requests on the same data service resource, it can get the processing time of requests #1 and #2, T1 and T2. Since DeliverToNull activity does not manipulate data, the processing time of this activity is much smaller than that of the time consuming activities SQLQuery and WebRowSet. Therefore, the overhead of the DeliverToNull can be ignored in the request#1. As a result, the difference in the processing time between T1 and T2 can be assumed as the executing time of SQLBulkLoad activity. 21

22 In the following experiment, the overhead of DeliverToNull activity is always ignored as it is insignificant in measuring the execution time of requests. The performance of DeliverToDT and DeliverFromDT can be obtained as follow: Request#3: SQLQuery + WebRowSet + DTOutput (source side) DeliverFromDT +DeliverToNull; (sink side) Actually to execute this request, it needs two perform documents which are sent to source and sink service individually. The polluntilrequestcompleted method is used for measuring the processing time of asynchronous request. Subtracting the processing time T3 by T1, we can get the processing time for delivering data form output stream of source service to the input stream of sink service. To get a stable performance data, all the activity requests are executed 30 times in the experiment, and result is an average value. The transportation model is set as Block, and the number of data blocks is 4000 for one go 3.7. Experiment Result The performance is measured in metric of second and the amount of delivered data is increased from 125 rows to The following table shows the processing time of each stage in the transportation of different number of rows Rows SQLQuery+WebRowSet DTOutput+DeliverFromDT SQLBulkLoad s s s s s s s s s s s s s s s s s s s s s s s s Due to the above result data is an average value, the standard deviation is also recorded when getting the performance data. The value of standard deviation implies the variant in the result data of each execution. Rows SQLQuery+WebRowSet DTOutput+DeliverFromDT SQLBulkLoad

23 Put data into output stream, delivery it over services and get them form input stream in sink side is the most costly part in the data transportation procedure, when increasing the number of delivered rows the consuming time of this part will become dominate in the whole processing time. And the cost of SQLBulkLoad also increases with the amount the data transportation. The percentage of executing time is illustrated in Figure % 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% SQLBulkLoad DTOutput+DeliverFromDT SQLQuery +WebRowSet 20.00% 10.00% 0.00% # of rows Figure 7: The proportion of processing time of activities When increasing the amount of delivering rows, the consuming time of serialization and deserialization activities DTOutput and DeliverFromDT became dominate, almost 65% of whole processing time is consumed by these two activities. Another costly activity is SQLBulkLoad, which takes about 33% of processing time, especially when number of delivering rows exceeds

24 Time[s] # of rows Figure 8: Performance of activities SQLQuery +WebRowSet DTOutput+Deliv erfromdt SQLBulkLoad The above figure shows how the amount of delivered data affects the performance in each part of the whole data transportation procedure. With the increase of number of rows, putting the data into the output stream exposed by data service resource of source service and retrieving data from the exposed stream into sink service will consume the most part of processing time for data transportation Conclusion By comparing the processing time of different activity request at client side, user is able to measure the performance of data migration procedure and identify the bottleneck activity. According to the performance data, when delivering a large amount of data, most processing time is used by DTOutput, DeliverFromDT and SQLBulkLoad. Two of the most inefficient activities are executed by data service resource at the sink service side. So if we can improve the performance in sink side, the performance of whole transportation procedure could get better Experimental setup The benchmarks presented here were performed using OGSA-DAI client toolkit and server and client software being run on the same machine, the hardware and software implementation are described below: Two OGSA-DAI data service both running on the same machine and the associated databases are all belonged to the same MySQL database server. 24

25 OS WindowsXP CPU Intel(mobile) PentiumⅢ 1.0GHz MEMORY 512MB JAVA Sun J2SE 1.5.0_06 Database MySQL 5.0 Row length 66bytes Row schema Int(11), varchar(64), varchar(128), varchar(20) Both client and server were running JVM started with the following flag: -Xms128m -Xmx128m. All the experiment result is obtained based 30 times execution and the data is an average value. 25

26 4. Creating DataBuffer Activity Activity is an extension point of OGSA-DAI middleware, besides the activities that represent the intrinsic capabilities of underlying data resource; the other activities are functionalities of service layer. This chapter will introduce a user created activity, DataBuffer, which is used to improve the performance at sink side during data migration. The implementation procedure and working principle is discussed there, and the improve performance is also described Why creating DataBuffer activity According to the above experiment data, we know the processing time of SQLBulkLoad activity and that of retrieving data from output stream of source service to sink service. In the sink side, SQLBulkLoad activity directly gets data from output of DeliverFromDT, so in the conventional data transportation procedure, the data delivery and insertion are executed in sequence. However when we comparing the sum of each part s processing time in the data transportation with the processing time of the whole data transportation procedure, we find the time used for processing request that contains both data delivery and insertion activities is longer than the sum of processing time of request which either has transportation activity or insertion activity. There is an example for proving this: Suppose there are two activity requests: Request#1: SQLQuery + WebRowSet + DTOutput DeliverFromDT + SQLBulkLoad + DeliverToNull; Request#2: SQLQuery + WebRowSet + DTOutput DeliverFromDT + DeliverToNull; The processing time of request#1, T1 is much larger than that of request#2, because in request#2 the data transportation does not need to stop and wait for the execution of SQLBulkLoad. The yellow curve in the following figure represents the sum of processing time of request#2 and individual executing time of activity SQLBulkLoad. If we can make delivering and inserting data happened in the same time with a single request, then performance of that request could be close to the yellow curve. 26

27 Time[s] # of rows Figure 9: Performance of requests and activity Request#1 Request#2 Request#2 + SQLBulkLoad When the source data service resource finished the execution of DTOutput activity, a new output stream is exposed at the source service, at this moment the real data transportation is not started yet. The delivery is actually started when the sink data service resource begin the execution of SQLBulkLoad activity. This activity inserts multiple rows into a relational table, each time it loads one block of data from the previous activity in the activities chain. Comparing the request that has a SQLBulkLoad activity, the activity request, which just deliver the data from source to sink without inserting table, it consume less processing time Because the SQLBulkLoad activity does not load the next data block until it finish the current insertion. If we can put a buffer activity which can make transportation and uploading working in parallel, the performance could be improved, because SQLBulkLoad activity does not need to wait the transportation of next data when it finish the previous insertion. Once the DataBuffer is executed by data service resource, it will pull data from source service and store them in buffer in the order of coming, at the mean time it also support the next activity in the chain, e.g. SQLBulkLoad, with buffered data. The execution of DataBuffer activity will not stop until there is no more coming data and buffered data How it improves performance Adding the DataBuffer activity in the position between DeliverFromDT and SQLBulkLoad, it will lower the time of transportation by keeping retrieving data from input stream and moving arrived data to the next activity connected to it. The following activity requests are executed to compare the performance Request#1, Source side: SQLQuery + WebRowSet + DTOutput Sink side: DeliverFromDT + DeliverToNull 27

28 Request#2, Source side: SQLQuery + WebRowSet + DTOutput Sink side: DeliverFromDT + SQLBulkLoad + DeliverToNull Request#3, Source side: SQLQuery + WebRowSet + DTOutput Sink side: DeliverFromDT + DataBuffer + SQLBulkLoad+ DeliverToNull Request#1 Request#2 Request#3 Time[s] # of Rows Figure 10: Improved performance by DataBuffer activity Rows Request#1 Request#2 Request# s s s s s s s s s s s s s s s s s s s s s s s s When the number of delivered rows exceeds the number of blocks in SOAP message, source service will generate another one SOAP envelope and send it to sink service. When executing request#2, the data will not be pulled from source side until the sink data service resource finish the execution of SQLBulkLoad activity. The DataBuffer activity can load the rest of data from the source side while SQLBulkLoad activity is being executed, so the sink service does not need to stop and wait for the data. 28

29 4.3. Synchronization objects for DataBuffer In order to make the transportation and uploading worked in parallel, when the implementation code of DataBuffer activity is executed there should be two or more threads generated for writing the data from activity input into buffer and moving buffered data to activity output. To achieve the synchronization between ReadBuffer and WriteBuffer threads, the implementation of Buffer class should be carefully considered. Otherwise it will cause deadlock while multiple threads try to access the shared buffer instance at the same time. To make the parallel threads working correctly, following areas should be considered: A same buffer instance should be used to initialize both read and write threads. The reading and writing operations to the buffer should be synchronized, when these methods are executed, none of them will return until the operation is finished. Adding and removing data from buffer should notify all the waiting thread, at any moment there is only one active thread operating on the buffer, others waiting for the finish of the active one. The data stored in the buffer should be moved following the first come first out principle. If the buffer is fully filled, then the writing operations should be blocked till there is available space, and the reading operations only can go on when the buffer does has data in it. The life cycle of the WriteBuffer thread should be as follows: Stage1: checking the activity input port, if there is no more coming data block, executing setcompeleted method of buffer class for notifying other threads working on the buffer and terminate, otherwise get one data block and goes to stage 2. Stage2: checking if the buffer is fully filled with data, if not, inserting the block to end of queue. Otherwise keep waiting status till buffer is available for writing. Stage3: notify the availability of buffer to other threads and goes to stage 1. The life cycle of ReadBuffer thread is like this: Stage1: check if there is a data block in the buffer or a working WriteBuffer activity, if not thread terminate itself, otherwise gets one block from the head of the queue. Stage2: put the data block into activity output port and goes to stage 1. 29

30 The manipulation methods on the buffer object should be synchronized between threads; this can avoid the write/read race caused by multiple parallel threads. The return value of setcompleted method of buffer object is used as a signal to notify the ReadBuffer thread there is no more data coming from activity input and WriteBuffer thread is terminated Implementation detail To implement the class of DataBuffer activity the following stages are involved: Create the activity implementation class, and make it extend from abstract class uk.org.ogsadai.activity.activity. Implement the constructor, which takes the corresponding element in perform document as input parameter. Constructor initializes the instance of activity with the information extracted from the element, for DataBuffer activity the information contains the name of activity which supports input data, the place to where output data goes and size of data pool. Implement the initialise method, the initialisation stage of an activity's lifecycle is the first point at which the activity context and session can be accessed [4]. This method is invoked before the processing of data blocks, for DataBuffer activity, the references of input and output port should be set up before reading and writing data blocks to the ports. Implement the processblock method, this is the method where the bulk of an activity's processing is usually performed.[4] In order to make the read and write of data running parallel, two java threads are generated during the execution of this method. One thread is in charge of reading data from input port and writing it into buffer, another one s task is to put the buffered data into the output port and empties the corresponding buffer space. When executing the processblock method, it is invoked repeatedly until the setcompleted method has been called. To reduce the overhead cost by repeatedly calling the same method, the main thread executing the processblock method should be blocked until the parallel java thread finish all their tasks. By doing this, the method is invoked only once. Implementing the sub class ReadBuffer and WriteBuffer, both of them implement the Runnable interface. The instances of these classes are used to initialize the java threads appeared in the execution of processblock method. The ReadBuffer thread gets data block from buffer and output it and it will not return until there is no more data in the buffer or WriteBuffer thread. The task of WriteBuffer thread is to get data from the activity input port and write it into the buffer, the thread will keep running until no more data coming from input port, before it exists the setcompleted will be called to inform the main thread the execution of processblock is completed. 30

31 5. Spatial linking with OGSA-DAI In this chapter, it will present the experiment about exposing the spatial database by OGSA-DAI service and perform the point-in-polygon linking operation. An efficient way for doing point-in-polygon linking is introduced, and the whole procedure of data migration and linking is presented in several steps Spatial database working with OGSA-DAI The PostgreSQL database has been tested with current release of OGSA-DAI; in the experiment the database used to store geographic information is a PostgreSQL database with PostGIS extension. PostGIS adds support for geographic objects to the PostgreSQL object-relational database. In effect, PostGIS "spatially enables" the PostgreSQL server, allowing it to be used as a backend spatial database for geographic information systems (GIS) [5]. Because PostgreSQL is the supported data resource of OGSA-DAI, so a PostGIS enabled PostgreSQL database can be exposed by OGSA-DAI web service. It is possible to retrieve the geographic information from an OGSA-DAI data service interface. In the application scenario, user needs to join the tables which have related geographic data column. In normal relational database, joining normal relational tables, the = operator can be used to test whether the values are equal or not, the columns which are used to perform join operation must represent the same attribute in the relations, and the data types must be compatible. However in spatial database the joined columns can store the data for different kinds of geometry objects, for =, this operator is a little more naive, it only tests whether the bounding boxes of two geometries are the same [5]. In this experiment, we want to perform the spatial linking which needs to know the relationship between a point and a polygon, if the geographic coordinate of point is located inside of the covered area of polygon, which means they are related; so the corresponding data set can be joined together. Comparing the relationship between geometry objects is more complicated than normal SQL data types, so it needs some special functionality to do that, the experiment will explore how to join the tables that contain geographic information with the help of OGSA-DAI data service How the linking operation is performed In experiment the database used to store the GIS data set is a PostgreSQL database with PostGIS extension. The linking operation goes through the following table: Table epsc: This dataset should be considered as user s private own table. 31

32 Column Name / Data type Postcode varchar (10) Address varchar (300) Property_type char (2) Table Postcode: This is a intermediate table that map a postcode to coordinate Column Name / Data type Pcd varchar (7) Pcd2 varchar (8) Coord Point Table intermediategeography: This table has a polygon column and corresponding geographic code for referencing the data in neighborhood tables. Column Name / Data type Intgeocode varchar (12) Intgename varchar (50) the_geom geometry Table neighborhood: This is the public dataset containing some user interested statistic data; it also has a column for Intgecode as the intermediategeography table. Column Name / Data type Intgeocode varchar (12) House_price integer Salary integer The linking between espc and postcode table is achieved by the normal relational join operation. To get the corresponding Intgecode of user specified postcode, it needs to establish a link between table postcode and intermediategeography. In the implementation of experiment, the spatial database supports a geometry relationship function: Distance (geometry, geometry), this function will return the distance between two geometries. To check a point is within a polygon use this function, e.g. distance (coorda, PolygonB) < 1 mean the point A is located inside of polygonb. To link the table postcode and intermediategeography, the following query can return the data of joined table: Select postocode.*, intermediatgeography.* From postcode, intermediategeography 32

33 Where distance (coord, the_geom) < 1; Actually the linking operation for obtaining the final dataset is divided into several join operation between tables. The spatial join operation between point and polygon can only be performed by special geographic relationship function, so this needs to move data from normal relational database to spatial database. In most time, the user private dataset is focused in a local area, which means the postcode of user s table could belong to a few polygons. In that situation, comparing the relationship between the points with every polygon could be inefficient, because most polygons do not contain the user specified point How to improve efficiency of spatial join Computing the distance is expensive, if the table containing point or polygon is large, the simply join them by using the distance function for each row could be very slow. Because it calculates the distance between each row of polygon and point in the table, and most calculation is performed between the unrelated points and polygons. For example, in application scenario the public database could contain information of each region of Britain, and the polygon table contains relative geographic information. However user is only interested in the statistic data about his local area, e.g. Edinburgh area. For such requirement, only querying the geographic data of Scotland is sufficient enough. The following figure illustrates the example mentioned above Polygon table contains geographic information for each region of Britain, but only Scotland contains the coordinates of user specified postcode. 33

Edinburgh Research Explorer

Edinburgh Research Explorer Edinburgh Research Explorer Profiling OGSA-DAI Performance for Common Use Patterns Citation for published version: Dobrzelecki, B, Antonioletti, M, Schopf, JM, Hume, AC, Atkinson, M, Hong, NPC, Jackson,

More information

Simple Object Access Protocol (SOAP) Reference: 1. Web Services, Gustavo Alonso et. al., Springer

Simple Object Access Protocol (SOAP) Reference: 1. Web Services, Gustavo Alonso et. al., Springer Simple Object Access Protocol (SOAP) Reference: 1. Web Services, Gustavo Alonso et. al., Springer Minimal List Common Syntax is provided by XML To allow remote sites to interact with each other: 1. A common

More information

(9A05803) WEB SERVICES (ELECTIVE - III)

(9A05803) WEB SERVICES (ELECTIVE - III) 1 UNIT III (9A05803) WEB SERVICES (ELECTIVE - III) Web services Architecture: web services architecture and its characteristics, core building blocks of web services, standards and technologies available

More information

Sistemi ICT per il Business Networking

Sistemi ICT per il Business Networking Corso di Laurea Specialistica Ingegneria Gestionale Sistemi ICT per il Business Networking SOA and Web Services Docente: Vito Morreale (vito.morreale@eng.it) 1 1st & 2nd Generation Web Apps Motivation

More information

OGSA-DAI Client Toolkit

OGSA-DAI Client Toolkit Client Toolkit Technology Update GridWorld Community Activity GGF15, Boston, MA (USA) Amy Krause EPCC a.krause@epcc.ed.ac.uk Outline Client Toolkit 13 March 2005 http://www.ogsadai.org.uk/ 2 DataBrowser

More information

Lesson 3 SOAP message structure

Lesson 3 SOAP message structure Lesson 3 SOAP message structure Service Oriented Architectures Security Module 1 - Basic technologies Unit 2 SOAP Ernesto Damiani Università di Milano SOAP structure (1) SOAP message = SOAP envelope Envelope

More information

Protocols SPL/ SPL

Protocols SPL/ SPL Protocols 1 Application Level Protocol Design atomic units used by protocol: "messages" encoding reusable, protocol independent, TCP server, LinePrinting protocol implementation 2 Protocol Definition set

More information

Last Class: RPCs and RMI. Today: Communication Issues

Last Class: RPCs and RMI. Today: Communication Issues Last Class: RPCs and RMI Case Study: Sun RPC Lightweight RPCs Remote Method Invocation (RMI) Design issues Lecture 9, page 1 Today: Communication Issues Message-oriented communication Persistence and synchronicity

More information

Chapter 8 Web Services Objectives

Chapter 8 Web Services Objectives Chapter 8 Web Services Objectives Describe the Web services approach to the Service- Oriented Architecture concept Describe the WSDL specification and how it is used to define Web services Describe the

More information

Service Interface Design RSVZ / INASTI 12 July 2006

Service Interface Design RSVZ / INASTI 12 July 2006 Architectural Guidelines Service Interface Design RSVZ / INASTI 12 July 2006 Agenda > Mandatory standards > Web Service Styles and Usages > Service interface design > Service versioning > Securing Web

More information

Chapter 4 Communication

Chapter 4 Communication DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 4 Communication Layered Protocols (1) Figure 4-1. Layers, interfaces, and protocols in the OSI

More information

COP 4814 Florida International University Kip Irvine. Inside WCF. Updated: 11/21/2013

COP 4814 Florida International University Kip Irvine. Inside WCF. Updated: 11/21/2013 COP 4814 Florida International University Kip Irvine Inside WCF Updated: 11/21/2013 Inside Windows Communication Foundation, by Justin Smith, Microsoft Press, 2007 History and Motivations HTTP and XML

More information

SOFTWARE ARCHITECTURES ARCHITECTURAL STYLES SCALING UP PERFORMANCE

SOFTWARE ARCHITECTURES ARCHITECTURAL STYLES SCALING UP PERFORMANCE SOFTWARE ARCHITECTURES ARCHITECTURAL STYLES SCALING UP PERFORMANCE Tomas Cerny, Software Engineering, FEE, CTU in Prague, 2014 1 ARCHITECTURES SW Architectures usually complex Often we reduce the abstraction

More information

DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI

DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI Department of Computer Science and Engineering IT6801 - SERVICE ORIENTED ARCHITECTURE Anna University 2 & 16 Mark Questions & Answers Year / Semester: IV /

More information

COMMUNICATION PROTOCOLS

COMMUNICATION PROTOCOLS COMMUNICATION PROTOCOLS Index Chapter 1. Introduction Chapter 2. Software components message exchange JMS and Tibco Rendezvous Chapter 3. Communication over the Internet Simple Object Access Protocol (SOAP)

More information

Distributed Multitiered Application

Distributed Multitiered Application Distributed Multitiered Application Java EE platform uses a distributed multitiered application model for enterprise applications. Logic is divided into components https://docs.oracle.com/javaee/7/tutorial/overview004.htm

More information

Middleware. Adapted from Alonso, Casati, Kuno, Machiraju Web Services Springer 2004

Middleware. Adapted from Alonso, Casati, Kuno, Machiraju Web Services Springer 2004 Middleware Adapted from Alonso, Casati, Kuno, Machiraju Web Services Springer 2004 Outline Web Services Goals Where do they come from? Understanding middleware Middleware as infrastructure Communication

More information

Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions

Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions Chapter 1: Solving Integration Problems Using Patterns 2 Introduction The Need for Integration Integration Challenges

More information

Web Services: Introduction and overview. Outline

Web Services: Introduction and overview. Outline Web Services: Introduction and overview 1 Outline Introduction and overview Web Services model Components / protocols In the Web Services model Web Services protocol stack Examples 2 1 Introduction and

More information

DYNAMIC CONFIGURATION OF COLLABORATION IN NETWORKED ORGANISATIONS

DYNAMIC CONFIGURATION OF COLLABORATION IN NETWORKED ORGANISATIONS 22 DYNAMIC CONFIGURATION OF COLLABORATION IN NETWORKED ORGANISATIONS Brian Shields and Owen Molloy Department of Information Technology, National University of Ireland, Galway, IRELAND. brian.shields@geminga.it.nuigalway.ie,

More information

Communication. Overview

Communication. Overview Communication Chapter 2 1 Overview Layered protocols Remote procedure call Remote object invocation Message-oriented communication Stream-oriented communication 2 Layered protocols Low-level layers Transport

More information

MOM MESSAGE ORIENTED MIDDLEWARE OVERVIEW OF MESSAGE ORIENTED MIDDLEWARE TECHNOLOGIES AND CONCEPTS. MOM Message Oriented Middleware

MOM MESSAGE ORIENTED MIDDLEWARE OVERVIEW OF MESSAGE ORIENTED MIDDLEWARE TECHNOLOGIES AND CONCEPTS. MOM Message Oriented Middleware MOM MESSAGE ORIENTED MOM Message Oriented Middleware MIDDLEWARE OVERVIEW OF MESSAGE ORIENTED MIDDLEWARE TECHNOLOGIES AND CONCEPTS Peter R. Egli 1/25 Contents 1. Synchronous versus asynchronous interaction

More information

Research and Design Application Platform of Service Grid Based on WSRF

Research and Design Application Platform of Service Grid Based on WSRF DOI: 10.7763/IPEDR. 2012. V49. 27 Research and Design Application Platform of Service Grid Based on WSRF Jianmei Ge a, Shying Zhang a College of Computer Science and Technology, Beihua University, No.1

More information

1.264 Lecture 16. Legacy Middleware

1.264 Lecture 16. Legacy Middleware 1.264 Lecture 16 Legacy Middleware What is legacy middleware? Client (user interface, local application) Client (user interface, local application) How do we connect clients and servers? Middleware Network

More information

SERVO - ACES Abstract

SERVO - ACES Abstract 1 of 6 12/27/2004 2:33 PM 2 of 6 12/27/2004 2:33 PM Implementing GIS Grid Services for the International Solid Earth Research Virtual Observatory Galip Aydin (1), Marlon Pierce (1), Geoffrey Fox (1), Mehmet

More information

Introduction to Protocols

Introduction to Protocols Chapter 6 Introduction to Protocols 1 Chapter 6 Introduction to Protocols What is a Network Protocol? A protocol is a set of rules that governs the communications between computers on a network. These

More information

DS 2009: middleware. David Evans

DS 2009: middleware. David Evans DS 2009: middleware David Evans de239@cl.cam.ac.uk What is middleware? distributed applications middleware remote calls, method invocations, messages,... OS comms. interface sockets, IP,... layer between

More information

Communication. Distributed Systems Santa Clara University 2016

Communication. Distributed Systems Santa Clara University 2016 Communication Distributed Systems Santa Clara University 2016 Protocol Stack Each layer has its own protocol Can make changes at one layer without changing layers above or below Use well defined interfaces

More information

Monitoring services on Enterprise Service Bus

Monitoring services on Enterprise Service Bus Monitoring services on Enterprise Service Bus Ilona Bluemke, Marcin Warda Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland {I.Bluemke}@ii.pw.edu.pl

More information

Oracle Service Bus. 10g Release 3 (10.3) October 2008

Oracle Service Bus. 10g Release 3 (10.3) October 2008 Oracle Service Bus Tutorials 10g Release 3 (10.3) October 2008 Oracle Service Bus Tutorials, 10g Release 3 (10.3) Copyright 2007, 2008, Oracle and/or its affiliates. All rights reserved. This software

More information

XML Web Services Basics

XML Web Services Basics MSDN Home XML Web Services Basics Page Options Roger Wolter Microsoft Corporation December 2001 Summary: An overview of the value of XML Web services for developers, with introductions to SOAP, WSDL, and

More information

XML Web Service? A programmable component Provides a particular function for an application Can be published, located, and invoked across the Web

XML Web Service? A programmable component Provides a particular function for an application Can be published, located, and invoked across the Web Web Services. XML Web Service? A programmable component Provides a particular function for an application Can be published, located, and invoked across the Web Platform: Windows COM Component Previously

More information

02 - Distributed Systems

02 - Distributed Systems 02 - Distributed Systems Definition Coulouris 1 (Dis)advantages Coulouris 2 Challenges Saltzer_84.pdf Models Physical Architectural Fundamental 2/58 Definition Distributed Systems Distributed System is

More information

Assignment 5. Georgia Koloniari

Assignment 5. Georgia Koloniari Assignment 5 Georgia Koloniari 2. "Peer-to-Peer Computing" 1. What is the definition of a p2p system given by the authors in sec 1? Compare it with at least one of the definitions surveyed in the last

More information

02 - Distributed Systems

02 - Distributed Systems 02 - Distributed Systems Definition Coulouris 1 (Dis)advantages Coulouris 2 Challenges Saltzer_84.pdf Models Physical Architectural Fundamental 2/60 Definition Distributed Systems Distributed System is

More information

A short introduction to Web Services

A short introduction to Web Services 1 di 5 17/05/2006 15.40 A short introduction to Web Services Prev Chapter Key Concepts Next A short introduction to Web Services Since Web Services are the basis for Grid Services, understanding the Web

More information

Caching Personalized and Database-related Dynamic Web Pages

Caching Personalized and Database-related Dynamic Web Pages Caching Personalized and Database-related Dynamic Web Pages Yeim-Kuan Chang, Yu-Ren Lin and Yi-Wei Ting Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan

More information

Agent-Enabling Transformation of E-Commerce Portals with Web Services

Agent-Enabling Transformation of E-Commerce Portals with Web Services Agent-Enabling Transformation of E-Commerce Portals with Web Services Dr. David B. Ulmer CTO Sotheby s New York, NY 10021, USA Dr. Lixin Tao Professor Pace University Pleasantville, NY 10570, USA Abstract:

More information

Communication Paradigms

Communication Paradigms Communication Paradigms Nicola Dragoni Embedded Systems Engineering DTU Compute 1. Interprocess Communication Direct Communication: Sockets Indirect Communication: IP Multicast 2. High Level Communication

More information

Patterns Architectural Styles Archetypes

Patterns Architectural Styles Archetypes Patterns Architectural Styles Archetypes Patterns The purpose of a pattern is to share a proven, widely applicable solution to a particular problem in a standard form that allows it to be easily reused.

More information

Chapter 1: Distributed Information Systems

Chapter 1: Distributed Information Systems Chapter 1: Distributed Information Systems Contents - Chapter 1 Design of an information system Layers and tiers Bottom up design Top down design Architecture of an information system One tier Two tier

More information

WebServices the New Era

WebServices the New Era WebServices the New Era Introduction to WebServices Standards of WebServices Component Architecture WebServices Architecture SOAP WSDL UDDI Tools and Technologies of WebServices An example of WebServices

More information

C exam. IBM C IBM WebSphere Application Server Developer Tools V8.5 with Liberty Profile. Version: 1.

C exam.   IBM C IBM WebSphere Application Server Developer Tools V8.5 with Liberty Profile. Version: 1. C9510-319.exam Number: C9510-319 Passing Score: 800 Time Limit: 120 min File Version: 1.0 IBM C9510-319 IBM WebSphere Application Server Developer Tools V8.5 with Liberty Profile Version: 1.0 Exam A QUESTION

More information

Web Services Development for IBM WebSphere Application Server V7.0

Web Services Development for IBM WebSphere Application Server V7.0 000-371 Web Services Development for IBM WebSphere Application Server V7.0 Version 3.1 QUESTION NO: 1 Refer to the message in the exhibit. Replace the??? in the message with the appropriate namespace.

More information

Stream. Two types of streams are provided by Java Byte and Character. Predefined Streams

Stream. Two types of streams are provided by Java Byte and Character. Predefined Streams Stream Stream is a sequence of bytes that travel from the source to destination over a communication path. For example, source might be network, destination might be a file on the file system. We may want

More information

Operating Systems 2 nd semester 2016/2017. Chapter 4: Threads

Operating Systems 2 nd semester 2016/2017. Chapter 4: Threads Operating Systems 2 nd semester 2016/2017 Chapter 4: Threads Mohamed B. Abubaker Palestine Technical College Deir El-Balah Note: Adapted from the resources of textbox Operating System Concepts, 9 th edition

More information

Distributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016

Distributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016 Distributed Systems 2015 Exam 1 Review Paul Krzyzanowski Rutgers University Fall 2016 1 Question 1 Why did the use of reference counting for remote objects prove to be impractical? Explain. It s not fault

More information

Implementing a Ground Service- Oriented Architecture (SOA) March 28, 2006

Implementing a Ground Service- Oriented Architecture (SOA) March 28, 2006 Implementing a Ground Service- Oriented Architecture (SOA) March 28, 2006 John Hohwald Slide 1 Definitions and Terminology What is SOA? SOA is an architectural style whose goal is to achieve loose coupling

More information

2011 IBM Research Strategic Initiative: Workload Optimized Systems

2011 IBM Research Strategic Initiative: Workload Optimized Systems PIs: Michael Hind, Yuqing Gao Execs: Brent Hailpern, Toshio Nakatani, Kevin Nowka 2011 IBM Research Strategic Initiative: Workload Optimized Systems Yuqing Gao IBM Research 2011 IBM Corporation Motivation

More information

Distributed KIDS Labs 1

Distributed KIDS Labs 1 Distributed Databases @ KIDS Labs 1 Distributed Database System A distributed database system consists of loosely coupled sites that share no physical component Appears to user as a single system Database

More information

Distributed Systems. Web Services (WS) and Service Oriented Architectures (SOA) László Böszörményi Distributed Systems Web Services - 1

Distributed Systems. Web Services (WS) and Service Oriented Architectures (SOA) László Böszörményi Distributed Systems Web Services - 1 Distributed Systems Web Services (WS) and Service Oriented Architectures (SOA) László Böszörményi Distributed Systems Web Services - 1 Service Oriented Architectures (SOA) A SOA defines, how services are

More information

The Myx Architectural Style

The Myx Architectural Style The Myx Architectural Style The goal of the Myx architectural style is to serve as an architectural style that is good for building flexible, high performance tool-integrating environments. A secondary

More information

Chapter 2 Distributed Computing Infrastructure

Chapter 2 Distributed Computing Infrastructure Slide 2.1 Web Serv vices: Princ ciples & Te echno ology Chapter 2 Distributed Computing Infrastructure Mike P. Papazoglou mikep@uvt.nl Slide 2.2 Topics Distributed computing and Internet protocols The

More information

Distributed Systems Architectures. Ian Sommerville 2006 Software Engineering, 8th edition. Chapter 12 Slide 1

Distributed Systems Architectures. Ian Sommerville 2006 Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures Ian Sommerville 2006 Software Engineering, 8th edition. Chapter 12 Slide 1 Objectives To explain the advantages and disadvantages of different distributed systems architectures

More information

Integration Framework. Architecture

Integration Framework. Architecture Integration Framework 2 Architecture Anyone involved in the implementation or day-to-day administration of the integration framework applications must be familiarized with the integration framework architecture.

More information

Test On Line: reusing SAS code in WEB applications Author: Carlo Ramella TXT e-solutions

Test On Line: reusing SAS code in WEB applications Author: Carlo Ramella TXT e-solutions Test On Line: reusing SAS code in WEB applications Author: Carlo Ramella TXT e-solutions Chapter 1: Abstract The Proway System is a powerful complete system for Process and Testing Data Analysis in IC

More information

On-Line Monitoring of Multi-Area Power Systems in Distributed Environment

On-Line Monitoring of Multi-Area Power Systems in Distributed Environment SERBIAN JOURNAL OF ELECTRICAL ENGINEERING Vol. 3, No. 1, June 2006, 89-101 On-Line Monitoring of Multi-Area Power Systems in Distributed Environment Ramadoss Ramesh 1, Velimuthu Ramachandran 2 Abstract:

More information

Scribe Insight Enterprise Architecture Overview

Scribe Insight Enterprise Architecture Overview Scribe Insight Enterprise Architecture Overview A TECHNICAL OVERVIEW OF THE SCRIBE INTEGRATION TOOL MARCH 2009 WRITTEN BY PETER R. CHASE EXECUTIVE VICE PRESIDENT, SCRIBE SOFTWARE CORPORATION PUBLISHED

More information

SOAP Specification. 3 major parts. SOAP envelope specification. Data encoding rules. RPC conventions

SOAP Specification. 3 major parts. SOAP envelope specification. Data encoding rules. RPC conventions SOAP, UDDI and WSDL SOAP SOAP Specification 3 major parts SOAP envelope specification Defines rules for encapsulating data Method name to invoke Method parameters Return values How to encode error messages

More information

Chapter 4: Threads. Overview Multithreading Models Thread Libraries Threading Issues Operating System Examples Windows XP Threads Linux Threads

Chapter 4: Threads. Overview Multithreading Models Thread Libraries Threading Issues Operating System Examples Windows XP Threads Linux Threads Chapter 4: Threads Overview Multithreading Models Thread Libraries Threading Issues Operating System Examples Windows XP Threads Linux Threads Chapter 4: Threads Objectives To introduce the notion of a

More information

Integrity in Distributed Databases

Integrity in Distributed Databases Integrity in Distributed Databases Andreas Farella Free University of Bozen-Bolzano Table of Contents 1 Introduction................................................... 3 2 Different aspects of integrity.....................................

More information

ICENI: An Open Grid Service Architecture Implemented with Jini Nathalie Furmento, William Lee, Anthony Mayer, Steven Newhouse, and John Darlington

ICENI: An Open Grid Service Architecture Implemented with Jini Nathalie Furmento, William Lee, Anthony Mayer, Steven Newhouse, and John Darlington ICENI: An Open Grid Service Architecture Implemented with Jini Nathalie Furmento, William Lee, Anthony Mayer, Steven Newhouse, and John Darlington ( Presentation by Li Zao, 01-02-2005, Univercité Claude

More information

Distributed Systems Theory 4. Remote Procedure Call. October 17, 2008

Distributed Systems Theory 4. Remote Procedure Call. October 17, 2008 Distributed Systems Theory 4. Remote Procedure Call October 17, 2008 Client-server model vs. RPC Client-server: building everything around I/O all communication built in send/receive distributed computing

More information

What Is Service-Oriented Architecture

What Is Service-Oriented Architecture What Is Service-Oriented Architecture by Hao He September 30, 2003 "Things should be made as simple as possible, but no simpler." -- Albert Einstein Introduction Einstein made that famous statement many

More information

Introduction of PDE.Mart

Introduction of PDE.Mart Grid-Based PDE.Mart A PDE-Oriented PSE for Grid Computing GY MAO, M. MU, Wu ZHANG, XB ZHANG School of Computer Science and Engineering, Shanghai University, CHINA Department of Mathematics, Hong Kong University

More information

Understanding StoRM: from introduction to internals

Understanding StoRM: from introduction to internals Understanding StoRM: from introduction to internals 13 November 2007 Outline Storage Resource Manager The StoRM service StoRM components and internals Deployment configuration Authorization and ACLs Conclusions.

More information

Overview. Communication types and role of Middleware Remote Procedure Call (RPC) Message Oriented Communication Multicasting 2/36

Overview. Communication types and role of Middleware Remote Procedure Call (RPC) Message Oriented Communication Multicasting 2/36 Communication address calls class client communication declarations implementations interface java language littleendian machine message method multicast network object operations parameters passing procedure

More information

Oracle. Exam Questions 1z Java Enterprise Edition 5 Web Services Developer Certified Professional Upgrade Exam. Version:Demo

Oracle. Exam Questions 1z Java Enterprise Edition 5 Web Services Developer Certified Professional Upgrade Exam. Version:Demo Oracle Exam Questions 1z0-863 Java Enterprise Edition 5 Web Services Developer Certified Professional Upgrade Exam Version:Demo 1.Which two statements are true about JAXR support for XML registries? (Choose

More information

Notes. Submit homework on Blackboard The first homework deadline is the end of Sunday, Feb 11 th. Final slides have 'Spring 2018' in chapter title

Notes. Submit homework on Blackboard The first homework deadline is the end of Sunday, Feb 11 th. Final slides have 'Spring 2018' in chapter title Notes Ask course content questions on Slack (is651-spring-2018.slack.com) Contact me by email to add you to Slack Make sure you checked Additional Links at homework page before you ask In-class discussion

More information

Chapter 3. Database Architecture and the Web

Chapter 3. Database Architecture and the Web Chapter 3 Database Architecture and the Web 1 Chapter 3 - Objectives Software components of a DBMS. Client server architecture and advantages of this type of architecture for a DBMS. Function and uses

More information

BEAAquaLogic. Service Bus. MQ Transport User Guide

BEAAquaLogic. Service Bus. MQ Transport User Guide BEAAquaLogic Service Bus MQ Transport User Guide Version: 3.0 Revised: February 2008 Contents Introduction to the MQ Transport Messaging Patterns......................................................

More information

CmpE 596: Service-Oriented Computing

CmpE 596: Service-Oriented Computing CmpE 596: Service-Oriented Computing Pınar Yolum pinar.yolum@boun.edu.tr Department of Computer Engineering Boğaziçi University CmpE 596: Service-Oriented Computing p.1/53 Course Information Topics Work

More information

Experiences Running OGSA-DQP Queries Against a Heterogeneous Distributed Scientific Database

Experiences Running OGSA-DQP Queries Against a Heterogeneous Distributed Scientific Database 2009 15th International Conference on Parallel and Distributed Systems Experiences Running OGSA-DQP Queries Against a Heterogeneous Distributed Scientific Database Helen X Xiang Computer Science, University

More information

Gustavo Alonso, ETH Zürich. Web services: Concepts, Architectures and Applications - Chapter 1 2

Gustavo Alonso, ETH Zürich. Web services: Concepts, Architectures and Applications - Chapter 1 2 Chapter 1: Distributed Information Systems Gustavo Alonso Computer Science Department Swiss Federal Institute of Technology (ETHZ) alonso@inf.ethz.ch http://www.iks.inf.ethz.ch/ Contents - Chapter 1 Design

More information

Java J Course Outline

Java J Course Outline JAVA EE - J2SE - CORE JAVA After all having a lot number of programming languages. Why JAVA; yet another language!!! AND NOW WHY ONLY JAVA??? CHAPTER 1: INTRODUCTION What is Java? History Versioning The

More information

Transport (http) Encoding (XML) Standard Structure (SOAP) Description (WSDL) Discovery (UDDI - platform independent XML)

Transport (http) Encoding (XML) Standard Structure (SOAP) Description (WSDL) Discovery (UDDI - platform independent XML) System Programming and Design Concepts Year 3 Tutorial 08 1. Explain what is meant by a Web service. Web service is a application logic that is accessible using Internet standards. A SOA framework. SOA

More information

Vlad Vinogradsky

Vlad Vinogradsky Vlad Vinogradsky vladvino@microsoft.com http://twitter.com/vladvino Commercially available cloud platform offering Billing starts on 02/01/2010 A set of cloud computing services Services can be used together

More information

BPEL Research. Tuomas Piispanen Comarch

BPEL Research. Tuomas Piispanen Comarch BPEL Research Tuomas Piispanen 8.8.2006 Comarch Presentation Outline SOA and Web Services Web Services Composition BPEL as WS Composition Language Best BPEL products and demo What is a service? A unit

More information

Web Services Overview

Web Services Overview Web Services Overview Using Eclipse WTP Greg Hester Pacific Hi-Tech, Inc. greg.hester.pacifichitech.com 1 September 17, 2008 Agenda Web Services Concepts How Web Services are used Web Services tools in

More information

Web Services in Cincom VisualWorks. WHITE PAPER Cincom In-depth Analysis and Review

Web Services in Cincom VisualWorks. WHITE PAPER Cincom In-depth Analysis and Review Web Services in Cincom VisualWorks WHITE PAPER Cincom In-depth Analysis and Review Web Services in Cincom VisualWorks Table of Contents Web Services in VisualWorks....................... 1 Web Services

More information

Services Web Nabil Abdennadher

Services Web Nabil Abdennadher Services Web Nabil Abdennadher nabil.abdennadher@hesge.ch 1 Plan What is Web Services? SOAP/WSDL REST http://www.slideshare.net/ecosio/introduction-to-soapwsdl-and-restfulweb-services/14 http://www.drdobbs.com/web-development/restful-web-services-a-tutorial/

More information

Service-Oriented Architecture (SOA)

Service-Oriented Architecture (SOA) Service-Oriented Architecture (SOA) SOA is a software architecture in which reusable services are deployed into application servers and then consumed by clients in different applications or business processes.

More information

Java Development and Grid Computing with the Globus Toolkit Version 3

Java Development and Grid Computing with the Globus Toolkit Version 3 Java Development and Grid Computing with the Globus Toolkit Version 3 Michael Brown IBM Linux Integration Center Austin, Texas Page 1 Session Introduction Who am I? mwbrown@us.ibm.com Team Leader for Americas

More information

Sentinet for Microsoft Azure SENTINET

Sentinet for Microsoft Azure SENTINET Sentinet for Microsoft Azure SENTINET Sentinet for Microsoft Azure 1 Contents Introduction... 2 Customer Benefits... 2 Deployment Topologies... 3 Cloud Deployment Model... 3 Hybrid Deployment Model...

More information

Communication. Outline

Communication. Outline COP 6611 Advanced Operating System Communication Chi Zhang czhang@cs.fiu.edu Outline Layered Protocols Remote Procedure Call (RPC) Remote Object Invocation Message-Oriented Communication 2 1 Layered Protocols

More information

Realisation of SOA using Web Services. Adomas Svirskas Vilnius University December 2005

Realisation of SOA using Web Services. Adomas Svirskas Vilnius University December 2005 Realisation of SOA using Web Services Adomas Svirskas Vilnius University December 2005 Agenda SOA Realisation Web Services Web Services Core Technologies SOA and Web Services [1] SOA is a way of organising

More information

Module 11: I/O Systems

Module 11: I/O Systems Module 11: I/O Systems Reading: Chapter 13 Objectives Explore the structure of the operating system s I/O subsystem. Discuss the principles of I/O hardware and its complexity. Provide details on the performance

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.

More information

Flash: an efficient and portable web server

Flash: an efficient and portable web server Flash: an efficient and portable web server High Level Ideas Server performance has several dimensions Lots of different choices on how to express and effect concurrency in a program Paper argues that

More information

Technical and Architectural Overview

Technical and Architectural Overview 100% Web-Based Time & Labor Management Technical and Architectural Overview Copyright 2007 Time America 15990 N. Greenway-Hayden Loop Suite D-500, Scottsdale, AZ (800) 227-9766 www.timeamerica.com Table

More information

HTRC Data API Performance Study

HTRC Data API Performance Study HTRC Data API Performance Study Yiming Sun, Beth Plale, Jiaan Zeng Amazon Indiana University Bloomington {plale, jiaazeng}@cs.indiana.edu Abstract HathiTrust Research Center (HTRC) allows users to access

More information

JAVA. 1. Introduction to JAVA

JAVA. 1. Introduction to JAVA JAVA 1. Introduction to JAVA History of Java Difference between Java and other programming languages. Features of Java Working of Java Language Fundamentals o Tokens o Identifiers o Literals o Keywords

More information

Communication. Distributed Systems IT332

Communication. Distributed Systems IT332 Communication Distributed Systems IT332 2 Outline Fundamentals Layered network communication protocols Types of communication Remote Procedure Call Message Oriented Communication Multicast Communication

More information

Parallelism. Master 1 International. Andrea G. B. Tettamanzi. Université de Nice Sophia Antipolis Département Informatique

Parallelism. Master 1 International. Andrea G. B. Tettamanzi. Université de Nice Sophia Antipolis Département Informatique Parallelism Master 1 International Andrea G. B. Tettamanzi Université de Nice Sophia Antipolis Département Informatique andrea.tettamanzi@unice.fr Andrea G. B. Tettamanzi, 2014 1 Lecture 2 Communication

More information

RFC 003 Event Service October Computer Science Department October 2001 Request for Comments: 0003 Obsoletes: none.

RFC 003 Event Service October Computer Science Department October 2001 Request for Comments: 0003 Obsoletes: none. Ubiquitous Computing Bhaskar Borthakur University of Illinois at Urbana-Champaign Software Research Group Computer Science Department October 2001 Request for Comments: 0003 Obsoletes: none The Event Service

More information

describe the functions of Windows Communication Foundation describe the features of the Windows Workflow Foundation solution

describe the functions of Windows Communication Foundation describe the features of the Windows Workflow Foundation solution 1 of 9 10/9/2013 1:38 AM WCF and WF Learning Objectives After completing this topic, you should be able to describe the functions of Windows Communication Foundation describe the features of the Windows

More information

Semantic SOA - Realization of the Adaptive Services Grid

Semantic SOA - Realization of the Adaptive Services Grid Semantic SOA - Realization of the Adaptive Services Grid results of the final year bachelor project Outline review of midterm results engineering methodology service development build-up of ASG software

More information

Personal Assistant: A Case Study on Web Service vs. Web Based Application

Personal Assistant: A Case Study on Web Service vs. Web Based Application Personal Assistant: A Case Study on Web Service vs. Web Based Application Guoliang Qian 1, Jing Zou, Bon Sy Computer Science Department, Graduate School and University Center of The City University of

More information

SOAr-DSGrid: Service-Oriented Architecture for Distributed Simulation on the Grid

SOAr-DSGrid: Service-Oriented Architecture for Distributed Simulation on the Grid SOAr-DSGrid: Service-Oriented Architecture for Distributed Simulation on the Grid Xinjun Chen, Wentong Cai, Stephen J. Turner, and Yong Wang Parallel and Distributed Computing Center School of Computer

More information

Announcements. Next week Upcoming R2

Announcements. Next week Upcoming R2 Announcements Next week Upcoming R2 APIs & Web Services SWEN-343 Today Need for APIs Webservices Types SOAP & REST SOA Microservices API (High-Level) Definition Application Program Interface A set of routines,

More information