Scientific Data & Workflow Engineering. Outline

Size: px
Start display at page:

Download "Scientific Data & Workflow Engineering. Outline"

Transcription

1 Scientific Data & Workflow Engineering Preliminary Notes from the Cyberinfrastructure Trenches Bertram Ludäscher Associate Professor Dept. of Computer Science & Genome Center University of California, Davis UC DAVIS Department of Computer Science Fellow San Diego Supercomputer Center University of California, San Diego San Diego Supercomputer Center Outline Introduction: CI Sample Architectures Scientific Data Integration Scientific Workflow Management Links & Crystallization Points Summary 2 Scientific Data & WF Engineering, B.Ludäscher 1

2 Science Environment for Ecological Knowledge (SEEK) Overview Domain Science Driver Ecology (LTER), biodiversity, Analysis & Modeling System Design & execution of ecological models & analysis ( scientific workflows ) {application,upper}-ware Kepler system Semantic Mediation System Data Integration of hard-torelate sources and processes Semantic Types and Ontologies upper middleware Sparrow Toolkit EcoGrid Access to ecology data and tools {middle,under}-ware unified API to SRB/MCAT, MetaCat, DiGIR, datasets sample CS problem [DILS 04] 3 Scientific Data & WF Engineering, B.Ludäscher Common CI Infrastructure Pieces Other CI-projects (e.g. GEON, ) have similar service-oriented architectures: Seamless and uniform data access ( Data-Grid ) data & metadata registry distributed and high performance computing platform ( Compute-Grid ) service registry Federated, integrated, mediated databases often use of semantic extensions (e.g. ontologies) User-friendly workbench / problem-solving environment scientific workflows add to this sensors, observing systems 4 Scientific Data & WF Engineering, B.Ludäscher 2

3 Example: Realtime Environment for Analytical Processing (REAP vision) 5 Scientific Data & WF Engineering, B.Ludäscher The Great Unified System Many engineering and CS challenges! we ll see some Our focus: Scientific data integration How to associate, mediate, integrate complex scientific data? Scientific workflows How to devise larger scientific workflows for process automation from individual components (e.g. web services)? Disclaimer: often scratching the surface; see references & research literature for details 6 Scientific Data & WF Engineering, B.Ludäscher 3

4 Outline Introduction: CI Sample Architectures Scientific Data Integration Scientific Workflow Management Links & Crystallization Points Summary 7 Scientific Data & WF Engineering, B.Ludäscher An Online Shopper s s Information Integration Problem El Cheapo: Where can I get the cheapest copy (including shipping cost) of Wittgenstein s Tractatus Logicus-Philosophicus within a week? addall.com? Mediator (virtual DB) Information (vs. Datawarehouse) Integration NOTE: non-trivial data engineering challenges! One-World Mediation amazon.com barnes&noble.com half.com A1books.com 8 Scientific Data & WF Engineering, B.Ludäscher 4

5 A Home Buyer s s Information Integration Problem What houses for sale under $500k have at least 2 bathrooms, 2 bedrooms, a nearby school ranking in the upper third, in a neighborhood with below-average crime rate and diverse population?? Information Integration Multiple-Worlds Mediation Realtor Realtor Crime Crime Stats Stats School School Rankings Rankings Demographics Demographics 9 Scientific Data & WF Engineering, B.Ludäscher A Neuroscientist s s Information Integration Problem Biomedical Informatics Research Network What is the cerebellar distribution of rat proteins with more than 70% homology with human NCS-1? Any structure specificity? How about other rodents? Inter-source links: unclear for the non-scientists hard for the scientist? Information Integration Complex Multiple-Worlds Mediation protein localization (NCMIR) sequence info (CaPROT) morphometry (SYNAPSE) neurotransmission (SENSELAB) 10 Scientific Data & WF Engineering, B.Ludäscher 5

6 11 Scientific Data & WF Engineering, B.Ludäscher Interoperability & Integration Challenges reconciling S 5 heterogeneities gluing together resources bridging information and knowledge gaps computationally System aspects: Grid Middleware distributed data & computing, SOA web services, WSDL/SOAP, WSRF, OGSA, sources = functions, files, data sets Syntax & Structure: (XML-Based) Data Mediators wrapping, restructuring (XML) queries and views sources = (XML) databases Semantics: Model-Based/Semantic Mediators conceptual models and declarative views Knowledge Representation: ontologies, description logics (RDF(S),OWL...) sources = knowledge bases (DB+CMs+ICs) Synthesis: Scientific Workflow Design & Execution Composition of declarative and procedural components into larger workflows (re)sources = services, processes, actors, 12 Scientific Data & WF Engineering, B.Ludäscher 6

7 Information Integration Challenges: S 4 Heterogeneities System aspects platforms, devices, data & service distribution, APIs, protocols, Grid middleware technologies + e.g. single sign-on, platform independence, transparent use of remote resources, Syntax & Structure heterogeneous data formats (one for each tool...) heterogeneous data models (RDBs, ORDBs, OODBs, XMLDBs, flat files, ) heterogeneous schemas (one for each DB...) Database mediation technologies + XML-based data exchange, integrated views, transparent query rewriting, Semantics descriptive metadata, different terminologies, hidden semantics (context), implicit assumptions, Knowledge representation & semantic mediation technologies + smart data discovery & integration + e.g. ask about X ( mafic ); find data about Y ( diorite ); be happy anyways! 13 Scientific Data & WF Engineering, B.Ludäscher Information Integration Challenges: S 5 Heterogeneities Synthesis of applications, analysis tools, data & query components, into scientific workflows How to make use of these wonderful things & put them together to solve a scientist s problem? Scientific Problem Solving Environments (PSEs( PSEs) Portals,Workbench ( scientist s view ) + ontology-enhanced data registration, discovery, manipulation + creation and registration of new data products from existing ones, Scientific Workflow System ( engineer s view ) + for designing, re-engineering, deploying analysis pipelines and scientific workflows; a tool to make new tools + e.g., creation of new datasets from existing ones, dataset registration, 14 Scientific Data & WF Engineering, B.Ludäscher 7

8 Information Integration from a Database Perspective Information Integration Problem Given: data sources S 1,..., S k (databases, web sites,...) and user questions Q 1,..., Q n that can in principle be answered using the information in the S i Find: the answers to Q 1,..., Q n The Database Perspective: source = database S i has a schema (relational, XML, OO,...) S i can be queried define virtual (or materialized) integrated (or global) view G over local sources S 1,..., S k using database query languages (SQL, XQuery,...) questions become queries Q i against G(S 1,..., S k ) 15 Scientific Data & WF Engineering, B.Ludäscher Standard (XML-Based) Mediator Architecture USER/Client 1. Query Q ( G (S 1,..., S k ) ) 6. {answers(q)} Integrated View Definition G(..) S 1 (..)S k (..) Integrated Global (XML) View G MEDIATOR Query Post rewriting processing 3. Q1 Q2 Q3 4. {answers(q1)} {answers(q2)} {answers(q3)} (XML) View Wrapper (XML) View Wrapper (XML) View Wrapper web services as wrapper APIs S 1 S 2 S k 16 Scientific Data & WF Engineering, B.Ludäscher 8

9 Query Planning in Data Integration Given: Declarative user query Q: answer() G... & { G S } global-as-view (GAV) & { S G} local-as-view (LAV) & { ic() SG} integrity constraints (ICs) Find: equivalent (or minimal containing, maximal contained) query plan Q : answer() S query rewriting (logical/calculus, algebraic, physical levels) Results: A variety of results/algorithms; depending on classes of queries, views, and ICs: P, NP,, undecidable hot research area in core CS (database community) 17 Scientific Data & WF Engineering, B.Ludäscher Scientific Data Integration using Semantic Extensions 18 Scientific Data & WF Engineering, B.Ludäscher 9

10 19 Scientific Data & WF Engineering, B.Ludäscher Example: Geologic Map Integration Given: Geologic maps from different state geological surveys (shapefiles w/ different data schemas) Different ontologies: Geologic age ontology (e.g. USGS) Rock classification ontologies: Multiple hierarchies (chemical, fabric, texture, genesis) from Geological Survey of Canada (GSC) Single hierarchy from British Geological Survey (BGS) Problem: Support uniform queries across all map using different ontologies Support registration w/ ontology A, querying w/ ontology B 20 Scientific Data & WF Engineering, B.Ludäscher 10

11 Schema Integration ( registering registering local schemas to the global schema) Sources Arizona Colorado Utah Nevada ABBREV PERIOD NAME PERIOD TYPE PERIOD FMATN TIME_UNIT Formation Age Formation Age Formation Age Formation Age Formation Age Composition Fabric Texture Integration Schema FORMATION AGE LITHOLOGY Idaho Wyoming New Mexico Montana E. NAME PERIOD NAME PERIOD FORMATION PERIOD Formation Age Formation Age Formation Age Formation Age Composition Fabric Texture Livingston formation Tertiary- Cretaceous FORMATION AGEMontana West LITHOLOGY andesitic sandstone Sources 21 Scientific Data & WF Engineering, B.Ludäscher Multihierarchical Rock Classification Ontology (Taxonomies) for Thematic Queries (GSC) Genesis Fabric Composition Texture 22 Scientific Data & WF Engineering, B.Ludäscher 11

12 Ontology-Enabled Application Example: Geologic Map Integration domain knowledge Knowledge representation Geologic Age ONTOLOGY Nevada Show Show formations where where AGE AGE = Paleozic (without age age ontology) Show Show formations where where AGE AGE = Paleozic (with age age ontology) +/- a few hundred million years 23 Scientific Data & WF Engineering, B.Ludäscher Querying by Geologic Age 24 Scientific Data & WF Engineering, B.Ludäscher 12

13 Querying by Geologic Age: Results 25 Scientific Data & WF Engineering, B.Ludäscher Querying by Chemical Composition (GSC) 26 Scientific Data & WF Engineering, B.Ludäscher 13

14 Semantic Mediation (via (via semantic registration of schemas and ontology articulations) Schema elements and/or data values are associated with concept expressions from the target ontology conceptual queries through the ontology Articulation ontology source registration to A, querying through B Semantic mediation: query rewriting w/ ontologies Database1 semantic registration ontology articulations Ontology A Concept-based ( semantic ) queries Database2 semantic registration Ontology B 27 Scientific Data & WF Engineering, B.Ludäscher Different views on State Geological Maps 28 Scientific Data & WF Engineering, B.Ludäscher 14

15 Sedimentary Rocks: BGS Ontology 29 Scientific Data & WF Engineering, B.Ludäscher Sedimentary Rocks: GSC Ontology 30 Scientific Data & WF Engineering, B.Ludäscher 15

16 Implementation in OWL: Not only for the machine 31 Scientific Data & WF Engineering, B.Ludäscher Source Contextualization through Ontology Refinement In addition to registering ( hanging off ) data relative to existing concepts, a source may also refine the mediator s domain map... sources can register new concepts at the mediator Scientific Data & WF Engineering, B.Ludäscher 16

17 Outline Introduction: CI Sample Architectures Scientific Data Integration Scientific Workflow Management Links & Crystallization Points Summary 33 Scientific Data & WF Engineering, B.Ludäscher What is a Scientific Workflow (SWF)? Goals: automate a scientist s repetitive data management and analysis tasks typical phases: data access, scheduling, generation, transformation, aggregation, analysis, visualization design, test, share, deploy, execute, reuse, SWFs 34 Scientific Data & WF Engineering, B.Ludäscher 17

18 Promoter Identification Workflow Source: Matt Coleman (LLNL) 35 Scientific Data & WF Engineering, B.Ludäscher Source: NIH BIRN (Jeffrey Grethe, UCSD) 36 Scientific Data & WF Engineering, B.Ludäscher 18

19 Ecology: GARP Analysis Pipeline for Invasive Species Prediction Registered Ecogrid Database EcoGrid Query Species presence & absence points (native range) (a) + A1 + + A2 A3 Sample Data Training sample (d) Test sample (d) GARP rule set (e) Data Calculation Map Generation Native range prediction map (f) Validation Registered Ecogrid Database Integrated layers (native range) (c) Model quality parameter (g) Registered Ecogrid Database Registered Ecogrid Database EcoGrid Query Environmental layers (native range) (b) Environmental layers (invasion area) (b) Species presence &absence points (invasion area) (a) Layer Integration Layer Integration Integrated layers (invasion area) (c) Map Generation Validation Invasion area prediction map (f) Model quality parameter (g) Archive To Ecogrid User Selected prediction maps (h) Generate Metadata Source: NSF SEEK (Deana Pennington et. al, UNM) 37 Scientific Data & WF Engineering, B.Ludäscher 38 Scientific Data & WF Engineering, B.Ludäscher 19

20 Commercial & Open Source Scientific Workflow (well Dataflow) ) Systems Kensington Discovery Edition from InforSense Triana Taverna 39 Scientific Data & WF Engineering, B.Ludäscher SCIRun: : Problem Solving Environments for Large-Scale Scientific Computing SCIRun: : PSE for interactive construction, debugging, and steering of large-scale scientific computations Component model, based on generalized dataflow programming Steve Parker (cs.utah.edu) 40 Scientific Data & WF Engineering, B.Ludäscher 20

21 Ptolemy II read! see! try! Source: Edward Lee et al Scientific Data & WF Engineering, B.Ludäscher Why Ptolemy II (and thus KEPLER)? Ptolemy II Objective: The focus is on assembly of concurrent components. The key underlying principle in the project is the use of well-defined models of computation that govern the interaction between components. A major problem area being addressed is the use of heterogeneous mixtures of models of computation. Dataflow Process Networks w/ natural support for abstraction, pipelining (streaming) actor-orientation, orientation, actor reuse User-Orientation Workflow design & exec console (Vergil GUI) Application/Glue-Ware excellent modeling and design support run-time support, monitoring, not a middle-/underware (we use someone else s, e.g. Globus, SRB, ) but middle-/underware is conveniently accessible through actors! PRAGMATICS Ptolemy II is mature, continuously extended & improved, well-documented (500+pp) open source system Ptolemy II folks actively participate in KEPLER 42 Scientific Data & WF Engineering, B.Ludäscher 21

22 KEPLER/CSP CSP: Contributors, Sponsors, Projects (or loosely coupled Communicating Sequential Persons ;-); Ilkay Altintas SDM, Resurgence Kim Baldridge Resurgence, NMI Chad Berkley SEEK Shawn Bowers SEEK Terence Critchlow SDM Tobin Fricke ROADNet Jeffrey Grethe BIRN Christopher H. Brooks Ptolemy II Zhengang Cheng SDM Dan Higgins SEEK Efrat Jaeger GEON Matt Jones SEEK Werner Krebs, EOL Edward A. Lee Ptolemy II Kai Lin GEON Bertram Ludaescher SEEK, GEON, SDM, ROADNet, BIRN Mark Miller EOL Steve Mock NMI Steve Neuendorffer Ptolemy II Jing Tao SEEK Mladen Vouk SDM Xiaowen Xin SDM Yang Zhao Ptolemy II Bing Zhu SEEK project.org Ptolemy II 43 Scientific Data & WF Engineering, B.Ludäscher KEPLER: An Open Collaboration Initiated by members from NSF SEEK and DOE SDM/SPA; now several other projects (GEON, Ptolemy II, EOL, Resurgence/NMI, ) Open Source (BSD-style license) Intensive Communications: Web-archived mailing lists IRC (!) Co-development: via shared CVS repository joining as a new co-developer (currently): get a CVS account (read-only) local development + contribution via existing KEPLER member be voted in as a member/co-developer Software & social engineering Software & social engineering How to better accommodate new groups/communities? How to better accommodate different usage/contribution models (core dev special purpose extender user)? 44 Scientific Data & WF Engineering, B.Ludäscher 22

23 Ptolemy II/KEPLER GUI (Vergil( Vergil) Directors define the component interaction & execution semantics Large, polymorphic component ( Actors ) and Directors libraries (drag & drop) 45 Scientific Data & WF Engineering, B.Ludäscher KEPLER/Ptolemy II GUI refined Ontology based actor (service) and dataset search Result Display 46 Scientific Data & WF Engineering, B.Ludäscher 23

24 Web Services Actors (WS Harvester) Minute-made (MM) WS-based application integration Similarly: MM workflow design & sharing w/o implemented components 47 Scientific Data & WF Engineering, B.Ludäscher Some Recent Actor Additions 48 Scientific Data & WF Engineering, B.Ludäscher 24

25 An early example: Promoter Identification SSDBM, AD 2003 Scientist models application as a workflow of connected components ( actors ) If all components exist, the workflow can be automated/ executed Different directors can be used to pick appropriate execution model (often pipelined execution: PN director) 49 Scientific Data & WF Engineering, B.Ludäscher Reengineering a Geoscientist s Mineral Classification Workflow 50 Scientific Data & WF Engineering, B.Ludäscher 25

26 Job Management (here: NIMROD) J ob management infrastructure in place Results database: under development Goal: 1000 s of GAMESS jobs (quantum mechanics) Fall/Winter Scientific Data & WF Engineering, B.Ludäscher ORB 52 Scientific Data & WF Engineering, B.Ludäscher 26

27 Rapid Web Service-based Prototyping (Here: ROADNet Command & Control Services for LOOKING Kick-Off Mtg) Source: Ilkay Altintas, SDM, NLADR ROADNet: Vernon, Orcutt et al Web services: Tony Fountain et al 53 Scientific Data & WF Engineering, B.Ludäscher in KEPLER (w/ editable script) Source: Dan Higgins, Kepler/SEEK 54 Scientific Data & WF Engineering, B.Ludäscher 27

28 in KEPLER (interactive session) Source: Dan Higgins, Kepler/SEEK 55 Scientific Data & WF Engineering, B.Ludäscher Blurring Design (ToDo( ToDo) and Execution 56 Scientific Data & WF Engineering, B.Ludäscher 28

29 Scientific Workflow Challenges Typical Features data-intensive and/or compute-intensive plumbing-intensive (consecutive web services won t fit) dataflow-oriented distributed (remote data, remote processing) user-interaction in the middle, vs. (C-z; bg; fg)-ing ( detach and reconnect) advanced programming constructs (map(f), zip, takewhile, ) logging, provenance, registering back (intermediate) products 57 Scientific Data & WF Engineering, B.Ludäscher designed to fit designed to fit hand-crafted control solution; also: forces sequential execution! [Altintas-et-al-PIW-SSDBM 03] hand-crafted Web-service actor No data transformations available Complex backward control-flow 58 Scientific Data & WF Engineering, B.Ludäscher 29

30 A Scientific Workflow Problem: Solved (Computer Scientist s s view) map(f)-style iterators Powerful type checking Generic, declarative programming constructs Generic data transformation actors Solution based on declarative, functional dataflow process network (= also a data streaming model!) Higher-order constructs: map(f) no control-flow spaghetti data-intensive apps free concurrent execution free type checking automatic support to go from piw(geneid) to PIW :=map(piw) over [GeneId] Forward-only, abstractable subworkflow piw(geneid) 59 Scientific Data & WF Engineering, B.Ludäscher Promoter Identification Workflow Redesigned map(genbankws) Input: { NM_001924, NM } Output: { CAGTAATATGAC", GGGGACAAAGA } 60 Scientific Data & WF Engineering, B.Ludäscher 30

31 map(f o g) instead of map(f) o map(g) A Research Problem: Optimization by Rewriting Example: PIW as a declarative, referentially transparent functional process optimization via functional rewriting possible e.g. map(f o g) = map(f) o map(g) Technical report & PIW specification in Haskell Combination of map and zip 61 Scientific Data & WF Engineering, B.Ludäscher A KR+DI+Scientific Workflow Problem Services can be semantically compatible,, but structurally incompatible Semantic Type P s Source Service Structural Type P s P s Ontologies (OWL) Compatible Incompatible δ ( ) δ(p s ) ( ) ( ) Desired Connection Structural Type P t P t Target Service Semantic Type P t Source: [Bowers-Ludaescher, DILS 04] 62 Scientific Data & WF Engineering, B.Ludäscher 31

32 Ontology-Informed Data Transformation ( Structure-Shim ) Ontologies (OWL) Semantic Compatible ( ) Semantic Type P s Registration Mapping (Output) Registration Mapping (Input) Type P t Structural Type P s Correspondence Structural Type P t Generate δ(p s ) Source Service P s Transformation Desired Connection P t Target Service Source: [Bowers-Ludaescher, DILS 04] 63 Scientific Data & WF Engineering, B.Ludäscher Outline Introduction: CI Sample Architectures Scientific Data Integration Scientific Workflow Management Links & Crystallization Points Summary 64 Scientific Data & WF Engineering, B.Ludäscher 32

33 Link-Up & Crystallization Points Shared (Domain) Science Vision, Goals NVO, SCEC, Human Genome Project, Technology Waves XML, web services, WSRF, Semantic Web (OWL), Portlets, Standards for data exchange, metadata, data access protocols, GML, EML, netcdf, HDF,, ADN,, DODS/OpenDAP, Organizations: W3C, GGF,, Community ontologies GO (Gene Ontology), ecoinformatics, seismology, geochemistry, from Saulus to Paulus Shared Community Tools and Tool Co-Development SRB, Globus,, Kepler, 65 Scientific Data & WF Engineering, B.Ludäscher Shared Science Vision, Goals: SCEC/CME Southern California Earthquake Center / Community Modeling Environment Project Simulation of Seismic Wave Propagation of a Magnitude 7.7 Earthquake on San Andreas Fault PIs: Thomas Jordan, Bernard Minster, Reagan Moore, Carl Kesselman Simulation 240 Processors for 5 days 47 Terabytes of data generated SDSC SAC project optimized code on DataStar parallel computer (both MPI I/O management and checkpointing) Future simulation Increase resolution a factor of 2, implies 1 PB of simulation results, 1000 processors for 20 days Source: Reagan Moore, SDSC 66 Scientific Data & WF Engineering, B.Ludäscher 33

34 Example: NVO Community Processes - created standard data encoding format (FITS image format) - made accessible common digital holdings (sky survey images) - defined Uniform Content Descriptors (common metadata attributes) - created standard services (standard access mechanisms to catalogs and surveys) - created digital library (manage derived data products) - created portals (for combining services interactively) - created processing pipelines (for automated processing) - created preservation environment And found a new brown dwarf! Source: Reagan Moore, SDSC 67 Scientific Data & WF Engineering, B.Ludäscher Semantic Mediation Waterfall Ontologies Iterative Development Semantic Data, Service Annotation Resource Discovery Resource Integration Workflow Analysis Workflow Planning Source: Shawn Bowers, SEEK AHM Scientific Data & WF Engineering, B.Ludäscher 34

35 GEON Dataset Generation & Registration (a co-development in KEPLER) % Makefile Makefile $> $> ant ant run run Matt,Chad, Dan et al. (SEEK) SQL database access (J DBC) Efrat (GEON) Ilkay (SDM) Yang (Ptolemy) Xiaowen (SDM) Edward et al.(ptolemy) 69 Scientific Data & WF Engineering, B.Ludäscher KEPLER as a Melting Pot A grass-roots roots project Needed a coalition of the (really!) willing Intra-project links e.g. in SEEK: AMS SMS EcoGrid Inter-project links SEEK ITR, GEON ITR, ROADNet ITRs, DOE SciDAC SDM, Ptolemy II, NIH BIRN (coming we hope ), UK escience mygrid, Inter-technology technology links Globus, SRB, JDBC, web services, soaplab services, command line tools, R, GRASS, XSLT, Interdisciplinary links CS, IT, domain sciences, (recently: usability engineer) 70 Scientific Data & WF Engineering, B.Ludäscher 35

36 Outline Introduction: CI Sample Architectures Scientific Data Integration Scientific Workflow Management Links & Crystallization Points Summary 71 Scientific Data & WF Engineering, B.Ludäscher Summary/Lessons Learned Eat your own dog-food (or at least try) start using your own (CI) tools early! Collaboration tools CVS repositories (+cvsview, webcvs) Mailing lists (e.g. mailman googlified) Bugzilla (detailed tracking of tech. issues & bugs) WIKI (community authored web resource, e.g. high-level tech. issues) Where is the XYZ repository/registry? EcoGrid (SEEK) registry, GEON registry, KEPLER actor & datasets repository, UDDI what? CI Melting Pots : SDSC, NCEAS, LTER, NLADR (w/ NCSA), KU Specify, Genome Center@UC Davis (moving in ) 72 Scientific Data & WF Engineering, B.Ludäscher 36

37 Q & A 73 Scientific Data & WF Engineering, B.Ludäscher Further Reading under review available upon request from ludaesch@sdsc.edu 74 Scientific Data & WF Engineering, B.Ludäscher 37

38 Related Publications Semantic Data Registration and Integration On Integrating Scientific Resources through Semantic Registration,, S. Bowers, K. Lin, and B. Ludäscher scher, 16th International Conference on Scientific and Statistical Database Management (SSDBM'04), June 2004, Santorini Island, Greece. A System for Semantic Integration of Geologic Maps via Ontologies,, K. Lin and B. Ludäscher scher.. In Semantic Web Technologies for Searching and Retrieving Scientific c Data (SCISW), Sanibel Island, Florida, Towards a Generic Framework for Semantic Registration of Scientific Data,, S. Bowers and B. Ludäscher scher.. In Semantic Web Technologies for Searching and Retrieving Scientific c Data (SCISW), Sanibel Island, Florida, The Role of XML in Mediated Data Integration Systems with Examples from Geological (Map) Data Interoperability,, B. Brodaric,, B. Ludäscher scher,, and K. Lin. In Geological Society of America (GSA) Annual Meeting,, volume 35(6), November Semantic Mediation Services in Geologic Data Integration: A Case Study from the GEON Grid,, K. Lin, B. Ludäscher scher,, B. Brodaric,, D. Seber,, C. Baru,, and K. A. Sinha.. In Geological Society of America (GSA) Annual Meeting,, volume 35(6), November Query Planning and Rewriting Processing First-Order Queries under Limited Access Patterns,, Alan Nash and B. Ludäscher scher, Proc. 23rd ACM Symposium on Principles of Database Systems (PODS'04)) Paris, France, June Processing Unions of Conjunctive Queries with Negation under Limited ited Access Patterns,, Alan Nash and B. Ludäscher scher., 9th Intl. Conference on Extending Database Technology (EDBT'04) Heraklion,, Crete, Greece, March 2004, LNCS Web Service Composition Through Declarative Queries: The Case of Conjunctive Queries with Union and Negation,, B. Ludäscher and Alan Nash. Research abstract (poster), 20th Intl. Conference on Data Engineering (ICDE'04)) Boston, IEEE Computer Society, April Scientific Data & WF Engineering, B.Ludäscher Scientific Workflows Related Publications Kepler: : An Extensible System for Design and Execution of Scientific Workflows rkflows,, I. Altintas,, C. Berkley, E. Jaeger, M. Jones, B. Ludäscher scher,, S. Mock, 16th International Conference on Scientific and Statistical Database Management (SSDBM'04), June 2004, Santorini Island, Greece. Kepler: : Towards a Grid-Enabled System for Scientific Workflows, Ilkay Altintas,, Chad Berkley, Efrat Jaeger, Matthew Jones, Bertram Ludäscher scher,, Steve Mock, Workflow in Grid Systems (GGF10),, Berlin, March 9th, An Ontology-Driven Framework for Data Transformation in Scientific Workflows,, S. Bowers and B. Ludäscher scher, Intl. Workshop on Data Integration in the Life Sciences (DILS'04), March 25-26, 26, 2004 Leipzig, Germany, LNCS A Web Service Composition and Deployment Framework for Scientific c Workflows, I. Altintas, E. Jaeger, K. Lin, B. Ludaescher,, A. Memon,, In the 2nd Intl. Conference on Web Services (ICWS), San Diego, California, July Scientific Data & WF Engineering, B.Ludäscher 38

ECS289F Winter 05 Scientific Data Management

ECS289F Winter 05 Scientific Data Management ECS289F Winter 05 Scientific Data Management Data Integration Bertram Ludäscher Associate Professor Dept. of Computer Science & Genome Center University of California, Davis ludaesch@ucdavis.edu Process

More information

KEPLER: Overview and Project Status

KEPLER: Overview and Project Status KEPLER: Overview and Project Status Bertram Ludäscher ludaesch@ucdavis.edu Associate Professor Dept. of Computer Science & Genome Center University of California, Davis UC DAVIS Department of Computer

More information

Final Project Assignments. Promoter Identification Workflow (PIW)

Final Project Assignments. Promoter Identification Workflow (PIW) Final Project Assignments Schema Matching Ji-Yeong Chong Biological Pathways & Ontologies Russell D Sa FCA Theory and Practice Bill Man, Betty Chan Practice of Data Integration (GAV) Jenny Wang Kepler/Data

More information

Scientific Data Integration: From the Big Picture to some Gory Details

Scientific Data Integration: From the Big Picture to some Gory Details Scientific Data Integration: From the Big Picture to some Gory Details Bertram Ludäscher ludaesch@ucdavis.edu Associate Professor Dept. of Computer Science & Genome Center University of California, Davis

More information

Scientific Workflow Tools. Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego

Scientific Workflow Tools. Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego Scientific Workflow Tools Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego 1 escience Today Increasing number of Cyberinfrastructure (CI) technologies Data Repositories: Network

More information

Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System

Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System Jianwu Wang, Ilkay Altintas Scientific Workflow Automation Technologies Lab SDSC, UCSD project.org UCGrid Summit,

More information

Using Web Services and Scientific Workflow for Species Distribution Prediction Modeling 1

Using Web Services and Scientific Workflow for Species Distribution Prediction Modeling 1 WAIM05 Using Web Services and Scientific Workflow for Species Distribution Prediction Modeling 1 Jianting Zhang, Deana D. Pennington, and William K. Michener LTER Network Office, the University of New

More information

A(nother) Vision of ppod Data Integration!?

A(nother) Vision of ppod Data Integration!? Scientific Workflows: A(nother) Vision of ppod Data Integration!? Bertram Ludäscher Shawn Bowers Timothy McPhillips Dave Thau Dept. of Computer Science & UC Davis Genome Center University of California,

More information

A High-Level Distributed Execution Framework for Scientific Workflows

A High-Level Distributed Execution Framework for Scientific Workflows A High-Level Distributed Execution Framework for Scientific Workflows Jianwu Wang 1, Ilkay Altintas 1, Chad Berkley 2, Lucas Gilbert 1, Matthew B. Jones 2 1 San Diego Supercomputer Center, UCSD, U.S.A.

More information

Scientific Workflows

Scientific Workflows Scientific Workflows Overview More background on workflows Kepler Details Example Scientific Workflows Other Workflow Systems 2 Recap from last time Background: What is a scientific workflow? Goals: automate

More information

DataONE: Open Persistent Access to Earth Observational Data

DataONE: Open Persistent Access to Earth Observational Data Open Persistent Access to al Robert J. Sandusky, UIC University of Illinois at Chicago The Net Partners Update: ONE and the Conservancy December 14, 2009 Outline NSF s Net Program ONE Introduction Motivating

More information

Knowledge-based Grids

Knowledge-based Grids Knowledge-based Grids Reagan Moore San Diego Supercomputer Center (http://www.npaci.edu/dice/) Data Intensive Computing Environment Chaitan Baru Walter Crescenzi Amarnath Gupta Bertram Ludaescher Richard

More information

A Three Tier Architecture for LiDAR Interpolation and Analysis

A Three Tier Architecture for LiDAR Interpolation and Analysis A Three Tier Architecture for LiDAR Interpolation and Analysis Efrat Jaeger-Frank 1, Christopher J. Crosby 2,AshrafMemon 1, Viswanath Nandigam 1, J. Ramon Arrowsmith 2, Jeffery Conner 2, Ilkay Altintas

More information

Automatic Transformation from Geospatial Conceptual Workflow to Executable Workflow Using GRASS GIS Command Line Modules in Kepler *

Automatic Transformation from Geospatial Conceptual Workflow to Executable Workflow Using GRASS GIS Command Line Modules in Kepler * Automatic Transformation from Geospatial Conceptual Workflow to Executable Workflow Using GRASS GIS Command Line Modules in Kepler * Jianting Zhang, Deana D. Pennington, and William K. Michener LTER Network

More information

A geoinformatics-based approach to the distribution and processing of integrated LiDAR and imagery data to enhance 3D earth systems research

A geoinformatics-based approach to the distribution and processing of integrated LiDAR and imagery data to enhance 3D earth systems research A geoinformatics-based approach to the distribution and processing of integrated LiDAR and imagery data to enhance 3D earth systems research Christopher J. Crosby, J Ramón Arrowsmith, Jeffrey Connor, Gilead

More information

Workflow Exchange and Archival: The KSW File and the Kepler Object Manager. Shawn Bowers (For Chad Berkley & Matt Jones)

Workflow Exchange and Archival: The KSW File and the Kepler Object Manager. Shawn Bowers (For Chad Berkley & Matt Jones) Workflow Exchange and Archival: The KSW File and the Shawn Bowers (For Chad Berkley & Matt Jones) University of California, Davis May, 2005 Outline 1. The 2. Archival and Exchange via KSW Files 3. Object

More information

7 th International Digital Curation Conference December 2011

7 th International Digital Curation Conference December 2011 Golden Trail 1 Golden-Trail: Retrieving the Data History that Matters from a Comprehensive Provenance Repository Practice Paper Paolo Missier, Newcastle University, UK Bertram Ludäscher, Saumen Dey, Michael

More information

A High-Level Distributed Execution Framework for Scientific Workflows

A High-Level Distributed Execution Framework for Scientific Workflows Fourth IEEE International Conference on escience A High-Level Distributed Execution Framework for Scientific Workflows Jianwu Wang 1, Ilkay Altintas 1, Chad Berkley 2, Lucas Gilbert 1, Matthew B. Jones

More information

Hybrid-Type Extensions for Actor-Oriented Modeling (a.k.a. Semantic Data-types for Kepler) Shawn Bowers & Bertram Ludäscher

Hybrid-Type Extensions for Actor-Oriented Modeling (a.k.a. Semantic Data-types for Kepler) Shawn Bowers & Bertram Ludäscher Hybrid-Type Extensions for Actor-Oriented Modeling (a.k.a. Semantic Data-types for Kepler) Shawn Bowers & Bertram Ludäscher University of alifornia, Davis Genome enter & S Dept. May, 2005 Outline 1. Hybrid

More information

Enhanced Access to High-Resolution LiDAR Topography through Cyberinfrastructure-Based Data Distribution and Processing

Enhanced Access to High-Resolution LiDAR Topography through Cyberinfrastructure-Based Data Distribution and Processing Enhanced Access to High-Resolution LiDAR Topography through Cyberinfrastructure-Based Data Distribution and Processing Christopher J. Crosby, J Ramón Arrowsmith Jeffrey Connor, Gilead Wurman Efrat Jaeger-Frank,

More information

Kepler and Grid Systems -- Early Efforts --

Kepler and Grid Systems -- Early Efforts -- Distributed Computing in Kepler Lead, Scientific Workflow Automation Technologies Laboratory San Diego Supercomputer Center, (Joint work with Matthew Jones) 6th Biennial Ptolemy Miniconference Berkeley,

More information

DSpace Fedora. Eprints Greenstone. Handle System

DSpace Fedora. Eprints Greenstone. Handle System Enabling Inter-repository repository Access Management between irods and Fedora Bing Zhu, Uni. of California: San Diego Richard Marciano Reagan Moore University of North Carolina at Chapel Hill May 18,

More information

Mitigating Risk of Data Loss in Preservation Environments

Mitigating Risk of Data Loss in Preservation Environments Storage Resource Broker Mitigating Risk of Data Loss in Preservation Environments Reagan W. Moore San Diego Supercomputer Center Joseph JaJa University of Maryland Robert Chadduck National Archives and

More information

Introduction to Grid Computing

Introduction to Grid Computing Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able

More information

Kepler: An Extensible System for Design and Execution of Scientific Workflows

Kepler: An Extensible System for Design and Execution of Scientific Workflows DRAFT Kepler: An Extensible System for Design and Execution of Scientific Workflows User Guide * This document describes the Kepler workflow interface for design and execution of scientific workflows.

More information

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment Paul Watry Univ. of Liverpool, NaCTeM pwatry@liverpool.ac.uk Ray Larson Univ. of California, Berkeley

More information

Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life

Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life Shawn Bowers 1, Timothy McPhillips 1, Sean Riddle 1, Manish Anand 2, Bertram Ludäscher 1,2 1 UC Davis Genome Center,

More information

San Diego Supercomputer Center, UCSD, U.S.A. The Consortium for Conservation Medicine, Wildlife Trust, U.S.A.

San Diego Supercomputer Center, UCSD, U.S.A. The Consortium for Conservation Medicine, Wildlife Trust, U.S.A. Accelerating Parameter Sweep Workflows by Utilizing i Ad-hoc Network Computing Resources: an Ecological Example Jianwu Wang 1, Ilkay Altintas 1, Parviez R. Hosseini 2, Derik Barseghian 2, Daniel Crawl

More information

Overview. Scientific workflows and Grids. Kepler revisited Data Grids. Taxonomy Example systems. Chimera GridDB

Overview. Scientific workflows and Grids. Kepler revisited Data Grids. Taxonomy Example systems. Chimera GridDB Grids and Workflows Overview Scientific workflows and Grids Taxonomy Example systems Kepler revisited Data Grids Chimera GridDB 2 Workflows and Grids Given a set of workflow tasks and a set of resources,

More information

Where we are so far. Intro to Data Integration (Datalog, mediators, ) more to come (your projects!): schema matching, simple query rewriting

Where we are so far. Intro to Data Integration (Datalog, mediators, ) more to come (your projects!): schema matching, simple query rewriting Where we are so far Intro to Data Integration (Datalog, mediators, ) more to come (your projects!): schema matching, simple query rewriting Intro to Knowledge Representation & Ontologies description logic,

More information

The International Journal of Digital Curation Issue 1, Volume

The International Journal of Digital Curation Issue 1, Volume Towards a Theory of Digital Preservation 63 Towards a Theory of Digital Preservation Reagan Moore, San Diego Supercomputer Center June 2008 Abstract A preservation environment manages communication from

More information

Data Integration and Data Warehousing Database Integration Overview

Data Integration and Data Warehousing Database Integration Overview Data Integration and Data Warehousing Database Integration Overview Sergey Stupnikov Institute of Informatics Problems, RAS ssa@ipi.ac.ru Outline Information Integration Problem Heterogeneous Information

More information

Situations and Ontologies: helping geoscientists understand and share the semantics surrounding their computational resources

Situations and Ontologies: helping geoscientists understand and share the semantics surrounding their computational resources Situations and Ontologies: helping geoscientists understand and share the semantics surrounding their computational resources Mark Gahegan GeoVISTA Center, Department of Geography, The Pennsylvania State

More information

IRODS: the Integrated Rule- Oriented Data-Management System

IRODS: the Integrated Rule- Oriented Data-Management System IRODS: the Integrated Rule- Oriented Data-Management System Wayne Schroeder, Paul Tooby Data Intensive Cyber Environments Team (DICE) DICE Center, University of North Carolina at Chapel Hill; Institute

More information

THE GLOBUS PROJECT. White Paper. GridFTP. Universal Data Transfer for the Grid

THE GLOBUS PROJECT. White Paper. GridFTP. Universal Data Transfer for the Grid THE GLOBUS PROJECT White Paper GridFTP Universal Data Transfer for the Grid WHITE PAPER GridFTP Universal Data Transfer for the Grid September 5, 2000 Copyright 2000, The University of Chicago and The

More information

Interoperability and eservices

Interoperability and eservices Interoperability and eservices Aphrodite Tsalgatidou and Eleni Koutrouli Department of Informatics & Telecommunications, National & Kapodistrian University of Athens, Greece {atsalga, ekou}@di.uoa.gr Abstract.

More information

EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography

EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography Christopher Crosby, San Diego Supercomputer Center J Ramon Arrowsmith, Arizona State University Chaitan

More information

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Storage Resource Broker Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Reagan W. Moore moore@sdsc.edu http://www.sdsc.edu/srb Background NARA research prototype persistent

More information

Opus: University of Bath Online Publication Store

Opus: University of Bath Online Publication Store Patel, M. (2004) Semantic Interoperability in Digital Library Systems. In: WP5 Forum Workshop: Semantic Interoperability in Digital Library Systems, DELOS Network of Excellence in Digital Libraries, 2004-09-16-2004-09-16,

More information

Jeffery S. Horsburgh. Utah Water Research Laboratory Utah State University

Jeffery S. Horsburgh. Utah Water Research Laboratory Utah State University Advancing a Services Oriented Architecture for Sharing Hydrologic Data Jeffery S. Horsburgh Utah Water Research Laboratory Utah State University D.G. Tarboton, D.R. Maidment, I. Zaslavsky, D.P. Ames, J.L.

More information

DataONE Enabling Cyberinfrastructure for the Biological, Environmental and Earth Sciences

DataONE Enabling Cyberinfrastructure for the Biological, Environmental and Earth Sciences DataONE Enabling Cyberinfrastructure for the Biological, Environmental and Earth Sciences William K. Michener 1,2, Rebecca Koskela 1,2, Matthew B. Jones 2,3, Robert B. Cook 2,4, Mike Frame 2,5, Bruce Wilson

More information

GMA-PSMH: A Semantic Metadata Publish-Harvest Protocol for Dynamic Metadata Management Under Grid Environment

GMA-PSMH: A Semantic Metadata Publish-Harvest Protocol for Dynamic Metadata Management Under Grid Environment GMA-PSMH: A Semantic Metadata Publish-Harvest Protocol for Dynamic Metadata Management Under Grid Environment Yaping Zhu, Ming Zhang, Kewei Wei, and Dongqing Yang School of Electronics Engineering and

More information

Provenance Collection Support in the Kepler Scientific Workflow System

Provenance Collection Support in the Kepler Scientific Workflow System Provenance Collection Support in the Kepler Scientific Workflow System Ilkay Altintas 1, Oscar Barney 2, and Efrat Jaeger-Frank 1 1 San Diego Supercomputer Center, University of California, San Diego,

More information

The Virtual Observatory and the IVOA

The Virtual Observatory and the IVOA The Virtual Observatory and the IVOA The Virtual Observatory Emergence of the Virtual Observatory concept by 2000 Concerns about the data avalanche, with in mind in particular very large surveys such as

More information

Semantics and Ontologies For EarthCube

Semantics and Ontologies For EarthCube Semantics and Ontologies For EarthCube Gary Berg-Cross 1, Isabel Cruz 2, Mike Dean 3, Tim Finin 4, Mark Gahegan 5, Pascal Hitzler 6, Hook Hua 7, Krzysztof Janowicz 8, Naicong Li 9, Philip Murphy 9, Bryce

More information

Opal: Simple Web Services Wrappers for Scientific Applications

Opal: Simple Web Services Wrappers for Scientific Applications Opal: Simple Web Services Wrappers for Scientific Applications Sriram Krishnan*, Brent Stearn, Karan Bhatia, Kim K. Baldridge, Wilfred W. Li, Peter Arzberger *sriram@sdsc.edu ICWS 2006 - Sept 21, 2006

More information

DATA MANAGEMENT SYSTEMS FOR SCIENTIFIC APPLICATIONS

DATA MANAGEMENT SYSTEMS FOR SCIENTIFIC APPLICATIONS DATA MANAGEMENT SYSTEMS FOR SCIENTIFIC APPLICATIONS Reagan W. Moore San Diego Supercomputer Center San Diego, CA, USA Abstract Scientific applications now have data management requirements that extend

More information

Parameter Space Exploration using Scientific Workflows

Parameter Space Exploration using Scientific Workflows Parameter Space Exploration using Scientific Workflows David Abramson 1, Blair Bethwaite 1, Colin Enticott 1, Slavisa Garic 1, Tom Peachey 1 1 Faculty of Information Technology, Monash University, Clayton,

More information

and the bringing cabig cancer data to the.net Developer and Microsoft Office User Communities

and the bringing cabig cancer data to the.net Developer and Microsoft Office User Communities and the bringing cabig cancer data to the.net Developer and Microsoft Office User Communities http://xl-cabig-client.sourceforge.net/ Robert Macura Tom Macura escience Workshop October 2005 Science Paradigms

More information

Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London

Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services Patrick Wendel Imperial College, London Data Mining and Exploration Middleware for Distributed and Grid Computing,

More information

Wade Sheldon. Georgia Coastal Ecosystems LTER University of Georgia CUAHSI Virtual Workshop Field Data Management Solutions

Wade Sheldon. Georgia Coastal Ecosystems LTER University of Georgia   CUAHSI Virtual Workshop Field Data Management Solutions Wade Sheldon Georgia Coastal Ecosystems LTER University of Georgia email: sheldon@uga.edu CUAHSI Virtual Workshop Field Data Management Solutions 01-Oct-2014 Georgia Coastal Ecosystems LTER started in

More information

Juliusz Pukacki OGF25 - Grid technologies in e-health Catania, 2-6 March 2009

Juliusz Pukacki OGF25 - Grid technologies in e-health Catania, 2-6 March 2009 Grid Technologies for Cancer Research in the ACGT Project Juliusz Pukacki (pukacki@man.poznan.pl) OGF25 - Grid technologies in e-health Catania, 2-6 March 2009 Outline ACGT project ACGT architecture Layers

More information

Putting the Archives to Work: Workflow and Metadata-driven Analysis in LTER Science

Putting the Archives to Work: Workflow and Metadata-driven Analysis in LTER Science Putting the Archives to Work: Workflow and Metadata-driven Analysis in LTER Science Wade Sheldon Georgia Coastal Ecosystems LTER University of Georgia Acknowledgements: John Porter (Virginia Coast Reserve

More information

Java Development and Grid Computing with the Globus Toolkit Version 3

Java Development and Grid Computing with the Globus Toolkit Version 3 Java Development and Grid Computing with the Globus Toolkit Version 3 Michael Brown IBM Linux Integration Center Austin, Texas Page 1 Session Introduction Who am I? mwbrown@us.ibm.com Team Leader for Americas

More information

Portals and workflows: Taverna Workbench. Paolo Romano National Cancer Research Institute, Genova

Portals and workflows: Taverna Workbench. Paolo Romano National Cancer Research Institute, Genova Portals and workflows: Taverna Workbench Paolo Romano National Cancer Research Institute, Genova (paolo.romano@istge.it) 1 Summary Information and data integration in biology Web Services and workflow

More information

Reducing Consumer Uncertainty

Reducing Consumer Uncertainty Spatial Analytics Reducing Consumer Uncertainty Towards an Ontology for Geospatial User-centric Metadata Introduction Cooperative Research Centre for Spatial Information (CRCSI) in Australia Communicate

More information

Science-as-a-Service

Science-as-a-Service Science-as-a-Service The iplant Foundation Rion Dooley Edwin Skidmore Dan Stanzione Steve Terry Matthew Vaughn Outline Why, why, why! When duct tape isn t enough Building an API for the web Core services

More information

Semantic Web Technologies

Semantic Web Technologies 1/33 Semantic Web Technologies Lecture 11: SWT for the Life Sciences 4: BioRDF and Scientifc Workflows Maria Keet email: keet -AT- inf.unibz.it home: http://www.meteck.org blog: http://keet.wordpress.com/category/computer-science/72010-semwebtech/

More information

Dynamic, Rule-based Quality Control Framework for Real-time Sensor Data

Dynamic, Rule-based Quality Control Framework for Real-time Sensor Data Dynamic, Rule-based Quality Control Framework for Real-time Sensor Data Wade Sheldon Georgia Coastal Ecosystems LTER University of Georgia Introduction Quality Control of high volume, real-time data from

More information

Agent-Enabling Transformation of E-Commerce Portals with Web Services

Agent-Enabling Transformation of E-Commerce Portals with Web Services Agent-Enabling Transformation of E-Commerce Portals with Web Services Dr. David B. Ulmer CTO Sotheby s New York, NY 10021, USA Dr. Lixin Tao Professor Pace University Pleasantville, NY 10570, USA Abstract:

More information

Generating EML from a Relational Database Management System (RDBMS)

Generating EML from a Relational Database Management System (RDBMS) Page 13 of 34 Generating EML from a Relational Database Management System (RDBMS) - Zhiqiang Yang and Don Henshaw (AND) The Ecological Metadata Language (EML) is an open metadata specification and provides

More information

Workflow Fault Tolerance for Kepler. Sven Köhler, Thimothy McPhillips, Sean Riddle, Daniel Zinn, Bertram Ludäscher

Workflow Fault Tolerance for Kepler. Sven Köhler, Thimothy McPhillips, Sean Riddle, Daniel Zinn, Bertram Ludäscher Workflow Fault Tolerance for Kepler Sven Köhler, Thimothy McPhillips, Sean Riddle, Daniel Zinn, Bertram Ludäscher Introduction Scientific Workflows Automate scientific pipelines Have long running computations

More information

Model Driven Dynamic Composition of Web Services Flow for Business Process Integration

Model Driven Dynamic Composition of Web Services Flow for Business Process Integration OMG s 2nd Workshop On Web Services Modeling, Architectures, Infrastructures And Standards Model Driven Dynamic Composition of Web Services Flow for Business Process Integration Liang-Jie Zhang, Jen-Yao

More information

Integrating Existing Scientific Workflow Systems: The Kepler/Pegasus Example

Integrating Existing Scientific Workflow Systems: The Kepler/Pegasus Example Integrating Existing Scientific Workflow Systems: The Kepler/Pegasus Example Nandita Mandal, Ewa Deelman, Gaurang Mehta, Mei-Hui Su, Karan Vahi USC Information Sciences Institute Marina Del Rey, CA 90292

More information

Information Workbench

Information Workbench Information Workbench The Optique Technical Solution Christoph Pinkel, fluid Operations AG Optique: What is it, really? 3 Optique: End-user Access to Big Data 4 Optique: Scalable Access to Big Data 5 The

More information

Hands-on tutorial on usage the Kepler Scientific Workflow System

Hands-on tutorial on usage the Kepler Scientific Workflow System Hands-on tutorial on usage the Kepler Scientific Workflow System (including INDIGO-DataCloud extension) RIA-653549 Michał Konrad Owsiak (@mkowsiak) Poznan Supercomputing and Networking Center michal.owsiak@man.poznan.pl

More information

Engaging and Connecting Faculty:

Engaging and Connecting Faculty: Engaging and Connecting Faculty: Research Discovery, Access, Re-use, and Archiving Janet McCue and Jon Corson-Rikert Albert R. Mann Library Cornell University CNI Spring 2007 Task Force Meeting April 16,

More information

The NeuroLOG Platform Federating multi-centric neuroscience resources

The NeuroLOG Platform Federating multi-centric neuroscience resources Software technologies for integration of process and data in medical imaging The Platform Federating multi-centric neuroscience resources Johan MONTAGNAT Franck MICHEL Vilnius, Apr. 13 th 2011 ANR-06-TLOG-024

More information

metamatrix enterprise data services platform

metamatrix enterprise data services platform metamatrix enterprise data services platform Bridge the Gap with Data Services Leaders of high-priority application projects in businesses and government agencies are looking to complete projects efficiently,

More information

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation, Integration Alan Blatecky Director OCI 1 1 Framing the

More information

2/12/11. Addendum (different syntax, similar ideas): XML, JSON, Motivation: Why Scientific Workflows? Scientific Workflows

2/12/11. Addendum (different syntax, similar ideas): XML, JSON, Motivation: Why Scientific Workflows? Scientific Workflows Addendum (different syntax, similar ideas): XML, JSON, Python (a) Python (b) w/ dickonaries XML (a): "meta schema" JSON syntax LISP Source: h:p://en.wikipedia.org/wiki/json XML (b): "direct" schema Source:

More information

Implementing Trusted Digital Repositories

Implementing Trusted Digital Repositories Implementing Trusted Digital Repositories Reagan W. Moore, Arcot Rajasekar, Richard Marciano San Diego Supercomputer Center 9500 Gilman Drive, La Jolla, CA 92093-0505 {moore, sekar, marciano}@sdsc.edu

More information

Development of an Ontology-Based Portal for Digital Archive Services

Development of an Ontology-Based Portal for Digital Archive Services Development of an Ontology-Based Portal for Digital Archive Services Ching-Long Yeh Department of Computer Science and Engineering Tatung University 40 Chungshan N. Rd. 3rd Sec. Taipei, 104, Taiwan chingyeh@cse.ttu.edu.tw

More information

Software + Services for Data Storage, Management, Discovery, and Re-Use

Software + Services for Data Storage, Management, Discovery, and Re-Use Software + Services for Data Storage, Management, Discovery, and Re-Use CODATA 22 Conference Stellenbosch, South Africa 25 October 2010 Alex D. Wade Director Scholarly Communication Microsoft External

More information

Datagridflows: Managing Long-Run Processes on Datagrids

Datagridflows: Managing Long-Run Processes on Datagrids Datagridflows: Managing Long-Run Processes on Datagrids Arun Jagatheesan 1,2, Jonathan Weinberg 1, Reena Mathew 1, Allen Ding 1, Erik Vandekieft 1, Daniel Moore 1,3, Reagan Moore 1, Lucas Gilbert 1, Mark

More information

Introduction to GT3. Introduction to GT3. What is a Grid? A Story of Evolution. The Globus Project

Introduction to GT3. Introduction to GT3. What is a Grid? A Story of Evolution. The Globus Project Introduction to GT3 The Globus Project Argonne National Laboratory USC Information Sciences Institute Copyright (C) 2003 University of Chicago and The University of Southern California. All Rights Reserved.

More information

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007 Grid Programming: Concepts and Challenges Michael Rokitka SUNY@Buffalo CSE510B 10/2007 Issues Due to Heterogeneous Hardware level Environment Different architectures, chipsets, execution speeds Software

More information

ACET s e-research Activities

ACET s e-research Activities 18 June 2008 1 Computing Resources 2 Computing Resources Scientific discovery and advancement of science through advanced computing Main Research Areas Computational Science Middleware Technologies for

More information

Delivering Data Management for Engineers on the Grid 1

Delivering Data Management for Engineers on the Grid 1 Delivering Data Management for Engineers on the Grid 1 Jasmin Wason, Marc Molinari, Zhuoan Jiao, and Simon J. Cox School of Engineering Sciences, University of Southampton, UK {j.l.wason, m.molinari, z.jiao,

More information

Networking European Digital Repositories

Networking European Digital Repositories Networking European Digital Repositories What to Network? Researchers generate knowledge This is going to become an amazing paper I only hope I will be able to access it Knowledge is wrapped in publications

More information

Webinar Annotate data in the EUDAT CDI

Webinar Annotate data in the EUDAT CDI Webinar Annotate data in the EUDAT CDI Yann Le Franc - e-science Data Factory, Paris, France March 16, 2017 This work is licensed under the Creative Commons CC-BY 4.0 licence. Attribution: Y. Le Franc

More information

Teiid Designer User Guide 7.5.0

Teiid Designer User Guide 7.5.0 Teiid Designer User Guide 1 7.5.0 1. Introduction... 1 1.1. What is Teiid Designer?... 1 1.2. Why Use Teiid Designer?... 2 1.3. Metadata Overview... 2 1.3.1. What is Metadata... 2 1.3.2. Editing Metadata

More information

Bio-Workflows with BizTalk: Using a Commercial Workflow Engine for escience

Bio-Workflows with BizTalk: Using a Commercial Workflow Engine for escience Bio-Workflows with BizTalk: Using a Commercial Workflow Engine for escience Asbjørn Rygg, Scott Mann, Paul Roe, On Wong Queensland University of Technology Brisbane, Australia a.rygg@student.qut.edu.au,

More information

Managing the Emerging Semantic Risks

Managing the Emerging Semantic Risks The New Information Security Agenda: Managing the Emerging Semantic Risks Dr Robert Garigue Vice President for information integrity and Chief Security Executive Bell Canada Page 1 Abstract Today all modern

More information

JAVA COURSES. Empowering Innovation. DN InfoTech Pvt. Ltd. H-151, Sector 63, Noida, UP

JAVA COURSES. Empowering Innovation. DN InfoTech Pvt. Ltd. H-151, Sector 63, Noida, UP 2013 Empowering Innovation DN InfoTech Pvt. Ltd. H-151, Sector 63, Noida, UP contact@dninfotech.com www.dninfotech.com 1 JAVA 500: Core JAVA Java Programming Overview Applications Compiler Class Libraries

More information

Florida Coastal Everglades LTER Program

Florida Coastal Everglades LTER Program Florida Coastal Everglades LTER Program Metadata Workshop April 13, 2007 Linda Powell, FCE Information Manager Workshop Objectives I. Short Introduction to the FCE Metadata Policy What needs to be submitted

More information

Executive Committee Meeting

Executive Committee Meeting Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Implementing the Army Net Centric Data Strategy in a Service Oriented Environment

Implementing the Army Net Centric Data Strategy in a Service Oriented Environment Implementing the Army Net Centric Strategy in a Service Oriented Environment Michelle Dirner Army Net Centric Strategy (ANCDS) Center of Excellence (CoE) Service Team Lead RDECOM CERDEC SED in support

More information

Approaches & Languages for Schema Transformation

Approaches & Languages for Schema Transformation Approaches & Languages for Schema Transformation Findings of HUMBOLDT & follow-up Activities INSPIRE KEN Workshop on Schema Transformation Paris, France, 08.10.2013 Thorsten Reitz Esri R&D Center Zurich

More information

20. Situational Applications and Mashups

20. Situational Applications and Mashups 20. Situational Applications and Mashups 5 November 2008 Bob Glushko Plan for Today's Lecture Platforms for Composite Applications Mash-ups Mash-ups {and,or,vs} Composite Applications A Vision: Rapid Service

More information

Analysis and summary of stakeholder recommendations First Kepler/CORE Stakeholders Meeting, May 13-15, 2008

Analysis and summary of stakeholder recommendations First Kepler/CORE Stakeholders Meeting, May 13-15, 2008 Analysis and summary of stakeholder recommendations First Kepler/CORE Stakeholders Meeting, May 13-15, 2008 I. Assessing Kepler/CORE development priorities The first Kepler Stakeholder s meeting brought

More information

A three tier architecture applied to LiDAR processing and monitoring

A three tier architecture applied to LiDAR processing and monitoring Scientific Programming 14 (2006) 185 194 185 IOS Press A three tier architecture applied to LiDAR processing and monitoring Efrat Jaeger-Frank a,, Christopher J. Crosby b, Ashraf Memon a, Viswanath Nandigam

More information

Workflow, Planning and Performance Information, information, information Dr Andrew Stephen M c Gough

Workflow, Planning and Performance Information, information, information Dr Andrew Stephen M c Gough Workflow, Planning and Performance Information, information, information Dr Andrew Stephen M c Gough Technical Coordinator London e-science Centre Imperial College London 17 th March 2006 Outline Where

More information

ITM DEVELOPMENT (ITMD)

ITM DEVELOPMENT (ITMD) ITM Development (ITMD) 1 ITM DEVELOPMENT (ITMD) ITMD 361 Fundamentals of Web Development This course will cover the creation of Web pages and sites using HTML, CSS, Javascript, jquery, and graphical applications

More information

Introduction to Federation Server

Introduction to Federation Server Introduction to Federation Server Alex Lee IBM Information Integration Solutions Manager of Technical Presales Asia Pacific 2006 IBM Corporation WebSphere Federation Server Federation overview Tooling

More information

Multiple Broker Support by Grid Portals* Extended Abstract

Multiple Broker Support by Grid Portals* Extended Abstract 1. Introduction Multiple Broker Support by Grid Portals* Extended Abstract Attila Kertesz 1,3, Zoltan Farkas 1,4, Peter Kacsuk 1,4, Tamas Kiss 2,4 1 MTA SZTAKI Computer and Automation Research Institute

More information

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel National Center for Supercomputing Applications University of Illinois

More information

IBM Rational Application Developer for WebSphere Software, Version 7.0

IBM Rational Application Developer for WebSphere Software, Version 7.0 Visual application development for J2EE, Web, Web services and portal applications IBM Rational Application Developer for WebSphere Software, Version 7.0 Enables installation of only the features you need

More information

USING THE BUSINESS PROCESS EXECUTION LANGUAGE FOR MANAGING SCIENTIFIC PROCESSES. Anna Malinova, Snezhana Gocheva-Ilieva

USING THE BUSINESS PROCESS EXECUTION LANGUAGE FOR MANAGING SCIENTIFIC PROCESSES. Anna Malinova, Snezhana Gocheva-Ilieva International Journal "Information Technologies and Knowledge" Vol.2 / 2008 257 USING THE BUSINESS PROCESS EXECUTION LANGUAGE FOR MANAGING SCIENTIFIC PROCESSES Anna Malinova, Snezhana Gocheva-Ilieva Abstract:

More information

Kepler Scientific Workflow and Climate Modeling

Kepler Scientific Workflow and Climate Modeling Kepler Scientific Workflow and Climate Modeling Ufuk Turuncoglu Istanbul Technical University Informatics Institute Cecelia DeLuca Sylvia Murphy NOAA/ESRL Computational Science and Engineering Dept. NESII

More information