A(nother) Vision of ppod Data Integration!?

Size: px
Start display at page:

Download "A(nother) Vision of ppod Data Integration!?"

Transcription

1 Scientific Workflows: A(nother) Vision of ppod Data Integration!? Bertram Ludäscher Shawn Bowers Timothy McPhillips Dave Thau Dept. of Computer Science & UC Davis Genome Center University of California, DAVIS UC DAVIS Department of Computer Science

2 Overview Scientific Workflow: Overview Vision Examples using Kepler (from NSF/ITR SEEK) Provenance in Scientific Workflows from single runs to project histories ppod & Kepler next steps

3 Different Kinds of (Data) Integration Traditional Information (& Data) Integration syntactic & structural heterogeneities, schema mappings, schema matching, query rewriting (parsing, matching, [G]LAV, Chase [+IC], Resolution), dealing with fundamentally same (largely overlapping) information find ways to integrate different representations Scientific Information Integration (SII) includes the above but often deals with combining fundamentally different information more than one way to combine, integrate the data integration invokes scientific theories, models that cannot be inferred from only data, schema, ontologies joining of data, chaining of analysis steps in the scientist s head ( y := f(x) ; z := g(x,y); ) make these analysis pipelines first-class citizens scientific workflows can provide an end-to-end framework

4 Types of Information Integration Export schema Export schema Conventional information integration: Federated schema schema-based view-based at the instance level Federated schema Export schema Export schema Export schema Component schema Component schema Component schema Local schema Local schema Local schema Data Source Data Source Data Source Spatial (co-)registration/ overlay of different data from 2D, 3D, 4D (x,y,z,t), (4+n) D GIS ++ Extended DI approaches using ontologies controlled vocabularies, metadata, annotations Scientific Information Integration = data + process/application integration scientific workflows can include all the others and statistics, data mining, visualization,

5 Scientific Workflows = Cyberinfrastructure UPPERWARE Upperware Upper Middleware Middleware Underware Science Environment for Ecological Knowledge ( SEEK )

6 Science Environment for Ecological Knowledge (SEEK) Access distributed environmental, ecological, and systematics data Enable data sharing & reuse Enhance data discovery at global scales Distributed data network EcoGrid Design, reuse, and execute scientific analyses Enable communication and collaboration for analysis Enable reuse of analytical components and analyses Integrated data access Kepler Data discovery and integration Addressing variety of semantic data heterogeneity issues Ontology and controlled-vocabulary development Semantic data and actor annotations Resolve taxonomic ambiguities SMS / OBOE / Taxonomic concept services

7 Kepler Data Access via the EcoGrid Lightweight API for providers & clients Implemented via web services Common metadata query syntax Common mechanism for accessing ecological (KNB), museum specimen (DiGIR), environmental (SRB), and geological (GEON) data Catalog-based Integration NOT a single CDM leave the integration to the workflow designer!

8 Scientific Workflow Capture how a scientist works with data and analytical tools data access, transformation, analysis, visualization possible worldview: dataflow-oriented Scientific workflow (wf) benefits (compare w/ script-based approaches) : wf automation wf & component reuse wf design, documentation wf archival, sharing built-in concurrency (task-, pipeline-parallelism) built-in provenance support distributed execution (Grid) support

9 Kepler Collaboration (alive and evolving) Open-source Builds on Ptolemy II from UC Berkeley Contributors from: SEEK SciDAC SDM Ptolemy GEON ROADNet Resurgence AToL: CIPRES, POD Ptolemy II Phyl-O'Data (POD) Goals Natural Diversity Create powerful analytical tools that are useful across disciplines Ecology, Biology, Engineering, Geology, Physics, Chemistry, Astronomy, Discovery Project

10 Basic Kepler User Interface Tool Bar Quick Search Workflow Canvas Actor Libraries Thumbnail Navigation

11 Kepler Data Access via the EcoGrid Data Quick Search Tab Metadata Keyword Search Access Multiple EcoGrid Sources Return Data Sets as Actors to Drag-Drop to Canvas

12 Input/Output Semantic Annotation Actor input/output port annotation: Each port can be annotated with multiple classes from multiple ontologies Annotations are stored with actor metadata (MOML) Actors can be discovered, validated, etc., via their semantic types

13 Actor Annotations Actor Annotations for Indexing & Classification New actors can be annotated and indexed into the component library (e.g., specializing generic actors) Existing components can also be revised, annotated, and indexed (hiding previous versions) Quick search leverages metadata, including annotations & ontologies

14 Kepler Demo: Building a simple workflow Select actors from Kepler actor library: Local or remote actors View actor metadata/documentation (not shown) Drag desired actor to canvas Connect actor ports other actor examples

15 Kepler Demo: Building a simple workflow Select input data: Shown here is an EcoGrid for bacterial abundance Connect data actors to workflow inputs many ways to import data

16 Kepler Demo: Building a simple workflow Using EcoGrid data sources: Display metadata (EML) Query data via SQL/QBE interface even if it is a tab-delimited file (see above)

17 Kepler Demo: Building a simple workflow Run the workflow Also set parameters, select & configure director, run window, etc.

18 SEEK Ecological Niche Modeling Workflows Complex workflows with many levels of nesting (sub-workflows) Predict species locations from presence data and environmental layers Designed to support different prediction algorithms (reusability) Currently uses GARP (Genetic Algorithm for Rule-Set Prediction) n levels down

19 Drilling down: Calculate Best Rulesets climate change data

20 SEEK Ecological Niche Modeling Workflows Includes a number of workflows for automating special purpose data-integration tasks Integration of multiple data sets and data types Workflows for local caching of data, format and content conversions Rescale grid data, adjust resolutions, extents, merges grids Integrate Hydro1K North and South American data, including warp/projection, format conversion, etc. NESCENT, rescaling, Sept 07

21 The Joy of Exa-Scale Cyberinfrastructure Are we working at the right level of abstraction? Are we optimizing the right thing? Optimize human cycles, not just CPU cycles! cf. John McCarthy (of AI/LISP fame) Make data & scientific workflows effectively (re-)usable for scientist Make workflows first-class, shareable knowledge artifacts Support user-oriented provenance queries

22 (Data) Provenance & Scientific Workflows (Data) provenance data lineage, processing history Query the lineage of a data product: what data it is derived from and how Evaluate the results of a workflow: is the approach correct Reuse intermediate or final products of one workflow in another Explain unexpected results Discover all results derived from a given data set Accurately prepare methods section of a publication Archive scientific results in a repository Replicate the results reported by another researcher

23 Inferring a phylogenetic tree from disparate data Aligned DNA sequences Discrete morphological data Continuous characters Datasets Maximum likelihood tree (DNA) Maximum parsimony tree Maximum likelihood tree (continuous characters) Actors Integrate Consensus Tree(s) Provenance Store Datasets

24 Scientific provenance questions (single run) What DNA sequences were input (phylogenetic trees were output) by the workflow? What intermediate phylogenetic trees were created? Which actor created this phylogenetic tree? Which input sequences does this consensus tree depend on? Which input sequences were not used to derive any consensus tree What sequence alignment (key intermediate data) was used to infer this tree?

25 A (very) simple phylogenetics workflow

26 Data lineage + processing history for a consensus tree ip yl Ph lipp ars :1 Pa Phy s 3 : rs e:1 ens :1 se n e ons lipc y Ph ns Co Ph yl :5 rs Pa y Ph ip lip n Co il p :1 :1 rs ip yl rser e:1 PhylipConsense:1 Pa lepa ens :1 Ph yli p :1 se usfi se ons Phy Ph Nex en en TextFileReader:1 Phy lipc :1 ar ilep usf Nex ns se Co en ns Ph yl ip PhylipPars:1 1 ser: Co s:1 ar P p yli Ph 1 ars: lipp Phy Derivation (processing history) of a data item in a scientific workflow run (a DAG) Nodes = data items the workflow run operated on or created Edges = was directly used in labeled by the actor invocation that performed this computation Different (emerging) provenance extensions to Kepler

27 Provenance: Single Run

28 Provenance: Multiple Runs

29 Conceptual workflows: series of subworkflows

30 Manual, data visualization, and quality assessment steps are interleaved with automated steps

31 Projects comprise multiple conceptual workflows

32 Workflows are run multiple times with different parameter settings

33 How Kepler is used today Aware of only one workflow, one run at a time Data, workflows, and provenance records reside outside the system between runs p1 p2 p3 Users must perform most data and provenance management outside of the system Workflows must be modified or reconfigured to operate on different input data p1 p2 p3

34 Support for project folders & histories Data is registered Project folders allow users to organize data. Project history records and depicts past workflow runs and the flow of data between runs. Data is staged from the project folders (and project history). Run outputs appear in the project history (along with the input) if the run is committed. All or part of the output of a run may be used to update the project folders. Workflows can be applied to different data sets NESCENT, Sept 07without

35 Project history relieves need to perform data versioning via project folders Recomputed data can replace old versions, be stored elsewhere in folders, or simply left in the project history. Replaced data are always accessible via project history. Provenance queries provide access to all data regardless of location.

36 Managing workflow evolution Workflow library is not a flat list of available workflows. Workflows evolve throughout a project, and previous versions must be retained for reference and for further use. Workflow evolution view complements run history.

37 Summary & Next Steps Kepler today used in ecoinformatics (SEEK), ChIP-chip, geoinformatics, data catalog, data grid workflows for data integration data annotation and semantic extensions Kepler next steps (planned deliverables): PHYLOGENETIC SCIENTIFIC WORKFLOWS Develop use cases / conceptual workflows: tree construction (understood) post-tree analysis, supertree/matrix construction (exciting :) community-driven! Implement subset of those in Kepler Generate actor library targeting community use cases PROJECT HISTORIES SUPPORT (cf. DILS'07 paper) Extend use cases to exploit project histories / provenance Implement those ppod REPOSITORY (Orchestra!?) 1. Extend Kepler to use ppod data repository

38 Consilience: The Unity of Knowledge (E. O. Wilson) "Literally a jumping together of knowledge by the linking of facts and fact-based theory across disciplines to create a common groundwork for explanation." E.O.Wilson escience, Cyberinfrastructure: mechanisms to make progress Scientific Workflows: crucial elements to get the most mileage out of CI to fuel escience, accelerating knowledge discovery Identify the real bottlenecks in this We must know, quest! we will know. -- David Hilbert Wer Visionen hat, sollte zum Arzt gehen Helmut Schmidt on Willy Brandt

39 Questions kepler-project.org

40 References Niche Modeling D Pennington, D Higgins, AT Peterson, M Jones, B Ludaescher, S Bowers. Ecological niche modeling using the Kepler workflow system.. Workflows for e-science: Scientific Workflows for Grids, Springer-Verlag, Ecological Niche Modeling in Kepler. User Manual. Draft, 2007 Semantic Annotation S Bowers, B Ludaescher. A calculus for propagating semantic annotations through scientific workflow queries. QLQP, S Bowers, B Ludaescher. Actor-oriented design of scientific workflows. ER, C Berkley, S Bowers, M Jones, B Ludaescher, M Schildhauer, J Tao. Incorporating semantics in scientific workflow authoring. SSDBM, S Bowers, B Ludaescher. An Ontology-driven framework for data transformation Scientific in scientific workflows. DILS, Workflows, B. Ludäscher

41 References Provenance in Workflows S Bowers, T McPhillips, M Wu, B Ludaescher. Project histories: Managing data provenance across collectionoriented scientific workflow runs. DILS, S Bowers, T McPhillips, B Ludaescher. Provenance in collection-oriented workflows. Concurrency and Computation: Practice and Experience, B Ludaescher, N Podhorszki, I Altintas, S Bowers, T McPhillips. From computation models to models of provenance: The RWS approach. Concurrency and Computation: Practice and Experience, 2007.

42 Additional Related Publications Semantic Type Annotation S Bowers, B Ludaescher. A Calculus for Propagating Semantic Annotations through Scientific Workflow Queries. ICDE Workshop on Query Languages and Query Processing (QLQP), LNCS, S Bowers, B Ludaescher. Towards Automatic Generation of Semantic Types in Scientific Workflows. International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS), WISE 2005 Workshop Proceedings, LNCS, C Berkley, S Bowers, M Jones, B Ludaescher, M Schildhauer, J Tao. Incorporating Semantics in Scientific Workflow Authoring. SSDBM, B Ludaescher, K Lin, S Bowers, E Jaeger-Frank, B Brodaric, C Baru. Managing Scientific Data: From Data Integration to Scientific Workflows. GSA Today, Special Issue on Geoinformatics, S Bowers, D Thau, R Williams, B Ludaescher. Data Procurement for Enabling Scientific Workflows: On Exploring Inter-Ant Parasitism. VLDB Workshop on Semantic Web and Databases (SWDB), S Bowers, K Lin, B Ludaescher. On Integrating Scientific Resources through Semantic Registration. SSDBM, S Bowers, B Ludaescher. An Ontology-Drive Framework for Data Transformation in Scientific Workflows. International Workshop on Data Integration in the Life Sciences (DILS), LNCS, S Bowers, B Ludaescher. Towards a Generic Framework for Semantic Registration of Scientific Data. International Semantic Web Conference Workshop on Semantic Web Technologies for Searching and Retrieving Scientific Data, Workflow Design and Modeling T McPhillips, S Bowers, B Ludaescher. Collection-Oriented Scientific Workflows for Integrating and Analyzing Biological Data. Workshop on Data Integration in the Life Sciences (DILS), LNCS, S Bowers, T McPhillips, B Ludaescher, S Cohen, SB Davidson. A Model for User-Oriented Data Provenance in Pipelined Scientific Workflows. International Provenance and Annotation Workshop (IPAW), LNCS, S Bowers, B Ludaescher, AHH Ngu, T Critchlow. Enabling Scientific Workflow Reuse through Structured Composition of Dataflow and Control-Flow. IEEE Workshop on Workflow and Data Flow for Scientific Applications (SciFlow), S Bowers, B Ludaescher. Actor-Oriented Design of Scientific Workflows. International Conference on Conceptual Modeling (ER), LNCS, T McPhillips, S Bowers. Pipelining Nested Data Collections in Scientific Workflows. SIGMOD Record, Kepler D Pennington, D Higgins, AT Peterson, M Jones, B Ludaescher, S Bowers. Ecological Niche Modeling using the Kepler Workflow System. Workflows for e-science, Springer-Verlag, to appear. W Michener, J Beach, S Bowers, L Downey, M Jones, B Ludaescher, D Pennington, A Rajasekar, S Romanello, M Schildhauer, D Vieglais, J Zhang. SEEK: Data Integration and Workflow Solutions for Ecology. Workshop on Data Integration in the Life Sciences (DILS), LNCS, S Romanello, W Michener, J Beach, M Jones, B Ludaescher, A Rajasekar, M Schildhauer, S Bowers, D Pennington. Creating and Providing Data Management Services for the Biological and Ecological Sciences: Science Environment for Ecological Knowledge. SSDBM, 2005.

Scientific Workflow Tools. Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego

Scientific Workflow Tools. Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego Scientific Workflow Tools Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego 1 escience Today Increasing number of Cyberinfrastructure (CI) technologies Data Repositories: Network

More information

Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life

Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life Shawn Bowers 1, Timothy McPhillips 1, Sean Riddle 1, Manish Anand 2, Bertram Ludäscher 1,2 1 UC Davis Genome Center,

More information

Workflow Exchange and Archival: The KSW File and the Kepler Object Manager. Shawn Bowers (For Chad Berkley & Matt Jones)

Workflow Exchange and Archival: The KSW File and the Kepler Object Manager. Shawn Bowers (For Chad Berkley & Matt Jones) Workflow Exchange and Archival: The KSW File and the Shawn Bowers (For Chad Berkley & Matt Jones) University of California, Davis May, 2005 Outline 1. The 2. Archival and Exchange via KSW Files 3. Object

More information

Using Web Services and Scientific Workflow for Species Distribution Prediction Modeling 1

Using Web Services and Scientific Workflow for Species Distribution Prediction Modeling 1 WAIM05 Using Web Services and Scientific Workflow for Species Distribution Prediction Modeling 1 Jianting Zhang, Deana D. Pennington, and William K. Michener LTER Network Office, the University of New

More information

7 th International Digital Curation Conference December 2011

7 th International Digital Curation Conference December 2011 Golden Trail 1 Golden-Trail: Retrieving the Data History that Matters from a Comprehensive Provenance Repository Practice Paper Paolo Missier, Newcastle University, UK Bertram Ludäscher, Saumen Dey, Michael

More information

Automatic Transformation from Geospatial Conceptual Workflow to Executable Workflow Using GRASS GIS Command Line Modules in Kepler *

Automatic Transformation from Geospatial Conceptual Workflow to Executable Workflow Using GRASS GIS Command Line Modules in Kepler * Automatic Transformation from Geospatial Conceptual Workflow to Executable Workflow Using GRASS GIS Command Line Modules in Kepler * Jianting Zhang, Deana D. Pennington, and William K. Michener LTER Network

More information

Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System

Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System Jianwu Wang, Ilkay Altintas Scientific Workflow Automation Technologies Lab SDSC, UCSD project.org UCGrid Summit,

More information

2/12/11. Addendum (different syntax, similar ideas): XML, JSON, Motivation: Why Scientific Workflows? Scientific Workflows

2/12/11. Addendum (different syntax, similar ideas): XML, JSON, Motivation: Why Scientific Workflows? Scientific Workflows Addendum (different syntax, similar ideas): XML, JSON, Python (a) Python (b) w/ dickonaries XML (a): "meta schema" JSON syntax LISP Source: h:p://en.wikipedia.org/wiki/json XML (b): "direct" schema Source:

More information

KEPLER: Overview and Project Status

KEPLER: Overview and Project Status KEPLER: Overview and Project Status Bertram Ludäscher ludaesch@ucdavis.edu Associate Professor Dept. of Computer Science & Genome Center University of California, Davis UC DAVIS Department of Computer

More information

DataONE: Open Persistent Access to Earth Observational Data

DataONE: Open Persistent Access to Earth Observational Data Open Persistent Access to al Robert J. Sandusky, UIC University of Illinois at Chicago The Net Partners Update: ONE and the Conservancy December 14, 2009 Outline NSF s Net Program ONE Introduction Motivating

More information

Provenance Collection Support in the Kepler Scientific Workflow System

Provenance Collection Support in the Kepler Scientific Workflow System Provenance Collection Support in the Kepler Scientific Workflow System Ilkay Altintas 1, Oscar Barney 2, and Efrat Jaeger-Frank 1 1 San Diego Supercomputer Center, University of California, San Diego,

More information

Workflow Fault Tolerance for Kepler. Sven Köhler, Thimothy McPhillips, Sean Riddle, Daniel Zinn, Bertram Ludäscher

Workflow Fault Tolerance for Kepler. Sven Köhler, Thimothy McPhillips, Sean Riddle, Daniel Zinn, Bertram Ludäscher Workflow Fault Tolerance for Kepler Sven Köhler, Thimothy McPhillips, Sean Riddle, Daniel Zinn, Bertram Ludäscher Introduction Scientific Workflows Automate scientific pipelines Have long running computations

More information

A Dataflow-Oriented Atomicity and Provenance System for Pipelined Scientific Workflows

A Dataflow-Oriented Atomicity and Provenance System for Pipelined Scientific Workflows A Dataflow-Oriented Atomicity and Provenance System for Pipelined Scientific Workflows Liqiang Wang 1, Shiyong Lu 2, Xubo Fei 2, and Jeffrey Ram 3 1 Dept. of Computer Science, University of Wyoming, USA.

More information

Scientific Workflow, Provenance, and Data Modeling Challenges and Approaches

Scientific Workflow, Provenance, and Data Modeling Challenges and Approaches Noname manuscript No. (will be inserted by the editor) Scientific Workflow, Provenance, and Data Modeling Challenges and Approaches Shawn Bowers Received: date / Accepted: date Abstract Semantic modeling

More information

A High-Level Distributed Execution Framework for Scientific Workflows

A High-Level Distributed Execution Framework for Scientific Workflows A High-Level Distributed Execution Framework for Scientific Workflows Jianwu Wang 1, Ilkay Altintas 1, Chad Berkley 2, Lucas Gilbert 1, Matthew B. Jones 2 1 San Diego Supercomputer Center, UCSD, U.S.A.

More information

A High-Level Distributed Execution Framework for Scientific Workflows

A High-Level Distributed Execution Framework for Scientific Workflows Fourth IEEE International Conference on escience A High-Level Distributed Execution Framework for Scientific Workflows Jianwu Wang 1, Ilkay Altintas 1, Chad Berkley 2, Lucas Gilbert 1, Matthew B. Jones

More information

The International Journal of Digital Curation Volume 7, Issue

The International Journal of Digital Curation Volume 7, Issue doi:10.2218/ijdc.v7i1.221 Golden Trail 139 Golden Trail: Retrieving the Data History that Matters from a Comprehensive Provenance Repository Paolo Missier, Newcastle University Bertram Ludäscher, Saumen

More information

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment Paul Watry Univ. of Liverpool, NaCTeM pwatry@liverpool.ac.uk Ray Larson Univ. of California, Berkeley

More information

Application of Named Graphs Towards Custom Provenance Views

Application of Named Graphs Towards Custom Provenance Views Application of Named Graphs Towards Custom Provenance Views Tara Gibson, Karen Schuchardt, Eric Stephan Pacific Northwest National Laboratory Abstract Provenance capture as applied to execution oriented

More information

DataONE Enabling Cyberinfrastructure for the Biological, Environmental and Earth Sciences

DataONE Enabling Cyberinfrastructure for the Biological, Environmental and Earth Sciences DataONE Enabling Cyberinfrastructure for the Biological, Environmental and Earth Sciences William K. Michener 1,2, Rebecca Koskela 1,2, Matthew B. Jones 2,3, Robert B. Cook 2,4, Mike Frame 2,5, Bruce Wilson

More information

Abstract Provenance Graphs: Anticipating and Exploiting Schema-Level Data Provenance

Abstract Provenance Graphs: Anticipating and Exploiting Schema-Level Data Provenance Abstract Provenance Graphs: Anticipating and Exploiting Schema-Level Data Provenance Daniel Zinn and Bertram Ludäscher fdzinn,ludaeschg@ucdavis.edu Abstract. Provenance graphs capture flow and dependency

More information

Kepler: An Extensible System for Design and Execution of Scientific Workflows

Kepler: An Extensible System for Design and Execution of Scientific Workflows DRAFT Kepler: An Extensible System for Design and Execution of Scientific Workflows User Guide * This document describes the Kepler workflow interface for design and execution of scientific workflows.

More information

Hybrid-Type Extensions for Actor-Oriented Modeling (a.k.a. Semantic Data-types for Kepler) Shawn Bowers & Bertram Ludäscher

Hybrid-Type Extensions for Actor-Oriented Modeling (a.k.a. Semantic Data-types for Kepler) Shawn Bowers & Bertram Ludäscher Hybrid-Type Extensions for Actor-Oriented Modeling (a.k.a. Semantic Data-types for Kepler) Shawn Bowers & Bertram Ludäscher University of alifornia, Davis Genome enter & S Dept. May, 2005 Outline 1. Hybrid

More information

The International Journal of Digital Curation Issue 1, Volume

The International Journal of Digital Curation Issue 1, Volume Towards a Theory of Digital Preservation 63 Towards a Theory of Digital Preservation Reagan Moore, San Diego Supercomputer Center June 2008 Abstract A preservation environment manages communication from

More information

ISSN: Supporting Collaborative Tool of A New Scientific Workflow Composition

ISSN: Supporting Collaborative Tool of A New Scientific Workflow Composition Abstract Supporting Collaborative Tool of A New Scientific Workflow Composition Md.Jameel Ur Rahman*1, Akheel Mohammed*2, Dr. Vasumathi*3 Large scale scientific data management and analysis usually relies

More information

A Three Tier Architecture for LiDAR Interpolation and Analysis

A Three Tier Architecture for LiDAR Interpolation and Analysis A Three Tier Architecture for LiDAR Interpolation and Analysis Efrat Jaeger-Frank 1, Christopher J. Crosby 2,AshrafMemon 1, Viswanath Nandigam 1, J. Ramon Arrowsmith 2, Jeffery Conner 2, Ilkay Altintas

More information

ARTICLE IN PRESS Future Generation Computer Systems ( )

ARTICLE IN PRESS Future Generation Computer Systems ( ) Future Generation Computer Systems ( ) Contents lists available at ScienceDirect Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs Scientific workflow design for mere mortals

More information

Putting the Archives to Work: Workflow and Metadata-driven Analysis in LTER Science

Putting the Archives to Work: Workflow and Metadata-driven Analysis in LTER Science Putting the Archives to Work: Workflow and Metadata-driven Analysis in LTER Science Wade Sheldon Georgia Coastal Ecosystems LTER University of Georgia Acknowledgements: John Porter (Virginia Coast Reserve

More information

Knowledge-based Grids

Knowledge-based Grids Knowledge-based Grids Reagan Moore San Diego Supercomputer Center (http://www.npaci.edu/dice/) Data Intensive Computing Environment Chaitan Baru Walter Crescenzi Amarnath Gupta Bertram Ludaescher Richard

More information

Overview. Scientific workflows and Grids. Kepler revisited Data Grids. Taxonomy Example systems. Chimera GridDB

Overview. Scientific workflows and Grids. Kepler revisited Data Grids. Taxonomy Example systems. Chimera GridDB Grids and Workflows Overview Scientific workflows and Grids Taxonomy Example systems Kepler revisited Data Grids Chimera GridDB 2 Workflows and Grids Given a set of workflow tasks and a set of resources,

More information

Semantics and Ontologies For EarthCube

Semantics and Ontologies For EarthCube Semantics and Ontologies For EarthCube Gary Berg-Cross 1, Isabel Cruz 2, Mike Dean 3, Tim Finin 4, Mark Gahegan 5, Pascal Hitzler 6, Hook Hua 7, Krzysztof Janowicz 8, Naicong Li 9, Philip Murphy 9, Bryce

More information

Integrated Machine Learning in the Kepler Scientific Workflow System

Integrated Machine Learning in the Kepler Scientific Workflow System Procedia Computer Science Volume 80, 2016, Pages 2443 2448 ICCS 2016. The International Conference on Computational Science Integrated Machine Learning in the Kepler Scientific Workflow System Mai H. Nguyen

More information

Managing Exploratory Workflows

Managing Exploratory Workflows Managing Exploratory Workflows Juliana Freire Claudio T. Silva http://www.sci.utah.edu/~vgc/vistrails/ University of Utah Joint work with: Erik Andersen, Steven P. Callahan, David Koop, Emanuele Santos,

More information

Scientific Workflows: Business as Usual?

Scientific Workflows: Business as Usual? Scientific Workflows: Business as Usual? Bertram Ludäscher 1,2, Mathias Weske 3, Timothy McPhillips 1, Shawn Bowers 1 1 Genome Center, University of California Davis, USA {ludaesch,tmcphillips,bowers}@ucdavis.edu

More information

Scientific Data & Workflow Engineering. Outline

Scientific Data & Workflow Engineering. Outline Scientific Data & Workflow Engineering Preliminary Notes from the Cyberinfrastructure Trenches Bertram Ludäscher Associate Professor Dept. of Computer Science & Genome Center University of California,

More information

Implementing Trusted Digital Repositories

Implementing Trusted Digital Repositories Implementing Trusted Digital Repositories Reagan W. Moore, Arcot Rajasekar, Richard Marciano San Diego Supercomputer Center 9500 Gilman Drive, La Jolla, CA 92093-0505 {moore, sekar, marciano}@sdsc.edu

More information

ECS289F Winter 05 Scientific Data Management

ECS289F Winter 05 Scientific Data Management ECS289F Winter 05 Scientific Data Management Data Integration Bertram Ludäscher Associate Professor Dept. of Computer Science & Genome Center University of California, Davis ludaesch@ucdavis.edu Process

More information

Wade Sheldon. Georgia Coastal Ecosystems LTER University of Georgia CUAHSI Virtual Workshop Field Data Management Solutions

Wade Sheldon. Georgia Coastal Ecosystems LTER University of Georgia   CUAHSI Virtual Workshop Field Data Management Solutions Wade Sheldon Georgia Coastal Ecosystems LTER University of Georgia email: sheldon@uga.edu CUAHSI Virtual Workshop Field Data Management Solutions 01-Oct-2014 Georgia Coastal Ecosystems LTER started in

More information

Scientific Workflows

Scientific Workflows Scientific Workflows Overview More background on workflows Kepler Details Example Scientific Workflows Other Workflow Systems 2 Recap from last time Background: What is a scientific workflow? Goals: automate

More information

Where we are so far. Intro to Data Integration (Datalog, mediators, ) more to come (your projects!): schema matching, simple query rewriting

Where we are so far. Intro to Data Integration (Datalog, mediators, ) more to come (your projects!): schema matching, simple query rewriting Where we are so far Intro to Data Integration (Datalog, mediators, ) more to come (your projects!): schema matching, simple query rewriting Intro to Knowledge Representation & Ontologies description logic,

More information

Kepler and Grid Systems -- Early Efforts --

Kepler and Grid Systems -- Early Efforts -- Distributed Computing in Kepler Lead, Scientific Workflow Automation Technologies Laboratory San Diego Supercomputer Center, (Joint work with Matthew Jones) 6th Biennial Ptolemy Miniconference Berkeley,

More information

Provenance-aware Faceted Search in Drupal

Provenance-aware Faceted Search in Drupal Provenance-aware Faceted Search in Drupal Zhenning Shangguan, Jinguang Zheng, and Deborah L. McGuinness Tetherless World Constellation, Computer Science Department, Rensselaer Polytechnic Institute, 110

More information

GETTING STARTED GUIDE

GETTING STARTED GUIDE GETTING STARTED GUIDE Version 2.5 October 2015 2 1. Introduction... 5 1.1. What is Kepler?... 5 1.2. What are Scientific Workflows?... 6 2. Downloading and Installing Kepler... 8 2.1. System Requirements...

More information

Agile Data Management Challenges in Enterprise Big Data Landscape

Agile Data Management Challenges in Enterprise Big Data Landscape Agile Data Management Challenges in Enterprise Big Data Landscape Eric Simon, SAP Big Data October, 2017 1 Evolution Towards Enterprise Big Data Landscape administrator Data analyst Athena Redshift #123

More information

DSpace Fedora. Eprints Greenstone. Handle System

DSpace Fedora. Eprints Greenstone. Handle System Enabling Inter-repository repository Access Management between irods and Fedora Bing Zhu, Uni. of California: San Diego Richard Marciano Reagan Moore University of North Carolina at Chapel Hill May 18,

More information

Towards Semantically-enabled Exploration and Analysis of Environmental Ecosystems

Towards Semantically-enabled Exploration and Analysis of Environmental Ecosystems Towards Semantically-enabled Exploration and Analysis of Environmental Ecosystems Deborah L. McGuinness 1 Tetherless World Senior Constellation Chair Rensselaer Polytechnic Institute In conjunction with

More information

Sangam: A Framework for Modeling Heterogeneous Database Transformations

Sangam: A Framework for Modeling Heterogeneous Database Transformations Sangam: A Framework for Modeling Heterogeneous Database Transformations Kajal T. Claypool University of Massachusetts-Lowell Lowell, MA Email: kajal@cs.uml.edu Elke A. Rundensteiner Worcester Polytechnic

More information

DATA MANAGEMENT SYSTEMS FOR SCIENTIFIC APPLICATIONS

DATA MANAGEMENT SYSTEMS FOR SCIENTIFIC APPLICATIONS DATA MANAGEMENT SYSTEMS FOR SCIENTIFIC APPLICATIONS Reagan W. Moore San Diego Supercomputer Center San Diego, CA, USA Abstract Scientific applications now have data management requirements that extend

More information

EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography

EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography Christopher Crosby, San Diego Supercomputer Center J Ramon Arrowsmith, Arizona State University Chaitan

More information

Sliding Window Calculations on Streaming Data using the Kepler Scientific Workflow System

Sliding Window Calculations on Streaming Data using the Kepler Scientific Workflow System Available online at www.sciencedirect.com Procedia Computer Science 9 (2012 ) 1639 1646 International Conference on Computational Science, ICCS 2012 Sliding Window Calculations on Streaming Data using

More information

Bio-Workflows with BizTalk: Using a Commercial Workflow Engine for escience

Bio-Workflows with BizTalk: Using a Commercial Workflow Engine for escience Bio-Workflows with BizTalk: Using a Commercial Workflow Engine for escience Asbjørn Rygg, Scott Mann, Paul Roe, On Wong Queensland University of Technology Brisbane, Australia a.rygg@student.qut.edu.au,

More information

Reproducible & Transparent Computational Science with Galaxy. Jeremy Goecks The Galaxy Team

Reproducible & Transparent Computational Science with Galaxy. Jeremy Goecks The Galaxy Team Reproducible & Transparent Computational Science with Galaxy Jeremy Goecks The Galaxy Team 1 Doing Good Science Previous talks: performing an analysis setting up and scaling Galaxy adding tools libraries

More information

Workflow Management in Spatial Studies:

Workflow Management in Spatial Studies: Workflow Management in Spatial Studies: Just an extra document or something with intelligence? Auteur: John Stuiver, Wageningen University Centre for Geo-Information Spatial information Answers to questions

More information

ICD Wiki Framework for Enabling Semantic Web Service Definition and Orchestration

ICD Wiki Framework for Enabling Semantic Web Service Definition and Orchestration ICD Wiki Framework for Enabling Semantic Web Service Definition and Orchestration Dean Brown, Dominick Profico Lockheed Martin, IS&GS, Valley Forge, PA Abstract As Net-Centric enterprises grow, the desire

More information

Final Project Assignments. Promoter Identification Workflow (PIW)

Final Project Assignments. Promoter Identification Workflow (PIW) Final Project Assignments Schema Matching Ji-Yeong Chong Biological Pathways & Ontologies Russell D Sa FCA Theory and Practice Bill Man, Betty Chan Practice of Data Integration (GAV) Jenny Wang Kepler/Data

More information

Towards Rule Learning Approaches to Instance-based Ontology Matching

Towards Rule Learning Approaches to Instance-based Ontology Matching Towards Rule Learning Approaches to Instance-based Ontology Matching Frederik Janssen 1, Faraz Fallahi 2 Jan Noessner 3, and Heiko Paulheim 1 1 Knowledge Engineering Group, TU Darmstadt, Hochschulstrasse

More information

Metadata Zoo Dataset Metadata Rebecca Koskela Execu4ve Director, DataONE

Metadata Zoo Dataset Metadata Rebecca Koskela Execu4ve Director, DataONE Metadata Zoo Dataset Metadata Rebecca Koskela Execu4ve Director, DataONE eurocris September 9, 2013 Outline Data Challenges Metadata Solu=on DataONE addressing the Data Challenge Enabling Scien=fic Discovery

More information

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Integrating Complex Financial Workflows in Oracle Database Xavier Lopez Seamus Hayes Oracle PolarLake, LTD 2 Copyright 2011, Oracle

More information

Managing Rapidly-Evolving Scientific Workflows

Managing Rapidly-Evolving Scientific Workflows Managing Rapidly-Evolving Scientific Workflows Juliana Freire, Cláudio T. Silva, Steven P. Callahan, Emanuele Santos, Carlos E. Scheidegger, and Huy T. Vo University of Utah Abstract. We give an overview

More information

Wade Sheldon. Georgia Coastal Ecosystems LTER University of Georgia

Wade Sheldon. Georgia Coastal Ecosystems LTER University of Georgia Wade Sheldon Georgia Coastal Ecosystems LTER University of Georgia email: sheldon@uga.edu Regardless of Q/A procedures, data quality issues guaranteed with environmental sensor data Without good Q/C data

More information

FedX: A Federation Layer for Distributed Query Processing on Linked Open Data

FedX: A Federation Layer for Distributed Query Processing on Linked Open Data FedX: A Federation Layer for Distributed Query Processing on Linked Open Data Andreas Schwarte 1, Peter Haase 1,KatjaHose 2, Ralf Schenkel 2, and Michael Schmidt 1 1 fluid Operations AG, Walldorf, Germany

More information

Provenance-Aware Faceted Search in Drupal

Provenance-Aware Faceted Search in Drupal Provenance-Aware Faceted Search in Drupal Zhenning Shangguan, Jinguang Zheng, and Deborah L. McGuinness Tetherless World Constellation, Computer Science Department, Rensselaer Polytechnic Institute, 110

More information

On the use of Abstract Workflows to Capture Scientific Process Provenance

On the use of Abstract Workflows to Capture Scientific Process Provenance On the use of Abstract Workflows to Capture Scientific Process Provenance Paulo Pinheiro da Silva, Leonardo Salayandia, Nicholas Del Rio, Ann Q. Gates The University of Texas at El Paso CENTER OF EXCELLENCE

More information

Flexible Framework for Mining Meteorological Data

Flexible Framework for Mining Meteorological Data Flexible Framework for Mining Meteorological Data Rahul Ramachandran *, John Rushing, Helen Conover, Sara Graves and Ken Keiser Information Technology and Systems Center University of Alabama in Huntsville

More information

USING THE BUSINESS PROCESS EXECUTION LANGUAGE FOR MANAGING SCIENTIFIC PROCESSES. Anna Malinova, Snezhana Gocheva-Ilieva

USING THE BUSINESS PROCESS EXECUTION LANGUAGE FOR MANAGING SCIENTIFIC PROCESSES. Anna Malinova, Snezhana Gocheva-Ilieva International Journal "Information Technologies and Knowledge" Vol.2 / 2008 257 USING THE BUSINESS PROCESS EXECUTION LANGUAGE FOR MANAGING SCIENTIFIC PROCESSES Anna Malinova, Snezhana Gocheva-Ilieva Abstract:

More information

An Archiving System for Managing Evolution in the Data Web

An Archiving System for Managing Evolution in the Data Web An Archiving System for Managing Evolution in the Web Marios Meimaris *, George Papastefanatos and Christos Pateritsas * Institute for the Management of Information Systems, Research Center Athena, Greece

More information

NextData System of Systems Infrastructure (ND-SoS-Ina)

NextData System of Systems Infrastructure (ND-SoS-Ina) NextData System of Systems Infrastructure (ND-SoS-Ina) DELIVERABLE D2.3 (CINECA, CNR-IIA) - Web Portal Architecture DELIVERABLE D4.1 (CINECA, CNR-IIA) - Test Infrastructure Document identifier: D2.3 D4.1

More information

Virtualization of Workflows for Data Intensive Computation

Virtualization of Workflows for Data Intensive Computation Virtualization of Workflows for Data Intensive Computation Sreekanth Pothanis (1,2), Arcot Rajasekar (3,4), Reagan Moore (3,4). 1 Center for Computation and Technology, Louisiana State University, Baton

More information

Implementing the Army Net Centric Data Strategy in a Service Oriented Environment

Implementing the Army Net Centric Data Strategy in a Service Oriented Environment Implementing the Army Net Centric Strategy in a Service Oriented Environment Michelle Dirner Army Net Centric Strategy (ANCDS) Center of Excellence (CoE) Service Team Lead RDECOM CERDEC SED in support

More information

Kepler User Manual Version October, 2010

Kepler User Manual Version October, 2010 Kepler User Manual Version 2.1.0 October, 2010 1. Introduction to Kepler...6 1.1 What is Kepler?... 6 1.1.1 Features...9 1.1.2 Architecture... 11 1.2 History of the Kepler Project...13 1.3 Kepler Code

More information

Leveraging metadata standards in ArcGIS to support Interoperability. David Danko and Aleta Vienneau

Leveraging metadata standards in ArcGIS to support Interoperability. David Danko and Aleta Vienneau Leveraging metadata standards in ArcGIS to support Interoperability David Danko and Aleta Vienneau Leveraging Metadata Standards in ArcGIS for Interoperability Why metadata and metadata standards? Overview

More information

Semantic Web Technologies

Semantic Web Technologies 1/33 Semantic Web Technologies Lecture 11: SWT for the Life Sciences 4: BioRDF and Scientifc Workflows Maria Keet email: keet -AT- inf.unibz.it home: http://www.meteck.org blog: http://keet.wordpress.com/category/computer-science/72010-semwebtech/

More information

Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language

Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Dong Han and Kilian Stoffel Information Management Institute, University of Neuchâtel Pierre-à-Mazel 7, CH-2000 Neuchâtel,

More information

Tackling the Provenance Challenge One Layer at a Time

Tackling the Provenance Challenge One Layer at a Time CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE [Version: 2002/09/19 v2.02] Tackling the Provenance Challenge One Layer at a Time Carlos Scheidegger 1, David Koop 2, Emanuele Santos 1, Huy Vo 1, Steven

More information

Dictionary Driven Exchange Content Assembly Blueprints

Dictionary Driven Exchange Content Assembly Blueprints Dictionary Driven Exchange Content Assembly Blueprints Concepts, Procedures and Techniques (CAM Content Assembly Mechanism Specification) Author: David RR Webber Chair OASIS CAM TC January, 2010 http://www.oasis-open.org/committees/cam

More information

Bibster A Semantics-Based Bibliographic Peer-to-Peer System

Bibster A Semantics-Based Bibliographic Peer-to-Peer System Bibster A Semantics-Based Bibliographic Peer-to-Peer System Peter Haase 1, Björn Schnizler 1, Jeen Broekstra 2, Marc Ehrig 1, Frank van Harmelen 2, Maarten Menken 2, Peter Mika 2, Michal Plechawski 3,

More information

Key cyberinfrastructure elements implemented as RESTful webservices

Key cyberinfrastructure elements implemented as RESTful webservices Key cyberinfrastructure elements implemented as RESTful webservices Investigator Toolkit Web Interface Analysis, Visualization Data Management Client Libraries Java Python Command Line Member Nodes Service

More information

Software + Services for Data Storage, Management, Discovery, and Re-Use

Software + Services for Data Storage, Management, Discovery, and Re-Use Software + Services for Data Storage, Management, Discovery, and Re-Use CODATA 22 Conference Stellenbosch, South Africa 25 October 2010 Alex D. Wade Director Scholarly Communication Microsoft External

More information

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge Discover hidden information from your texts! Information overload is a well known issue in the knowledge industry. At the same time most of this information becomes available in natural language which

More information

Principles of Dataspaces

Principles of Dataspaces Principles of Dataspaces Seminar From Databases to Dataspaces Summer Term 2007 Monika Podolecheva University of Konstanz Department of Computer and Information Science Tutor: Prof. M. Scholl, Alexander

More information

Situations and Ontologies: helping geoscientists understand and share the semantics surrounding their computational resources

Situations and Ontologies: helping geoscientists understand and share the semantics surrounding their computational resources Situations and Ontologies: helping geoscientists understand and share the semantics surrounding their computational resources Mark Gahegan GeoVISTA Center, Department of Geography, The Pennsylvania State

More information

A geoinformatics-based approach to the distribution and processing of integrated LiDAR and imagery data to enhance 3D earth systems research

A geoinformatics-based approach to the distribution and processing of integrated LiDAR and imagery data to enhance 3D earth systems research A geoinformatics-based approach to the distribution and processing of integrated LiDAR and imagery data to enhance 3D earth systems research Christopher J. Crosby, J Ramón Arrowsmith, Jeffrey Connor, Gilead

More information

A three tier architecture applied to LiDAR processing and monitoring

A three tier architecture applied to LiDAR processing and monitoring Scientific Programming 14 (2006) 185 194 185 IOS Press A three tier architecture applied to LiDAR processing and monitoring Efrat Jaeger-Frank a,, Christopher J. Crosby b, Ashraf Memon a, Viswanath Nandigam

More information

Using ESML in a Semantic Web Approach for Improved Earth Science Data Usability

Using ESML in a Semantic Web Approach for Improved Earth Science Data Usability Using in a Semantic Web Approach for Improved Earth Science Data Usability Rahul Ramachandran, Helen Conover, Sunil Movva and Sara Graves Information Technology and Systems Center University of Alabama

More information

Context-Aware Actors. Outline

Context-Aware Actors. Outline Context-Aware Actors Anne H. H. Ngu Department of Computer Science Texas State University-San Macos 02/8/2011 Ngu-TxState Outline Why Context-Aware Actor? Context-Aware Scientific Workflow System - Architecture

More information

Tackling the Provenance Challenge One Layer at a Time

Tackling the Provenance Challenge One Layer at a Time CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE [Version: 2002/09/19 v2.02] Tackling the Provenance Challenge One Layer at a Time Carlos Scheidegger 1, David Koop 2, Emanuele Santos 1, Huy Vo 1, Steven

More information

San Diego Supercomputer Center, UCSD, U.S.A. The Consortium for Conservation Medicine, Wildlife Trust, U.S.A.

San Diego Supercomputer Center, UCSD, U.S.A. The Consortium for Conservation Medicine, Wildlife Trust, U.S.A. Accelerating Parameter Sweep Workflows by Utilizing i Ad-hoc Network Computing Resources: an Ecological Example Jianwu Wang 1, Ilkay Altintas 1, Parviez R. Hosseini 2, Derik Barseghian 2, Daniel Crawl

More information

Validation and Inference of Schema-Level Workflow Data-Dependency Annotations

Validation and Inference of Schema-Level Workflow Data-Dependency Annotations Validation and Inference of Schema-Level Workflow Data-Dependency Annotations Shawn Bowers 1, Timothy McPhillips 2, Bertram Ludäscher 2 1 Dept. of Computer Science, Gonzaga University 2 School of Information

More information

Acquiring Experience with Ontology and Vocabularies

Acquiring Experience with Ontology and Vocabularies Acquiring Experience with Ontology and Vocabularies Walt Melo Risa Mayan Jean Stanford The author's affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended

More information

SEXTANT 1. Purpose of the Application

SEXTANT 1. Purpose of the Application SEXTANT 1. Purpose of the Application Sextant has been used in the domains of Earth Observation and Environment by presenting its browsing and visualization capabilities using a number of link geospatial

More information

Open Research Online The Open University s repository of research publications and other research outputs

Open Research Online The Open University s repository of research publications and other research outputs Open Research Online The Open University s repository of research publications and other research outputs The Smart Book Recommender: An Ontology-Driven Application for Recommending Editorial Products

More information

NeAT Business Plan Component Data Integration and Annotation Services in Biodiversity (DIAS-B) 1. Service Description

NeAT Business Plan Component Data Integration and Annotation Services in Biodiversity (DIAS-B) 1. Service Description NeAT Business Plan Component Data Integration and Annotation Services in Biodiversity (DIAS-B) 1. Service Description 1.1. Description of a research community and the eresearch service need The Atlas of

More information

Comparing Provenance Data Models for Scientific Workflows: an Analysis of PROV-Wf and ProvOne

Comparing Provenance Data Models for Scientific Workflows: an Analysis of PROV-Wf and ProvOne Comparing Provenance Data Models for Scientific Workflows: an Analysis of PROV-Wf and ProvOne Wellington Oliveira 1, 2, Paolo Missier 3, Daniel de Oliveira 1, Vanessa Braganholo 1 1 Instituto de Computação,

More information

Automatic Generation of Workflow Provenance

Automatic Generation of Workflow Provenance Automatic Generation of Workflow Provenance Roger S. Barga 1 and Luciano A. Digiampietri 2 1 Microsoft Research, One Microsoft Way Redmond, WA 98052, USA 2 Institute of Computing, University of Campinas,

More information

Design of Distributed Data Mining Applications on the KNOWLEDGE GRID

Design of Distributed Data Mining Applications on the KNOWLEDGE GRID Design of Distributed Data Mining Applications on the KNOWLEDGE GRID Mario Cannataro ICAR-CNR cannataro@acm.org Domenico Talia DEIS University of Calabria talia@deis.unical.it Paolo Trunfio DEIS University

More information

The NCAR Community Data Portal

The NCAR Community Data Portal The NCAR Community Data Portal http://cdp.ucar.edu/ QuickTime and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime and a TIFF (Uncompressed) decompressor are needed to see this

More information

Semantic Web Mining and its application in Human Resource Management

Semantic Web Mining and its application in Human Resource Management International Journal of Computer Science & Management Studies, Vol. 11, Issue 02, August 2011 60 Semantic Web Mining and its application in Human Resource Management Ridhika Malik 1, Kunjana Vasudev 2

More information

On Optimizing Workflows Using Query Processing Techniques

On Optimizing Workflows Using Query Processing Techniques On Optimizing Workflows Using Query Processing Techniques Georgia Kougka and Anastasios Gounaris Department of Informatics, Aristotle University of Thessaloniki, Greece {georkoug,gounaria}@csd.auth.gr

More information

Metadata Quality Assessment: A Phased Approach to Ensuring Long-term Access to Digital Resources

Metadata Quality Assessment: A Phased Approach to Ensuring Long-term Access to Digital Resources Metadata Quality Assessment: A Phased Approach to Ensuring Long-term Access to Digital Resources Authors Daniel Gelaw Alemneh University of North Texas Post Office Box 305190, Denton, Texas 76203, USA

More information

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing. About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This

More information