Experience in Using a Process Language to Define Scientific Workflow and Generate Dataset Provenance
|
|
- Ferdinand Jordan
- 5 years ago
- Views:
Transcription
1 Experience in Using a Process Language to Define Scientific Workflow and Generate Dataset Provenance Leon J. Osterweil 1, Lori A. Clarke 1, Aaron Ellison 2, Rodion Podorozhny 3, Alexander Wise 1, Emery Boose 2, and Julian Hadley 2 1 Lab. for Advanced SW Engineering Research (LASER), Dept of Computer Science Univ. of Massachusetts Amherst, MA Harvard Forest, Harvard University, Petersham, MA 3 Dept. of Computer Science, Texas State Univ., San Marcos, TX
2 Or: Why software engineers should be interested in scientific data and how it is processed
3 Or: Why software engineers should be interested in scientific data and how it is processed Scientific data management raises problems in: Configuration Management Abstraction/modularity Exception management Concurrency control Static Analysis --and more
4 Or: Why software engineers should be interested in scientific data and how it is processed Scientific data management raises problems in: Configuration Management Abstraction/modularity Exception management Concurrency control Static Analysis --and more Addressing these Problems can shed light on fundamental issues in software engineering
5 Our approach Evaluate applying the principles of o Modern programming languages o Software engineering o Other CS domains In the form of the Analytic Web concept o Two interrelated types of graphs To see how this helps address the problems of scientists o And what it tells us about the CS domains
6 Our approach Evaluate applying the principles of o Modern programming languages o Software engineering o Other CS domains In the form of the Analytic Web concept o Two interrelated types of graphs To see how this helps address the problems of scientists o And what it tells us about the CS domains
7 You re not a real scientist, you re a Computer Scientist --A Physicist/Astronomer to Lee
8 Computer Scientists do real science, but it is a science of abstract, non-tangible things. And this is a science that you need our help with Lee, back to Physicist/Astronomer
9 So, What do real scientists do (in the abstract)?: The Traditional Structure of Science--The Scientific Method Hypothesis Experimental Design o Define experimentation process Experimentation o Execute the process to produce data Evaluation o Analyze datasets, producing more datasets Reproduction by others o Reapply processes to compare results and increase confidence in the conclusions
10 Focus on Datasets Datasets o Their production o Their reproduction by others o Their use by consumers E.g., In producing subsequent datasets Dataset Provenance o Document where datasets come from o Who, What, When, Where (Typical Metadata) o How What computer programs What cleansing and interpolation algorithms Process Provenance Metadata
11 Real Datasets: Real Complications Is there really any raw data? o Mostly recorded automatically o Often preprocessed by computers Considerable cleaning of datasets o Some by humans o Some by automation Documentation is usually scant or absent Reproduction is hard or impossible o Processes are complex and hard to document Real threat to the foundations of science o Reproducibility is the bedrock of science
12 Perils in Inadequate Dataset Documentation Some datasets are wrong o Incorrectly cleaned o Inappropriate algorithms applied incorrectly o Incompletely processed Statistical data is often misunderstood/misused o Used under assumptions that are unwarranted Incorrect/incorrectly used data can be the basis for gravely important decisions o Clear cut forests o Release of unsafe pharmaceuticals How to know which datasets to trust?
13 The Internet Aggravates These Problems Scientific datasets are published o In the literature o ON THE WEB (NSF mandate) What might be published? o All the data o Analyzed datasets o With all of the fudge factors? What does get published o Highly processed datasets o With important details omitted THE DETAILS MATTER A GREAT DEAL
14 Key Problems to be Addressed Producers need automation help for o Generation o Distribution o Provenance documentation Consumers need help for o Understanding o Reproduction o Regeneration o Reuse These are primarily process issues
15 A Real Ecological Research Problem: The Water Budget Problem Measure and analyze the flow of water through a small forested watershed ds = P - ET- Q Precipitation (P) o Two gauges, P1, P2 used to minimize gaps in data. Surface Discharge (Q) o Measured at a stream gauge o Gaps caused by sensor failure, ice build-up, etc. o Fill as a function of preceding P and Q. Evapotranspiration (ET) o measured at an eddy-flux tower. Photosynthetically active radiation (PAR) o PAR can be used to estimate ET
16 An Eddy Covariance (Flux) Tower Flux source area: = f(canopy structure, wind speed, wind direction, turbulence) CO 2 photosynthesis CO 2 soil respiration CO 2 CO 2 soil respiration Eddy covariance tower
17 Key Features of the Water Budget Process Frequent measurements taken in real-time o Some wirelessly o Considerable manipulation in real-time Datasets published in real-time(?) 30-day retrospective revision of dataset o Maybe revisions at other (more immediate?) intervals Other revisions done at other times o Adjustments for sensor drift o Alternative model evaluation o Use of data from other sites (?) Different scientists want to do different things????
18 DFG of Water Budget Process
19 Some key issues to be dealt with Processes are complex o Intertwined iterations o Exceptions (not shown) interrupt nominal flow o Concurrency Create profusions of artifact instances o How to keep track of multiple instances of a single dataset type? o Need an artifact structure o And configuration management Still other issues o Mixed human and automated agents
20 DFG does not do well with some of these issues Producers generate datasets by executing paths o DFG has some spurious unexecutable paths Consumers need provenance documentation o How datasets were generated o How data items were created o But DFG nodes are types. There are no instances Both need reproducibility o Sufficient details are needed o And can be hard to produce Both need analyses o Which requires rigor and precision o As well as details
21 Our Proposed Approach: Analytic Web Represent scientific datasets as a web structure specifying: o What data were used to derive each new dataset o The process by which each dataset was derived Use two complementary graphs o Process Definition Graph (PDG) (a visual program) Defines process in sufficient detail to support Understanding, Execution, Reproducibility, Scrutiny o Data Derivation Graph (DDGs) (program traces) Documents how each dataset instance was actually derived In terms of actual process steps applied and choices made
22 The Two Graphs Process Derivation Graph (PDG) o Defines the process o Types of tasks that are applied to types of datasets Data Derivation Graph (DDG) o Which instances derived by which others o By which activities PDG execution leads to the creation of the DDG Dataset provenance metadata created from DDG nodes
23 The Process Definition Challenges How to represent processes o Accurately,completely,clearly,reproducibly Real processes are large and complex o Even easy ones How to support execution of these processes How to communicate the processes to working scientists How to validate these processes o Scientific scrutiny o Analysis to assure that composed processes are not anomalous
24 The Little-JIL Process Language Executable,visual process language emphasizing o Use of abstraction o Exception management o Concurrency control o Resource specification and late-binding Use it to define PDGs to determine the characteristics and features needed Execute the PDGs to generate DDGs o That define dataset provenance
25 Value of Hierarchy, Scoping, and Abstraction Process definition is a hierarchical decomposition Think of steps as procedure invocations o They define scopes o Copy and restore argument semantics o Creation of scopes Encourages use of abstraction o Eg. process fragment reuse Use of recursion helps define, propagate, exploit context information to elucidate iteration
26 Elaboration of on key step Augment Realtime Dataset X Get RT Readings ~Data not OK Read Precip Read Others Data not OK Clean Data X ~Dataset Review OK Select Values Clean Data Apply Models Select Best Model Fill Gaps in Data Dataset Review OK
27 Makes use of recursion Apply Models Apply A Model Apply Models Get Model Compute Model Values
28 Use of different types of agents too Apply Models Apply A Model Apply Models Agent is scientist Get Model Compute Model Values
29 The whole process
30 Elaborate this corner
31 Need for complex exception handling Read Precip = Handle No Precip Read Precip 1 Read Precip 2 X Choose Source O X Sensor Down X Read Precip 1 Read Precip 2 Get Airport Put Null
32 The whole process
33 howing process odule reuse
34 re process dule reuse
35 The DDG A DAG of instances o Dataset instances o Process step instances (and their agents) Edges indicate which datasets are o Derived from executing which step instance o Using which dataset instances as inputs
36 Different executions lead to different DDGs Read Precip = Handle No Precip Read Precip 1 Read Precip 2 X Choose Source O X Sensor Down X Read Precip 1 Read Precip 2 Get Airport Put Null
37 DDG if both sensors are read successfully Read Precip 1 Read Precip 2 P1 P2 Read Precip P1 P2 Get RT Readings
38 DDG if Read of Precip 1 Fails twice, then gets value from alternative source (airport) Read Precip 1 Get Airport Read Precip 1 Read Precip 2 P1 P2 P1 Retry Airport P1 Read Precip Handle No Precip P2 P1 Get RT Readings
39 DDG showing consideration of two alternative models P1 P2 Model 1 Model 2 Apply A Model Apply A Model P from Model1 P from Model 2 Add to Dataset P
40 Full DDG: Two Models; Both readings succeeded P1 Read Precip 1 Read Precip Read Precip 2 P2 P1 P2 Get RT Readings P1 P2 Apply Models P1 Model 1 Model 2 P2 Apply A Model Apply A Model
41 Read Precip 1 Get Airport Read Precip 1 Read Precip 2 P1 P2 P1 Retry Airport P1 Read Precip Handle No Precip P2 DDG with two failed Precip readings, then airport read P1 P1 P1 Get RT Readings Apply Models Model 1 Model 2 P2 P2 Apply A Model Apply A Model
42 Read Precip 1 Get Airport Read Precip 1 Read Precip 2 P1 P2 P1 Retry Airport P1 Read Precip Handle No Precip P1 P2 Eliding for Clarity (?) Get RT Readings: Apply Models Model 1 Model 2 Apply A Model Apply A Model
43 Read Precip 1 Read Precip 2 A Further DDG Read Precip; Get RT Readings; Apply Models P1 P2 Model 1 Model 2 Apply A Model Apply A Model P from Model1 P from Model 2 Add to Dataset P
44 sensordatum Read PModelChannel at time 1 Read PModelChannel at time 2 Read PModelChannel at time 3 IdofPModel 1 IdofPModel 2 IdofPModel 3 DDG for Selection from among three Models Auto Select Model at time 1 Auto Select Model at time 2 Auto Select Model at time 3 Cached data IdofPModel 1 After Auto Select Model IdofPModel 2 After Auto Select Model IdofPModel 3 After Auto Select Model Eval Selected Model for Model1 Eval Selected Model for Model2 Eval Selected Model for Model3 SensorDatum.original SensorDatum 1 SensorDatum 2 SensorDatum 3 PModelID SensorDatum.modeled Select Best Model
45 Read try1 time Sensor1 try1 time Read try2 time try2 time Read PModelChannel at time 1 Read PModelChannel at time 2 Read PModelChannel at time 3 IdofPModel 1 IdofPModel 2 IdofPModel 3 Get Airport Data streams Auto Select Model at time 1 Auto Select Model at time 2 Auto Select Model at time 3 Airport airport try time Bind Airport Datum As P1 Cached Data IdofPModel 1 After Auto Select Model Eval Selected Model for Model1 IdofPModel 2 After Auto Select Model Eval Selected Model for Model2 IdofPModel 3 After Auto Select Model Eval Selected Model for Model3 P1 after Get Airport SensorDatum 1 SensorDatum 2 SensorDatum 3 Filter Range Data P1, after Check Ranges Check Ranges Read try1 time P2 Read PModelChannel at time 4 Select Values of P SensorDatum.original Select Best Model PModelID SensorDatum.modeled IdofPModel 4 Auto Select Model at time 4 P set from P2 sensordatum Enhanced Cached Data SensorDatum.original The DDG further along Next PModelID Select Best Model Next SensorDatum.modeled IdofPModel 4 After Auto Select Model Eval Selected Model for Model4 SensorDatum 4
46 Process Provenance Metadata Derived from the DDG o Root of the appropriate DDG subdag o Efficiencies by storing one DDG Provenance metadata stored as pointer Tradeoffs possible o Cache some intermediate values o Rederive the rest
47 Some Related Work in Scientific Workflow The Kepler Project o Largest, best funded o Based on Ptolemy II Data Flow Graph Approach o Oodles of tools Taverna, Teuta, Chimera, JOpera o Different Data Flow Graph projects Provenance Projects o Database approaches o Process history logging approaches
48 Problems with Current Scientific Workflow Languages tend to be semantically weak o Based on box and arrow charts Principal semantic weaknesses o Lack of abstraction facilities o Weak exception management o Non-trivial iteration can be hard to understand Other problems (with some) o Lack of focus on data and data flow o Provenance viewed as a separate issue o Poorly defined semantics Restricts the rigor of analyses Kepler is better than most
49 Kepler Large toolset to support users Highly visual Box and arrow charts are defined by Directors, which provide semantics o But different levels of hierarchy may have different Directors o No standard usage of Directors E.g. for loops and exceptions o Hard to tell what a diagram is saying without looking at Its Directors Director of its parent Programming language experience can inform work in this area
50 Important Clarification DFG-based scientific workflow systems can do (most or) all of the above BUT: Process definitions tend to be ungainly, opaque The Turing Tarpit for process definition Focus here: o What language constructs should be used to make these processes clear, evolvable, analyzable, etc. o Software engineering has much to say about this
51 Observations Creating a PDG makes scientists think carefully The PDG helps scientists create flexible and scalable processes The PDG supports re-execution using different input data and/or different tools. The PDG provides a useful and efficient framework for structuring and generating the DDG. PDGs can be rigorously evaluated for logical, statistical and propagation of measurement errors.
52 Summary Defined PDGs for ecological research processes o Carbon Flux o Water Budget Key semantic features needed o Exception management o Abstraction/modularity/reuse o Agent definition Not favorable for use of DFGs as PDGs Little-JIL interpreter executed simple processes only DDGs done only manually o Automation needed o Application to other process areas and problems?
53 Future work SciWalker 2 o Environment for building and maintaining PDGs and DDGs Focus on DDGs o How to build, maintain, optimize, display them o What else are they good for? (Software) rework? Running benchmarks???? Integrate with Kepler o Use Kepler for low-level tasks o Little-JIL for high-level process definition
54 Questions and Discussion?
Experience in Using a Process Language to Define Scientific Workflow and Generate Dataset Provenance
Experience in Using a Process Language to Define Scientific Workflow and Generate Dataset Provenance Leon J. Osterweil Dept. Computer Science Univ. of Massachusetts Amherst, MA 01003 USA ljo@cs.umass.edu
More informationSupporting Undo and Redo in Scientific Data Analysis
Supporting Undo and Redo in Scientific Data Analysis Xiang Zhao, Emery R. Boose, Yuriy Brun, xiang@cs.umass.edu, boose@fas.harvard.edu, brun@cs.umass.edu Barbara Staudt Lerner, Leon J. Osterweil blerner@mtholyoke.edu,
More informationCONCEPTS & SYNTHESIS
CONCEPTS & SYNTHESIS EMPHASIZING NEW IDEAS TO STIMULATE RESEARCH IN ECOLOGY Ecology, 87(6), 2006, pp. 1345 1358 Ó 2006 by the Ecological Society of America ANALYTIC WEBS SUPPORT THE SYNTHESIS OF ECOLOGICAL
More informationThe Role of Context in Exception-Driven Rework
The Role of Context in Exception-Driven Rework Xiang Zhao University of Massachuestts Amherst Amherst, USA xiang@cs.umass.edu Barbara Staudt Lerner Mount Holyoke College South Hadley, USA blerner@mtholyoke.edu
More informationThe Role of Context in Exception-Driven Rework
Laboratory for Advanced Software Engineering Research 1 University of Massachusetts Amherst 2 Mount Holyoke College 5th International Workshop on Exception Handling (WEH.12) Zurich, Switzerland June 16,
More informationTesting! Prof. Leon Osterweil! CS 520/620! Spring 2013!
Testing Prof. Leon Osterweil CS 520/620 Spring 2013 Relations and Analysis A software product consists of A collection of (types of) artifacts Related to each other by myriad Relations The relations are
More informationSupporting Process Undo and Redo in Software Engineering Decision Making
Supporting Process Undo and Redo in Software Engineering Decision Making Xiang Zhao Yuriy Brun Leon J. Osterweil School of Computer Science University of Massachusetts Amherst, MA, USA {xiang,brun,ljo}@cs.umass.edu
More informationCoding and Unit Testing! The Coding Phase! Coding vs. Code! Coding! Overall Coding Language Trends!
Requirements Spec. Design Coding and Unit Testing Characteristics of System to be built must match required characteristics (high level) Architecture consistent views Software Engineering Computer Science
More informationApplying Real-Time Scheduling Techniques to Software Processes: A Position Paper
To Appear in Proc. of the 8th European Workshop on Software Process Technology.19-21 June 2001. Witten, Germany. Applying Real-Time Scheduling Techniques to Software Processes: A Position Paper Aaron G.
More informationLittle-JIL/Juliette: A Process Definition Language and Interpreter
Proceedings of the 22nd International Conference on Software Engineering. 4-11 June 2000. Limerick, Ireland. Pages 754 757. Little-JIL/Juliette: A Process Definition Language and Interpreter Aaron G. Cass
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis Test Loads CS 147: Computer Systems Performance Analysis Test Loads 1 / 33 Overview Overview Overview 2 / 33 Test Load Design Test Load Design Test Load Design
More informationSolve the Data Flow Problem
Gaining Condence in Distributed Systems Gleb Naumovich, Lori A. Clarke, and Leon J. Osterweil University of Massachusetts, Amherst Computer Science Department University of Massachusetts Amherst, Massachusetts
More informationOverview. Scientific workflows and Grids. Kepler revisited Data Grids. Taxonomy Example systems. Chimera GridDB
Grids and Workflows Overview Scientific workflows and Grids Taxonomy Example systems Kepler revisited Data Grids Chimera GridDB 2 Workflows and Grids Given a set of workflow tasks and a set of resources,
More informationIssues in the Development of Transactional Web Applications R. D. Johnson D. Reimer IBM Systems Journal Vol 43, No 2, 2004 Presenter: Barbara Ryder
Issues in the Development of Transactional Web Applications R. D. Johnson D. Reimer IBM Systems Journal Vol 43, No 2, 2004 Presenter: Barbara Ryder 3/21/05 CS674 BGR 1 Web Applications Transactional processing
More informationData Governance. Mark Plessinger / Julie Evans December /7/2017
Data Governance Mark Plessinger / Julie Evans December 2017 12/7/2017 Agenda Introductions (15) Background (30) Definitions Fundamentals Roadmap (15) Break (15) Framework (60) Foundation Disciplines Engagements
More informationStatic Analysis! Prof. Leon J. Osterweil! CS 520/620! Fall 2012! Characteristics of! System to be! built must! match required! characteristics!
Static Analysis! Prof. Leon J. Osterweil! CS 520/620! Fall 2012! Requirements Spec.! Design! Test Results must! match required behavior! Characteristics of! System to be! built must! match required! characteristics!
More informationCS-XXX: Graduate Programming Languages. Lecture 9 Simply Typed Lambda Calculus. Dan Grossman 2012
CS-XXX: Graduate Programming Languages Lecture 9 Simply Typed Lambda Calculus Dan Grossman 2012 Types Major new topic worthy of several lectures: Type systems Continue to use (CBV) Lambda Caluclus as our
More informationCS:2820 (22C:22) Object-Oriented Software Development
The University of Iowa CS:2820 (22C:22) Object-Oriented Software Development! Spring 2015 Software Complexity by Cesare Tinelli Complexity Software systems are complex artifacts Failure to master this
More informationRepetition Through Recursion
Fundamentals of Computer Science I (CS151.02 2007S) Repetition Through Recursion Summary: In many algorithms, you want to do things again and again and again. For example, you might want to do something
More informationIntegration With the Business Modeler
Decision Framework, J. Duggan Research Note 11 September 2003 Evaluating OOA&D Functionality Criteria Looking at nine criteria will help you evaluate the functionality of object-oriented analysis and design
More informationModeling Dependencies for Cascading Selective Undo
Modeling Dependencies for Cascading Selective Undo Aaron G. Cass and Chris S. T. Fernandes Union College, Schenectady, NY 12308, USA, {cassa fernandc}@union.edu Abstract. Linear and selective undo mechanisms
More informationTest and Evaluation of Autonomous Systems in a Model Based Engineering Context
Test and Evaluation of Autonomous Systems in a Model Based Engineering Context Raytheon Michael Nolan USAF AFRL Aaron Fifarek Jonathan Hoffman 3 March 2016 Copyright 2016. Unpublished Work. Raytheon Company.
More informationActor-Oriented Design: Concurrent Models as Programs
Actor-Oriented Design: Concurrent Models as Programs Edward A. Lee Professor, UC Berkeley Director, Center for Hybrid and Embedded Software Systems (CHESS) Parc Forum Palo Alto, CA May 13, 2004 Abstract
More informationAccelerating the Scientific Exploration Process with Kepler Scientific Workflow System
Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System Jianwu Wang, Ilkay Altintas Scientific Workflow Automation Technologies Lab SDSC, UCSD project.org UCGrid Summit,
More informationModeling Dependencies for Cascading Selective Undo
Modeling Dependencies for Cascading Selective Undo Aaron G. Cass and Chris S. T. Fernandes Union College, Schenectady, NY 12308, USA, {cassa fernandc}@union.edu Abstract. Linear and selective undo mechanisms
More informationManaging the Evolution of Dataflows with VisTrails
Managing the Evolution of Dataflows with VisTrails Juliana Freire http://www.cs.utah.edu/~juliana University of Utah Joint work with: Steven P. Callahan, Emanuele Santos, Carlos E. Scheidegger, Claudio
More informationProject Name. The Eclipse Integrated Computational Environment. Jay Jay Billings, ORNL Parent Project. None selected yet.
Project Name The Eclipse Integrated Computational Environment Jay Jay Billings, ORNL 20140219 Parent Project None selected yet. Background The science and engineering community relies heavily on modeling
More informationIntroduction to Data Science
UNIT I INTRODUCTION TO DATA SCIENCE Syllabus Introduction of Data Science Basic Data Analytics using R R Graphical User Interfaces Data Import and Export Attribute and Data Types Descriptive Statistics
More informationFORMALIZED SOFTWARE DEVELOPMENT IN AN INDUSTRIAL ENVIRONMENT
FORMALIZED SOFTWARE DEVELOPMENT IN AN INDUSTRIAL ENVIRONMENT Otthein Herzog IBM Germany, Dept. 3100 P.O.Box 80 0880 D-7000 STUTTGART, F. R. G. ABSTRACT tn the IBM Boeblingen Laboratory some software was
More informationAADL Graphical Editor Design
AADL Graphical Editor Design Peter Feiler Software Engineering Institute phf@sei.cmu.edu Introduction An AADL specification is a set of component type and implementation declarations. They are organized
More informationAbsolute C++ Walter Savitch
Absolute C++ sixth edition Walter Savitch Global edition This page intentionally left blank Absolute C++, Global Edition Cover Title Page Copyright Page Preface Acknowledgments Brief Contents Contents
More informationDatabase Systems: Design, Implementation, and Management Tenth Edition. Chapter 1 Database Systems
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 1 Database Systems Objectives In this chapter, you will learn: The difference between data and information What a database
More informationIntroduction to Data Management for Ocean Science Research
Introduction to Data Management for Ocean Science Research Cyndy Chandler Biological and Chemical Oceanography Data Management Office 12 November 2009 Ocean Acidification Short Course Woods Hole, MA USA
More informationAutomatic Fault Tree Derivation from Little-JIL Process Definitions
Automatic Fault Tree Derivation from Little-JIL Process Definitions Bin Chen, George S. Avrunin, Lori A. Clarke, and Leon J. Osterweil Department of Computer Science, University of Massachusetts, Amherst,
More informationDeveloping a Research Data Policy
Developing a Research Data Policy Core Elements of the Content of a Research Data Management Policy This document may be useful for defining research data, explaining what RDM is, illustrating workflows,
More informationMassive Data Algorithmics
In the name of Allah Massive Data Algorithmics An Introduction Overview MADALGO SCALGO Basic Concepts The TerraFlow Project STREAM The TerraStream Project TPIE MADALGO- Introduction Center for MAssive
More informationOpen Data Standards for Administrative Data Processing
University of Pennsylvania ScholarlyCommons 2018 ADRF Network Research Conference Presentations ADRF Network Research Conference Presentations 11-2018 Open Data Standards for Administrative Data Processing
More information7 th International Digital Curation Conference December 2011
Golden Trail 1 Golden-Trail: Retrieving the Data History that Matters from a Comprehensive Provenance Repository Practice Paper Paolo Missier, Newcastle University, UK Bertram Ludäscher, Saumen Dey, Michael
More informationMinsoo Ryu. College of Information and Communications Hanyang University.
Software Reuse and Component-Based Software Engineering Minsoo Ryu College of Information and Communications Hanyang University msryu@hanyang.ac.kr Software Reuse Contents Components CBSE (Component-Based
More informationManaging Exploratory Workflows
Managing Exploratory Workflows Juliana Freire Claudio T. Silva http://www.sci.utah.edu/~vgc/vistrails/ University of Utah Joint work with: Erik Andersen, Steven P. Callahan, David Koop, Emanuele Santos,
More informationOutline. Proof Carrying Code. Hardware Trojan Threat. Why Compromise HDL Code?
Outline Proof Carrying Code Mohammad Tehranipoor ECE6095: Hardware Security & Trust University of Connecticut ECE Department Hardware IP Verification Hardware PCC Background Software PCC Description Software
More informationBuilding a Runnable Program and Code Improvement. Dario Marasco, Greg Klepic, Tess DiStefano
Building a Runnable Program and Code Improvement Dario Marasco, Greg Klepic, Tess DiStefano Building a Runnable Program Review Front end code Source code analysis Syntax tree Back end code Target code
More informationReasoning About Programs Panagiotis Manolios
Reasoning About Programs Panagiotis Manolios Northeastern University February 26, 2017 Version: 100 Copyright c 2017 by Panagiotis Manolios All rights reserved. We hereby grant permission for this publication
More information18-642: Software Architecture & High Level Design
18-642: Software Architecture & High Level Design 9/27/2018 All the really important mistakes are made the first day. Eberhardt Rechtin, System Architecting 1 YOU ARE HERE Product Requirements SPECIFY
More informationNAPA COUNTY RESPONSE TO THE GRAND JURY FINAL REPORT NAPA COUNTY S WEBSITE NEEDS IMPROVEMENT MAY 24, 2016
NAPA COUNTY RESPONSE TO THE GRAND JURY FINAL REPORT NAPA COUNTY S WEBSITE NEEDS IMPROVEMENT MAY 24, 2016 The Grand Jury requested responses from the Board of Supervisors, County Executive Officer, Library
More informationData Flow Diagram" Answering many of these questions can be facilitated by allowing hierarchical decomposition"
Computer Science 520/620 Spring 2011 Prof. L. Osterweil" Software Models and Representations" Part 2" Consistency is a principal concern" Are the diagrams consistent with each other?" Top view consistent
More informationFinding Firmware Defects Class T-18 Sean M. Beatty
Sean Beatty Sean Beatty is a Principal with High Impact Services in Indianapolis. He holds a BSEE from the University of Wisconsin - Milwaukee. Sean has worked in the embedded systems field since 1986,
More informationCalibrating HART Transmitters. HCF_LIT-054, Revision 1.1
Calibrating HART Transmitters HCF_LIT-054, Revision 1.1 Release Date: November 19, 2008 Date of Publication: November 19, 2008 Document Distribution / Maintenance Control / Document Approval To obtain
More informationComputer Science 520/620 Spring 2013 Prof. L. Osterweil" Use Cases" Software Models and Representations" Part 4" More, and Multiple Models"
Computer Science 520/620 Spring 2013 Prof. L. Osterweil Software Models and Representations Part 4 More, and Multiple Models Use Cases Specify actors and how they interact with various component parts
More informationComputer Science 520/620 Spring 2013 Prof. L. Osterweil" Software Models and Representations" Part 4" More, and Multiple Models" Use Cases"
Computer Science 520/620 Spring 2013 Prof. L. Osterweil Software Models and Representations Part 4 More, and Multiple Models Use Cases Specify actors and how they interact with various component parts
More informationCS 241 Honors Memory
CS 241 Honors Memory Ben Kurtovic Atul Sandur Bhuvan Venkatesh Brian Zhou Kevin Hong University of Illinois Urbana Champaign February 20, 2018 CS 241 Course Staff (UIUC) Memory February 20, 2018 1 / 35
More information1. true / false By a compiler we mean a program that translates to code that will run natively on some machine.
1. true / false By a compiler we mean a program that translates to code that will run natively on some machine. 2. true / false ML can be compiled. 3. true / false FORTRAN can reasonably be considered
More informationVocabulary-Driven Enterprise Architecture Development Guidelines for DoDAF AV-2: Design and Development of the Integrated Dictionary
Vocabulary-Driven Enterprise Architecture Development Guidelines for DoDAF AV-2: Design and Development of the Integrated Dictionary December 17, 2009 Version History Version Publication Date Author Description
More informationHistory of object-oriented approaches
Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr http://www.yildiz.edu.tr/~naydin Object-Oriented Oriented Systems Analysis and Design with the UML Objectives: Understand the basic characteristics of object-oriented
More informationCS 360 Programming Languages Interpreters
CS 360 Programming Languages Interpreters Implementing PLs Most of the course is learning fundamental concepts for using and understanding PLs. Syntax vs. semantics vs. idioms. Powerful constructs like
More informationSystem Development Life Cycle Methods/Approaches/Models
Week 11 System Development Life Cycle Methods/Approaches/Models Approaches to System Development System Development Life Cycle Methods/Approaches/Models Waterfall Model Prototype Model Spiral Model Extreme
More informationLambda Architecture for Batch and Stream Processing. October 2018
Lambda Architecture for Batch and Stream Processing October 2018 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided for informational purposes only.
More informationIntermediate Representation (IR)
Intermediate Representation (IR) Components and Design Goals for an IR IR encodes all knowledge the compiler has derived about source program. Simple compiler structure source code More typical compiler
More informationProgramming Languages
Programming Languages As difficult to discuss rationally as religion or politics. Prone to extreme statements devoid of data. Examples: "It is practically impossible to teach good programming to students
More informationRoy Fielding s PHD Dissertation. Chapter s 5 & 6 (REST)
Roy Fielding s PHD Dissertation Chapter s 5 & 6 (REST) Architectural Styles and the Design of Networkbased Software Architectures Roy Fielding University of California - Irvine 2000 Chapter 5 Representational
More informationFormal Verification! Prof. Leon Osterweil! Computer Science 520/620! Spring 2012!
Formal Verification Prof. Leon Osterweil Computer Science 520/620 Spring 2012 Relations and Analysis A software product consists of A collection of (types of) artifacts Related to each other by myriad
More informationWorking Together In Communities of Practice with Metadata Standards
International Open Government Data Conference Working Together In Communities of Practice with Metadata Standards Sandra Cannon, Ph.D., Assistant Director, Division of Research and Statistics, Board of
More informationDEPARTMENT OF COMPUTER SCIENCE
Department of Computer Science 1 DEPARTMENT OF COMPUTER SCIENCE Office in Computer Science Building, Room 279 (970) 491-5792 cs.colostate.edu (http://www.cs.colostate.edu) Professor L. Darrell Whitley,
More informationViolations of the contract are exceptions, and are usually handled by special language constructs. Design by contract
Specification and validation [L&G Ch. 9] Design patterns are a useful way to describe program structure. They provide a guide as to how a program fits together. Another dimension is the responsibilities
More informationLab 3: Digitizing in ArcGIS Pro
Lab 3: Digitizing in ArcGIS Pro What You ll Learn: In this Lab you ll be introduced to basic digitizing techniques using ArcGIS Pro. You should read Chapter 4 in the GIS Fundamentals textbook before starting
More informationBig Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara
Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case
More informationD2K Support for Standard Content Repositories: Design Notes. Request For Comments
D2K Support for Standard Content Repositories: Design Notes Request For Comments Abstract This note discusses software developments to integrate D2K [1] with Tupelo [6] and other similar repositories.
More informationUsing Metadata Queries To Build Row-Level Audit Reports in SAS Visual Analytics
SAS6660-2016 Using Metadata Queries To Build Row-Level Audit Reports in SAS Visual Analytics ABSTRACT Brandon Kirk and Jason Shoffner, SAS Institute Inc., Cary, NC Sensitive data requires elevated security
More informationComputer Science 520/620 Spring 2014 Prof. L. Osterweil" Use Cases" Software Models and Representations" Part 4" More, and Multiple Models"
Computer Science 520/620 Spring 2014 Prof. L. Osterweil Software Models and Representations Part 4 More, and Multiple Models Use Cases Specify actors and how they interact with various component parts
More informationSection 1: True / False (2 points each, 30 pts total)
Section 1: True / False (2 points each, 30 pts total) Circle the word TRUE or the word FALSE. If neither is circled, both are circled, or it impossible to tell which is circled, your answer will be considered
More informationPANDA A System for Provenance and Data. Example: Sales Prediction Workflow. Example: Sales Prediction Workflow. Backward Tracing. Item Sales.
PANDA A System for Provenance and Data Example: Prediction Workflow Union Predict Agg -1 s 2 Example: Prediction Workflow Backward Tracing Union Predict Agg -1 s Name Amelie Jacques Isabelle Name Address
More informationContinued Investigation of Small-Scale Air-Sea Coupled Dynamics Using CBLAST Data
Continued Investigation of Small-Scale Air-Sea Coupled Dynamics Using CBLAST Data Dick K.P. Yue Center for Ocean Engineering Department of Mechanical Engineering Massachusetts Institute of Technology Cambridge,
More informationProcess and data flow modeling
Process and data flow modeling Vince Molnár Informatikai Rendszertervezés BMEVIMIAC01 Budapest University of Technology and Economics Fault Tolerant Systems Research Group Budapest University of Technology
More informationSystem R and the Relational Model
IC-65 Advances in Database Management Systems Roadmap System R and the Relational Model Intro Codd s paper System R - design Anastasia Ailamaki www.cs.cmu.edu/~natassa 2 The Roots The Roots Codd (CACM
More informationchallenges in domain-specific modeling raphaël mannadiar august 27, 2009
challenges in domain-specific modeling raphaël mannadiar august 27, 2009 raphaël mannadiar challenges in domain-specific modeling 1/59 outline 1 introduction 2 approaches 3 debugging and simulation 4 differencing
More informationInformation System Support for Analytical Methods
Information System Support for Analytical Methods Martin Doerr Center for Cultural Informatics, Institute of Computer Science Foundation for Research and Technology - Hellas Heraklion, October 18, 2017
More informationAlan J. Perlis - Epigrams on Programming
Programming Languages (CS302 2007S) Alan J. Perlis - Epigrams on Programming Comments on: Perlis, Alan J. (1982). Epigrams on Programming. ACM SIGPLAN Notices 17(9), September 1982, pp. 7-13. 1. One man
More informationObjective and Subjective Specifications
2017-7-10 0 Objective and Subjective Specifications Eric C.R. Hehner Department of Computer Science, University of Toronto hehner@cs.utoronto.ca Abstract: We examine specifications for dependence on the
More informationScripted Components: Problem. Scripted Components. Problems with Components. Single-Language Assumption. Dr. James A. Bednar
Scripted Components: Problem Scripted Components Dr. James A. Bednar jbednar@inf.ed.ac.uk http://homepages.inf.ed.ac.uk/jbednar (Cf. Reuse-Oriented Development; Sommerville 2004 Chapter 4, 18) A longstanding
More informationScripted Components Dr. James A. Bednar
Scripted Components Dr. James A. Bednar jbednar@inf.ed.ac.uk http://homepages.inf.ed.ac.uk/jbednar SAPM Spring 2012: Scripted Components 1 Scripted Components: Problem (Cf. Reuse-Oriented Development;
More informationUSING THE BUSINESS PROCESS EXECUTION LANGUAGE FOR MANAGING SCIENTIFIC PROCESSES. Anna Malinova, Snezhana Gocheva-Ilieva
International Journal "Information Technologies and Knowledge" Vol.2 / 2008 257 USING THE BUSINESS PROCESS EXECUTION LANGUAGE FOR MANAGING SCIENTIFIC PROCESSES Anna Malinova, Snezhana Gocheva-Ilieva Abstract:
More informationParallel Programming Patterns Overview CS 472 Concurrent & Parallel Programming University of Evansville
Parallel Programming Patterns Overview CS 472 Concurrent & Parallel Programming of Evansville Selection of slides from CIS 410/510 Introduction to Parallel Computing Department of Computer and Information
More informationInitiate a PSI-BLAST search simply by choosing the option on the BLAST input form.
1 2 Initiate a PSI-BLAST search simply by choosing the option on the BLAST input form. But note: invoking the algorithm is trivial. Using it correctly and interpreting the results, perhaps not so much.
More informationQuality Assured (QA) data
Quality Assured (QA) data Towards DOI quality of data generated at the UFZ Mark Frenzel (Ecologist) & Thomas Schnicke (IT) DataCite / Helmholtz Open Science Workshop Leipzig, 12.01.2016 QA + DOI: Best
More informationData Curation Profile Human Genomics
Data Curation Profile Human Genomics Profile Author Profile Author Institution Name Contact J. Carlson N. Brown Purdue University J. Carlson, jrcarlso@purdue.edu Date of Creation October 27, 2009 Date
More informationSoftware Testing and Maintenance 1. Introduction Product & Version Space Interplay of Product and Version Space Intensional Versioning Conclusion
Today s Agenda Quiz 3 HW 4 Posted Version Control Software Testing and Maintenance 1 Outline Introduction Product & Version Space Interplay of Product and Version Space Intensional Versioning Conclusion
More informationDesign Rationale Modeling Representations
Agenda Design Rationale Modeling Representations EE382V Software Architecture and Design Intent Pankaj Adhikari, Bill Reinhart 7 MAR 2006 Why Rationale? Design Rationale s Roots Modeling Rationale IBIS
More informationData management is fun. Casey Dunn Assistant Professor Ecology and Evolutionary Biology
Data management is fun Casey Dunn Assistant Professor Ecology and Evolutionary Biology What is science? The study of the natural world through observation and experiment. Reproducible study. Prove it isn
More informationSardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization
Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES (2018-19) (ODD) Code Optimization Prof. Jonita Roman Date: 30/06/2018 Time: 9:45 to 10:45 Venue: MCA
More informationProblem Solving with C++
GLOBAL EDITION Problem Solving with C++ NINTH EDITION Walter Savitch Kendrick Mock Ninth Edition PROBLEM SOLVING with C++ Problem Solving with C++, Global Edition Cover Title Copyright Contents Chapter
More informationWade Sheldon. Georgia Coastal Ecosystems LTER University of Georgia
Wade Sheldon Georgia Coastal Ecosystems LTER University of Georgia email: sheldon@uga.edu Regardless of Q/A procedures, data quality issues guaranteed with environmental sensor data Without good Q/C data
More informationA PRIMITIVE EXECUTION MODEL FOR HETEROGENEOUS MODELING
A PRIMITIVE EXECUTION MODEL FOR HETEROGENEOUS MODELING Frédéric Boulanger Supélec Département Informatique, 3 rue Joliot-Curie, 91192 Gif-sur-Yvette cedex, France Email: Frederic.Boulanger@supelec.fr Guy
More informationRAISE in Perspective
RAISE in Perspective Klaus Havelund NASA s Jet Propulsion Laboratory, Pasadena, USA Klaus.Havelund@jpl.nasa.gov 1 The Contribution of RAISE The RAISE [6] Specification Language, RSL, originated as a development
More informationThe SPL Programming Language Reference Manual
The SPL Programming Language Reference Manual Leonidas Fegaras University of Texas at Arlington Arlington, TX 76019 fegaras@cse.uta.edu February 27, 2018 1 Introduction The SPL language is a Small Programming
More informationEXPERT SERVICES FOR IoT CYBERSECURITY AND RISK MANAGEMENT. An Insight Cyber White Paper. Copyright Insight Cyber All rights reserved.
EXPERT SERVICES FOR IoT CYBERSECURITY AND RISK MANAGEMENT An Insight Cyber White Paper Copyright Insight Cyber 2018. All rights reserved. The Need for Expert Monitoring Digitization and external connectivity
More informationConcept as a Generalization of Class and Principles of the Concept-Oriented Programming
Computer Science Journal of Moldova, vol.13, no.3(39), 2005 Concept as a Generalization of Class and Principles of the Concept-Oriented Programming Alexandr Savinov Abstract In the paper we describe a
More informationBuild and Provision: Two Sides of the Coin We Love to Hate
Build and Provision: Two Sides of the Coin We Love to Hate Ed Merks Eclipse Modeling Project Lead 1 The Software Pipeline Software artifacts flow from developer to developer and ultimately to the clients
More informationHDF5 Single Writer/Multiple Reader Feature Design and Semantics
HDF5 Single Writer/Multiple Reader Feature Design and Semantics The HDF Group Document Version 5.2 This document describes the design and semantics of HDF5 s single writer/multiplereader (SWMR) feature.
More informationScientific Workflow Tools. Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego
Scientific Workflow Tools Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego 1 escience Today Increasing number of Cyberinfrastructure (CI) technologies Data Repositories: Network
More informationCategorizing Migrations
What to Migrate? Categorizing Migrations A version control repository contains two distinct types of data. The first type of data is the actual content of the directories and files themselves which are
More information