Integrating large, fast-moving, and heterogeneous data sets in biology.
|
|
- Dwight Piers Allen
- 6 years ago
- Views:
Transcription
1 Integrating large, fast-moving, and heterogeneous data sets in biology. C. Titus Brown Asst Prof, CSE and Microbiology; BEACON NSF STC Michigan State University
2 Introduction Background: Modeling dl & data analysis undergrad d => Open source software development + software engineering + developmental biology + genomics PhD => Bio + computer science faculty => Data driven biology Currently working with next-gen sequencing data (mrnaseq, metagenomics, difficult genomes). Thinking hard about how to do data-driven modeling & model-driven data analysis.
3 Goal & outline Address challenges and opportunities of heterogeneous data integration: 1000 ft view. Outline: What types of analysis and discovery do we want to enable? What are the technical challenges, common solutions, and common failure points? Where might we look for success stories, and what lessons can we port to biology? My conclusions.
4 Specific types of questions I have a known chemical/gene interaction; do I see it in this other data set? I have a known chemical/gene interaction; what other gene expression is affected? What does chemical X do to overall phenotype, effect on gene expression, altered protein localization, and patterns of histone modification? More complex/combinatorial interactions: What does this chemical do in this genetic background? What kind of additional gene expression changes are generated by the combination of these two chemicals? What are common effects of this class of chemicals?
5 What general behavior do we want to enable? Reuse of data by groups that did not/could not produce it. Publication of reusable/ fork able data analysis pipelines pp and models. Integration of data and models. Serendipitous uses and cross-referencing of data sets ( mashups ). Rapid scientific exploration and hypothesis generation in data space.
6 (Executable papers & data reuse) ENCODE All data is available; all processing scripts for papers are available on a virtual machine. QIIME (microbial ecology) Amazon virtual machine containing software and data for: Collaborative cloud-enabled d tools allow rapid, reproducible biological insights. (pmid ) Digital normalization paper Amazon virtual machine, again:
7 Executable papers can support easy replication & reuse of code, data. (IPython Notebook; also see RStudio)
8 What general behavior do we want to enable? Reuse of data by groups that did not/could not produce it. Publication of reusable/ fork able data analysis pipelines and models. Integration of data and models. Serendipitous uses and cross-referencing of data sets ( mashups ). Rapid scientific exploration and hypothesis generation in data space.
9 An entertaining digression -- A mashup of Facebook top 10 books by college and per-college SAT rankings
10 Technical obstacles Syntactic incompatibility The first 90% of bioinformatics: your IDs are different from my IDs. Semantic incompatibility The second 90% of bioinformatics: what does gene mean in your database? Impedance mismatch SQL is notoriously bad at representing intervals and hierarchies Genomes consist of intervals; ontologies consist of hierarchies! SQL databases dominate (vs graph or object DBs). Data volume & velocity Large & expanding data sets just make everything er harder. Unstructured data aka publications most scientific knowledge is locked up
11 Typical solutions Entity resolution Accession numbers or other common identifiers requires global naming system OR translators. Top down imposition of structure Centralized DB; Here is the schema you will all use ; limits flexibility, prevents use of unstructured data, heavyweight. Ontologies to enable correct communication Centrally coordinated vocabulary slow, hard to get right, doesn t solve unstructured data problem. Balancing theoretical rigor with practical applicability is particularly hard. Ad hoc entity resolution ( winging it ) Common solution doesn t work that well.
12 Are better standards the solution?
13 Rephrasing technical goals How can we best provide a platform or platforms to support flexible data dt integration it ti and data dt investigation across a wide range of data sets and data types in biology? My interests: Avoid master data manager and centralization Support federated roll-out of new data and functionality Provide flexible extensibility of ontologies and hierarchies Support diverse ecology of databases, interfaces, and analysis software.
14 Success stories outside of biology? Look for domains: with really large amounts of heterogenous data, that are continually increasing in size, are being effectively mined on an ongoing basis, Have widely used programmatic interfaces that support mashups and other cross-database stuff, and are intentional, with principles that we can steal or adapt.
15 Success stories outside of biology? Look for domains: with really large amounts of heterogenous data, that are continually increasing in size, are being effectively mined on an ongoing basis, Have widely used programmatic interfaces that support mashups and other cross-database stuff, and are intentional, with principles that we can steal or adapt. Amazon.
16 Amazon: > 50 million users, > 1 million product partners, billions of reviews, dozens of compute services Continually changing/updating data sets. Explicitly l adopted d a service-oriented architecture that enables both internal and external use of this data. For example, the amazon.com Web site is itself built from over 150 independent services Amazon routinely deploys new services and functionality.
17 Sources: The Platform Rant (Steve Yegge) -- in which he compares the Google and Amazon approaches: eouesvavx A summary at HighScalability.com: com: (They are both long and tech-y, note, but the first is especially entertaining.)
18 A brief summary of core principles Mandates from the CEO: 1. All teams must expose data and functionality solely through h a service interface. 2. All communication between teams happens through that service interface. 3. All service interfaces must be designed so that they can be exposed to the outside world.
19 More colloquially: You should eat your own dogfood. Design and implement the database and database functionality to meet your own needs; and only use the functionality yyou ve explicitly made available to everyone. To adapt to research: database functionality should be designed in tightly integration with researchers who are using it, both at a user interface level and programmatically. (Genome databases have done a really good job of this, albeit generally in a centralized model.)
20 If the customers aren t integrated into the development loop:
21 A platform view? Metabolic model Diff'n gene expression query Data exploration WWW Gene ID translator Chemical relationships Expression normalization Isoform resolution/ comparison Expression data (tiling) Expression data (microarray) Expression data (mrnaseq) Expression data II (mrnaseq)
22 A few points Open source and agile software development approaches can be surprisingly effective and inexpensive. Developing services in small groups that include customerfacing developers helps ensure utility. Implementing services in the cloud (e.g. virtual machines, or on top of infrastructure as a service services) )gives developer flexibility in tools, approaches, implementation; also enables scaling and reusability.
23 Combining modelling with data Data-driven modeling: connections and parameters can be, to some extent, determined d from data. Model-driven driven data investigation: data that doesn t fit the known model is particularly interesting. The second approach is essentially how particle physicists work with accelerator data: build a model & then interpret the data using the model. (In biology, models are less constraining, though; more unknowns.)
24
25 Using developmental models Davidson et al.,
26 Using developmental models Models can contain useful abstractions of specific processes; here, the direct effects of blocking nuclearization of B-catenin can be predicted by following the connections. Models provide a common language for (dis)agreement in a community.
27 Using developmental models Davidson et al.,
28 Social obstacles Training of biologically aware software developers is lacking. Molecular biologists are still very much of a computationally naïve mindset: give me the answer so I can do the real work Incentives for data sharing, much less useful data sharing are not yet very strong. Pubs, grants, respect... Patterns for useful data sharing are still not well understood, in general.
29 Other places to look NEON and other NSF centers (e.g. NCEAS) are collecting vast heterogenous data sets, and are explicitly tackling the data management/use/integration/reuse problem. SBML ( Systems Biology Markup Language ) is a modeling descriptive language g that enables interoperability of modeling software. Software Carpentry runs free workshops on effective use of computation for science.
30 My conclusions We need a platform mentality to make the most use of our data, even if we don t completely embrace loose coupling and distribution. Agile and end-user focused software development methodologies have worked well in other areas; much of the hard technical space has already been explored in Internet companies (and probably social networking companies, too). Data is most useful in the context of an explicit model; models can Data is most useful in the context of an explicit model; models can be generated from data, and models can feed back into data gathering.
31 Things I didn t discuss Database maintenance and active curation is incredibly important. Most data only makes sense in the context of other data (think: controls; wild type vs knockout; other backgrounds; etc.) so we will need lots more data to interpret the data we already have. Deep learning is a promising field for extracting correlations from multiple large data sets. All of these technical problems are easier to solve than the social problems (incentives; training).
32 Thanks -- This talk and ancillary notes will be available on my blog ~soon: /bl / Pl d t t t tb@ d if h ti Please do contact me at ctb@msu.edu if you have questions or comments.
Low Friction Data Warehousing WITH PERSPECTIVE ILM DATA GOVERNOR
Low Friction Data Warehousing WITH PERSPECTIVE ILM DATA GOVERNOR Table of Contents Foreword... 2 New Era of Rapid Data Warehousing... 3 Eliminating Slow Reporting and Analytics Pains... 3 Applying 20 Years
More informationScience-as-a-Service
Science-as-a-Service The iplant Foundation Rion Dooley Edwin Skidmore Dan Stanzione Steve Terry Matthew Vaughn Outline Why, why, why! When duct tape isn t enough Building an API for the web Core services
More informationSELF-SERVICE SEMANTIC DATA FEDERATION
SELF-SERVICE SEMANTIC DATA FEDERATION WE LL MAKE YOU A DATA SCIENTIST Contact: IPSNP Computing Inc. Chris Baker, CEO Chris.Baker@ipsnp.com (506) 721 8241 BIG VISION: SELF-SERVICE DATA FEDERATION Biomedical
More informationThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT Strategy An Evolving Approach for Dealing with Big Data & Changing Environments bit.ly/datalake SPEAKERS: Thomas Kelly, Practice Director Cognizant Technology Solutions Sean Martin,
More informationANNUAL REPORT Visit us at project.eu Supported by. Mission
Mission ANNUAL REPORT 2011 The Web has proved to be an unprecedented success for facilitating the publication, use and exchange of information, at planetary scale, on virtually every topic, and representing
More informationIntroduction to Grid Computing
Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able
More informationPowering Knowledge Discovery. Insights from big data with Linguamatics I2E
Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural
More informationUnstructured Text in Big Data The Elephant in the Room
Unstructured Text in Big Data The Elephant in the Room David Milward ICIC, October 2013 Click Unstructured to to edit edit Master Master Big title Data style title style Big Data Volume, Variety, Velocity
More informationExtracting reproducible simulation studies from model repositories using the CombineArchive Toolkit
Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit Martin Scharm, Dagmar Waltemath Department of Systems Biology and Bioinformatics University of Rostock
More informationAdvances in Data Integration & Representation in Systems Biology
Advances in Data Integration & Representation in Systems Biology Susie Stephens Principal Product Manager, Life Sciences Oracle susie.stephens@oracle.com Outline Systems Biology Data Requirements Semantic
More informationBuilding High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL
Building High Performance Apps using NoSQL Swami Sivasubramanian General Manager, AWS NoSQL Building high performance apps There is a lot to building high performance apps Scalability Performance at high
More informationExtending SOA Infrastructure for Semantic Interoperability
Extending SOA Infrastructure for Semantic Interoperability Wen Zhu wzhu@alionscience.com ITEA System of Systems Conference 26 Jan 2006 www.alionscience.com/semantic Agenda Background Semantic Mediation
More informationDataspaces: A New Abstraction for Data Management. Mike Franklin, Alon Halevy, David Maier, Jennifer Widom
Dataspaces: A New Abstraction for Data Management Mike Franklin, Alon Halevy, David Maier, Jennifer Widom Today s Agenda Why databases are great. What problems people really have Why databases are not
More informationBig Data - Some Words BIG DATA 8/31/2017. Introduction
BIG DATA Introduction Big Data - Some Words Connectivity Social Medias Share information Interactivity People Business Data Data mining Text mining Business Intelligence 1 What is Big Data Big Data means
More informationIntermediate/Advanced Python. Michael Weinstein (Day 1)
Intermediate/Advanced Python Michael Weinstein (Day 1) Who am I? Most of my experience is on the molecular and animal modeling side I also design computer programs for analyzing biological data, particularly
More informationWhat is Text Mining? Sophia Ananiadou National Centre for Text Mining University of Manchester
National Centre for Text Mining www.nactem.ac.uk University of Manchester Outline Aims of text mining Text Mining steps Text Mining uses Applications 2 Aims Extract and discover knowledge hidden in text
More informationBUILDING MICROSERVICES ON AZURE. ~ Vaibhav
BUILDING MICROSERVICES ON AZURE ~ Vaibhav Gujral @vabgujral About Me Over 11 years of experience Working with Assurant Inc. Microsoft Certified Azure Architect MCSD, MCP, Microsoft Specialist Aspiring
More informationOntologies and Database Schema: What s the Difference? Michael Uschold, PhD Semantic Arts.
Ontologies and Database Schema: What s the Difference? Michael Uschold, PhD Semantic Arts. Objective To settle once and for all the question: What is the difference between an ontology and a database schema?
More informationWhen, Where & Why to Use NoSQL?
When, Where & Why to Use NoSQL? 1 Big data is becoming a big challenge for enterprises. Many organizations have built environments for transactional data with Relational Database Management Systems (RDBMS),
More informationWEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS
1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,
More informationIntroduction to Semantic Web
ه عا ی Semantic Web Introduction to Semantic Web Morteza Amini Sharif University of Technology Fall 95-96 Outline Thinking and Intelligent Applications The World Wide Web History The Problem with the Web
More informationStrategic Briefing Paper Big Data
Strategic Briefing Paper Big Data The promise of Big Data is improved competitiveness, reduced cost and minimized risk by taking better decisions. This requires affordable solution architectures which
More informationSOA: Service-Oriented Architecture
SOA: Service-Oriented Architecture Dr. Kanda Runapongsa (krunapon@kku.ac.th) Department of Computer Engineering Khon Kaen University 1 Gartner Prediction The industry analyst firm Gartner recently reported
More informationWither OWL in a knowledgegraphed, Linked-Data World?
Wither OWL in a knowledgegraphed, Linked-Data World? Jim Hendler @jahendler Tetherless World Professor of Computer, Web and Cognitive Science Director, Rensselaer Institute for Data Exploration and Applications
More informationInteroperability ~ An Introduction
Interoperability ~ An Introduction Cyndy Chandler Biological and Chemical Oceanography Data Management Office (BCO-DMO) Woods Hole Oceanographic Institution 26 July 2008 MMI OOS Interoperability Planning
More informationEmpowering People with Knowledge the Next Frontier for Web Search. Wei-Ying Ma Assistant Managing Director Microsoft Research Asia
Empowering People with Knowledge the Next Frontier for Web Search Wei-Ying Ma Assistant Managing Director Microsoft Research Asia Important Trends for Web Search Organizing all information Addressing user
More informationThe 7 Habits of Highly Effective API and Service Management
7 Habits of Highly Effective API and Service Management: Introduction The 7 Habits of Highly Effective API and Service Management... A New Enterprise challenge has emerged. With the number of APIs growing
More informationHow to integrate data into Tableau
1 How to integrate data into Tableau a comparison of 3 approaches: ETL, Tableau self-service and WHITE PAPER WHITE PAPER 2 data How to integrate data into Tableau a comparison of 3 es: ETL, Tableau self-service
More informationManagement Information Systems MANAGING THE DIGITAL FIRM, 12 TH EDITION FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT
MANAGING THE DIGITAL FIRM, 12 TH EDITION Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT VIDEO CASES Case 1: Maruti Suzuki Business Intelligence and Enterprise Databases
More informationBioinformatics Data Distribution and Integration via Web Services and XML
Letter Bioinformatics Data Distribution and Integration via Web Services and XML Xiao Li and Yizheng Zhang* College of Life Science, Sichuan University/Sichuan Key Laboratory of Molecular Biology and Biotechnology,
More informationEnabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services
Enabling Open Science: Data Discoverability, Access and Use Jo McEntyre Head of Literature Services www.ebi.ac.uk About EMBL-EBI Part of the European Molecular Biology Laboratory International, non-profit
More informationData management is fun. Casey Dunn Assistant Professor Ecology and Evolutionary Biology
Data management is fun Casey Dunn Assistant Professor Ecology and Evolutionary Biology What is science? The study of the natural world through observation and experiment. Reproducible study. Prove it isn
More informationStarting small to go Big: Building a Living Database
Starting small to go Big: Building a Living Database Michael Sabbatino 1,2, Baker, D.V. Vic 3,4, Rose, K. 1, Romeo, L. 1,2, Bauer, J. 1, and Barkhurst, A. 3,4 1 US Department of Energy, National Energy
More informationTitle: Episode 11 - Walking through the Rapid Business Warehouse at TOMS Shoes (Duration: 18:10)
SAP HANA EFFECT Title: Episode 11 - Walking through the Rapid Business Warehouse at (Duration: 18:10) Publish Date: April 6, 2015 Description: Rita Lefler walks us through how has revolutionized their
More informationSemantic Technologies for Nuclear Knowledge Modelling and Applications
Semantic Technologies for Nuclear Knowledge Modelling and Applications D. Beraha 3 rd International Conference on Nuclear Knowledge Management 7.-11.11.2016, Vienna, Austria Why Semantics? Machines understanding
More informationTaming Structured And Unstructured Data With SAP HANA Running On VCE Vblock Systems
1 Taming Structured And Unstructured Data With SAP HANA Running On VCE Vblock Systems The Defacto Choice For Convergence 2 ABSTRACT & SPEAKER BIO Dealing with enormous data growth is a key challenge for
More informationREPORT MICROSOFT PATTERNS AND PRACTICES
REPORT MICROSOFT PATTERNS AND PRACTICES Corporate Headquarters Nucleus Research Inc. 100 State Street Boston, MA 02109 Phone: +1 617.720.2000 Nucleus Research Inc. TOPICS Application Development & Integration
More informationYour Data Demands More NETAPP ENABLES YOU TO LEVERAGE YOUR DATA & COMPUTE FROM ANYWHERE
Your Data Demands More NETAPP ENABLES YOU TO LEVERAGE YOUR DATA & COMPUTE FROM ANYWHERE IN ITS EARLY DAYS, NetApp s (www.netapp.com) primary goal was to build a market for network-attached storage and
More informationCMIS An Industry Effort to Define a Service-Based Interoperability Standard for Content Management
CMIS An Industry Effort to Define a Service-Based Interoperability Standard for Content Management Dr. David Choy Content Management & Archiving CTO Office Chair, OASIS CMIS Technical Committee Patricia
More informationFoundations of a Data Centric Organization A NDREW K A R CHER SQL SAT U R D AY #740 A P R IL 1 4,
Foundations of a Data Centric Organization A NDREW K A R CHER SQL SAT U R D AY #740 A P R IL 1 4, 20 1 8 About Me http://www.andrewkarcher.com Twitter: @akarcher LinkedIn, Twitter Email: akarcher@gmail.com
More informationPlantSimLab An Innovative Web Application Tool for Plant Biologists
PlantSimLab An Innovative Web Application Tool for Plant Biologists Feb. 17, 2014 Sook S. Ha, PhD Postdoctoral Associate Virginia Bioinformatics Institute (VBI) 1 Outline PlantSimLab Project A NSF proposal
More informationData Analysis and Validation for ML
Analysis and for ML Neoklis (Alkis) Polyzotis, Google Research Collaborators: Eric Breck, Sudip Roy, Steven Whang, Martin Zinkevich Outline ML in production is hard, and a big part of hardness is related
More informationOutline. Quick Introduction to Database Systems. Data Manipulation Tasks. What do they all have in common? CSE142 Wi03 G-1
Outline Quick Introduction to Database Systems Why do we need a different kind of system? What is a database system? Separating the what the how: The relational data model Querying the databases: SQL May
More informationMeeting the OMB FY2012 Objective: Experiences, Observations, Lessons-Learned, and Other Thoughts
Meeting the OMB FY2012 Objective: Experiences, Observations, Lessons-Learned, and Other Thoughts 2013 Federal Interagency Workshop 9 December, 2013 Ron Broersma DREN Chief Engineer ron@dren.mil Introduction
More informationGetting Started with Semantics in the Enterprise. November 10, 2010, AWOSS, Moncton, NB
Getting Started with Semantics in the Enterprise Bradley Shoebottom November 10, 2010, AWOSS, Moncton, NB Introduction Should your enterprises first ontology look like this (and take 2 years to get there)?
More informationProf. Dr. Christian Bizer
STI Summit July 6 th, 2011, Riga, Latvia Global Data Integration and Global Data Mining Prof. Dr. Christian Bizer Freie Universität ität Berlin Germany Outline 1. Topology of the Web of Data What data
More informationTWOO.COM CASE STUDY CUSTOMER SUCCESS STORY
TWOO.COM CUSTOMER SUCCESS STORY With over 30 million users, Twoo.com is Europe s leading social discovery site. Twoo runs the world s largest scale-out SQL deployment, with 4.4 billion transactions a day
More informationXML in the bipharmaceutical
XML in the bipharmaceutical sector XML holds out the opportunity to integrate data across both the enterprise and the network of biopharmaceutical alliances - with little technological dislocation and
More informationTowards Practical Differential Privacy for SQL Queries. Noah Johnson, Joseph P. Near, Dawn Song UC Berkeley
Towards Practical Differential Privacy for SQL Queries Noah Johnson, Joseph P. Near, Dawn Song UC Berkeley Outline 1. Discovering real-world requirements 2. Elastic sensitivity & calculating sensitivity
More informationData-Intensive Distributed Computing
Data-Intensive Distributed Computing CS 451/651 431/631 (Winter 2018) Part 5: Analyzing Relational Data (1/3) February 8, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo
More informationWhere the Social Web Meets the Semantic Web. Tom Gruber RealTravel.com tomgruber.org
Where the Social Web Meets the Semantic Web Tom Gruber RealTravel.com tomgruber.org Doug Engelbart, 1968 "The grand challenge is to boost the collective IQ of organizations and of society. " Tim Berners-Lee,
More informationEducation Brochure. Education. Accelerate your path to business discovery. qlik.com
Education Education Brochure Accelerate your path to business discovery Qlik Education Services offers expertly designed coursework, tools, and programs to give your organization the knowledge and skills
More informationBig Data in Translational Science
Big Data in Translational Science Albert Wang Associate Director, Translational R&D IT Bristol-Myers Squibb 2015 AAPS Annual Meeting Agenda Perspectives on Big Data Big Data in Translational R&D Selected
More informationHOW THE RIGHT CMS MAKES CONTENT AVAILABLE WHEN AND WHERE CUSTOMERS NEED IT
HOW THE RIGHT CMS MAKES CONTENT AVAILABLE WHEN AND WHERE CUSTOMERS NEED IT We have never lived in a more oversaturated content environment than we do now. We have images and hashtags and blog posts demanding
More informationCSE 3241: Database Systems I Databases Introduction (Ch. 1-2) Jeremy Morris
CSE 3241: Database Systems I Databases Introduction (Ch. 1-2) Jeremy Morris 1 Outline What is a database? The database approach Advantages Disadvantages Database users Database concepts and System architecture
More informationLet me begin by introducing myself. I have been a Progress Application Partner since 1986 and for many years I was the architect and chief developer
Let me begin by introducing myself. I have been a Progress Application Partner since 1986 and for many years I was the architect and chief developer for our ERP application. In recent years, I have refocused
More informationBuilding a Data Strategy for a Digital World
Building a Data Strategy for a Digital World Jason Hunter, CTO, APAC Data Challenge: Pushing the Limits of What's Possible The Art of the Possible Multiple Government Agencies Data Hub 100 s of Service
More informationTHE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel
THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel National Center for Supercomputing Applications University of Illinois
More informationDatabases in the Cloud
Databases in the Cloud Ani Thakar Alex Szalay Nolan Li Center for Astrophysical Sciences and Institute for Data Intensive Engineering and Science (IDIES) The Johns Hopkins University Cloudy with a chance
More informationBiocomputing II Coursework guidance
Biocomputing II Coursework guidance I refer to the database layer as DB, the middle (business logic) layer as BL and the front end graphical interface with CGI scripts as (FE). Standardized file headers
More informationNational Centre for Text Mining NaCTeM. e-science and data mining workshop
National Centre for Text Mining NaCTeM e-science and data mining workshop John Keane Co-Director, NaCTeM john.keane@manchester.ac.uk School of Informatics, University of Manchester What is text mining?
More informationBuild Scientific Computing Infrastructure with Rebar3 and Docker. Eric Sage
Build Scientific Computing Infrastructure with Rebar3 and Docker Eric Sage A scientific telecommunications network Hello, I d like an automated gene ontology please! Agenda - An example biological service
More informationDatabase Management Systems Chapter 1 Instructor: Oliver Schulte Database Management Systems 3ed, R. Ramakrishnan and J.
Database Management Systems Chapter 1 Instructor: Oliver Schulte oschulte@cs.sfu.ca Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 What is a database? A database (DB) is a very large,
More informationIntroduction to Data Management for Ocean Science Research
Introduction to Data Management for Ocean Science Research Cyndy Chandler Biological and Chemical Oceanography Data Management Office 12 November 2009 Ocean Acidification Short Course Woods Hole, MA USA
More informationPlanning & Managing Migrations
Planning & Managing Migrations It s for the birds. Har har. Aimee Degnan / aimee@hook42.com Expectation Setting This is the first run of this presentation. It is being shaped for DrupalCon. Is text heavy
More informationSensor Data Collection and Processing
Sensor Data Collection and Processing Applying Web Scale To Sensor Data Today s speaker Josh Patterson josh@cloudera.com / twitter: @jpatanooga Master s Thesis: self-organizing mesh networks Published
More informationOverview. Data-mining. Commercial & Scientific Applications. Ongoing Research Activities. From Research to Technology Transfer
Data Mining George Karypis Department of Computer Science Digital Technology Center University of Minnesota, Minneapolis, USA. http://www.cs.umn.edu/~karypis karypis@cs.umn.edu Overview Data-mining What
More informationBUILDING the VIRtUAL enterprise
BUILDING the VIRTUAL ENTERPRISE A Red Hat WHITEPAPER www.redhat.com As an IT shop or business owner, your ability to meet the fluctuating needs of your business while balancing changing priorities, schedules,
More informationSemantic Web in a Constrained Environment
Semantic Web in a Constrained Environment Laurens Rietveld and Stefan Schlobach Department of Computer Science, VU University Amsterdam, The Netherlands {laurens.rietveld,k.s.schlobach}@vu.nl Abstract.
More informationRiskSense Attack Surface Validation for IoT Systems
RiskSense Attack Surface Validation for IoT Systems 2018 RiskSense, Inc. Surfacing Double Exposure Risks Changing Times and Assessment Focus Our view of security assessments has changed. There is diminishing
More informationIntroduction. October 5, Petr Křemen Introduction October 5, / 31
Introduction Petr Křemen petr.kremen@fel.cvut.cz October 5, 2017 Petr Křemen (petr.kremen@fel.cvut.cz) Introduction October 5, 2017 1 / 31 Outline 1 About Knowledge Management 2 Overview of Ontologies
More informationDiscovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London
Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services Patrick Wendel Imperial College, London Data Mining and Exploration Middleware for Distributed and Grid Computing,
More informationThis video is part of the Microsoft Virtual Academy.
This video is part of the Microsoft Virtual Academy. 1 In this session we re going to talk about building for the private cloud using the Microsoft deployment toolkit 2012, my name s Mike Niehaus, I m
More informationData Integrity in Stateful Services. Percona Live, Santa Clara, 2017
Data Integrity in Stateful Services Percona Live, Santa Clara, 2017 Data Integrity Bringing Sexy Back Protect the Data. -Every DBA who doesn t want to be fired Breaking Integrity Down Physical Integrity
More information10 Cloud Myths Demystified
10 Cloud s Demystified The Realities for Modern Campus Transformation Higher education is in an era of transformation. To stay competitive, institutions must respond to changing student expectations, demanding
More informationfrom the idea to the experience
User Interface Design and the Semantic Web from the idea to the experience Duane Degler Design for Context www.designforcontext.com Copyright D. Degler, Design for Context. 11.16.2010 Slide 1 Semantic
More informationCopyright 2012 EMC Corporation. All rights reserved. Obrigado
Copyright 20132012 EMC Corporation. EMC Corporation. All rights reserved. All rights reserved. 1 EMC FORUM 2013 2 Obrigado 3 SOFTWARE DEFINED DATA CENTER WORLD IS CHANGING RAPID CHANGE APP / INFRA INCREASED
More informationTackling network heterogeneity head-on
Tackling network heterogeneity head-on Timothy Roscoe Networks and Operating Systems Group ETH Zürich ETH Zürich Scene setting Different dimension of MAD networks: Independent evolution Arbitrary policies
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK REVIEW PAPER ON IMPLEMENTATION OF DOCUMENT ANNOTATION USING CONTENT AND QUERYING
More informationProposal for Implementing Linked Open Data on Libraries Catalogue
Submitted on: 16.07.2018 Proposal for Implementing Linked Open Data on Libraries Catalogue Esraa Elsayed Abdelaziz Computer Science, Arab Academy for Science and Technology, Alexandria, Egypt. E-mail address:
More informationNDARC Web Refresh 2011
NDARC Web Refresh 2011 Update to Staff Luc Betbeder Status Update Where we were What did we want What did we do Where are we up to Releasing 25 Aug What now? Where we were An image is worth a 1000 bullet
More informationWhat s a BA to do with Data? Discover and define standard data elements in business terms
What s a BA to do with Data? Discover and define standard data elements in business terms Susan Block, Lead Business Systems Analyst The Vanguard Group Discussion Points Discovering Business Data The Data
More informationImproving Decision-Making Support
Improving Decision-Making Support by Linking Database results to Simulations Gio Wiederhold Stanford University May 2014 Gio Wiederhold SimQL 1 Problem : Mismatch Database Technology should support Decision-Making
More informationLesson 14 SOA with REST (Part I)
Lesson 14 SOA with REST (Part I) Service Oriented Architectures Security Module 3 - Resource-oriented services Unit 1 REST Ernesto Damiani Università di Milano Web Sites (1992) WS-* Web Services (2000)
More informationChoosing the perfect CMS
... Choosing the perfect CMS 4 Pillars of picking the perfect Content Management System www.milestoneinternet.com 1-866-615-2516 Introduction Your website and mobile presence are the most powerful channels
More informationDatabase and Knowledge-Base Systems: Data Mining. Martin Ester
Database and Knowledge-Base Systems: Data Mining Martin Ester Simon Fraser University School of Computing Science Graduate Course Spring 2006 CMPT 843, SFU, Martin Ester, 1-06 1 Introduction [Fayyad, Piatetsky-Shapiro
More informationSemantic Web and Web2.0. Dr Nicholas Gibbins
Semantic Web and Web2.0 Dr Nicholas Gibbins Web 2.0 is the business revolution in the computer industry caused by the move to the internet as platform, and an attempt to understand the rules for success
More informationQ&A TAKING ENTERPRISE SECURITY TO THE NEXT LEVEL. An interview with John Summers, Enterprise VP and GM, Akamai
TAKING ENTERPRISE SECURITY TO THE NEXT LEVEL An interview with John Summers, Enterprise VP and GM, Akamai Q&A What are the top things that business leaders need to understand about today s cybersecurity
More informationSemantics Modeling and Representation. Wendy Hui Wang CS Department Stevens Institute of Technology
Semantics Modeling and Representation Wendy Hui Wang CS Department Stevens Institute of Technology hwang@cs.stevens.edu 1 Consider the following data: 011500 18.66 0 0 62 46.271020111 25.220010 011500
More informationBuilding A Business Online. A Crash Course in Creating an Online Presence for Your Business
A Crash Course in Creating an Online Presence for Your Business A little bit about me Graphic Design graduate from George Brown College Been in industry for the past 15 years Experience with clients ranging
More informationChapter 6 VIDEO CASES
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More informationInteroperability and Semantics in Use- Application of UML, XMI and MDA to Precision Medicine and Cancer Research
Interoperability and Semantics in Use- Application of UML, XMI and MDA to Precision Medicine and Cancer Research Ian Fore, D.Phil. Associate Director, Biorepository and Pathology Informatics Senior Program
More informationOverview of Web Mining Techniques and its Application towards Web
Overview of Web Mining Techniques and its Application towards Web *Prof.Pooja Mehta Abstract The World Wide Web (WWW) acts as an interactive and popular way to transfer information. Due to the enormous
More informationExploring and Exploiting the Biological Maze. Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix
Exploring and Exploiting the Biological Maze Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix Motivation An abundance of biological data sources contain data about scientific entities, such as
More informationModern Data Warehouse The New Approach to Azure BI
Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics
More informationLeakDAS Version 4 The Complete Guide
LeakDAS Version 4 The Complete Guide SECTION 4 LEAKDAS MOBILE Second Edition - 2014 Copyright InspectionLogic 2 Table of Contents CONNECTING LEAKDAS MOBILE TO AN ANALYZER VIA BLUETOOTH... 3 Bluetooth Devices...
More informationBy Snappy. Advanced SEO
Advanced SEO 1 Table of Contents Chapter 4 Page Speed 9 Site Architecture 13 Content Marketing 25 Rich Results 01 Page Speed Advanced SEO ebook CHAPTER 1 Page Speed CHAPTER 1 CHAPTER ONE Page Speed ONE
More informationTransformative characteristics and research agenda for the SDI-SKI step change:
Transformative characteristics and research agenda for the SDI-SKI step change: A Cadastral Case Study Dr Lesley Arnold Research Fellow, Curtin University, CRCSI Director Geospatial Frameworks World Bank
More informationMaking the most of metadata with Metadata 2020
Making the most of metadata with Metadata 2020 Patricia Feeney, Crossref and Metadata2020 CSE Annual Meeting April 2018 What is Metadata 2020? Metadata 2020 is a collaboration that advocates richer, connected,
More informationEnterprise Knowledge Map: Toward Subject Centric Computing. March 21st, 2007 Dmitry Bogachev
Enterprise Knowledge Map: Toward Subject Centric Computing March 21st, 2007 Dmitry Bogachev Are we ready?...the idea of an application is an artificial one, convenient to the programmer but not to the
More information