In-Memory Databases: Applications in Healthcare

Size: px
Start display at page:

Download "In-Memory Databases: Applications in Healthcare"

Transcription

1 Dr. Matthieu-P. Schapranow Apr 21,

2 Frankfurter Allgemeine Zeitung Verlagsspezial Medizin zwischen Möglichkeiten und Erfolg 17. April

3 Important things first: Where do you find additional information? Online: Visit we.analyzegenomes.com for latest research results, tools, and news Offline: Read more about it, e.g. High-Performance In-Memory Genome Data Analysis: How In-Memory Database Technology Accelerates Personalized Medicine, In-Memory Data Management Research, Springer, ISBN: , 2014 In Person: Join us for German Biotechnology Days Apr 22-23, in Cologne, Germany 3

4 IT Challenges Distributed Heterogeneous Data Sources Human genome/biological data 600GB per full genome 15PB+ in databases of leading institutes Human proteome 160M data points (2.4GB) per sample >3TB raw proteome data in ProteomicsDB Hospital information systems Often more than 50GB PubMed database >23M articles Cancer patient records >160k records at NCT Prescription data 1.5B records from 10,000 doctors and 10M Patients (100 GB) Medical sensor data Scan of a single organ in 1s creates 10GB of raw data Clinical trials Currently more than 30k recruiting on ClinicalTrials.gov 4

5 Our Approach Analyze Genomes: Real-time Analysis of Big Medical Data Oncolyzer Clinical Trial Assessment Pathway Topology Analysis... Drug Response Analysis Cohort Analysis Medical Knowledge Cockpit Real-time Analysis Access Control, Data Protection Data Exchange, App Store Statistical Tools Fair Use Extensions for Life Sciences App-spanning User Profiles Genome Data Research Publications Genome Metadata Pipeline and Analysis Models Cellular Pathways Combined and Linked Data Drugs and Interactions In-Memory Database 5

6 The Setting Actors in Oncology Patients Individual anamnesis, family history, and background Require fast access to individualized therapy Clinicians Identify root and extent of disease using laboratory tests Evaluate therapy alternatives, adapt existing therapy Researchers Conduct laboratory work, e.g. analyze patient samples Create new research findings and come-up with treatment alternatives 6

7 Our Motivation Make Precision Medicine Come Routine in Real Life Motivation: Can we enable clinicians to take their therapy decisions: Incorporating all available specifics about each individual patient, Referencing latest lab results and worldwide medical knowledge, and Interactively during their ward round? 7

8 Cloud-based Services for Processing of DNA Data Control center for processing of raw DNA data, such as FASTQ, SAM, and VCF Personal user profile guarantees privacy of uploaded and processed data Supports reproducible research process by storing all relevant process parameters Standardized Modeling and runtime environment for analysis pipelines Implements prioritized data processing and fair use, e.g. per department or per institute Supports additional service, such as data annotations, billing, and sharing for all Analyze Genomes services Honored by the 2014 European Life Science Award 8

9 Real-time Processing of Event Data from Medical Sensors Processing of sensor data, e.g. from Intensive Care Units (ICUs) or wearable sensor devices (quantify self) Multi-modal real-time analysis to detect indicators for severe events, such as heart attacks or strokes t Comparison of waveform data with history of similar patients Incorporates machine-learning algorithms to detect severe events and to inform clinical personnel in time Successfully tested with 100 Hz event rate, i.e. sufficient for ICU use 9

10 Drug Safety Statistical Analysis of Drug Side Effects Data Combines confirmed side effect data from different data sources Interactive statistical analysis, e.g. apriori rules, to discover still unknown interactions Integrates personal prescription data and directly report side effects Work together with your doctor to prevent interaction with already prescribed drugs Unified access to international side effect data On-the-fly extension of database schema to add side effect databases 10

11 Real-time Assessment of Clinical Trial Candidates Switch from trial-centric to patient-centric clinical trials Real-time matching and clustering of patients and clinical trial inclusion/exclusion criteria No manual pre-screening of patients for months: In-memory technology enables interactive prescreening process Reassessment of already screened or already participating patient reduces recruitment costs Assessment of patients preconditions for clinical trials 11

12 Drug Response Analysis Data Sources and Matching Patient-specific Data Tumor-specific Data Experiment Data Metadata e.g. smoking status, tumor classification and age Genome Data e.g. raw DNA data and genetic variants Enable Interavtive Data Exploration And Analysis Experiment Results e.g. medication effectivity obtained from wet laboratory 12

13 Drug Response Analysis Interactive Data Exploration Drug response depends on individual genetic variants of tumors Challenge: Identification of relevant genetic variants and their impact on drug response is a ongoing research activity, e.g. Xenograft models Exploration of experiment results is timeconsuming and Excel-driven Interactive analysis of correlations between drugs and genetic variants In-memory technology enables interactive exploration of experiment data to leverage new scientific insights 13

14 Medical Knowledge Cockpit for Patients and Clinicians Search for affected genes in distributed and heterogeneous data sources Immediate exploration of relevant information, such as Gene descriptions, Molecular impact and related pathways, Scientific publications, and Suitable clinical trials. Unified access to structured and un-structured data sources Automatic clinical trial matching build on text analysis features No manual searching for hours or days: In-memory technology translates searching into interactive finding! 14

15 Medical Knowledge Cockpit for Patients and Clinicians Publications In-place preview of relevant data, such as publications and publication meta data Incorporating individual filter settings, e.g. additional search terms 15

16 Medical Knowledge Cockpit for Patients and Clinicians Latest Clinical Trials Personalized clinical trials, e.g. by incorporating patient specifics Classification of internal/external trials based on treating institute 16

17 Medical Knowledge Cockpit for Patients and Clinicians Pathway Topology Analysis Search in pathways is limited to is a certain element contained today Integrated >1,5k pathways from international sources, e.g. KEGG, HumanCyc, and WikiPathways, into HANA Implemented graph-based topology exploration and ranking based on patient specifics Unified access to multiple formerly disjoint data sources Pathway analysis of genetic variants with graph engine Enables interactive identification of possible dysfunctions affecting the course of a therapy before its start 17

18 Medical Knowledge Cockpit for Patients and Clinicians Search in Structured and Unstructured Medical Data Extended text analysis feature by medical terminology Genes (122, ,771 synonyms) Medical terms and categories (98,886 diseases, 47 categories) Pharmaceutical ingredients (7,099) T Unified access to structured and unstructured data sources Clinical trial matching using text analysis features Indexed clinicaltrials.gov database (145k trials/30,138 recruiting) Extracted, e.g., 320k genes, 161k ingredients, 30k periods Select studies based on multiple filters in less than 500ms 18

19 Medical Knowledge Cockpit for Patients and Clinicians Seamless Integration of Patient Specifics Google-like user interface for searching data Seamless integration of individual EMR data Search various sources for biomarkers, literature, and diseases 19

20 Our Methodology Design Thinking Methodology 20

21 Our Methodology Design Thinking Methodology Desirability Leveraging directed customer services Portfolio of integrated services for clinicians, researchers, and patients Include latest research results, e.g. most effective therapies Viability Enable personalized medicine also in far-off regions and developing countries Share data via the Internet to get feedback from word-wide experts (cost-saving) Combine research data (publications, annotations, genome data) from international databases in a single knowledge base Feasibility HiSeq 2500 enables high-coverage whole genome sequencing in 20h IMDB enables allele frequency determination of 12B records within <1s Detection of 1 relevant annotation out of 80M <1s Cloud-based data processing services reduce TCO 21

22 Our Technology In-Memory Database Technology + Combined column and row store Map/Reduce Single and multi-tenancy Insert only for time travel Real-time replication Working on integers Active/passive data store Minimal projections Group key Dynamic multithreading Bulk load of data Objectrelational mapping No aggregate tables Data partitioning Any attribute as index On-the-fly extensibility Analytics on historical data Multi-core/ parallelization Lightweight compression SQL P v t SQL interface on columns and rows x x + Reduction of software layers T disk Text retrieval and extraction engine No disk 22

23 In-Memory Data Management Overview Advances in Hardware A Multi-core architecture (6 x 12 core CPU per blade) Parallel scaling across blades 1 blade 50k USD = 1 enterprise class server 64 bit address space 4 TB in current server boards 4 MB/ms/core data throughput Cost-performance ratio rapidly declining Advances in Software P Row and Column Store Insert Only Compression Partitioning Parallelization Active & Passive Data Stores 23

24 In-Memory Database Technology Use Case: Analysis of Genomic Data Analysis of Genomic Data Alignment and Variant Calling Analysis of Annotations in World-wide DBs Bound To CPU Performance Memory Capacity Duration Hours Days Weeks HPI Minutes Real-time In-Memory Technology Multi-Core Partitioning & Compression 24

25 In-Memory Database Technology Hardware Characteristics at HPI FSOC Lab 1,000 core cluster at Hasso Plattner Institute with 25 TB main memory Consists of 25 nodes, each: 40 cores 1 TB main memory Intel Xeon E GHz 30 MB Cache 25

26 + Combined Column and Row Store Row stores are designed for operative workload, e.g. Create and maintain meta data for tests Access a complete record of a trial or test series Column stores are designed for analytical work, e.g. Evaluate the number of positive test results Identification of correlations or test candidates In-Memory approach: Combination of both stores Increased performance for analytical work Operative performance remains interactively 26

27 Insert-Only / Append-Only Traditional databases allow four data operations: INSERT, SELECT and DELETE, UPDATE Insert-only requires only INSERT, SELECT to maintain a complete history (bookkeeping systems) Insert-only enables time travelling, e.g. to Trace changes and reconstruct decisions Document complete history of changes, therapies, etc. Enable statistical observations 27

28 Lightweight Compression Main memory access is the new bottleneck Lightweight compression can reduce this bottleneck, i.e. Lossless Improved usage of data bus capacity Work directly on compressed data Attribute Vector Data Dictionary ValueId Value 1 Larynx 2 Lip 3 Rectum 4 Colon Table Typical RecId compression ValueId factor 5 of 10:1 Mama for enterprise software RecId Colon C C18.0 RecId Larynx In financial 2 C32.0 applications up Inverted to 50:1 Index C32.0 RecId Lip C C Colon C18.0 ValueId RecIdList RecId 4 4 C18.0 RecId Rectum C C20.0 RecId Rectum C Mama 6 C20.0 RecId 7 C ,6 RecId Colon C C ,4,8 8 C

29 Partitioning Horizontal Partitioning Cut long tables into shorter segments E.g. to group samples with same relevance Vertical Partitioning Split off columns to individual resources E.g. to separate personalized data from experiment data Partitioning is the basis for Parallel execution of database queries Implementation of data aging and data retention management 29

30 Multi-core and Parallelization Modern server systems consist of x CPUs, e.g. Each CPU consists of y CPU cores, e.g. 12 Consider each of the x*y CPU core as individual workers, e.g. 6x12 Each worker can perform one task at the same time in parallel Full table scan of database table w/ 1M entries results in 1/x*1/y search time when traversing in parallel Reduced response time No need for pre-aggregated totals and redundant data Improved usage of hardware Instant analysis of data 30

31 P Active and Passive Data Store Consider two categories of data stores: active and passive Active data are accessed frequently & updates are expected, e.g. Most recent experiment results, e.g. last two weeks Samples that have not been processed, yet Passive data are used for analytical & statistical purposes, e.g. Samples that were processed 5 years ago Meta data about seeds that are not longer produced Moving passive data on slower storages Reduces main memory demands Improves performance for active data 31

32 x x Reduction of Application Layers Layers are introduced to abstract from complexity Each layer offers complete functionality, e.g. meta data of samples Less layer result in Less code to maintain More specific code Reduced resource demands Improves performance of applications due to eliminating obsolete processing steps 32

33 What to take home? Test-drive it yourself: For patients Identify relevant clinical trials and medical experts Start most appropriate therapy as early as possible For clinicians Preventive diagnostics to identify risk patients early Indicate pharmacokinetic correlations Scan for similar patient cases, e.g. to evaluate therapy For researchers Enable real-time analysis of medical data and its assessment, e.g. assess pathways to identify impact of detected variants Combined free-text search in publications, diagnosis, and EMR data, i.e. structured and unstructured data 33

34 Keep in contact with us! Hasso Plattner Institute Enterprise Platform & Integration Concepts (EPIC) Program Manager E-Health Dr. Matthieu-P. Schapranow August-Bebel-Str Potsdam, Germany Dr. Matthieu-P. Schapranow 34

In-Memory Technology in Life Sciences

In-Memory Technology in Life Sciences in Life Sciences Dr. Matthieu-P. Schapranow In-Memory Database Applications in Healthcare 2016 Apr Intelligent Healthcare Networks in the 21 st Century? Hospital Research Center Laboratory Researcher Clinician

More information

Applying In-Memory Technology to Genome Data Analysis

Applying In-Memory Technology to Genome Data Analysis Applying In-Memory Technology to Genome Data Analysis Cindy Fähnrich Hasso Plattner Institute GLOBAL HEALTH 14 Tutorial Hasso Plattner Institute Key Facts Founded as a public-private partnership in 1998

More information

In-Memory Data Management

In-Memory Data Management In-Memory Data Management Martin Faust Research Assistant Research Group of Prof. Hasso Plattner Hasso Plattner Institute for Software Engineering University of Potsdam Agenda 2 1. Changed Hardware 2.

More information

In-Memory Data Management Jens Krueger

In-Memory Data Management Jens Krueger In-Memory Data Management Jens Krueger Enterprise Platform and Integration Concepts Hasso Plattner Intitute OLTP vs. OLAP 2 Online Transaction Processing (OLTP) Organized in rows Online Analytical Processing

More information

PARIS DEC, 13th Workshop 2: How digital therapeutics solutions & connected care make life easier

PARIS DEC, 13th Workshop 2: How digital therapeutics solutions & connected care make life easier by PARIS DEC, 13th 2017 Workshop 2: How digital therapeutics solutions & connected care make life easier Health and Tech for People «Digital Disruption» by Workshop 2: How digital therapeutics solutions

More information

Fujitsu: Your Partner for SAP HANA Solutions

Fujitsu: Your Partner for SAP HANA Solutions Fujitsu: Your Partner for SAP HANA Solutions The In-memory Revolution Process vast amounts of data in real-time Run analytics dramatically faster than disk-based DB (10x to >1,000x) Big Data Challenge

More information

SAP HANA. Jake Klein/ SVP SAP HANA June, 2013

SAP HANA. Jake Klein/ SVP SAP HANA June, 2013 SAP HANA Jake Klein/ SVP SAP HANA June, 2013 SAP 3 YEARS AGO Middleware BI / Analytics Core ERP + Suite 2013 WHERE ARE WE NOW? Cloud Mobile Applications SAP HANA Analytics D&T Changed Reality Disruptive

More information

DRAGEN Bio-IT Platform Enabling the Global Genomic Infrastructure

DRAGEN Bio-IT Platform Enabling the Global Genomic Infrastructure TM DRAGEN Bio-IT Platform Enabling the Global Genomic Infrastructure About DRAGEN Edico Genome s DRAGEN TM (Dynamic Read Analysis for GENomics) Bio-IT Platform provides ultra-rapid secondary analysis of

More information

Healthcare mobility: selecting the right device for better patient care

Healthcare mobility: selecting the right device for better patient care Healthcare mobility: selecting the right device for better patient care How Fujitsu Mobile Solutions help accelerate digital transformation with human-centric innovation* Fujitsu Thought Leadership Report

More information

SAP HANA Update. Saul Cunningham SAP Big Data Centre of Excellence

SAP HANA Update. Saul Cunningham SAP Big Data Centre of Excellence SAP HANA Update Saul Cunningham SAP Big Data Centre of Excellence The first 35 years: innovated with ERP & LOB apps Data In ERP + LOB Systems of Record Five years ago: innovated with analytics Data In

More information

Connecting the Healthcare Dots to Enable Precision Medicine

Connecting the Healthcare Dots to Enable Precision Medicine Connecting the Healthcare Dots to Enable Precision Medicine Enabling SAP Health services with Dell EMC Isilon Data Lake in a Hadoop environment ABSTRACT Dell EMC offers an end-to-end solution for the personalized

More information

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural

More information

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight ESG Lab Review InterSystems Data Platform: A Unified, Efficient Data Platform for Fast Business Insight Date: April 218 Author: Kerry Dolan, Senior IT Validation Analyst Abstract Enterprise Strategy Group

More information

A scalable AI Knowledge Graph Solution for Healthcare (and many other industries) Dr. Jans Aasman

A scalable AI Knowledge Graph Solution for Healthcare (and many other industries) Dr. Jans Aasman A scalable AI Knowledge Graph Solution for Healthcare (and many other industries) Dr. Jans Aasman About Franz Inc. Privately held, Self-funded, Profitable since 1984 Headquartered: Oakland, CA Flagship

More information

WISER. Protocol Creation, Activation, and Management TRAINING MANUAL. Wake Integrated Solution for Enterprise Research. For Oncology Studies

WISER. Protocol Creation, Activation, and Management TRAINING MANUAL. Wake Integrated Solution for Enterprise Research. For Oncology Studies WISER Wake Integrated Solution for Enterprise Research Protocol Creation, Activation, and Management For Oncology Studies TRAINING MANUAL Version June 11, 2018 WELCOME to WISER! Navigation and Home Page

More information

2017 GridGain Systems, Inc. In-Memory Performance Durability of Disk

2017 GridGain Systems, Inc. In-Memory Performance Durability of Disk In-Memory Performance Durability of Disk Meeting the Challenges of Fast Data in Healthcare with In-Memory Technologies Akmal Chaudhri Technology Evangelist GridGain Agenda Introduction Fast Data in Healthcare

More information

Lenovo Database Configuration for Microsoft SQL Server TB

Lenovo Database Configuration for Microsoft SQL Server TB Database Lenovo Database Configuration for Microsoft SQL Server 2016 22TB Data Warehouse Fast Track Solution Data Warehouse problem and a solution The rapid growth of technology means that the amount of

More information

Lenovo Database Configuration

Lenovo Database Configuration Lenovo Database Configuration for Microsoft SQL Server Standard Edition DWFT 9TB Reduce time to value with pretested hardware configurations Data Warehouse problem and a solution The rapid growth of technology

More information

Cost-Benefit Analysis of Retrospective vs. Prospective Data Standardization

Cost-Benefit Analysis of Retrospective vs. Prospective Data Standardization Cost-Benefit Analysis of Retrospective vs. Prospective Data Standardization Vicki Seyfert-Margolis, PhD Senior Advisor, Science Innovation and Policy Food and Drug Administration IOM Sharing Clinical Research

More information

Strategic Briefing Paper Big Data

Strategic Briefing Paper Big Data Strategic Briefing Paper Big Data The promise of Big Data is improved competitiveness, reduced cost and minimized risk by taking better decisions. This requires affordable solution architectures which

More information

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing. About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This

More information

Acurian on. The Role of Technology in Patient Recruitment

Acurian on. The Role of Technology in Patient Recruitment Acurian on The Role of Technology in Patient Recruitment Wearables smartphones social networks the list of new technological tools available to patients and healthcare providers goes on and on. Many clinical

More information

Big Data in Translational Science

Big Data in Translational Science Big Data in Translational Science Albert Wang Associate Director, Translational R&D IT Bristol-Myers Squibb 2015 AAPS Annual Meeting Agenda Perspectives on Big Data Big Data in Translational R&D Selected

More information

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)?

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)? Introduction to Data Warehousing and Business Intelligence Overview Why Business Intelligence? Data analysis problems Data Warehouse (DW) introduction A tour of the coming DW lectures DW Applications Loosely

More information

Keywords Real-Time Data Analysis; In-Memory Database Technology; Genome Data; Personalized Medicine; Next-Generation Sequencing

Keywords Real-Time Data Analysis; In-Memory Database Technology; Genome Data; Personalized Medicine; Next-Generation Sequencing Volume 4, Issue 4, April 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com High-Throughput

More information

Healthcare IT A Monitoring Primer

Healthcare IT A Monitoring Primer Healthcare IT A Monitoring Primer Published: February 2019 PAGE 1 OF 13 Contents Introduction... 3 The Healthcare IT Environment.... 4 Traditional IT... 4 Healthcare Systems.... 4 Healthcare Data Format

More information

ORACLE DIAGNOSTICS PACK

ORACLE DIAGNOSTICS PACK ORACLE DIAGNOSTICS PACK KEY FEATURES AND BENEFITS: Automatic Performance Diagnostic liberates administrators from this complex and time consuming task, and ensures quicker resolution of performance bottlenecks.

More information

CTSI Module 8 Workshop Introduction to Biomedical Informatics, Part V

CTSI Module 8 Workshop Introduction to Biomedical Informatics, Part V CTSI Module 8 Workshop Introduction to Biomedical Informatics, Part V Practical Tools: Data Processing & Analysis William Hsu, PhD Assistant Professor Medical Imaging Informatics Group Dept of Radiological

More information

Service Description. IBM DB2 on Cloud. 1. Cloud Service. 1.1 IBM DB2 on Cloud Standard Small. 1.2 IBM DB2 on Cloud Standard Medium

Service Description. IBM DB2 on Cloud. 1. Cloud Service. 1.1 IBM DB2 on Cloud Standard Small. 1.2 IBM DB2 on Cloud Standard Medium Service Description IBM DB2 on Cloud This Service Description describes the Cloud Service IBM provides to Client. Client means the company and its authorized users and recipients of the Cloud Service.

More information

The NIH Collaboratory Distributed Research Network: A Privacy Protecting Method for Sharing Research Data Sets

The NIH Collaboratory Distributed Research Network: A Privacy Protecting Method for Sharing Research Data Sets The NIH Collaboratory Distributed Research Network: A Privacy Protecting Method for Sharing Research Data Sets Jeffrey Brown, Lesley Curtis, and Rich Platt June 13, 2014 Previously The NIH Collaboratory:

More information

FlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC

FlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC white paper FlashGrid Software Intel SSD DC P3700/P3600/P3500 Topic: Hyper-converged Database/Storage FlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC Abstract FlashGrid

More information

Modernizing Healthcare IT for the Data-driven Cognitive Era Storage and Software-Defined Infrastructure

Modernizing Healthcare IT for the Data-driven Cognitive Era Storage and Software-Defined Infrastructure Modernizing Healthcare IT for the Data-driven Cognitive Era Storage and Software-Defined Infrastructure An IDC InfoBrief, Sponsored by IBM April 2018 Executive Summary Today s healthcare organizations

More information

5/24/ MVP SQL Server: Architecture since 2010 MCT since 2001 Consultant and trainer since 1992

5/24/ MVP SQL Server: Architecture since 2010 MCT since 2001 Consultant and trainer since 1992 2014-05-20 MVP SQL Server: Architecture since 2010 MCT since 2001 Consultant and trainer since 1992 @SoQooL http://blog.mssqlserver.se Mattias.Lind@Sogeti.se 1 The evolution of the Microsoft data platform

More information

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models RCFile: A Fast and Space-efficient Data

More information

FlashStack 70TB Solution with Cisco UCS and Pure Storage FlashArray

FlashStack 70TB Solution with Cisco UCS and Pure Storage FlashArray REFERENCE ARCHITECTURE Microsoft SQL Server 2016 Data Warehouse Fast Track FlashStack 70TB Solution with Cisco UCS and Pure Storage FlashArray FLASHSTACK REFERENCE ARCHITECTURE December 2017 TABLE OF CONTENTS

More information

TCO REPORT. NAS File Tiering. Economic advantages of enterprise file management

TCO REPORT. NAS File Tiering. Economic advantages of enterprise file management TCO REPORT NAS File Tiering Economic advantages of enterprise file management Executive Summary Every organization is under pressure to meet the exponential growth in demand for file storage capacity.

More information

Migrate from Netezza Workload Migration

Migrate from Netezza Workload Migration Migrate from Netezza Automated Big Data Open Netezza Source Workload Migration CASE SOLUTION STUDY BRIEF Automated Netezza Workload Migration To achieve greater scalability and tighter integration with

More information

Built for Speed: Comparing Panoply and Amazon Redshift Rendering Performance Utilizing Tableau Visualizations

Built for Speed: Comparing Panoply and Amazon Redshift Rendering Performance Utilizing Tableau Visualizations Built for Speed: Comparing Panoply and Amazon Redshift Rendering Performance Utilizing Tableau Visualizations Table of contents Faster Visualizations from Data Warehouses 3 The Plan 4 The Criteria 4 Learning

More information

called Hadoop Distribution file System (HDFS). HDFS is designed to run on clusters of commodity hardware and is capable of handling large files. A fil

called Hadoop Distribution file System (HDFS). HDFS is designed to run on clusters of commodity hardware and is capable of handling large files. A fil Parallel Genome-Wide Analysis With Central And Graphic Processing Units Muhamad Fitra Kacamarga mkacamarga@binus.edu James W. Baurley baurley@binus.edu Bens Pardamean bpardamean@binus.edu Abstract The

More information

HANA Performance. Efficient Speed and Scale-out for Real-time BI

HANA Performance. Efficient Speed and Scale-out for Real-time BI HANA Performance Efficient Speed and Scale-out for Real-time BI 1 HANA Performance: Efficient Speed and Scale-out for Real-time BI Introduction SAP HANA enables organizations to optimize their business

More information

2017 2nd International Conference on Information Technology and Management Engineering (ITME 2017) ISBN:

2017 2nd International Conference on Information Technology and Management Engineering (ITME 2017) ISBN: 2017 2nd International Conference on Information Technology and Management Engineering (ITME 2017) ISBN: 978-1-60595-415-8 Design and Implementation of a Mobile Healthcare Management Platform Based on

More information

HPC Enabling R&D at Philip Morris International

HPC Enabling R&D at Philip Morris International HPC Enabling R&D at Philip Morris International Jim Geuther*, Filipe Bonjour, Bruce O Neel, Didier Bouttefeux, Sylvain Gubian, Stephane Cano, and Brian Suomela * Philip Morris International IT Service

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Big Data Connectors: High Performance Integration for Hadoop and Oracle Database Melli Annamalai Sue Mavris Rob Abbott 2 Program Agenda Big Data Connectors: Brief Overview Connecting Hadoop with Oracle

More information

Storage for HPC, HPDA and Machine Learning (ML)

Storage for HPC, HPDA and Machine Learning (ML) for HPC, HPDA and Machine Learning (ML) Frank Kraemer, IBM Systems Architect mailto:kraemerf@de.ibm.com IBM Data Management for Autonomous Driving (AD) significantly increase development efficiency by

More information

SELF-SERVICE SEMANTIC DATA FEDERATION

SELF-SERVICE SEMANTIC DATA FEDERATION SELF-SERVICE SEMANTIC DATA FEDERATION WE LL MAKE YOU A DATA SCIENTIST Contact: IPSNP Computing Inc. Chris Baker, CEO Chris.Baker@ipsnp.com (506) 721 8241 BIG VISION: SELF-SERVICE DATA FEDERATION Biomedical

More information

Pooling Clinical Data: Key points and Pitfalls. October 16, 2012 Phuse 2012 conference, Budapest Florence Buchheit

Pooling Clinical Data: Key points and Pitfalls. October 16, 2012 Phuse 2012 conference, Budapest Florence Buchheit Pooling Clinical Data: Key points and Pitfalls October 16, 2012 Phuse 2012 conference, Budapest Florence Buchheit Introduction Are there any pre-defined rules to pool clinical data? Are there any pre-defined

More information

Netezza The Analytics Appliance

Netezza The Analytics Appliance Software 2011 Netezza The Analytics Appliance Michael Eden Information Management Brand Executive Central & Eastern Europe Vilnius 18 October 2011 Information Management 2011IBM Corporation Thought for

More information

Smart Manufacturing in the Food & Beverage Industry

Smart Manufacturing in the Food & Beverage Industry Smart Manufacturing in the Food & Beverage Industry PUBLIC Copyright 2016 Rockwell Automation, Inc. All Rights Reserved. 1 Rockwell Automation at a Glance $5.9B FISCAL 2016 SALES 22,000 EMPLOYEES 80+ COUNTRIES

More information

A Distributed World - the New IT Requirements of Edge Computing

A Distributed World - the New IT Requirements of Edge Computing A Distributed World - the New IT Requirements of Edge Computing JEDEC Mobile & IOT Forum Jonathan Hinkle, Sr. Research Staff Member - Systems and Memory Architecture Lenovo Research 2018 Data growth continuing

More information

Decrypting your genome data privately in the cloud

Decrypting your genome data privately in the cloud Decrypting your genome data privately in the cloud Marc Sitges Data Manager@Made of Genes @madeofgenes The Human Genome 3.200 M (x2) Base pairs (bp) ~20.000 genes (~30%) (Exons ~1%) The Human Genome Project

More information

Monitoring & Tuning Azure SQL Database

Monitoring & Tuning Azure SQL Database Monitoring & Tuning Azure SQL Database Dustin Ryan, Data Platform Solution Architect, Microsoft Moderated By: Paresh Motiwala Presenting Sponsors Thank You to Our Presenting Sponsors Empower users with

More information

Bringing Connected Diagnostics to Scale

Bringing Connected Diagnostics to Scale Bringing Connected Diagnostics to Scale Connected Diagnostics Opportunity Improve Linkage to Care Reduce Loss to Follow Up Improve Surveillance Reduce Transcription Errors Diagnostics Data Monitor Quality

More information

REFERENCE ARCHITECTURE Microsoft SQL Server 2016 Data Warehouse Fast Track. FlashStack 70TB Solution with Cisco UCS and Pure Storage FlashArray//X

REFERENCE ARCHITECTURE Microsoft SQL Server 2016 Data Warehouse Fast Track. FlashStack 70TB Solution with Cisco UCS and Pure Storage FlashArray//X REFERENCE ARCHITECTURE Microsoft SQL Server 2016 Data Warehouse Fast Track FlashStack 70TB Solution with Cisco UCS and Pure Storage FlashArray//X FLASHSTACK REFERENCE ARCHITECTURE September 2018 TABLE

More information

RADIOMICS: potential role in the clinics and challenges

RADIOMICS: potential role in the clinics and challenges 27 giugno 2018 Dipartimento di Fisica Università degli Studi di Milano RADIOMICS: potential role in the clinics and challenges Dr. Francesca Botta Medical Physicist Istituto Europeo di Oncologia (Milano)

More information

ICTR UW Institute of Clinical and Translational Research. i2b2 User Guide. Version 1.0 Updated 9/11/2017

ICTR UW Institute of Clinical and Translational Research. i2b2 User Guide. Version 1.0 Updated 9/11/2017 ICTR UW Institute of Clinical and Translational Research i2b2 User Guide Version 1.0 Updated 9/11/2017 Table of Contents Background/Search Criteria... 2 Accessing i2b2... 3 Navigating the Workbench...

More information

Dispatcher. Phoenix. Dispatcher Phoenix Enterprise White Paper Version 0.2

Dispatcher. Phoenix. Dispatcher Phoenix Enterprise White Paper Version 0.2 Dispatcher Phoenix Dispatcher Phoenix Enterprise CONTENTS Introduction... 3 Terminology... 4 Planning & Considerations... 5 Security Features... 9 Enterprise Features... 10 Cluster Overview... 11 Deployment

More information

A Study of Skew in MapReduce Applications

A Study of Skew in MapReduce Applications A Study of Skew in MapReduce Applications YongChul Kwon Magdalena Balazinska, Bill Howe, Jerome Rolia* University of Washington, *HP Labs Motivation MapReduce is great Hides details of distributed execution

More information

D B M G Data Base and Data Mining Group of Politecnico di Torino

D B M G Data Base and Data Mining Group of Politecnico di Torino DataBase and Data Mining Group of Data mining fundamentals Data Base and Data Mining Group of Data analysis Most companies own huge databases containing operational data textual documents experiment results

More information

Hyrise - a Main Memory Hybrid Storage Engine

Hyrise - a Main Memory Hybrid Storage Engine Hyrise - a Main Memory Hybrid Storage Engine Philippe Cudré-Mauroux exascale Infolab U. of Fribourg - Switzerland & MIT joint work w/ Martin Grund, Jens Krueger, Hasso Plattner, Alexander Zeier (HPI) and

More information

SAP HANA Scalability. SAP HANA Development Team

SAP HANA Scalability. SAP HANA Development Team SAP HANA Scalability Design for scalability is a core SAP HANA principle. This paper explores the principles of SAP HANA s scalability, and its support for the increasing demands of data-intensive workloads.

More information

This study is brought to you courtesy of.

This study is brought to you courtesy of. This study is brought to you courtesy of www.google.com/think/insights Health Consumer Study The Role of Digital in Patients Healthcare Actions & Decisions Google/OTX U.S., December 2009 Background Demonstrate

More information

From Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019

From Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019 From Single Purpose to Multi Purpose Data Lakes Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019 Agenda Data Lakes Multiple Purpose Data Lakes Customer Example Demo Takeaways

More information

Migrate from Netezza Workload Migration

Migrate from Netezza Workload Migration Migrate from Netezza Automated Big Data Open Netezza Source Workload Migration CASE SOLUTION STUDY BRIEF Automated Netezza Workload Migration To achieve greater scalability and tighter integration with

More information

5 Fundamental Strategies for Building a Data-centered Data Center

5 Fundamental Strategies for Building a Data-centered Data Center 5 Fundamental Strategies for Building a Data-centered Data Center June 3, 2014 Ken Krupa, Chief Field Architect Gary Vidal, Solutions Specialist Last generation Reference Data Unstructured OLTP Warehouse

More information

Data Center Management and Automation Strategic Briefing

Data Center Management and Automation Strategic Briefing Data Center and Automation Strategic Briefing Contents Why is Data Center and Automation (DCMA) so important? 2 The Solution Pathway: Data Center and Automation 2 Identifying and Addressing the Challenges

More information

I am a Data Nerd and so are YOU!

I am a Data Nerd and so are YOU! I am a Data Nerd and so are YOU! Not This Type of Nerd Data Nerd Coffee Talk We saw Cloudera as the lone open source champion of Hadoop and the EMC/Greenplum/MapR initiative as a more closed and

More information

Delivering a 360 o View in Healthcare and Life Sciences With Agile Data

Delivering a 360 o View in Healthcare and Life Sciences With Agile Data Delivering a 360 o View in Healthcare and Life Sciences With Agile Data Imran Chaudhri, @imrantech, Solutions Director, Healthcare & Life Sciences Mark Ferneau, @ferneau, Practice Manager, Healthcare &

More information

New Approach to Unstructured Data

New Approach to Unstructured Data Innovations in All-Flash Storage Deliver a New Approach to Unstructured Data Table of Contents Developing a new approach to unstructured data...2 Designing a new storage architecture...2 Understanding

More information

RD-Action WP5. Specification and implementation manual of the Master file for statistical reporting with Orphacodes

RD-Action WP5. Specification and implementation manual of the Master file for statistical reporting with Orphacodes RD-Action WP5 Specification and implementation manual of the Master file for statistical reporting with Orphacodes Second Part of Milestone 27: A beta master file version to be tested in some selected

More information

IBM Terms of Use SaaS Specific Offering Terms. IBM DB2 on Cloud. 1. IBM SaaS. 2. Charge Metrics

IBM Terms of Use SaaS Specific Offering Terms. IBM DB2 on Cloud. 1. IBM SaaS. 2. Charge Metrics IBM Terms of Use SaaS Specific Offering Terms IBM DB2 on Cloud The Terms of Use ( ToU ) is composed of this IBM Terms of Use - SaaS Specific Offering Terms ( SaaS Specific Offering Terms ) and a document

More information

Data mining fundamentals

Data mining fundamentals Data mining fundamentals Elena Baralis Politecnico di Torino Data analysis Most companies own huge bases containing operational textual documents experiment results These bases are a potential source of

More information

Genomics on Cisco Metacloud + SwiftStack

Genomics on Cisco Metacloud + SwiftStack Genomics on Cisco Metacloud + SwiftStack Technology is a large component of driving discovery in both research and providing timely answers for clinical treatments. Advances in genomic sequencing have

More information

Processing Technology of Massive Human Health Data Based on Hadoop

Processing Technology of Massive Human Health Data Based on Hadoop 6th International Conference on Machinery, Materials, Environment, Biotechnology and Computer (MMEBC 2016) Processing Technology of Massive Human Health Data Based on Hadoop Miao Liu1, a, Junsheng Yu1,

More information

Huntington s Disease and Vertex Pharmaceuticals

Huntington s Disease and Vertex Pharmaceuticals Huntington s Disease and Vertex Pharmaceuticals Jeff Stack, Ph.D. Vertex, San Diego HDSA Annual Convention June 7, 2008 www.vrtx.com Outline Background on Vertex Pharmaceuticals Vertex drug discovery collaboration

More information

Transforming the Data Center into the Information Center. Jack Domme Chief Executive Officer Hitachi Data Systems

Transforming the Data Center into the Information Center. Jack Domme Chief Executive Officer Hitachi Data Systems Transforming the Data Center into the Information Center Jack Domme Chief Executive Officer Hitachi Data Systems What Customers Are Saying Budgets are down by as much as 50% My data keeps growing We are

More information

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Unleash Your Data Center s Hidden Power September 16, 2014 Molly Rector CMO, EVP Product Management & WW Marketing

More information

Unstructured Text in Big Data The Elephant in the Room

Unstructured Text in Big Data The Elephant in the Room Unstructured Text in Big Data The Elephant in the Room David Milward ICIC, October 2013 Click Unstructured to to edit edit Master Master Big title Data style title style Big Data Volume, Variety, Velocity

More information

Processing Genomics Data: High Performance Computing meets Big Data. Jan Fostier

Processing Genomics Data: High Performance Computing meets Big Data. Jan Fostier Processing Genomics Data: High Performance Computing meets Big Data Jan Fostier Traditional HPC way of doing things Communication network (Infiniband) Lots of communication c c c c c Lots of computations

More information

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon.

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon. THE EMC ISILON STORY Big Data In The Enterprise Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon August, 2012 1 Big Data In The Enterprise Isilon Overview Isilon Technology

More information

LazyBase: Trading freshness and performance in a scalable database

LazyBase: Trading freshness and performance in a scalable database LazyBase: Trading freshness and performance in a scalable database (EuroSys 2012) Jim Cipar, Greg Ganger, *Kimberly Keeton, *Craig A. N. Soules, *Brad Morrey, *Alistair Veitch PARALLEL DATA LABORATORY

More information

Semantic Web. Dr. Philip Cannata 1

Semantic Web. Dr. Philip Cannata 1 Semantic Web Dr. Philip Cannata 1 Dr. Philip Cannata 2 Dr. Philip Cannata 3 Dr. Philip Cannata 4 See data 14 Scientific American.sql on the class website calendar SELECT strreplace(x, 'sa:', '') "C" FROM

More information

Socrates: A System for Scalable Graph Analytics C. Savkli, R. Carr, M. Chapman, B. Chee, D. Minch

Socrates: A System for Scalable Graph Analytics C. Savkli, R. Carr, M. Chapman, B. Chee, D. Minch Socrates: A System for Scalable Graph Analytics C. Savkli, R. Carr, M. Chapman, B. Chee, D. Minch September 10, 2014 Cetin Savkli Cetin.Savkli@jhuapl.edu 240 228 0115 Challenges of Big Data & Analytics

More information

Rethink the Network It is more than just transport

Rethink the Network It is more than just transport Rethink the Network It is more than just transport Christian Buhrow cbuhrow@extrahop.com Founded in 2007 in Seattle (USA) ; well-funded, rapidly emerging market leader Global leadership in wire data analysis

More information

April Copyright 2013 Cloudera Inc. All rights reserved.

April Copyright 2013 Cloudera Inc. All rights reserved. Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on

More information

Smart Data Center From Hitachi Vantara: Transform to an Agile, Learning Data Center

Smart Data Center From Hitachi Vantara: Transform to an Agile, Learning Data Center Smart Data Center From Hitachi Vantara: Transform to an Agile, Learning Data Center Leverage Analytics To Protect and Optimize Your Business Infrastructure SOLUTION PROFILE Managing a data center and the

More information

SAS and Grid Computing Maximize Efficiency, Lower Total Cost of Ownership Cheryl Doninger, SAS Institute, Cary, NC

SAS and Grid Computing Maximize Efficiency, Lower Total Cost of Ownership Cheryl Doninger, SAS Institute, Cary, NC Paper 227-29 SAS and Grid Computing Maximize Efficiency, Lower Total Cost of Ownership Cheryl Doninger, SAS Institute, Cary, NC ABSTRACT IT budgets are declining and data continues to grow at an exponential

More information

Design of a Health-Data Model and a Query-driven Implementation in Cassandra

Design of a Health-Data Model and a Query-driven Implementation in Cassandra SSH 2015: The 3rd International Workshop on Service Science for e-health Design of a Health-Data Model and a Query-driven Implementation in Cassandra Kausik Naguri Department of Computer Science and Engineering

More information

Introduction to Data Mining and Data Analytics

Introduction to Data Mining and Data Analytics 1/28/2016 MIST.7060 Data Analytics 1 Introduction to Data Mining and Data Analytics What Are Data Mining and Data Analytics? Data mining is the process of discovering hidden patterns in data, where Patterns

More information

Dell EMC SAP HANA Appliance Backup and Restore Performance with Dell EMC Data Domain

Dell EMC SAP HANA Appliance Backup and Restore Performance with Dell EMC Data Domain Dell EMC SAP HANA Appliance Backup and Restore Performance with Dell EMC Data Domain Performance testing results using Dell EMC Data Domain DD6300 and Data Domain Boost for Enterprise Applications July

More information

XML in the bipharmaceutical

XML in the bipharmaceutical XML in the bipharmaceutical sector XML holds out the opportunity to integrate data across both the enterprise and the network of biopharmaceutical alliances - with little technological dislocation and

More information

The MovingLife Project

The MovingLife Project The MovingLife Project MObile ehealth for the VINdication of Global LIFEstyle change and disease management solutions Stakeholders Conference The MovingLife Roadmaps Brussels 18 April 2013 Alessio Gugliotta

More information

When, Where & Why to Use NoSQL?

When, Where & Why to Use NoSQL? When, Where & Why to Use NoSQL? 1 Big data is becoming a big challenge for enterprises. Many organizations have built environments for transactional data with Relational Database Management Systems (RDBMS),

More information

Business Continuity Planning: Documentation During EMR Downtime. The webcast will begin shortly...

Business Continuity Planning: Documentation During EMR Downtime. The webcast will begin shortly... Business Continuity Planning: Documentation During EMR Downtime The webcast will begin shortly... You have been automatically muted. Please use the Q&A panel to submit questions during the presentation

More information

Bigtable. Presenter: Yijun Hou, Yixiao Peng

Bigtable. Presenter: Yijun Hou, Yixiao Peng Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google, Inc. OSDI 06 Presenter: Yijun Hou, Yixiao Peng

More information

Leveraging Flash in HPC Systems

Leveraging Flash in HPC Systems Leveraging Flash in HPC Systems IEEE MSST June 3, 2015 This work was performed under the auspices of the U.S. Department of Energy by under Contract DE-AC52-07NA27344. Lawrence Livermore National Security,

More information

Path-based Macroanalysis for Large Distributed Systems

Path-based Macroanalysis for Large Distributed Systems Path-based Macroanalysis for Large Distributed Systems 1, Anthony Accardi 2, Emre Kiciman 3 Dave Patterson 1, Armando Fox 3, Eric Brewer 1 mikechen@cs.berkeley.edu UC Berkeley 1, Tellme Networks 2, Stanford

More information

My Health, My Data (and other related projects) Yannis Ioannidis ATHENA Research Center & University of Athens

My Health, My Data (and other related projects) Yannis Ioannidis ATHENA Research Center & University of Athens My Health, My Data (and other related projects) Yannis Ioannidis ATHENA Research Center & University of Athens My Health, My Data! 1 / 11 / 2016-30 / 10 / 2019 ~3M ( ~420K for ARC) Age ParCHD Procedures

More information

technology Catalyst For connected CARE Per Ljungberg Director, System and Technology Group Function Technology and Emerging Business Ericsson

technology Catalyst For connected CARE Per Ljungberg Director, System and Technology Group Function Technology and Emerging Business Ericsson 5G technology Catalyst For connected CARE Per Ljungberg Director, System and Technology Group Function Technology and Emerging Business Ericsson Challenges in Healthcare Communication SECURITY AVAILABILITY

More information

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure Mario Beck (mario.beck@oracle.com) Principal Sales Consultant MySQL Session Agenda Requirements for

More information

In-Memory Data Structures and Databases Jens Krueger

In-Memory Data Structures and Databases Jens Krueger In-Memory Data Structures and Databases Jens Krueger Enterprise Platform and Integration Concepts Hasso Plattner Intitute What to take home from this talk? 2 Answer to the following questions: What makes

More information