Advances In Data Integration: The No ETL Approach. Marcos A. Campos, Principle Consultant, The Cognatic Group. capsenta.com. Sponsored by Capsenta
|
|
- Garey Sims
- 5 years ago
- Views:
Transcription
1 Advances In Data Integration: The No ETL Approach Marcos A. Campos, Principle Consultant, The Cognatic Group Sponsored by Capsenta capsenta.com
2 INTRODUCTION Data integration. It s a costly activity. Current approaches naturally lead to the question: Is it possible to bring multiple data sources together in a less costly, fully automated way, and still satisfactorily respond to the full breadth of business needs? The answer to that question would represent a breakthrough in several industries. For example, in the pharmaceutical industry there are two primary areas of executive pain. The first is the pre- clinical data discovery phase. There is a need to fail faster. With lots of potential drugs being tested, pharmaceutical executives want to be able to rule out failures as early as possible. They also want to rule out drugs that don t meet specific criteria, and rule out drugs that have already been evaluated or patented by a competitor. Another executive pain occurs in the later stages 2 and 3 of the clinical trial phase. It can be difficult to find appropriate patients for the new drug trial. It is a painstaking process. The ability to integrate key clinical data from electronic medical record systems and 3 rd party repositories could alleviate these problems and reduce the time it takes to develop a new drug. In the health care industry access to the right data, at the right time and in the right setting can save lives and improve efficiency. In general, the goals of this no ETL approach are to achieve: 50% reduction in time to integrate data sources, and 50% reduction in costs related to business intelligence 1 (BI) data integration. To some extent, recent methods in systems and software architecture have already broken down some of the walls between data silos. Service Oriented Architecture (SOA) methods have improved the way that data is moved between separate databases. However, the mapping of one data model to another and the physical transformation of data from one model to another remains expensive. Mapping data is a surprisingly hands- on and labor- intensive process. In general, these labor- intensive processes can be categorized as Extract, Transform, and Load (ETL) activities. ETL activity has become a mainstay of Business Intelligence (BI) tools, and data warehousing. In this white paper, Capsenta proposes a solution that renders ETL activities obsolete. To understand where the innovation is happening, one must understand the shift that has occurred. It s a shift that redefines what we mean by data integration. 1 Bluepatent.com, Business Intelligence (BI) is the set of techniques and tools for the transformation of raw data into meaningful and useful information for business analysis purposes
3 IS ETL BECOMING OBSOLETE? Most organizations that deal with lots of data are using methods and approaches that are becoming obsolete. From a business perspective, how often have business leaders identified a new business opportunity only to be told that either the data is not compliant with company standards, or that integration is too costly to be feasible? In some situations, operations come to a stop because key technical people in an organization who are familiar with a legacy database leave the organization. Maintenance and modifications have to be put on hold. Time and resources go into finding new- hires to learn formats of legacy data or consultants to transform the existing data into supported formats. And yet, addressing all the above problems still does not solve the root problem of deriving sufficient business value from disconnected data sets. Part of the reason for this is that typical data sources are complex. Data must be tended to by technical staff. In large data warehouses, technical staff often work in islands of heterogeneous technology. For example, one department s Oracle database (with tables designed 20 years ago) might not play well with another department s Microsoft SQL Server database (whose tables are changing). Data is not easily aggregated. This is part of why so much time and effort goes into Extract, Transform, and Load (ETL) activities. The dominate form of data storage for many years in business, commerce and government has been based on relational databases. Because of that, business intelligence, search functions, and data analytics have required ETL operations. The problem with current ETL methods is that they take time and resources, and include hidden costs that are not apparent at the beginning of a project. In an article titled, "The True Cost of Integration in the World of BI," David Linthicum states: When it comes to the cost of a BI deployment, it s not the software that will get you; it s the miscellany - - the miscellaneous integration work, in particular. 2 It s logical to assume that data integration is a significant cost for any BI project, in the implementation phase estimates of almost 80%; compared to the analytics component of 20% 3. Data integration hassles are legendary, including bringing together all relevant data from various operational systems not designed to feed BI systems. In addition to integration and conversion costs, there are ongoing costs. Linthicum states, When you look at ongoing costs, though, the roles reverse, making data integration 20 percent of the costs versus reporting and analytics 4. Why so expensive? On a commercial scale, data integration is hard. It s complex. It s perhaps one of the most difficult jobs in the world of BI. However, it s also critical. Indeed, the ability to bring in the right data on a timely basis is directly related to the value that the BI system will bring to the business - - more so than the types of analytics it s looking to drive. It s the old garbage- in- garbage- out concept. 2 Linthicum, David, The True Cost of Data Integration, TDWI, Aug, 20, Linthicum, David, The True Cost of Data Integration, TDWI, Aug, 20, Linthicum, David, The True Cost of Data Integration, TDWI, Aug, 20, 2013
4 What about real- time analysis? The need to perform ETL activities means that real- time data analysis is not possible. Typically, new search queries mean that new report formats have to be created manually. In contrast, semantic technologies provide tools for real- time data analysis. That s another reason why data integration is moving toward semantic technologies, and away from ETL activities. SHIFT TO SEMANTIC TECHNOLOGY The shift to semantic technology sometimes comes under the umbrella of Big Data or Data Analytics. But the true meaning of those words does not mean that ETL is eliminated. The true shift is in the way that we treat data; removing focus from methods of storage and placing the focus on business value contained in the data itself. That s what semantic technology does. It adds meaning to data. This shift has become possible because we have turned a corner in machine processing speeds, network availability, network performance, and the cost of hardware. Turning the corner in these areas means that methods of computing have become available to the business community. Until recently, these programming methods and algorithms were more suited for an academic environment where realistic application would have taken place on a super- computer. These methods and emerging algorithms are now commercially viable. By enriching data with semantics we improve: SEARCH: Enable search across multiple, heterogeneous data sources ANALYTICS: Provide data analytics in real- time between previously unmapped data sets SPEED TO MARKET: Reduce time, capital, and human resources associated with data mapping Semantic technology gets at the root problem that ETL ignores. Coupled with next- generation information architecture, the latest semantic technologies represent a giant leap toward the end of the need for ETL. By applying certain semantic techniques, an organization can automate computer processing to produce those benefits in search, analytics, and improved speed to market. Data scientists at many universities around the world, governments and commercial companies are taking advantage of this technology. The technology is moving quickly. New methods are emerging and existing methods are being optimized. EMERGING SOLUTION Capsenta has developed technologies based on industry standards and semantic techniques that provide a better approach to data integration. Capsenta uses computational methods and approaches that are not restricted by underlying storage mechanisms, and do not require ETL activity. Capsenta technology can: Virtualize data as graphs Leverage common vocabularies Use automated mapping Support federation
5 There is already proven value of this technology and approach in the area of cardiovascular medical devices. Capsenta is powering first- to- market connectivity for every implantable device patient in the United States. Medical device manufacturing is among the most competitive industries in the healthcare domain. Individual companies and organization do not have incentive to merge their data. There is little attempt to collaborate in order to centralize data for the purpose of analytics and patient search. In addition, almost 50% of patients with implantable devices do not know the type of their device nor do they know the name of the manufacturer of their device. These issues create a problem in identifying which manufacturer service representative should be contacted during patient triage. This can result in lengthy times before clinical care is performed. Capsenta is working with industry professionals to enable federated search of device data in emergency rooms. This will enable faster identification of a patient s device, and reduce critical clinical time. VIRTUALIZE DATA AS A GRAPH A key part of the solution is to virtualize data as a graph. A graph is a data model that uses nodes and edges instead of tables, columns and rows. One advantage of using a graph is that integrating multiple data sets is easier than it would be if the data were in tables. Integrating multiple, disparate graphs can be done by adding more edges and nodes. In this way, multiple graphs can become one. By exposing relational data as graphs, we change the hard problem of relational database integration to the relatively simple activity of finding edges between nodes. The tables, columns, and rows in the original database still exist. But with the use of graphs, the data can be represented in a virtual way and integrated with other databases without the need to map foreign keys, and normalize data entities. To do this, Capsenta uses semantic web standards established by World Wide Web Consortium (W3C). One key standard is the Resource Description Framework (RDF). RDF describes resources, and is used for modeling data as graphs. We apply RDF conventions to data that is in a relational database so that it can be represented in an RDF database. Another useful open standard is SPARQL, which is used for querying graphs. Capsenta s OntoExplorer search tool uses SPARQL to perform semantic searches. Capsenta also uses a W3C standard called Web Ontology Language (OWL). We use OWL to represent complex knowledge about things, groups of things, and the relationships between things within the context of a graph. As the name implies, OWL helps describe ontologies. In this context, an ontology is like a vocabulary; a complex collection of terms.
6 LEVERAGE COMMON VOCABULARIES The semantic approach becomes much more useful if it can leverage common vocabularies and taxonomies. Vocabularies are sets of metadata, terms, and relationships. In the context of semantic standards, vocabularies and ontologies essentially mean the same thing. Every industry has its own set of standard vocabularies. Many industries, such as the health care industry, have mature vocabularies that are shared in common between companies and organizations. For example, in the healthcare industry there are organizations such as Health Level Seven International (HL7) and the Coalition for ICD- 10 that maintain standards and common dictionaries in order to promote interoperability. In the finance industry, there is the Financial Industry Business Ontology (FIBO), which is an initiative to define common financial terms, definitions and synonyms. The suite of semantic technologies, RDF and OWL allow the creation of rich knowledge models (ontologies) where business logic can be captured and maintained. OWL ontologies are machine- readable. This is important because it means that an application can directly use those ontologies (vocabularies) without the need for traditional data analysis. For example, let s say that in Database A, the ontology reveals that a person named Pedro is Mexican. In Database B, the ontology reveals that a person named Maria is Cuban. What if you were to ask the question: Who are all the Hispanics in the data set? Suppose that there is a layer of semantics (ontological knowledge) that includes assertions that Mexican is Hispanic and Cuban is Hispanic. Given those ontological relationships, a semantic search could provide a result that shows that both Pedro and Maria are Hispanic - even though the word Hispanic does not appear in Database A or in Database B. USE AUTOMATED MAPPING Applying semantic technology to graph data models makes automated mapping possible. This is not the kind of mapping seen in traditional technologies. Using supervised automation features in Capsenta s Ultrawrap TM product, initial discovery and visualization of a source relational database schema and metadata is most instantaneous. Traditional methods of mapping data elements require human judgment. Seemingly simple data elements can require human effort to sort out the logic. For example, consider a simple element such as a name. One data source might store a name as LAST_NAME with a separate element for FIRST_NAME. A different data source might store a name as LAST_FIRST_MIDDLE. And yet another data source might store a name as FIRST_MIDDLE_LAST. Traditional mapping of one element to another would require not only human judgment, but some method of conversion in order to normalize the data. Applying semantic technology to graph data makes it possible to connect the name elements from those separate databases automatically, without the need for manual conversion or normalization. Using semantic search in combination with common vocabularies, we can automate the mapping between source schemas and target ontologies. This saves time and effort over traditional integrations which would have required a human to map the elements of one database to another.
7 Traditional data mapping can often lead to problems of missing data in search results. For example, consider a search in a clinical database for all patients with diabetes. A search using the term diabetes might yield a list of patients that is not complete. What if some patients are diagnosed with NIDDM or are diabetic? What if the intention of the search was to find those patients also? Using semantics, an automated mapping can tie words together to yield a search result for all patients associated with either of those diagnoses without the need for a human to make that determination. Another advantage is that it makes it possible for a computer to make inferences about the relationships between data elements. This kind of inference technique in search results goes beyond finding synonyms. It finds related data that are not necessarily synonyms. For example, consider a recent case study involving databases of constitutional language. There are data warehouses that contain text from the constitutions of nations all over the world. If one were to search for text that contains the word religion, the search results might miss instances of text that deal with religious issues. For example, traditional methods of mapping the data would miss phrases such as church and state. This kind of error of omission in search results is much less likely in a semantic search. In a semantic search, these phrases would have been inferred to be related to each other, and been found in search results. SUPPORT FEDERATION One of the most significant improvements of using semantic technology to perform search operations is that the data can exist anywhere. Data sources can be in different areas of an organization, in different formats, or even stored within different organizations. The data sources do not have to be centralized in order to support the virtualization, mapping and search operations we have described. As we mentioned earlier, individual companies and organization do not always have incentive to merge their data with other companies and organizations. Those barriers between data sources, however, do not represent barriers to using semantic search technology. Virtualization, mapping, and search can be done across independent databases without the need to centralize the data. In addition, this federated approach does not impose any performance overhead. The speed of performing a search across independent data sources is not slower than a non- federated search. This is possible because the Capsenta tools leverage the optimization mechanisms that are inherent in each individual database. The Ultrawrap middleware takes advantage of the inherent database optimizers that already exist in relational databases, such as Oracle, DB2, and Microsoft SQL databases. CONCLUSION Tools and capabilities available to IT data professionals have lagged behind the need of business demands. Semantic technology provides a solution to existing data integration problems and the overhead of ETL activities. The solution comprises several elements. One element is the representation of relational databases as graphs (virtualizing data without changing the underlying database). Another element is to connect the virtualized data with common vocabularies to create useful ontologies. And finally, to be useful in a practical application the solution must also support automated mapping and federation.
8 When all these elements come into together, the need for ETL is no longer required. Capsenta Inc. has created a solution that does this called Ultrawrap. The Ultrawrap middleware uses semantic technology to integrate and enrich disparate data sources. Ontoexplorer enables next- generation search capability. Together, the combined solution brings all of the above- mentioned elements to provide a significant advance in data integration and search. Capsenta s solution has already shown value in the healthcare domain. The semantic approach could provide similar benefits in other domains.
9 ABOUT CAPSENTA Capsenta, was founded in 2011, spun- out of the University of Texas at Austin. The Ultrawrap technology is a product of more than 6 years of research. Capsenta increases the value of a company s data by applying knowledge graphs and semantic technologies. Semantics improve search, analysis, and interpretation. Capsenta s patent- pending technology is the only complete W3C- compliant turnkey solution to making existing SQL, and SQL data warehouse infrastructure, upward compatible with semantics. The approach assures scalability and robustness. Traditional applications seamlessly coexist with semantically enriched applications. By enriching data with semantics, Capsenta is able to accelerate the provisioning process from months to days. Capsenta s technology starts with standards compliance and is able to integrate the result with any existing database infrastructure. Capsenta provides Data as a Service (DAAS) - database integration services as a one- time cloud service. Capsenta customers benefit from the power of semantics from day one, with low risk. TARGET MARKET: Life Science, Drug Discovery, Clinical Trials, Personalized Medicine, Health IT For further guidance on this topic and additional detail, contact us at ultrawrap@capsenta.com. You can learn more about database automation and search at or reach Capsenta directly by calling
10 5225 FOSSIL RIM RD., AUSTIN TX capsenta.com
Fast Innovation requires Fast IT
Fast Innovation requires Fast IT Cisco Data Virtualization Puneet Kumar Bhugra Business Solutions Manager 1 Challenge In Data, Big Data & Analytics Siloed, Multiple Sources Business Outcomes Business Opportunity:
More informationThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT Strategy An Evolving Approach for Dealing with Big Data & Changing Environments bit.ly/datalake SPEAKERS: Thomas Kelly, Practice Director Cognizant Technology Solutions Sean Martin,
More informationNew Approach to Graph Databases
Paper PP05 New Approach to Graph Databases Anna Berg, Capish, Malmö, Sweden Henrik Drews, Capish, Malmö, Sweden Catharina Dahlbo, Capish, Malmö, Sweden ABSTRACT Graph databases have, during the past few
More informationFINANCIAL REGULATORY REPORTING ACROSS AN EVOLVING SCHEMA
FINANCIAL REGULATORY REPORTING ACROSS AN EVOLVING SCHEMA MODELDR & MARKLOGIC - DATA POINT MODELING MARKLOGIC WHITE PAPER JUNE 2015 CHRIS ATKINSON Contents Regulatory Satisfaction is Increasingly Difficult
More informationPowering Knowledge Discovery. Insights from big data with Linguamatics I2E
Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural
More informationModernizing Healthcare IT for the Data-driven Cognitive Era Storage and Software-Defined Infrastructure
Modernizing Healthcare IT for the Data-driven Cognitive Era Storage and Software-Defined Infrastructure An IDC InfoBrief, Sponsored by IBM April 2018 Executive Summary Today s healthcare organizations
More informationAPPLYING KNOWLEDGE BASED AI TO MODERN DATA MANAGEMENT. Mani Keeran, CFA Gi Kim, CFA Preeti Sharma
APPLYING KNOWLEDGE BASED AI TO MODERN DATA MANAGEMENT Mani Keeran, CFA Gi Kim, CFA Preeti Sharma 2 What we are going to discuss During last two decades, majority of information assets have been digitized
More informationDeveloping A Semantic Web-based Framework for Executing the Clinical Quality Language Using FHIR
Developing A Semantic Web-based Framework for Executing the Clinical Quality Language Using FHIR Guoqian Jiang 1, Eric Prud Hommeax 2, and Harold R. Solbrig 1 1 Mayo Clinic, Rochester, MN, 55905, USA 2
More informationThe Role of Converged and Hyper-converged Infrastructure in IT Transformation
Enterprise Strategy Group Getting to the bigger truth. ESG Research Insights Brief The Role of Converged and Hyper-converged Infrastructure in IT Transformation The Quantified Effects of Organizational
More informationDelivering a 360 o View in Healthcare and Life Sciences With Agile Data
Delivering a 360 o View in Healthcare and Life Sciences With Agile Data Imran Chaudhri, @imrantech, Solutions Director, Healthcare & Life Sciences Mark Ferneau, @ferneau, Practice Manager, Healthcare &
More information1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.
1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Integrating Complex Financial Workflows in Oracle Database Xavier Lopez Seamus Hayes Oracle PolarLake, LTD 2 Copyright 2011, Oracle
More informationComposite Software Data Virtualization The Five Most Popular Uses of Data Virtualization
Composite Software Data Virtualization The Five Most Popular Uses of Data Virtualization Composite Software, Inc. June 2011 TABLE OF CONTENTS INTRODUCTION... 3 DATA FEDERATION... 4 PROBLEM DATA CONSOLIDATION
More informationETL is No Longer King, Long Live SDD
ETL is No Longer King, Long Live SDD How to Close the Loop from Discovery to Information () to Insights (Analytics) to Outcomes (Business Processes) A presentation by Brian McCalley of DXC Technology,
More informationIBM Research Report. Model-Driven Business Transformation and Semantic Web
RC23731 (W0509-110) September 30, 2005 Computer Science IBM Research Report Model-Driven Business Transformation and Semantic Web Juhnyoung Lee IBM Research Division Thomas J. Watson Research Center P.O.
More informationHow a Metadata Repository enables dynamism and automation in SDTM-like dataset generation
Paper DH05 How a Metadata Repository enables dynamism and automation in SDTM-like dataset generation Judith Goud, Akana, Bennekom, The Netherlands Priya Shetty, Intelent, Princeton, USA ABSTRACT The traditional
More informationProgress DataDirect For Business Intelligence And Analytics Vendors
Progress DataDirect For Business Intelligence And Analytics Vendors DATA SHEET FEATURES: Direction connection to a variety of SaaS and on-premises data sources via Progress DataDirect Hybrid Data Pipeline
More informationMaking the Impossible Possible
Making the Impossible Possible Find and Eliminate Data Errors with Automated Discovery and Data Lineage Introduction Organizations have long struggled to identify and take advantage of opportunities for
More informationCreating a Virtual Knowledge Base for Financial Risk and Reporting
Creating a Virtual Knowledge Base for Financial Risk and Reporting Juan Sequeda, Capsenta Inc. Mike Bennett, Ltd. Ontology Summit 2016 24 March 2016 1 Risk reporting New regulatory requirements The Basel
More informationCONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM
CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED PLATFORM Executive Summary Financial institutions have implemented and continue to implement many disparate applications
More informationVirtualization. Q&A with an industry leader. Virtualization is rapidly becoming a fact of life for agency executives,
Virtualization Q&A with an industry leader Virtualization is rapidly becoming a fact of life for agency executives, as the basis for data center consolidation and cloud computing and, increasingly, as
More informationAn introductory look. cloud computing in education
An introductory look cloud computing in education An introductory look cloud computing in education Today, the question for education IT managers is not whether to adopt cloud computing, but when. With
More informationUse of Semantic Technologies at Eli Lilly and Company. J Phil Brooks Information Consultant, SE Data Team Discover IT Eli Lilly and Company
Use of Semantic Technologies at Eli Lilly and Company J Phil Brooks Information Consultant, SE Data Team Discover IT Eli Lilly and Company Notable Semantic Projects at Lilly Discovery Metadata Integration
More informationFIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION
FIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION The process of planning and executing SQL Server migrations can be complex and risk-prone. This is a case where the right approach and
More informationIndustry Adoption of Semantic Web Technology
IBM China Research Laboratory Industry Adoption of Semantic Web Technology Dr. Yue Pan panyue@cn.ibm.com Outline Business Drivers Industries as early adopters A Software Roadmap Conclusion Data Semantics
More informationSolving the Enterprise Data Dilemma
Solving the Enterprise Data Dilemma Harmonizing Data Management and Data Governance to Accelerate Actionable Insights Learn More at erwin.com Is Our Company Realizing Value from Our Data? If your business
More informationData Management Glossary
Data Management Glossary A Access path: The route through a system by which data is found, accessed and retrieved Agile methodology: An approach to software development which takes incremental, iterative
More informationBuilding a Data Strategy for a Digital World
Building a Data Strategy for a Digital World Jason Hunter, CTO, APAC Data Challenge: Pushing the Limits of What's Possible The Art of the Possible Multiple Government Agencies Data Hub 100 s of Service
More informationData Virtualization Implementation Methodology and Best Practices
White Paper Data Virtualization Implementation Methodology and Best Practices INTRODUCTION Cisco s proven Data Virtualization Implementation Methodology and Best Practices is compiled from our successful
More informationVirtualizing the SAP Infrastructure through Grid Technology. WHITE PAPER March 2007
Virtualizing the SAP Infrastructure through Grid Technology WHITE PAPER March 2007 TABLE OF CONTENTS TABLE OF CONTENTS 2 Introduction 3 The Complexity of the SAP Landscape 3 Specific Pain Areas 4 Virtualizing
More informationDistributed Hybrid MDM, aka Virtual MDM Optional Add-on, for WhamTech SmartData Fabric
Distributed Hybrid MDM, aka Virtual MDM Optional Add-on, for WhamTech SmartData Fabric Revision 2.1 Page 1 of 17 www.whamtech.com (972) 991-5700 info@whamtech.com August 2018 Contents Introduction... 3
More informationConfiguration for Registry*, Repository or Hybrid** Master Data Management. *aka Federated or Virtual **aka Coexistence
Configuration for Registry*, Repository or Hybrid** Master Data Management *aka Federated or Virtual **aka Coexistence October 2017 (Best viewed in slideshow mode) Revision 3.6 Copyright 2017 WhamTech,
More informationMETADATA INTERCHANGE IN SERVICE BASED ARCHITECTURE
UDC:681.324 Review paper METADATA INTERCHANGE IN SERVICE BASED ARCHITECTURE Alma Butkovi Tomac Nagravision Kudelski group, Cheseaux / Lausanne alma.butkovictomac@nagra.com Dražen Tomac Cambridge Technology
More informationSmartData Fabric distributed virtual data, graph data and master data management, analytics and security. Solutions and Key Features Revision 2.
s and Key Features Revision 2.5 Page 1 of 7 www.whamtech.com (972) 991-5700 info@whamtech.com March 2018 ID SOL1 Automated Data Discovery and Classification (ADDC) Key Feature ID KF01 KF02 KF03 Key Feature
More informationUsing Linked Data and taxonomies to create a quick-start smart thesaurus
7) MARJORIE HLAVA Using Linked Data and taxonomies to create a quick-start smart thesaurus 1. About the Case Organization The two current applications of this approach are a large scientific publisher
More informationData Mining and Warehousing
Data Mining and Warehousing Sangeetha K V I st MCA Adhiyamaan College of Engineering, Hosur-635109. E-mail:veerasangee1989@gmail.com Rajeshwari P I st MCA Adhiyamaan College of Engineering, Hosur-635109.
More informationA Better Approach to Leveraging an OpenStack Private Cloud. David Linthicum
A Better Approach to Leveraging an OpenStack Private Cloud David Linthicum A Better Approach to Leveraging an OpenStack Private Cloud 1 Executive Summary The latest bi-annual survey data of OpenStack users
More informationDeveloping A Semantic Web-based Framework for Executing the Clinical Quality Language Using FHIR
Developing A Semantic Web-based Framework for Executing the Clinical Quality Language Using FHIR Guoqian Jiang 1, Eric Prud Hommeaux 2, Guohui Xiao 3, and Harold R. Solbrig 1 1 Mayo Clinic, Rochester,
More informationCA ERwin Data Profiler
PRODUCT BRIEF: CA ERWIN DATA PROFILER CA ERwin Data Profiler CA ERWIN DATA PROFILER HELPS ORGANIZATIONS LOWER THE COSTS AND RISK ASSOCIATED WITH DATA INTEGRATION BY PROVIDING REUSABLE, AUTOMATED, CROSS-DATA-SOURCE
More informationElectronic Health Records with Cleveland Clinic and Oracle Semantic Technologies
Electronic Health Records with Cleveland Clinic and Oracle Semantic Technologies David Booth, Ph.D., Cleveland Clinic (contractor) Oracle OpenWorld 20-Sep-2010 Latest version of these slides: http://dbooth.org/2010/oow/
More informationThe Benefits of Strong Authentication for the Centers for Medicare and Medicaid Services
The Benefits of Strong Authentication for the Centers for Medicare and Medicaid Services This document was developed by the Smart Card Alliance Health and Human Services Council in response to the GAO
More informationQLIKVIEW ARCHITECTURAL OVERVIEW
QLIKVIEW ARCHITECTURAL OVERVIEW A QlikView Technology White Paper Published: October, 2010 qlikview.com Table of Contents Making Sense of the QlikView Platform 3 Most BI Software Is Built on Old Technology
More informationIBM Software IBM InfoSphere Information Server for Data Quality
IBM InfoSphere Information Server for Data Quality A component index Table of contents 3 6 9 9 InfoSphere QualityStage 10 InfoSphere Information Analyzer 12 InfoSphere Discovery 13 14 2 Do you have confidence
More informationSemantic Web. Dr. Philip Cannata 1
Semantic Web Dr. Philip Cannata 1 Dr. Philip Cannata 2 Dr. Philip Cannata 3 Dr. Philip Cannata 4 See data 14 Scientific American.sql on the class website calendar SELECT strreplace(x, 'sa:', '') "C" FROM
More informationXcelerated Business Insights (xbi): Going beyond business intelligence to drive information value
KNOWLEDGENT INSIGHTS volume 1 no. 5 October 7, 2011 Xcelerated Business Insights (xbi): Going beyond business intelligence to drive information value Today s growing commercial, operational and regulatory
More informationApril 25, Dear Secretary Sebelius,
April 25, 2014 Department of Health and Human Services Office of the National Coordinator for Health Information Technology Attention: 2015 Edition EHR Standards and Certification Criteria Proposed Rule
More informationConCert FAQ s Last revised December 2017
ConCert FAQ s Last revised December 2017 What is ConCert by HIMSS? ConCert by HIMSS is a comprehensive interoperability testing and certification program governed by HIMSS and built on the work of the
More informationREGULATORY REPORTING FOR FINANCIAL SERVICES
REGULATORY REPORTING FOR FINANCIAL SERVICES Gordon Hughes, Global Sales Director, Intel Corporation Sinan Baskan, Solutions Director, Financial Services, MarkLogic Corporation Many regulators and regulations
More informationSemantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.
Semantic Web Company PoolParty - Server PoolParty - Technical White Paper http://www.poolparty.biz Table of Contents Introduction... 3 PoolParty Technical Overview... 3 PoolParty Components Overview...
More informationSEMANTIC SOLUTIONS FOR OIL & GAS: ROLES AND RESPONSIBILITIES
SEMANTIC SOLUTIONS FOR OIL & GAS: ROLES AND RESPONSIBILITIES Jeremy Carroll, Ralph Hodgson, {jeremy,ralph}@topquadrant.com This paper is submitted to The W3C Workshop on Semantic Web in Energy Industries
More informationDesigning High-Performance Data Structures for MongoDB
Designing High-Performance Data Structures for MongoDB The NoSQL Data Modeling Imperative Danny Sandwell, Product Marketing, erwin, Inc. Leigh Weston, Product Manager, erwin, Inc. Learn More at erwin.com
More informationwarwick.ac.uk/lib-publications
Original citation: Zhao, Lei, Lim Choi Keung, Sarah Niukyun and Arvanitis, Theodoros N. (2016) A BioPortalbased terminology service for health data interoperability. In: Unifying the Applications and Foundations
More informationMigrate from Netezza Workload Migration
Migrate from Netezza Automated Big Data Open Netezza Source Workload Migration CASE SOLUTION STUDY BRIEF Automated Netezza Workload Migration To achieve greater scalability and tighter integration with
More informationCertification for Meaningful Use Experiences and Observations from the Field June 2011
Certification for Meaningful Use Experiences and Observations from the Field June 2011 Principles for Certification to Support Meaningful Use Certification should promote EHR adoption by giving providers
More informationConverged Infrastructure Matures And Proves Its Value
A Custom Technology Adoption Profile Commissioned By Hewlett-Packard May 2013 Introduction Converged infrastructure (CI) solutions have been widely adopted by a range of enterprises, and they offer significant
More informationCOMPUTER AND INFORMATION SCIENCE JENA DB. Group Abhishek Kumar Harshvardhan Singh Abhisek Mohanty Suhas Tumkur Chandrashekhara
JENA DB Group - 10 Abhishek Kumar Harshvardhan Singh Abhisek Mohanty Suhas Tumkur Chandrashekhara OUTLINE Introduction Data Model Query Language Implementation Features Applications Introduction Open Source
More informationTransforming the Data Center into the Information Center. Jack Domme Chief Executive Officer Hitachi Data Systems
Transforming the Data Center into the Information Center Jack Domme Chief Executive Officer Hitachi Data Systems What Customers Are Saying Budgets are down by as much as 50% My data keeps growing We are
More informationThe Analytic Utility of Anonymized Data
The Analytic Utility of Anonymized Data Data has become a precious, prized asset for healthcare organizations looking to control costs and improve patient care that can capture and action the considerable
More informationThird generation of Data Virtualization
White Paper Third generation of Data Virtualization Write back to the sources An Enterprise Enabler white paper from Stone Bond Technologies Copyright 2014 Stone Bond Technologies, L.P. All rights reserved.
More informationIntroduction to K2View Fabric
Introduction to K2View Fabric 1 Introduction to K2View Fabric Overview In every industry, the amount of data being created and consumed on a daily basis is growing exponentially. Enterprises are struggling
More informationICTR UW Institute of Clinical and Translational Research. i2b2 User Guide. Version 1.0 Updated 9/11/2017
ICTR UW Institute of Clinical and Translational Research i2b2 User Guide Version 1.0 Updated 9/11/2017 Table of Contents Background/Search Criteria... 2 Accessing i2b2... 3 Navigating the Workbench...
More informationCost-Benefit Analysis of Retrospective vs. Prospective Data Standardization
Cost-Benefit Analysis of Retrospective vs. Prospective Data Standardization Vicki Seyfert-Margolis, PhD Senior Advisor, Science Innovation and Policy Food and Drug Administration IOM Sharing Clinical Research
More informationHow to integrate data into Tableau
1 How to integrate data into Tableau a comparison of 3 approaches: ETL, Tableau self-service and WHITE PAPER WHITE PAPER 2 data How to integrate data into Tableau a comparison of 3 es: ETL, Tableau self-service
More informationThe NIH Collaboratory Distributed Research Network: A Privacy Protecting Method for Sharing Research Data Sets
The NIH Collaboratory Distributed Research Network: A Privacy Protecting Method for Sharing Research Data Sets Jeffrey Brown, Lesley Curtis, and Rich Platt June 13, 2014 Previously The NIH Collaboratory:
More informationWEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS
1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,
More informationin collaboration with
in collaboration with Table of Contents 01 Turn Silos of Data into Operational Intelligence page 04 02 Gain a Competitive Advantage with Cisco and Splunk page 06 03 Improve Insight with IT Operations Analytics
More informationTaming Structured And Unstructured Data With SAP HANA Running On VCE Vblock Systems
1 Taming Structured And Unstructured Data With SAP HANA Running On VCE Vblock Systems The Defacto Choice For Convergence 2 ABSTRACT & SPEAKER BIO Dealing with enormous data growth is a key challenge for
More informationInformation Architecture and the Actionable Enterprise Architecture
ApTSi TM Information Architecture Communications >Applied Technology Solutions, Inc.(ApTSi TM ) Applying Technology to Business Problems TM Information Architecture and the Actionable Enterprise Architecture
More informationAccelerate Your Enterprise Private Cloud Initiative
Cisco Cloud Comprehensive, enterprise cloud enablement services help you realize a secure, agile, and highly automated infrastructure-as-a-service (IaaS) environment for cost-effective, rapid IT service
More informationJanuary 16, Re: Request for Comment: Data Access and Data Sharing Policy. Dear Dr. Selby:
Dr. Joe V. Selby, MD, MPH Executive Director Patient-Centered Outcomes Research Institute 1828 L Street, NW, Suite 900 Washington, DC 20036 Submitted electronically at: http://www.pcori.org/webform/data-access-and-data-sharing-policypublic-comment
More informationHybrid IT for SMBs. HPE addressing SMB and channel partner Hybrid IT demands ANALYST ANURAG AGRAWAL REPORT : HPE. October 2018
V REPORT : HPE Hybrid IT for SMBs HPE addressing SMB and channel partner Hybrid IT demands October 2018 ANALYST ANURAG AGRAWAL Data You Can Rely On Analysis You Can Act Upon HPE addressing SMB and partner
More informationThe Next Frontier in Medical Device Security
The Next Frontier in Medical Device Security Session #76, February 21, 2017 Denise Anderson, President, NH-ISAC Dr. Dale Nordenberg, Executive Director, MDISS 1 Speaker Introduction Denise Anderson, MBA
More informationSymantec Data Center Transformation
Symantec Data Center Transformation A holistic framework for IT evolution As enterprises become increasingly dependent on information technology, the complexity, cost, and performance of IT environments
More informationBenefits and Costs of Structured Data. August 11, Secretary Securities and Exchange Commission 100 F Street, NE Washington, DC
August 11, 2015 1211 Avenue of the Americas 19 th Floor New York, NY 10036 Secretary Securities and Exchange Commission 100 F Street, NE Washington, DC 20549-1090 RE: Investment Company Reporting Modernization,
More informationThe Top Five Reasons to Deploy Software-Defined Networks and Network Functions Virtualization
The Top Five Reasons to Deploy Software-Defined Networks and Network Functions Virtualization May 2014 Prepared by: Zeus Kerravala The Top Five Reasons to Deploy Software-Defined Networks and Network Functions
More informationManaging Trust in e-health with Federated Identity Management
ehealth Workshop Konolfingen (CH) Dec 4--5, 2007 Managing Trust in e-health with Federated Identity Management Dr. rer. nat. Hellmuth Broda Distinguished Director and CTO, Global Government Strategy, Sun
More informationData Mining: Approach Towards The Accuracy Using Teradata!
Data Mining: Approach Towards The Accuracy Using Teradata! Shubhangi Pharande Department of MCA NBNSSOCS,Sinhgad Institute Simantini Nalawade Department of MCA NBNSSOCS,Sinhgad Institute Ajay Nalawade
More informationMODERNIZE INFRASTRUCTURE
SOLUTION OVERVIEW MODERNIZE INFRASTRUCTURE Support Digital Evolution in the Multi-Cloud Era Agility and Innovation Are Top of Mind for IT As digital transformation gains momentum, it s making every business
More informationVocabulary Harvesting Using MatchIT. By Andrew W Krause, Chief Technology Officer
July 31, 2006 Vocabulary Harvesting Using MatchIT By Andrew W Krause, Chief Technology Officer Abstract Enterprises and communities require common vocabularies that comprehensively and concisely label/encode,
More informationWhen, Where & Why to Use NoSQL?
When, Where & Why to Use NoSQL? 1 Big data is becoming a big challenge for enterprises. Many organizations have built environments for transactional data with Relational Database Management Systems (RDBMS),
More informationIT Infrastructure for BIM and GIS 3D Data, Semantics, and Workflows
IT Infrastructure for BIM and GIS 3D Data, Semantics, and Workflows Hans Viehmann Product Manager EMEA ORACLE Corporation November 23, 2017 @SpatialHannes Safe Harbor Statement The following is intended
More informationTimeXtender extends beyond data warehouse automation with Discovery Hub
IMPACT REPORT TimeXtender extends beyond data warehouse automation with Discovery Hub MARCH 28 2017 BY MATT ASLETT TimeXtender is best known as a provider of data warehouse automation (DWA) software, but
More informationControlled vocabularies, taxonomies, and thesauruses (and ontologies)
Controlled vocabularies, taxonomies, and thesauruses (and ontologies) When people maintain a vocabulary of terms and sometimes, metadata about these terms they often use different words to refer to this
More informationPartner Presentation Faster and Smarter Data Warehouses with Oracle OLAP 11g
Partner Presentation Faster and Smarter Data Warehouses with Oracle OLAP 11g Vlamis Software Solutions, Inc. Founded in 1992 in Kansas City, Missouri Oracle Partner and reseller since 1995 Specializes
More informationAccelerate your SAS analytics to take the gold
Accelerate your SAS analytics to take the gold A White Paper by Fuzzy Logix Whatever the nature of your business s analytics environment we are sure you are under increasing pressure to deliver more: more
More informationShaping the Cloud for the Healthcare Industry
Shaping the Cloud for the Healthcare Industry Louis Caschera Chief Information Officer CareTech Solutions www.caretech.com > 877.700.8324 Information technology (IT) is used by healthcare providers as
More informationMAPR DATA GOVERNANCE WITHOUT COMPROMISE
MAPR TECHNOLOGIES, INC. WHITE PAPER JANUARY 2018 MAPR DATA GOVERNANCE TABLE OF CONTENTS EXECUTIVE SUMMARY 3 BACKGROUND 4 MAPR DATA GOVERNANCE 5 CONCLUSION 7 EXECUTIVE SUMMARY The MapR DataOps Governance
More informationNatural Language Processing with PoolParty
Natural Language Processing with PoolParty Table of Content Introduction to PoolParty 2 Resolving Language Problems 4 Key Features 5 Entity Extraction and Term Extraction 5 Shadow Concepts 6 Word Sense
More informationPaper. Delivering Strong Security in a Hyperconverged Data Center Environment
Paper Delivering Strong Security in a Hyperconverged Data Center Environment Introduction A new trend is emerging in data center technology that could dramatically change the way enterprises manage and
More information21ST century enterprise. HCL Technologies Presents. Roadmap for Data Center Transformation
21ST century enterprise HCL Technologies Presents Roadmap for Data Center Transformation june 2016 21st Century Impact on Data Centers The rising wave of digitalization has changed the way IT impacts business.
More informationTECHNOLOGY BRIEF: CA ERWIN DATA PROFILER. Combining Data Profiling and Data Modeling for Better Data Quality
TECHNOLOGY BRIEF: CA ERWIN DATA PROFILER Combining Data Profiling and Data Modeling for Better Data Quality Table of Contents Executive Summary SECTION 1: CHALLENGE 2 Reducing the Cost and Risk of Data
More informationGuidance for Exchange and Medicaid Information Technology (IT) Systems
Department of Health and Human Services Office of Consumer Information and Insurance Oversight Centers for Medicare & Medicaid Services Guidance for Exchange and Medicaid Information Technology (IT) Systems
More informationFor Healthcare Providers: How All-Flash Storage in EHR and VDI Can Lower Costs and Improve Quality of Care
For Healthcare Providers: How All-Flash Storage in EHR and VDI Can Lower Costs and Improve Quality of Care WHITE PAPER Table of Contents The Benefits of Flash for EHR...2 The Benefits of Flash for VDI...3
More informationSemantic Web and ehealth
White Paper Series Semantic Web and ehealth January 2013 SEMANTIC IDENTITY OBSERVE REASON IMAGINE Publication Details Author: Renato Iannella Email: ri@semanticidentity.com Date: January 2013 ISBN: 1 74064
More informationUrika: Enabling Real-Time Discovery in Big Data
Urika: Enabling Real-Time Discovery in Big Data Discovery is the process of gaining valuable insights into the world around us by recognizing previously unknown relationships between occurrences, objects
More informationAn Introduction to Big Data Formats
Introduction to Big Data Formats 1 An Introduction to Big Data Formats Understanding Avro, Parquet, and ORC WHITE PAPER Introduction to Big Data Formats 2 TABLE OF TABLE OF CONTENTS CONTENTS INTRODUCTION
More informationSemantic Annotation, Search and Analysis
Semantic Annotation, Search and Analysis Borislav Popov, Ontotext Ontology A machine readable conceptual model a common vocabulary for sharing information machine-interpretable definitions of concepts in
More informationPerform scalable data exchange using InfoSphere DataStage DB2 Connector
Perform scalable data exchange using InfoSphere DataStage Angelia Song (azsong@us.ibm.com) Technical Consultant IBM 13 August 2015 Brian Caufield (bcaufiel@us.ibm.com) Software Architect IBM Fan Ding (fding@us.ibm.com)
More informationATA DRIVEN GLOBAL VISION CLOUD PLATFORM STRATEG N POWERFUL RELEVANT PERFORMANCE SOLUTION CLO IRTUAL BIG DATA SOLUTION ROI FLEXIBLE DATA DRIVEN V
ATA DRIVEN GLOBAL VISION CLOUD PLATFORM STRATEG N POWERFUL RELEVANT PERFORMANCE SOLUTION CLO IRTUAL BIG DATA SOLUTION ROI FLEXIBLE DATA DRIVEN V WHITE PAPER Create the Data Center of the Future Accelerate
More informationPERSPECTIVE. Data Virtualization A Potential Antidote for Big Data Growing Pains. Abstract
PERSPECTIVE Data Virtualization A Potential Antidote for Big Data Growing Pains Abstract Enterprises are already facing challenges around data consolidation, heterogeneity, quality, and value. Now they
More informationHybrid Data Platform
UniConnect-Powered Data Aggregation Across Enterprise Data Warehouses and Big Data Storage Platforms A Percipient Technology White Paper Author: Ai Meun Lim Chief Product Officer Updated Aug 2017 2017,
More information