IBM Cognitive Systems Cognitive Infrastructure for the digital business transformation July 2017 Dilek Sezgün dilek@de.ibm.com 0160/90741619 Cognitive Solution Infrastructure Sales Leader
Painpoints of the Digital Business Transformation Data is the new competitive advantage Cloud is the new approach to agility Open collaboration is the new path to innovation The digital economy is driven by big data. To deal with it, companies require more agile, flexible, and scale able tools. Open Source applications and databases are changing the enterprise 2
Open Source Software enabling digital transformation Open source undoubtedly speeds the digital transformation for most companies. Kelly Stirman, VP, MongoDB http://www.cmswire.com/digital-experience/how-open-source-guides-digital-transformation/ By 2018, more than 70% of new in-house applications will be developed on an OSDBMS, and 50% of existing commercial RDBMS instances will have been converted or will be in process... we now believe that the cost of managing OSDBMSs and the availability of skills are now close to parity with those of the commercial DBMS offerings. We therefore believe there are clear savings in TCO for the OSDBMS. With software costs skyrocketing, this has become a major focus of IT management and is a major impact of the OSDBMS. Source: The State of Open-Source RDBMSs, 2015, Gartner, Donald Feinberg, Merv Adrian, April 2015
Open Source everywhere http://www.datacenter-insider.de/open-source-fuer-die-oeffentliche-verwaltung-a-594726/?cmp=nl- 86&uuid=8497CF90-0DA1-4B09-94A8-B8E1B439E3B1
What's going on in the market? Database Growth Rates Rank DBMS Growth (20 months) 1 Oracle -5% 2 MySQL 2% 3 MS SQL Server -10% 4 MongoDB 172% 5 PostgreSQL 40% 6 DB2 11% 7 Microsoft Access -26% 8 Cassandra 87% 9 SQLite 19% Source: http://db-engines.com/en/ranking
The Modern Data Platform Mobile Data New data sources Social Services Sensors Location Your value Lower cost, greater scale, speed Open source, Dev Ops model Reduce vendor lock-in Born on / designed for the cloud SQL MySQL compatible New Relational Open Source SQL SQL PostgreSQL compatible SQL PostgreSQL compatible New Data Models NoSQL Document, Column, Key Value, Graph, b Conventional Data Platform Systems of Record RDBMS OLTP ACID Structured Data Systems of Insight Data Warehouse & Marts Structured Data New Data Models Hadoop/Spark Unstructured Data / Text note: some new data companies shown are not based on open source Transactions 6
OSDB Scenarios Opportunity Situation Probable OSDB Candidates Creating (implementing) a new application and database Short-term (0-6 months) Migration of existing OSDB to Power Medium-term (6-12 months) Migration of existing relational databases and application to Open Source DB Long-term (12-18+ months) 7
IBM POWER8: Breakthrough performance for YOUR data 4X Threads per core* 4X Mem. Bandwidth* Data flow 4X More cache* x86 POWER8 SMT8 x86 Hyperthread Parallel Processing POWER8 pipe x86 pipe POWER8 x86 POWER8 + OpenPOWER Optimized for a broad range of Databases, BigData and Analytics Workloads 5X Faster 8
Open Source Myth and Infrastructure x86 is the best platform for Open Source Databases myth buster Open Source DBs deliver 1.8-3X+ greater value on POWER8 vs. x86 and this is interesting for all who want to better than there competition 9
Power 8 Price Performance Guarantee on * Mongo DB: IBM Power Systems guarantees the Power S822LC for Big Data system built with POWER8 delivers at least a 2X price-performance advantage vs. x86 based servers when running a customer application/workload based on Mongo DB. Enterprise DB: IBM Power Systems guarantees the S822LC for Big Data system built with POWER8 delivers at least a 1.8X price-performance advantage versus x86 based servers when running a virtualized customer application/workload based on EnterpriseDB Postgres 9.5. On Hortonworks: IBM Power Systems guarantees the Power S822LC for Big Data system built with POWER8 delivers at least a 3X priceperformance advantage vs. x86 based results when running a customer application/workload with Tez/Hive LLAP on Hortonworks HDP. * More informations see on backup
Build your own server with Open Source Databases Easy to use All Open Source Databases and Information included Design a system from the ground up related on the Databases https://www- 03.ibm.com/systems/uk/power/ hardware/linux-lc-solutions.html IBM Systems
Open Source and AI on Cognitive Infrastructure Data Power S822 LC for HPC (Minsky) With open Source Deep Learning Frameworks Mashine Learning on System z Power LC S812, S821, S822 Power LC 822 for Big Data IBM Storage and Software Defined Storage IBM Spectrum Computing Ess Disk Deep Flash Scale Products Services 12
Thank You!
Open Source Databases Classification and Use Cases Name Logo Classification Description Optimized for Use cases / data types MongoDB NoSQL Document Store Most widely open Source NoSQL document Database Changes the data model from relational to docment oriented Flexible schema, envolves with apps Document model, document stores, semi-structured or unstructured data. Single view of Customer records, Enterprise content management, catalogs, personalization Redis NoSQL in memory Key Value Store An open source in-memory nosql database that retrieves data based on a key value. Most popular key-value stores are also in-memory, often working as a caching engine with other more permanent data stores Data queues, strings, lists, counts, caching, statistics, text, session IDs, pictures, videos Live in memory cache, data queues, User session data, shopping cart data, Cassandra NoSQL Wide Column Store Apache Cassandra is a free and open source distributed database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients. NoSQL environments that need very high performance and scalability, very high data volumes Messaging, Fraud detection, Internet of Things data sensor data, log data, telco call detail records Neo4J NoSQL Graph Store Neo4j is an open source, NoSQL, enterprise grade, highly scalable graph database.a database that enables you to establish the relationships within your data and across multiple data sources Data stored as edges, nodes, or attributes (graphs). Fraud detection, Social Network Analysis, Location aware apps, Master data mgmt., Machine Learning Postgres EnterpriseDB Relational Database Pre-defined data model, strong consistency. Wide variety of transactional work at lower TCO relational/structured queries. Deep database compatibility with Oracle plus familiar Oracle tools like EDB*Plus, EDB*Loader, Note: the Oracle extensions are closed source commercial license Wide variety of transactional work at lower TCO relational/structured queries to object store and retrieval Oracle RDBMs migrations and takeouts PostgresPure Splendid Data Relational Database Matches functionality of Oracle Important Note: this is not the same as being Oracle Compatible 100% Open Source; A native migration to Postgres with no Oracle traces left behind Wide variety of transactional work at lower TCO relational/structured queries to object store and retrieval Oracle RDBMs migrations and takeouts MariaDB 14 Relational Database Used for large volumes of data processing, e.g. seen at web portals. MariaDB has been created by the founders of MySQL (former SkySQL) as a fork of MySQL and is establishing its growth partly by moving MySQL customers to MariaDB. Example POWER8 solution with MariaDB TURBO LAMP Lower cost transactional SQL based queries and updates Migrations from Oracle MySQL, Turbo LAMP stack 14
Power Systems MongoDB running on POWER8 Price-Performance Guarantee IBM Power Systems guarantees the Power S822LC for Big Data system built with POWER8 delivers at least a 2X priceperformance advantage vs. x86 based servers when running a customer application/workload based on MongoDB. 2X price-performance means that the customer's documented throughput performance on the S822LC POWER8 divided by the price of the system will be at least 2 times higher than the customer's documented throughput performance on the x86 based system divided by the price of the comparable x86 system. EX: If transactions per second on the S822LC are 20,000 and 10,000 on the x86 based system, while the price of the S822LC is $10,000, and the price of the x86 based system is $10,000, then the Throughput Performance Per Price would be exactly 2 times higher and the guaranty would be met." The IBM Power S822LC for Big Data server (20-core/2.92 GHz 128GB memory, 4 TB SATA Storage) must be purchased from IBM or an authorized IBM Business Partner prior to June 30, 2017. The guarantee period is valid for three (3) months from the date of purchase. The x86 based systems must be comparably configured branded servers from Cisco, Dell, HP, or Lenovo and the client is responsible for all MongoDB licenses. 2X throughput performance per price means that the customer's documented throughput performance on the S822LC POWER8 system based on either queries, operations or transactions per second divided by the price of the such system will be at least 2 times higher than the customer's same documented throughput performance on the x86 based system divided by the price of such comparable x86 system. Remediation: IBM will provide additional performance optimization and tuning services consistent with IBM Best Practices, at no charge. If unable to reach guaranteed level of price-performance, IBM will provide additional equally configured systems to those already purchased to reach the guaranteed level of priceperformance. Notes: 1. Client s POWER8 Machine and the x86 Machine must be running at similar utilization rates. 2. Client s POWER8 Machine s system performance cannot be constrained by I/O subsystem. Specifically, the I/O subsystem on the POWER8 Machines must achieve greater than or equal I/O bandwidth and operations per second than the x86 Machine. 3. Client s POWER8 Machine s physical memory must be the same or greater than the physical memory on the x86 Machine 4. Client is responsible for demonstrating comparable real-world representative workload between the POWER8 Machine and the x86 Machine through the use of the IBM provided tools and comparable tools on x86 systems. 5. 2x guarantee is based on a list price for x86 (Dell, Cisco, HP or Lenovo) and the IBM S822LC for Big Data. 2017 2016 IBM Corporation
Power Systems EnterpriseDB on Power Systems Price-Performance Guarantee IBM Power Systems guarantees the S822LC for Big Data system built with POWER8 delivers at least a 1.8X price-performance advantage versus x86 based servers when running a virtualized customer application/workload based on EnterpriseDB Postgres 9.5. 2017 2016 IBM Corporation 1.8X price-performance means that the customer's documented throughput performance on the S822LC POWER8 divided by the sum of the price of the system and associated EnterpriseDB licenses will be at least 1.8 times that of the customer's documented throughput performance on the x86 based system divided by the sum of the price of the comparable x86 system and associated EnterpriseDB licenses EX: If transactions per second on the S822LC are 18,000 and 10,000 on the x86 based system, while the price of the S822LC and associated EnterpriseDB licenses is $10,000, and the price of the x86 based system and associated EnterpriseDB licenses is $10,000, then the Throughput Performance Per Price would be exactly 1.8 times advantaged and the guaranty would be met." The IBM Power S822LC for Big Data server (20-core/2.92 GHz 256GB memory, 4 TB SATA Storage) must be purchased from IBM or an authorized IBM Business Partner prior to June 30, 2017. The guarantee period is valid for three (3) months from the date of purchase. The x86 based systems must be comparably configured branded servers from Cisco, Dell, or HP and the client is responsible for all EnterpriseDB licenses. 1.8 X price-performance means that the customer's documented throughput performance on the S822LC POWER8 divided by the sum of the price of the system and associated EnterpriseDB licenses will be at least 1.8 times that of the customer's documented throughput performance on the x86 based system divided by the sum of the price of the comparable x86 system and associated EnterpriseDB licenses Remediation: IBM will provide additional performance optimization and tuning services consistent with IBM Best Practices, at no charge. If unable to reach guaranteed level of price-performance, IBM will provide additional equally configured systems to those already purchased to reach the guaranteed level of priceperformance. Notes: 1. Client s POWER8 Machine and the x86 Machine must be running at similar utilization rates. Eligible Machine and the Compared Machine must be partitioned with at least 4 equal sized partitions. 2. Client s POWER8 Machine s system performance cannot be constrained by I/O subsystem. Specifically, the I/O subsystem on the POWER8 Machines must achieve greater than or equal I/O bandwidth and operations per second than the x86 Machine. 3. Client s POWER8 Machine s physical memory must be the same or greater than the physical memory on the x86 Machine 4. Client is responsible for demonstrating comparable real-world representative workload between the POWER8 Machine and the x86 Machine through the use of the IBM provided tools and comparable tools on x86 systems. 5. 1.8x guarantee is based on list price for the x86 based server (Dell, Cisco,or HP) and list price for the IBM S822LC for Big Data. 6. EnterpriseDB Postgres Advanced Server 9.5 license are priced at $1750 per core - EDB 9.5 http://www.enterprisedb.com/products-services-training/subscriptions-power
Power Systems Hortonworks HDP running on POWER8 Price-Performance Guarantee IBM Power Systems guarantees the Power S822LC for Big Data system built with POWER8 delivers at least a 3X price-performance advantage vs. x86 based results when running a customer application/workload with Tez/Hive LLAP on Hortonworks HDP under the conditions noted below. A Worker Node is a server carrying out the HDP query functions, with one Worker Node per server. 3X price-performance means that the customer's documented throughput performance on the cluster of S822LC for Big Data Worker Nodes divided by the price of the cluster of Worker Nodes will be at least 3 times higher than the customer's documented throughput performance on the cluster of x86 based Worker Nodes divided by the price of the cluster of x86 Worker Nodes. EX: If queries per second on the cluster of S822LC Worker Nodes are 30,000 and 10,000 on the cluster of x86 based Worker Nodes, while the price of the S822LC Worker Node cluster is $10,000, and the price of the x86 based Worker Node cluster is $10,000, then the Throughput Performance Per Price would be exactly 3 times higher and the guarantee would be met." The IBM Power S822LC for Big Data servers (22-core/2.89 GHz) used as Worker Nodes must be purchased from IBM or an authorized IBM Business Partner prior to September 30, 2017. The guarantee period is valid for three (3) months from the date of purchase. The x86-based Worker Nodes must be comparably configured branded servers from Cisco, Dell, HP, or Lenovo and the client is responsible for all Hortonworks licenses. 3X throughput performance per price means that the customer's documented throughput performance on the cluster of Power S822LC for BD Worker Nodes based on either queries, operations or transactions per second divided by the price of the cluster of Worker Nodes will be at least 3 times higher than the customer's same documented throughput performance on the cluster of x86 Worker Nodes divided by the price of said cluster of x86 Worker Nodes. Remediation: IBM will provide additional performance optimization and tuning services consistent with IBM Best Practices, at no charge. If unable to reach the guaranteed level of price-performance, IBM will provide additional equally configured Worker Nodes to those already purchased to reach the guaranteed level of priceperformance. Notes: 1. Client s Power S822LC for BD Worker Nodes and the x86 Worker Nodes must be running at similar utilization rates of at least 50% or higher, using the same software stack as described in Note #4, and which are configured similarly. 2. Client s Power S822LC for BD performance cannot be constrained by I/O subsystem. Specifically, the I/O subsystem on the Power S822LC for BD Worker Node must achieve greater than or equal I/O bandwidth and operations per second than the x86 Worker Node. 3. Client s Power S822LC for BD Worker Node s physical memory must be the same or greater than the physical memory on the x86 Worker Node. 4. Applicable software stack is Tez/Hive LLAP on HDP 2.6 or later for both the Power S822LC and x86-based Worker Nodes. 5. Client is responsible for demonstrating comparable real-world representative workload between the Power S822LC for BD Worker Node and the x86 Worker Node through the use of the IBM provided tools and comparable tools on x86 systems. 6. 3X guarantee is based on a list price for x86 servers from Dell, Cisco, HP or Lenovo based on E5-2600 v4 or earlier processor technology and the IBM S822LC for Big Data. 2017 2016 IBM Corporation POP04058USEN-01