What s New in DataStax Enterprise 3.1? A Guide for Developers, Architects and IT Managers. White Paper BY DATASTAX CORPORATION November 2013
|
|
- Della Ball
- 5 years ago
- Views:
Transcription
1 What s New in DataStax Enterprise 3.1? A Guide for Developers, Architects and IT Managers White Paper BY DATASTAX CORPORATION November
2 Table of Contents Abstract 3 Introduction 3 What s New in DataStax Enterprise Edition 3.1? 3 Performance Enhancements 4 Improved Native Memory Management 4 Faster Node Bootup/Startup 4 Murmur3Partitoner 4 Query Profiling/Tracing 4 New Compression Option 4 Miscellaneous Cassandra Performance Enhancements 4 Management Enhancements 5 Virtual Nodes 5 Parallel Leveled Compaction 5 Improved JBOD Functionality 5 Developer Enhancements 5 Collections 5 Atomic Batches 5 Native/Binary CQL Transport 6 Concurrent Schema Changes 6 Composite Column Support for Hive and Solr 6 Other CQL Enhancements 6 Enterprise Search Improvements 7 Updates to Security 7 Improvements to Visual Enterprise Manageability 7 Why DataStax Enterprise? 8 Use Cases Handled by DataStax Enterprise 9 Conclusion 10 About DataStax 10 2
3 Abstract DataStax Enterprise Edition is an enterprise NoSQL platform, built with the C* Seal of Approval: a production-certified version of Apache Cassandra. Architected to manage real-time, batch analytics, and enterprise search data all in the same database cluster, DataStax Enterprise 3.1 features numerous components and upgrades that make management and development easier than ever before, while delivering unmatched performance. DataStax Enterprise 3.1 also continues to provide the most comprehensive security feature set of any NoSQL solution provider. This paper details the enhancements you ll find in DataStax Enterprise Edition 3.1. Introduction DataStax Enterprise Edition is an enterprise NoSQL platform, built with the C* Seal of Approval: a production-certified version of Apache Cassandra. It is architected to manage real-time, batch analytics, and enterprise search data all in the same database cluster. DataStax Enterprise has three components: 1. DataStax Enterprise Server, which is an enterprise-class NoSQL data management platform that uses Apache Cassandra for real-time data management, and provides easy methods for running analytics and enterprise search operations on Cassandra data. 2. OpsCenter Enterprise, which is a visual, browser-based management tool that allows administrators to manage all database clusters, whether they are on premise, across multiple data centers, or in the cloud, from a single interface. 3. Expert support, which is delivered by the NoSQL professionals at DataStax and includes around-the-clock coverage. This paper discusses the enhancements included in version 3.1 of DataStax Enterprise Edition. What s New in DataStax Enterprise Edition 3.1? DataStax Enterprise Edition 3.1 includes a number of improvements in the following areas: Performance new additions provide for faster query and DML response times, more efficient data compression, and better insight into performance bottlenecks. Management altering an existing database cluster (adding, removing nodes, etc.) is now much easier in DataStax Enterprise 3.1, with the end result being even less time needed for cluster management and maintenance. Developer Enablement many new Cassandra and Solr features are included in DataStax Enterprise 3.1, giving developers more flexibility and capabilities for developing online applications plus reducing the amount of manual code that needs to be written. The following sections explain these feature additions in more detail. 3
4 Performance Enhancements DataStax Enterprise 3.1 contains a number of performance enhancements that greatly contribute to better performance and scale. For many use cases, up to 10x the amount of data per Cassandra node can now be managed with the same levels of performance as prior versions. Such capabilities can reduce hardware costs and personnel overhead for large scaleout deployments. The following features contribute to 3.1 s enhanced performance and scale. Improved Native Memory Management DataStax Enterprise 3.1 uses native memory vs. the Java heap for Cassandra bloom filters, compression metadata, and the partition summary. This allows Cassandra to handle much more data per node than in the past, and reduces garbage collection operations. Faster Node Bootup/Startup Faster startup/bootup times for each node in a cluster is realized in 3.1, with internal tests performed at DataStax showing up to 80% less time needed to start a DataStax Enterprise node. The startup reductions were realized through more efficient sampling and loading of SSTable indexes into memory caches. Murmur3Partitoner Version 3.1 introduces a new Cassandra partitioner: the Murmur3Partitioner, based on the Murmur3 hash. The Murmur3 hash is 3x-5x faster than the prior MD5 has used in earlier versions of Cassandra; this translates into performance gains of over 10% for index-heavy workloads. Query Profiling/Tracing A new Cassandra performance diagnostic utility is available in 3.1, which is aimed at helping to understand, diagnose, and troubleshoot CQL statements sent to a Cassandra cluster. Users can interrogate individual CQL statements in an ad-hoc manner, or perform a system-wide collection of all queries/commands that are sent to a cluster. New Compression Option LZ4 compression is now included as a compression option. LZ4 provides about a 50% performance improvement over the current default Snappy compression. Miscellaneous Cassandra Performance Enhancements Other performance enhancements include: A new approach to index maintenance, improving the speed at which indexes are updated. More efficient and faster streaming of data during bootstrap or repair operations. Faster replica recovery via a new concurrent hint delivery mechanism. The compaction_throughput_mb_per_sec setting now works better with large partitions. In prior releases, Cassandra only checked the compaction throughput between partitions, so large partitions could still cause spikes of i/o demand. The removal of cell-name bloom filters results in faster queries against large partitions, since they are no longer part of the partition header. The enablement of a thread-local allocation setting results in a 15% gain in read performance. 4
5 Management Enhancements Database clusters running DataStax Enterprise 3.1 are now easier to manage due to the following management enhancements. Virtual Nodes Virtual nodes or vnodes change the previous Cassandra paradigm of using one token or range per node, to many per node, which makes a cluster much easier to manage and grow. Vnodes provide the following core benefits: Rather than just one or a couple of nodes participating in bootstrapping new nodes, all nodes participate in the operation, thus parallelizing the task. End result: much faster performance for node addition operations. Vnodes automatically maintain the data distribution / balances of a cluster so there is no need to perform any rebalance operation after a cluster has been modified. Note that virtual nodes may be combined with traditional token-assigned nodes in a DataStax Enterprise cluster. For example, Cassandra nodes may use virtual nodes, whereas Solr nodes may use traditional token assignments in the same database cluster. Parallel Leveled Compaction Cassandra parallel leveled compaction provides more efficient and faster compaction operations for deployments that especially target SSD hardware. Whereas the general idea for compaction processes is to mitigate impact on the overall operation of nodes (which typically results in longer compaction times but less resource intensive operations), SSD implementations lend themselves to speeding up compaction tasks, which is what paralleled level compaction does. Improved JBOD Functionality In prior versions, a single disk going down in a JBOD (just a bunch of disks) configuration had the potential to make an entire node unavailable for I/O operations. A new disk_failure_policy configuration setting allows architects to choose from a number of new policies that deal with disk failure so this doesn t have to be the case. Developer Enhancements Version 3.1 provides a number of new features that help developers create applications with DataStax Enterprise. Collections DataStax Enterprise 3.1 includes a new mechanism for storing Cassandra data called collections. The general idea behind collections is to provide easier methods for inserting and manipulating data that consists of multiple items that a user wants to store in a single column; for example, multiple addresses for a single employee. There are three different types of collections: (1) sets; (2) lists; (3) maps. Atomic Batches Prior versions supported Cassandra batch operations, which allowed users to group related updates into a single statement. If some of the replicas for the batch failed mid-operation, the coordinator would hint those rows automatically. However, if the coordinator itself failed in mid operation, users could end up with partially applied batches. In DataStax Enterprise 3.1, batch operations are guaranteed by default to be atomic. The default functionality for batches is for any batch to be atomic (i.e. all or nothing). It should be noted that there is a performance penalty for using atomic batches, so for use cases that necessitate batch operations, but either have client side workarounds or other methods for ensuring batch 5
6 atomicity, a BEGIN UNLOGGED BATCH command is supplied for cases when performance is more important than atomicity guarantees. Native/Binary CQL Transport Prior to version 3.1, the Cassandra Query Language (CQL) API had been using Thrift as a network transport, but now a new binary protocol is available for CQL that does not require Thrift. There are a number of benefits that the new native CQL transport provides: Thrift is a synchronous transport meaning only one request can be active at a time for a connection. By contrast, the new native CQL transport allows each connection to handle more than one active request at the same time. This translates into client libraries only needing to maintain a relatively low number of open connections to a Cassandra node in order to maximize performance, and helps scale large clusters. Thrift is an RPC mechanism, which means a user cannot have a Cassandra server push information to a client. However the new native CQL protocol allows clients to register for certain types of event notifications from a server. Currently supported events include [1] cluster topology changes (e.g. a node joins the cluster, is removed, is moved, etc.); [2] status changes (e.g. a node is detected up/down); [3] schema changes (e.g. a table has been modified, etc.). These new capabilities allow clients to stay up to date with the state of the Cassandra cluster without having to poll the cluster regularly. The new native protocol allows for messages to be compressed if desired. Thrift is still the default transport in 1.2. To use the new binary protocol, the start_native_transport option is changed to true in the cassandra.yaml file. Also needed are client drivers that support the new binary protocol such as the new DataStax Java and.net drivers. Concurrent Schema Changes While earlier versions introduced the ability to modify objects in a concurrent fashion across a cluster, they did not include support for programmatically creating and dropping column families / tables (either permanent or temporary) in a concurrent manner. This is now supported in 3.1, which means multiple users may add/drop tables at the same time in the same cluster. Composite Column Support for Hive and Solr Composite columns are now no longer constrained to only work with Cassandra. Users can now create objects with composite columns in both Hive/Hadoop and Solr. Other CQL Enhancements Numerous CQL enhancements are in version 3.1. Changes include a new ALTER KEYSPACE statement, syntax additions to understand how long a TTL column has remaining, support for conditional operators, and much more. For a full list of all CQL additions, please see the DataStax online documentation. 6
7 Enterprise Search Improvements DataStax has certified Solr 4.3 for production use in DataStax Enterprise. With Solr 4.3 come nearly 60 new features, mostly targeted at providing developers with more functionality for creating search applications. Solr 4.3 also brings a number of improvements that provide greater quality and stability to the platform. In addition to Solr 4.3, version 3.1 of DataStax Enterprise includes the following new search features, which help make Solr near real-time in its data management handling: Per segment filters. Multi-value facets. Updates to Security DataStax Enterprise 3.1 includes a number of maintenance patches to several of the security features that were introduced in version 3.0. DataStax Enterprise 3.1 continues to provide the most comprehensive security feature set of any NoSQL solution provider, with data protection capabilities including the following: Internal authentication supports standard login ID/password external security for Cassandra. External authentication provides integration with Kerberos and LDAP (with Kerberos) to secure Cassandra, Hadoop and Solr using popular 3 rd party security software. Object permission management controls who can create and edit objects as well as view and manipulate data via the familiar RDBMS GRANT and REVOKE paradigm. Client to node encryption ensures that data in flight from client machines to/from a database cluster cannot be stolen. Node to node encryption guarantees that data being moved between nodes cannot be intercepted and accessed. Transparent data encryption protects data at rest so that sensitive data is not compromised. Data auditing allows for administrators to understand who is accessing the database and what actions they have performed. Improvements to Visual Enterprise Manageability OpsCenter Enterprise 3.2 is included with the DataStax Enterprise NoSQL platform and provides visual management and monitoring for Cassandra, Hadoop, and Solr. Enhancements were provided in OpsCenter 3.0 for visually creating and editing database clusters, as well as restoring a cluster from backup. Version 3.2 supplies a number of improvements to these features including: Support for provisioning database clusters with virtual nodes. Recognizing objects created with CQL3. Improved monitoring with a new caching component that results in performance metrics being displayed about 10x faster than in previous versions. Numerous stability and visual user interface improvements. 7
8 Why DataStax Enterprise? Version 3.1 of DataStax Enterprise continues to help modern businesses transform the way they do business. Today, line-of-business (LOB) applications are evolving to meet the need of providing greater capabilities and data insights more than ever before, and this necessitates a new kind of technology aimed at handling today s data in a technically efficient and cost effective way. Using Apache Cassandra, DataStax Enterprise provides the type of technology and architecture that targets modern LOB applications. Moreover, DataStax Enterprise is unique in the data management marketplace in that it smartly handles all key data dimensions real time, batch analytic, and enterprise search all in one easily managed database cluster. The benefits delivered by DataStax Enterprise can be summarized as follows: Feature Benefit Production-Certified Cassandra Continuously Available Analytics on Cassandra data Fault-Tolerant Search on Cassandra data Workload Separation/Isolation No Need for ETL Easy Data Migration Simple Log Integration Enterprise Management Control Expert Support Cost Efficient Apache Cassandra is a massively scalable NoSQL database that is an acknowledged industry leader at handling today s LOB systems. DataStax certifies a version of Cassandra for its big data platform via lengthy testing, benchmarking, validation with 3 rd party software, and defect resolution to ensure a chosen version is ready for production environments. Running analytics on Cassandra data is made easy via the integration of a number of Hadoop components. Support for MapReduce, Hive, Pig, Mahout, and Sqoop are included in DataStax Enterprise. Enterprise search operations are run on Cassandra data with built-in Solr integration. The mixed workload problem is solved in the platform as no workload (real time, analytics, search) competes with any other for compute resources; all are isolated and yet integrated together. The need to extract-transform-load data from real time to analytic to search systems goes away as builtin replication keeps all data domains in sync. Migrating data from existing RDBMS systems is easy via built-in migration utilities. Third party migration tools also support the platform. Application logging data is easily consumed via logging interfaces, and then can be analyzed and searched.. Administrators are instantly productive and save time with OpsCenter Enterprise, which allows management of all database clusters within a single interface. Professional, around-the-clock support ensures questions get quickly answered and help is available to ensure applications stay online. Cost reductions over typical RDBMS vendors routinely run 80%. 8
9 Use Cases Handled by DataStax Enterprise Because DataStax Enterprise is a comprehensive and integrated big data platform, it is capable of handling use cases that have real-time, batch analytics, and enterprise search requirements. Example use cases should such as the following can be managed by DataStax Enterprise: Real-Time: Time series data Device/Sensor/Data exhaust systems Distributed applications Media streaming Online Web retail (transactional, shopping carts, etc.) Real-time data analytics Social media capture and analysis Web click-stream analysis Write-intensive transactional systems Batch Analytics: Buyer behavior analytics Compliance/regulatory analysis Customer recommendation output Fraud detection Risk analysis Sales program campaign analysis Supply chain analytics Batch Web clickstream analysis Enterprise Search: General Web search Web retail faceted (categorization) search Search/hit prioritization and highlighting Application log search and analysis Document (PDF, MS Word, etc.) search and analysis Geospatial search Real estate location and property search Social media match ups 9
10 Conclusion DataStax Enterprise Edition 3.1 provides increased performance, scale, and developer flexibility for creating modern line-of-business applications that have real-time, analytic, and enterprise search requirements. For more information on DataStax Enterprise Edition 3.1, visit for downloads, online documentation, and more. Note that DataStax Enterprise Edition 3.1 may be downloaded and used free of charge in development environments with no restrictions (e.g. data size, RAM, CPU, etc.), however production deployments do require a subscription be purchased. For information on subscription pricing, please send an to DataStax at info@datastax.com. About DataStax DataStax powers the online applications that transform business for more than 300 customers, including startups and 20 of the Fortune 100. DataStax delivers a massively scalable, flexible and continuously available big data platform built on Apache Cassandra. DataStax integrates enterprise-ready Cassandra, Apache Hadoop for analytics and Apache Solr for search across multi-data centers and in the cloud. Companies such as Adobe, Healthcare Anytime, ebay and Netflix rely on DataStax to transform their businesses. Based in San Mateo, Calif., DataStax is backed by industry-leading investors: Lightspeed Venture Partners, Crosslink Capital and Meritech Capital Partners. For more information, visit DataStax or follow 10
What s New in DataStax Enterprise 3.0? A Guide for Developers, Architects and IT Managers. White Paper
What s New in DataStax Enterprise 3.0? A Guide for Developers, Architects and IT Managers White Paper BY DATASTAX CORPORATION FEBRUARY 2013 Contents Introduction 3 Why DataStax Enterprise? 3 Use Cases
More informationTable of Contents... 2
5 Steps to Apache Cassandra Success with DataStax 1 2 4 3 5 Table of Contents Table of Contents... 2 Abstract... 3 Choosing the Right Database Technology... 3 Implementing a System on DataStax Enterprise...
More informationBig Data Development CASSANDRA NoSQL Training - Workshop. November 20 to (5 days) 9 am to 5 pm HOTEL DUBAI GRAND DUBAI
Big Data Development CASSANDRA NoSQL Training - Workshop November 20 to 24 2016 (5 days) 9 am to 5 pm HOTEL DUBAI GRAND DUBAI ISIDUS TECH TEAM FZE PO Box 9798 Dubai UAE, email training-coordinator@isidusnet
More informationDataStax Enterprise 4.0 In-Memory Option A look at performance, use cases, and anti-patterns. White Paper
DataStax Enterprise 4.0 In-Memory Option A look at performance, use cases, and anti-patterns White Paper Table of Contents Abstract... 3 Introduction... 3 Performance Implications of In-Memory Tables...
More informationSimplifying Data Management. With DataStax Enterprise (DSE) OpsCenter
Simplifying Data Management With DataStax Enterprise (DSE) OpsCenter CONTENTS Abstract3 Introduction 3 DSE OpsCenter 4 How Does DSE OpsCenter Work? 4 The OpsCenter Interface 4 OpsCenter Security 6 Creating
More informationGlossary. Updated: :00
Updated: 2018-07-25-07:00 2018 DataStax, Inc. All rights reserved. DataStax, Titan, and TitanDB are registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
More informationCassandra Database Security
Cassandra Database Security Author: Mohit Bagria NoSQL Database A NoSQL database (sometimes called as Not Only SQL) is a database that provides a mechanism to store and retrieve data other than the tabular
More informationEvaluating Apache CassandraTM as a Cloud Database
Evaluating Apache CassandraTM as a Cloud Database Table of Contents Table of Contents... 2 Abstract... 3 Introduction... 3 Why a Cloud Database?... 3 Transparent Elasticity... 3 Transparent Scalability...
More informationEvaluating Cloud Databases for ecommerce Applications. What you need to grow your ecommerce business
Evaluating Cloud Databases for ecommerce Applications What you need to grow your ecommerce business EXECUTIVE SUMMARY ecommerce is the future of not just retail but myriad industries from telecommunications
More informationSynergetics-Standard-SQL Server 2012-DBA-7 day Contents
Workshop Name Duration Objective Participants Entry Profile Training Methodology Setup Requirements Hardware and Software Requirements Training Lab Requirements Synergetics-Standard-SQL Server 2012-DBA-7
More informationHadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved
Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop
More informationScylla Open Source 3.0
SCYLLADB PRODUCT OVERVIEW Scylla Open Source 3.0 Scylla is an open source NoSQL database that offers the horizontal scale-out and fault-tolerance of Apache Cassandra, but delivers 10X the throughput and
More informationCIB Session 12th NoSQL Databases Structures
CIB Session 12th NoSQL Databases Structures By: Shahab Safaee & Morteza Zahedi Software Engineering PhD Email: safaee.shx@gmail.com, morteza.zahedi.a@gmail.com cibtrc.ir cibtrc cibtrc 2 Agenda What is
More informationMigrating Oracle Databases To Cassandra
BY UMAIR MANSOOB Why Cassandra Lower Cost of ownership makes it #1 choice for Big Data OLTP Applications. Unlike Oracle, Cassandra can store structured, semi-structured, and unstructured data. Cassandra
More informationCassandra 2012: What's New & Upcoming. Sam Tunnicliffe
Cassandra 2012: What's New & Upcoming Sam Tunnicliffe sam@datastax.com DSE : integrated Big Data platform Built on Cassandra Analytics using Hadoop (Hive/Pig/Mahout) Enterprise Search with Solr Cassandra
More informationMODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS
MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS SUJEE MANIYAM FOUNDER / PRINCIPAL @ ELEPHANT SCALE www.elephantscale.com sujee@elephantscale.com HI, I M SUJEE MANIYAM Founder / Principal @ ElephantScale
More informationSQL Azure. Abhay Parekh Microsoft Corporation
SQL Azure By Abhay Parekh Microsoft Corporation Leverage this Presented by : - Abhay S. Parekh MSP & MSP Voice Program Representative, Microsoft Corporation. Before i begin Demo Let s understand SQL Azure
More informationWHITEPAPER. MemSQL Enterprise Feature List
WHITEPAPER MemSQL Enterprise Feature List 2017 MemSQL Enterprise Feature List DEPLOYMENT Provision and deploy MemSQL anywhere according to your desired cluster configuration. On-Premises: Maximize infrastructure
More informationTechnical Deep Dive: Cassandra + Solr. Copyright 2012, Think Big Analy7cs, All Rights Reserved
Technical Deep Dive: Cassandra + Solr Confiden7al Business case 2 Super scalable realtime analytics Hadoop is fantastic at performing batch analytics Cassandra is an advanced column family oriented system
More information5 Fundamental Strategies for Building a Data-centered Data Center
5 Fundamental Strategies for Building a Data-centered Data Center June 3, 2014 Ken Krupa, Chief Field Architect Gary Vidal, Solutions Specialist Last generation Reference Data Unstructured OLTP Warehouse
More informationNoSQL in the Enterprise. A Guide for Technology Leaders and Decision-Makers. White Paper
NoSQL in the Enterprise A Guide for Technology Leaders and Decision-Makers White Paper BY DATASTAX CORPORATION FEBRUARY 2013 Contents Introduction 3 An Overview of NoSQL 4 The Rise and Momentum of NoSQL
More informationTable of Contents. Client: Sears Holding Corporation
1 Table of Contents Client: Sears Holding Corporation... 3 Technology: Cassandra DB... 3 Challenges faced by the client... 4 Why Aurelius?... 4 Solution and post solution benefits... 5 2 Cassandra DB training
More informationJavaentwicklung in der Oracle Cloud
Javaentwicklung in der Oracle Cloud Sören Halter Principal Sales Consultant 2016-11-17 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information
More informationTECHNICAL OVERVIEW OF NEW AND IMPROVED FEATURES OF EMC ISILON ONEFS 7.1.1
TECHNICAL OVERVIEW OF NEW AND IMPROVED FEATURES OF EMC ISILON ONEFS 7.1.1 ABSTRACT This introductory white paper provides a technical overview of the new and improved enterprise grade features introduced
More informationOracle NoSQL Database Enterprise Edition, Version 18.1
Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database is a scalable, distributed NoSQL database, designed to provide highly reliable, flexible and available data management across
More informationOracle NoSQL Database Enterprise Edition, Version 18.1
Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database is a scalable, distributed NoSQL database, designed to provide highly reliable, flexible and available data management across
More informationARCHITECTING WEB APPLICATIONS FOR THE CLOUD: DESIGN PRINCIPLES AND PRACTICAL GUIDANCE FOR AWS
ARCHITECTING WEB APPLICATIONS FOR THE CLOUD: DESIGN PRINCIPLES AND PRACTICAL GUIDANCE FOR AWS Dr Adnene Guabtni, Senior Research Scientist, NICTA/Data61, CSIRO Adnene.Guabtni@csiro.au EC2 S3 ELB RDS AMI
More informationMicrosoft SharePoint Server 2013 Plan, Configure & Manage
Microsoft SharePoint Server 2013 Plan, Configure & Manage Course 20331-20332B 5 Days Instructor-led, Hands on Course Information This five day instructor-led course omits the overlap and redundancy that
More informationMicrosoft Big Data and Hadoop
Microsoft Big Data and Hadoop Lara Rubbelke @sqlgal Cindy Gross @sqlcindy 2 The world of data is changing The 4Vs of Big Data http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data 3 Common
More informationCassandra 1.0 and Beyond
Cassandra 1.0 and Beyond Jake Luciani, DataStax jake@datastax.com, 11/11/11 1 About me http://twitter.com/tjake Cassandra Committer Thrift PMC Early DataStax employee Ex-Wall St. (happily) Job Trends from
More informationCO MySQL for Database Administrators
CO-61762 MySQL for Database Administrators Summary Duration 5 Days Audience Administrators, Database Designers, Developers Level Professional Technology Oracle MySQL 5.5 Delivery Method Instructor-led
More informationScaleArc for SQL Server
Solution Brief ScaleArc for SQL Server Overview Organizations around the world depend on SQL Server for their revenuegenerating, customer-facing applications, running their most business-critical operations
More informationCertified Apache Cassandra Professional VS-1046
Certified Apache Cassandra Professional VS-1046 Certified Apache Cassandra Professional Certification Code VS-1046 Vskills certification for Apache Cassandra Professional assesses the candidate for skills
More informationOracle Database 18c and Autonomous Database
Oracle Database 18c and Autonomous Database Maria Colgan Oracle Database Product Management March 2018 @SQLMaria Safe Harbor Statement The following is intended to outline our general product direction.
More informationPart 1: Indexes for Big Data
JethroData Making Interactive BI for Big Data a Reality Technical White Paper This white paper explains how JethroData can help you achieve a truly interactive interactive response time for BI on big data,
More informationMassive Scalability With InterSystems IRIS Data Platform
Massive Scalability With InterSystems IRIS Data Platform Introduction Faced with the enormous and ever-growing amounts of data being generated in the world today, software architects need to pay special
More informationBig Data Analytics using Apache Hadoop and Spark with Scala
Big Data Analytics using Apache Hadoop and Spark with Scala Training Highlights : 80% of the training is with Practical Demo (On Custom Cloudera and Ubuntu Machines) 20% Theory Portion will be important
More informationEmbedded Technosolutions
Hadoop Big Data An Important technology in IT Sector Hadoop - Big Data Oerie 90% of the worlds data was generated in the last few years. Due to the advent of new technologies, devices, and communication
More informationModern Data Warehouse The New Approach to Azure BI
Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics
More informationAbstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight
ESG Lab Review InterSystems Data Platform: A Unified, Efficient Data Platform for Fast Business Insight Date: April 218 Author: Kerry Dolan, Senior IT Validation Analyst Abstract Enterprise Strategy Group
More informationBig Data Hadoop Course Content
Big Data Hadoop Course Content Topics covered in the training Introduction to Linux and Big Data Virtual Machine ( VM) Introduction/ Installation of VirtualBox and the Big Data VM Introduction to Linux
More informationStages of Data Processing
Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,
More information1 Big Data Hadoop. 1. Introduction About this Course About Big Data Course Logistics Introductions
Big Data Hadoop Architect Online Training (Big Data Hadoop + Apache Spark & Scala+ MongoDB Developer And Administrator + Apache Cassandra + Impala Training + Apache Kafka + Apache Storm) 1 Big Data Hadoop
More informationCassandra- A Distributed Database
Cassandra- A Distributed Database Tulika Gupta Department of Information Technology Poornima Institute of Engineering and Technology Jaipur, Rajasthan, India Abstract- A relational database is a traditional
More informationMySQL Database Administrator Training NIIT, Gurgaon India 31 August-10 September 2015
MySQL Database Administrator Training Day 1: AGENDA Introduction to MySQL MySQL Overview MySQL Database Server Editions MySQL Products MySQL Services and Support MySQL Resources Example Databases MySQL
More informationBig Data. Big Data Analyst. Big Data Engineer. Big Data Architect
Big Data Big Data Analyst INTRODUCTION TO BIG DATA ANALYTICS ANALYTICS PROCESSING TECHNIQUES DATA TRANSFORMATION & BATCH PROCESSING REAL TIME (STREAM) DATA PROCESSING Big Data Engineer BIG DATA FOUNDATION
More informationGoal of the presentation is to give an introduction of NoSQL databases, why they are there.
1 Goal of the presentation is to give an introduction of NoSQL databases, why they are there. We want to present "Why?" first to explain the need of something like "NoSQL" and then in "What?" we go in
More informationNext-Generation Cloud Platform
Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology
More informationHiTune. Dataflow-Based Performance Analysis for Big Data Cloud
HiTune Dataflow-Based Performance Analysis for Big Data Cloud Jinquan (Jason) Dai, Jie Huang, Shengsheng Huang, Bo Huang, Yan Liu Intel Asia-Pacific Research and Development Ltd Shanghai, China, 200241
More informationCassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent
Tanton Jeppson CS 401R Lab 3 Cassandra, MongoDB, and HBase Introduction For my report I have chosen to take a deeper look at 3 NoSQL database systems: Cassandra, MongoDB, and HBase. I have chosen these
More informationUsing the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver
Using the SDACK Architecture to Build a Big Data Product Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver Outline A Threat Analytic Big Data product The SDACK Architecture Akka Streams and data
More informationMigrating to Cassandra in the Cloud, the Netflix Way
Migrating to Cassandra in the Cloud, the Netflix Way Jason Brown - @jasobrown Senior Software Engineer, Netflix Tech History, 1998-2008 In the beginning, there was the webapp and a single database in a
More informationWhen, Where & Why to Use NoSQL?
When, Where & Why to Use NoSQL? 1 Big data is becoming a big challenge for enterprises. Many organizations have built environments for transactional data with Relational Database Management Systems (RDBMS),
More informationData 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp.
Data 101 Which DB, When Joe Yong (joeyong@microsoft.com) Azure SQL Data Warehouse, Program Management Microsoft Corp. The world is changing AI increased by 300% in 2017 Data will grow to 44 ZB in 2020
More informationMySQL for Database Administrators Ed 3.1
Oracle University Contact Us: 1.800.529.0165 MySQL for Database Administrators Ed 3.1 Duration: 5 Days What you will learn The MySQL for Database Administrators training is designed for DBAs and other
More informationADVANCED DATABASES CIS 6930 Dr. Markus Schneider
ADVANCED DATABASES CIS 6930 Dr. Markus Schneider Group 2 Archana Nagarajan, Krishna Ramesh, Raghav Ravishankar, Satish Parasaram Drawbacks of RDBMS Replication Lag Master Slave Vertical Scaling. ACID doesn
More informationMicrosoft. [MS20762]: Developing SQL Databases
[MS20762]: Developing SQL Databases Length : 5 Days Audience(s) : IT Professionals Level : 300 Technology : Microsoft SQL Server Delivery Method : Instructor-led (Classroom) Course Overview This five-day
More informationITS. MySQL for Database Administrators (40 Hours) (Exam code 1z0-883) (OCP My SQL DBA)
MySQL for Database Administrators (40 Hours) (Exam code 1z0-883) (OCP My SQL DBA) Prerequisites Have some experience with relational databases and SQL What will you learn? The MySQL for Database Administrators
More informationExadata Implementation Strategy
Exadata Implementation Strategy BY UMAIR MANSOOB 1 Who Am I Work as Senior Principle Engineer for an Oracle Partner Oracle Certified Administrator from Oracle 7 12c Exadata Certified Implementation Specialist
More informationCS 655 Advanced Topics in Distributed Systems
Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3
More informationWhy is Office 365 the right choice?
Why is Office 365 the right choice? People today want to be productive wherever they go. They want to work faster and smarter across their favorite devices, while staying current and connected. Simply
More informationHow Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,
How Apache Hadoop Complements Existing BI Systems Dr. Amr Awadallah Founder, CTO Cloudera, Inc. Twitter: @awadallah, @cloudera 2 The Problems with Current Data Systems BI Reports + Interactive Apps RDBMS
More informationFLORIDA DEPARTMENT OF TRANSPORTATION PRODUCTION BIG DATA PLATFORM
FLORIDA DEPARTMENT OF TRANSPORTATION PRODUCTION BIG DATA PLATFORM RECOMMENDATION AND JUSTIFACTION Executive Summary: VHB has been tasked by the Florida Department of Transportation District Five to design
More informationSoftNAS Cloud Performance Evaluation on AWS
SoftNAS Cloud Performance Evaluation on AWS October 25, 2016 Contents SoftNAS Cloud Overview... 3 Introduction... 3 Executive Summary... 4 Key Findings for AWS:... 5 Test Methodology... 6 Performance Summary
More informationHortonworks and The Internet of Things
Hortonworks and The Internet of Things Dr. Bernhard Walter Solutions Engineer About Hortonworks Customer Momentum ~700 customers (as of November 4, 2015) 152 customers added in Q3 2015 Publicly traded
More informationDatacenter replication solution with quasardb
Datacenter replication solution with quasardb Technical positioning paper April 2017 Release v1.3 www.quasardb.net Contact: sales@quasardb.net Quasardb A datacenter survival guide quasardb INTRODUCTION
More informationEsgynDB Enterprise 2.0 Platform Reference Architecture
EsgynDB Enterprise 2.0 Platform Reference Architecture This document outlines a Platform Reference Architecture for EsgynDB Enterprise, built on Apache Trafodion (Incubating) implementation with licensed
More informationIntroduction to the Active Everywhere Database
Introduction to the Active Everywhere Database INTRODUCTION For almost half a century, the relational database management system (RDBMS) has been the dominant model for database management. This more than
More informationDistributed File Systems II
Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation
More informationOverview. : Cloudera Data Analyst Training. Course Outline :: Cloudera Data Analyst Training::
Module Title Duration : Cloudera Data Analyst Training : 4 days Overview Take your knowledge to the next level Cloudera University s four-day data analyst training course will teach you to apply traditional
More informationPercona Server for MySQL 8.0 Walkthrough
Percona Server for MySQL 8.0 Walkthrough Overview, Features, and Future Direction Tyler Duzan Product Manager MySQL Software & Cloud 01/08/2019 1 About Percona Solutions for your success with MySQL, MongoDB,
More informationTopics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples
Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?
More informationCloud Computing & Visualization
Cloud Computing & Visualization Workflows Distributed Computation with Spark Data Warehousing with Redshift Visualization with Tableau #FIUSCIS School of Computing & Information Sciences, Florida International
More informationMySQL & NoSQL: The Best of Both Worlds
MySQL & NoSQL: The Best of Both Worlds Mario Beck Principal Sales Consultant MySQL mario.beck@oracle.com 1 Copyright 2012, Oracle and/or its affiliates. All rights Safe Harbour Statement The following
More informationData Analytics at Logitech Snowflake + Tableau = #Winning
Welcome # T C 1 8 Data Analytics at Logitech Snowflake + Tableau = #Winning Avinash Deshpande I am a futurist, scientist, engineer, designer, data evangelist at heart Find me at Avinash Deshpande Chief
More informationComposite Software Data Virtualization The Five Most Popular Uses of Data Virtualization
Composite Software Data Virtualization The Five Most Popular Uses of Data Virtualization Composite Software, Inc. June 2011 TABLE OF CONTENTS INTRODUCTION... 3 DATA FEDERATION... 4 PROBLEM DATA CONSOLIDATION
More informationWhy Migrate from MySQL to Cassandra? WHITE PAPER
Why Migrate from MySQL to Cassandra? WHITE PAPER By DataStax Corporation July 2012 Contents Introduction... 3 Why Stay with MySQL?...4 Why Migrate from MySQL?...4 Architectural Limitations...6 Data Model
More informationOracle Big Data Connectors
Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process
More informationWe are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info
We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : PH NO: 9963799240, 040-40025423
More informationGetting to know. by Michelle Darling August 2013
Getting to know by Michelle Darling mdarlingcmt@gmail.com August 2013 Agenda: What is Cassandra? Installation, CQL3 Data Modelling Summary Only 15 min to cover these, so please hold questions til the end,
More informationJargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems
Jargons, Concepts, Scope and Systems Key Value Stores, Document Stores, Extensible Record Stores Overview of different scalable relational systems Examples of different Data stores Predictions, Comparisons
More informationFusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic
WHITE PAPER Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic Western Digital Technologies, Inc. 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive
More informationFlash Storage Complementing a Data Lake for Real-Time Insight
Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018 Agenda 1 2 3 4 5 Delivering insight along the entire spectrum
More informationAchieving Digital Transformation: FOUR MUST-HAVES FOR A MODERN VIRTUALIZATION PLATFORM WHITE PAPER
Achieving Digital Transformation: FOUR MUST-HAVES FOR A MODERN VIRTUALIZATION PLATFORM WHITE PAPER Table of Contents The Digital Transformation 3 Four Must-Haves for a Modern Virtualization Platform 3
More informationPostgres Plus and JBoss
Postgres Plus and JBoss A New Division of Labor for New Enterprise Applications An EnterpriseDB White Paper for DBAs, Application Developers, and Enterprise Architects October 2008 Postgres Plus and JBoss:
More informationNutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure
Nutanix Tech Note Virtualizing Microsoft Applications on Web-Scale Infrastructure The increase in virtualization of critical applications has brought significant attention to compute and storage infrastructure.
More informationMigrate from Netezza Workload Migration
Migrate from Netezza Automated Big Data Open Netezza Source Workload Migration CASE SOLUTION STUDY BRIEF Automated Netezza Workload Migration To achieve greater scalability and tighter integration with
More informationBig Data Architect.
Big Data Architect www.austech.edu.au WHAT IS BIG DATA ARCHITECT? A big data architecture is designed to handle the ingestion, processing, and analysis of data that is too large or complex for traditional
More informationOutline. Introduction Background Use Cases Data Model & Query Language Architecture Conclusion
Outline Introduction Background Use Cases Data Model & Query Language Architecture Conclusion Cassandra Background What is Cassandra? Open-source database management system (DBMS) Several key features
More informationEnterprise Architectures The Pace Accelerates Camberley Bates Managing Partner & Analyst
Enterprise Architectures The Pace Accelerates Camberley Bates Managing Partner & Analyst Change is constant in IT.But some changes alter forever the way we do things Inflections & Architectures Solid State
More informationDEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies!
DEMYSTIFYING BIG DATA WITH RIAK USE CASES Martin Schneider Basho Technologies! Agenda Defining Big Data in Regards to Riak A Series of Trade-Offs Use Cases Q & A About Basho & Riak Basho Technologies is
More informationNoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu
NoSQL Databases MongoDB vs Cassandra Kenny Huynh, Andre Chik, Kevin Vu Introduction - Relational database model - Concept developed in 1970 - Inefficient - NoSQL - Concept introduced in 1980 - Related
More informationMicrosoft Developing SQL Databases
1800 ULEARN (853 276) www.ddls.com.au Length 5 days Microsoft 20762 - Developing SQL Databases Price $4290.00 (inc GST) Version C Overview This five-day instructor-led course provides students with the
More informationHyper-Converged Infrastructure: Providing New Opportunities for Improved Availability
Hyper-Converged Infrastructure: Providing New Opportunities for Improved Availability IT teams in companies of all sizes face constant pressure to meet the Availability requirements of today s Always-On
More informationCloud Analytics and Business Intelligence on AWS
Cloud Analytics and Business Intelligence on AWS Enterprise Applications Virtual Desktops Sharing & Collaboration Platform Services Analytics Hadoop Real-time Streaming Data Machine Learning Data Warehouse
More informationGplus Adapter 6.1. Gplus Adapter for WFM. Hardware and Software Requirements
Gplus Adapter 6.1 Gplus Adapter for WFM Hardware and Software Requirements The information contained herein is proprietary and confidential and cannot be disclosed or duplicated without the prior written
More informationUnderstanding Virtual System Data Protection
Understanding Virtual System Data Protection Server virtualization is the most important new technology introduced in the data center in the past decade. It has changed the way we think about computing
More informationWhat s New in MySQL 5.7 Geir Høydalsvik, Sr. Director, MySQL Engineering. Copyright 2015, Oracle and/or its affiliates. All rights reserved.
What s New in MySQL 5.7 Geir Høydalsvik, Sr. Director, MySQL Engineering Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes
More informationJVM Performance Study Comparing Oracle HotSpot and Azul Zing Using Apache Cassandra
JVM Performance Study Comparing Oracle HotSpot and Azul Zing Using Apache Cassandra Legal Notices Apache Cassandra, Spark and Solr and their respective logos are trademarks or registered trademarks of
More informationDeveloping SQL Databases
Course 20762B: Developing SQL Databases Page 1 of 9 Developing SQL Databases Course 20762B: 4 days; Instructor-Led Introduction This four-day instructor-led course provides students with the knowledge
More information