Index. Sam R. Alapati 2018 S. R. Alapati, Expert Apache Cassandra Administration,

Size: px
Start display at page:

Download "Index. Sam R. Alapati 2018 S. R. Alapati, Expert Apache Cassandra Administration,"

Transcription

1 A Accrual detection mechanism, 157 Administration tools cassandra-stress tool, 24 cassandra utility, 24 nodetool utility, 24 SSTable utilities, 24 Akka, 280 Allocation algorithm, ALTER KEYSPACE privilege, 433 Amazon EC2, 157 AMIs, 90 AWS, 91 configuring Cassandra cluster, create instance, 91 add storage, 92 AMI, 91 choose type, 91 configure details, 92 configure security group, 92 key pair, 93 review launch, 92 tag, 92 install Cassandra, Amazon Machine Image (AMI), 91 Amazon s DynamoDB, 17 Amazon Web Services (AWS), 91 Anti-entropy repair definition, 166 Merkel trees, 166 nodetool repair command, 167 ANY consistency level, 134 Apache Cassandra, 20 bloom filters, 21 build from source, 39 cassandra.yaml file, 42 check status, clearing cassandra data, 62 cluster, 27 compaction, 21 compression, 20 definition, 3 distributed data, 22 drawbacks immutable tables and mutation, 19 lack of support for transactions, 19 no indexes, 18 no joins, 18 querying data, 18 eventual consistency, 18 flags, highly scalable, 6 high performing database, 5 large environments, 21 node, 27 prerequisites Java, Python, Sam R. Alapati 2018 S. R. Alapati, Expert Apache Cassandra Administration, 455

2 Apache Cassandra (cont.) service command, stopping, table rows, 26 user and group, 36 verify version, 63 write-heavy workloads, 22 writes, 26 Apache configuration Cassandra-specific plugins, installing and configuring NRPE, 367 Nagios server configuration, 368 Apache Kafka, 280 Apache Mesos, 280 Apache Spark, 109, 271 Cassandra database, configuration, 273 install, prerequisites, pyspark command, RDD, 275 spark-cassandra-connector, 275 spark-shell command, start up cluster, Apache Sqoop, 300 Atomic operation, 143 Atomicity, Consistency, Isolation, and Durability (ACID) atomicity, 129 batch method, 130 durability, 130 isolation, 130 periodic, 130 requirements, 14 Authentication, auto_bootstrap property, 83 B Backing up data automatic snapshots, 287 incremental, 287 snapshot, SSTables, 283, 285 Basically available eventually consistent (BASE) system, 16 Batchlog, 143 Batch method, 130 Batch operations, single and multiple partition configuration, 144 INSERT statements, 145 single and multiple partitions, Behavior-driven development (BDD), 259 Coshxlabs code, 263 Gherkin syntax, 262 installing Cucumber, 261 installing Docker, 260 installing Docker-compose, 261 run with Cucumber, Big data, 7, 12 Bin directory, 53 Binary tarball, Bloom filters, 21, 30, 124, 379 B-tree, 117, 128 Bulk data COPY command cassandra table, import and export data, 296 sstableloader import, 300 load external data, 299 Bundler,

3 C Cache directory, 45 Cardinality, 230 Cassandra, node, 150 CassandraAuthorizer, 431 Cassandra Cluster Manager (CCM) Apache Spark (see Apache Spark) BDD Coshxlabs code, 263 Gherkin syntax, 262 installing Cucumber, 261 installing Docker, 260 installing Docker-compose, 261 run with Cucumber, Cassandra cluster, 268 definition, 266 install, 267 SSTable, 270 status, 269 using cqlsh, 270 Cassandra data modeling avoid querying across partitions, 23 avoid updates and deletes, 23 duplicating data, 23 even data distribution, 23 vs. RDBMS, 31 Cassandra query language (CQL), 7 altering a table, 213 bin directory, 53 capture command, 48 cluster topology, 199 collection data types frozen values, 243 list type, 242 map type, 243 multiple/ addresses, 240 set type, collection set, list/map, column_definition, 202 command line options, 48 composite partition key, compound keys and clustering columns, conditional statement, connect, 65 copy command, 49 cqlshrc config file, create keyspace, 66 create table, 66 counter column to track values, 212 deleting rows and columns, table, 216 describe command, dropping a table, 214 expand command, find versions, 64 functions, aggregates and user types, 200 garbage collection, 219 getting help, indexing, Cassandra database dropping an index, 225 high and low cardinality columns, 223 partitions, 223 primary index, 221 secondary index, 221, 224 updated/deleted columns, 223 usage, 222 INSERT statement, insert test data, 67 keyspace altering, 194 creation, durable writes property,

4 Cassandra query language (CQL) (cont.) management, 190 multiple datacenters, 193 nodetool repair command, 195 qualifier, 197 relational database system, 189 removing, repairing, 197 replicas, data centers, 196 replication factor, materialized view creation, denormalization, 226 dropping, 226 updation, 227 options, 64 partition key, primary key, , query quarters table, 67 secondary index creation, 230 drawbacks, SASI, 231 SELECT statement built-in functions, collection column, 236 filtering, GROUP BY clause, 232 IN keyword, JSON format, 236 LIMIT N option, ORDER BY clause, selection clause, 228 syntax, 227 WHERE clause, start, 46 static columns, structures, 189 table creation, table options clustering order, 211 compact tables, 211 tombstones, table configuration, TIMESTAMP, 217 tombstones, deletion marker, 217 truncating a table, 215 TTL value, tuples, 243 UDFs, UPDATE statement, user-defined types, 244 zombie records and node repair, time zones, 46 tracing command, 52 Cassandra s access control matrix, 433 Cassandra-stress tool command, replication, compaction and compression options, running mixed workload, 393 multiple nodes, 395 read test, 393 write test, YAML-based profile, 395, 397 cassandra.yaml file, 43 Change data capture (CDC) logging, 239 Clearing cassandra data, 62 Clear the screen (CLS), 64 Cloud applications, 3 Cluster, 27, 150 Linux server (see Linux server) Cluster deployment Cassandra-stress, 69 choose CPUs,

5 choose storage, 70 compaction, 72 disk storage, 71 install PDSH, network considerations, 70 NFS, SAN and NAS, 71 RAM, 70 storage requirements, 72 usable disk capacity, 72 Cluster health check JConsole, JMX clients, 346 nodetool info command, 342, 343 nodetool status command, 342 thread pools statistics, Clustering key/column, 202 Cluster maintenance tasks flushing and draining data, handling data corruption, rebuilding indexes, 307 repairing data (nodetool repair), system.size_estimates table, 307 cluster_name parameter, 43 Column family databases, 11 Column-oriented database, 14 Commit log, 27, 45, 151 Commodity servers, 6 Compactions, 21, 30, 72, 125 Compaction strategies enabling and disabling, 404 global compaction parameters, configuration compaction_throughput mb_ per_sec parameter, 406 concurrent_compactors Property, 405 snapshot_before_compaction property, 405 sstable_preemptive_open_ interval_in_mb Property, 406 LCS, logging, 408 setting, 407 STCS, testing, TWCS, Compare and set (CAS), 146 Compound primary key, 209 Conceptual modeling, 103 Concurrent-Mark Sweep (CMS), 75 conf directory, 54 Consistency, availability, and partition tolerance (CAP) theorem, 131 BASE system, 16 consistency, 17 principles availability, 15 consistency, 15 partition tolerance, 15 Container, 250 Coordinators, 28, 122 Counter cache, 119, Counter data type, 212 Cucumber, , D Data caches, 20 Datacenter, 5, 27, 150 Datacenter-related maintenance tasks, Data corruption checking, 331 fixing, Data file directory,

6 Data manipulation language (DML), 189, 227 Data modeling cluster nodes, 106 data-driven vs. query-driven data, 100 ease of use, 102 partition, 106 physical, 105 Pro Cycling statistics, 103 queries, 107 read limitations, 109 reliability, 102 scalability, 102 sort order, 101 structured process, 102 write limitations, 108 DataStax, 13 courses, 4 development tools, 33 DSE, DataStax Enterprise (DSE), 13, DataStax, Inc., 32 Debian packages, 39, 41 Decentralized database, 13 Decommissioning datacenter, Disk storage, Distributed database system, 10, 16, 26 Docker broadcast_address, 258 Cassandra cluster, command line utility, 252 container, 250 dc property, 258 endpoint_snitch, 258 environment variables for, host server, 259 images, listen_address, 258 num_tokens, 258 rack property, 258 run cqlsh, 257 systemctl status command, 251 Ubuntu server installation, using volumes, 258 Document databases, 10 Dynamic ring participation, 181 E Endpoint range vs. subrange repair, 169 Eventual consistency, 14, 18 anti-entropy, 111 consistency levels, 111 reconciliation, 111 repairing data, requirements, 110 F Facebook, 4 Failure detection mechanism, 157 Fault tolerance, 4 Firewall, 45 port access, 79 ports configuration, 439 Flexible data model, 5 Full vs. incremental repair, 168 G Garbage collection, 219 Gossip management, 29, 323, 325 Gossip protocols, 126, 149 accrual detection mechanism, 157 cluster_name,

7 failure detection mechanism, 157 listen_address, 156 process, seed nodes and, 156 seed_provider, 156 storage_port, 156 Graph databases, node, 12 H Handling consistency, Handoff process, Hash ring, 31 Hinted handoff, 29, 31 consists of, 159 for datacenter, 160, 161 definition, 158 enable cluster, 158 max_hint_window_in_ms attribute, 162 sethintedhandoffthrottlekb property, 162 statushandoff command, 160 stores in directory, 160 truncatehints command, 161 write_request_timeout_in_ms property, 162 I Incremental repairs, 167, 170, 172 files, 124 J Javadoc directory, 55 Java garbage collection (GC), 75, Java heap size, 75 Java hugepages setting, 77 Java Management Extensions (JMX), 25, 346 Java Virtual Machine (JVM), 25 JConsole, 347 connection login page, 348 jmxsh, 350 Overview tab, 349 JMX authentication and authorization, K Kafka, Apache, 280 Key cache, 119, 381 Key nodetool maintenance commands decommissioning nodes cassandra.override_ decommission=true Option, 312 command, data remove and restart, nodetool assassinate, 312, 314 Keyspaces, 28, 151 Key-value databases, 10 L LeveledCompactionStrategy (LCS), lib directory, 54 Lightweight transactions, 108 cautions, 147 insert cyclist with ID number, set of operations, 147 Linearizable consistency, 146 Linux server disable swap, 76 disable zone_reclaim_mode, 73 Java heap size, 75 Java hugepages setting, 77 Java version,

8 Linux server (cont.) and Kernel settings, 73 NUMA systems, 73 PAM security settings, 75 setting shell limits, 76 synchronize clock and enable NTP, 73 TCP settings, 74 user resource limits, 74 listen_address property, 44 listen_address parameter, 82 listen_interface parameter, 44 Logback, 353 Logging configuration, 352 locations setting, 352 logback configure, 353 logback logging framework appender, benefits, 353 layout, 355 logback.xml file, logger class, 354 setting up log rotation, 359 nodetool setlogginglevel command, Logical modeling, 105 Log-structured merge-tree (LSM tree), 128 Log-structured storage engine, 13 M Medium data, 7 Memtable, 27, 114 definition, 151 threshold, 416 memtable_cleanup_threshold parameter, 416 memtable_flush_writers parameter, 417 Merkle trees, 166 Mesos, Apache, 280 Minimal configuration properties, 43 MongoDB, 12, 17 Monitoring Cassandra LAMP stack installation, Nagios configuration file, 366 installation, plugins installation, 364 NRPE installation, Multi-node Casandra cluster auto_bootstrap property, 83 broadcast_rpc_address property, 82 change node IP address, 87 client ports, 79 configuration, 81 datacenter configuration, endpoint_snitch option, firewall port access configuration, 79 initialize cluster with multiple datacenters, 84 inter-node ports, 79 IP addresses, 80 keyspaces, 89 listen_address parameter, 82 node is down, num_tokens property, 81 ports, 79 rack names, rpc_address property, 82 seed nodes, 80 seeds property, 82 select name for datacenter, startup process, stopping, 85 version mismatch, Murmur3 partitioning strategy,

9 N Nagios build dependencies, 363 Cassandra cluster hosts, monitoring, 367 configuration, 366 installation, NRPE, 361 plugins, 361, 364 user and group, 363 Nagios Remote Plugin Executor (NRPE), 361 Network-attached storage (NAS), 71 Network information, 340 Network interface cards (NICs), 71 Network Time Protocol (NTP), 73 NetworkTopologyStrategy, 28 Node management adding, data center, cluster joining, dead node replacement, 319 decommission datacenter, moving, 320 removing node, 318 running node replacement, Node repair, definition, 158 hinted handoff (see Hinted handoff) Node restart method, 293, 295 Nodetool drain command, 326 nodetool proxyhistograms command, 337, 338 nodetool tablehistograms command, nodetool upgradetsstables command, 414 Nodetool utility, 24, 53 nodetool info command, nodetool status command, 58 Normalization theory, 100 NoSQL databases column family, 11 document, 10 graph, key-value, 10 num_tokens property, 81 O ONE consistency level, 133 Open source database, 4 OpsCenter, 33 Optimal storage, 70 Optimistic replication, 109 Oracle JDK, 6, P PAM security settings, 75 Parallel distributed shell (PDSH), Parallel repair, 168 Partitioner function, 29 Partitioner range repair, 168 Partitioners Murmur3Partioner, 183 and partitioning strategies, 182 RandomPartitioner, 182 Partition key cache, 20 Paxos protocol, 19, Peer-to-peer architecture, 9 Peer-to-peer system, 149 Performance, Cassandra compression data ALTER TABLE statement, 414 configuration, efficacy testing, 415 modifying, compression algorithm, 414 turning off,

10 Performance, Cassandra (cont.) data caching Cassandra stores, 382 configuration, counter cache, global caching parameters, monitoring, tracing database operations, types, 381 JVM and garbage collection strategies, 422 stress testing cassandra (see Cassandra-stress tool) tracing to analyze performance managing tracing, 374 read request, write request, tuning bloom filters, 379 Phi Accrual Failure Detection, 155 Physical data modeling, 105 Probabilistic tracing strategy, Python, Q Query-driven data modeling, 100 Querying data, see Cassandra query language (CQL), SELECT statement Quorum calculate, 135 datacenter cluster, 135 EACH_QUORUM, 132 LOCAL_QUORUM, 133, 138 read consistency levels, 138 replication factor, write consistency levels, 133 Quorum reads and writes, 110 R Rack, 150 Random selection algorithm, 178 Rapid read protection, 141 consistency levels, 164 speculative_retry property, 165 supports, 165 Read consistency levels ALL, 137 cluster with two datacenters, 142 requests, 137 single datacenter, 142 Reading data coordinator, 127 filter command, 127 gossip protocol, 126 request, 127 write data affects, 128 Read repair definition, 162 read_repair_chance property, 163 Read requests direct, 140 repair request, 140 replica node, 140 Referential integrity, 101 Relational database management system (RDBMS), 7, 31 Relational databases data locality, 9 lower cost, 9 no failover, 9 peer-to-peer architecture, 9 RDMSs and big data, 7 reliability, 9 sharding, 8 third normal form, 8 464

11 Repairing data, Replica placement strategy, 28, 152 Replication strategy definition, 179 group, 180 NetworkTopologyStrategy, 180 SimpleStrategy, 180 switch keyspace, 182 Resilient Distributed Dataset (RDDs), 275 Restoring data commit log manual archive, 295 point-in-time recovery, 295 restore, 295 cycling keyspace, 289, 291, 293 node restart method, 293, 295 run repair, 293 set location, 296 set timestamp, 296 from snapshot, 288 using sstableloader, 293 Role-backed access control cycling_admin, 436 granting permissions, 435 login accounts, object permissions, 436 permissions command, 437, 438 view, permissions granted, 438 Row caching, 20, 119, 381 S SAN, 71 Secondary indexes, 18, 109 Security configuring authentication, firewall ports configuration, 439 JMX authentication and authorization, 451, 453 roles creation administrator privileges, 424 AllowAuthenticator, 425 assigning permissions, login accounts, configure authentication, logging, 424 password changing, 429 properties, 427 superuser account, 430 SSL encryption (see SSL encryption) Seed nodes, 80, 156 Secondary indexes, 105 Sequential vs. parallel repair, 168 Serial consistency settings, 139 Sharding, 8 Simple Build Tool (SBT), 272 Single-token architecture, 174 SizeTieredCompactionStrategy, SMACK stack, Snapshot before compacting data, 287 copy data, 289, 291, 293 list node, restoring data, 288 run repair, 293 using sstableloader, 293 schema, Snitches, 30 cassandra-rackdc.properties, cassandra-topology.properties, 187 CloudstackSnitch, 186 dynamic by default, 185 Ec2MultiRegionSnitch, 186 Ec2Snitch, 186 GoogleCloudSnitch,

12 Snitches (cont.) GossipingPropertyFileSnitch, 185 PropertyFileSnitch, 186 RackInferringSnitch, 186 SimpleSnitch, 185 Snitch serves, Solid-state drives (SSDs), 70, 129 Sorted string table (SSTable), 27 Speculative retrying, Sqoop, Apache, 300 SSH tools, 24 SSL encryption client encryption, enabling, inter-node encryption, enabling, 448 java cryptography extension files installation, 440 server certificates CA to Keystore, 446 certificate authority, creation, cluster configuration, keystores, 447 creation, nodes, server truststore, 447 signed certificates, keystore, 446 signing, CA s public key, 445 signing requests, 444 SSTable Attached Secondary (SASI), 222, 231 SSTables, 19, 21, 151 caching data, 119 compaction operation, data file, 117 data structures, for durable storage, 117 Stores data four-node, 152 hash values, Strict consistency, 110 Subrange repair, 169 Switching snitches, 321, 323 Symbolic link, 42 T Table statistics, TCP settings, 74 Test-driven development (TDD), 259 Third normal form, 8 Time-to-live (TTL), 212, TimeWindowCompactionStrategy (TWCS), Tokens, Tombstones, 19, 21, 108, 217 Tools directory, 54 Tracing data, 372 Tunable consistency, 14, 29, 100, 109, 111, 131 U Ubuntu server, Universally unique identifier (UUID), 205 User-defined aggregates (UDAs), 189, 247 User-defined functions (UDFs), 189, User-defined types (UDTs), 189, 244 V Vagrant tool, 260 Virtual machine (VM), 250 Virtual nodes (vnodes), 81 disable, 179 num_tokens parameter, 178 rebalance data, 173 ring with, 176 tokens,

13 W, X Write amplification (WA), 129 Write consistency ALL consistency level, 132 ANY consistency level, 134 default, 132 hinted handoff, 132 LOCAL_ONE consistency level, 134, 138 ONE consistency level, 133, 138 quorum-related levels, , 138 serial consistency settings, 139 TWO and THREE consistency levels, 133, 138 Writes data bloom filters, 124 commit log binary files, 123 to protect changes, space threshold, 115 Y index file, 124 internal operations, 122 memtable configure cleanup threshold, 115 configure flushing data, 114 database flushes, 113 durability, 114 flushing to disk, 123 nodetool drain command, 116 nodetool flush command, 116 request flow, 119, 121 role of hints, 122 SSTables (see SSTables) YAML-based profile file, 395 Z Zombie,

Glossary. Updated: :00

Glossary. Updated: :00 Updated: 2018-07-25-07:00 2018 DataStax, Inc. All rights reserved. DataStax, Titan, and TitanDB are registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or other countries.

More information

Axway API Management 7.5.x Cassandra Best practices. #axway

Axway API Management 7.5.x Cassandra Best practices. #axway Axway API Management 7.5.x Cassandra Best practices #axway Axway API Management 7.5.x Cassandra Best practices Agenda Apache Cassandra - Overview Apache Cassandra - Focus on consistency level Apache Cassandra

More information

DataStax Distribution of Apache Cassandra 3.x

DataStax Distribution of Apache Cassandra 3.x DataStax Distribution of Apache Cassandra 3.x Documentation November 24, 216 Apache, Apache Cassandra, Apache Hadoop, Hadoop and the eye logo are trademarks of the Apache Software Foundation 216 DataStax,

More information

Apache Cassandra 3.0 for DSE 5.0 (Earlier version)

Apache Cassandra 3.0 for DSE 5.0 (Earlier version) Apache Cassandra 3. for DSE 5. (Earlier version) Updated: 218-9-8-7: 218 DataStax, Inc. All rights reserved. DataStax, Titan, and TitanDB are registered trademark of DataStax, Inc. and its subsidiaries

More information

Apache Cassandra 2.1 for DSE (EOL)

Apache Cassandra 2.1 for DSE (EOL) Apache Cassandra 2.1 for DSE (EOL) Updated: 2018-06-11-07:00 2018 DataStax, Inc. All rights reserved. DataStax, Titan, and TitanDB are registered trademark of DataStax, Inc. and its subsidiaries in the

More information

Apache Cassandra 2.1

Apache Cassandra 2.1 Apache Cassandra 21 Documentation February 17, 2015 2015 DataStax All rights reserved Apache, Apache Cassandra, Apache Hadoop, Hadoop and the eye logo are trademarks of the Apache Software Foundation Contents

More information

Apache Cassandra 2.0

Apache Cassandra 2.0 Apache Cassandra 2. Documentation December 5, 214 214 DataStax. All rights reserved. Apache, Apache Cassandra, Apache Hadoop, Hadoop and the eye logo are trademarks of the Apache Software Foundation Contents

More information

Big Data Development CASSANDRA NoSQL Training - Workshop. November 20 to (5 days) 9 am to 5 pm HOTEL DUBAI GRAND DUBAI

Big Data Development CASSANDRA NoSQL Training - Workshop. November 20 to (5 days) 9 am to 5 pm HOTEL DUBAI GRAND DUBAI Big Data Development CASSANDRA NoSQL Training - Workshop November 20 to 24 2016 (5 days) 9 am to 5 pm HOTEL DUBAI GRAND DUBAI ISIDUS TECH TEAM FZE PO Box 9798 Dubai UAE, email training-coordinator@isidusnet

More information

ADVANCED DATABASES CIS 6930 Dr. Markus Schneider

ADVANCED DATABASES CIS 6930 Dr. Markus Schneider ADVANCED DATABASES CIS 6930 Dr. Markus Schneider Group 2 Archana Nagarajan, Krishna Ramesh, Raghav Ravishankar, Satish Parasaram Drawbacks of RDBMS Replication Lag Master Slave Vertical Scaling. ACID doesn

More information

CQL for Apache Cassandra 3.0 (Earlier version)

CQL for Apache Cassandra 3.0 (Earlier version) CQL for Apache Cassandra 3.0 (Earlier version) Updated: 2018-08-20-07:00 2018 DataStax, Inc. All rights reserved. DataStax, Titan, and TitanDB are registered trademark of DataStax, Inc. and its subsidiaries

More information

Certified Apache Cassandra Professional VS-1046

Certified Apache Cassandra Professional VS-1046 Certified Apache Cassandra Professional VS-1046 Certified Apache Cassandra Professional Certification Code VS-1046 Vskills certification for Apache Cassandra Professional assesses the candidate for skills

More information

Migrating to Cassandra in the Cloud, the Netflix Way

Migrating to Cassandra in the Cloud, the Netflix Way Migrating to Cassandra in the Cloud, the Netflix Way Jason Brown - @jasobrown Senior Software Engineer, Netflix Tech History, 1998-2008 In the beginning, there was the webapp and a single database in a

More information

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : PH NO: 9963799240, 040-40025423

More information

Cassandra Installation and Configuration Guide. Installation

Cassandra Installation and Configuration Guide. Installation Cassandra Installation and Configuration Guide Installation 6/18/2018 Contents 1 Installation 1.1 Step 1: Downloading and Setting Environmental Variables 1.2 Step 2: Edit configuration files 1.3 Step 3:

More information

Expert Apache Cassandra Administration

Expert Apache Cassandra Administration Expert Apache Cassandra Administration Install, configure, optimize, and secure Apache Cassandra databases Sam R. Alapati www.allitebooks.com Expert Apache Cassandra Administration Sam R. Alapati www.allitebooks.com

More information

Apache Cassandra Documentation

Apache Cassandra Documentation Apache Cassandra Documentation February 16, 2012 2012 DataStax. All rights reserved. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Apache,!Apache!Cassandra,!Apache!Hadoop,!Hadoop!and!the!eye!logo! are!trademarks!of!the!apache!software!foundation!

More information

Apache Cassandra. Tips and tricks for Azure

Apache Cassandra. Tips and tricks for Azure Apache Cassandra Tips and tricks for Azure Agenda - 6 months in production Introduction to Cassandra Design and Test Getting ready for production The first 6 months 1 Quick introduction to Cassandra Client

More information

Cassandra multi-datacenter operations essentials Apache: Big Data Vancouver, CA

Cassandra multi-datacenter operations essentials Apache: Big Data Vancouver, CA Cassandra multi-datacenter operations essentials Apache: Big Data 2016 - Vancouver, CA Julien Anguenot (@anguenot) agenda key notions configuration and tuning tools and operations monitoring things you

More information

Outline. Introduction Background Use Cases Data Model & Query Language Architecture Conclusion

Outline. Introduction Background Use Cases Data Model & Query Language Architecture Conclusion Outline Introduction Background Use Cases Data Model & Query Language Architecture Conclusion Cassandra Background What is Cassandra? Open-source database management system (DBMS) Several key features

More information

CS 655 Advanced Topics in Distributed Systems

CS 655 Advanced Topics in Distributed Systems Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3

More information

Using the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver

Using the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver Using the SDACK Architecture to Build a Big Data Product Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver Outline A Threat Analytic Big Data product The SDACK Architecture Akka Streams and data

More information

Migrating Oracle Databases To Cassandra

Migrating Oracle Databases To Cassandra BY UMAIR MANSOOB Why Cassandra Lower Cost of ownership makes it #1 choice for Big Data OLTP Applications. Unlike Oracle, Cassandra can store structured, semi-structured, and unstructured data. Cassandra

More information

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014 Cassandra @ Spotify Scaling storage to million of users world wide! Jimmy Mårdell October 14, 2014 2 About me Jimmy Mårdell Tech Product Owner in the Cassandra team 4 years at Spotify

More information

DataStax Enterprise 4.0 In-Memory Option A look at performance, use cases, and anti-patterns. White Paper

DataStax Enterprise 4.0 In-Memory Option A look at performance, use cases, and anti-patterns. White Paper DataStax Enterprise 4.0 In-Memory Option A look at performance, use cases, and anti-patterns White Paper Table of Contents Abstract... 3 Introduction... 3 Performance Implications of In-Memory Tables...

More information

Cassandra Installation and Configuration Guide. Orchestration Server 8.1.4

Cassandra Installation and Configuration Guide. Orchestration Server 8.1.4 Cassandra Installation and Configuration Guide Orchestration Server 8.1.4 12/15/2017 Table of Contents Cassandra 2.2.5 / 3.9 Installation/Configuration Guide 3 Overview 4 Prerequisites 6 Installation 7

More information

June 20, 2017 Revision NoSQL Database Architectural Comparison

June 20, 2017 Revision NoSQL Database Architectural Comparison June 20, 2017 Revision 0.07 NoSQL Database Architectural Comparison Table of Contents Executive Summary... 1 Introduction... 2 Cluster Topology... 4 Consistency Model... 6 Replication Strategy... 8 Failover

More information

1 Big Data Hadoop. 1. Introduction About this Course About Big Data Course Logistics Introductions

1 Big Data Hadoop. 1. Introduction About this Course About Big Data Course Logistics Introductions Big Data Hadoop Architect Online Training (Big Data Hadoop + Apache Spark & Scala+ MongoDB Developer And Administrator + Apache Cassandra + Impala Training + Apache Kafka + Apache Storm) 1 Big Data Hadoop

More information

Index. Raul Estrada and Isaac Ruiz 2016 R. Estrada and I. Ruiz, Big Data SMACK, DOI /

Index. Raul Estrada and Isaac Ruiz 2016 R. Estrada and I. Ruiz, Big Data SMACK, DOI / Index A ACID, 251 Actor model Akka installation, 44 Akka logos, 41 OOP vs. actors, 42 43 thread-based concurrency, 42 Agents server, 140, 251 Aggregation techniques materialized views, 216 probabilistic

More information

Big Data Hadoop Course Content

Big Data Hadoop Course Content Big Data Hadoop Course Content Topics covered in the training Introduction to Linux and Big Data Virtual Machine ( VM) Introduction/ Installation of VirtualBox and the Big Data VM Introduction to Linux

More information

Cassandra 1.0 and Beyond

Cassandra 1.0 and Beyond Cassandra 1.0 and Beyond Jake Luciani, DataStax jake@datastax.com, 11/11/11 1 About me http://twitter.com/tjake Cassandra Committer Thrift PMC Early DataStax employee Ex-Wall St. (happily) Job Trends from

More information

CQL for DataStax Enterprise 5.1 (Previous version)

CQL for DataStax Enterprise 5.1 (Previous version) CQL for DataStax Enterprise 5.1 (Previous version) Updated: 2018-06-11-07:00 2018 DataStax, Inc. All rights reserved. DataStax, Titan, and TitanDB are registered trademark of DataStax, Inc. and its subsidiaries

More information

A Non-Relational Storage Analysis

A Non-Relational Storage Analysis A Non-Relational Storage Analysis Cassandra & Couchbase Alexandre Fonseca, Anh Thu Vu, Peter Grman Cloud Computing - 2nd semester 2012/2013 Universitat Politècnica de Catalunya Microblogging - big data?

More information

Cassandra- A Distributed Database

Cassandra- A Distributed Database Cassandra- A Distributed Database Tulika Gupta Department of Information Technology Poornima Institute of Engineering and Technology Jaipur, Rajasthan, India Abstract- A relational database is a traditional

More information

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE Presented by Byungjin Jun 1 What is Dynamo for? Highly available key-value storages system Simple primary-key only interface Scalable and Reliable Tradeoff:

More information

Distributed Systems. Fall 2017 Exam 3 Review. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Systems. Fall 2017 Exam 3 Review. Paul Krzyzanowski. Rutgers University. Fall 2017 Distributed Systems Fall 2017 Exam 3 Review Paul Krzyzanowski Rutgers University Fall 2017 December 11, 2017 CS 417 2017 Paul Krzyzanowski 1 Question 1 The core task of the user s map function within a

More information

Cassandra Database Security

Cassandra Database Security Cassandra Database Security Author: Mohit Bagria NoSQL Database A NoSQL database (sometimes called as Not Only SQL) is a database that provides a mechanism to store and retrieve data other than the tabular

More information

Infrastructures for Cloud Computing and Big Data

Infrastructures for Cloud Computing and Big Data University of Bologna Dipartimento di Informatica Scienza e Ingegneria (DISI) Engineering Bologna Campus Class of Computer Networks M or Infrastructures for Cloud Computing and Big Data Global Data Storage

More information

MySQL Database Administrator Training NIIT, Gurgaon India 31 August-10 September 2015

MySQL Database Administrator Training NIIT, Gurgaon India 31 August-10 September 2015 MySQL Database Administrator Training Day 1: AGENDA Introduction to MySQL MySQL Overview MySQL Database Server Editions MySQL Products MySQL Services and Support MySQL Resources Example Databases MySQL

More information

Synergetics-Standard-SQL Server 2012-DBA-7 day Contents

Synergetics-Standard-SQL Server 2012-DBA-7 day Contents Workshop Name Duration Objective Participants Entry Profile Training Methodology Setup Requirements Hardware and Software Requirements Training Lab Requirements Synergetics-Standard-SQL Server 2012-DBA-7

More information

Distributed File Systems II

Distributed File Systems II Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation

More information

What s New in DataStax Enterprise 3.1? A Guide for Developers, Architects and IT Managers. White Paper BY DATASTAX CORPORATION November 2013

What s New in DataStax Enterprise 3.1? A Guide for Developers, Architects and IT Managers. White Paper BY DATASTAX CORPORATION November 2013 What s New in DataStax Enterprise 3.1? A Guide for Developers, Architects and IT Managers White Paper BY DATASTAX CORPORATION November 2013 1 Table of Contents Abstract 3 Introduction 3 What s New in DataStax

More information

10. Replication. Motivation

10. Replication. Motivation 10. Replication Page 1 10. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure

More information

Presented By: Devarsh Patel

Presented By: Devarsh Patel : Amazon s Highly Available Key-value Store Presented By: Devarsh Patel CS5204 Operating Systems 1 Introduction Amazon s e-commerce platform Requires performance, reliability and efficiency To support

More information

NoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu

NoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu NoSQL Databases MongoDB vs Cassandra Kenny Huynh, Andre Chik, Kevin Vu Introduction - Relational database model - Concept developed in 1970 - Inefficient - NoSQL - Concept introduced in 1980 - Related

More information

Cassandra - A Decentralized Structured Storage System. Avinash Lakshman and Prashant Malik Facebook

Cassandra - A Decentralized Structured Storage System. Avinash Lakshman and Prashant Malik Facebook Cassandra - A Decentralized Structured Storage System Avinash Lakshman and Prashant Malik Facebook Agenda Outline Data Model System Architecture Implementation Experiments Outline Extension of Bigtable

More information

Scylla Open Source 3.0

Scylla Open Source 3.0 SCYLLADB PRODUCT OVERVIEW Scylla Open Source 3.0 Scylla is an open source NoSQL database that offers the horizontal scale-out and fault-tolerance of Apache Cassandra, but delivers 10X the throughput and

More information

CO MySQL for Database Administrators

CO MySQL for Database Administrators CO-61762 MySQL for Database Administrators Summary Duration 5 Days Audience Administrators, Database Designers, Developers Level Professional Technology Oracle MySQL 5.5 Delivery Method Instructor-led

More information

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent Tanton Jeppson CS 401R Lab 3 Cassandra, MongoDB, and HBase Introduction For my report I have chosen to take a deeper look at 3 NoSQL database systems: Cassandra, MongoDB, and HBase. I have chosen these

More information

. International Journal of Advance Research in Engineering, Science & Technology. Identifying Vulnerabilities in Apache Cassandra

. International Journal of Advance Research in Engineering, Science & Technology. Identifying Vulnerabilities in Apache Cassandra Impact Factor (SJIF): 4.542. International Journal of Advance Research in Engineering, Science & Technology e-issn: 2393-9877, p-issn: 2394-2444 Volume 4, Issue 4, April-2017 Identifying Vulnerabilities

More information

@joerg_schad Nightmares of a Container Orchestration System

@joerg_schad Nightmares of a Container Orchestration System @joerg_schad Nightmares of a Container Orchestration System 2017 Mesosphere, Inc. All Rights Reserved. 1 Jörg Schad Distributed Systems Engineer @joerg_schad Jan Repnak Support Engineer/ Solution Architect

More information

CS Amazon Dynamo

CS Amazon Dynamo CS 5450 Amazon Dynamo Amazon s Architecture Dynamo The platform for Amazon's e-commerce services: shopping chart, best seller list, produce catalog, promotional items etc. A highly available, distributed

More information

Cassandra Design Patterns

Cassandra Design Patterns Cassandra Design Patterns Sanjay Sharma Chapter No. 1 "An Overview of Architecture and Data Modeling in Cassandra" In this package, you will find: A Biography of the author of the book A preview chapter

More information

Getting to know. by Michelle Darling August 2013

Getting to know. by Michelle Darling August 2013 Getting to know by Michelle Darling mdarlingcmt@gmail.com August 2013 Agenda: What is Cassandra? Installation, CQL3 Data Modelling Summary Only 15 min to cover these, so please hold questions til the end,

More information

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development:: Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized

More information

Compactions in Apache Cassandra

Compactions in Apache Cassandra Thesis no: MSEE-2016-21 Compactions in Apache Cassandra Performance Analysis of Compaction Strategies in Apache Cassandra Srinand Kona Faculty of Computing Blekinge Institute of Technology SE-371 79 Karlskrona

More information

Intro Cassandra. Adelaide Big Data Meetup.

Intro Cassandra. Adelaide Big Data Meetup. Intro Cassandra Adelaide Big Data Meetup instaclustr.com @Instaclustr Who am I and what do I do? Alex Lourie Worked at Red Hat, Datastax and now Instaclustr We currently manage x10s nodes for various customers,

More information

Making Non-Distributed Databases, Distributed. Ioannis Papapanagiotou, PhD Shailesh Birari

Making Non-Distributed Databases, Distributed. Ioannis Papapanagiotou, PhD Shailesh Birari Making Non-Distributed Databases, Distributed Ioannis Papapanagiotou, PhD Shailesh Birari Dynomite Ecosystem Dynomite - Proxy layer Dyno - Client Dynomite-manager - Ecosystem orchestrator Dynomite-explorer

More information

How we build TiDB. Max Liu PingCAP Amsterdam, Netherlands October 5, 2016

How we build TiDB. Max Liu PingCAP Amsterdam, Netherlands October 5, 2016 How we build TiDB Max Liu PingCAP Amsterdam, Netherlands October 5, 2016 About me Infrastructure engineer / CEO of PingCAP Working on open source projects: TiDB: https://github.com/pingcap/tidb TiKV: https://github.com/pingcap/tikv

More information

CIB Session 12th NoSQL Databases Structures

CIB Session 12th NoSQL Databases Structures CIB Session 12th NoSQL Databases Structures By: Shahab Safaee & Morteza Zahedi Software Engineering PhD Email: safaee.shx@gmail.com, morteza.zahedi.a@gmail.com cibtrc.ir cibtrc cibtrc 2 Agenda What is

More information

NoSQL BENCHMARKING AND TUNING. Nachiket Kate Santosh Kangane Ankit Lakhotia Persistent Systems Ltd. Pune, India

NoSQL BENCHMARKING AND TUNING. Nachiket Kate Santosh Kangane Ankit Lakhotia Persistent Systems Ltd. Pune, India NoSQL BENCHMARKING AND TUNING Nachiket Kate Santosh Kangane Ankit Lakhotia Persistent Systems Ltd. Pune, India Today large variety of available NoSQL options has made it difficult for developers to choose

More information

Analysis, Archival, & Retrieval Guide. StorageX 8.0

Analysis, Archival, & Retrieval Guide. StorageX 8.0 Analysis, Archival, & Retrieval Guide StorageX 8.0 March 2018 Copyright 2018 Data Dynamics, Inc. All Rights Reserved. The trademark Data Dynamics is the property of Data Dynamics, Inc. All other brands,

More information

DataStax Enterprise on VMware vsan 6.6 All-Flash for Development First Published On: Last Updated On:

DataStax Enterprise on VMware vsan 6.6 All-Flash for Development First Published On: Last Updated On: DataStax Enterprise on VMware vsan 6.6 All-Flash for First Published On: 01-09-2018 Last Updated On: 03-09-2018 1 Table of Contents 1. Executive Summary 1.1.Business Case 1.2.Solution Overview 1.3.Key

More information

PROFESSIONAL. NoSQL. Shashank Tiwari WILEY. John Wiley & Sons, Inc.

PROFESSIONAL. NoSQL. Shashank Tiwari WILEY. John Wiley & Sons, Inc. PROFESSIONAL NoSQL Shashank Tiwari WILEY John Wiley & Sons, Inc. Examining CONTENTS INTRODUCTION xvil CHAPTER 1: NOSQL: WHAT IT IS AND WHY YOU NEED IT 3 Definition and Introduction 4 Context and a Bit

More information

MongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM

MongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM MongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM About us Adamo Tonete MongoDB Support Engineer Agustín Gallego MySQL Support Engineer Agenda What are MongoDB and MySQL; NoSQL

More information

Distributed PostgreSQL with YugaByte DB

Distributed PostgreSQL with YugaByte DB Distributed PostgreSQL with YugaByte DB Karthik Ranganathan PostgresConf Silicon Valley Oct 16, 2018 1 CHECKOUT THIS REPO: github.com/yugabyte/yb-sql-workshop 2 About Us Founders Kannan Muthukkaruppan,

More information

GridGain and Apache Ignite In-Memory Performance with Durability of Disk

GridGain and Apache Ignite In-Memory Performance with Durability of Disk GridGain and Apache Ignite In-Memory Performance with Durability of Disk Dmitriy Setrakyan Apache Ignite PMC GridGain Founder & CPO http://ignite.apache.org #apacheignite Agenda What is GridGain and Ignite

More information

Big Data Analytics using Apache Hadoop and Spark with Scala

Big Data Analytics using Apache Hadoop and Spark with Scala Big Data Analytics using Apache Hadoop and Spark with Scala Training Highlights : 80% of the training is with Practical Demo (On Custom Cloudera and Ubuntu Machines) 20% Theory Portion will be important

More information

SCYLLA: NoSQL at Ludicrous Speed. 主讲人 :ScyllaDB 软件工程师贺俊

SCYLLA: NoSQL at Ludicrous Speed. 主讲人 :ScyllaDB 软件工程师贺俊 SCYLLA: NoSQL at Ludicrous Speed 主讲人 :ScyllaDB 软件工程师贺俊 Today we will cover: + Intro: Who we are, what we do, who uses it + Why we started ScyllaDB + Why should you care + How we made design decisions to

More information

What s New in MySQL 5.7 Geir Høydalsvik, Sr. Director, MySQL Engineering. Copyright 2015, Oracle and/or its affiliates. All rights reserved.

What s New in MySQL 5.7 Geir Høydalsvik, Sr. Director, MySQL Engineering. Copyright 2015, Oracle and/or its affiliates. All rights reserved. What s New in MySQL 5.7 Geir Høydalsvik, Sr. Director, MySQL Engineering Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes

More information

WHITEPAPER. MemSQL Enterprise Feature List

WHITEPAPER. MemSQL Enterprise Feature List WHITEPAPER MemSQL Enterprise Feature List 2017 MemSQL Enterprise Feature List DEPLOYMENT Provision and deploy MemSQL anywhere according to your desired cluster configuration. On-Premises: Maximize infrastructure

More information

Bitnami Cassandra for Huawei Enterprise Cloud

Bitnami Cassandra for Huawei Enterprise Cloud Bitnami Cassandra for Huawei Enterprise Cloud Description Apache Cassandra is an open source distributed database management system designed to handle large amounts of data across many commodity servers,

More information

[This is not an article, chapter, of conference paper!]

[This is not an article, chapter, of conference paper!] http://www.diva-portal.org [This is not an article, chapter, of conference paper!] Performance Comparison between Scaling of Virtual Machines and Containers using Cassandra NoSQL Database Sogand Shirinbab,

More information

Hadoop Development Introduction

Hadoop Development Introduction Hadoop Development Introduction What is Bigdata? Evolution of Bigdata Types of Data and their Significance Need for Bigdata Analytics Why Bigdata with Hadoop? History of Hadoop Why Hadoop is in demand

More information

CAP Theorem, BASE & DynamoDB

CAP Theorem, BASE & DynamoDB Indian Institute of Science Bangalore, India भ रत य व ज ञ न स स थ न ब गल र, भ रत DS256:Jan18 (3:1) Department of Computational and Data Sciences CAP Theorem, BASE & DynamoDB Yogesh Simmhan Yogesh Simmhan

More information

Cloud Computing & Visualization

Cloud Computing & Visualization Cloud Computing & Visualization Workflows Distributed Computation with Spark Data Warehousing with Redshift Visualization with Tableau #FIUSCIS School of Computing & Information Sciences, Florida International

More information

Installation and Configuration Guide for Cassandra Message Store Release 8.0.2

Installation and Configuration Guide for Cassandra Message Store Release 8.0.2 [1]Oracle Communications Messaging Server Installation and Configuration Guide for Cassandra Message Store Release 8.0.2 E79615-01 October 2017 Oracle Communications Messaging Server Installation and Configuration

More information

Zadara Enterprise Storage in

Zadara Enterprise Storage in Zadara Enterprise Storage in Google Cloud Platform (GCP) Deployment Guide March 2017 Revision A 2011 2017 ZADARA Storage, Inc. All rights reserved. Zadara Storage / GCP - Deployment Guide Page 1 Contents

More information

MySQL for Database Administrators Ed 3.1

MySQL for Database Administrators Ed 3.1 Oracle University Contact Us: 1.800.529.0165 MySQL for Database Administrators Ed 3.1 Duration: 5 Days What you will learn The MySQL for Database Administrators training is designed for DBAs and other

More information

Introduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data

Introduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data Introduction to Hadoop High Availability Scaling Advantages and Challenges Introduction to Big Data What is Big data Big Data opportunities Big Data Challenges Characteristics of Big data Introduction

More information

TIBCO ActiveMatrix BusinessWorks Plug-in for Apache Cassandra User's Guide

TIBCO ActiveMatrix BusinessWorks Plug-in for Apache Cassandra User's Guide TIBCO ActiveMatrix BusinessWorks Plug-in for Apache Cassandra User's Guide Software Release 6.3 August 2017 Two-Second Advantage 2 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO

More information

c 2014 Ala Walid Shafe Alkhaldi

c 2014 Ala Walid Shafe Alkhaldi c 2014 Ala Walid Shafe Alkhaldi LEVERAGING METADATA IN NOSQL STORAGE SYSTEMS BY ALA WALID SHAFE ALKHALDI THESIS Submitted in partial fulfillment of the requirements for the degree of Master of Science

More information

SQL Server 2014 Training. Prepared By: Qasim Nadeem

SQL Server 2014 Training. Prepared By: Qasim Nadeem SQL Server 2014 Training Prepared By: Qasim Nadeem SQL Server 2014 Module: 1 Architecture &Internals of SQL Server Engine Module : 2 Installing, Upgrading, Configuration, Managing Services and Migration

More information

Course Content MongoDB

Course Content MongoDB Course Content MongoDB 1. Course introduction and mongodb Essentials (basics) 2. Introduction to NoSQL databases What is NoSQL? Why NoSQL? Difference Between RDBMS and NoSQL Databases Benefits of NoSQL

More information

Dynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation

Dynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation Dynamo Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/20 Outline Motivation 1 Motivation 2 3 Smruti R. Sarangi Leader

More information

Facebook, 14 Fast projection index, 84 First database revolution data handling code, 6 DBMS, 6 network and hierarchical model, 6 7

Facebook, 14 Fast projection index, 84 First database revolution data handling code, 6 DBMS, 6 network and hierarchical model, 6 7 Index A Aerospike, 91, 217 Aerospike query language (AQL), 218 AJAX. See Asynchronous JavaScript and XML (AJAX) Alternative persistence model, 92 Amazon ACID RDBMS, 46 Dynamo, 14, 45 46 DynamoDB, 219 hashing,

More information

Hadoop. Course Duration: 25 days (60 hours duration). Bigdata Fundamentals. Day1: (2hours)

Hadoop. Course Duration: 25 days (60 hours duration). Bigdata Fundamentals. Day1: (2hours) Bigdata Fundamentals Day1: (2hours) 1. Understanding BigData. a. What is Big Data? b. Big-Data characteristics. c. Challenges with the traditional Data Base Systems and Distributed Systems. 2. Distributions:

More information

Virtualization with VMware ESX and VirtualCenter SMB to Enterprise

Virtualization with VMware ESX and VirtualCenter SMB to Enterprise Virtualization with VMware ESX and VirtualCenter SMB to Enterprise This class is an intense, five-day introduction to virtualization using VMware s immensely popular Virtual Infrastructure suite including

More information

Exploring Cassandra and HBase with BigTable Model

Exploring Cassandra and HBase with BigTable Model Exploring Cassandra and HBase with BigTable Model Hemanth Gokavarapu hemagoka@indiana.edu (Guidance of Prof. Judy Qiu) Department of Computer Science Indiana University Bloomington Abstract Cassandra is

More information

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems CompSci 516 Data Intensive Computing Systems Lecture 21 (optional) NoSQL systems Instructor: Sudeepa Roy Duke CS, Spring 2016 CompSci 516: Data Intensive Computing Systems 1 Key- Value Stores Duke CS,

More information

ITS. MySQL for Database Administrators (40 Hours) (Exam code 1z0-883) (OCP My SQL DBA)

ITS. MySQL for Database Administrators (40 Hours) (Exam code 1z0-883) (OCP My SQL DBA) MySQL for Database Administrators (40 Hours) (Exam code 1z0-883) (OCP My SQL DBA) Prerequisites Have some experience with relational databases and SQL What will you learn? The MySQL for Database Administrators

More information

Oracle NoSQL Database Enterprise Edition, Version 18.1

Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database is a scalable, distributed NoSQL database, designed to provide highly reliable, flexible and available data management across

More information

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo Document Sub Title Yotpo Technical Overview 07/18/2016 2015 Yotpo Contents Introduction... 3 Yotpo Architecture... 4 Yotpo Back Office (or B2B)... 4 Yotpo On-Site Presence... 4 Technologies... 5 Real-Time

More information

The course modules of MongoDB developer and administrator online certification training:

The course modules of MongoDB developer and administrator online certification training: The course modules of MongoDB developer and administrator online certification training: 1 An Overview of the Course Introduction to the course Table of Contents Course Objectives Course Overview Value

More information

OpsCenter 6.0 User Guide (Earlier version)

OpsCenter 6.0 User Guide (Earlier version) OpsCenter 6.0 User Guide (Earlier version) Updated: 2018-08-20-07:00 2018 DataStax, Inc. All rights reserved. DataStax, Titan, and TitanDB are registered trademark of DataStax, Inc. and its subsidiaries

More information

FAQs Snapshots and locks Vector Clock

FAQs Snapshots and locks Vector Clock //08 CS5 Introduction to Big - FALL 08 W.B.0.0 CS5 Introduction to Big //08 CS5 Introduction to Big - FALL 08 W.B. FAQs Snapshots and locks Vector Clock PART. LARGE SCALE DATA STORAGE SYSTEMS NO SQL DATA

More information

Ghislain Fourny. Big Data 5. Wide column stores

Ghislain Fourny. Big Data 5. Wide column stores Ghislain Fourny Big Data 5. Wide column stores Data Technology Stack User interfaces Querying Data stores Indexing Processing Validation Data models Syntax Encoding Storage 2 Where we are User interfaces

More information

Tools for Social Networking Infrastructures

Tools for Social Networking Infrastructures Tools for Social Networking Infrastructures 1 Cassandra - a decentralised structured storage system Problem : Facebook Inbox Search hundreds of millions of users distributed infrastructure inbox changes

More information

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Copyright 2013, Oracle and/or its affiliates. All rights reserved. 1 Oracle NoSQL Database: Release 3.0 What s new and why you care Dave Segleau NoSQL Product Manager The following is intended to outline our general product direction. It is intended for information purposes

More information

VMware vsphere with ESX 4 and vcenter

VMware vsphere with ESX 4 and vcenter VMware vsphere with ESX 4 and vcenter This class is a 5-day intense introduction to virtualization using VMware s immensely popular vsphere suite including VMware ESX 4 and vcenter. Assuming no prior virtualization

More information

DATA SCIENCE USING SPARK: AN INTRODUCTION

DATA SCIENCE USING SPARK: AN INTRODUCTION DATA SCIENCE USING SPARK: AN INTRODUCTION TOPICS COVERED Introduction to Spark Getting Started with Spark Programming in Spark Data Science with Spark What next? 2 DATA SCIENCE PROCESS Exploratory Data

More information