NoSQL BENCHMARKING AND TUNING. Nachiket Kate Santosh Kangane Ankit Lakhotia Persistent Systems Ltd. Pune, India

Size: px

Start display at page:

Download "NoSQL BENCHMARKING AND TUNING. Nachiket Kate Santosh Kangane Ankit Lakhotia Persistent Systems Ltd. Pune, India"

Meghan George
6 years ago
Views:

1 NoSQL BENCHMARKING AND TUNING Nachiket Kate Santosh Kangane Ankit Lakhotia Persistent Systems Ltd. Pune, India Today large variety of available NoSQL options has made it difficult for developers to choose the appropriate system for their usage. This Paper presents the author's experiences with Mongo DB and NoSQL solutions in a benchmarking activity carried out in a non-biased manner with the help of YCSB framework. Databases were benchmarked against close to real life workloads. During this activity performance parameters were also played with and intent is share the experience and results of Benchmarking activity. 1. Introduction: This paper presents benchmarking and tuning activity carried out with Mongo DB and databases. Some challenges we faced before beginning were, which databases? What type of hardware? How to benchmark these databases in a non-biased way and generic way? Some systems have made the decision to optimize the writes by using on-disk structures that can be maintained using sequential I/O (as in the case of and Hbase), while others have optimized for random reads by using a more traditional buffer-pool architecture (as in the case of PNUTS). Furthermore, decisions about data partitioning and placement, replication, transactional consistency, and so on all have an impact on performance. Before starting this activity we went through lot of benchmarks available for chosen databases and others as well. Benchmark published by Mongo DB shows Mongo is better than other and same was case with and others. Thus we decided to come up with a non-biased benchmark. To start with we chose Mongo DB and in this activity because of their popularity and good community support in terms of deployment, performance tuning etc. We chose YCSB framework from Yahoo because of its generic and close to real life workloads and customizable structure. In addition to this, framework is extensible for newer type of workloads with very small efforts. Goal of this activity was to benchmark above mentioned two databases using YCSB and find out possible performance parameters and share the experiences with the community. 1.1 Yahoo! Cloud Serving Benchmark The Yahoo! Cloud Serving Benchmark (YCSB) is an open-source specification and program suite for evaluating retrieval and maintenance capabilities of computer programs. It is often used to compare relative performance of NoSQL database management systems. YCSB was published by Yahoo in 211. Yahoo created a tool for benchmarking NoSQL databases. This tool provides connectors to multiple NoSQL databases including Mongo DB,, Hbase, MySQL, Redis etc. YCSB framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems with core set of benchmarks. Benchmarking Tiers: According to YCSB, following benchmark tiers were considered for this activity. 1. Performance 2. Scaling Based on operations permitted by NoSQL databases and operations related to benchmarking tiers CRUD operations were exercise.

Table 1: Benchmarking Workloads Workload Operations Record selection Application example A Update heavy B Read heavy Read:5% Update: 5% Read:95% Update: 5% Zipfian Zipfian Session store recording

2 Table 1: Benchmarking Workloads Workload Operations Record selection Application example A Update heavy B Read heavy Read:5% Update: 5% Read:95% Update: 5% Zipfian Zipfian Session store recording recent actions in a user session Photo tagging; add a tag is an update but most operations are to read tags C Read only Read: 1% Zipfian User profile cache where profiles are constructed elsewhere (e.g. Hadoop) E Short ranges Scan:95% Insert: 5% Zipfian/Uniform* Threaded conversations where each scan is for the posts in a given thread (assumed to be clustered by thread id) F Mix Workload Read:47% Update:47% Insert: 6% Zipfian/Uniform* Widely possible scenario in terms of user activities Figure 1: YCSB architecture Figure 2: Mongo DB Test Environment Shard 1 YCSB Server Mongos with Load Balancing Shard 2 Shard 5 2. Test Environment The test system was hosted in private cloud in a lab, as shown in Figure 2 & 3. Load generation was done from single dedicated YCSB machine for both (Mongo DB and ) cases. Each virtual machine in this benchmark had same set of software packages with Centos (release Final) operating system and 1 GB of storage space. All these servers were on same subnetwork of 1 GB/s bandwidth. 2.1 Mongo DB The Mongo DB config server was hosted on dedicated machine with 3 instances of mongos (1 Active and 2 for fault tolerance). The cluster consists of five Mongo Shards. Each shard had 3 replica sets (1 primary and 2 secondary) hosted on same machine. Writes were performed on primary and reads were load balanced on all 3 replicas. 2.2 The cluster of consists of five servers. Figure 3: Test Environment Node 5 Node 1 Node 2 Node 4 Node 3 3. Benchmark Phases and Tuning Parameters 3.1. Benchmark Phases

3 3.1.1 Phase 1 (Insert Operation) Goal: Benchmark Insert operation using load phase in YCSB with fine tuning server parameters based on observed during test runs. Figure 4: Phase 1 load variation Workloa d A Phase 2 (Read-Write Operations) Goal: Benchmark Read/update/insert capacity of database using run phase in YCSB with tuning parameters Figure 5: Phase 2 load variation Workload A B C E F Phase 3 (Latency and Throughput scaling) Goal: Benchmark scaling capacity of database for latency and throughput. Figure 6: Phase 3 load variation Workload A B E Records 1K 1K 1K 1M Records 1K 1K 1K 1M Max Connections Default Cluster Size Cluster Size Cluster Size Tuning Parameters Following are the tuning parameters considered during test: Mongo DB a. Nohttpinterface=true: Enabling the interface can increase network exposure. b. Noobjcheck=disabled: Disables the default document validation that Mongo DB performs on all incoming BSON documents. c. Maxconns=2: Maximum no. of connections to Mongod. d. Journaling=disabled: With journaling enabled, Mongo DB creates a journal subdirectory within the directory defined by dbpath, which is /data/db by default. The journal directory holds journal files, which contain write-ahead redo logs. The directory also holds a lastsequence-number file. e. Sharding: Sharding (horizontal scaling) by contrast, divides the data set and distributes the data over multiple servers. Each shard is an independent database, and collectively, the shards make up a single logical database. f. Collection Capping: Capped collections are fixed-size collections that support highthroughput operations that insert and retrieve documents based on insertion order. Capped collections work in a way similar to circular buffers: once a collection fills its allocated space, it makes room for new documents by overwriting the oldest documents in the collection. Capped collections guarantee preservation of the insertion order. As a result, queries do not need an index to return documents in insertion order. Without this indexing overhead, they can support higher insertion throughput. g. SlaveOk: This allows the current connection to allow read operations to run on secondary members a. index_interval=128: The index_interval controls the sampling of row keys for each SSTable. The default value of 128 means one out of every 128 keys is held in memory. Index_interval is independent of the key cache. b. Bloom Filter=.1: uses Bloom filters to determine whether an SSTable has data for a particular row. Bloom filters are unused for range scans, but are used for index scans. Bloom filter settings range from to 1. (disabled).

4 c. Consistency Level=QUORUM: Consistency levels in can be configured to manage availability versus data accuracy applicable for read as well as write. i. ONE - A write must be written to the commit log and memtable of at least one replica node. ii. ANY - A write must be written to at least one node. iii. ALL - A write must be written to the commit log and memtable on all replica nodes in the cluster for that partition. iv. QUORUM - A write must be written to the commit log and memtable on a quorum of replica nodes. d. Read Repair Chance=default (.1): Read repair keeps data consistent by comparing and updating the data across all the replicas. Each column family has a read_repair_chance property that controls the chance of a read repair being triggered. e. Caching=ALL: has offered built-in key and row caches. The key cache is essentially a cache of the primary key index for a table. Whereas the row cache is more similar to a traditional cache like memcached: when a row is accessed, the entire row is pulled into memory. f. Compaction=SizeTieredCompactionStrategy The compaction process merges keys, combines columns, evicts tombstones, consolidates SSTables, and creates a new index in the merged SSTable. Using CQL, you configure a compaction strategy: i. Size-tiered compaction ii. Date-tiered compaction iii. Leveled compaction g. Load balancing property: A measure of how distant a node is from the client, which may influence how the load balancer distributes requests and how many connections are opened to the node. Load balancing policies are used to decide how to distribute requests among all possible coordinator nodes in the cluster. Following are subclasses of Load Balancing Property: RoundRobinPolicy DCAwareRoundRobinPolicy WhiteListRoundRobinPolicy TokenAwarePolicy This property was not set as it needs to be set from code modules responsible for data insertion. In our case YCSB was used for data generation. h. concurrency settings: Tuning Concurrent Reads & Concurrent Writes. concurrent_reads (Defaults are 8) A good rule of thumb is 4 concurrent_reads per processor core. User can increase the value for systems with fast I/O storage. concurrent_write (Defaults are 32) May not need tuning since write is usually fast. If needed, increase the value for system with many cores i. Swap Space=OFF: Swap space in Linux is used when the amount of physical memory (RAM) is full. If the system needs more memory resources and the RAM is full, inactive pages in memory are moved to the swap space. j. JVM Tuning: Garbage collection is the JVM's process of freeing up unused Java objects in the Java heap. The Java heap is where the objects of a Java program live. The JVM heap size determines how often and how long the VM spends collecting garbage. An acceptable rate for garbage collection is application-specific and should be adjusted after analyzing the actual time and frequency of garbage collections. If you set a large heap size, full garbage collection is slower, but it occurs less frequently. To ensure maximum performance during benchmarking, you might set high heap size values to ensure that garbage collection does not occur during the entire run of the benchmark. k. Replication factor=no. of nodes in ring: stores replicas on multiple nodes to ensure reliability and fault tolerance. A replication strategy determines the nodes where replicas are placed. The total number of replicas across the cluster is referred to as the replication factor. A replication factor of 1 means that there is only one copy of each row is present on any one node. A replication factor of 2 means two copies of each row are present in the cluster, where each copy is on a different node.

5 Throughput (ops/sec) Update Latency (us) Time in secibds Read Latency (us) 4. Benchmark Results 4.1 Phase 1 (Insert Capacity) Graph 1: Insert Latency Vs Throughput Graph 2: No. of Records Inserted Phase 2 Comparison Insert Latency(us) MongoDb Throughput (ops/sec) Insert Capacity No. of Records inserted MongoDb Observations: As mentioned in phase 1 observations and as per above graph, is designed for insert/modify heavy type of workloads. Thus in phase 2 which focuses on insert capacity of a database, completely outperforms Mongo DB. Average insert latency in was around 4us and it was 25us in Mongo. Average throughput in was around 23 ops/sec and it was 4ops/sec in Mongo. In phase 2, only configuration tuning was done but testing was done with default no. of threads and thus throughput is not much for both databases. This throughput will increase up to saturation point with no. of threads. Insert Capacity graph shows that, at low no. of records insertion time is very less (less than 1 seconds) and thus is not visible in graph but at higher loads i.e. at 1M records we can clearly see the difference. This graph also shows that Mongo was unable to scale well with increasing no. of records. We could only reach up to 1M records at a time in Mongo DB but was scaling well without any failure even with 2M records. 4.2 Phase 2 (Read-Update Operations) Graph 3: Read Latency Read Latency A B C Workloads Graph 4: Update Latency Update latency A MongoDb Workloads Graph 5: Throughput Throughput A B C MongoDb Workloads B

6 Update Latency in us Read Latency in us Observations: In phase1, environments were also tested with tunings parameters suggested by individual databases and some parameters found in database benchmarking activity. From above graphs we can see that beats Mongo by a clear margin in all cases i.e. read and update operations and overall throughput of system. In terms of read latency, average read latency in Mongo is 2.2ms but it is 1ms in. For all workloads this latency was consistent. For Update latency, serves best for its designed purpose. Average update latency for Mongo is above 3ms but it is less than 5us in. As workload C is read only workload, update latency is not present in workload C. In terms of Overall throughput reaches up to 14 ops/sec for workload A and up to 1 ops/sec worst in workload C but in other case Mongo reaches only up to 38 ops/sec maximum for workload C and 317 ops/sec worst in workload A. Benchmarking tests showed that Mongo performs better when requests proportion increases read than update requests. Mongo performs best for workload C. On the contrary, performs best when update/insert proportion increases than read requests. As per architecture, is designed to serve update/insert requests best than read. Our observations show that performs better than mongo in terms of read operations. Three tests were conducted for each Workload on cluster of 4,5,6 nodes (1 MongoS and remaining are shards) with varying record count from 1K to 2M which will perform insert operation using ycsb load phase. 4.3 Phase 4 (Scaling) Comparison for workload A Following graphs show that Read and Update latencies increase with increasing cluster size. Here exceptionally they are dropped to a quite a good read and update latency for cluster size 4. According to Mongo DB, cluster size also affects latencies and this behavior is normal but trend line with high no. of nodes shows that read latency increases with no. of nodes. Due to unavailability of additional hardware/nodes we couldn t carry out scale test with high no. of nodes. Throughput in Mongo increases with no. of nodes in cluster but with higher no. of nodes node management becomes difficult and cluster performance starts degrading. In Mongo MongoS can be a performance bottleneck as it is the only single point of failure at higher loads and thus replicas of MongoS are also recommended. Graph 6: Read Latency Workload A Read Latency Workload A Graph 7: Update Latency Workload A Update Latency Workload A

7 Read Latency in us Read Latency in us Throughput (ops/sec) Read Latency in us Graph 8: Throughput Workload A Graph 1: Update Latency Workload B Throughput Workload A Update Latency Workload B As follows ring architecture i.e. there is no single master in the ring. All nodes can serve the request. Above graphs show that in, scaling up doesn t hamper read, update latencies much. Throughput in was also consistent with increasing no. of nodes. But in terms of insert/update operations scales almost linearly. There is no single point of failure in as no master is present. All these values present represent the values at the peak throughput of system i.e. both Mongo DB and. For Workload A which is read heavy outperforms Mongo DB in terms of update latency and throughput. Read latency was consistently better in Mongo for all cluster sizes. Latencies in were almost consistent with cluster sizes unlike Mongo DB. Comparison for workload B Graph 9: Read Latency Workload B Read Latency Workload B Graph 11: Throughput Workload B Throughput Workload B Above Graph shows behavior of with increasing cluster size for workload B i.e. read heavy. This workload matches real life scenario where most of users perform read operation more than write/update. As Mongo is better in terms of read oeprations, In this scenario performs better workload A. Read latency is drops with increasing cluster size and update latency is consistent. Throughput achieved is more than workload A. performs better than Mongo DB in workload B. Throughput achieved in is quite higher than Mongo DB. Similar to Mongo, read and update latencies were constant in with increasing node size. Though workload B is read heavy still update latencies in were in range of 1ms 1.5ms and read latencies were below 3ms to 4ms which were decreasing with increasing no. of nodes.

8 Based on graphs we can say that for workload B also continues to outperform Mongo DB. For workload B, read latencies in both databases were decreasing with increasing no. of nodes was having low latencies than Mongo DB. From graphs it looks that at higher nodes, Mongo will outperform in terms of read latency. In terms of update latencies and throughput is clear winner. Update latencies were very low in than Mongo DB. Throughput was also high in than Mongo DB though it was not as high a throughput obtained for workload A in. 5. Resource Utilization 5.1 Mongo DB Graph 12: CPU Utilization on MongoS compared to workload B and E. Graphs show resource utilization of systems for test duration with maximum resource utilization and max. No of connections available to database. Please note that graphs are showing resource utilization for tests with varying no. of DB connections. Spikes in the graphs are resource utilizations during the tests. Resource utilization graphs show that all resource counters were within the limit even with the highest no. of database connections. System workload was high sometimes on MongoS node but there were no errors found in test execution. Resource utilization on shards was very less as compared to MongoS node i.e. less than 3% for almost all performance counters. Though resource utilization was low but throughput was not scaled accordingly in tests. Limiting factor in this scenario could be IO device. A Single storage device was used to store data, logs, replication which can limit the system scalability. Another factor could be type of data and ycsb limitations. Query tuning, Read consistency; Data distributions across collections were not possible through/with ycsb. 5.2 Workload A Graph 14: CPU Utilization on Graph 13: Memory Utilization on MongoS Graphs present for phase 4 are for workload A which was having highest resource utilization

9 Graph 15: Memory Utilization on Graph 17: Memory Utilization on Workload E Graph 16: CPU Utilization on We observed that Workload A and Workload E were showing maximum resource utilization as compared to Workload B. Similar to Mongo DB, above graphs show resource utilization for tests with varying no. of DB connections. Above graphs show that resource utilization was high in terms of CPU, Memory (~ 8%). The reason behind this is increased no. of threads in the system. Resource utilization increases with increasing no. of threads. Throughput was also varied but after saturation point, utilization is increasing but not throughput of system. System resource utilization was almost reaching 8% independent of cluster size with increasing thread count. Because of this graphs of only one cluster size are included. 5. Summary: a. According to our tests using YCSB, we found that performs better than Mongo in almost all cases for all workloads. b. In terms of Throughput outperforms Mongo. can scale up to double the no. of operations/sec than Mongo. Maximum throughput achieved by Mongo was 12ops/sec and it was 23ops/sec for. c. For Insert operation, Mongo couldn t scale well for more than 1M records insertion. Operation failed with disk I/O error. This could be because of low I/O operation rate of disk. Recommendation is always prefer SSDs or high quality disks. Logging and data storage should be on separate disks. d. For Insert operation, was scaling much higher level than Mongo. We were able to successfully load 2M records without any failure with an average insert latency of 4us. e. In terms of Read Latency, Mongo performs little better than at higher loads for Read heavy workload. Our observation is with increasing cluster size read latency increases by very low margin for and decreases for Mongo but overall throughput is low in Mongo than.

10 f. In terms of insert/update operation latency, outperforms Mongo. Average update latency in is 4us whereas it is 4us in Mongo. Consistent latencies with very little fluctuation with increasing cluster size in help it scale well and reliable. In Mongo, latency varies with change in cluster size by a noticeable amount and thus makes it more unreliable. g. Another observation with Casandra was, during consecutive test executions results vary by a very little amount. This contradicts to above statement saying consistency in. The variation is very little when compared with overall latencies and thus we preferred average of 3 runs. h. Mongo performs best with consistency factor as 3 for all cluster sizes but in consistency factor/replication factor should be equal to no. of nodes in cluster for optimal performance. i. Resource utilization in Mongo is better than. MongoS is heaviest node in Mongo and can be single point of failure in case of heavy loads. In there is no load balancer and single master node in cluster. Average resource utilization observed in both databases was around 5%- 8% but heap utilization is a point of concern in at higher loads. j. JVM tuning plays of a lot of role in performance management in than Mongo DB. k. scales almost linearly for insert operation with increasing cluster size but we couldn t test it beyond 5 node cluster. Mongo doesn t scale well for insert operation. l. Both databases have their pros and cons but is best choice for insert/update heavy kind of workloads where read operation is less as compared to rest. Mongo is better choice where read heavy workloads are used. m. Both databases perform to their best when thread pool/connection size was set to no. of cores*5. Acknowledgment Authors of this paper would like to thank Mataprasad Agrawal, Senior Architect and Dr. Rajesh Mansharamani for their support and guidance. The author would also like to thank the anonymous CMG referees for their valuable feedback and reviews. References: [YCSB] Cooper, Brian F et al. "Benchmarking cloud serving systems with YCSB". [MONGO] [UMONGO] [DATAX] content/uploads/213/2/wp-benchmarking- Top-NoSQL-Databases.pdf [PLANCAS] [NETWORD] /tech-primers/a-vendor-independent- comparison-of-nosql-databases--cassandra-- hbase--mongodb--riak.html

Benchmarking Cloud Serving Systems with YCSB 詹剑锋 2012 年 6 月 27 日

Benchmarking Cloud Serving Systems with YCSB 詹剑锋 2012 年 6 月 27 日 Motivation There are many cloud DB and nosql systems out there PNUTS BigTable HBase, Hypertable, HTable Megastore Azure Cassandra Amazon