Kinetic Action: Micro and Macro Benchmark-based Performance Analysis of Kinetic Drives Against LevelDB-based Servers

Size: px
Start display at page:

Download "Kinetic Action: Micro and Macro Benchmark-based Performance Analysis of Kinetic Drives Against LevelDB-based Servers"

Transcription

1 Kinetic Action: Micro and Macro Benchmark-based Performance Analysis of Kinetic Drives Against LevelDB-based Servers Abstract There is an unprecedented growth in the amount of unstructured data. Therefore, key-value object storage is becoming extremely important for its ease of storing and retrieving information. Seagate recently launched direct-access-over Kinetic Ethernet hard drives which incorporate a LevelDB key-value store inside a hard drive. In this work, we employ Micro as well as Macro (Yahoo Cloud Serving Benchmark) to evaluate these drives to try to understand the performance limits, trade-offs, and implications of replacing traditional hard drives with Kinetic Drives in the data centers and High Performance Systems. We perform in-depth throughput and latency benchmarking of these Kinetic Drives (each acting as a tiny independent server) from a client machine connected to them via Ethernet. We compare these results to a SATA-based and a faster SAS-based traditional server running LevelDB. Our CPU-bound functionality sample drives average sequential write throughout of 63 MB/sec and sequential read throughput of 78 MB/sec for MB value sizes and are able to demonstrate unique Kinetic features including direct disk-todisk data transfer. Our Macro Benchmark (YCSB) shows that the full-fledged LevelDB servers outperform the Kinetic Drives for several workloads; however, this is not always the case. For larger value sizes, even these sample Kinetic drives outperform a full server for several different workloads. I. INTRODUCTION The amount of digital data is growing at an extremely rapid pace, and it is estimated that its volume will grow at 4%-5% per year []. Most of this data explosion is due to unstructured data. According to predictions from International Data Corporation (IDC), 8% of global data (approximately 33 EB in 27) will be unstructured [2]. Managing such an enormous amount of data is a challenging task. Most of the data in existing systems is stored and accessed using traditional file-based or block-based systems [3]. Unfortunately, these traditional systems are becoming inefficient as file-based access and hardware requirements limit their scalability. Therefore, there is a need for a data access method that is highly scalable and has the flexibility of scaling-out horizontally [4]. Object storage overcomes limitations of file-based systems by offering scalability and being software defined [5]. The data is communicated as objects rather than files or blocks, and additional metadata can be stored alongside the object [3]. Object storage is flat structured and prevents higher-level applications from needing to manipulate data at the lowest level. Therefore, object storage has the flexibility to scale-out horizontally. Furthermore, a unique identifier called "key" is used to access the object or "value". Therefore, this form of storage is called a key-value store (KV store). There are several key-value stores being deployed to support large websites such as: Dynamo at Amazon [6], Redis at GitHub [7], and Memcached at Facebook [7]. All these systems store ordered (key, value) pairs. Even though keyvalue stores address the problem of scaling and managing huge amounts of data for the above systems, the existing keyvalue stores also have several limitations. They run on top of multiple layers of legacy software and hardware, such as POSIX, RAID controllers, etc., designed for file-based systems [8]. Also, these huge systems consume significant amounts of power and rack space [9]. To help overcome these hardware and software limitations, Seagate has announced a new class of hard drives called Kinetic Drives []. These drives have a built-in processor that runs a LevelDB-based key-value store directly on the drive []. Rather than the typical SATA or SAS interface, Kinetic drives communicate externally via TCP/IP over Ethernet. Each drive acts as a tiny server in itself. An important function of the drives is direct P2P (Peer-to-Peer) transfer that allows direct data transfer from one drive to another via Ethernet without the need to copy data through a storage controller or other server [2]. By replacing hardware and software layers, Kinetic Drives can reduce the cost and complexity of a largescale object storage system. In this work, we attempt to better understand the key functionalities, features, and performance of the drives. We used this information to compare Kinetic Drives with other LevelDB based servers to drive insights about the possibility of replacing the traditional hard drives with Kinetic Drives. Furthermore, the specification sheets do not provide a detailed analysis of the throughput [3] specially in comparison to the other LevelDB based servers. We also study the ease of programmability of the Kinetic drives and share our experiences. To the best of our knowledge there are no prior works which evaluate Kinetic Drives this thoroughly. To understand several salient features including P2P transfer, get, put, and others, we developed several tests. With these tests, we identify bottlenecks and limitations of the drives. As an example, the internal processing capability of these functionality sample drives limits the I/O and thus the data transfer rate. Overall, these tests help by giving us experience using this latest class of drives from Seagate as well as help us determine throughput bounds and performance characteristics in different scenarios. Beyond academic interest, we believe the information in this paper may be useful to system-level engineers at data centers and for High Performance Computing or Big Data and Analytics systems. Our contributions are the following: Conduct tests on Kinetic Drives to understand the throughput and performance bottlenecks, limits, trade-

2 offs, and characteristics Make comparisons between Kinetic Drives and servers running the LevelDB key-value store on conventional hard drives Share the experience of using the Kinetic platform The rest of the paper is organized as follows. Section II presents motivation and architecture of the Kinetic Drives as well as a description of the software platform and API. Section III provides the test system configuration, while Section IV discusses our test methodology, results, and other findings for Kinetic Drives. Section V a comparison of Throughput of Kinetic Drives and LevelDB Servers. In Section VI results for Latency for Kinetic Drives and other Servers are discussed. Findings for "Keys Not Found" are presented in the Section VII, Macro Benchmark-based results are presented in the Section VIII Yahoo Cloud Serving Benchmark. Section IX provides an overview of some of the related works. Finally, Section X gives a brief conclusion and direction of future research. II. KINETIC PLATFORM - MOTIVATION AND ARCHITECTURE Kinetic drives are the latest class of hard drives from Seagate and have a similar form factor to a conventional 3.5" drive. However, Kinetic Drives are fundamentally different from traditional drives in two ways: Ethernet connectivity and an internal processor that converts the storage drive to a keyvalue store [4]. Each drive has two Ethernet ports ( Gbps port each) to transfer data to the clients or other drives using TCP/IP over Ethernet. However, they are for fault tolerance and only one should be used at a time. Newer models (2nd generation) are expected to have two 2.5 Gbps ports. However, after inquiring, these drives are not available for general public. Hence, we could not test the 2nd generation models of these drives. Furthermore, the drives do not use the SCSI interface and instead have a key-value interface. Applications interact with the drives directly using this interface and a standard Ethernet connection with IPs assigned to the drives by a DHCP server [4]. The use of Ethernet allows the storage system to be expanded without the need to install a separate Storage Area Network. Figure is a simplified comparison of the hardware and software stack of a traditional storage system and that of a Kinetic system. The second major benefit of using these drives is that they encapsulate the concept of in-storage processing by running a LevelDB-based key-value store in the drives. There is a single-core.2 GHz System-on-Chip (SoC) processor inside the drives that runs LevelDB and manages the data on the disk itself. This can reduce the software I/O complexity for applications and simplify the storage rack by eliminating the need for separate storage controllers [9]. Seagate provides a software platform to help write programs for these drives. This Kinetic API comes with the libraries for several popular programming languages, a simulator, installation instructions, some test utilities, and some basic test codes [5]. The simulator allows the test and custom programs to run without the actual drives (but we run all our experiments on actual drives). The applications or programs are created using the functions and protocols defined in the API library. The protocols are responsible for the actual connection and communication with the Kinetic Drives via messages (Figure 2). This Kinetic API is now maintained by the Linux Foundation. The drives have a built-in capability to support multiple threads and sessions. While the Kinetic API abstracts away most of the internal details of the drives, there are set limits to certain parameters. These limits are listed in Table I. Table II lists the main API commands used to interact with the key-value store on the drive. In addition, the API allows the programmer to perform write synchronization in several different modes as described in Table III. These modes allow the programmer to force immediate synchronization or allow the drive to control the write operations. A particularly unique feature that comes with the drive is the P2P transfer. One drive can directly transfer data from one drive to another via Ethernet. The process initiates when a client issues a "P2P Put" or "P2P Get" command to a particular Kinetic Drive. This drive takes control of the process and automatically (without the client s involvement) performs "put/write" or "get/read" operations from/to the target drive using Ethernet. After this process is completed both the drives have the same key-value pairs. This is a unique feature of the Kinetic Drive which is not available in the traditional drives. Therefore, these instructions get rid of the several software layers required to transfer the data. This reduces the I/O time drastically. We employed several different commands based on the API to evaluate the performance of the drives to understand the limitations and bottlenecks better. These findings are mentioned in the upcoming sections. Fig.. Software Stack Comparison between Server with Traditional Drives vs Kinetic Drives TABLE I SPECIFICATION FOR THE KINETIC DRIVES Device Feature Limit max Key Size 496 KB max MB max Version Size 248 max Tag Size 28 max Connections max Outstanding Read Requests 3 max Outstanding Write Requests 2 max Message Size MB max Key Range Count 2 max Identity Count max Pin Size 2

3 Kinetic Write Mode WriteThrough WriteBack Flush TABLE II COMMANDS AND FUNCTIONS OF KINETIC DRIVE Command Function put Writes the specified key-value (KV) pair to the persistent store get Reads the key-value (KV) pair delete Deletes the KV pair getnext Gets the next key that is after the specified key in the sequence getprevious Gets the previous entry associated with a specified key in the sequence getkeyrange Gets a list of keys in the sequence based on the given key range getmetadata Get an entry s metadata for a specified key secureerase Securely erases all the user data, configurations, and setup information on the drive instanterase Erase all data in database for the drive. This is faster than secureerase setsecurity Set the access control list to the Kinetic Drive P2P Get Kinetic Drives reads from another Kinetic Drive directly P2P Put Kinetic Drives writes to another Kinetic Drive directly TABLE III DIFFERENT WRITE MODES OF KINETIC DRIVES Description The request is made persistent before returning. This does not affect any other pending operations The requests are made persistent when the drive chooses, or when a FLUSH is sent to the drive All the write requests that are pending and are not written yet will be made persistent to the disk Fig. 2. Message Protocol for Kinetic Drive III. CONFIGURATION In this section, we cover the three system configurations we employ to test and evaluate the Kinetic Drives and compare them with LevelDB-based servers. To manage and operate the Kinetic Drives, we must also understand the Kinetic API, which is explained in below. To benefit from the commands described in Table II, we have to leverage the Kinetic API. This API is installed in the client and is used to send commands to different Kinetic Drives which act as tiny independent servers, as shown in Figure 3. The Kinetic API is designed for several programming platforms including C, C++, Java, and Python. Our testing uses the C library of the API. Additionally, the write operation can be performed in several different modes (Table III). The programmer uses these modes to control when write operations are made persistent or to allow the drive to manage the write operations. The client machine is a Dell R42 running Ubuntu Server 4.4 LTS 64-bit (kernel 3.3) on two quad-core Intel Xeon E GHz and 2 GB of RAM. This machine uses a Gbps Ethernet PCIe network card to connect to a Kinetic disk enclosure containing four Kinetic sample drives. Internally, this enclosure contains redundant Ethernet switches with Gbps ports for the Kinetic Drives and a Gbps uplink port. The Kinetic enclosure also has a separate Ethernet connection to allow access to a web interface that allows basic management of each drive (power cycle the drive, see its IP address, etc.) The drives are model ST4NK, and we upgraded the firmware to version 3..4 using one of the included utility programs. These disks are 4 TB, 59 RPM, and have 64 MB of disk cache plus another 52 MB of onboard RAM to run the modified version of Linux and LevelDB on the added Marvel 37 SoC [3]. Additional details are in the previous section. To make comparisons between the Kinetic Drives and LevelDB-based servers, we have set up two different systems. Server-SATA Server-SAS Server-SATA has the same specifications as the Kinetic client system described above plus has LevelDB installed and two additional traditional SATA drives, separate from the OS drive, connected directly to the motherboard via SATA. These two extra SATA drives, Seagate model ST4VN, were chosen because they have specifications similar to those of the Kinetic Drives, i.e., 4 TB, 59 RPM, and 64 MB of disk cache. Server-SAS also has LevelDB installed but is a more powerful and modern system. This machine runs Ubuntu Server 4.4 LTS 64-bit (kernel 4.) on two six-core Intel Xeon E5-262 v3 2.4GHz and 64 GB of RAM. There is a PCIe LSI MegaRAID SAS-3 38 adapter connected to an external Dell MD4 disk enclosure containing Seagate model ST6NM34 Enterprise Capacity SAS drives. Each drive is 6 TB, 72 RPM, and has 28 MB of disk cache. For both Server-SATA and Server-SAS, LevelDB is installed and db_bench.cc (or a modified version) are run on the various drive configurations from the same machine, i.e., the client benchmark is run against a local server. For tests using two or more drives, Linux software RAID- is configured with default mdadm settings offered by the GNOME Disk Utility. The target location is then formatted with default ext4 settings and mounted for use. IV. METHODOLOGY AND RESULTS FOR KINETIC DRIVES This section details a variety of our tests particularly on Kinetic Drives. Performance results where we compared Kinetic Drives with LevelDB based servers are presented in the later section. Our attempt is to first inform the readers about the raw performance metrics of Kinetic Drives. This will help system engineers in designing their systems. The primary

4 Fig. 3. System Configuration with Kinetic Drives as Independent Servers measurements we made were sequential read and write along with random read and write for Kinetic Drives. We also present the throughputs for peer-to-peer transfer for Kinetic Drives, which is a unique feature. A. Comparison of Writethrough vs. Flush Mode There are two different modes for forcing synchronous write operations, Writethrough and Flush, as mentioned in the Table III. Of the two, Writethrough operations tend to be faster than the pure Flush operations. Flush mode must make sure all previously written commands in the asynchronous writeback mode are made permanent. However, requests made in the Writethrough mode are made permanent without checking the status of any other command. Results of a test to illustrate this difference are shown in Table IV. In this test there are two variations, one where all writes are issued in Writethrough mode and one where all writes are issued in Flush mode. In both cases, each write (or put) is made persistent before returning a completion signal to the application. However, for Flush mode, the drives must ensure that there are no pending write requests waiting to be made persistent. For, key-value pairs and, key-value pairs, Writethrough gives higher throughput than repeated Flush mode writes. This illustrates that Writethrough should be used for synchronous writes and it is bad programming to use Flush mode for repeated writes. Furthermore, it could be concluded that the throughput depends on the write modes, value size, and number of keyvalue pairs. Therefore, we obtain different throughputs for Table III and, Figures 4 and 5. B. Comparison of Throughputs for Write and Read Operation ) Sequential Order Writes: This testing consists of two sets of experiments. For Sequential Write, write requests are issued by the client in increasing key ID order, i.e., from key number to key number n. The write operation issues n commands in the WriteBack mode and then the n th command is written in Flush mode to make all the commands persistent. This series of repeated asynchronous WriteBack writes followed by a single Flush command is the correct way to use Flush, and we call this WriteBack-Flush mode. After Sequential Write, the data is read back from key number to key number n sequentially for Sequential Read. For Random TABLE IV COMPARISON OF THROUGHPUTS FOR DIFFERENT WRITE MODES Key-value pairs Flush WriteThrough MB, 5.4 (MB/sec) 8.99 (MB/sec) MB, 4.8 (MB/sec) 8.9 (MB/sec) Read, after performing Sequential Write the data is read back in random key ID order. The Figure 4 demonstrates the obtained throughput from the Kinetic Drives for Random Read and Sequential Read for up to TB of sequentially written data. Since the write process is the same for both the experiments, it is demonstrated by only one plot in Figure 4. For these experiments we vary the size of the value from 2 bytes to MB but keep the key size constant at 6 bytes. We keep the number of key-value pairs constant at,,. Thus, the x-axis can also be interpreted as value-size times one million. Random Order Read throughput is lower than Sequential Order Read because the head has to be repositioned on the platter lot more often for random order reads than sequential order reads. 2) Random Order Writes: This testing also consists of two sets of experiments. This time, the client issues write requests in random key ID order for Random Write. After the data is successfully written in the WriteBack-Flush mode (as explained in Section IV-B), key-value pairs in one experiment are sequentially read back. In the other set of experiments, data is randomly written and then read back randomly. The value size and other settings for Random Order Write tests are similar to the Sequential Order Write tests. The throughput comparisons are shown in Figure 5. Again, write is only plotted once because the results are the same for both sets of experiments. We observe that the throughputs of Random Order Writes Average Throughput (MB/sec) _Bytes Sequential_Write Sequential_Read Random_Read Fig. 4. Sequential Order Write then Random Order Read or Sequential Order Read for,, Key-Value Pairs with MB and 6 Bytes Key Size 7 Average Throughput (MB/sec) Random_Write Sequential_Read Random_Read 2_Bytes Fig. 5. Throughputs of Random Order Write then Random Order Read or Sequential Order Read for,, Key-Value Pairs with MB and 6 Bytes Key Size

5 and Sequential Order Writes are similar. We believe this trend happens because the drive does not consider the key ids of the KV pairs. Instead, it receives a write request and simply writes it to the disk. Moreover, LevelDB (installed in the drives) is supposed to sort the key-value pairs based on the ids. We expected that the throughput of random writes would be lower than the sequential write throughput. We can only conclude that the LevelDB inside the drives has been modified by Seagate. We are investigating this phenomenon by conducting similar tests on a server with LevelDB installed. C. Maximum Throughput for Different s In this test we try to find the maximum throughput that can be obtained for different value sizes. Each data point on the x-axis in Figure 6 is a different experiment. For each experiment, we vary the number of key-value pairs and perform the sequential order write operation, as explained in Section IV-B. These experiments give us the maximum write throughputs that can be extracted out of the Kinetic Drives while previous results show an average. The number of key-value pairs required to achieve the maximum throughput was also different for all the experiments. Maximum write throughput increases with value size as shown in Figure 6. This trend is likely because the CPU-bound Kinetic Drive wastes fewer resources on overhead with larger value sizes. D. Impact of Key Size on Throughput Performance Figure 7 depicts the impact on throughput because of different key sizes. The comparison is made when data is written sequentially (as explained in Section IV-B) in Writeback- Flush mode with two different key sizes, 4 KB and 6 Bytes. In this experiment, we vary the value size from 2 bytes to MB (the max value size), and,, key-value pairs are sequentially written. To write more than TB of data we have to increase the number of key-value pairs to 2,, or more with MB as the value size. Therefore, for 3.6 TB of data written, the client sends 3,6, key-value pairs of value size MB each. The better throughput for smaller keys is partly because the total amount of data transferred from the client to the Kinetic Drive, key size + value size, is smaller. When we write a million 4 KB key-sized key-value pairs, we transmit TB of data and 4 GB of data for their keys. With 6 Byte keys, there is hardly any key data sent. This performance difference is also 9 Write likely due to the already CPU-bound Kinetic drive struggling to process the larger keys. E. Multi-threaded Throughput This section presents the throughput obtained when two or more threads perform the write operation. The API allows two or more threads to perform sequential writes. The API also allows for two simultaneous sessions. The write operation for each thread is similar to the write operation described in Section IV-D. In Figure 8, it can be observed that the cumulative throughputs for multiple threads are similar to that of a single thread. This is again likely because the internal Kinetic CPU is the limiting factor and it cannot process multiple threads any faster than a single thread. The drives also have two Ethernet ports, but when we attempt to utilize both of these ports simultaneously for read or write operations, the test programs crash. We conclude that these ports are used more for fault tolerance rather than extracting parallel performance from the drives. F. Throughput for P2P Transfer This section presents the results for a unique feature called P2P transfer. The client issues a P2P Get or P2P Put command to a Kinetic Drive (KD-Init) along with the IP address of the target Kinetic Drive (KD-Target). KD-Init establishes a connection over Ethernet to KD-Target and initiates put/write or get/read operations based on the command sent from the client. After this operation is successfully completed both the drives have copies of the transferred key-value pairs. In the Table V, we show the throughput and internal Kinetic CPU load while using this feature. First, we write 52 keyvalue pairs to KD-Target. Thereafter, we performed P2P Get operation in which KD-Init copies 52 key-value pairs, each with MB value size, from KD-Target. After the operation is successfully completed, both the drives have these same keyvalue pairs. This data copying process takes place without interference from the client. The client only issues the initial instruction to begin the P2P transfer. Average Throughput (MB/sec) Write_4KB_Key Write_6Bytes_Key Average Throughput (MB/sec) KB 6KB 32KB 64KB 28KB 256KB 52KB Fig. 6. Maximum Throughput for different MB 2_Bytes Fig. 7. Total data transfer with different key-size 5 TABLE V THROUGHPUT FOR P2P TRANSFER Read No. of KV pairs Internal CPU KB.43 (MB/sec) % MB (MB/sec) %

6 Average Throughput (MB/sec) Average Throughput (MB/sec) _threads 4_threads 6_threads 2MB 256MB 52MB GB 2GB 4GB 8GB 6GB 32GB 64GB 28GB Total Data Transferred per Run Fig. 8. Aggregate Throughputs for Muiltithread Performance Write_Throughput P2P_Throughput 256GB 52GB TB 2TB 3TB 3.6TB 2_Bytes Fig. 9. Throughput Comparisons between Write Throughput of Kinetic Drives with P2P Operation for 25, KV Pairs Unfortunately, when we attempt to transfer more than.5 GB, or more than 52 key-value (KV) pairs with value-size of MB, the experiments crash. We are working with Seagate to fix this problem, and it is most likely a software bug. As with many of our other experiments, we see in Table V that the internal processor s utilization is almost % when P2P transfer is taking place. After observing the throughputs for different size key-value pairs, we conclude that the throughput obtained for this process is dependent on the value size of each key-value (KV) pair and that the utilization of the internal processor remains high with small or large value sizes. Therefore, we believe that the Kinetic CPU acts as a bottleneck and prevents throughputs from increasing beyond a certain range. Seagate is working on upgrading this processor for future Kinetic models. G. Get Previous Performance of Kinetic Drives Figure compares the read performance of reading keyvalue pairs in reverse order. Here, we test the "previous" command feature that comes with the Kinetic API. First, we perform sequential writes as described in the Section IV-D and after completion, we use the previous command of the API to read the key-value pairs in reverse order, i.e., from key number n to. As an illustration, if we give key n to the Kinetic Drive then by using this feature the Kinetic Drive returns back to us the key-value pair with the key n-. We compare this test with a custom created program that issues read commands from the nth key-value pair to the first key-value pair. It can be observed that the performance is comparable. Therefore, this feature only brings convenience to the users. It can also be concluded from the experimental data that the reverse command and the commands issued by the client are handled similarly by the Kinetic Drive. Average Throughput (MB/sec) MB KD_previous_cmd 256MB KD_reverse 52MB GB 2GB 4GB 8GB 6GB 32GB 64GB 28GB Total Data Transferred per Run Fig.. Comparison of Throughputs for Reverse Command and Custom Backwards Read; KD-Kinetic Drive, cmd-command/instruction, kd-reverseprogram created to test and compare V. THROUGHPUT COMPARISONS OF KINETIC DRIVES AND LEVELDB SERVERS This section compares the throughput of Kinetic Drives against traditional servers running LevelDB (Server-SATA and Server-SAS). We believe it is important for the readers to understand the possible implications of replacing traditional hard drives with Kinetic Drives for various workloads and disk configurations. As described in Section III, we have two different LevelDBbased server systems. Server-SATA has 2 GB of RAM and two extra 59 RPM SATA disks with similar properties to the Kinetic drives, and the faster Server-SAS has 64 GB of RAM with several 72 RPM enterprise drives. ) Sequential Order Write and Sequential Order Read for Single and Multiple Drives: In this set of tests, we compare the sequential key order throughput of the three systems (Kinetic Drives, Server-SATA, and Server-SAS) for reads and writes with various value sizes. As in the previous figures, each point on the x-axis represents different experiments. For example, 2_B in Figure represents,, KV pairs, each with value size of 2 bytes, first written in sequential key order for one experiment and then read back in sequential key order for another experiment. The total data transfer of each run is 2 MB. Similarly, 256_B represents another,, KV pairs written and then read back, each with a value size of 256 bytes (total data transfer is 256 MB per run). It is also important to note that the y-axis is in log scale, because some of the server read experiments with smaller value sizes fit mostly in the server RAM and avoid many of the slower disk accesses. For read tests, the LevelDB benchmark, db_bench.cc, first loads some data to read but does not purge any caches before measuring the reads. The most interesting observation is that the Kinetic drive can sustain higher throughput than a LevelDB server for large enough values sizes. For small value sizes, however, the servers with their large RAM size can easily outperform a 256GB 52GB TB 2TB 3TB 3.6TB

7 Kinetic drive. Overall, even ignoring the small experiments that mostly fit in server RAM, we can also conclude that read operations achieve higher throughput than writes for all of the three systems. During reads, the drives simply have to deliver the data that has already been sorted by the systems. When there are many writes, LevelDB has to do compaction and level management that introduces write amplification. We extend these experiments by adding drives, striped in software RAID-, to Server-SATA and Server-SAS as shown in Figures 2, 3, and 4. Only Server-SAS has enough disks for four and eight drive tests. Having conducted simultaneous parallel experiments on all four Kinetic disks to verify no individual performance degradation, we extrapolate the plots for multiple Kinetic Drive tests by assuming the load is balanced evenly across the independent disks and the performance will scale linearly until the Kinetic enclosureâăźs Gbps network link becomes a bottleneck. While the LevelDB servers do show throughput increases as we add more disks, their performance does not scale up as much as expected. For example, sequential order writes on Server-SAS stop improving beyond four drives. Because Server-SAS has faster and far more RAM, it shows significantly higher read throughput than the Kinetic drives or Server-SATA for small value sizes. However, Server-SATA and Server-SAS both reflect a âăijmemory cliffâăi trend in read throughputs. After the read throughputs increase continuously, they suddenly drop beyond a certain value size. As the amount of data transferred in each experiment increases with value size, the system memory cannot continue to hide slower disk accesses and throughput drops. At these points, the Kinetic Drives usually take over and start giving better throughputs than either of the servers. Average Throughput (MB/sec) in Log. KD_Seq_WR_D KD_Seq_RD_D Server_SATA_Seq_WR_D 2_B 256_B 52_B Server_SATA_Seq_RD_D Server_SAS_Seq_WR_D Server_SAS_Seq_RD_D Fig.. Sequential (Seq) Read (RD) and Write (WR) Throughput Comparison of a Kinetic Drive (KD), Server-SATA, and Server-SAS using one Drive (D) 2) Random Order Write and Random Order Read for Single and Multiple Drives: In this section, we compare throughputs when data is written and read back in a random order. Again, we use db_bench.cc, which comes with LevelDB, to perform random reads and writes on a single drive as well as multiple drives. As in the previous subsection, the y-axis is in a log scale so the server RAM results do not overwhelm the graph, and the multiple drive Kinetic results are a linear extrapolation of the single drive results in Section IV. Average Throughput (MB/sec) in Log. KD_Seq_WR_2D KD_Seq_RD_2D Server_SATA_Seq_WR_2D 2_Bytes Server_SATA_Seq_RD_2D Server_SAS_Seq_WR_2D Server_SAS_Seq_RD_2D Fig. 2. Sequential (Seq) Read (RD) and Write (WR) Throughput Comparison of Kinetic Drives (KD), Server-SATA, and Server-SAS using two drives (2D) Average Throughput (MB/sec) in Log. KD_Seq_WR_4D KD_Seq_RD_4D 2_Bytes Server-SAS_Seq_WR_4D Server-SAS_Seq_RD_4D Fig. 3. Sequential (Seq) Read (RD) and Write (WR) Throughput Comparison of Kinetic Drives (KD) and Server-SAS using four drives (4D) Average Throughput (MB/sec) in Log KD_Seq_WR_8D KD_Seq_RD_8D 2_Bytes Server-SAS_Seq_WR_8D Server-SAS_Seq_RD_8D Fig. 4. Sequential (Seq) Read (RD) and Write (WR) Throughput Comparison of Kinetic Drives (KD) and Server-SAS using eight drives (8D) Looking at Figures 5, 6, 7, and 8 we notice that overall throughputs for these random server tests are lower than sequential order reads and writes. This reduced random performance is due to the underlying spinning platter hard drive technology being unable to prefetch or batch write nearby data as it can when it reads or writes sequentially. The servers show particularly poor random write performance. In fact, as the value size increases, the random write throughput decreases to below MB/s. The larger value sizes were so slow that we were forced to abandon those tests. In addition, we realize that the system s memory size is quite critical for the server s performance. As before, however, when the requested data increases beyond what is cached in the serverâăźs RAM,

8 the high throughputs due to memory are not maintained and throughputs for read operation drop. In this section, we have shown that the Kinetic drives scale well for datasets with larger value sizes that do not fit in the serverâăźs memory. In the next section, we will discuss latency measurements from the different systems. Average Throughput (MB/sec) in Log. KD_RND_WR_D KD_RND_RD_D Server-SATA_RND_WR_D 2_B 256_B 52_B Server-SATA_RND_RD_D Server-SAS_RND_WR_D Server-SAS_RND_RD_D Fig. 5. Random (RND) Read (RD) and Write (WR) Throughput Comparison of a Kinetic Drive (KD), Server-SATA, and Server-SAS using one drive (D) Average Throughput (MB/sec) in Log. KD_RND_WR_2D KD_RND_RD_2D Server-SATA_RND_WR_2D 2_Bytes Server-SATA_RND_RD_2D Server-SAS_RND_WR_2D Server-SAS_RND_RD_2D Fig. 6. Random (RND) Read (RD) and Write (WR) Throughput Comparison of Kinetic Drives (KD), Server-SATA, and Server-SAS using two drives (2D) Average Throughput (MB/sec) in Log. KD_RND_WR_4D KD_RND_RD_4D 2_Bytes Server-SAS_RND_WR_4D Server-SAS_RND_RD_4D Fig. 7. Random (RND) Read (RD) and Write (WR) Throughput Comparison of Kinetic Drives (KD) and Server-SAS using four drives (4D) VI. LATENCY TESTS OF KINETIC DRIVES AND LEVELDB SERVERS Another important characteristic of the drives is latency. In this section, we ascertain the latency of Kinetic Drives in Average Throughput (MB/sec) in Log KD_RND_WR_8D KD_RND_RD_8D 2_Bytes Server-SAS_RND_WR_8D Server-SAS_RND_RD_8D Fig. 8. Random (RND) Read (RD) and Write (WR) Throughput Comparison of Kinetic Drives (KD) and Server-SAS using eight drives (8D) various scenarios and compare it with the LevelDB servers. The configuration for the Kinetic Drives and LevelDB servers remain the same as explained in the earlier sections. A. Effect of Randomly Written Data on Random Read Latency Here we describe the latency results obtained on the Kinetic Drives, Server-SATA, and Server-SAS. There are three different sets of tests conducted. Effect of GB and 3.7 TB of Random Order and Sequential Order Writes on Random Read Latency for Kinetic Drives Effect of GB of Random Order and Sequential Order Writes on Random Read Latency for Server-SATA Effect of GB of Random Order and Sequential Order Writes on Random Read Latency for Server-SAS ) Read Latency of Kinetic Drives Following Random and Sequential Order Writes: The first set of experiments study the impact of writing GB or 3.7 TB in a random order or sequential order on the read latency of a subsequent random set of, KV pairs, and Figure 9 presents a histogram of our findings. To obtain the experimental data, we first perform the sequential key order write operation as described in Section IV-B and then read GB of that data by randomly requesting, key-value pairs (value size of MB and key size of 6 bytes). We record the time it takes to successfully complete each read operation and put these latencies in different histogram bins to determine a count or frequency of accesses in that latency range. In a subsequent experiment, we write the GB of data in a random order (as described in Section IV-B2) and then randomly read, key-value pairs. We place these latencies as well in different bins. Before comparing these frequencies with other results, we obtain relative frequencies by dividing the number in each bin by number of key-value pairs (, in this case), as shown in Figure 9. In addition, we test the effect of the amount of data written on the latencies by repeating the same experiment with 3.7 TB of data written in a random order as well as sequential order. Referring again to Figure 9, we can observe that the read latencies are higher when the Kinetic drive is full. Reads

9 following a drive filled with random order writes show yet higher latency. Another interesting observation is that around 4-5% of the reads have a latency above ms. We can speculate that this is due to some poorly timed LevelDB compaction, but we are not certain why this happens. To rule out network latency, we tested the network ping time from the client to the Kinetic drives and found that it is usually around.3 ms. The ping is never above ms, even when the Kinetic drive is at % CPU load. The latency also depends on the distribution of the searched key-value pairs, so it is possible that a few of the randomly chosen keys are requested in an order that is particularly difficult to process. Overall, we can conclude that the amount of data written and the amount of randomness in that data affect the latencies for subsequent read operations on these Kinetic drives. From our results, we observe that a substantial number of read latencies for Server-SAS fall in the "(,2]" bin, which implies that Server-SAS tends to be faster for read operations than Server-SATA. These findings are reflective of the higher memory size of Server-SAS. Therefore, increasing the memory size of a server improves the read performance of the system. Fraction of Frequency After GB RW on Server-SATA After GB SW on Server-SATA After GB RW on Server-SAS After GB SW on Server-SAS After GB SW on Kinetic Drives After GB RW on Kinetic Drives After 3.7 TB SW on Kinetic Drives After 3.7 TB RW on Kinetic Drives [,] (,2] (2,3] (3,4] (4,5] (5,6] (6,7] (7,8] (8,9] (9,] > Latency (ms) Histogram Bins Fraction Frequency [,] (,2] (2,3] (3,4] (4,5] (5,6] (6,7] (7,8] (8,9] (9,] > Latency (ms) Histogram Bins Fig. 9. GB Random Read after Varied Writes on Kinetic Drives 2) Comparison of Latency Tests between Kinetic Drives, Server-SATA, and Server-SAS: In our study of Kinetic Drives, we consider it important to compare the performance of Kinetic Drives with the servers with LevelDB installed. Figure 2 presents our findings for running the LevelDB benchmarking script on Server-SATA and Server-SAS. Overall we performed 4 set of operations on two servers and the order of the writes and read is defined by the script itself. GB Random Write followed by GB Random Read on Server-SATA GB Sequential Write followed by GB Random Read on Server-SATA GB Random Write followed by GB Random Read on Server-SAS GB Sequential Write and GB Random Read on Server-SAS For the first experiment, we wrote GB of data in random order (key-size of 6 bytes and MB value size) on Server- SATA. This data is written on a single drive of the Server- SATA. Afterwards, we read, KV pairs in the random order to record the read latency of each KV pair and bin these latencies after finding fraction of these latencies (similar to as explained in the Section VI-A). Furthermore, we perform the above mentioned 4 experiments and compare them in the Figure 2. Fig. 2. GB Random Read after Varied Writes on Server-SATA and Server-SAS Moreover, we wanted to better understand the results by studying the trends in the read latencies. Therefore, we obtain mean of the latencies and use those to find 99% confidence intervals. Figures 2, 22, 23, and 24 We calculated mean and confidence interval based on the book by Lilja et.al. [6]. We used the latency data obtained to calculate mean and the confidence interval. For calculating mean, we sampled every data points which represents the latencies of reading KV pairs. We employed this mean to calculate the 99% confidence interval. We plotted data points from, KV pairs. We did similar findings for Server-SAS and Server-SATA. The pattern demonstrates us a transition in the latencies of all the plotted graphs. Overall, these transition depends on the randomness of the data. Mean of Latency for Read Operation KV Pair Number * ^3 mean Confidence Interval Fig % Confidence Interval for Reading, KV Pairs in Random Order After Writing GB in Random Order with MB and Key Size 6 Bytes on Server-SATA Based on our findings it could be concluded that there are significant differences between the Kinetic Drive s LevelDB and the LevelDB based servers. Overall, the latencies for the Server-SATA and Server-SAS are much higher than the latencies for the Kinetic Drives

10 Mean of Latency for Read Operation KV Pair Number * ^3 mean Confidence Interval the obtained latencies for 2 drives, 4 drives, and 8 drives on Server-SAS respectively. We noticed that because of multiple drives the overall latencies for read operations reduce. Fraction of Frequency After GB RW 2 Drives After GB RW 4 Drives After GB RW 8 Drives Fig % Confidence Interval for Reading, KV Pairs in Random Order After Writing GB in Sequential Order with MB and Key Size of 6 Bytes on Server-SATA Mean of Latency for Read Operation KV Pair Number * ^3 mean Confidence Interval Fig % Confidence Interval for Reading, KV Pairs in Random Order After Writing GB in Random Order with MB and Key Size of 6 Bytes on Server-SAS Mean of Latency for Read Operation KV Pair Number * ^ mean Confidence Interval Fig % Confidence Interval for Reading, KV Pairs in Random Order After Writing GB in Sequential Order with MB and Key Size of 6 Bytes on Server-SAS 3) Latency Tests for Server-SAS with Multiple Drives: In the previous section, we discussed latency tests that were conducted on single drives. In this section we discuss the latency findings for LevelDB based servers with multiple drives. To setup the system for multiple drives we used RAID. For these experiments as well we used the LevelDB based benchmarking script and then wrote GB in a random order. The key size was 6 bytes and value size being MB. Afterwards,, KV pairs are read back and latency is plotted for all the key-value pairs. Figure 25 demonstrates [,] (,2] (2,3] (3,4] (4,5] (5,6] (6,7] (7,8] (8,9] (9,] > Latency (ms) Histogram Bins Fig. 25. GB Random Read after Varied Writes on Server-SAS with Multiple Drives VII. KEYS NOT FOUND - TRENDS FOR KINETIC DRIVE AND LEVELDB BASED SERVERS In this section, we study the time it takes when the searched keys are "not found" in a Kinetic Drive or in a server. This aspect is important when large number of drives are used to create a system. There is a strong possibility that users search for Key-Value pairs and may not find them in a particular Kinetic Drive. Therefore, the users will have to search other Kinetic drives. The time it takes to perform the operation of searching key-value pairs in a particular server or Kinetic drive can affect the entire system s performance. A. Time for Search Operation When Keys Are Not Found in a Kinetic Drive ) With GB Data Written: For this test, we performed the write operation in the sequential order, as explained in the Section IV-B, and wrote GB of data in the form of, KV pairs with each value size of MB. Following that, we requested, key-value pairs from a Kinetic Drive in the sequential order. We ensured that the requested keys do not exist in the Kinetic Drive. We record the time it takes, for each request, for the Kinetic drive to return back the status that the requested key is not found in the drive. We plot the time against the total number of requests made in Figure 26. To find out if the drives treat random order search differently from sequential order search, we performed the above mentioned experiment again. However, this time we requested the, KV pairs in the random order after writing GB data in the sequential order. We plotted the time it takes to serve each request, when keys are not found, and plot it against the KV pairs requested as shown in the Figure 27. Based on our experience from the previous latency based experiments, we attempted to introduce more randomness in the process. We wrote GB of data in the random order and tried to search for keys in the random order as well. Our observation is given in the Figure 28.

11 Time to Read Key-Value Pair (seconds) Fig. 26. Keys Not Found - k KV Pairs Searched in Sequential Order With GB Data Written in Sequential Order Time to Read Key-Value Pair Fig. 27. Keys Not Found - k KV Pairs Searched in Random Order With GB of Written Data in Sequential Order Latency (seconds) Fig. 28. Keys Not Found - k KV Pairs Searched in Random Order With GB of Written Data in Random Order We found a very interesting pattern in the Figures 26, 27, and 28. All the figures depict a very similar and regular pattern. This data made us conclude that the Kinetic Drives are searching an index and returning the timings. This entire index should be fitting inside the memory of Kinetic Drives to yield very similar patterns. 2) With 3.7 TB Data Written: In this section we will discuss the conducted experiments which are very similar to the experiments in the previous subsection. However, for these experiments, we completely filled the drive to 3.7 TB of data. This was done to observe the impact of filling the drive completely. For the first experiment, we wrote 3.7 TB of data in the sequential order in the form of 3,7, KV pairs with keys of 6 bytes each and value size of MB. All the generated keys were unique and "kinetic-c-util" generated MB values. Figure 29 presents our findings for the time it takes to search, keys and report them to be not found. Time to Read Key-Value Pair (seconds) Fig. 29. Keys Not Found - k KV Pairs Searched in Sequential Order With 3.7 TB of Data Written in Sequential Order Following this we performed the same operation however, we requested KV pairs in the random order and the observations are presented in the Figure 3. None of these KV pairs or keys are located in a Kinetic Drive. We realized that the pattern obtained in all the cases was very similar. Therefore, it could not be confirmed that searching for unavailable keys is independent of the order in which data is written however, the order has bare minimum effect on the time it takes to search for keys that are not found. From these 4 experiments, it could be concluded that the time spent on fulfilling a request for each key not found is extremely less as compared to performing a successful read operation in the Kinetic Drive. Perhaps, the index which records a list of all the KV pairs in the drive could easily fit inside the memory and thus takes minimal time. Time to Read Key-Value Pair (seconds) Fig. 3. Keys Not Found - k KV Pairs Searched in Random Order With 3.7 TB of Data Written in Sequential Order B. Effect of Storing KV Pairs Before Reading Back When read and write operations are performed it is likely that some of the KV pairs remain in the memory of the Kinetic Drives. Therefore, if we read back the data immediately

12 . Time to Read Key-Value Pair (seconds).8.6 It is interesting to conclude from these two Figures that the latencies substantially reduce when we have multiple operations performed (Figure 33). This trend is obtained primarily because after repeating several operations mem_table is brought in the memory of the Server-SATA and subsequently it becomes much faster for the server to search for keys that are missing in the drive. Hence, the latencies to search for missing keys reduce substantially after repeating several operations..6 Time to Read Key-Value Pair (seconds) after writing then the Kinetic Drives might give us higher throughput or our searches might be faster. Therefore, we wrote 3.7 TB of data to the Kinetic Drive in a random order and then kept the drive idle for a week. After waiting for about a week, we attempted to search, KV pairs that could not be found in a drive. We wanted to find out the pattern for accessing, keys that are not stored in a drive. We plotted the time it takes to search for each key in Figure 3. The pattern we obtained is quite different from the previously obtained data (shown in the Figure 3). This is primarily because all the data must have been flushed to the drive and Kinetic Drive had to fetch the data back from the persistent drive Fig. 32. Time for Searching Keys Which Are Not Found In The LevelDB Installed Server-SATA with Write and Read Operations Fig. 3. Keys Not Found -, KV Pairs Searched in Random Order After Waiting for 7 Days, With 3.7 TB of Data Written in Random Order C. Impact of LevelDB Based Server-SATA s System Memory On Performance When Keys Not Found In this section, we tried to ascertain the performance of a LevelDB installed Server-SATA. We performed a similar test as mentioned above in this section. However, to test the effect of Server-SATA s memory on the searching of unavailable keys, we attempted two different sequences of read and write operations as given below: ) Random Write GB, Search for Missing Keys (Figure 32) 2) Random Write GB, Random Read GB, Sequential Read GB, Search for Missing Keys (Figure 33) For the first sequence, we simply performed the write operation by writing, key-value pairs of MB value each. Thereafter, we immediately performed the searching of, missing keys and obtained Figure 32. For the second sequence, we performed the write operation by writing, key-value pairs of MB value each by using the script that is provided by Google, Inc. for testing servers. After successful completion, we performed random order read operation of all the stored key-value pairs. We then read all GB data in a sequential order, as well. After following these operations, we then searched for keys that are not stored in the Server-SATA. This was intentionally done to see the impact of the repeated operations on the latencies of searching for missing key-value pairs. Time to Read Key-Value Pair (seconds) Fig. 33. Time for Searching Keys Which Are Not Found In The LevelDB Installed Server-SATA D. Keys Not Found on LevelDB Installed Server-SAS With Multiple Drives In this section, we would discuss the performance implications when the requested keys are not available in a LevelDB based Server-SAS. We conducted these tests on multiple drives. As an illustration, we tested it on Server-SAS system with drive, 2 drives, and 4 drives. We ran the script that comes with LevelDB (db_bench). This script is helpful in benchmarking the system it runs on. ) Drive Tests: For the first experiment, we used db_bench to write, KV pairs with MB value size and 6 bytes of key size. We used only drive for this and afterwards, we requested keys which are not stored in the drives of the system. The findings of this experiment are presented in the Figure 34. We performed a similar test as mentioned above however, this time we wrote GB of data in sequential order using db_bench. The findings are presented in the Figure 35. It could

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees Pandian Raju 1, Rohan Kadekodi 1, Vijay Chidambaram 1,2, Ittai Abraham 2 1 The University of Texas at Austin 2 VMware Research

More information

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage Database Solutions Engineering By Raghunatha M, Ravi Ramappa Dell Product Group October 2009 Executive Summary

More information

Performance Benefits of Running RocksDB on Samsung NVMe SSDs

Performance Benefits of Running RocksDB on Samsung NVMe SSDs Performance Benefits of Running RocksDB on Samsung NVMe SSDs A Detailed Analysis 25 Samsung Semiconductor Inc. Executive Summary The industry has been experiencing an exponential data explosion over the

More information

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4 W H I T E P A P E R Comparison of Storage Protocol Performance in VMware vsphere 4 Table of Contents Introduction................................................................... 3 Executive Summary............................................................

More information

IBM and HP 6-Gbps SAS RAID Controller Performance

IBM and HP 6-Gbps SAS RAID Controller Performance IBM and HP 6-Gbps SAS RAID Controller Performance Evaluation report prepared under contract with IBM Corporation Introduction With increasing demands on storage in popular application servers, the server

More information

Dell PowerEdge R730xd Servers with Samsung SM1715 NVMe Drives Powers the Aerospike Fraud Prevention Benchmark

Dell PowerEdge R730xd Servers with Samsung SM1715 NVMe Drives Powers the Aerospike Fraud Prevention Benchmark Dell PowerEdge R730xd Servers with Samsung SM1715 NVMe Drives Powers the Aerospike Fraud Prevention Benchmark Testing validation report prepared under contract with Dell Introduction As innovation drives

More information

Four-Socket Server Consolidation Using SQL Server 2008

Four-Socket Server Consolidation Using SQL Server 2008 Four-Socket Server Consolidation Using SQL Server 28 A Dell Technical White Paper Authors Raghunatha M Leena Basanthi K Executive Summary Businesses of all sizes often face challenges with legacy hardware

More information

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions A comparative analysis with PowerEdge R510 and PERC H700 Global Solutions Engineering Dell Product

More information

Presented by: Nafiseh Mahmoudi Spring 2017

Presented by: Nafiseh Mahmoudi Spring 2017 Presented by: Nafiseh Mahmoudi Spring 2017 Authors: Publication: Type: ACM Transactions on Storage (TOS), 2016 Research Paper 2 High speed data processing demands high storage I/O performance. Flash memory

More information

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE WHITEPAPER DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily

More information

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c White Paper Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c What You Will Learn This document demonstrates the benefits

More information

New Approach to Unstructured Data

New Approach to Unstructured Data Innovations in All-Flash Storage Deliver a New Approach to Unstructured Data Table of Contents Developing a new approach to unstructured data...2 Designing a new storage architecture...2 Understanding

More information

IBM InfoSphere Streams v4.0 Performance Best Practices

IBM InfoSphere Streams v4.0 Performance Best Practices Henry May IBM InfoSphere Streams v4.0 Performance Best Practices Abstract Streams v4.0 introduces powerful high availability features. Leveraging these requires careful consideration of performance related

More information

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage Evaluation of Lustre File System software enhancements for improved Metadata performance Wojciech Turek, Paul Calleja,John

More information

Quantifying FTK 3.0 Performance with Respect to Hardware Selection

Quantifying FTK 3.0 Performance with Respect to Hardware Selection Quantifying FTK 3.0 Performance with Respect to Hardware Selection Background A wide variety of hardware platforms and associated individual component choices exist that can be utilized by the Forensic

More information

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays Dell EqualLogic Best Practices Series Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays A Dell Technical Whitepaper Jerry Daugherty Storage Infrastructure

More information

Modification and Evaluation of Linux I/O Schedulers

Modification and Evaluation of Linux I/O Schedulers Modification and Evaluation of Linux I/O Schedulers 1 Asad Naweed, Joe Di Natale, and Sarah J Andrabi University of North Carolina at Chapel Hill Abstract In this paper we present three different Linux

More information

A Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510

A Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510 A Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510 Incentives for migrating to Exchange 2010 on Dell PowerEdge R720xd Global Solutions Engineering

More information

Performance and Scalability with Griddable.io

Performance and Scalability with Griddable.io Performance and Scalability with Griddable.io Executive summary Griddable.io is an industry-leading timeline-consistent synchronized data integration grid across a range of source and target data systems.

More information

Extreme Storage Performance with exflash DIMM and AMPS

Extreme Storage Performance with exflash DIMM and AMPS Extreme Storage Performance with exflash DIMM and AMPS 214 by 6East Technologies, Inc. and Lenovo Corporation All trademarks or registered trademarks mentioned here are the property of their respective

More information

Best Practices for SSD Performance Measurement

Best Practices for SSD Performance Measurement Best Practices for SSD Performance Measurement Overview Fast Facts - SSDs require unique performance measurement techniques - SSD performance can change as the drive is written - Accurate, consistent and

More information

VoltDB vs. Redis Benchmark

VoltDB vs. Redis Benchmark Volt vs. Redis Benchmark Motivation and Goals of this Evaluation Compare the performance of several distributed databases that can be used for state storage in some of our applications Low latency is expected

More information

GearDB: A GC-free Key-Value Store on HM-SMR Drives with Gear Compaction

GearDB: A GC-free Key-Value Store on HM-SMR Drives with Gear Compaction GearDB: A GC-free Key-Value Store on HM-SMR Drives with Gear Compaction Ting Yao 1,2, Jiguang Wan 1, Ping Huang 2, Yiwen Zhang 1, Zhiwen Liu 1 Changsheng Xie 1, and Xubin He 2 1 Huazhong University of

More information

The Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput

The Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput Solution Brief The Impact of SSD Selection on SQL Server Performance Understanding the differences in NVMe and SATA SSD throughput 2018, Cloud Evolutions Data gathered by Cloud Evolutions. All product

More information

SAS Enterprise Miner Performance on IBM System p 570. Jan, Hsian-Fen Tsao Brian Porter Harry Seifert. IBM Corporation

SAS Enterprise Miner Performance on IBM System p 570. Jan, Hsian-Fen Tsao Brian Porter Harry Seifert. IBM Corporation SAS Enterprise Miner Performance on IBM System p 570 Jan, 2008 Hsian-Fen Tsao Brian Porter Harry Seifert IBM Corporation Copyright IBM Corporation, 2008. All Rights Reserved. TABLE OF CONTENTS ABSTRACT...3

More information

Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic

Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic WHITE PAPER Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic Western Digital Technologies, Inc. 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive

More information

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell Storage PS Series Arrays

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell Storage PS Series Arrays Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell Storage PS Series Arrays Dell EMC Engineering December 2016 A Dell Best Practices Guide Revisions Date March 2011 Description Initial

More information

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO Agenda Technical challenge Custom product Growth of aspirations Enterprise requirements Making an enterprise cold storage product 2 Technical Challenge

More information

CSE 124: Networked Services Lecture-17

CSE 124: Networked Services Lecture-17 Fall 2010 CSE 124: Networked Services Lecture-17 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/30/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments

More information

IBM B2B INTEGRATOR BENCHMARKING IN THE SOFTLAYER ENVIRONMENT

IBM B2B INTEGRATOR BENCHMARKING IN THE SOFTLAYER ENVIRONMENT IBM B2B INTEGRATOR BENCHMARKING IN THE SOFTLAYER ENVIRONMENT 215-4-14 Authors: Deep Chatterji (dchatter@us.ibm.com) Steve McDuff (mcduffs@ca.ibm.com) CONTENTS Disclaimer...3 Pushing the limits of B2B Integrator...4

More information

Maximizing VMware ESX Performance Through Defragmentation of Guest Systems

Maximizing VMware ESX Performance Through Defragmentation of Guest Systems Maximizing VMware ESX Performance Through Defragmentation of Guest Systems This paper details the results of testing performed to determine if there was any measurable performance benefit to be derived

More information

arxiv: v1 [cs.db] 25 Nov 2018

arxiv: v1 [cs.db] 25 Nov 2018 Enabling Efficient Updates in KV Storage via Hashing: Design and Performance Evaluation Yongkun Li, Helen H. W. Chan, Patrick P. C. Lee, and Yinlong Xu University of Science and Technology of China The

More information

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018 Cloud Computing and Hadoop Distributed File System UCSB CS70, Spring 08 Cluster Computing Motivations Large-scale data processing on clusters Scan 000 TB on node @ 00 MB/s = days Scan on 000-node cluster

More information

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Kefei Wang and Feng Chen Louisiana State University SoCC '18 Carlsbad, CA Key-value Systems in Internet Services Key-value

More information

OPERATING SYSTEM. PREPARED BY : DHAVAL R. PATEL Page 1. Q.1 Explain Memory

OPERATING SYSTEM. PREPARED BY : DHAVAL R. PATEL Page 1. Q.1 Explain Memory Q.1 Explain Memory Data Storage in storage device like CD, HDD, DVD, Pen drive etc, is called memory. The device which storage data is called storage device. E.g. hard disk, floppy etc. There are two types

More information

Block Storage Service: Status and Performance

Block Storage Service: Status and Performance Block Storage Service: Status and Performance Dan van der Ster, IT-DSS, 6 June 2014 Summary This memo summarizes the current status of the Ceph block storage service as it is used for OpenStack Cinder

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information

CSE 124: Networked Services Lecture-16

CSE 124: Networked Services Lecture-16 Fall 2010 CSE 124: Networked Services Lecture-16 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/23/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments

More information

White Paper. File System Throughput Performance on RedHawk Linux

White Paper. File System Throughput Performance on RedHawk Linux White Paper File System Throughput Performance on RedHawk Linux By: Nikhil Nanal Concurrent Computer Corporation August Introduction This paper reports the throughput performance of the,, and file systems

More information

Low Latency Evaluation of Fibre Channel, iscsi and SAS Host Interfaces

Low Latency Evaluation of Fibre Channel, iscsi and SAS Host Interfaces Low Latency Evaluation of Fibre Channel, iscsi and SAS Host Interfaces Evaluation report prepared under contract with LSI Corporation Introduction IT professionals see Solid State Disk (SSD) products as

More information

New Oracle NoSQL Database APIs that Speed Insertion and Retrieval

New Oracle NoSQL Database APIs that Speed Insertion and Retrieval New Oracle NoSQL Database APIs that Speed Insertion and Retrieval O R A C L E W H I T E P A P E R F E B R U A R Y 2 0 1 6 1 NEW ORACLE NoSQL DATABASE APIs that SPEED INSERTION AND RETRIEVAL Introduction

More information

Benchmarking Cloud Serving Systems with YCSB 詹剑锋 2012 年 6 月 27 日

Benchmarking Cloud Serving Systems with YCSB 詹剑锋 2012 年 6 月 27 日 Benchmarking Cloud Serving Systems with YCSB 詹剑锋 2012 年 6 月 27 日 Motivation There are many cloud DB and nosql systems out there PNUTS BigTable HBase, Hypertable, HTable Megastore Azure Cassandra Amazon

More information

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage Performance Study of Microsoft SQL Server 2016 Dell Engineering February 2017 Table of contents

More information

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state

More information

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting

More information

Analysis of high capacity storage systems for e-vlbi

Analysis of high capacity storage systems for e-vlbi Analysis of high capacity storage systems for e-vlbi Matteo Stagni - Francesco Bedosti - Mauro Nanni May 21, 212 IRA 458/12 Abstract The objective of the analysis is to verify if the storage systems now

More information

Performance of Multicore LUP Decomposition

Performance of Multicore LUP Decomposition Performance of Multicore LUP Decomposition Nathan Beckmann Silas Boyd-Wickizer May 3, 00 ABSTRACT This paper evaluates the performance of four parallel LUP decomposition implementations. The implementations

More information

SoftNAS Cloud Performance Evaluation on AWS

SoftNAS Cloud Performance Evaluation on AWS SoftNAS Cloud Performance Evaluation on AWS October 25, 2016 Contents SoftNAS Cloud Overview... 3 Introduction... 3 Executive Summary... 4 Key Findings for AWS:... 5 Test Methodology... 6 Performance Summary

More information

SAS Technical Update Connectivity Roadmap and MultiLink SAS Initiative Jay Neer Molex Corporation Marty Czekalski Seagate Technology LLC

SAS Technical Update Connectivity Roadmap and MultiLink SAS Initiative Jay Neer Molex Corporation Marty Czekalski Seagate Technology LLC SAS Technical Update Connectivity Roadmap and MultiLink SAS Initiative Jay Neer Molex Corporation Marty Czekalski Seagate Technology LLC SAS Connectivity Roadmap Background Connectivity Objectives Converged

More information

CS Project Report

CS Project Report CS7960 - Project Report Kshitij Sudan kshitij@cs.utah.edu 1 Introduction With the growth in services provided over the Internet, the amount of data processing required has grown tremendously. To satisfy

More information

W H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b l e d N e x t - G e n e r a t i o n V N X

W H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b l e d N e x t - G e n e r a t i o n V N X Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com W H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b

More information

JMR ELECTRONICS INC. WHITE PAPER

JMR ELECTRONICS INC. WHITE PAPER THE NEED FOR SPEED: USING PCI EXPRESS ATTACHED STORAGE FOREWORD The highest performance, expandable, directly attached storage can be achieved at low cost by moving the server or work station s PCI bus

More information

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble CSE 451: Operating Systems Spring 2009 Module 12 Secondary Storage Steve Gribble Secondary storage Secondary storage typically: is anything that is outside of primary memory does not permit direct execution

More information

HTRC Data API Performance Study

HTRC Data API Performance Study HTRC Data API Performance Study Yiming Sun, Beth Plale, Jiaan Zeng Amazon Indiana University Bloomington {plale, jiaazeng}@cs.indiana.edu Abstract HathiTrust Research Center (HTRC) allows users to access

More information

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson A Cross Media File System Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson 1 Let s build a fast server NoSQL store, Database, File server, Mail server Requirements

More information

Clustering and Reclustering HEP Data in Object Databases

Clustering and Reclustering HEP Data in Object Databases Clustering and Reclustering HEP Data in Object Databases Koen Holtman CERN EP division CH - Geneva 3, Switzerland We formulate principles for the clustering of data, applicable to both sequential HEP applications

More information

CSE 124: Networked Services Fall 2009 Lecture-19

CSE 124: Networked Services Fall 2009 Lecture-19 CSE 124: Networked Services Fall 2009 Lecture-19 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa09/cse124 Some of these slides are adapted from various sources/individuals including but

More information

NPTEL Course Jan K. Gopinath Indian Institute of Science

NPTEL Course Jan K. Gopinath Indian Institute of Science Storage Systems NPTEL Course Jan 2012 (Lecture 39) K. Gopinath Indian Institute of Science Google File System Non-Posix scalable distr file system for large distr dataintensive applications performance,

More information

Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays

Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays This whitepaper describes Dell Microsoft SQL Server Fast Track reference architecture configurations

More information

Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage

Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage A Dell Technical White Paper Dell Database Engineering Solutions Anthony Fernandez April 2010 THIS

More information

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BRETT WENINGER, MANAGING DIRECTOR 10/21/2014 ADURANT APPROACH TO BIG DATA Align to Un/Semi-structured Data Instead of Big Scale out will become Big Greatest

More information

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage CSCI-GA.2433-001 Database Systems Lecture 8: Physical Schema: Storage Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com View 1 View 2 View 3 Conceptual Schema Physical Schema 1. Create a

More information

Accelerate Database Performance and Reduce Response Times in MongoDB Humongous Environments with the LSI Nytro MegaRAID Flash Accelerator Card

Accelerate Database Performance and Reduce Response Times in MongoDB Humongous Environments with the LSI Nytro MegaRAID Flash Accelerator Card Accelerate Database Performance and Reduce Response Times in MongoDB Humongous Environments with the LSI Nytro MegaRAID Flash Accelerator Card The Rise of MongoDB Summary One of today s growing database

More information

Database Solutions Engineering. Best Practices for Deploying SSDs in an Oracle OLTP Environment using Dell TM EqualLogic TM PS Series

Database Solutions Engineering. Best Practices for Deploying SSDs in an Oracle OLTP Environment using Dell TM EqualLogic TM PS Series Best Practices for Deploying SSDs in an Oracle OLTP Environment using Dell TM EqualLogic TM PS Series A Dell Technical White Paper Database Solutions Engineering Dell Product Group April 2009 THIS WHITE

More information

HyperDex. A Distributed, Searchable Key-Value Store. Robert Escriva. Department of Computer Science Cornell University

HyperDex. A Distributed, Searchable Key-Value Store. Robert Escriva. Department of Computer Science Cornell University HyperDex A Distributed, Searchable Key-Value Store Robert Escriva Bernard Wong Emin Gün Sirer Department of Computer Science Cornell University School of Computer Science University of Waterloo ACM SIGCOMM

More information

NVMe SSDs Future-proof Apache Cassandra

NVMe SSDs Future-proof Apache Cassandra NVMe SSDs Future-proof Apache Cassandra Get More Insight from Datasets Too Large to Fit into Memory Overview When we scale a database either locally or in the cloud performance 1 is imperative. Without

More information

Correlation based File Prefetching Approach for Hadoop

Correlation based File Prefetching Approach for Hadoop IEEE 2nd International Conference on Cloud Computing Technology and Science Correlation based File Prefetching Approach for Hadoop Bo Dong 1, Xiao Zhong 2, Qinghua Zheng 1, Lirong Jian 2, Jian Liu 1, Jie

More information

Storage. Hwansoo Han

Storage. Hwansoo Han Storage Hwansoo Han I/O Devices I/O devices can be characterized by Behavior: input, out, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections 2 I/O System Characteristics

More information

RACKSPACE ONMETAL I/O V2 OUTPERFORMS AMAZON EC2 BY UP TO 2X IN BENCHMARK TESTING

RACKSPACE ONMETAL I/O V2 OUTPERFORMS AMAZON EC2 BY UP TO 2X IN BENCHMARK TESTING RACKSPACE ONMETAL I/O V2 OUTPERFORMS AMAZON EC2 BY UP TO 2X IN BENCHMARK TESTING EXECUTIVE SUMMARY Today, businesses are increasingly turning to cloud services for rapid deployment of apps and services.

More information

Seagate Enterprise SATA SSD with DuraWrite Technology Competitive Evaluation

Seagate Enterprise SATA SSD with DuraWrite Technology Competitive Evaluation August 2018 Seagate Enterprise SATA SSD with DuraWrite Technology Competitive Seagate Enterprise SATA SSDs with DuraWrite Technology have the best performance for compressible Database, Cloud, VDI Software

More information

Characterizing Storage Resources Performance in Accessing the SDSS Dataset Ioan Raicu Date:

Characterizing Storage Resources Performance in Accessing the SDSS Dataset Ioan Raicu Date: Characterizing Storage Resources Performance in Accessing the SDSS Dataset Ioan Raicu Date: 8-17-5 Table of Contents Table of Contents...1 Table of Figures...1 1 Overview...4 2 Experiment Description...4

More information

Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure

Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure Generational Comparison Study of Microsoft SQL Server Dell Engineering February 2017 Revisions Date Description February 2017 Version 1.0

More information

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

CMSC 424 Database design Lecture 12 Storage. Mihai Pop CMSC 424 Database design Lecture 12 Storage Mihai Pop Administrative Office hours tomorrow @ 10 Midterms are in solutions for part C will be posted later this week Project partners I have an odd number

More information

Quiz for Chapter 6 Storage and Other I/O Topics 3.10

Quiz for Chapter 6 Storage and Other I/O Topics 3.10 Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: 1. [6 points] Give a concise answer to each of the following

More information

Lenovo RAID Introduction Reference Information

Lenovo RAID Introduction Reference Information Lenovo RAID Introduction Reference Information Using a Redundant Array of Independent Disks (RAID) to store data remains one of the most common and cost-efficient methods to increase server's storage performance,

More information

CSE 451: Operating Systems Spring Module 12 Secondary Storage

CSE 451: Operating Systems Spring Module 12 Secondary Storage CSE 451: Operating Systems Spring 2017 Module 12 Secondary Storage John Zahorjan 1 Secondary storage Secondary storage typically: is anything that is outside of primary memory does not permit direct execution

More information

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

IX: A Protected Dataplane Operating System for High Throughput and Low Latency IX: A Protected Dataplane Operating System for High Throughput and Low Latency Belay, A. et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Reviewed by Chun-Yu and Xinghao Li Summary In this

More information

Next-Generation Cloud Platform

Next-Generation Cloud Platform Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology

More information

Accelerating Big Data: Using SanDisk SSDs for Apache HBase Workloads

Accelerating Big Data: Using SanDisk SSDs for Apache HBase Workloads WHITE PAPER Accelerating Big Data: Using SanDisk SSDs for Apache HBase Workloads December 2014 Western Digital Technologies, Inc. 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents

More information

Accelerating Parallel Analysis of Scientific Simulation Data via Zazen

Accelerating Parallel Analysis of Scientific Simulation Data via Zazen Accelerating Parallel Analysis of Scientific Simulation Data via Zazen Tiankai Tu, Charles A. Rendleman, Patrick J. Miller, Federico Sacerdoti, Ron O. Dror, and David E. Shaw D. E. Shaw Research Motivation

More information

BENEFITS AND BEST PRACTICES FOR DEPLOYING SSDS IN AN OLTP ENVIRONMENT USING DELL EQUALLOGIC PS SERIES

BENEFITS AND BEST PRACTICES FOR DEPLOYING SSDS IN AN OLTP ENVIRONMENT USING DELL EQUALLOGIC PS SERIES WHITE PAPER BENEFITS AND BEST PRACTICES FOR DEPLOYING SSDS IN AN OLTP ENVIRONMENT USING DELL EQUALLOGIC PS SERIES Using Solid State Disks (SSDs) in enterprise storage arrays is one of today s hottest storage

More information

High Performance Solid State Storage Under Linux

High Performance Solid State Storage Under Linux High Performance Solid State Storage Under Linux Eric Seppanen, Matthew T. O Keefe, David J. Lilja Electrical and Computer Engineering University of Minnesota April 20, 2010 Motivation SSDs breaking through

More information

Extremely Fast Distributed Storage for Cloud Service Providers

Extremely Fast Distributed Storage for Cloud Service Providers Solution brief Intel Storage Builders StorPool Storage Intel SSD DC S3510 Series Intel Xeon Processor E3 and E5 Families Intel Ethernet Converged Network Adapter X710 Family Extremely Fast Distributed

More information

Comparing Software versus Hardware RAID Performance

Comparing Software versus Hardware RAID Performance White Paper VERITAS Storage Foundation for Windows Comparing Software versus Hardware RAID Performance Copyright 2002 VERITAS Software Corporation. All rights reserved. VERITAS, VERITAS Software, the VERITAS

More information

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google 2017 fall DIP Heerak lim, Donghun Koo 1 Agenda Introduction Design overview Systems interactions Master operation Fault tolerance

More information

10 Steps to Virtualization

10 Steps to Virtualization AN INTEL COMPANY 10 Steps to Virtualization WHEN IT MATTERS, IT RUNS ON WIND RIVER EXECUTIVE SUMMARY Virtualization the creation of multiple virtual machines (VMs) on a single piece of hardware, where

More information

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems

More information

S4D-Cache: Smart Selective SSD Cache for Parallel I/O Systems

S4D-Cache: Smart Selective SSD Cache for Parallel I/O Systems S4D-Cache: Smart Selective SSD Cache for Parallel I/O Systems Shuibing He, Xian-He Sun, Bo Feng Department of Computer Science Illinois Institute of Technology Speed Gap Between CPU and Hard Drive http://www.velobit.com/storage-performance-blog/bid/114532/living-with-the-2012-hdd-shortage

More information

WHITE PAPER AGILOFT SCALABILITY AND REDUNDANCY

WHITE PAPER AGILOFT SCALABILITY AND REDUNDANCY WHITE PAPER AGILOFT SCALABILITY AND REDUNDANCY Table of Contents Introduction 3 Performance on Hosted Server 3 Figure 1: Real World Performance 3 Benchmarks 3 System configuration used for benchmarks 3

More information

White paper ETERNUS Extreme Cache Performance and Use

White paper ETERNUS Extreme Cache Performance and Use White paper ETERNUS Extreme Cache Performance and Use The Extreme Cache feature provides the ETERNUS DX500 S3 and DX600 S3 Storage Arrays with an effective flash based performance accelerator for regions

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Storage and Other I/O Topics I/O Performance Measures Types and Characteristics of I/O Devices Buses Interfacing I/O Devices

More information

This project must be done in groups of 2 3 people. Your group must partner with one other group (or two if we have add odd number of groups).

This project must be done in groups of 2 3 people. Your group must partner with one other group (or two if we have add odd number of groups). 1/21/2015 CS 739 Distributed Systems Fall 2014 PmWIki / Project1 PmWIki / Project1 The goal for this project is to implement a small distributed system, to get experience in cooperatively developing a

More information

CA485 Ray Walshe Google File System

CA485 Ray Walshe Google File System Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage

More information

IBM V7000 Unified R1.4.2 Asynchronous Replication Performance Reference Guide

IBM V7000 Unified R1.4.2 Asynchronous Replication Performance Reference Guide V7 Unified Asynchronous Replication Performance Reference Guide IBM V7 Unified R1.4.2 Asynchronous Replication Performance Reference Guide Document Version 1. SONAS / V7 Unified Asynchronous Replication

More information

MODERN FILESYSTEM PERFORMANCE IN LOCAL MULTI-DISK STORAGE SPACE CONFIGURATION

MODERN FILESYSTEM PERFORMANCE IN LOCAL MULTI-DISK STORAGE SPACE CONFIGURATION INFORMATION SYSTEMS IN MANAGEMENT Information Systems in Management (2014) Vol. 3 (4) 273 283 MODERN FILESYSTEM PERFORMANCE IN LOCAL MULTI-DISK STORAGE SPACE CONFIGURATION MATEUSZ SMOLIŃSKI Institute of

More information

Using Transparent Compression to Improve SSD-based I/O Caches

Using Transparent Compression to Improve SSD-based I/O Caches Using Transparent Compression to Improve SSD-based I/O Caches Thanos Makatos, Yannis Klonatos, Manolis Marazakis, Michail D. Flouris, and Angelos Bilas {mcatos,klonatos,maraz,flouris,bilas}@ics.forth.gr

More information

VERITAS Foundation Suite TM 2.0 for Linux PERFORMANCE COMPARISON BRIEF - FOUNDATION SUITE, EXT3, AND REISERFS WHITE PAPER

VERITAS Foundation Suite TM 2.0 for Linux PERFORMANCE COMPARISON BRIEF - FOUNDATION SUITE, EXT3, AND REISERFS WHITE PAPER WHITE PAPER VERITAS Foundation Suite TM 2.0 for Linux PERFORMANCE COMPARISON BRIEF - FOUNDATION SUITE, EXT3, AND REISERFS Linux Kernel 2.4.9-e3 enterprise VERSION 2 1 Executive Summary The SPECsfs3.0 NFS

More information

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( ) Guide: CIS 601 Graduate Seminar Presented By: Dr. Sunnie S. Chung Dhruv Patel (2652790) Kalpesh Sharma (2660576) Introduction Background Parallel Data Warehouse (PDW) Hive MongoDB Client-side Shared SQL

More information

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc. Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_WP_ 20121112 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD

More information

Recovering Disk Storage Metrics from low level Trace events

Recovering Disk Storage Metrics from low level Trace events Recovering Disk Storage Metrics from low level Trace events Progress Report Meeting May 05, 2016 Houssem Daoud Michel Dagenais École Polytechnique de Montréal Laboratoire DORSAL Agenda Introduction and

More information