A Mathematical Model in Support of Efficient offloading for Active Storage Architectures

Size: px
Start display at page:

Download "A Mathematical Model in Support of Efficient offloading for Active Storage Architectures"

Transcription

1 A Mathematical Model in Support of Efficient offloading for Active Storage Architectures Dr.Naveenkumar Jayakumar and Mr.Sairam Iyer Department of Computer Engineering, Bharati Vidyapeeth Deemed University, Pune EMC Corporation, Bengaluru. Dr.S.D.Joshi and Dr.S.H.Patil Department of Computer Engineering, Bharati Vidyapeeth Deemed University, Pune Abstract With the advent of big data and 3rd platform application technologies, enterprise class applications are going through rapid transformation. These enterprise class applications are expected to deliver actionable business intelligence as against presenting volume of information drawn from raw data. As these enterprise applications move up in the value chain, they are becoming more and more data intensive thereby increasing the volume of data to be moved between the hosts and the data storage sub-systems. However, the storage IO sub systems dawdled in advancement in consistency with enhancement of processors and memory coerces the system to deliver huge amount of data for execution of the data powered applications due to bandwidth limitation, excessive power consumption, network congestions, and redundant copies of data in the end to end application stack. This bottleneck in the IO path negatively impacts end to end application performance. One good solution proposed by earlier researches is to move the compute intensive tasks of the applications closer to data so as to minimize the amount of raw data transferred between server and the storage. This research paper explores the offload architecture options and analyses the application offload architecture factors influencing the end to end application performance. Keywords Active Storage;Storage Architecture; storage arrays; Performance mode;, Modelling of computer architecture;, Emerging technologies; Parallel I/O; Processors Architecture; Distributed Systems; multiple-processor systems;performance measures; Parallel systems. I. INTRODUCTION Applications in High performance computing are I/O intensive because of their requirement of vast data to be accessed and processed. There is an ever growing gap between the performance of compute and I/O access rates with the current technology of disk drives. Even though if raw capacity hardware like processors and storage are available, it becomes more vital to understand the complexity of scaling and parallelism in order to provide software which increases the performance of the applications in high performance computing. Now days a large class of applications has been developed and deployed in almost all the domains. These applications generate enormous amount of data and process it. These kinds of applications generating, storing, retrieving and processing huge data are called data intensive applications. Since these applications have to interact with the I/O subsystems frequently they are sometimes also regarded as I/O intensive applications. There are many emerging technologies in the storage field like software defined storage, storage virtualization etc., but the question which is been raised is the technologies which is being developed and deployed will able to fulfil the required end to end performance to serve the I/O intensive applications? The answer to the above question relates to the process of improving storage performance and bottleneck identification of the infrastructure and application execution. In a cluster system, the compute node is responsible for executing the I/O intensive functions. These functions fetch huge amount of data from storage array thereby increasing the bottleneck at network and also result in the underutilization of the hardware resources by wasting the compute, memory and power usage by the network/fabrics at storage array end. The network traffic increases the response time of the applications. The end-to-end application performance also gets influenced in the tradition cluster infrastructure. One of the options is to deploy the conceptual technology Active storage which is still in research. The active storage reduces the consumption of the bandwidth between the compute node and storage array by leveraging the computing power of the storage array. In the active storage the user end IO intensive functions or the tasks are offloaded from compute node to storage node for their execution in storage node. This paper discusses on an efficient offloading for active storage architectures considering the various hardware resources like core, processor utilization, port utilization and based upon these metric values it is decided when to offload I/O intensive functions from compute node to storage node which makes feasible for storage node to accept the off loadable tasks or functions to be executed at storage array end. The primal idea is to ascertain the efficiency, performance of the storage array and also effective performance of the application by offloading or not offloading the tasks to the storage array. The remaining paper discusses the points in a systematized way as trails: Section II explores the various

2 technologies and terminologies associated to active storage following is the mathematical model that s supports the active storage offloading part based on the various parameters that decides to when to offload the function or process that is explained in section III. Section IV evaluates the parameters that supports the offloading using two applications and at last the conclusion of this research. II. LITERATURE SURVEY Extensive research work has taken place in improving the performance of the applications in high performance computing environment. Research scientists have proposed new prototypes of active storage that helps in improving the end-to-end application performance. Many of the works done by various researchers/scientist in the active storage domain state only the offload framework and it performance benefits. The notion of modelling hardware resource consumption of the host as well as the storage array is new and has not been widely explored. [1][2][3] Proposed the concept of active disk architecture in which the processing power and the memory are embedded singly on each hard disk. The offloading of tasks is done through the stream based programming model to the hard disk from host and the tasks gets executed by consuming the compute power and memory embedded on the storage disks. Active disk based file system was proposed by [1][4][5]. This file system offloads its part of file system functionality to active disks. Application specific tasks can also be migrated to active disks for getting executed and only the results are returned back to the application. [1][16][17][19] Proposed an analytical model for evaluating the various applications with different patterns and wide applicability, which may benefit out of active storage. [6][7][8] Suggested that the distributed architecture performs well than partially and centralized systems through the experimented results on smart disks. Smart disks consisted the processor, on disk memory and memory interfaces. [1][3][8][9] Proposed his research view in the context of parallel file system. The approach provides the server-server communication for reduction and aggregation. Designed the enhanced programming interface that enables application codes to embed in the parallel file system. A combination of active disks and object storage model was proposed. The various research ideas in the field of OSD based active storage and focused on changing the architecture of the storage systems. [14] [15][16] Demonstrated the active storage implemented in lustre file system. [17][18][19] Proposed the active storage in Lustre deployed in user space and compared it with the active storage in the kernel space. The data movement is reduced drastically in both the proposed prototypes. The user space is better than the kernel space implementation because, in kernel space implementation Operating system codes needs to be altered and new API s needs to be developed for implementing the active storage. On the other hand, user space is readily deployable, more flexible and faster. III. WHEN TO OFFLOAD MODEL A. Active storage Exadata is an intelligent storage appliance, which comprises the offloading of tasks near to the data. But these offloading are not full-fledged and are ad-hoc in manner. Exadata components can be partitioned into three compartments like Database servers, Network fabrics and Storage servers also called storage cells. The more storage cells more will be the capacity and high bandwidth. The offloading process in Exadata happens between the database servers and storage cell through idb (intelligent database protocol). The idb works as function conveyance to transparently represent the database operations to Exadata operations. It is also used for transferring data between database node and storage cell. idb is implemented using LIBCELL and in turn Exadata binaries are linked up with the LIBCELL to facilitate cell communication. The heart of the Exadata storage cell for providing the unique features and services is CELLSRV. The CELLSRV provides majority of services. Cell Offload significantly reduces the IO transfer over storage network, memory and CPU utilization database tier nodes. Since direct path read mechanism is implemented the buffer cache will not get saturated for large IO requests. DDN also emerged with its own storage fusion architecture (SFA), which was designed for multiprocessor and multi core systems. The processors inside the storage were divided into application processor and RAID processors. The Application processors were dedicated to execute tightly coupled applications within the storage subsystem. [30] With the help of virtualization tool the applications were brought into storage subsystem and the application processor were used to execute them. Operating system with the SFA acts as hypervisor to control processors/cores, memory, IO and virtual disk allocations. This also takes care of applications running in embedded space cannot hinder block operations memory space and that the applications only utilize the processing and IO resources they have been coupled with. It has been observed that, the cost involved in bandwidth consumption while transacting data amid the processing nodes and the data storage node has not shown a significant enhancement while in the same period the capacity of the disk has been significantly improved. Many applications require high end computing and are shifting towards the data centric computation. Thus Data centric applications request storage nodes to transfer huge amount of data to compute nodes where these applications are getting processed. This data transfer often dominates application execution or processing. High data intensive applications have stringent performance requirements. The advent of low latency, high bandwidth networks architectures and embedded processing having enabled hard disks to function as active storage devices. In order to meet these performance demands, it is essential to optimize the processing and I/O subsystems. One promising approach to optimize performance is to use active storage. Storage infrastructure heavily impacts the Application performance

3 whether it is in Virtualized environment or non- Virtualized environment. [30] Storage configurations are directly influenced or changed based on the demands of various applications. The factors for storage configurations can range from array cache sizes, number of disks per LUN, fan in/out and other variables. In turn these factors may influence how I/O Performance is handled & tempt how applications respond. The situation is no different with virtualized environment. There are many proprietary parallel/distributed files systems (Google s GFS, IBM s GPFS, Panasas Active file scale system, SGI s CXFS) and open source like (Lustre, PVFS, Redhat s GFS) have been developed to manage the increasing volumes of data in terms of High performance computing environment. [2] Active storage is the system, which takes advantages of underutilized processing power of computing units in storage nodes instead of simply storing data. Employing the storage nodes to process the data along with storing will result in drastic reduction in redundant data transfers over the network, which spectacularly boosts the performance. Offloading some computing operations and tasks from application server (computing nodes) to computing units of storage nodes is core concept of active storage. Here, the computation is moved closer to the data instead of moving data closer to computation. That means, instead of moving out data from storage node, the computation is brought in to the storage, which will significantly scale down the data movement across the network and hence the overall network traffic. [9] Active storage is targeted at IO intensive applications. B. Offload Model The section begins with emphasizing on the main idea of this paper for data intensive applications in high performance computing environment. The figure 1 below shows the generic idea of offloading taking place between the host and the storage array. The core idea for this paper is to find if the storage array will perform well or the performance of the array will degrade if the tasks are offloaded to the array from application server as well as also covers the assessment on whether the end-to-end application performance enhances or improves. The figure 1 states a simple generic offload framework, which consist of components, which makes the offloading feasible. The dotted line components are the hardware and the others are logical components. This paper derives the question when to offload from the research work proposed by the [9] where they state the offloading should be only done when the bandwidth requirement analysis is completed which is inbuilt into the dynamic active storage. At the host end Applications interact with the distributed file systems API for carrying out I/O operations with the storage array. A new API is developed at the host end as well as at the storage array side that are further embedded into file systems and kernels. These newly created API will interact with each other in order to perform simple I/O operations or offloading tasks to the active storage array. Active Storage API responds to offloading requests. As per the earlier studies, the distributed file systems are used to create the local I/O API. The local I/O API is responsible for carrying out the local strips to be viewed as a file and helps in reading the data locally for kernels. [30]. The I/O requests are received by API at active storage end and it starts accumulating the scattered information of data required by the request. Once the evaluation of the requested data is over, the I/O requests and the set of tasks are offloaded to this active storage node from the host. [4] Then these tasks are executed by the storage array and the results are returned to the host. C. When to Offload In a storage array, there are various components like Front end ports connecting the host HBA s and storage array. Cache serving the requests from hosts there by increasing the performance of the storage array [J 2015]. The Back end ports, which connects the front end and cache to the disks. The disks, which store the data, the processors which execute I/O requests and send the data to host. The front-end ports are the important component, which receives the requests from one or more hosts via HBA. Figure 1: Offloading model

4 The utilization of ports and processors play an important role in determining whether to offload functions from host to array. In current trends of storage technology, there are many functions, which are being embedded with the storage array. These arrays perform functions like compression, snapshots etc. other than just servicing the I/O requests from the hosts. While performing the regular I/O operations, the components gets utilized and the data transferred from disks to host consume the network bandwidth depending on the data size required by the applications. In addition to these I/O requests being served, if the functions are offloaded from host to storage array there may be chance of high utilization of resources, which will lead to low response times and reduce the number of IOPS. This in turn degrades the performance of the storage array executing it near the data. The factors, which impact the performance of storage array, are storage array port utilization, storage core utilization, bandwidth consumed for offloading. [Jayakumar et al. 2015; Bhardwaj 2015; Naveenkumar et al. 2015] When any function is offloaded to the storage array the dotted representation in figure 1 gets impacted. Each component has a threshold value, which is monitored every time when the array is in action. The threshold values are set in order to keep the performance high, once the threshold value is crossed by any one of the components in the array, the performance is checked if it is dropping. The overall performance of the storage depends both on the software and the hardware resource utilizations. The research paper centers to find whether offloading to array will increase the performance or at when the offload request will be accepted and serviced by the storage array. Considering the total cost of execution time of the functions or tasks viewing end to end from applications to storage array is given by: Keeping the performance of the end to end application storage array deployment in mind, it should be important to monitor the resource utilization and bandwidth consumed when the functions/tasks are offloaded to the storage for execution. If the bandwidth consumption is high for offloading the I/O requests and tasks to the active storage system and the processing cost is also more than the native I/O execution, then the host takes up the task and I/O requests and executes it in native style rather than offloading it to the active storage node. If the bandwidth cost is less, then the host will offload the tasks and I/O request to the active storage node for execution using the Parallel I/O API s. [4]. As discussed earlier, the component utilization also plays a very important role in determining whether the tasks can be offloaded to storage array. The first component which receives the requests from the host HBA s are the front-end ports. If the demand of the front-end ports is exceeded by the utilization of the ports, then it is not recommending offloading the functions to the storage array. In order to understand the scenario, it is needed to monitor how much is being demanded for how much time? What is the amount of the request size being transferred to and from front-end ports and for how long? The demand for the front-end ports will be given by this equation: The utilization of the port is defined by the below equation: (3) (4) Where, (1) Thus from equation (4) and (5) we can check upon the feasibility and viability of the offloading to storage array. It s only effective to offload and to expect high performance out of the storage array when the port demand is less than port utilization shown in equation 6. (5) Where, Task execution host is the task execution time at host. Task execution array fraction is the fraction of local execution time that remote execution takes. From the equation (1) we can draw that the offloading is only beneficial when the host execution time is more than the array execution time. (2) At the same time if the cost involved in executing the function at array side is minimum to that of the total capacity available and also less than the host requires that, then as per equation (2), the functions or tasks are eligible for offloading to storage array. When we see with respect to network, the cost of bandwidth should also be greater than the cost of the offloading functions to storage array, which is shown in the equation (3). The CPU s in storage array that is responsible for supervising the disks and array components. The CPU s receive the I/O transfer request from the host. These host requests are translated into individual I/O operations on the physical drives in the array. The disk array CPU s is essentially a compute engine that routes data from host to disk and vice versa. Before coming up with new storage architecture or making/enhancing improvements to the storage systems and storage network to make sure the performance is as expected. Since, in active storage approach, the functions are offloaded, the CPU s also needs to perform computation along with responding to the I/O requests. In order to find whether the function can be offloaded from host, it is important to consider (6)

5 the utilization of processors inside the array at the time of the offloading is taking place. applicable to the active storage architectures and these performance of the storage are given by the equations (9) and (10). (7) IV. EVALUATION The CPU clock frequency will be constant, thus capacity is the product of number of cores and time interval for they are used. The required capacity is calculated by the product of interval of time CPU s are required and Number of CPU s consumed. The total capacity is the product of total number of CPU s and the total interval time. If the potentially available CPU is less and the Utilization of the CPU is at its max and more the demanding utilization is increasing through offloading, then the performance of the array will degrade hence, to keep a check on the utilization of CPU s in storage array it is necessary to consider the environment parameters while offloading tasks and requests from host. Meanwhile, the migration of data between both the nodes can be drastically reduced by deploying intelligent data distribution methods that considers the dependency among the distributed data. [4] The offloaded computations are carried out by the kernel at the array end. The processing kernels are invoked by helper daemon process for executing the offloaded computations and I/O requests. The performance of the storage array is realized by the response time and IOPS metrics. The utilization of array is proportional to the number of request jobs given by (8): Where, N = No. of request Jobs The Response time and IOPS are measure through these equations: Where, tr = Response Time Ats = Array service time (8) (9) (10) Considering the above described set of equations, it is required that if operation needs to be offloaded to storage array, it is only efficient when the cost of execution of array is less than the cost of execution at host and current port utilization in array is less than what is being demanded at that time and thirdly, the CPU utilization of array should not exceed the threshold and current demand should not exceed the current utilization of CPU s in the array. If these three equations (2), (3) and (7) are satisfied then offloading will yield the performance of storage array. The above equations are only A. Testbed A test bed has been prepared here in order to check whether offloading of functions yields performance improvements at the application level as well as utilizes the hardware resources of the IO subsystem. The test bed constitutes two systems connected through Ethernet of speed 1Gbps. The first system assumed to be the storage node which had i7 Processor 3.1 GHz, 8GB RAM and 4 250GB hard disks. This first system was running on Centos 6.5. The second system considered to be application hosting system constituting i7 Processor 3.1 GHz, 4GB RAM and 1 250GB hard disks and the system was running centos 6.5. B. Application for Testing The application chosen for testing are Hadoop example word count and a RDBMS i.e. PostgreSQL C. Preparing Data The database, which is created, is of size 10GB having nearly 800 tables and 200 varchar and integer attributes across the database. The metadata was not created so any query fired needs to be accessing all the tables directly and with referring to identifying the indexes and its values. There were multiple queries generated, which access all the tables, and scans the individual table one by one. There were total 10 text files consisting stories and each file size was nearly 1.2GB. The total size of data, which was prepared for the word count program, was nearly 12 GB. D. Test Execution The performance measurement was done on the machine, which is assumed to be a storage system. The data is stored in the storage system and the application is executing at other node. The above-explained offloading model is used to offload the application from hosting system to storage system. The performance of the storage system is measured w.r.t CPU utilization, impact on performance of application and network data transfer. These metrics are measured in both the scenarios when the application is running at the host end i.e. without offloading and also the values are measured when the application is offloaded to the storage system. The graphs and table are shown below for both the applications viz. word count Programme and PostgreSQL.

6 Table 1: shows the results of Latency measured while executing the query with offloading and without Offloading Logic for word Count Application Table 2: shows the results of network bandwidth measured while executing the query with offloading and without offloading Logic for word Count Application Figure 2: Above graph depicts the results of Latency measured while executing the query with offloading and without offloading Logic for word Count Application Figure 3: shows the graph for network bandwidth measured while executing the query with offloading and without offloading Logic for word Count Application

7 Table 3: shows the results of CPU Utilization measured while executing the query with offloading and without offloading Logic for word Count Application E. Execution Mechanics The database PostgreSQL processes are executed in the clientserver fashion. The communication happens through client interface library. The mechanism of inter process communication is shown below in the diagram. The test bed, which is evaluated and explained in this paper, consists both configurations for with and without offloading logic. Through the offloading logic explained above in offload model section of the paper, the PostgreSQL server (Backend) process component of the PostgreSQL is pushed into the storage system and the remaining processes of database is executed at database/application hosting node. Figure 5 : PostgreSQL without offloading Logic Figure 4: shows the graph for CPU Utilization measured while executing the query with offloading and without offloading Logic for word Count Application The above figure shows the process components of the PostgreSQL. The backend process consists of components like parser, rewriter, planner and executor. The executor component is the most IO intensive component, which executes functions like fetching of data, storing data and scanning. The logic behind executor is execution of plan tree, which is a pipelined demand pull of processing nodes. The execution of query and processes in PostgreSQL without using offloading logic is depicted in the figure above. In this case, the data is fetched from the storage system and get it stored in shared disk buffers or shared memory. Then the executor fetches the data and executes query over it. In regard to style of this execution the data needs to be moved via network to application hosting which increases the transfer time because of the size of data to

8 be migrated, in turn this process impacts the application response time. In order to improve the performance of application and utilization of various resources at storage end, the offloading logic is proposed and deployed. The offloading logic works using the concept of process migration and RPC. When applied to the existing configuration as show in Figure 2. The Executor i.e. server process (backend) is migrated or offloaded to the storage system thereby moving the processing/executing component which is IO intensive closer to the data residence component. The offloading works on the concept of RPC/RMI packages. It is being developed to offload the IO intensive components of the Application and migrate the component near data. Since, the paper is focusing on the issue of whether it is viable to offload the compute logic to storage systems every time and what resources get hindered because of offloading? We limit the discussion on offloading logic to conceptual view. The below figure explains the configuration of second system in test bed where the offloading logic is deployed at application hosting node. Since, Linux flavor is been installed at storage node, the migration becomes possible with set of existing libraries at storage end. Figure 6: PostgreSQL with offloading Logic The offloading logic identifies the IO bound tasks and migrates it to the other node where the data resides. The postmaster daemon receives the offloaded tasks from kernel libraries at storage end and spawns the tasks to multi cores for effective execution of the queries. Once these cores are assigned with computes component then these cores schedule themselves to IO services as well as the offloaded tasks with the help of operating system. This paper focus on how well these cores can manage both the components execution in parallel. In some cases, the high utilization of resources at storage end may hinder the normal functioning of the system. In order to understand the viability of offloading, these tests were performed to measure and compare how well CPU utilization and application performance reacts to both the configuration of test beds. F. Test Observations The evaluations are made with an objective to understand the impact on utilization of storage resources and application performance when the offloading of IO intensive tasks are made from application hosting system to storage system. Two applications viz. Hadoop s word count and PostgreSQL are considered for testing their performance with both the system model. Application performance and CPU utilization metrics are noticed and graphs were generated for the same. This paper considers two models, the first model has traditional server storage configuration and communication channel where the application running at application server will request data from the storage server and migrate the required data from storage to the memory in server system and then the application executes on the data at server end. The second model is the offloaded model in which the hardware resource structure remains intact only variations comes in the way of interaction between server and storage system. In this system model prototype, the application s IO intensive functions like the executor process of PostgreSQL or the count function including scan and comparator of the word count is partitioned from the whole application and is migrated to storage node where the data resides. It is observed from the graph Figure 1 application Latency for word count application, there is a drastic drop down in execution time when application is executed in second model with offloading. When the application is executed in traditional system model, the application copies the data from the storage to buffer memory in the server where application is getting executed. In this case the response time increases because the data is transferred between nodes while in the later model, the IO intensive function is moved near to data which takes less time because compared to the size of data IO intensive task of the application size is less and it can be quickly transferred to storage node and is executed on the storage node itself and only results are transferred back to Application server. The graph depicts nearly for 6GB data to be transferred from the storage to server node and then the scan and count function executes on bits of transferred data, which nearly takes seconds to complete and respond with results. The same application when executed on the second model with offloading in which the count function and comparator function of the application word count are transferred to storage system and executed there. The response time to be noticed is 2.3 seconds, which is a drastic drop down to earlier measured value. The reason behind this effective response time of application is due to cutting down the transfer of huge amount of data and instead transferring the data

9 intensive functions. In continuation to above evaluation of application response time, there are two more performance components, which are also covered. The throughput also comes down since the data is stopped from transferring and tasks are transferred between server and storage. In case of word count as well as database application it is observed data transfer rate in MBPS is reduced when the application is executed with offloading of data intensive tasks. Table 4: The results of Throughput measured while executing the query with offloading and without offloading Logic for PostgreSQL Application Table 5: The results of Throughput measured while executing the query with offloading and without offloading Logic for PostgreSQL Application Figure 7: The graph of Latency measured while executing the query with offloading and without offloading Logic for PostgreSQL Application Figure 8:The graph of Throughput measured while executing the query with offloading and without offloading Logic for PostgreSQL Application.

10 Table 6: The results of CPU utilization measured while executing the query with offloading and without offloading Logic for PostgreSQL Application. In the storage system the cores are utilized by the operating system and IO services. The utilization varies from nearly 0 % to 33% in the traditional storage system. As per some standards storage array, CPU utilization can be up to 80% and the remaining 20% is kept as buffer if something goes wrong then the CPU utilization can be extended more 20%. But since in majority cases, the CPU utilization reaches nearly 50% to 70% this resource is underutilized at storage end. The offloading becomes effective to have increased utilization of CPU. When the application is offloaded to the storage system the CPU utilization increased to 51% at the cost of gain in the application performance. Still there is space for offloading to the storage. Thus it can be concluded when the offloading happens the CPU utilization can increase drastically depending upon the size of application and native IO services provided by the storage system. So to keep a track of resource utilization at storage end, this paper has proposed in above section when to offload the IO intensive tasks. CONCLUSION This research paper discusses about the key factors, which influences the performance of the storage array if and when the tasks are offloaded to the storage array. Using these factors or performance metrics, it can be decided that the functions or tasks if offloaded will improve efficiency and performance of the storage array or not. The metrics include % CPU utilization and the cost of bandwidth for task execution. These metrics elaborated in the above section will guide us in designing and developing the offloading logics that can be deployed in the active storage architectures or frameworks whereby, the resources can be utilized to their efficiency by offloading being permissible or not. Figure 9: The graph of CPU utilization measured while executing the query with offloading and without offloading Logic for PostgreSQL Application REFERENCES [1] Riedel, E., Gibson, G. and Faloutsos, C., 1998, August. Active storage for large-scale data mining and multimedia applications. In Proceedings of 24th Conference on Very Large Databases (pp ). [2] Piernas, J., Nieplocha, J. and Felix, E.J., 2007, November. Evaluation of active storage strategies for the lustre parallel file system. In Proceedings of the 2007 ACM/IEEE conference on Supercomputing (p. 28). ACM. [3] Henley, M.R., International Business Machines Corporation, Method and system for managing movement of large multi-media data files from an archival storage to an active storage within a multi-media server computer system. U.S. Patent 5,745,756. [4] Shimada, A., Nozawa, M. and Nakano, T., Hitachi, Ltd., External storage unit comprising active and inactive storage wherein data is stored in an active storage if in use and archived to an inactive storage when not accessed in predetermined time by the host processor. U.S. Patent 5,584,008. [5] Chockler, G. and Malkhi, D., Active disk paxos with infinitely many processes. Distributed Computing, 18(1), pp

11 [6] Chen, C. and Chen, Y., 2012, September. Dynamic active storage for high performance I/O. In st International Conference on Parallel Processing (pp ). IEEE. [7] Son, S.W., Lang, S., Carns, P., Ross, R., Thakur, R., Ozisikyilmaz, B., Kumar, P., Liao, W.K. and Choudhary, A., 2010, May. Enabling active storage on parallel I/O software stacks. In 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST) (pp. 1-12). IEEE. [8] Wickremesinghe, R., Chase, J.S. and Vitter, J.S., Distributed computing with load-managed active storage. In High Performance Distributed Computing, HPDC Proceedings. 11th IEEE International Symposium on (pp ). IEEE. [9] Felix, E.J., Fox, K., Regimbal, K. and Nieplocha, J., 2006, April. Active storage processing in a parallel file system. In In Proc. of the 6th LCI International Conference on Linux Clusters: The HPC Revolution (p. 85). [10] Qin, L. and Feng, D., 2006, April. Active storage framework for object-based storage device. In 20th International Conference on Advanced Information Networking and Applications-Volume 1 (AINA'06) (Vol. 2, pp ). IEEE. [11] Ma, X. and Reddy, A.N., MVSS: an active storage architecture. IEEE Transactions on Parallel and Distributed Systems, 14(10), pp [12] Xie, Y., Muniswamy-Reddy, K.K., Feng, D., Long, D.D., Kang, Y., Niu, Z. and Tan, Z., 2011, May. Design and evaluation of oasis: An active storage framework based on t10 osd standard. In 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST) (pp. 1-12). IEEE. [13] Amiri, K., Petrou, D., Ganger, G. and Gibson, G., Dynamic function placement in active storage clusters (No. CMU-CS ). CARNEGIE-MELLON UNIV PITTSBURGH PA SCHOOL OF COMPUTER SCIENCE. [14] Zhang, Y. and Feng, D., 2008, March. An active storage system for high performance computing. In 22nd International Conference on Advanced Information Networking and Applications (aina 2008) (pp ). IEEE. [15] John, T.M., Ramani, A.T. and Chandy, J.A., 2008, September. Active storage using object-based devices. In 2008 IEEE International Conference on Cluster Computing (pp ). IEEE. [16] Fitch, B.G., Rayshubskiy, A., Pitman, M.C., Ward, T.J. and Germain, R.S., 2009, November. Using the active storage fabrics model to address petascale storage challenges. In Proceedings of the 4th Annual Workshop on Petascale Data Storage (pp ). ACM. [17] Acharya, A., Uysal, M. and Saltz, J., Active disks: Programming model, algorithms and evaluation. ACM SIGPLAN Notices, 33(11), pp [18] Riedel, E., Faloutsos, C., Gibson, G.A. and Nagle, D., Active disks for large-scale data processing. Computer, 34(6), pp [19] Riedel, E. and Gibson, G., Active disks-remote execution for network-attached storage (No. CMU-CS ). CARNEGIE-MELLON UNIV PITTSBURGH PA SCHOOL OF COMPUTER SCIENCE. [20] Uysal, M., Acharya, A. and Saltz, J., Evaluation of active disks for decision support databases. In High-Performance Computer Architecture, HPCA-6. Proceedings. Sixth International Symposium on (pp ). IEEE. [21] Riedel, E., Gibson, G. and Faloutsos, C., 1998, August. Active storage for large-scale data mining and multimedia applications. In Proceedings of 24th Conference on Very Large Databases (pp ). [22] Delmerico, J.A., Byrnes, N.A., Bruno, A.E., Jones, M.D., Gallo, S.M. and Chaudhary, V., 2009, December. Comparing the performance of clusters, Hadoop, and Active Disks on microarray correlation computations. In HiPC(pp ). [23] Keeton, K., Patterson, D.A. and Hellerstein, J.M., A case for intelligent disks (IDISKs). ACM SIGMOD Record, 27(3), pp [24] D. T. Jayakumar and R. Naveenkumar, Active Storage, Int. J. Adv. Res. Comput. Sci. Softw. Eng. Int. J, vol. 2, no. 9, pp , [25] M. N. Jayakumar, M. F. Zaeimfar, M. M. Joshi, and S. D. Joshi, A GENERIC PERFORMANCE EVALUATION MODEL FOR THE FILE SYSTEMS, Int. J. Comput. Eng. Technol., vol. 5, no. 1, pp , [26] M. N. Jayakumar, M. F. Zaeimfar, M. M. Joshi, and S. D. Joshi, INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET), J. Impact Factor, vol. 5, no. 1, pp , [27] N. Jayakumar, T. Bhardwaj, K. Pant, S. D. Joshi, and S. H. Patil, A Holistic Approach for Performance Analysis of Embedded Storage Array, Int. J. Sci. Technol. Eng., vol. 1, no. 12, pp , [28] N. Jayakumar, S. Singh, S. H. Patil, and S. D. Joshi, Evaluation Parameters of Infrastructure Resources Required for Integrating Parallel Computing Algorithm and Distributed File System, IJSTE, vol. 1, no. 12, pp , [29] S. D. J. M. J. Naveenkumar Farid, A GENERIC PERFORMANCE EVALUATION MODEL FOR THE FILE SYSTEMS, Int. J. Comput. Eng. Technol., vol. 5, no. 1, p. 6, [30] P. D. S. D. J. Naveenkumar J, Evaluation of Active Storage System Realized through MobilityRPC, Int. J. Innov. Res. Comput. Commun. Eng., vol. 3, no. 11, pp , [31] S. D. J. Naveenkumar Jayakumar Farid Zaeimfar, Workload Characteristics Impacts on file System Benchmarking, Int. J. Adv. Res. Comput. Sci. Softw. Eng., vol. 4, no. 2, pp , [32] J. Naveenkumar, R. Makwana, S. D. Joshi, and D. M. Thakore, Performance Impact Analysis of Application Implemented on Active Storage Framework, Int. J., vol. 5, no. 2, [33] J. Naveenkumar, R. Makwana, S. D. Joshi, and D. M. Thakore, OFFLOADING COMPRESSION AND DECOMPRESSION LOGIC CLOSER TO VIDEO FILES USING REMOTE PROCEDURE CALL, J. Impact Factor, vol. 6, no. 3, pp , [34] R. Salunkhe, A. D. Kadam, N. Jayakumar, and S. Joshi, Luster A Scalable Architecture File System: A Research Implementation on Active Storage Array Framework with Luster file System., in ICEEOT, [35] R. Salunkhe, A. D. Kadam, N. Jayakumar, and D. Thakore, In Search of a Scalable File System State-of-the-art File Systems Review and Map view of new Scalable File system., in nternational Conference on El ectrical, Electronics, and Optimization Techni ques (ICEEOT) , 2015, pp [36] Acharya, A., Uysal, M. and Saltz, J., Active disks: Programming model, algorithms and evaluation. ACM SIGPLAN Notices, 33(11), pp

Active Storage System

Active Storage System Active Storage System Ritesh Mishra 1 Shivam Soni 2, Prateek Bansal 3 1, 2, 3 Bharati Vidyapeeth Deemed University College of Engineering,Pune INDIA Abstract: The active storage system collects and then

More information

Evaluation of Active Storage System Realized Through Hadoop

Evaluation of Active Storage System Realized Through Hadoop Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 12, December 2015,

More information

Rucha Shankar Jamale Department of Computer Engineering, Bharati Vidyapeeth Deemed Universiy, Pune, India.

Rucha Shankar Jamale Department of Computer Engineering, Bharati Vidyapeeth Deemed Universiy, Pune, India. A Study On Near Data Processing Rucha Shankar Jamale Department of Computer Engineering, Bharati Vidyapeeth Deemed Universiy, Pune, India. Abstract-This paper helps to understand the basics of Near Data

More information

International Journal of Advance Engineering and Research Development. Contribution of SSDs to Green Computing

International Journal of Advance Engineering and Research Development. Contribution of SSDs to Green Computing Scientific Journal of Impact Factor (SJIF): 4.72 International Journal of Advance Engineering and Research Development Volume 4, Issue 10, October -2017 Contribution of SSDs to Green Computing Study on

More information

Dynamic Active Storage for High Performance I/O

Dynamic Active Storage for High Performance I/O Dynamic Active Storage for High Performance I/O Chao Chen(chao.chen@ttu.edu) 4.02.2012 UREaSON Outline Ø Background Ø Active Storage Ø Issues/challenges Ø Dynamic Active Storage Ø Prototyping and Evaluation

More information

Was ist dran an einer spezialisierten Data Warehousing platform?

Was ist dran an einer spezialisierten Data Warehousing platform? Was ist dran an einer spezialisierten Data Warehousing platform? Hermann Bär Oracle USA Redwood Shores, CA Schlüsselworte Data warehousing, Exadata, specialized hardware proprietary hardware Introduction

More information

HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT:

HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT: HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms Author: Stan Posey Panasas, Inc. Correspondence: Stan Posey Panasas, Inc. Phone +510 608 4383 Email sposey@panasas.com

More information

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc. Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_WP_ 20121112 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD

More information

Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016

Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016 National Aeronautics and Space Administration Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures 13 November 2016 Carrie Spear (carrie.e.spear@nasa.gov) HPC Architect/Contractor

More information

Enabling Active Storage on Parallel I/O Software Stacks. Seung Woo Son Mathematics and Computer Science Division

Enabling Active Storage on Parallel I/O Software Stacks. Seung Woo Son Mathematics and Computer Science Division Enabling Active Storage on Parallel I/O Software Stacks Seung Woo Son sson@mcs.anl.gov Mathematics and Computer Science Division MSST 2010, Incline Village, NV May 7, 2010 Performing analysis on large

More information

Storage Optimization with Oracle Database 11g

Storage Optimization with Oracle Database 11g Storage Optimization with Oracle Database 11g Terabytes of Data Reduce Storage Costs by Factor of 10x Data Growth Continues to Outpace Budget Growth Rate of Database Growth 1000 800 600 400 200 1998 2000

More information

Software Pipelining for Coarse-Grained Reconfigurable Instruction Set Processors

Software Pipelining for Coarse-Grained Reconfigurable Instruction Set Processors Software Pipelining for Coarse-Grained Reconfigurable Instruction Set Processors Francisco Barat, Murali Jayapala, Pieter Op de Beeck and Geert Deconinck K.U.Leuven, Belgium. {f-barat, j4murali}@ieee.org,

More information

Time Constrained Datadestruction Using Blowfish Algorithm In Cloud

Time Constrained Datadestruction Using Blowfish Algorithm In Cloud Time Constrained Datadestruction Using Blowfish Algorithm In Cloud R.K.Vinothraja 1, T.Krishnakaarthik 2 P.G Scholar, Department of information technology and Engineering, Nandha College of Technology,

More information

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Crossing the Chasm: Sneaking a parallel file system into Hadoop Crossing the Chasm: Sneaking a parallel file system into Hadoop Wittawat Tantisiriroj Swapnil Patil, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University In this work Compare and contrast large

More information

Adaptive Resync in vsan 6.7 First Published On: Last Updated On:

Adaptive Resync in vsan 6.7 First Published On: Last Updated On: First Published On: 04-26-2018 Last Updated On: 05-02-2018 1 Table of Contents 1. Overview 1.1.Executive Summary 1.2.vSAN's Approach to Data Placement and Management 1.3.Adaptive Resync 1.4.Results 1.5.Conclusion

More information

Storage s Pivotal Role in Microsoft Exchange Environments: The Important Benefits of SANs

Storage s Pivotal Role in Microsoft Exchange Environments: The Important Benefits of SANs Solution Profile Storage s Pivotal Role in Microsoft Exchange Environments: The Important Benefits of SANs Hitachi Data Systems Making the Optimal Storage Choice for Performance, Resiliency in Microsoft

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton

More information

An Oracle White Paper September Oracle Utilities Meter Data Management Demonstrates Extreme Performance on Oracle Exadata/Exalogic

An Oracle White Paper September Oracle Utilities Meter Data Management Demonstrates Extreme Performance on Oracle Exadata/Exalogic An Oracle White Paper September 2011 Oracle Utilities Meter Data Management 2.0.1 Demonstrates Extreme Performance on Oracle Exadata/Exalogic Introduction New utilities technologies are bringing with them

More information

Survey Paper on Traditional Hadoop and Pipelined Map Reduce

Survey Paper on Traditional Hadoop and Pipelined Map Reduce International Journal of Computational Engineering Research Vol, 03 Issue, 12 Survey Paper on Traditional Hadoop and Pipelined Map Reduce Dhole Poonam B 1, Gunjal Baisa L 2 1 M.E.ComputerAVCOE, Sangamner,

More information

Increasing Performance of Existing Oracle RAC up to 10X

Increasing Performance of Existing Oracle RAC up to 10X Increasing Performance of Existing Oracle RAC up to 10X Prasad Pammidimukkala www.gridironsystems.com 1 The Problem Data can be both Big and Fast Processing large datasets creates high bandwidth demand

More information

pnfs and Linux: Working Towards a Heterogeneous Future

pnfs and Linux: Working Towards a Heterogeneous Future CITI Technical Report 06-06 pnfs and Linux: Working Towards a Heterogeneous Future Dean Hildebrand dhildebz@umich.edu Peter Honeyman honey@umich.edu ABSTRACT Anticipating terascale and petascale HPC demands,

More information

Cisco Wide Area Application Services and Cisco Nexus Family Switches: Enable the Intelligent Data Center

Cisco Wide Area Application Services and Cisco Nexus Family Switches: Enable the Intelligent Data Center Cisco Wide Area Application Services and Cisco Nexus Family Switches: Enable the Intelligent Data Center What You Will Learn IT departments are facing increasing pressure to accommodate numerous changing

More information

MAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti

MAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti International Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16 MAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti 1 Department

More information

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning September 22 nd 2015 Tommaso Cecchi 2 What is IME? This breakthrough, software defined storage application

More information

The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases

The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases Gurmeet Goindi Principal Product Manager Oracle Flash Memory Summit 2013 Santa Clara, CA 1 Agenda Relational Database

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system

More information

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Crossing the Chasm: Sneaking a parallel file system into Hadoop Crossing the Chasm: Sneaking a parallel file system into Hadoop Wittawat Tantisiriroj Swapnil Patil, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University In this work Compare and contrast large

More information

VMware vsphere Clusters in Security Zones

VMware vsphere Clusters in Security Zones SOLUTION OVERVIEW VMware vsan VMware vsphere Clusters in Security Zones A security zone, also referred to as a DMZ," is a sub-network that is designed to provide tightly controlled connectivity to an organization

More information

Data Transformation and Migration in Polystores

Data Transformation and Migration in Polystores Data Transformation and Migration in Polystores Adam Dziedzic, Aaron Elmore & Michael Stonebraker September 15th, 2016 Agenda Data Migration for Polystores: What & Why? How? Acceleration of physical data

More information

The Case for Reexamining Multimedia File System Design

The Case for Reexamining Multimedia File System Design The Case for Reexamining Multimedia File System Design Position Statement Prashant Shenoy Department of Computer Science, University of Massachusetts, Amherst, MA 01003. shenoy@cs.umass.edu Research in

More information

NetVault Backup Client and Server Sizing Guide 2.1

NetVault Backup Client and Server Sizing Guide 2.1 NetVault Backup Client and Server Sizing Guide 2.1 Recommended hardware and storage configurations for NetVault Backup 10.x and 11.x September, 2017 Page 1 Table of Contents 1. Abstract... 3 2. Introduction...

More information

Data Movement & Tiering with DMF 7

Data Movement & Tiering with DMF 7 Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why

More information

Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA

Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA Evaluation report prepared under contract with HP Executive Summary The computing industry is experiencing an increasing demand for storage

More information

Workload Characterization using the TAU Performance System

Workload Characterization using the TAU Performance System Workload Characterization using the TAU Performance System Sameer Shende, Allen D. Malony, and Alan Morris Performance Research Laboratory, Department of Computer and Information Science University of

More information

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage Database Solutions Engineering By Raghunatha M, Ravi Ramappa Dell Product Group October 2009 Executive Summary

More information

IBM Emulex 16Gb Fibre Channel HBA Evaluation

IBM Emulex 16Gb Fibre Channel HBA Evaluation IBM Emulex 16Gb Fibre Channel HBA Evaluation Evaluation report prepared under contract with Emulex Executive Summary The computing industry is experiencing an increasing demand for storage performance

More information

vsan Security Zone Deployment First Published On: Last Updated On:

vsan Security Zone Deployment First Published On: Last Updated On: First Published On: 06-14-2017 Last Updated On: 11-20-2017 1 1. vsan Security Zone Deployment 1.1.Solution Overview Table of Contents 2 1. vsan Security Zone Deployment 3 1.1 Solution Overview VMware vsphere

More information

Data Centric Computing

Data Centric Computing Piyush Chaudhary HPC Solutions Development Data Centric Computing SPXXL/SCICOMP Summer 2011 Agenda What is Data Centric Computing? What is Driving Data Centric Computing? Puzzle vs.

More information

EMC XTREMCACHE ACCELERATES MICROSOFT SQL SERVER

EMC XTREMCACHE ACCELERATES MICROSOFT SQL SERVER White Paper EMC XTREMCACHE ACCELERATES MICROSOFT SQL SERVER EMC XtremSF, EMC XtremCache, EMC VNX, Microsoft SQL Server 2008 XtremCache dramatically improves SQL performance VNX protects data EMC Solutions

More information

W H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b l e d N e x t - G e n e r a t i o n V N X

W H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b l e d N e x t - G e n e r a t i o n V N X Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com W H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b

More information

Catalogic DPX TM 4.3. ECX 2.0 Best Practices for Deployment and Cataloging

Catalogic DPX TM 4.3. ECX 2.0 Best Practices for Deployment and Cataloging Catalogic DPX TM 4.3 ECX 2.0 Best Practices for Deployment and Cataloging 1 Catalogic Software, Inc TM, 2015. All rights reserved. This publication contains proprietary and confidential material, and is

More information

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1 What s New in VMware vsphere 4.1 Performance VMware vsphere 4.1 T E C H N I C A L W H I T E P A P E R Table of Contents Scalability enhancements....................................................................

More information

SoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research

SoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research SoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research 1 The world s most valuable resource Data is everywhere! May. 2017 Values from Data! Need infrastructures for

More information

Best Practices for Setting BIOS Parameters for Performance

Best Practices for Setting BIOS Parameters for Performance White Paper Best Practices for Setting BIOS Parameters for Performance Cisco UCS E5-based M3 Servers May 2013 2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page

More information

SSIM Collection & Archiving Infrastructure Scaling & Performance Tuning Guide

SSIM Collection & Archiving Infrastructure Scaling & Performance Tuning Guide SSIM Collection & Archiving Infrastructure Scaling & Performance Tuning Guide April 2013 SSIM Engineering Team Version 3.0 1 Document revision history Date Revision Description of Change Originator 03/20/2013

More information

VMware vstorage APIs FOR ARRAY INTEGRATION WITH EMC VNX SERIES FOR SAN

VMware vstorage APIs FOR ARRAY INTEGRATION WITH EMC VNX SERIES FOR SAN White Paper VMware vstorage APIs FOR ARRAY INTEGRATION WITH EMC VNX SERIES FOR SAN Benefits of EMC VNX for Block Integration with VMware VAAI EMC SOLUTIONS GROUP Abstract This white paper highlights the

More information

The Performance Analysis of a Service Deployment System Based on the Centralized Storage

The Performance Analysis of a Service Deployment System Based on the Centralized Storage The Performance Analysis of a Service Deployment System Based on the Centralized Storage Zhu Xu Dong School of Computer Science and Information Engineering Zhejiang Gongshang University 310018 Hangzhou,

More information

Structuring PLFS for Extensibility

Structuring PLFS for Extensibility Structuring PLFS for Extensibility Chuck Cranor, Milo Polte, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University What is PLFS? Parallel Log Structured File System Interposed filesystem b/w

More information

The Analysis Research of Hierarchical Storage System Based on Hadoop Framework Yan LIU 1, a, Tianjian ZHENG 1, Mingjiang LI 1, Jinpeng YUAN 1

The Analysis Research of Hierarchical Storage System Based on Hadoop Framework Yan LIU 1, a, Tianjian ZHENG 1, Mingjiang LI 1, Jinpeng YUAN 1 International Conference on Intelligent Systems Research and Mechatronics Engineering (ISRME 2015) The Analysis Research of Hierarchical Storage System Based on Hadoop Framework Yan LIU 1, a, Tianjian

More information

MVSS: Multi-View Storage System

MVSS: Multi-View Storage System MVSS: Multi-View Storage System Xiaonan Ma and A. L. Narasimha Reddy Department of Electrical Engineering Texas A & M University College Station, TX 77843-3128 xiaonan, reddy @ee.tamu.edu Abstract This

More information

EMC XTREMCACHE ACCELERATES ORACLE

EMC XTREMCACHE ACCELERATES ORACLE White Paper EMC XTREMCACHE ACCELERATES ORACLE EMC XtremSF, EMC XtremCache, EMC VNX, EMC FAST Suite, Oracle Database 11g XtremCache extends flash to the server FAST Suite automates storage placement in

More information

Data Archiving Using Enhanced MAID

Data Archiving Using Enhanced MAID Data Archiving Using Enhanced MAID Aloke Guha COPAN Systems Inc. aloke.guha@copansys.com Abstract * This paper discusses archive data and its specific attributes, as well as the requirements for storing,

More information

Performance Extrapolation for Load Testing Results of Mixture of Applications

Performance Extrapolation for Load Testing Results of Mixture of Applications Performance Extrapolation for Load Testing Results of Mixture of Applications Subhasri Duttagupta, Manoj Nambiar Tata Innovation Labs, Performance Engineering Research Center Tata Consulting Services Mumbai,

More information

S K T e l e c o m : A S h a r e a b l e D A S P o o l u s i n g a L o w L a t e n c y N V M e A r r a y. Eric Chang / Program Manager / SK Telecom

S K T e l e c o m : A S h a r e a b l e D A S P o o l u s i n g a L o w L a t e n c y N V M e A r r a y. Eric Chang / Program Manager / SK Telecom S K T e l e c o m : A S h a r e a b l e D A S P o o l u s i n g a L o w L a t e n c y N V M e A r r a y Eric Chang / Program Manager / SK Telecom 2/23 Before We Begin SKT NV-Array (NVMe JBOF) has been

More information

朱义普. Resolving High Performance Computing and Big Data Application Bottlenecks with Application-Defined Flash Acceleration. Director, North Asia, HPC

朱义普. Resolving High Performance Computing and Big Data Application Bottlenecks with Application-Defined Flash Acceleration. Director, North Asia, HPC October 28, 2013 Resolving High Performance Computing and Big Data Application Bottlenecks with Application-Defined Flash Acceleration 朱义普 Director, North Asia, HPC DDN Storage Vendor for HPC & Big Data

More information

Performance of relational database management

Performance of relational database management Building a 3-D DRAM Architecture for Optimum Cost/Performance By Gene Bowles and Duke Lambert As systems increase in performance and power, magnetic disk storage speeds have lagged behind. But using solidstate

More information

IOmark-VM. VMware VSAN Intel Servers + VMware VSAN Storage SW Test Report: VM-HC a Test Report Date: 16, August

IOmark-VM. VMware VSAN Intel Servers + VMware VSAN Storage SW Test Report: VM-HC a Test Report Date: 16, August IOmark-VM VMware VSAN Intel Servers + VMware VSAN Storage SW Test Report: VM-HC-160816-a Test Report Date: 16, August 2016 Copyright 2010-2016 Evaluator Group, Inc. All rights reserved. IOmark-VM, IOmark-VDI,

More information

IBM InfoSphere Streams v4.0 Performance Best Practices

IBM InfoSphere Streams v4.0 Performance Best Practices Henry May IBM InfoSphere Streams v4.0 Performance Best Practices Abstract Streams v4.0 introduces powerful high availability features. Leveraging these requires careful consideration of performance related

More information

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Unleash Your Data Center s Hidden Power September 16, 2014 Molly Rector CMO, EVP Product Management & WW Marketing

More information

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure Nutanix Tech Note Virtualizing Microsoft Applications on Web-Scale Infrastructure The increase in virtualization of critical applications has brought significant attention to compute and storage infrastructure.

More information

The Fusion Distributed File System

The Fusion Distributed File System Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique

More information

EMC Business Continuity for Microsoft Applications

EMC Business Continuity for Microsoft Applications EMC Business Continuity for Microsoft Applications Enabled by EMC Celerra, EMC MirrorView/A, EMC Celerra Replicator, VMware Site Recovery Manager, and VMware vsphere 4 Copyright 2009 EMC Corporation. All

More information

Virtualizing SQL Server 2008 Using EMC VNX Series and VMware vsphere 4.1. Reference Architecture

Virtualizing SQL Server 2008 Using EMC VNX Series and VMware vsphere 4.1. Reference Architecture Virtualizing SQL Server 2008 Using EMC VNX Series and VMware vsphere 4.1 Copyright 2011, 2012 EMC Corporation. All rights reserved. Published March, 2012 EMC believes the information in this publication

More information

Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments

Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments LCI HPC Revolution 2005 26 April 2005 Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments Matthew Woitaszek matthew.woitaszek@colorado.edu Collaborators Organizations National

More information

Scaling-Out with Oracle Grid Computing on Dell Hardware

Scaling-Out with Oracle Grid Computing on Dell Hardware Scaling-Out with Oracle Grid Computing on Dell Hardware A Dell White Paper J. Craig Lowery, Ph.D. Enterprise Solutions Engineering Dell Inc. August 2003 Increasing computing power by adding inexpensive

More information

Benchmark of a Cubieboard cluster

Benchmark of a Cubieboard cluster Benchmark of a Cubieboard cluster M J Schnepf, D Gudu, B Rische, M Fischer, C Jung and M Hardt Steinbuch Centre for Computing, Karlsruhe Institute of Technology, Karlsruhe, Germany E-mail: matthias.schnepf@student.kit.edu,

More information

A SAS/AF Application for Parallel Extraction, Transformation, and Scoring of a Very Large Database

A SAS/AF Application for Parallel Extraction, Transformation, and Scoring of a Very Large Database Paper 11 A SAS/AF Application for Parallel Extraction, Transformation, and Scoring of a Very Large Database Daniel W. Kohn, Ph.D., Torrent Systems Inc., Cambridge, MA David L. Kuhn, Ph.D., Innovative Idea

More information

Running VMware vsan Witness Appliance in VMware vcloudair First Published On: April 26, 2017 Last Updated On: April 26, 2017

Running VMware vsan Witness Appliance in VMware vcloudair First Published On: April 26, 2017 Last Updated On: April 26, 2017 Running VMware vsan Witness Appliance in VMware vcloudair First Published On: April 26, 2017 Last Updated On: April 26, 2017 1 Table of Contents 1. Executive Summary 1.1.Business Case 1.2.Solution Overview

More information

EXTRACT DATA IN LARGE DATABASE WITH HADOOP

EXTRACT DATA IN LARGE DATABASE WITH HADOOP International Journal of Advances in Engineering & Scientific Research (IJAESR) ISSN: 2349 3607 (Online), ISSN: 2349 4824 (Print) Download Full paper from : http://www.arseam.com/content/volume-1-issue-7-nov-2014-0

More information

NetVault Backup Client and Server Sizing Guide 3.0

NetVault Backup Client and Server Sizing Guide 3.0 NetVault Backup Client and Server Sizing Guide 3.0 Recommended hardware and storage configurations for NetVault Backup 12.x September 2018 Page 1 Table of Contents 1. Abstract... 3 2. Introduction... 3

More information

BEST PRACTICES FOR OPTIMIZING YOUR LINUX VPS AND CLOUD SERVER INFRASTRUCTURE

BEST PRACTICES FOR OPTIMIZING YOUR LINUX VPS AND CLOUD SERVER INFRASTRUCTURE BEST PRACTICES FOR OPTIMIZING YOUR LINUX VPS AND CLOUD SERVER INFRASTRUCTURE Maximizing Revenue per Server with Parallels Containers for Linux Q1 2012 1 Table of Contents Overview... 3 Maximizing Density

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information

Research on Implement Snapshot of pnfs Distributed File System

Research on Implement Snapshot of pnfs Distributed File System Applied Mathematics & Information Sciences An International Journal 2011 NSP 5 (2) (2011), 179S-185S Research on Implement Snapshot of pnfs Distributed File System Liu-Chao, Zhang-Jing Wang, Liu Zhenjun,

More information

Cisco 4000 Series Integrated Services Routers: Architecture for Branch-Office Agility

Cisco 4000 Series Integrated Services Routers: Architecture for Branch-Office Agility White Paper Cisco 4000 Series Integrated Services Routers: Architecture for Branch-Office Agility The Cisco 4000 Series Integrated Services Routers (ISRs) are designed for distributed organizations with

More information

New Approach to Unstructured Data

New Approach to Unstructured Data Innovations in All-Flash Storage Deliver a New Approach to Unstructured Data Table of Contents Developing a new approach to unstructured data...2 Designing a new storage architecture...2 Understanding

More information

Technology Insight Series

Technology Insight Series IBM ProtecTIER Deduplication for z/os John Webster March 04, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved. Announcement Summary The many data

More information

vsan Remote Office Deployment January 09, 2018

vsan Remote Office Deployment January 09, 2018 January 09, 2018 1 1. vsan Remote Office Deployment 1.1.Solution Overview Table of Contents 2 1. vsan Remote Office Deployment 3 1.1 Solution Overview Native vsphere Storage for Remote and Branch Offices

More information

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state

More information

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE WHITEPAPER DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily

More information

Lossless 10 Gigabit Ethernet: The Unifying Infrastructure for SAN and LAN Consolidation

Lossless 10 Gigabit Ethernet: The Unifying Infrastructure for SAN and LAN Consolidation . White Paper Lossless 10 Gigabit Ethernet: The Unifying Infrastructure for SAN and LAN Consolidation Introduction As organizations increasingly rely on IT to help enable, and even change, their business

More information

Network Design Considerations for Grid Computing

Network Design Considerations for Grid Computing Network Design Considerations for Grid Computing Engineering Systems How Bandwidth, Latency, and Packet Size Impact Grid Job Performance by Erik Burrows, Engineering Systems Analyst, Principal, Broadcom

More information

Parallel File Systems for HPC

Parallel File Systems for HPC Introduction to Scuola Internazionale Superiore di Studi Avanzati Trieste November 2008 Advanced School in High Performance and Grid Computing Outline 1 The Need for 2 The File System 3 Cluster & A typical

More information

Microsoft Office SharePoint Server 2007

Microsoft Office SharePoint Server 2007 Microsoft Office SharePoint Server 2007 Enabled by EMC Celerra Unified Storage and Microsoft Hyper-V Reference Architecture Copyright 2010 EMC Corporation. All rights reserved. Published May, 2010 EMC

More information

Abstract /10/$26.00 c 2010 IEEE

Abstract /10/$26.00 c 2010 IEEE Abstract Clustering solutions are frequently used in large enterprise and mission critical applications with high performance and availability requirements. This is achieved by deploying multiple servers

More information

Oasis: An Active Storage Framework for Object Storage Platform

Oasis: An Active Storage Framework for Object Storage Platform Oasis: An Active Storage Framework for Object Storage Platform Yulai Xie 1, Dan Feng 1, Darrell D. E. Long 2, Yan Li 2 1 School of Computer, Huazhong University of Science and Technology Wuhan National

More information

IBM Spectrum NAS. Easy-to-manage software-defined file storage for the enterprise. Overview. Highlights

IBM Spectrum NAS. Easy-to-manage software-defined file storage for the enterprise. Overview. Highlights IBM Spectrum NAS Easy-to-manage software-defined file storage for the enterprise Highlights Reduce capital expenditures with storage software on commodity servers Improve efficiency by consolidating all

More information

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE White Paper EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE EMC XtremSF, EMC XtremCache, EMC Symmetrix VMAX and Symmetrix VMAX 10K, XtremSF and XtremCache dramatically improve Oracle performance Symmetrix

More information

LATEST INTEL TECHNOLOGIES POWER NEW PERFORMANCE LEVELS ON VMWARE VSAN

LATEST INTEL TECHNOLOGIES POWER NEW PERFORMANCE LEVELS ON VMWARE VSAN LATEST INTEL TECHNOLOGIES POWER NEW PERFORMANCE LEVELS ON VMWARE VSAN Russ Fellows Enabling you to make the best technology decisions November 2017 EXECUTIVE OVERVIEW* The new Intel Xeon Scalable platform

More information

4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015)

4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015) 4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015) Benchmark Testing for Transwarp Inceptor A big data analysis system based on in-memory computing Mingang Chen1,2,a,

More information

EMC Backup and Recovery for Microsoft SQL Server

EMC Backup and Recovery for Microsoft SQL Server EMC Backup and Recovery for Microsoft SQL Server Enabled by Microsoft SQL Native Backup Reference Copyright 2010 EMC Corporation. All rights reserved. Published February, 2010 EMC believes the information

More information

Executive Brief June 2014

Executive Brief June 2014 (707) 595-3607 Executive Brief June 2014 Comparing IBM Power Systems to Cost/Benefit Case for Transactional Applications Introduction Demand for transaction processing solutions continues to grow. Although

More information

Four-Socket Server Consolidation Using SQL Server 2008

Four-Socket Server Consolidation Using SQL Server 2008 Four-Socket Server Consolidation Using SQL Server 28 A Dell Technical White Paper Authors Raghunatha M Leena Basanthi K Executive Summary Businesses of all sizes often face challenges with legacy hardware

More information

Native vsphere Storage for Remote and Branch Offices

Native vsphere Storage for Remote and Branch Offices SOLUTION OVERVIEW VMware vsan Remote Office Deployment Native vsphere Storage for Remote and Branch Offices VMware vsan is the industry-leading software powering Hyper-Converged Infrastructure (HCI) solutions.

More information

RIGHTNOW A C E

RIGHTNOW A C E RIGHTNOW A C E 2 0 1 4 2014 Aras 1 A C E 2 0 1 4 Scalability Test Projects Understanding the results 2014 Aras Overview Original Use Case Scalability vs Performance Scale to? Scaling the Database Server

More information

Linux Software RAID Level 0 Technique for High Performance Computing by using PCI-Express based SSD

Linux Software RAID Level 0 Technique for High Performance Computing by using PCI-Express based SSD Linux Software RAID Level Technique for High Performance Computing by using PCI-Express based SSD Jae Gi Son, Taegyeong Kim, Kuk Jin Jang, *Hyedong Jung Department of Industrial Convergence, Korea Electronics

More information

Oracle EXAM - 1Z Oracle Exadata Database Machine Administration, Software Release 11.x Exam. Buy Full Product

Oracle EXAM - 1Z Oracle Exadata Database Machine Administration, Software Release 11.x Exam. Buy Full Product Oracle EXAM - 1Z0-027 Oracle Exadata Database Machine Administration, Software Release 11.x Exam Buy Full Product http://www.examskey.com/1z0-027.html Examskey Oracle 1Z0-027 exam demo product is here

More information

SolidFire and Pure Storage Architectural Comparison

SolidFire and Pure Storage Architectural Comparison The All-Flash Array Built for the Next Generation Data Center SolidFire and Pure Storage Architectural Comparison June 2014 This document includes general information about Pure Storage architecture as

More information

Introducing SUSE Enterprise Storage 5

Introducing SUSE Enterprise Storage 5 Introducing SUSE Enterprise Storage 5 1 SUSE Enterprise Storage 5 SUSE Enterprise Storage 5 is the ideal solution for Compliance, Archive, Backup and Large Data. Customers can simplify and scale the storage

More information

HCI: Hyper-Converged Infrastructure

HCI: Hyper-Converged Infrastructure Key Benefits: Innovative IT solution for high performance, simplicity and low cost Complete solution for IT workloads: compute, storage and networking in a single appliance High performance enabled by

More information

Active Storage using OSD. John A. Chandy Department of Electrical and Computer Engineering

Active Storage using OSD. John A. Chandy Department of Electrical and Computer Engineering Active Storage using OSD John A. Chandy Department of Electrical and Computer Engineering Active Disks We already have intelligence at the disk Block management Arm scheduling Can we use that intelligence

More information