Elastic Complex Event Processing
|
|
- Lilian Clark
- 5 years ago
- Views:
Transcription
1 Elastic Complex Event Processing Thomas Heinze SAP Research Dresden Dresden, Germany ABSTRACT Complex Event Processing (CEP) systems are designed to process large amount of information by simultaneously evaluating multiple queries over event streams. Two main requirementsimposedbytheusersofthecepsystemsare: (1) ability to process high throughput event data and (2) ability to answer queries with very low latency. In order to meet above requirements CEP systems are becoming increasingly distributed. Distribution of queries as well as event streams across multiple nodes facilitates increasing the throughput of CEP systems while simultaneously maintaining low response times. The widespread adoption of cloud computing and the accompanying pay-as-you-go model has added new dimensions to the problem of complex event processing in a distributed system. Nowadays, it is not only important to be able to scale the processing out to a large number of nodes, it is also equally important to be able to scale the processing down, as soon as the load or user requirements decrease. The ability to scale processing up and down along with the load and user requirements is called elasticity. The goal of the thesis described in this paper is to develop a component allowing for elastic scaling of distributed CEP systems in response to variations in the load and contractual obligations regarding the quality of service. To this end, the thesis described in this paper will address following three major topics: (1) multi query optimization, (2) operator placement in distributed environments, and (3) cost efficiency. This paper outlines the state of art for the three aforementioned topics and presents the overall draft of the solution for the problem of the elastic complex event processing. 1. INTRODUCTION Complex Event Processing (CEP) systems [21] are used in various scenarios, including (but not limited to): stock trading [20], click stream monitoring [11], and information integration workflows [7]. Moreover, Complex Event Pro- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MDS 2011, December 12th, 2011, Lisbon, Portugal. Copyright 2011 ACM /11/12...$ cessing is becoming a central component of Business Intelligence (BI) systems, as it facilitates solving BI task in (near) realtime by shortcutting the pipeline between raw data and the dashboard layer [7]. In contrast to classical database systems, CEP systems avoid persisting of incoming events, processing the data directly in memory. This allows CEP systems to process event streams with several thousands of events per second with sub millisecond latency [34]. With the increasing amount of data available for analysis [14] and increasing complexity and availability of the Business Intelligence analysis tools, single machine-based solutions are quickly becoming overloaded. Based on reports from SAP customers one can currently expect that CEP systems will have to cope with as much as simultaneous users with estimated 5 queries being asked by every single user. The Options Price Reporting Authority (OPRA), which aggregates all quotes and trades from options exchanges, estimated peak rates of 456,000 events/sec (July 2007), with rates roughly doubling every year [33]. Since load shedding is not an applicable strategy when the correctness of results is required, distribution of the complex event processing emerges as the only viable option. First attempts at creating distributed CEP systems have been already undertaken and have resulted in systems, like: RAPI- DE [22], Borealis [1], and DCEP [28]. However, in order to be able to competitively run largescale distributed systems, cloud computing[8] resources have to be used. The key feature provided by cloud computing is the notion of elasticity. Elasticity has been defined [24] as: [...] capabilities [which] can be rapidly and elastically provisioned, in some cases automatically, to quickly scale out and rapidly released to quickly scale in. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be purchased in any quantity at any time. Elasticity involves not only transparent scale up of the system, but also transparent scaling down if the current setup is too big with respect to the current load or requested quality of service. In contrast, a classical distributed CEP system [22, 1, 28] is designed to always support for the peak load. Therefore, it is periodically scaled up in case the input rate increases. However, in practice, the load imposed on a distributed system is not increasing linearly, but shows an unpredictable variability see Figure 1. Using a static resource provisioning either not enough resources are available (underprovisioning) and requests are dropped, or too many resources are used (overprovisioning), creating unnecessary costs. In contrast, using elastic resource provisioning, the system can alter its resource utilization dynamically, which
2 Figure 2: Example dataflow on StreamMine Figure 1: Comparison of static and elastic resource provisioning source: [35] avoids over- as well as underprovisioning. Recently a number of prototypes for cloud-based CEP engines has been proposed, including: SEEP [25], an extension for System S [27], Flood [4], StreamCloud [12] and Kleiminger et al. [17]. In the following table (Table 1) we characterize aforementioned systems based on the level at which the elasticity is realized. Elastic Action Taken System Engine Query Operator New engine instance is started and new queries are employed on this engine New query instance is started and the input data is split New operator instance is created and the input data is split [17] [12, 25] [4, 27] Table 1: Different levels of elasticity and corresponding actions A CEP engine can be made elastic by parallelizing either at the engine, query or operator level. Each of these approaches has benefits and shortcomings. The engine level approach requires the smallest set of changes to the overall system, however it also introduces the largest overhead to initialize a new instance and exposes the smallest flexibility and granularity. The query level approach imposes less overhead, however it requires the system to be able to optimize for a possibly very large amount of queries see section 4 for more details on this approach. The query level approach is also less flexible than the operator level approach. The operatorlevel approachhelps to scale the system in the most fine grained fashion. However, it also requires fully elastic operators, which are difficult to implement, especially if a system permits user defined operators. Moreover, too high degree of distribution might result in communication overhead becoming a limiting factor with respect to scalability [12]. In order to exploit elasticity a complex event processing system has to be designed in an elastic way, allowing for operation with a variable number of nodes. To be able to realize such systems, a solution for the following three issues has to be created [35]: 1. Which components or layers in my application architecture can become elastic? 2. What will it take to make them elastic? 3. What will be the impact of realizing elasticity onto my overall system architecture? The current focus of most CEP systems is on implementing answers to questions (1) and (2). None of the works has done an intensive study on the extensions needed for making the elasticity transparent to the user. The thesis described in this paper tries to fill this gap by researching question (3). The system needs to decide automatically, when to scale up and also when to scale down again to reduce the monetary cost. In addition, the system has to detect operators becoming bottlenecks and react by e.g. moving them to another node. The remainder of this paper is structured as follows: Section 2 introduces the system model and general architecture ofthe CEP system whichwillbe usedas a foundation forthe work carried out within the described thesis. In Section 3 main components of the proposed system are introduced, including multi query optimization, operator placement, and cost efficiency. Section 7 describes the planned evaluation strategy and plan for the whole system. Section 8 concludes this paper. 2. SYSTEM MODEL ThethesisdescribedinthispaperispartoftheSRT-15 1 research project. The SRT-15 research project is funded by the European Commission within the Seventh Framework Programme (FP7) under the Grant Agreement number in the area of Internet of Services, Software & Virtualisation (ICT ). A software foundation for both the SRT-15 research project and the described thesis the StreamMine [23] system. This StreamMine system is a framework for processes of events using a stage-based approach, where each stage can be programmed to express an independent logic unit and store independent state. Within a single stage multiple instances of the stage logic realize the same functionality for different data. In order to achieve this, StreamMine offers functionality to split the input data using a key-based approach see Figure 2 for the architectural overview. 1 Project homepage:
3 Figure 3: Thesis prototype StreamMine can be executed in a distributed environment. For each execution all involved stages have to be assigned to physical underlying system nodes. Subsequently, stages have to be explicitly connected using point to point connections. This allows to scale the different stages based on expected event load by: (1) adding new stages to the already existing system or (2) adding additional nodes to already running stages. StreamMine can therefore be used to realize elasticity on an operator level. Within the thesis, described in this paper, we define typical CEP operators (Selection, Join, Aggregation and Sequence) on top of the StreamMine model. We assume that input for the CEP system, running on top of StreamMine, consists of a set of infinite event streams and a set of continuous queries. The number of queries can change during runtime, as new queries can be added and removed dynamically. We also assume that queries submitted to the CEP system model the data flow as an acyclic graph. Figure 2 illustrates such data flow consisting of two Selection operators and one Join operator. 3. SYSTEM ARCHITECTURE The main challenge (and simultaneously advantage) when designing a cloud-based CEP engine is that it must handle a rapidly changing set of events and queries, where both the number of queries and the event rate is very hard to predict. As outlined in Section 1 peak load rates can be as high as 100,000 concurrent queries and 400,000 events per second. However, the normal load is expected to significantly differ from the above values, depending mostly on the application type. This in turn, requires the system to be able to automatically scale up and down along with the varying load, no user interaction is involved in the elastic scaling process. In order to allow for such functionality we propose a system architecture presented in Figure 3. The key concept of the thesis prototype with respect to its architecture, is that it will be decoupled from the underlying elastic CEP system. All communication between the thesis prototype and the underlying CEP system will be executed via generic interfaces. This allows to use of the thesis prototype in combination with any elastic CEP engine, as long as the underlying CEP engine implements the specified interfaces and the required functionality including a distributed execution, dynamic addition of queries, operators and operator instances as well as dynamic movement of operators instances to other nodes. The prototype consists of three major components: Multiple query optimization Given a large amount of concurrently running queries it is likely that results produced by parts of or even whole queries are identical. In order to lower the load on the system such overlaps should be exploited. A solution can be based on multiple query optimization [29] a technique which allows creation of global query plans reusing common results. Operator placement Based on a global query plan and a set of available nodes an assignment of operators or queries onto the physical nodes has to be performed. In order to accommodate for the varying load operator placement has to be aware of the current load levels as well as the current utilization of system nodes. Moreover, considering a deployment in the cloud, the operator placement component must be able to request and release resources as needed in order to cope with the varying workload and quality of service requirements like a maximal end to end latency or an expected minimal throughput. Cost efficiency In most scenarios users provide not only event sources and queries to run against them but also a monetary budget and quality of service requirements which need to be met. Therefore, both aforementioned components must be able to operate in a cost efficient way. This means that the query plan and its execution must be tuned in such a way as to minimize the execution cost, while simultaneously meeting the specified quality of service requirements. In addition, newly added queries should be rejected by an admission control in case they can not be executed with the given budget. In the following sections we will describe each of the aforementioned components in more detail. Specifically, we will motivate the design decisions regarding the components design and we will present related work and the progress beyond the state of the art. 4. MULTIPLE QUERY OPTIMIZATION FOR ELASTIC CEP The goal of the Multiple Query Optimization [29] (MQO) is to reduce the number of concurrently running queries in a CEP and/or database system. A MQO approach involves identification and reuse of identical results produced by operators within a global query plan. MQO allows avoiding repetitive computation of identical results and is specifically well suited for CEP systems, which are working with long running queries [7]. Figure 4 illustrates the multi query optimization approach. The solution to the MQO is a global query plan incorporating multiple queries. The goal of the MQO when creating the global query plan is to minimize a cost metric, such as the sum of all operators cost. The task is known to be NP-complete [30], because the global optimal plan cannot be constructed by simply combining optimal per query plans. Instead, all possible query plans for all queries have to be examined, which leads to an exponential growth of the
4 Figure 4: Example for an incremental multi query optimization search space. To avoid this problem most approaches use heuristics [26] to find a near optimal query plan. The MQO approach used in context of this thesis should be incremental, allowing adding new queries and removing queries during runtime. The global query plan should only contain the current active queries to allow an efficient resource usage. Many classical approaches used in database systems [6, 26] are not incremental algorithms, which means that each time a new query arrives the computation to find the optimalquery plan has to be restarted. This can become very expensive, especially when the query arrival frequency is high. In addition, when adding new queries the already running ones should not be interrupted, i.e., changes to the currently executed global query plan should be avoided. If the position of operators within a running global query plan is changed this might involve changes in the semantics of the current operator state (e.g. the previously seen and still valid events inside a join operator), which either results in a complex reinitialization or incorrect results. In the context of data streaming the usage of multi query optimization was first addressed in[13, 16]. Both approaches show an efficient way to optimize a large amount of continuous queries. However, both approaches do not allow for addition nor removal of queries during runtime. A system for incremental query optimization has been first proposed in [15] it uses internal index structures to easily identify possible merge points for new queries. Additionally, the authors use query subsumption [3] for filter expressions, which results in a higher results reuse. However, the set of operators supported in the above system is very limited and the operators are only executed on the same input stream. The thesis described in this paper will extend the above approach to: (1) accommodate a full set of CEP operators as well as (2) multiple input streams and (3) query removal from the system. In addition, instead of comparing the semantics of the operators, a stream annotation approach will be used. In our approach every stream is annotated with a set of predicates based on the operator semantics defining tuples contained within this stream. The goal is to avoid query rewriting and make use of efficient index structures to allow for fast integration of new queries. 5. OPERATOR PLACEMENT FOR ELAS- TIC CEP Given a query plan and a set of available nodes, it has to be decided which operators are to be mapped onto which nodes and how many nodes should execute a given operator. This problem can be solved by operator placement algorithms [19] whichtryto place a setofoperatorson a fixedset of distributed nodes. A number of approaches [2, 16, 32, 36] which try to optimize operator placement based on a specified metric have been proposed. Authors in [19] outline different design dimensions for operator placement algorithms, in the following we mention those which are relevant to the thesis described in this paper: Architecture Existing approaches can be categorized being an independent module [16, 32] or as distributed logic [2, 36]. With respect to the architecture the thesis prototype described in this paper is implemented as a centralized and independent module. Metric The operator placement can be optimized based on various metrics including load, latency, network bandwidth or operator importance. Within the thesis described in this paper different metrics including monetary cost (see Section 6) will be evaluated. Adjustment The operator placement can either be static, with new placement calculated only when a new query arrives or departs, or dynamic with the placement being adjusted, subject to varying load and node utilization. Within the thesis described in this paper a dynamic operator placement based on the system load and utilization will be used. With the thesis described in this paper two additional problems will be investigated: (1) it has to be decided, how many nodes per operator are needed and (2) how to perform an a priori (instead of an a posteriori) node provisioning. The first problem can be solved in a two phased approach. First, an initial estimate based on the number of available nodes and the complexity of the operator is performed. Subsequently, after deployment has taken place, runtime information is used in order to further refine the number of nodes used to handle the operator load. The second problem stems from the fact that the number of available nodes is not fixed. Nodes are purchased on demand from the cloud provider. Ideally, one should try to avoid situations where nodes are becoming bottlenecks (underprovisioning) as well as overprovisioning of the overall system. The major challenge is to be able to detect such situations early enough to migrate the operator to another machine before violating given QoS constraints. This is motivated by the fact that both allocation of a new node and moving of an operator to a newly allocated node are not instantaneous. Moreover, a frequent allocation and deallocation of nodes (called thrashing) as a result of a bursty load should be avoided. The overall idea for the operator placement is to provide a fast and simple initial placement strategy and perform subsequent adjustments according to the varying load. The motivation is based on following facts: (1) in CEP systems, unlike in classical data bases, it is not possible to predict the load and (2) initial operator placement is only the starting point for the search of an (sub) optimal solution.
5 Figure 5: Cost efficient execution of queries 6. COST EFFICIENT ELASTIC CEP Most cloud providers use a pay-as-you-go model, which means, that a user is charged for every CPU cycle he uses. The Elastic CEP prototype will use this property to reduce the information processing cost for the user as much as possible. Users of the Elastic CEP prototype will be able to specify a maximum cost for information processing along with QoS constraints, such as: a lower bound for the throughput and upper bound for the latency. The Elastic CEP prototype will try to optimize the data flow in such a way so as to stay within these constraints. From the large number of possible global query plans ones which fulfill both expected QoS constraints as well as the cost limits will be selected. Figure 5 illustrates a set of possible query plans with expected throughput and associated cost along with an expected minimal throughput and a maximum budget. The chosen query plan should meets both criteria which is true for the query plan in the dark gray area. A number of authors [9, 18, 31] have studied the problem of cost efficient computations, especially in context of MapReduce[10] and cloud computing. For batch-based tasks it has been shown [18] how to integrate monetary costs into the schedule computation using an additional cost dimension during optimization. An approach proposed by authors in [9] shows how to combine a pricing model with a provisioning infrastructure and how to incorporate existing SLAs defined between customers and the service provider. The authors also demonstrate how service providers have to pay compensation fees in case of SLA violations. Authors in [31] present a heuristic method of optimization which speeds up a given task within a predefined budget by optimizing the number of allocated machines. However, the complexity of cost efficient planning for applications involving streaming data is much higher, because the concrete rate for the event sources cannot be predicted. If the number of machines has to be increased (due to an increased amount of incoming events) the remaining budget has to be taken into consideration. As a result, the system has to decide (using an admission control) whether to reject a new set of queries, due to violation of the available budget. 7. EVALUATION In order to be able to evaluate the Elastic CEP prototype a benchmark, containing both synthetic as well as realworld data, is needed. The goal of the evaluation is to show strengths and limitations of the Elastic CEP prototype with respect to handling varying load (event rates and number of queries) and the responsiveness of the system. The evaluation will examine following metrics: Throughput The elastic CEP prototype must be able to demonstrate handling of query loads exceeding capacityofasingle node. The ElasticCEP prototype should demonstrate the ability to execute a distributed global query plan. Query load This metric will evaluate the multiple query optimization component both the performance of an optimized query plan as well as the performance of the optimizer will be measured. Responsiveness The responsiveness of the Elastic CEP prototype describes the time needed to adapt to changing query loads as well as the ability to detect overloaded operators. Cost efficiency To test the cost efficiency the Elastic CEP prototype will be faced with situations in which queries have to be rejected or the available budget is limited. The system should show its ability to meet the budget constraints. The evaluation will be based on real-world and synthetic test data. Therefore, existing benchmarks will be considered as starting point for the evaluation. The most common benchmark for CEP systems is Linear Road [5]. The Linear Roads benchmark simulates a toll system on a large number of highways. The system uses a fixed set of queries with well defined constraints regarding correctness and response times. The Linear Road benchmark comes with an event generator, which can change the amount of events sent per second. The main goal of Linear Road benchmark is to test the maximum event rate a system can sustain. However, the both event rates and the query load show a linear behavior, which does not allow to test for the elasticity of the overall system. Elasticity has to be tested by additional test cases therefore, one of the tasks of the thesis described within this paper is to develop a benchmark suite which can be used for comparing the performance of the elastic CEP systems. 8. CONCLUSION The thesis described in this paper introduces an extension to existing CEP systems which reflects the high dynamism of a cloud environment. This involves studying the problem of: (1) how to handle a large amount of concurrent queries, (2) how to make use of allocated resources efficiently and (3) how to realize a cost efficient computation. Therefore, the Elastic CEP prototype developed within the thesis will accept following input: (1) a set of queries, (2) a set of event sources and a (3) set of available nodes and their utilization. As an output the component will provide an assignment of operators to nodes for an elastic CEP system taking into account the runtime statistics. This research is based on existing approaches for distributed complex event processing, however progress beyond the state of the art is made in that the dynamism of the cloud environment and load are considered. The Elastic CEP prototype will be evaluated using a new benchmark system allowing the comparison to other systems with respect to the support for elasticity.
6 9. REFERENCES [1] D. Abadi, Y. Ahmad, M. Balazinska, U. Cetintemel, M. Cherniack, J. Hwang, W. Lindner, A. Maskey, A. Rasin, E. Ryvkina, et al. The design of the Borealis stream processing engine. In Second Biennial Conference on Innovative Data Systems Research (CIDR 2005), Asilomar, CA, pages Citeseer, [2] Y. Ahmad and U. Çetintemel. Network-aware query processing for stream-based applications. In Proceedings of the Thirtieth international conference on Very large data bases- Volume 30, pages VLDB Endowment, [3] M. Al-Qasem and S. Deen. Query subsumption. Flexible Query Answering Systems, pages 29 42, [4] D. Alves, P. Bizarro, and P. Marques. Flood: Elastic streaming MapReduce. In DEBS 10, pages , [5] A. Arasu, M. Cherniack, E. Galvez, D. Maier, A. Maskey, E. Ryvkina, M. Stonebraker, and R. Tibbetts. Linear road: A stream data management benchmark. In Proceedings of the Thirtieth international conference on Very large data bases-volume 30, pages VLDB Endowment, [6] S. Chaudhuri. An overview of query optimization in relational systems. In Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 34 43, Seattle, Washington, June ACM Press. [7] S. Chaudhuri, U. Dayal, and V. Narasayya. An overview of business intelligence technology. Communications of the ACM, 54(8):88 98, August [8] M. Creeger. Cloud computing: An overview. Queue, 7(5):3 4, June [9] I. Cunha, J. Almeida, V. Almeida, and M. Santos. Selfadaptive capacity management for multi-tier virtualized environments. In Integrated Network Management, IM th IFIP/IEEE International Symposium on, pages IEEE, [10] J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Commun. ACM, 51(1): , [11] M. Gualtieri and J. R. Rymer. The Forrester Wave: Complex Event Processing (CEP) Platforms, Q Online, August [12] V. Gulisano, R. Jimenez-Peris, M. Patino-Martinez, and P. Valduriez. StreamCloud: A large scale data streaming system. In 2010 International Conference on Distributed Computing Systems, pages IEEE, [13] M. Hong, M. Riedewald, C. Koch, J. Gehrke, and A. Demers. Rule-based multi-query optimization. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pages ACM, [14] A. Jacobs. The pathologies of big data. Queue, 7(6):10 19, [15] C. Jin and J. Carbonell. Predicate indexing for incremental multi-query optimization. Foundations of Intelligent Systems, pages , [16] E.Kalyvianaki, W.Wiesemann, Q.Vu, D.Kuhn, andp.pietzuch. SQPR: Stream query planning with reuse. Imperial College London, Tech. Rep, [17] W. Kleiminger, E. Kalyvianaki, and P. Pietzuch. Balancing load in stream processing with the cloud. In Proceedings of the 6th International Workshop on Self Managing Database Systems (SMDB 2011), Hannover, Germany. IEEE, [18] H. Kllapi, E. Sitaridi, M. Tsangaris, and Y. Ioannidis. Schedule optimization for data processing flows on the cloud. In Proceedings of the 2011 international conference on Management of data, pages ACM, [19] G. Lakshmanan, Y. Li, and R. Strom. Placement strategies for internet-scale data stream systems. Internet Computing, IEEE, 12(6):50 60, [20] N. Leavitt. Complex-event processing poised for growth. Computer, 42(4):17 20, [21] D. Luckham. The power of events: An introduction to complex event processing in distributed enterprise systems. Addison-Wesley Longman Publishing Co., Inc., [22] D. Luckham and B. Frasca. Complex event processing in distributed systems. Computer Systems Laboratory Technical Report CSL-TR Stanford University, Stanford, [23] A. Martin, T. Knauth, S. Creutz, D. B. de Brum, S. Weigert, A. Brito, and C. Fetzer. Low-overhead fault tolerance for high-throughput data processing systems. In ICDCS 11: Proceedings of the st IEEE International Conference on Distributed Computing Systems, pages , Los Alamitos, CA, USA, June IEEE Computer Society. [24] P. Mell and T. Grance. The NIST definition of cloud computing. National Institute of Standards and Technology, 53(6), [25] M. Migliavacca, D. Eyers, J. Bacon, Y. Papagiannis, B. Shand, and P. Pietzuch. SEEP: Scalable and elastic event processing. In Middleware 10 Posters and Demos Track, page 4. ACM, [26] P. Roy, S. Seshadri, S. Sudarshan, and S. Bhobe. Efficient and extensible algorithms for multi query optimization. In ACM SIGMOD Record, volume 29, pages ACM, [27] S. Schneider, H. Andrade, B. Gedik, A. Biem, and K. Wu. Elastic scaling of data parallel operators in stream processing. In Parallel & Distributed Processing, IPDPS IEEE International Symposium on, pages IEEE, [28] N. Schultz-Moller, M. Migliavacca, and P. Pietzuch. Distributed complex event processing with query rewriting. In Proceedings of the Third ACM International Conference on Distributed Event-Based Systems, pages ACM, operator placement. [29] T. Sellis. Multiple-query optimization. ACM Transactions on Database Systems (TODS), 13(1):23 52, [30] T. Sellis and S. Ghosh. On the multiple-query optimization problem. Knowledge and Data Engineering, IEEE Transactions on, 2(2): , [31] J. Silva, L. Veiga, and P. Ferreira. Heuristic for resources allocation on utility computing infrastructures. In Proceedings of the 6th international workshop on Middleware for grid computing, page 9. ACM, [32] U. Srivastava, K. Munagala, and J. Widom. Operator placement for in-network stream query processing. In Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages ACM, [33] StreamBase. Real-time data processing with a stream processing engine. Technical report, [34] Sybase. Evaluating Sybase CEP performance. Technical report, October [35] J. Varia. Architecting for the cloud: Best practices. Technical report, Amazon, Inc., [36] Y. Xing, S. Zdonik, and J. Hwang. Dynamic load distribution in the Borealis stream processor. In Data Engineering, ICDE Proceedings. 21st International Conference on, pages IEEE, 2005.
FUGU: Elastic Data Stream Processing with Latency Constraints
FUGU: Elastic Data Stream Processing with Latency Constraints Thomas Heinze 1, Yuanzhen Ji 1, Lars Roediger 1, Valerio Pappalardo 1, Andreas Meister 2, Zbigniew Jerzak 1, Christof Fetzer 3 1 SAP SE 2 University
More informationSynergy: Quality of Service Support for Distributed Stream Processing Systems
Synergy: Quality of Service Support for Distributed Stream Processing Systems Thomas Repantis Vana Kalogeraki Department of Computer Science & Engineering University of California, Riverside {trep,vana}@cs.ucr.edu
More informationBiCEP Benchmarking Complex Event Processing Systems
BiCEP Benchmarking Complex Event Processing Systems Pedro Bizarro University of Coimbra, DEI-CISUC 3030-290 Coimbra, Portugal bizarro@dei.uc.pt Abstract. BiCEP is a new project being started at the University
More informationChallenges in Data Stream Processing
Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Challenges in Data Stream Processing Corso di Sistemi e Architetture per Big Data A.A. 2016/17 Valeria
More informationADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT
ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT PhD Summary DOCTORATE OF PHILOSOPHY IN COMPUTER SCIENCE & ENGINEERING By Sandip Kumar Goyal (09-PhD-052) Under the Supervision
More informationSemantic Event Correlation Using Ontologies
Semantic Event Correlation Using Ontologies Thomas Moser 1, Heinz Roth 2, Szabolcs Rozsnyai 3, Richard Mordinyi 1, and Stefan Biffl 1 1 Complex Systems Design & Engineering Lab, Vienna University of Technology
More informationOverload Management in Data Stream Processing Systems with Latency Guarantees
Overload Management in Data Stream Processing Systems with Latency Guarantees Evangelia Kalyvianaki, hemistoklis Charalambous, Marco Fiscato, and Peter Pietzuch Department of Computing, Imperial College
More informationStreamMine3G OneClick Deploy & Monitor ESP Applications With A Single Click
OneClick Deploy & Monitor ESP Applications With A Single Click Andrey Brito, André Martin, Christof Fetzer, Isabelly Rocha, Telles Nóbrega Technische Universität Dresden - Dresden, Germany - Email: {firstname.lastname}@se.inf.tu-dresden.de
More informationDesigning Views to Answer Queries under Set, Bag,and BagSet Semantics
Designing Views to Answer Queries under Set, Bag,and BagSet Semantics Rada Chirkova Department of Computer Science, North Carolina State University Raleigh, NC 27695-7535 chirkova@csc.ncsu.edu Foto Afrati
More informationDATABASES AND THE CLOUD. Gustavo Alonso Systems Group / ECC Dept. of Computer Science ETH Zürich, Switzerland
DATABASES AND THE CLOUD Gustavo Alonso Systems Group / ECC Dept. of Computer Science ETH Zürich, Switzerland AVALOQ Conference Zürich June 2011 Systems Group www.systems.ethz.ch Enterprise Computing Center
More informationDealing with Overload in Distributed Stream Processing Systems
Dealing with Overload in Distributed Stream Processing Systems Nesime Tatbul Stan Zdonik Brown University, Department of Computer Science, Providence, RI USA E-mail: {tatbul, sbz}@cs.brown.edu Abstract
More informationDynamically Provisioning Distributed Systems to Meet Target Levels of Performance, Availability, and Data Quality
Dynamically Provisioning Distributed Systems to Meet Target Levels of Performance, Availability, and Data Quality Amin Vahdat Department of Computer Science Duke University 1 Introduction Increasingly,
More informationScalable and Elastic Realtime Click Stream Analysis Using StreamMine3G
Scalable and Elastic Realtime Click Stream Analysis Using StreamMine3G André Martin *, Andrey Brito # and Christof Fetzer * * Technische Universität Dresden - Dresden, Germany # Universidade Federal de
More informationMulti-Model Based Optimization for Stream Query Processing
Multi-Model Based Optimization for Stream Query Processing Ying Liu and Beth Plale Computer Science Department Indiana University {yingliu, plale}@cs.indiana.edu Abstract With recent explosive growth of
More informationOptimizing the use of the Hard Disk in MapReduce Frameworks for Multi-core Architectures*
Optimizing the use of the Hard Disk in MapReduce Frameworks for Multi-core Architectures* Tharso Ferreira 1, Antonio Espinosa 1, Juan Carlos Moure 2 and Porfidio Hernández 2 Computer Architecture and Operating
More informationScaling-Out with Oracle Grid Computing on Dell Hardware
Scaling-Out with Oracle Grid Computing on Dell Hardware A Dell White Paper J. Craig Lowery, Ph.D. Enterprise Solutions Engineering Dell Inc. August 2003 Increasing computing power by adding inexpensive
More informationVirtual Machine Placement in Cloud Computing
Indian Journal of Science and Technology, Vol 9(29), DOI: 10.17485/ijst/2016/v9i29/79768, August 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Virtual Machine Placement in Cloud Computing Arunkumar
More informationFuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc
Fuxi Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc {jiamang.wang, yongjun.wyj, hua.caihua, zhipeng.tzp, zhiqiang.lv,
More information2010 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,
2010 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising
More informationStaying FIT: Efficient Load Shedding Techniques for Distributed Stream Processing
Staying FIT: Efficient Load Shedding Techniques for Distributed Stream Processing Nesime Tatbul Uğur Çetintemel Stan Zdonik Talk Outline Problem Introduction Approach Overview Advance Planning with an
More informationInteractive Responsiveness and Concurrent Workflow
Middleware-Enhanced Concurrency of Transactions Interactive Responsiveness and Concurrent Workflow Transactional Cascade Technology Paper Ivan Klianev, Managing Director & CTO Published in November 2005
More informationEFFICIENT ALLOCATION OF DYNAMIC RESOURCES IN A CLOUD
EFFICIENT ALLOCATION OF DYNAMIC RESOURCES IN A CLOUD S.THIRUNAVUKKARASU 1, DR.K.P.KALIYAMURTHIE 2 Assistant Professor, Dept of IT, Bharath University, Chennai-73 1 Professor& Head, Dept of IT, Bharath
More informationAccommodating Bursts in Distributed Stream Processing Systems
Accommodating Bursts in Distributed Stream Processing Systems Distributed Real-time Systems Lab University of California, Riverside {drougas,vana}@cs.ucr.edu http://www.cs.ucr.edu/~{drougas,vana} Stream
More informationAn Intelligent Service Oriented Infrastructure supporting Real-time Applications
An Intelligent Service Oriented Infrastructure supporting Real-time Applications Future Network Technologies Workshop 10-11 -ETSI, Sophia Antipolis,France Karsten Oberle, Alcatel-Lucent Bell Labs Karsten.Oberle@alcatel-lucent.com
More informationJoint Entity Resolution
Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute
More informationRethinking the Design of Distributed Stream Processing Systems
Rethinking the Design of Distributed Stream Processing Systems Yongluan Zhou Karl Aberer Ali Salehi Kian-Lee Tan EPFL, Switzerland National University of Singapore Abstract In this paper, we present a
More informationNext-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data
Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data 46 Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data
More informationDATA STREAMS AND DATABASES. CS121: Introduction to Relational Database Systems Fall 2016 Lecture 26
DATA STREAMS AND DATABASES CS121: Introduction to Relational Database Systems Fall 2016 Lecture 26 Static and Dynamic Data Sets 2 So far, have discussed relatively static databases Data may change slowly
More informationCrescando: Predictable Performance for Unpredictable Workloads
Crescando: Predictable Performance for Unpredictable Workloads G. Alonso, D. Fauser, G. Giannikis, D. Kossmann, J. Meyer, P. Unterbrunner Amadeus S.A. ETH Zurich, Systems Group (Funded by Enterprise Computing
More informationENERGY EFFICIENT VIRTUAL MACHINE INTEGRATION IN CLOUD COMPUTING
ENERGY EFFICIENT VIRTUAL MACHINE INTEGRATION IN CLOUD COMPUTING Mrs. Shweta Agarwal Assistant Professor, Dept. of MCA St. Aloysius Institute of Technology, Jabalpur(India) ABSTRACT In the present study,
More informationWhite Paper. Major Performance Tuning Considerations for Weblogic Server
White Paper Major Performance Tuning Considerations for Weblogic Server Table of Contents Introduction and Background Information... 2 Understanding the Performance Objectives... 3 Measuring your Performance
More informationRESTORE: REUSING RESULTS OF MAPREDUCE JOBS. Presented by: Ahmed Elbagoury
RESTORE: REUSING RESULTS OF MAPREDUCE JOBS Presented by: Ahmed Elbagoury Outline Background & Motivation What is Restore? Types of Result Reuse System Architecture Experiments Conclusion Discussion Background
More informationPerformance Evaluation of Yahoo! S4: A First Look
Performance Evaluation of Yahoo! S4: A First Look Jagmohan Chauhan, Shaiful Alam Chowdhury and Dwight Makaroff Department of Computer Science University of Saskatchewan Saskatoon, SK, CANADA S7N 3C9 (jac735,
More informationPLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS
PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS By HAI JIN, SHADI IBRAHIM, LI QI, HAIJUN CAO, SONG WU and XUANHUA SHI Prepared by: Dr. Faramarz Safi Islamic Azad
More informationHybrid Auto-scaling of Multi-tier Web Applications: A Case of Using Amazon Public Cloud
Hybrid Auto-scaling of Multi-tier Web Applications: A Case of Using Amazon Public Cloud Abid Nisar, Waheed Iqbal, Fawaz S. Bokhari, and Faisal Bukhari Punjab University College of Information and Technology,Lahore
More informationNC Education Cloud Feasibility Report
1 NC Education Cloud Feasibility Report 1. Problem Definition and rationale North Carolina districts are generally ill-equipped to manage production server infrastructure. Server infrastructure is most
More informationExploiting Predicate-window Semantics over Data Streams
Exploiting Predicate-window Semantics over Data Streams Thanaa M. Ghanem Walid G. Aref Ahmed K. Elmagarmid Department of Computer Sciences, Purdue University, West Lafayette, IN 47907-1398 {ghanemtm,aref,ake}@cs.purdue.edu
More informationA Predictive Load Balancing Service for Cloud-Replicated Databases
paper:174094 A Predictive Load Balancing for Cloud-Replicated Databases Carlos S. S. Marinho 1,2, Emanuel F. Coutinho 1, José S. Costa Filho 2, Leonardo O. Moreira 1,2, Flávio R. C. Sousa 2, Javam C. Machado
More informationStreamGlobe Adaptive Query Processing and Optimization in Streaming P2P Environments
StreamGlobe Adaptive Query Processing and Optimization in Streaming P2P Environments A. Kemper, R. Kuntschke, and B. Stegmaier TU München Fakultät für Informatik Lehrstuhl III: Datenbanksysteme http://www-db.in.tum.de/research/projects/streamglobe
More informationMark Sandstrom ThroughPuter, Inc.
Hardware Implemented Scheduler, Placer, Inter-Task Communications and IO System Functions for Many Processors Dynamically Shared among Multiple Applications Mark Sandstrom ThroughPuter, Inc mark@throughputercom
More informationAdaptive Aggregation Scheduling Using. Aggregation-degree Control in Sensor Network
Contemporary Engineering Sciences, Vol. 7, 2014, no. 14, 725-730 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ces.2014.4681 Adaptive Aggregation Scheduling Using Aggregation-degree Control in
More informationQoS-aware resource allocation and load-balancing in enterprise Grids using online simulation
QoS-aware resource allocation and load-balancing in enterprise Grids using online simulation * Universität Karlsruhe (TH) Technical University of Catalonia (UPC) Barcelona Supercomputing Center (BSC) Samuel
More informationDistributed Scheduling for the Sombrero Single Address Space Distributed Operating System
Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Donald S. Miller Department of Computer Science and Engineering Arizona State University Tempe, AZ, USA Alan C.
More informationAmazon Elastic File System
Amazon Elastic File System Choosing Between the Different Throughput & Performance Modes July 2018 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided
More informationLarge Scale Computing Infrastructures
GC3: Grid Computing Competence Center Large Scale Computing Infrastructures Lecture 2: Cloud technologies Sergio Maffioletti GC3: Grid Computing Competence Center, University
More informationBest Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0.
IBM Optim Performance Manager Extended Edition V4.1.0.1 Best Practices Deploying Optim Performance Manager in large scale environments Ute Baumbach (bmb@de.ibm.com) Optim Performance Manager Development
More informationCOMPTIA CLO-001 EXAM QUESTIONS & ANSWERS
COMPTIA CLO-001 EXAM QUESTIONS & ANSWERS Number: CLO-001 Passing Score: 800 Time Limit: 120 min File Version: 39.7 http://www.gratisexam.com/ COMPTIA CLO-001 EXAM QUESTIONS & ANSWERS Exam Name: CompTIA
More informationA RESOURCE MANAGEMENT FRAMEWORK FOR INTERACTIVE GRIDS
A RESOURCE MANAGEMENT FRAMEWORK FOR INTERACTIVE GRIDS Raj Kumar, Vanish Talwar, Sujoy Basu Hewlett-Packard Labs 1501 Page Mill Road, MS 1181 Palo Alto, CA 94304 USA { raj.kumar,vanish.talwar,sujoy.basu}@hp.com
More informationComplex Event Processing in a High Transaction Enterprise POS System
Complex Event Processing in a High Transaction Enterprise POS System Samuel Collins* Roy George** *InComm, Atlanta, GA 30303 **Department of Computer and Information Science, Clark Atlanta University,
More informationYOUR APPLICATION S JOURNEY TO THE CLOUD. What s the best way to get cloud native capabilities for your existing applications?
YOUR APPLICATION S JOURNEY TO THE CLOUD What s the best way to get cloud native capabilities for your existing applications? Introduction Moving applications to cloud is a priority for many IT organizations.
More informationLambda Architecture for Batch and Stream Processing. October 2018
Lambda Architecture for Batch and Stream Processing October 2018 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided for informational purposes only.
More informationAdaptive In-Network Query Deployment for Shared Stream Processing Environments
Adaptive In-Network Query Deployment for Shared Stream Processing Environments Olga Papaemmanouil Brown University olga@cs.brown.edu Sujoy Basu HP Labs basus@hpl.hp.com Sujata Banerjee HP Labs sujata@hpl.hp.com
More informationArchitectural challenges for building a low latency, scalable multi-tenant data warehouse
Architectural challenges for building a low latency, scalable multi-tenant data warehouse Mataprasad Agrawal Solutions Architect, Services CTO 2017 Persistent Systems Ltd. All rights reserved. Our analytics
More informationPerformance Assurance in Virtualized Data Centers
Autonomic Provisioning with Self-Adaptive Neural Fuzzy Control for End-to-end Delay Guarantee Palden Lama Xiaobo Zhou Department of Computer Science University of Colorado at Colorado Springs Performance
More informationTransactum Business Process Manager with High-Performance Elastic Scaling. November 2011 Ivan Klianev
Transactum Business Process Manager with High-Performance Elastic Scaling November 2011 Ivan Klianev Transactum BPM serves three primary objectives: To make it possible for developers unfamiliar with distributed
More informationAn Integrated Synchronization and Consistency Protocol for the Implementation of a High-Level Parallel Programming Language
An Integrated Synchronization and Consistency Protocol for the Implementation of a High-Level Parallel Programming Language Martin C. Rinard (martin@cs.ucsb.edu) Department of Computer Science University
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung December 2003 ACM symposium on Operating systems principles Publisher: ACM Nov. 26, 2008 OUTLINE INTRODUCTION DESIGN OVERVIEW
More informationDSMS Benchmarking. Morten Lindeberg University of Oslo
DSMS Benchmarking Morten Lindeberg University of Oslo Agenda Introduction DSMS Recap General Requirements Metrics Example: Linear Road Example: StreamBench 30. Sep. 2009 INF5100 - Morten Lindeberg 2 Introduction
More informationA QoS Control Method Cooperating with a Dynamic Load Balancing Mechanism
A QoS Control Method Cooperating with a Dynamic Load Balancing Mechanism Akiko Okamura, Koji Nakamichi, Hitoshi Yamada and Akira Chugo Fujitsu Laboratories Ltd. 4-1-1, Kamikodanaka, Nakahara, Kawasaki,
More informationNetwork Awareness in Internet-Scale Stream Processing
Network Awareness in Internet-Scale Stream Processing Yanif Ahmad, Uğur Çetintemel John Jannotti, Alexander Zgolinski, Stan Zdonik {yna,ugur,jj,amz,sbz}@cs.brown.edu Dept. of Computer Science, Brown University,
More informationMOHA: Many-Task Computing Framework on Hadoop
Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction
More informationEvaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades
Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state
More informationTDDD82 Secure Mobile Systems Lecture 6: Quality of Service
TDDD82 Secure Mobile Systems Lecture 6: Quality of Service Mikael Asplund Real-time Systems Laboratory Department of Computer and Information Science Linköping University Based on slides by Simin Nadjm-Tehrani
More informationHybrid Approach for the Maintenance of Materialized Webviews
Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2010 Proceedings Americas Conference on Information Systems (AMCIS) 8-2010 Hybrid Approach for the Maintenance of Materialized Webviews
More informationSoftNAS Cloud Performance Evaluation on AWS
SoftNAS Cloud Performance Evaluation on AWS October 25, 2016 Contents SoftNAS Cloud Overview... 3 Introduction... 3 Executive Summary... 4 Key Findings for AWS:... 5 Test Methodology... 6 Performance Summary
More informationEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data Marcin Wylot 1 Motivation and objectives of the research The proliferation of heterogeneous Linked Data on the Web requires data management
More informationA SURVEY ON SCHEDULING IN HADOOP FOR BIGDATA PROCESSING
Journal homepage: www.mjret.in ISSN:2348-6953 A SURVEY ON SCHEDULING IN HADOOP FOR BIGDATA PROCESSING Bhavsar Nikhil, Bhavsar Riddhikesh,Patil Balu,Tad Mukesh Department of Computer Engineering JSPM s
More informationMitigating Data Skew Using Map Reduce Application
Ms. Archana P.M Mitigating Data Skew Using Map Reduce Application Mr. Malathesh S.H 4 th sem, M.Tech (C.S.E) Associate Professor C.S.E Dept. M.S.E.C, V.T.U Bangalore, India archanaanil062@gmail.com M.S.E.C,
More informationWORKFLOW ENGINE FOR CLOUDS
WORKFLOW ENGINE FOR CLOUDS By SURAJ PANDEY, DILEBAN KARUNAMOORTHY, and RAJKUMAR BUYYA Prepared by: Dr. Faramarz Safi Islamic Azad University, Najafabad Branch, Esfahan, Iran. Task Computing Task computing
More informationStar: Sla-Aware Autonomic Management of Cloud Resources
Star: Sla-Aware Autonomic Management of Cloud Resources Sakshi Patil 1, Meghana N Rathod 2, S. A Madival 3, Vivekanand M Bonal 4 1, 2 Fourth Sem M. Tech Appa Institute of Engineering and Technology Karnataka,
More informationGrand Challenge: Scalable Stateful Stream Processing for Smart Grids
Grand Challenge: Scalable Stateful Stream Processing for Smart Grids Raul Castro Fernandez, Matthias Weidlich, Peter Pietzuch Imperial College London {rc311, m.weidlich, prp}@imperial.ac.uk Avigdor Gal
More informationGETTING GREAT PERFORMANCE IN THE CLOUD
WHITE PAPER GETTING GREAT PERFORMANCE IN THE CLOUD An overview of storage performance challenges in the cloud, and how to deploy VPSA Storage Arrays for better performance, privacy, flexibility and affordability.
More informationVeritas InfoScale Enterprise for Oracle Real Application Clusters (RAC)
Veritas InfoScale Enterprise for Oracle Real Application Clusters (RAC) Manageability and availability for Oracle RAC databases Overview Veritas InfoScale Enterprise for Oracle Real Application Clusters
More information2 The BEinGRID Project
2 The BEinGRID Project Theo Dimitrakos 2.1 Introduction Most of the results presented in this book were created within the BEinGRID project. BEinGRID, Business Experiments in GRID, is the European Commission
More informationCHAPTER 7 CONCLUSION AND FUTURE SCOPE
121 CHAPTER 7 CONCLUSION AND FUTURE SCOPE This research has addressed the issues of grid scheduling, load balancing and fault tolerance for large scale computational grids. To investigate the solution
More informationQoS support for Intelligent Storage Devices
QoS support for Intelligent Storage Devices Joel Wu Scott Brandt Department of Computer Science University of California Santa Cruz ISW 04 UC Santa Cruz Mixed-Workload Requirement General purpose systems
More informationNew research on Key Technologies of unstructured data cloud storage
2017 International Conference on Computing, Communications and Automation(I3CA 2017) New research on Key Technologies of unstructured data cloud storage Songqi Peng, Rengkui Liua, *, Futian Wang State
More informationD DAVID PUBLISHING. Big Data; Definition and Challenges. 1. Introduction. Shirin Abbasi
Journal of Energy and Power Engineering 10 (2016) 405-410 doi: 10.17265/1934-8975/2016.07.004 D DAVID PUBLISHING Shirin Abbasi Computer Department, Islamic Azad University-Tehran Center Branch, Tehran
More informationSOLUTION BRIEF NETWORK OPERATIONS AND ANALYTICS. How Can I Predict Network Behavior to Provide for an Exceptional Customer Experience?
SOLUTION BRIEF NETWORK OPERATIONS AND ANALYTICS How Can I Predict Network Behavior to Provide for an Exceptional Customer Experience? SOLUTION BRIEF CA DATABASE MANAGEMENT FOR DB2 FOR z/os DRAFT When used
More informationGrand Challenge: Scalable Stateful Stream Processing for Smart Grids
Grand Challenge: Scalable Stateful Stream Processing for Smart Grids Raul Castro Fernandez, Matthias Weidlich, Peter Pietzuch Imperial College London {rc311, m.weidlich, prp}@imperial.ac.uk Avigdor Gal
More informationSystems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2012/13
Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Data Stream Processing Topics Model Issues System Issues Distributed Processing Web-Scale Streaming 3 System Issues Architecture
More informationComplexity Measures for Map-Reduce, and Comparison to Parallel Computing
Complexity Measures for Map-Reduce, and Comparison to Parallel Computing Ashish Goel Stanford University and Twitter Kamesh Munagala Duke University and Twitter November 11, 2012 The programming paradigm
More informationPocket: Elastic Ephemeral Storage for Serverless Analytics
Pocket: Elastic Ephemeral Storage for Serverless Analytics Ana Klimovic*, Yawen Wang*, Patrick Stuedi +, Animesh Trivedi +, Jonas Pfefferle +, Christos Kozyrakis* *Stanford University, + IBM Research 1
More informationStatistics Driven Workload Modeling for the Cloud
UC Berkeley Statistics Driven Workload Modeling for the Cloud Archana Ganapathi, Yanpei Chen Armando Fox, Randy Katz, David Patterson SMDB 2010 Data analytics are moving to the cloud Cloud computing economy
More informationReservoir Resource and Service Virtualisation w/out Borders
NETWORKED EUROPEAN SOFTWARE & SERVICES INITIATIVE Reservoir Resource and Service Virtualisation w/out Borders Dr. Yaron Wolfsthal IBM Haifa Research Lab NESSI General Assembly 2007 Brussels 11th & 12th
More informationHow Real Time Are Your Analytics?
How Real Time Are Your Analytics? Min Xiao Solutions Architect, VoltDB Table of Contents Your Big Data Analytics.... 1 Turning Analytics into Real Time Decisions....2 Bridging the Gap...3 How VoltDB Helps....4
More informationWas ist dran an einer spezialisierten Data Warehousing platform?
Was ist dran an einer spezialisierten Data Warehousing platform? Hermann Bär Oracle USA Redwood Shores, CA Schlüsselworte Data warehousing, Exadata, specialized hardware proprietary hardware Introduction
More informationThe Top Five Reasons to Deploy Software-Defined Networks and Network Functions Virtualization
The Top Five Reasons to Deploy Software-Defined Networks and Network Functions Virtualization May 2014 Prepared by: Zeus Kerravala The Top Five Reasons to Deploy Software-Defined Networks and Network Functions
More informationDesign Issues for Second Generation Stream Processing Engines
Design Issues for Second Generation Stream Processing Engines Daniel J. Abadi, Yanif Ahmad, Magdalena Balazinska, Uğur Çetintemel, Mitch Cherniack, Jeong-Hyon Hwang, Wolfgang Lindner, Anurag S. Maskey,
More informationHéctor Fernández and G. Pierre Vrije Universiteit Amsterdam
Héctor Fernández and G. Pierre Vrije Universiteit Amsterdam Cloud Computing Day, November 20th 2012 contrail is co-funded by the EC 7th Framework Programme under Grant Agreement nr. 257438 1 Typical Cloud
More informationW H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b l e d N e x t - G e n e r a t i o n V N X
Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com W H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b
More informationQLIKVIEW SCALABILITY BENCHMARK WHITE PAPER
QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER Hardware Sizing Using Amazon EC2 A QlikView Scalability Center Technical White Paper June 2013 qlikview.com Table of Contents Executive Summary 3 A Challenge
More informationMigration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM
Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM Hyunchul Seok Daejeon, Korea hcseok@core.kaist.ac.kr Youngwoo Park Daejeon, Korea ywpark@core.kaist.ac.kr Kyu Ho Park Deajeon,
More informationTowards Energy Efficient Change Management in a Cloud Computing Environment
Towards Energy Efficient Change Management in a Cloud Computing Environment Hady AbdelSalam 1,KurtMaly 1,RaviMukkamala 1, Mohammad Zubair 1, and David Kaminsky 2 1 Computer Science Department, Old Dominion
More informationData Streams. Building a Data Stream Management System. DBMS versus DSMS. The (Simplified) Big Picture. (Simplified) Network Monitoring
Building a Data Stream Management System Prof. Jennifer Widom Joint project with Prof. Rajeev Motwani and a team of graduate students http://www-db.stanford.edu/stream stanfordstreamdatamanager Data Streams
More informationParallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce
Parallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce Huayu Wu Institute for Infocomm Research, A*STAR, Singapore huwu@i2r.a-star.edu.sg Abstract. Processing XML queries over
More informationContextion: A Framework for Developing Context-Aware Mobile Applications
Contextion: A Framework for Developing Context-Aware Mobile Applications Elizabeth Williams, Jeff Gray Department of Computer Science, University of Alabama eawilliams2@crimson.ua.edu, gray@cs.ua.edu Abstract
More informationLambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
More informationThesis: An Extensible, Self-Tuning, Overlay-Based Infrastructure for Large-Scale Stream Processing and Dissemination Advisor: Ugur Cetintemel
Olga Papaemmanouil Phone: +1 (401) 588-0230 Department of Computer Science Fax: +1 (401) 863-7657 Box 1910, 115 Waterman St, Floor 4 Email: olga@cs.brown.edu Providence, RI, 02912, USA Web: http://www.cs.brown.edu/
More informationvsan 6.6 Performance Improvements First Published On: Last Updated On:
vsan 6.6 Performance Improvements First Published On: 07-24-2017 Last Updated On: 07-28-2017 1 Table of Contents 1. Overview 1.1.Executive Summary 1.2.Introduction 2. vsan Testing Configuration and Conditions
More information