Using Bad Learners to find Good Configurations
|
|
- Allen Ramsey
- 5 years ago
- Views:
Transcription
1 Using Bad Learners to find Good Configurations Vivek Nair Tim Menzies North Carolina State University Norbert Siegmund Sven Apel Bauhaus-Universität Weimar University of Passau
2 Configurable Systems and Variability System
3 Configurable Systems and Variability t_2k_480p24.y4m System
4 Configurable Systems and Variability t_2k_480p24.y4m System trailer_480p24.x264
5 Configurable Systems and Variability Configuration options c1 c2 c3 c4 t_2k_480p24.y4m... System cn trailer_480p24.x264
6 Configurable Systems and Variability Configuration options c1 c2 c3 c4 t_2k_480p24.y4m... System cn trailer_480p24.x264 Non-functional behavior: response time, throughput, etc.
7 Configurable Systems and Variability Configuration options c1 c2 c3 c4 t_2k_480p24.y4m... System cn trailer_480p24.x264 Non-functional behavior: response time, throughput, etc.
8 Performance Optimization is Necessary!
9 Performance Optimization is getting more Complex! Necessary [1] Xu et. al Hey, you have given me too many knobs!: understanding and dealing with over-designed configuration in system software.fse 2015 [2] Van Aken, Dana, et al. "Automatic Database Management System Tuning Through Large-scale Machine Learning." International Conference on Management of Data. ACM, 2017.
10 Performance Optimization is required since Default Configuration is Bad! Necessary Complex [1] Van Aken, Dana, et al. "Automatic Database Management System Tuning Through Large-scale Machine Learning." International Conference on Management of Data. ACM, [2] Herodotou, Herodotos, et al. "Starfish: A Self-tuning System for Big Data Analytics." CIDR
11 Performance Optimization can be Expensive! Evaluation of single instance of their hardware/software co-design problem can take weeks[1] Necessary Complex Default is bad Rolling Sort use-case required 21 days, within a total experimental time of about 2.5 months[2] Test suite generation using Evolutionary Algorithm can take weeks[3] Image recognition workload and speech recognition workload, jobs ran for many hours or days[4] [1] Zuluaga, Marcela, et al. "Active learning for multi-objective optimization." International Conference on Machine Learning [2] Jamshidi, Pooyan, and Giuliano Casale. "An uncertainty-aware approach to optimal configuration of stream processing systems."mascots-2016 [3] Wang, Tiantian, et al. "Searching for better configurations: a rigorous approach to clone evaluation." FSE-2013 [4] Venkataraman, Shivaram, et al. "Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics." NSDI
12 Existing Solutions [Guo 13] [Siegmund 12] [Sarkar 15] [Nair 17] Configuration Configuration F1= True F2= False F3= True Response Time = 2100 ms Request F1= True F2= False F3= True Response Time ~ 2100 ms Request Software System [Siegmund 12] Siegmund, Norbert, et al. "Predicting performance via automated feature-interaction detection." ICSE [Guo 13] Guo, Jienmei, et al. Variability-aware performance prediction: A statistical learning approach. ASE-2013 [Sarkar 15] Sarkar, Atri, et al. "Cost-efficient sampling for performance prediction of configurable systems (t)." ASE-2015 [Nair 17] Nair, Vivek, et al. "Faster discovery of faster system configurations with spectral learning." ASE Journal to appear. Surrogate System
13 Existing Solutions 2012 [Siegmund 12] 2013 [Guo 13] 2015 [Sarkar 15] Configuration Space [Siegmund 12] Siegmund, Norbert, et al. "Predicting performance via automated feature-interaction detection." ICSE [Guo 13] Guo, Jienmei, et al. Variability-aware performance prediction: A statistical learning approach. ASE-2013 [Sarkar 15] Sarkar, Atri, et al. "Cost-efficient sampling for performance prediction of configurable systems (t)." ASE-2015 [Nair 17] Nair, Vivek, et al. "Faster discovery of faster system configurations with spectral learning." ASE Journal to appear [Nair 17]
14 Existing Solutions 2012 [Siegmund 12] 2013 [Guo 13] [Sarkar 15] 1. [Nair 17] Divide the configuration space into training and testing sets; Train 1 Test Configuration Space [Siegmund 12] Siegmund, Norbert, et al. "Predicting performance via automated feature-interaction detection." ICSE [Guo 13] Guo, Jienmei, et al. Variability-aware performance prediction: A statistical learning approach. ASE-2013 [Sarkar 15] Sarkar, Atri, et al. "Cost-efficient sampling for performance prediction of configurable systems (t)." ASE-2015 [Nair 17] Nair, Vivek, et al. "Faster discovery of faster system configurations with spectral learning." ASE Journal to appear.
15 Existing Solutions [Siegmund 12] [Guo 13] [Sarkar 15] [Nair 17] Divide the configuration space into training and testing sets; Measure all the configurations in the testing set; Train 1 Test 2 Configuration Space [Siegmund 12] Siegmund, Norbert, et al. "Predicting performance via automated feature-interaction detection." ICSE [Guo 13] Guo, Jienmei, et al. Variability-aware performance prediction: A statistical learning approach. ASE-2013 [Sarkar 15] Sarkar, Atri, et al. "Cost-efficient sampling for performance prediction of configurable systems (t)." ASE-2015 [Nair 17] Nair, Vivek, et al. "Faster discovery of faster system configurations with spectral learning." ASE Journal to appear.
16 Existing Solutions [Siegmund 12] [Sarkar 15] [Guo 13] Train [Nair 17] Divide the configuration space into training and testing sets; Measure all the configurations in the testing set; Iteratively sampling configuration from training set to build a model and test the model against testing set; Test 2 Configuration Space [Siegmund 12] Siegmund, Norbert, et al. "Predicting performance via automated feature-interaction detection." ICSE [Guo 13] Guo, Jienmei, et al. Variability-aware performance prediction: A statistical learning approach. ASE-2013 [Sarkar 15] Sarkar, Atri, et al. "Cost-efficient sampling for performance prediction of configurable systems (t)." ASE-2015 [Nair 17] Nair, Vivek, et al. "Faster discovery of faster system configurations with spectral learning." ASE Journal to appear.
17 Existing Solutions [Siegmund 12] [Sarkar 15] [Guo 13] Train 1 4 Test [Nair 17] Divide the configuration space into training and testing sets; Measure all the configurations in the testing set; Iteratively sampling configuration from training set to build a model and test the model against testing set; Exit when an accurate model is built (e.g., error = 0.1) 2 Configuration Space [Siegmund 12] Siegmund, Norbert, et al. "Predicting performance via automated feature-interaction detection." ICSE [Guo 13] Guo, Jienmei, et al. Variability-aware performance prediction: A statistical learning approach. ASE-2013 [Sarkar 15] Sarkar, Atri, et al. "Cost-efficient sampling for performance prediction of configurable systems (t)." ASE-2015 [Nair 17] Nair, Vivek, et al. "Faster discovery of faster system configurations with spectral learning." ASE Journal to appear.
18 Limitation of Existing Solutions
19 Limitation of Existing Solutions
20 Limitation of Existing Solutions
21 Limitation of Existing Solutions
22 Limitation of Existing Solutions
23 Limitation of Existing Solutions
24 Core Insight
25 Rank Preserving Model
26 Rank Preserving Model
27 Rank Preserving Model
28 Rank Preserving Model
29 Rank Preserving Model
30 Rank Preserving Model
31 Rank Preserving Model
32 Rank Preserving Model Train 1 Test 2 Configuration Space Divide the configuration space into training and testing sets; Measure all the configurations in the testing set; Iteratively sampling configuration from training set to build a model and test the model against testing set;
33 Rank Preserving Model Train 1 4 Test 2 Configuration Space 4. Divide the configuration space into training and testing sets; Measure all the configurations in the testing set; Iteratively sampling configuration from training set to build a model and test the model against testing set; Calculate accuracy model should get progressively more accurate
34 Rank Preserving Model Train 1 4 Test 2 Configuration Space Divide the configuration space into training and testing sets; Measure all the configurations in the testing set; Iteratively sampling configuration from training set to build a model and test the model against testing set; Calculate accuracy model should get progressively more accurate Exit when a model built does not improve (accuracy plateau)
35 Evaluation
36 Baselines Progressive Sampling[1] Sequentially (randomly) sample configuration to build a decision tree till threshold accuracy is reached Projective Sampling[2] Using minimal set of initial sample configurations to project the sampling cost based on a threshold accuracy [1] Guo, Jianmei, et al. Variability-aware performance prediction: A statistical learning approach. ASE-2013 [2] Sarkar, Atri, et al. "Cost-efficient sampling for performance prediction of configurable systems (t)." ASE-2015
37 Research Questions RQ1 - Can inaccurate models accurately rank configurations? RQ2 - How expensive is a rank-based method?
38 Subject Software Systems Video Encoder Databases Utility Compression Numerical Grid benchmark Web server Data processing
39 Subject Software Systems Pooyan Jamshidi Sven Apel Norbert Siegmund Combined effort = 6 computational months
40 Experimental Settings Data (100)
41 Experimental Settings Data (100) Training Pool (40) Prune Pool (20) Testing Pool (40)
42 Experimental Settings Data (100) Training Pool (40) Prune Pool (20) Testing Pool (40)
43 Results
44 RQ1: Can inaccurate models accurately rank configurations?
45 RQ1: Can inaccurate models accurately rank configurations?
46 RQ1: Can inaccurate models accurately rank configurations?
47 RQ1: Can inaccurate models accurately rank configurations? The lower the better
48 RQ1: Can inaccurate models accurately rank configurations? The lower the better
49 RQ1: Can inaccurate models accurately rank configurations? 6 predictedoptima l 1 actualoptimal The lower the better
50 RQ1: Can inaccurate models accurately rank configurations? RD = 1-6 =5 6 predictedoptimal 1 actualoptimal The lower the better
51 RQ1: Can inaccurate models accurately rank configurations?
52 RQ1: Can inaccurate models accurately rank configurations?
53 RQ1: Can inaccurate models accurately rank configurations? 8/1535
54 RQ1: Can inaccurate models accurately rank configurations?
55 RQ2: How expensive is a rank-based approach?
56 RQ2: How expensive is a rank-based approach? The lower the better
57 RQ2: How expensive is a rank-based approach?
58 RQ2: How expensive is a rank-based approach?
59 Conclusion
60
arxiv: v2 [cs.se] 28 Jun 2017
arxiv:1702.05701v2 [cs.se] 28 Jun 2017 ABSTRACT Vivek Nair North Carolina State University Raleigh, North Carolina, USA Norbert Siegmund Bauhaus-University Weimar, Germany Finding the optimally performing
More informationWhy CART Works for Variability-Aware Performance Prediction? An Empirical Study on Performance Distributions
GSDLAB TECHNICAL REPORT Why CART Works for Variability-Aware Performance Prediction? An Empirical Study on Performance Distributions Jianmei Guo, Krzysztof Czarnecki, Sven Apel, Norbert Siegmund, Andrzej
More informationChallenges and Insights from Optimizing Configurable Software Systems
Challenges and Insights from Optimizing Configurable Software Systems Norbert Siegmund 2/6/2019 A Joint-Research Endeavor Christian Kästner Pooyan Jamshidi Miguel Velez Sven Apel Alexander Grebhahn Christian
More informationScaling Exact Multi Objective Combinatorial Optimization by Parallelization
Scaling Exact Multi Objective Combinatorial Optimization by Parallelization Jianmei Guo, Edward Zulkoski, Rafael Olaechea, Derek Rayside, Krzysztof Czarnecki, Sven Apel, Joanne M. Atlee Multi Objective
More informationMeta-learning Performance Prediction of Highly Configurable Systems: A Cost-oriented Approach
Meta-learning Performance Prediction of Highly Configurable Systems: A Cost-oriented Approach by Atri Sarkar A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for
More informationFocus: Querying Large Video Datasets with Low Latency and Low Cost
Focus: Querying Large Video Datasets with Low Latency and Low Cost Kevin Hsieh Ganesh Ananthanarayanan, Peter Bodik, Shivaram Venkataraman, Paramvir Bahl, Matthai Philipose, Phillip B. Gibbons, Onur Mutlu
More informationResearch Works to Cope with Big Data Volume and Variety. Jiaheng Lu University of Helsinki, Finland
Research Works to Cope with Big Data Volume and Variety Jiaheng Lu University of Helsinki, Finland Big Data: 4Vs Photo downloaded from: https://blog.infodiagram.com/2014/04/visualizing-big-data-concepts-strong.html
More informationAutomatic Speech Recognition (ASR)
Automatic Speech Recognition (ASR) February 2018 Reza Yazdani Aminabadi Universitat Politecnica de Catalunya (UPC) State-of-the-art State-of-the-art ASR system: DNN+HMM Speech (words) Sound Signal Graph
More informationOtterTune. Automatic Database Management System Tuning Through Large-scale Machine Learning
OtterTune Automatic Database Management System Tuning Through Large-scale Machine Learning Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, Bohan Zhang [image source] 2 DBMS Tuning Tuning a DBMS s configuration
More informationBohr: Similarity Aware Geo-distributed Data Analytics. Hangyu Li, Hong Xu, Sarana Nutanong City University of Hong Kong
Bohr: Similarity Aware Geo-distributed Data Analytics Hangyu Li, Hong Xu, Sarana Nutanong City University of Hong Kong 1 Big Data Analytics Analysis Generate 2 Data are geo-distributed Frankfurt US Oregon
More informationSAP HANA Scalability. SAP HANA Development Team
SAP HANA Scalability Design for scalability is a core SAP HANA principle. This paper explores the principles of SAP HANA s scalability, and its support for the increasing demands of data-intensive workloads.
More informationAgenda. Cassandra Internal Read/Write path.
Rafiki: A Middleware for Parameter Tuning of NoSQL Data-stores for Dynamic Metagenomics Workloads Ashraf Mahgoub, Paul Wood, Sachandhan Ganesh Purdue University Subrata Mitra Adobe Research Wolfgang Gerlach,
More informationSelf Programming Networks
Self Programming Networks Is it possible for to Learn the control planes of networks and applications? Operators specify what they want, and the system learns how to deliver CAN WE LEARN THE CONTROL PLANE
More informationDB2 is a complex system, with a major impact upon your processing environment. There are substantial performance and instrumentation changes in
DB2 is a complex system, with a major impact upon your processing environment. There are substantial performance and instrumentation changes in versions 8 and 9. that must be used to measure, evaluate,
More informationData Driven Networks
Data Driven Networks Is it possible for to Learn the control planes of networks and applications? Operators specify what they want, and the system learns how to deliver CAN WE LEARN THE CONTROL PLANE OF
More informationAggregation and Degradation in JetStream: Streaming Analytics in the Wide Area
Aggregation and Degradation in JetStream: Streaming Analytics in the Wide Area Ariel Rabkin Princeton University asrabkin@cs.princeton.edu Work done with Matvey Arye, Siddhartha Sen, Vivek S. Pai, and
More informationAutomatic Identification of Application I/O Signatures from Noisy Server-Side Traces. Yang Liu Raghul Gunasekaran Xiaosong Ma Sudharshan S.
Automatic Identification of Application I/O Signatures from Noisy Server-Side Traces Yang Liu Raghul Gunasekaran Xiaosong Ma Sudharshan S. Vazhkudai Instance of Large-Scale HPC Systems ORNL s TITAN (World
More informationLightweight Streaming-based Runtime for Cloud Computing. Shrideep Pallickara. Community Grids Lab, Indiana University
Lightweight Streaming-based Runtime for Cloud Computing granules Shrideep Pallickara Community Grids Lab, Indiana University A unique confluence of factors have driven the need for cloud computing DEMAND
More informationDRIZZLE: FAST AND Adaptable STREAM PROCESSING AT SCALE
DRIZZLE: FAST AND Adaptable STREAM PROCESSING AT SCALE Shivaram Venkataraman, Aurojit Panda, Kay Ousterhout, Michael Armbrust, Ali Ghodsi, Michael Franklin, Benjamin Recht, Ion Stoica STREAMING WORKLOADS
More informationUsing Alluxio to Improve the Performance and Consistency of HDFS Clusters
ARTICLE Using Alluxio to Improve the Performance and Consistency of HDFS Clusters Calvin Jia Software Engineer at Alluxio Learn how Alluxio is used in clusters with co-located compute and storage to improve
More informationExplore Co-clustering on Job Applications. Qingyun Wan SUNet ID:qywan
Explore Co-clustering on Job Applications Qingyun Wan SUNet ID:qywan 1 Introduction In the job marketplace, the supply side represents the job postings posted by job posters and the demand side presents
More informationTen things hyperconvergence can do for you
Ten things hyperconvergence can do for you Francis O Haire Director, Technology & Strategy DataSolutions Evolution of Enterprise Infrastructure 1990s Today Virtualization Server Server Server Server Scale-Out
More informationThe Memetic Algorithm for The Minimum Spanning Tree Problem with Degree and Delay Constraints
The Memetic Algorithm for The Minimum Spanning Tree Problem with Degree and Delay Constraints Minying Sun*,Hua Wang* *Department of Computer Science and Technology, Shandong University, China Abstract
More informationIntelligent QoS Grid for Virtualized Workloads Gaurav Gupta Tata Consultancy Services
Intelligent Grid for ized Workloads Gaurav Gupta Tata Consultancy Services Characteristics of Data Analytics BI Image Processing Multi Media Static Content OLTP BigData NoSQL ECM Cloud IOT ERP Web 2.0
More informationTable of Contents 1 Introduction A Declarative Approach to Entity Resolution... 17
Table of Contents 1 Introduction...1 1.1 Common Problem...1 1.2 Data Integration and Data Management...3 1.2.1 Information Quality Overview...3 1.2.2 Customer Data Integration...4 1.2.3 Data Management...8
More informationA REVIEW PAPER ON BIG DATA ANALYTICS
A REVIEW PAPER ON BIG DATA ANALYTICS Kirti Bhatia 1, Lalit 2 1 HOD, Department of Computer Science, SKITM Bahadurgarh Haryana, India bhatia.kirti.it@gmail.com 2 M Tech 4th sem SKITM Bahadurgarh, Haryana,
More informationOnline Optimization of VM Deployment in IaaS Cloud
Online Optimization of VM Deployment in IaaS Cloud Pei Fan, Zhenbang Chen, Ji Wang School of Computer Science National University of Defense Technology Changsha, 4173, P.R.China {peifan,zbchen}@nudt.edu.cn,
More informationPREDICTIVE DATACENTER ANALYTICS WITH STRYMON
PREDICTIVE DATACENTER ANALYTICS WITH STRYMON Vasia Kalavri kalavriv@inf.ethz.ch QCon San Francisco 14 November 2017 Support: ABOUT ME Postdoc at ETH Zürich Systems Group: https://www.systems.ethz.ch/ PMC
More informationData Driven Networks. Sachin Katti
Data Driven Networks Sachin Katti Is it possible for to Learn the control planes of networks and applications? Operators specify what they want, and the system learns how to deliver CAN WE LEARN THE CONTROL
More informationCLUSTERING BIG DATA USING NORMALIZATION BASED k-means ALGORITHM
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 5.258 IJCSMC,
More informationImplementation of Data Mining for Vehicle Theft Detection using Android Application
Implementation of Data Mining for Vehicle Theft Detection using Android Application Sandesh Sharma 1, Praneetrao Maddili 2, Prajakta Bankar 3, Rahul Kamble 4 and L. A. Deshpande 5 1 Student, Department
More informationIntelligent QoS Grid for Virtualized Workloads
Intelligent Grid for ized Workloads Gaurav Gupta Delivery Head, HiTech Industry Solution Unit Tata Consultancy Services 27 May 2016 SDC India 2016 1 Copyright 2016 Tata Consultancy Services Limited Parallel
More informationBe a VDI hero with Nutanix
Be a VDI hero with Nutanix Francis O Haire Director, Technology & Strategy DataSolutions Evolution of Enterprise Infrastructure 1990s Today App App App App Virtualization Server Server Server Server Scale-Out
More informationCPU-GPU hybrid computing for feature extraction from video stream
LETTER IEICE Electronics Express, Vol.11, No.22, 1 8 CPU-GPU hybrid computing for feature extraction from video stream Sungju Lee 1, Heegon Kim 1, Daihee Park 1, Yongwha Chung 1a), and Taikyeong Jeong
More informationIO Priority: To The Device and Beyond
IO Priority: The Device and Beyond Adam Manzanares Filip Blagojevic Cyril Guyot 2017 Western Digital Corporation or its affiliates. All rights reserved. 7/24/17 1 The Relationship of Throughput and Latency
More informationREAL-TIME AND PARALLEL QUALITY CONTROL PROCESSING ON INDUSTRIAL PRODUCTION LINES
POLLACK PERIODICA An International Journal for Engineering and Information Sciences DOI: 10.1556/Pollack.3.2008.3.9 Vol. 3, No. 3, pp. 105 111 (2008) www.akademiai.com REAL-TIME AND PARALLEL QUALITY CONTROL
More informationLITERATURE SURVEY (BIG DATA ANALYTICS)!
LITERATURE SURVEY (BIG DATA ANALYTICS) Applications frequently require more resources than are available on an inexpensive machine. Many organizations find themselves with business processes that no longer
More informationA What-if Engine for Cost-based MapReduce Optimization
A What-if Engine for Cost-based MapReduce Optimization Herodotos Herodotou Microsoft Research Shivnath Babu Duke University Abstract The Starfish project at Duke University aims to provide MapReduce users
More informationBottleneck Identification and Scheduling in Multithreaded Applications. José A. Joao M. Aater Suleman Onur Mutlu Yale N. Patt
Bottleneck Identification and Scheduling in Multithreaded Applications José A. Joao M. Aater Suleman Onur Mutlu Yale N. Patt Executive Summary Problem: Performance and scalability of multithreaded applications
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationResolve: Generation of High Performance Sorting Architectures from High Level Synthesis
Resolve: Generation of High Performance Sorting Architectures from High Level Synthesis Janarbek Matai, Dustin Richmond, Dajung Lee, Zac Blair, Qiongzhi Wu, Amin Abazari, Ryan Kastner Department of Computer
More informationManuel F. Dolz, Juan C. Fernández, Rafael Mayo, Enrique S. Quintana-Ortí. High Performance Computing & Architectures (HPCA)
EnergySaving Cluster Roll: Power Saving System for Clusters Manuel F. Dolz, Juan C. Fernández, Rafael Mayo, Enrique S. Quintana-Ortí High Performance Computing & Architectures (HPCA) University Jaume I
More informationFGDEFRAG: A Fine-Grained Defragmentation Approach to Improve Restore Performance
FGDEFRAG: A Fine-Grained Defragmentation Approach to Improve Restore Performance Yujuan Tan, Jian Wen, Zhichao Yan, Hong Jiang, Witawas Srisa-an, Baiping Wang, Hao Luo Outline Background and Motivation
More informationLecture: Benchmarks, Pipelining Intro. Topics: Performance equations wrap-up, Intro to pipelining
Lecture: Benchmarks, Pipelining Intro Topics: Performance equations wrap-up, Intro to pipelining 1 Measuring Performance Two primary metrics: wall clock time (response time for a program) and throughput
More informationADAPTIVE HANDLING OF 3V S OF BIG DATA TO IMPROVE EFFICIENCY USING HETEROGENEOUS CLUSTERS
INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 ADAPTIVE HANDLING OF 3V S OF BIG DATA TO IMPROVE EFFICIENCY USING HETEROGENEOUS CLUSTERS Radhakrishnan R 1, Karthik
More informationApplying Supervised Learning
Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains
More informationGraph Partitioning for Scalable Distributed Graph Computations
Graph Partitioning for Scalable Distributed Graph Computations Aydın Buluç ABuluc@lbl.gov Kamesh Madduri madduri@cse.psu.edu 10 th DIMACS Implementation Challenge, Graph Partitioning and Graph Clustering
More informationFacilitating Magnetic Recording Technology Scaling for Data Center Hard Disk Drives through Filesystem-level Transparent Local Erasure Coding
Facilitating Magnetic Recording Technology Scaling for Data Center Hard Disk Drives through Filesystem-level Transparent Local Erasure Coding Yin Li, Hao Wang, Xuebin Zhang, Ning Zheng, Shafa Dahandeh,
More informationQoS-aware resource allocation and load-balancing in enterprise Grids using online simulation
QoS-aware resource allocation and load-balancing in enterprise Grids using online simulation * Universität Karlsruhe (TH) Technical University of Catalonia (UPC) Barcelona Supercomputing Center (BSC) Samuel
More informationHow to ask the right questions about backup reporting products Lindsay Morris Product Architect
When the SRM Honeymoon s over How to ask the right questions about backup reporting products Lindsay Morris Product Architect The software honeymoon cycle 1. We need some help! Let s see what s out there.
More informationHiTune. Dataflow-Based Performance Analysis for Big Data Cloud
HiTune Dataflow-Based Performance Analysis for Big Data Cloud Jinquan (Jason) Dai, Jie Huang, Shengsheng Huang, Bo Huang, Yan Liu Intel Asia-Pacific Research and Development Ltd Shanghai, China, 200241
More informationWhite paper ETERNUS Extreme Cache Performance and Use
White paper ETERNUS Extreme Cache Performance and Use The Extreme Cache feature provides the ETERNUS DX500 S3 and DX600 S3 Storage Arrays with an effective flash based performance accelerator for regions
More informationEvolving To The Big Data Warehouse
Evolving To The Big Data Warehouse Kevin Lancaster 1 Copyright Director, 2012, Oracle and/or its Engineered affiliates. All rights Insert Systems, Information Protection Policy Oracle Classification from
More informationOracle Database 11g: Real Application Testing & Manageability Overview
Oracle Database 11g: Real Application Testing & Manageability Overview Top 3 DBA Activities Performance Management Challenge: Sustain Optimal Performance Change Management Challenge: Preserve Order amid
More informationAnalytics Research Internship at Hewlett Packard Labs
Analytics Research Internship at Hewlett Packard Labs Stefanie Deo Mentor: Mehran Kafai September 12, 2016 First, another opportunity that came my way but didn t pan out: Data Science Internship at Intuit!
More informationFeedback Arc Set in Bipartite Tournaments is NP-Complete
Feedback Arc Set in Bipartite Tournaments is NP-Complete Jiong Guo 1 Falk Hüffner 1 Hannes Moser 2 Institut für Informatik, Friedrich-Schiller-Universität Jena, Ernst-Abbe-Platz 2, D-07743 Jena, Germany
More informationQoS Management of Web Services
QoS Management of Web Services Zibin Zheng (Ben) Supervisor: Prof. Michael R. Lyu Department of Computer Science & Engineering The Chinese University of Hong Kong Dec. 10, 2010 Outline Introduction Web
More informationBuilding High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL
Building High Performance Apps using NoSQL Swami Sivasubramanian General Manager, AWS NoSQL Building high performance apps There is a lot to building high performance apps Scalability Performance at high
More informationGrid Scheduler. Grid Information Service. Local Resource Manager L l Resource Manager. Single CPU (Time Shared Allocation) (Space Shared Allocation)
Scheduling on the Grid 1 2 Grid Scheduling Architecture User Application Grid Scheduler Grid Information Service Local Resource Manager Local Resource Manager Local L l Resource Manager 2100 2100 2100
More informationConcurrent execution of an analytical workload on a POWER8 server with K40 GPUs A Technology Demonstration
Concurrent execution of an analytical workload on a POWER8 server with K40 GPUs A Technology Demonstration Sina Meraji sinamera@ca.ibm.com Berni Schiefer schiefer@ca.ibm.com Tuesday March 17th at 12:00
More informationAutomated Database Workload Characterization, Mapping, and Tuning through Machine Learning. Abstract
Automated Database Workload Characterization, Mapping, and Tuning through Machine Learning Madeline MacDonald University of Utah UUCS-18-007 School of Computing University of Utah Salt Lake City, UT 84112
More informationSoftware Development of Automatic Data Collector for Bus Route Planning System
International Journal of Electrical and Computer Engineering (IJECE) Vol. 5, No. 1, February 2015, pp. 150~157 ISSN: 2088-8708 150 Software Development of Automatic Data Collector for Bus Route Planning
More informationInteractive Branched Video Streaming and Cloud Assisted Content Delivery
Interactive Branched Video Streaming and Cloud Assisted Content Delivery Niklas Carlsson Linköping University, Sweden @ Sigmetrics TPC workshop, Feb. 2016 The work here was in collaboration... Including
More informationReview on Managing RDF Graph Using MapReduce
Review on Managing RDF Graph Using MapReduce 1 Hetal K. Makavana, 2 Prof. Ashutosh A. Abhangi 1 M.E. Computer Engineering, 2 Assistant Professor Noble Group of Institutions Junagadh, India Abstract solution
More informationResearch on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a
International Conference on Education Technology, Management and Humanities Science (ETMHS 2015) Research on Applications of Data Mining in Electronic Commerce Xiuping YANG 1, a 1 Computer Science Department,
More informationCustomizing Progressive JPEG for Efficient Image Storage
Customizing Progressive JPEG for Efficient Image Storage Eddie Yan Kaiyuan Zhang Xi Wang Karin Strauss Luis Ceze HotStorage 17 July 11, 2017 2 2 2 2 2 2 2 2 2 Summary Today s image hosts need to store
More informationProfile-Guided Program Simplification for Effective Testing and Analysis
Profile-Guided Program Simplification for Effective Testing and Analysis Lingxiao Jiang Zhendong Su Program Execution Profiles A profile is a set of information about an execution, either succeeded or
More informationVarexJ: A Variability-Aware Java Interpreter
VarexJ: A Variability-Aware Java Interpreter Testing Configurable Systems Jens Meinicke, Chu-Pan Wong, Christian Kästner FOSD Meeting 2015 Feature Interaction Jens Meinicke VarexJ - Testing Configurable
More informationImproving Face Recognition by Exploring Local Features with Visual Attention
Improving Face Recognition by Exploring Local Features with Visual Attention Yichun Shi and Anil K. Jain Michigan State University Difficulties of Face Recognition Large variations in unconstrained face
More informationScheduling of Independent Tasks in Cloud Computing Using Modified Genetic Algorithm (FUZZY LOGIC)
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 9, September 2015,
More informationTim Ebringer The university of Melbourne, Australia Li Sun RMIT university, Australia Serdar Boztas RMIT university, Australia VB 2008 Ottawa,
A FAST RANDOMNESS TEST THAT PRESERVES LOCAL DETAIL Tim Ebringer The university of Melbourne, Australia Li Sun RMIT university, Australia Serdar Boztas RMIT university, Australia VB 2008 Ottawa, Canada,
More informationTesting Properties of Dataflow Program Operators
Testing Properties of Dataflow Program Operators Zhihong Xu Martin Hirzel Gregg Rothermel Kun-Lung Wu U. of Nebraska IBM Research U. of Nebraska IBM Research Presentation at ASE, November 2013 Big Data
More informationPatterns that Matter
Patterns that Matter Describing Structure in Data Matthijs van Leeuwen Leiden Institute of Advanced Computer Science 17 November 2015 Big Data: A Game Changer in the retail sector Predicting trends Forecasting
More informationTechnical Brief: Domain Risk Score Proactively uncover threats using DNS and data science
Technical Brief: Domain Risk Score Proactively uncover threats using DNS and data science 310 Million + Current Domain Names 11 Billion+ Historical Domain Profiles 5 Million+ New Domain Profiles Daily
More informationEvolution of Big Data Facebook. Architecture Summit, Shenzhen, August 2012 Ashish Thusoo
Evolution of Big Data Architectures@ Facebook Architecture Summit, Shenzhen, August 2012 Ashish Thusoo About Me Currently Co-founder/CEO of Qubole Ran the Data Infrastructure Team at Facebook till 2011
More informationHIGH-PERFORMANCE STORAGE FOR DISCOVERY THAT SOARS
HIGH-PERFORMANCE STORAGE FOR DISCOVERY THAT SOARS OVERVIEW When storage demands and budget constraints collide, discovery suffers. And it s a growing problem. Driven by ever-increasing performance and
More informationContents. Part I Setting the Scene
Contents Part I Setting the Scene 1 Introduction... 3 1.1 About Mobility Data... 3 1.1.1 Global Positioning System (GPS)... 5 1.1.2 Format of GPS Data... 6 1.1.3 Examples of Trajectory Datasets... 8 1.2
More informationFlash Storage Complementing a Data Lake for Real-Time Insight
Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018 Agenda 1 2 3 4 5 Delivering insight along the entire spectrum
More informationA Comprehensive Analytical Performance Model of DRAM Caches
A Comprehensive Analytical Performance Model of DRAM Caches Authors: Nagendra Gulur *, Mahesh Mehendale *, and R Govindarajan + Presented by: Sreepathi Pai * Texas Instruments, + Indian Institute of Science
More informationSearch Engines Chapter 8 Evaluating Search Engines Felix Naumann
Search Engines Chapter 8 Evaluating Search Engines 9.7.2009 Felix Naumann Evaluation 2 Evaluation is key to building effective and efficient search engines. Drives advancement of search engines When intuition
More informationFSE Clone- Based and Interactive Recommendation for Modifying Pasted Code
FSE 2015 Clone- Based and Interactive Recommendation for Modifying Pasted Code Yun Lin, Xin Peng, Zhenchang Xing, Diwen Zheng, and Wenyun Zhao Software School, Fudan University pengxin@fudan.edu.cn http://www.se.fudan.edu.cn/pengxin
More informationAUTOMATIC CLUSTERING PRASANNA RAJAPERUMAL I MARCH Snowflake Computing Inc. All Rights Reserved
AUTOMATIC CLUSTERING PRASANNA RAJAPERUMAL I MARCH 2019 SNOWFLAKE Our vision Allow our customers to access all their data in one place so they can make actionable decisions anytime, anywhere, with any number
More informationPROFILING BASED REDUCE MEMORY PROVISIONING FOR IMPROVING THE PERFORMANCE IN HADOOP
ISSN: 0976-2876 (Print) ISSN: 2250-0138 (Online) PROFILING BASED REDUCE MEMORY PROVISIONING FOR IMPROVING THE PERFORMANCE IN HADOOP T. S. NISHA a1 AND K. SATYANARAYAN REDDY b a Department of CSE, Cambridge
More informationITIL Capacity Management Deep Dive
ITIL Capacity Management Deep Dive Chris Molloy IBM Distinguished Engineer International Business Machines Agenda IBM Global Services ITIL Business Model ITIL Architecture ITIL Capacity Management Introduction
More informationExperimental Study of Virtual Machine Migration in Support of Reservation of Cluster Resources
Experimental Study of Virtual Machine Migration in Support of Reservation of Cluster Resources Ming Zhao, Renato J. Figueiredo Advanced Computing and Information Systems (ACIS) Electrical and Computer
More informationDesign of a Dynamic Data-Driven System for Multispectral Video Processing
Design of a Dynamic Data-Driven System for Multispectral Video Processing Shuvra S. Bhattacharyya University of Maryland at College Park ssb@umd.edu With contributions from H. Li, K. Sudusinghe, Y. Liu,
More informationA Comparative study of Clustering Algorithms using MapReduce in Hadoop
A Comparative study of Clustering Algorithms using MapReduce in Hadoop Dweepna Garg 1, Khushboo Trivedi 2, B.B.Panchal 3 1 Department of Computer Science and Engineering, Parul Institute of Engineering
More informationModification and Evaluation of Linux I/O Schedulers
Modification and Evaluation of Linux I/O Schedulers 1 Asad Naweed, Joe Di Natale, and Sarah J Andrabi University of North Carolina at Chapel Hill Abstract In this paper we present three different Linux
More informationSecurity analytics: From data to action Visual and analytical approaches to detecting modern adversaries
Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries Chris Calvert, CISSP, CISM Director of Solutions Innovation Copyright 2013 Hewlett-Packard Development
More informationA METRIC BASED EVALUATION OF TEST CASE PRIORITATION TECHNIQUES- HILL CLIMBING, REACTIVE GRASP AND TABUSEARCH
A METRIC BASED EVALUATION OF TEST CASE PRIORITATION TECHNIQUES- HILL CLIMBING, REACTIVE GRASP AND TABUSEARCH 1 M.Manjunath, 2 N.Backiavathi 1 PG Scholar, Department of Information Technology,Jayam College
More informationMachine Learning for Software Engineering
Machine Learning for Software Engineering Single-State Meta-Heuristics Prof. Dr.-Ing. Norbert Siegmund Intelligent Software Systems 1 2 Recap: Goal is to Find the Optimum Challenges of general optimization
More informationA PERSONALIZED RECOMMENDER SYSTEM FOR TELECOM PRODUCTS AND SERVICES
A PERSONALIZED RECOMMENDER SYSTEM FOR TELECOM PRODUCTS AND SERVICES Zui Zhang, Kun Liu, William Wang, Tai Zhang and Jie Lu Decision Systems & e-service Intelligence Lab, Centre for Quantum Computation
More informationOracle Big Data Connectors
Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process
More informationFujitsu/Fujitsu Labs Technologies for Big Data in Cloud and Business Opportunities
Fujitsu/Fujitsu Labs Technologies for Big Data in Cloud and Business Opportunities Satoshi Tsuchiya Cloud Computing Research Center Fujitsu Laboratories Ltd. January, 2012 Overview: Fujitsu s Cloud and
More informationNetezza The Analytics Appliance
Software 2011 Netezza The Analytics Appliance Michael Eden Information Management Brand Executive Central & Eastern Europe Vilnius 18 October 2011 Information Management 2011IBM Corporation Thought for
More informationCombine Native SQL Flexibility with SAP HANA Platform Performance and Tools
SAP Technical Brief Data Warehousing SAP HANA Data Warehousing Combine Native SQL Flexibility with SAP HANA Platform Performance and Tools A data warehouse for the modern age Data warehouses have been
More informationMixApart: Decoupled Analytics for Shared Storage Systems. Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp
MixApart: Decoupled Analytics for Shared Storage Systems Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp Hadoop Pig, Hive Hadoop + Enterprise storage?! Shared storage
More informationNoVA MySQL October Meetup. Tim Callaghan VP/Engineering, Tokutek
NoVA MySQL October Meetup TokuDB and Fractal Tree Indexes Tim Callaghan VP/Engineering, Tokutek 2012.10.23 1 About me, :) Mark Callaghan s lesser-known but nonetheless smart brother. [C. Monash, May 2010]
More informationSHARDS & Talus: Online MRC estimation and optimization for very large caches
SHARDS & Talus: Online MRC estimation and optimization for very large caches Nohhyun Park CloudPhysics, Inc. Introduction Efficient MRC Construction with SHARDS FAST 15 Waldspurger at al. Talus: A simple
More informationConference The Data Challenges of the LHC. Reda Tafirout, TRIUMF
Conference 2017 The Data Challenges of the LHC Reda Tafirout, TRIUMF Outline LHC Science goals, tools and data Worldwide LHC Computing Grid Collaboration & Scale Key challenges Networking ATLAS experiment
More information