Information Visualization and Visual Analytics roles, challenges, and examples Giuseppe Santucci
|
|
- Blaze Goodman
- 5 years ago
- Views:
Transcription
1 Information Visualization and Visual Analytics roles, challenges, and examples Giuseppe Santucci
2 VisDis and the Database & User Interface The VisDis and the Database/Interface group background is about: Visual Information Access Data quality Data integration Adaptive Interfaces User Centered Design Usability and Accessibility Infovis evaluation Visual quality metrics Visual Analytics Data sampling Density map optimization
3 Outline Information Visualization Main issues Data overloading Visual Analytics Automatic data analysis Three examples Projects and books
4 Information visualization! 1. Infovis is perfect for exploration, when we don t know exactly what to look at. It supports vague goals 2. Infovis is perfect to explain complex data and to support decisions Other approaches to data analysis Statistics: strong verification but does not support exploration and vague goals Data mining: actionable and reliable but black box, not interactive, question-response style Visual analytics (formerly Visual Data Mining) is trying to join the two worlds
5 Canonical steps in infovis STEP 1 DATA Internal Representation Sport Mathematics Physics Encoding of values Univariate data Bivariate data Trivariate data Multidimensional data Encoding of relations Temporal data Map & Diagrams Graphs/Trees Data streams Chemistry Art Geography Literature History
6 Canonical steps in infovis STEP 2 Internal Representation Space limitations Scrolling Overview + details Distortion Suppression Zoom & pan Semantic zoom Time limitation Perceptual issues Cognitive issues Presentation
7 SO WE ARE DONE! (?)
8 Outline Information Visualization Data overloading Visual Analytics Automatic data analysis Three examples Projects and books and conferences
9 Data size and complexity! 100 million FedEx transactions per day 150 million VISA credit card transactions per day 300 million long distance ATT calls per day 50 billion s per day 600 billion IP packets per day 1 trillion (10 12 ) of web pages (according to Google), corresponding to about 3 petabytes of data Google processes 20 petabytes of data per day Data streams (sensor network, IP traffic, etc) kilobyte, megabyte, gigabyte, terabyte, petabyte
10 Rescuing information In different situations people need to exploit and to use hidden information resting in unexplored large data sets decision-makers analysts engineers emergency response teams... Several techniques exist devoted to this aim Automatic analysis techniques (e.g., data mining) Manual analysis techniques (e.g., Information visualization) Petabyte datasets require a joint effort:
11 Visual Analytics
12 VA is highly interdisciplinary Evaluation Data Mining Evaluation Data Management Scientific & Information Visualisation Spatio- Temporal Data Infrastructure Human Perception +Cognition Infrastructure Each component presents challenging issues
13 Visualization Scientific Visualization & Information Visualization interactivity & scalability issues Challenges: design of new scalable structure that support: Visual abstractions (e.g., clustering, sampling, etc.) Rapid update of visual displays for billion record databases (10 frames per second)
14 Data Management Answering a query against a large data set is now possible Among the other challenges: Integration of heterogeneous data such as numeric data, graphs, text, audio and video signals, semi-structured data Data streams - In many application data are continuously produced (sensor data, stock market data, news data, etc.) Data provenance - Understanding where data come from Data reduction - Visualizing billion records is not possible. We need to reduce and abstract the data to support interaction at different detail levels (see, e.g., Google Earth)...
15 Data mining Methods to automatically extract insights Supervised learning from examples: using training samples to learn models for the classification (or prediction) of previously unseen data sample Cluster analysis, which aims to extract structure from unknown data, grouping data instances into classes based on mutual similarity, and to identify outliers Association rule mining (analysis of co-occurrence of data items) and dimensionality reduction Challenges come from: semi-structured and complex data (web data, documents) interaction with visualizations
16 Spatio - Temporal Data Data about time and space are widely spread geographic measurements GPS position data remote sensing applications (e.g., satellite data) Finding spatial relationships and patterns among this data is of special interest The analysis of data with references both in space and in time is a challenging research topic: scale: clusters and other phenomena may only occur at particular scales, which may not be the scale at which data is recorded uncertainty: spatio-temporal data are often incomplete, interpolated, collected at different times, etc.
17 Perception and cognition A critical element is the human being ( ) Visual analysis tasks require the careful design of apt human-computer interfaces Challenges: need to integrate Psychology, Sociology, Neurosciences, and Design issues user-centred analysis and modelling multimodal interaction techniques for visualization and exploration of large information spaces availability of improved display resources novel interaction algorithms perceptual, cognitive and graphical principles which in combination lead to improved visual communication of data and analysis results Form Intention Form Action plan Execute Action Evaluatio Interpretatio Perception
18 Evaluation and Infrastructure How to assess (evaluate) the effectiveness of visual analytics environment is a topic of lively debate The same happens for infrastructures: agreed solutions are still under investigation Both topics are still in the phase of workshop results... D3!
19 Back to the Automatic Data Analysis We can classify the automatic activities in three main groups 1. Deriving new values from the dataset for ad-hoc visualization This is the less standard and the more creative part of the process 2. Data reduction / data mining Clustering /classification / Sampling / pixel oriented visualization Dimension reduction 3. Visualization improvement Data distribution Perceptual issues Cognitive issues
20 Example for group 1 Deriving new values from the dataset for ad-hoc visualization (you are going to visualize DERIVED data)
21 A Visual Analytics example (Group 1) Deriving new values from the dataset for ad-hoc visualization How to visually compare J. London and M. Twain books? [D. A. Keim and D. Oelke. Literature Fingerprinting: A New Method for Visual Literary Analysis IEEE Symp. on Visual Analytics Science and Technology (VAST '07) ] 1. Split the book in several text block (e.g., pages, paragraph, sentences) 2. Measure, for each text block, a relevant feature (e.g., average sentence length, word usage, etc. ) 3. Associate the relevant feature to a visual attribute (e.g., color) 4. Visualize it
22 J.London vs M.Twain average sentence lengths
23 User interaction (a non uniform book?)
24 Details of a book
25 What about the Bible?
26 Example 2 Data reduction / data mining
27 Visual Analytics of Anomaly Detection in Large Data Streams (paper from Daniel Keim group) You have to monitor a network composed of 8 systems with 16 servers each Each server provide basic information CPU % occupation DISK % occupation MEM % occupation... That corresponds to 128 temporal data streams (overplotting!!) CPU % time
28 Pixel oriented visualization 28 days (5 min windows), about 8k observations Each observation takes a pixel The color codes the CPU %
29 The whole system Color is preattentive!
30 Automated analysis Computing high CPU % clusters That selects hot time intervals
31 Automated analysis... Detecting persistent anomalies
32 Looking for correlations
33 Example 3 Visualization improvement
34 A Visual Analytics example (Group 3 Visualization improvement) Data distribution and perceptual issues 4 data items are plotted on the same pixel:d=4 Density maps empty pixel 8x8 pixels we can map the density values to a 256 levels grey or color scale
35 The case study (Infovis contest 2005) About 60,000 USA companies plotted on a 800x450 (360,000 pixels) scatter plot 126 distinct density values ranging on [1..1,633] 7,042 active pixels (i.e., hosting at least one company): 2526 pixels (36%) host exactly one company (d=1) 1182 pixels (17%) host two companies (d=2)... 1 pixel ( %) hosts 1633 companies (d=1633)
36 What is the problem? The choice of the right mapping is crucial, because of density frequency distribution presents very skewed behaviour 36% Pixel number 17% 0.001% Density (126 distinct values) 1633
37 The mapping 126 different data densities = { 1, 2,, 1,633 }? 256 Color Codes = { 0,1, 2,, 255} Available solutions - Linear mapping - Non linear mappings
38 Linear mapping ColorCode( d) = Round 255 d d max d min d min colors collisions Transfer Function Straightforward solution Useless in this situation Most pixels share very low color codes Few color codes are used (46 out of 256) Color code frequency distribution Different low density values are represented by the same color code: densities in [1..10] are mapped on codes {1,2}
39 Density function mapping ColorCode( d j ) j = Round 255 i= 1 DN N AP ( d AP i ) TF Hermann et al. [HMM00] Quite similar to histogram aequalization Better than linear mapping Color code frequency distribution Few color codes are used (39 out of 256) Lowest color code unnecessarily high Codes ranging only on [ ] Different high density values are represented by the same color code: densities in [48..1,633] -> [250,255]
40 Our proposal We take into account that: densities and color codes are discrete and finite too close color codes are hardly distinguishable (for human beings) [E. Bertini, A. Di Girolamo, G.Santucci - See what you know: analyzing data distribution to improve density map visualization Eurovis 2007 conference]
41 uniform scale mapping We use a reduced color scale, e.g. with 15 codes (N L =15) This implies that different density values will be necessarily represented by the same color code: to reduce the degradation the mapping is performed through an algorithm that tries to assign to each code the same number of pixels N N AP L c1 c2 c3 cnl Target color code frequency distribution
42 N DV >N L : uniform scale mapping ColorCode( d j ) = DistributePixels Because of densities are discrete the algorithm cannot ensure the N AP /N L value and through a peak analysis it minimizes the variance Full color scale usage [0..255] All the color codes are used Maximum color code separation Color code frequency distribution
43 Visual comparison Linear mapping Density function mapping Uniform scale mapping
44 Visual comparison
45
46
47 The parcel dataset Postal parcels plotted by weight (x) and volume (y)
48 Grey scale Linear CSU=0.53 CsAR=1 CS=2.83 Density Function CSU=0.18 CsAR=0.62 CS=5.23 Uniform color sc. CSU=1 CsAR=1 CS=8.79
49 Conclusions Visual Analytics is a new (exciting) emerging research field Information visualization is a core component of VA Automated data analysis could be classified in three main groups Deriving new values (more creative) Data reduction (sometimes creative) Image improvement (very technical) It is highly interdisciplinary and require a collaborative approach It is mainly a METHODOLOGY / VISION than a technique However a collection of available results / proposal is quickly growing
50 The new (European) book on VA Illuminating the path : The Research and Development Agenda for Visual Analytics 2005, focusing on USA homeland security Managing the Information Age Solving Problems with Visual Analytics (2010) One of the major outcome of Vismaster Availble for free at:
51 5 books you HAVE to read (greedy order) Robert Spence - Information Visualization: Design for Interaction (2nd Edition) - Addison-Wesley (ACM Press) - BASIC ISSUES Chaomei Chen - Information Visualization - Second Edition - Springer - AN UPDATED OVERVIEW Managing the Information Age Solving Problems with Visual Analytics (2010) VISMASTER BOOK Colin Ware - Information Visualization, Third Edition: Perception for Design (Interactive Technologies) - Morgan Kaufmann - PERCEPTUAL ISSUES Card, Mackinlay, Shneiderman - Reading in Information Visualization HYSTORICAL
52 Visual Analytics projects
53 The Vismaster CA project
54 The Promise NoE project
55 PanopteSec Network Cyber Security 3 years European IP project!
56
57 PanopteSec: Call for Master Thesis Design implement and test a Visual Analytics Environment for Network security D3 framework It includes the Information visualization homework
Database and Knowledge-Base Systems: Data Mining. Martin Ester
Database and Knowledge-Base Systems: Data Mining Martin Ester Simon Fraser University School of Computing Science Graduate Course Spring 2006 CMPT 843, SFU, Martin Ester, 1-06 1 Introduction [Fayyad, Piatetsky-Shapiro
More informationGeometric Techniques. Part 1. Example: Scatter Plot. Basic Idea: Scatterplots. Basic Idea. House data: Price and Number of bedrooms
Part 1 Geometric Techniques Scatterplots, Parallel Coordinates,... Geometric Techniques Basic Idea Visualization of Geometric Transformations and Projections of the Data Scatterplots [Cleveland 1993] Parallel
More informationChapter 1, Introduction
CSI 4352, Introduction to Data Mining Chapter 1, Introduction Young-Rae Cho Associate Professor Department of Computer Science Baylor University What is Data Mining? Definition Knowledge Discovery from
More informationInformation Visualization & Visual Analytics
Information Visualization & Visual Analytics Jack van Wijk Dept. Math. & Computer Science TU Eindhoven BPM round table, March 28, 2011 Overview InfoVis Visual Analytics Why is my hard disk full?? SequoiaView
More informationQuality Metrics for Visual Analytics of High-Dimensional Data
Quality Metrics for Visual Analytics of High-Dimensional Data Daniel A. Keim Data Analysis and Information Visualization Group University of Konstanz, Germany Workshop on Visual Analytics and Information
More informationCS423: Data Mining. Introduction. Jakramate Bootkrajang. Department of Computer Science Chiang Mai University
CS423: Data Mining Introduction Jakramate Bootkrajang Department of Computer Science Chiang Mai University Jakramate Bootkrajang CS423: Data Mining 1 / 29 Quote of the day Never memorize something that
More informationData Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 21 Table of contents 1 Introduction 2 Data mining
More informationGrundlagen methodischen Arbeitens Informationsvisualisierung [WS ] Monika Lanzenberger
Grundlagen methodischen Arbeitens Informationsvisualisierung [WS0708 01 ] Monika Lanzenberger lanzenberger@ifs.tuwien.ac.at 17. 10. 2007 Current InfoVis Research Activities: AlViz 2 [Lanzenberger et al.,
More informationData Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 20 Table of contents 1 Introduction 2 Data mining
More informationSpatial Outlier Detection
Spatial Outlier Detection Chang-Tien Lu Department of Computer Science Northern Virginia Center Virginia Tech Joint work with Dechang Chen, Yufeng Kou, Jiang Zhao 1 Spatial Outlier A spatial data point
More informationIntroduction to Trajectory Clustering. By YONGLI ZHANG
Introduction to Trajectory Clustering By YONGLI ZHANG Outline 1. Problem Definition 2. Clustering Methods for Trajectory data 3. Model-based Trajectory Clustering 4. Applications 5. Conclusions 1 Problem
More informationModel Based Impact Location Estimation Using Machine Learning Techniques
Model Based Impact Location Estimation Using Machine Learning Techniques 1. Introduction Impacts on composite structures result in invisible damages that need to be detected and corrected before they lead
More informationInternational Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani
LINK MINING PROCESS Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani Higher Colleges of Technology, United Arab Emirates ABSTRACT Many data mining and knowledge discovery methodologies and process models
More informationInformation Visualization - Introduction
Information Visualization - Introduction Institute of Computer Graphics and Algorithms Information Visualization The use of computer-supported, interactive, visual representations of abstract data to amplify
More informationDATA MINING II - 1DL460
DATA MINING II - 1DL460 Spring 2016 A second course in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt16 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,
More informationClustering and Visualisation of Data
Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some
More informationBIG DATA SCIENTIST Certification. Big Data Scientist
BIG DATA SCIENTIST Certification Big Data Scientist Big Data Science Professional (BDSCP) certifications are formal accreditations that prove proficiency in specific areas of Big Data. To obtain a certification,
More informationData Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha
Data Preprocessing S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking
More informationData Mining. Chapter 1: Introduction. Adapted from materials by Jiawei Han, Micheline Kamber, and Jian Pei
Data Mining Chapter 1: Introduction Adapted from materials by Jiawei Han, Micheline Kamber, and Jian Pei 1 Any Question? Just Ask 3 Chapter 1. Introduction Why Data Mining? What Is Data Mining? A Multi-Dimensional
More information9. Conclusions. 9.1 Definition KDD
9. Conclusions Contents of this Chapter 9.1 Course review 9.2 State-of-the-art in KDD 9.3 KDD challenges SFU, CMPT 740, 03-3, Martin Ester 419 9.1 Definition KDD [Fayyad, Piatetsky-Shapiro & Smyth 96]
More informationSeeing and Reading Red: Hue and Color-word Correlation in Images and Attendant Text on the WWW
Seeing and Reading Red: Hue and Color-word Correlation in Images and Attendant Text on the WWW Shawn Newsam School of Engineering University of California at Merced Merced, CA 9534 snewsam@ucmerced.edu
More informationData Mining and Analytics. Introduction
Data Mining and Analytics Introduction Data Mining Data mining refers to extracting or mining knowledge from large amounts of data It is also termed as Knowledge Discovery from Data (KDD) Mostly, data
More informationData Mining Course Overview
Data Mining Course Overview 1 Data Mining Overview Understanding Data Classification: Decision Trees and Bayesian classifiers, ANN, SVM Association Rules Mining: APriori, FP-growth Clustering: Hierarchical
More informationA Content Based Image Retrieval System Based on Color Features
A Content Based Image Retrieval System Based on Features Irena Valova, University of Rousse Angel Kanchev, Department of Computer Systems and Technologies, Rousse, Bulgaria, Irena@ecs.ru.acad.bg Boris
More informationDATA MINING II - 1DL460
DATA MINING II - 1DL460 Spring 2012 A second course in data mining!! http://www.it.uu.se/edu/course/homepage/infoutv2/vt12 Kjell Orsborn! Uppsala Database Laboratory! Department of Information Technology,
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Computer Science 591Y Department of Computer Science University of Massachusetts Amherst February 3, 2005 Topics Tasks (Definition, example, and notes) Classification
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,
More informationFall 2017 ECEN Special Topics in Data Mining and Analysis
Fall 2017 ECEN 689-600 Special Topics in Data Mining and Analysis Nick Duffield Department of Electrical & Computer Engineering Teas A&M University Organization Organization Instructor: Nick Duffield,
More informationIntroduction to Data Mining and Data Analytics
1/28/2016 MIST.7060 Data Analytics 1 Introduction to Data Mining and Data Analytics What Are Data Mining and Data Analytics? Data mining is the process of discovering hidden patterns in data, where Patterns
More informationA Survey Of Issues And Challenges Associated With Clustering Algorithms
International Journal for Science and Emerging ISSN No. (Online):2250-3641 Technologies with Latest Trends 10(1): 7-11 (2013) ISSN No. (Print): 2277-8136 A Survey Of Issues And Challenges Associated With
More information8. Automatic Content Analysis
8. Automatic Content Analysis 8.1 Statistics for Multimedia Content Analysis 8.2 Basic Parameters for Video Analysis 8.3 Deriving Video Semantics 8.4 Basic Parameters for Audio Analysis 8.5 Deriving Audio
More informationSemi supervised clustering for Text Clustering
Semi supervised clustering for Text Clustering N.Saranya 1 Assistant Professor, Department of Computer Science and Engineering, Sri Eshwar College of Engineering, Coimbatore 1 ABSTRACT: Based on clustering
More informationData Preprocessing. Slides by: Shree Jaswal
Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 1 1 Acknowledgement Several Slides in this presentation are taken from course slides provided by Han and Kimber (Data Mining Concepts and Techniques) and Tan,
More informationS. Rinzivillo DATA VISUALIZATION AND VISUAL ANALYTICS
S. Rinzivillo rinzivillo@isti.cnr.it DATA VISUALIZATION AND VISUAL ANALYTICS Who I Am? Salvatore Rinzivillo rinzivillo@isti.cnr.it Page course: http://didawiki.cli.di.unipi.it/ Visual Analytics Github
More informationInformation Visualisation
Information Visualisation Computer Animation and Visualisation Lecture 18 Taku Komura tkomura@ed.ac.uk Institute for Perception, Action & Behaviour School of Informatics 1 Overview Information Visualisation
More informationFROM PEER TO PEER...
FROM PEER TO PEER... Dipartimento di Informatica, Università degli Studi di Pisa HPC LAB, ISTI CNR Pisa in collaboration with: Alessandro Lulli, Emanuele Carlini, Massimo Coppola, Patrizio Dazzi 2 nd HPC
More informationDATA WAREHOUSING IN LIBRARIES FOR MANAGING DATABASE
DATA WAREHOUSING IN LIBRARIES FOR MANAGING DATABASE Dr. Kirti Singh, Librarian, SSD Women s Institute of Technology, Bathinda Abstract: Major libraries have large collections and circulation. Managing
More informationD B M G Data Base and Data Mining Group of Politecnico di Torino
DataBase and Data Mining Group of Data mining fundamentals Data Base and Data Mining Group of Data analysis Most companies own huge databases containing operational data textual documents experiment results
More informationData Sets. of Large. Visual Exploration. Daniel A. Keim
Visual Exploration of Large Data Sets Computer systems today store vast amounts of data. Researchers, including those working on the How Much Information? project at the University of California, Berkeley,
More information3. Multidimensional Information Visualization II Concepts for visualizing univariate to hypervariate data
3. Multidimensional Information Visualization II Concepts for visualizing univariate to hypervariate data Vorlesung Informationsvisualisierung Prof. Dr. Andreas Butz, WS 2009/10 Konzept und Basis für n:
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Features and Patterns The Curse of Size and
More informationIntroduction to digital image classification
Introduction to digital image classification Dr. Norman Kerle, Wan Bakx MSc a.o. INTERNATIONAL INSTITUTE FOR GEO-INFORMATION SCIENCE AND EARTH OBSERVATION Purpose of lecture Main lecture topics Review
More informationUniversity of Florida CISE department Gator Engineering. Data Preprocessing. Dr. Sanjay Ranka
Data Preprocessing Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville ranka@cise.ufl.edu Data Preprocessing What preprocessing step can or should
More informationParallel Approach for Implementing Data Mining Algorithms
TITLE OF THE THESIS Parallel Approach for Implementing Data Mining Algorithms A RESEARCH PROPOSAL SUBMITTED TO THE SHRI RAMDEOBABA COLLEGE OF ENGINEERING AND MANAGEMENT, FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
More informationRemotely Sensed Image Processing Service Automatic Composition
Remotely Sensed Image Processing Service Automatic Composition Xiaoxia Yang Supervised by Qing Zhu State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University
More informationStrategic Briefing Paper Big Data
Strategic Briefing Paper Big Data The promise of Big Data is improved competitiveness, reduced cost and minimized risk by taking better decisions. This requires affordable solution architectures which
More informationSRM UNIVERSITY FACULTY OF ENGINEERING AND TECHNOLOGY SCHOOL OF COMPUTING DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING COURSE PLAN
SRM UNIVERSITY FACULTY OF ENGINEERING AND TECHNOLOGY SCHOOL OF COMPUTING DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING COURSE PLAN Course Code : CS110 Course Title : Visualization Technique Semester :
More informationData Visualization. Fall 2016
Data Visualization Fall 2016 Information Visualization Upon now, we dealt with scientific visualization (scivis) Scivisincludes visualization of physical simulations, engineering, medical imaging, Earth
More informationINTRODUCTION TO BIG DATA, DATA MINING, AND MACHINE LEARNING
CS 7265 BIG DATA ANALYTICS INTRODUCTION TO BIG DATA, DATA MINING, AND MACHINE LEARNING * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Mingon Kang, PhD Computer Science,
More informationData Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation
Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization
More informationHigh Dimensional Data Visualization
High Dimensional Data Visualization Some examples Text data. Finance. Time Series Data. Climate Data (http://www.erh.noaa.gov/lwx/f6.htm ). Spatial Data. Spatio temporal Data. Biological Data Many others
More informationCOMP 465 Special Topics: Data Mining
COMP 465 Special Topics: Data Mining Introduction & Course Overview 1 Course Page & Class Schedule http://cs.rhodes.edu/welshc/comp465_s15/ What s there? Course info Course schedule Lecture media (slides,
More informationThanks to the advances of data processing technologies, a lot of data can be collected and stored in databases efficiently New challenges: with a
Data Mining and Information Retrieval Introduction to Data Mining Why Data Mining? Thanks to the advances of data processing technologies, a lot of data can be collected and stored in databases efficiently
More informationData Preprocessing. Data Preprocessing
Data Preprocessing Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville ranka@cise.ufl.edu Data Preprocessing What preprocessing step can or should
More informationKnowledge-Defined Networking: Towards Self-Driving Networks
Knowledge-Defined Networking: Towards Self-Driving Networks Albert Cabellos (UPC/BarcelonaTech, Spain) albert.cabellos@gmail.com 2nd IFIP/IEEE International Workshop on Analytics for Network and Service
More informationContextual priming for artificial visual perception
Contextual priming for artificial visual perception Hervé Guillaume 1, Nathalie Denquive 1, Philippe Tarroux 1,2 1 LIMSI-CNRS BP 133 F-91403 Orsay cedex France 2 ENS 45 rue d Ulm F-75230 Paris cedex 05
More informationInformation Visualization
Information Visualization Text: Information visualization, Robert Spence, Addison-Wesley, 2001 What Visualization? Process of making a computer image or graph for giving an insight on data/information
More informationVisual Analytics: Combining Automated Discovery with Interactive Visualizations
Visual Analytics: Combining Automated Discovery with Interactive Visualizations Daniel A. Keim, Florian Mansmann, Daniela Oelke, and Hartmut Ziegler University of Konstanz, Germany first.lastname@uni-konstanz.de,
More informationCONCENTRATIONS: HIGH-PERFORMANCE COMPUTING & BIOINFORMATICS CYBER-SECURITY & NETWORKING
MAJOR: DEGREE: COMPUTER SCIENCE MASTER OF SCIENCE (M.S.) CONCENTRATIONS: HIGH-PERFORMANCE COMPUTING & BIOINFORMATICS CYBER-SECURITY & NETWORKING The Department of Computer Science offers a Master of Science
More informationCOMPUTER NETWORKS PERFORMANCE. Gaia Maselli
COMPUTER NETWORKS PERFORMANCE Gaia Maselli maselli@di.uniroma1.it Prestazioni dei sistemi di rete 2 Overview of first class Practical Info (schedule, exam, readings) Goal of this course Contents of the
More informationBig Data Challenges in Large IP Networks
Big Data Challenges in Large IP Networks Feature Extraction & Predictive Alarms for network management Wednesday 28 th Feb 2018 Dave Yearling British Telecommunications plc 2017 What we will cover Making
More informationMULTIVARIATE ANALYSIS OF STEALTH QUANTITATES (MASQ)
MULTIVARIATE ANALYSIS OF STEALTH QUANTITATES (MASQ) Application of Machine Learning to Testing in Finance, Cyber, and Software Innovation center, Washington, D.C. THE SCIENCE OF TEST WORKSHOP 2017 AGENDA
More informationPESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore
Data Warehousing Data Mining (17MCA442) 1. GENERAL INFORMATION: PESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore 560 100 Department of MCA COURSE INFORMATION SHEET Academic
More informationIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce Antonino Virgillito THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Large-scale Computation Traditional solutions for computing large
More informationData Mining and. in Dynamic Networks
Data Mining and Knowledge Discovery in Dynamic Networks Panos M. Pardalos Center for Applied Optimization Dept. of Industrial & Systems Engineering Affiliated Faculty of: Computer & Information Science
More informationECLT 5810 Data Preprocessing. Prof. Wai Lam
ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate
More informationPARALLEL AND DISTRIBUTED PLATFORM FOR PLUG-AND-PLAY AGENT-BASED SIMULATIONS. Wentong CAI
PARALLEL AND DISTRIBUTED PLATFORM FOR PLUG-AND-PLAY AGENT-BASED SIMULATIONS Wentong CAI Parallel & Distributed Computing Centre School of Computer Engineering Nanyang Technological University Singapore
More informationAn Introduction to Content Based Image Retrieval
CHAPTER -1 An Introduction to Content Based Image Retrieval 1.1 Introduction With the advancement in internet and multimedia technologies, a huge amount of multimedia data in the form of audio, video and
More informationInternational Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X
Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,
More informationPouya Kousha Fall 2018 CSE 5194 Prof. DK Panda
Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Observe novel applicability of DL techniques in Big Data Analytics. Applications of DL techniques for common Big Data Analytics problems. Semantic indexing
More informationMetaData for Database Mining
MetaData for Database Mining John Cleary, Geoffrey Holmes, Sally Jo Cunningham, and Ian H. Witten Department of Computer Science University of Waikato Hamilton, New Zealand. Abstract: At present, a machine
More informationFramework for Visual Analytics of Measurement Data
Framework for Visual Analytics of Measurement Data Paula Järvinen, Pekka Siltanen, Kari Rainio VTT, PL 1000, 02044 VTT Espoo, Finland {paula.jarvinen, pekka.siltanen, kari.rainio}@vtt.fi Abstract-Visual
More informationChapter 4 Data Mining A Short Introduction
Chapter 4 Data Mining A Short Introduction Data Mining - 1 1 Today's Question 1. Data Mining Overview 2. Association Rule Mining 3. Clustering 4. Classification Data Mining - 2 2 1. Data Mining Overview
More informationWith turing you can: Identify, locate and mitigate the effects of botnets or other malware abusing your infrastructure
Decoding DNS data If you have a large DNS infrastructure, understanding what is happening with your real-time and historic traffic is difficult, if not impossible. Until now, the available network management
More informationAdvanced Visualization
320581 Advanced Visualization Prof. Lars Linsen Fall 2011 0 Introduction 0.1 Syllabus and Organization Course Website Link in CampusNet: http://www.faculty.jacobsuniversity.de/llinsen/teaching/320581.htm
More information1. Inroduction to Data Mininig
1. Inroduction to Data Mininig 1.1 Introduction Universe of Data Information Technology has grown in various directions in the recent years. One natural evolutionary path has been the development of the
More informationData Mining. Data preprocessing. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Data preprocessing Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 15 Table of contents 1 Introduction 2 Data preprocessing
More informationTexture Image Segmentation using FCM
Proceedings of 2012 4th International Conference on Machine Learning and Computing IPCSIT vol. 25 (2012) (2012) IACSIT Press, Singapore Texture Image Segmentation using FCM Kanchan S. Deshmukh + M.G.M
More informationFast Approximations for Analyzing Ten Trillion Cells. Filip Buruiana Reimar Hofmann
Fast Approximations for Analyzing Ten Trillion Cells Filip Buruiana (filipb@google.com) Reimar Hofmann (reimar.hofmann@hs-karlsruhe.de) Outline of the Talk Interactive analysis at AdSpam @ Google Trade
More informationInteraction. CS Information Visualization. Chris Plaue Some Content from John Stasko s CS7450 Spring 2006
Interaction CS 7450 - Information Visualization Chris Plaue Some Content from John Stasko s CS7450 Spring 2006 Hello. What is this?! Hand back HW! InfoVis Music Video! Interaction Lecture remindme.mov
More informationCourse Curriculum for Master Degree in Network Engineering and Security
Course Curriculum for Master Degree in Network Engineering and Security The Master Degree in Network Engineering and Security is awarded by the Faculty of Graduate Studies at Jordan University of Science
More informationMassive Data Analysis
Professor, Department of Electrical and Computer Engineering Tennessee Technological University February 25, 2015 Big Data This talk is based on the report [1]. The growth of big data is changing that
More informationData Mining. Yi-Cheng Chen ( 陳以錚 ) Dept. of Computer Science & Information Engineering, Tamkang University
Data Mining Yi-Cheng Chen ( 陳以錚 ) Dept. of Computer Science & Information Engineering, Tamkang University Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused Web data, e-commerce
More informationImproving the Efficiency of Fast Using Semantic Similarity Algorithm
International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year
More informationChallenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Data Centric Systems and Networking Emergence of Big Data Shift of Communication Paradigm From end-to-end to data
More informationClassification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University
Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate
More informationStatistical Learning and Data Mining CS 363D/ SSC 358
Statistical Learning and Data Mining CS 363D/ SSC 358! Lecture: Introduction Pradeep Ravikumar pradeepr@cs.utexas.edu What is this course about (in 1 minute) Big Data Data Mining, Statistical Learning
More informationExploring the Structure of Data at Scale. Rudy Agovic, PhD CEO & Chief Data Scientist at Reliancy January 16, 2019
Exploring the Structure of Data at Scale Rudy Agovic, PhD CEO & Chief Data Scientist at Reliancy January 16, 2019 Outline Why exploration of large datasets matters Challenges in working with large data
More informationDS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li
Welcome to DS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li Time: 6:00pm 8:50pm Wednesday Location: Fuller 320 Spring 2017 2 Team assignment Finalized. (Great!) Guest Speaker 2/22 A
More informationName of the lecturer Doç. Dr. Selma Ayşe ÖZEL
Y.L. CENG-541 Information Retrieval Systems MASTER Doç. Dr. Selma Ayşe ÖZEL Information retrieval strategies: vector space model, probabilistic retrieval, language models, inference networks, extended
More informationData Mining. Jeff M. Phillips. January 7, 2019 CS 5140 / CS 6140
Data Mining CS 5140 / CS 6140 Jeff M. Phillips January 7, 2019 What is Data Mining? What is Data Mining? Finding structure in data? Machine learning on large data? Unsupervised learning? Large scale computational
More informationWELCOME! Lecture 3 Thommy Perlinger
Quantitative Methods II WELCOME! Lecture 3 Thommy Perlinger Program Lecture 3 Cleaning and transforming data Graphical examination of the data Missing Values Graphical examination of the data It is important
More informationThis tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.
About the Tutorial Data Mining is defined as the procedure of extracting information from huge sets of data. In other words, we can say that data mining is mining knowledge from data. The tutorial starts
More informationDETECTION OF ANOMALIES FROM DATASET USING DISTRIBUTED METHODS
DETECTION OF ANOMALIES FROM DATASET USING DISTRIBUTED METHODS S. E. Pawar and Agwan Priyanka R. Dept. of I.T., University of Pune, Sangamner, Maharashtra, India M.E. I.T., Dept. of I.T., University of
More informationUNCLASSIFIED. R-1 ITEM NOMENCLATURE PE D8Z: Data to Decisions Advanced Technology FY 2012 OCO
Exhibit R-2, RDT&E Budget Item Justification: PB 2012 Office of Secretary Of Defense DATE: February 2011 BA 3: Advanced Development (ATD) COST ($ in Millions) FY 2010 FY 2011 Base OCO Total FY 2013 FY
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2013 http://ce.sharif.edu/courses/91-92/2/ce725-1/ Agenda Features and Patterns The Curse of Size and
More informationBased on Big Data: Hype or Hallelujah? by Elena Baralis
Based on Big Data: Hype or Hallelujah? by Elena Baralis http://dbdmg.polito.it/wordpress/wp-content/uploads/2010/12/bigdata_2015_2x.pdf 1 3 February 2010 Google detected flu outbreak two weeks ahead of
More informationA Statistical Approach to Culture Colors Distribution in Video Sensors Angela D Angelo, Jean-Luc Dugelay
A Statistical Approach to Culture Colors Distribution in Video Sensors Angela D Angelo, Jean-Luc Dugelay VPQM 2010, Scottsdale, Arizona, U.S.A, January 13-15 Outline Introduction Proposed approach Colors
More informationdan.fay@microsoft.com Scientific Data Intensive Computing Workshop 2004 Visualizing and Experiencing E 3 Data + Information: Provide a unique experience to reduce time to insight and knowledge through
More information