Infographics and Visualisation (or: Beyond the Pie Chart) LSS: ITNPBD4, 1 November 2016
|
|
- Aubrey McBride
- 5 years ago
- Views:
Transcription
1 Infographics and Visualisation (or: Beyond the Pie Chart) LSS: ITNPBD4, 1 November 2016
2 Overview Overview (short: we covered most of this in the tutorial) Why infographics and visualisation What s the problem we re trying to solve? What makes for good infographics and visualisations? Where are we now in this area? Interactive visualisations ITNPD4: Applications of Big Data 2
3 The problem Data analysis may tell you something about the structure of a problem Or may predict how to optimise something Profit, energy usage etc. BUT: In general you will have to convince someone else And they may not be convinced by the numbers on their own They expect some sort of graphic that they can show to the Board/CEO to convince them A visualisation, perhaps an infographic. The other side of this is that people may be presenting their data with a particular axe to grind ITNPD4: Applications of Big Data 3
4 Visualisation and infographics Visualisation is the generic name for displaying data May be a single image Or a movie, for example. Visualizations help people see things that were not obvious to them before (SAS website) There is also sonification, where data is sounded out: this works, because our ears are very good a picking up patterns. E.g. Geiger counter, reversing systems in modern cars. Infographics may be single images Providing a visualisation of a specific set of data. But they may also be interactive ITNPD4: Applications of Big Data 4
5 Infographics An infographic is a picture that displays information in an accessable and/or informative way. Can be quite simple or quite complex ITNPD4: Applications of Big Data 5
6 not a new idea (Minard, 1869)! The standard text in this area is E. R. Tufte, The visual display of quantitative information ITNPD4: Applications of Big Data 6
7 Infographic shows the troops and troop movements on the eastern from in World War 2. ITNPD4: Applications of Big Data 7
8 Visualisation of low-dimensional datasets Low-dimensional datasets are often visualised as simple X/Y graphs: but even here there are issues For both X and Y axes: Offset (is the origin at 0?) Scale Linear or logarithmic? Continuous or broken axes. Graph lines: One or more than one? Line style: continuous, dashed, dotted Line colour Symbols and/or lines? ITNPD4: Applications of Big Data 8
9 ITNPD4: Applications of Big Data
10 Using different line styles and colours ITNPD4: Applications of Big Data 10
11 Visualising 3D data. ITNPD4: Applications of Big Data 11
12 Visualising high dimensional datasets This is harder: and can be where infographics comes in Cannot do this directly. Can plot two or three dimensions directly, but not more Clever infographics can plot more dimensions, for example using geographical location, lines of varying thickness and colour, multiple symbols How can we show the structure of such datasets? When we can t think of one-off target-domain clever tricks Discuss earlier infographics Clearly depends on what we are trying to show! Geography as timeline, for example See also ITNPD4: Applications of Big Data 12
13 What can we do in general Let s say that we don t have any inspiration for designing a good infographic (!) Infographics often depends on specific factors E.g. dates, geographic distribution, Can we find 2 or 3 (or even a few more) dimensions that in some sense summarise (what we want to emphasise about) the dataset? Ways forward: projecting and clustering ITNPD4: Applications of Big Data 13
14 Choosing dimensions and projecting data If the data is randomly spread throughout all the dimensions and has no structure? Give up. There s nothing to be learned from it (if it really is random) Datasets that have something to tell us have some from of structure Maybe the data lie (largely) on a smaller dimensional subset of the high-dimensional space. As opposed to being spread randomly and evenly throughout the original space. ITNPD4: Applications of Big Data 14
15 Example Say that we have 3-dimensional data, sampled over time Each point is (x,y,z,t): really 4-dimensional data and -1 <= x 2 +y 2 +z 2 <=1, 0<=t<=10 (the points (x,y,z) are inside a sphere, of radius 1, centered at the origin) Let s also say that at each time t, sqrt(x 2 +y 2 +z 2 ) = t/10 So that the points at time t are on the surface of a sphere of radius t/10 Clearly, if we simply look at all the(x,y,z) points (ignoring t) they are spread throughout the sphere But not in an unstructured way ITNPD4: Applications of Big Data 15
16 Discovering structure in data There are many techniques for discovering (uncovering) structure Principal component analysis (pca) Linearly projecting a high dimensional dataset on to a smaller number of dimensions In such a way that as much as possible of the variance in the data is contained in this smaller number of dimensions And the dimensions are orthogonal to each other Well-understood and commonly used technique for data dimension reduction ITNPD4: Applications of Big Data 16
17 ITNPD4: Applications of Big Data 17
18 Independent components analysis Independent components analysis (ica) a statistical and computational technique for revealing hidden factors that underlie sets of random variables, measurements, or signals. Hyvärinen, (U Helsinki) Essentially looking for dimensions that co-vary Finding ways of summarising points in the N-dimensional space using less than N values. Data is assumed to be a linear mixture of underlying latent variables These are assumed non-gaussian, and mutually independent: independent components Related to PCA, but can find structure when PCA fails to do so ITNPD4: Applications of Big Data 18
19 Example: input ITNPD4: Applications of Big Data 19
20 ICA output ITNPD4: Applications of Big Data 20
21 ITNPD4: Applications of Big Data 21
22 Clustering data Often rather than projecting data on to other axes, it is better to look at how the data points are grouped The aim is to classify a large number of data vectors into a small number of manageable groups Does the data fall into clusters? How unevenly distributed is the data? Does it cluster in The original high-dimensional space In a lower-dimensional projected space? ITNPD4: Applications of Big Data 22
23 How does clustering work? Techniques Partition or Hierarchical ITNPD4: Applications of Big Data 23
24 Examples ITNPD4: Applications of Big Data 24
25 Partition-based clustering Based on distance between vectors But which distance? Euclidean City-block? Weighted versions Chebychev distance Forming clusters: Simple method: Start with each vector as a single-element cluster Identify two closest vectors and combine them into the same cluster. Keep doing this until the distance between the two closest vectors not in the same cluster is large. ITNPD4: Applications of Big Data 25
26 Criticisms of clustering Clustering is descriptive, and not unique Actual clusters may depend on techniques used, as well as on the data Clustering techniques will always find clusters Even when there aren t any! (This implies some measure for quality of clustering should be used) Clustering techniques depend strongly on the measures used There should ideally be some conceptual support of the measures used to calculate distances between vectors. ITNPD4: Applications of Big Data 26
27 Examples: Google News indexes Uses text to create topic clusters Title, article listings Used to discover multiple reports of same story Video clusters on YouTube Uses keywords, popularity, viewer engagement, user browsing history ITNPD4: Applications of Big Data 27
28 Infographics tools At its simplest, Excel has many facilities for creating infographics and visualisations. But it s limited, and proprietary (though one can import comma separated values) Matlab? Not free! Good graphing tools Flot: jquery and JavaScript based Google Chart API: free JavaScript based, browser output D3: JavaScript based, very powerful. ITNPD4: Applications of Big Data 28
29 Using visualisation and infographics As noted earlier, infographics and visualisation Is about communication of ideas about data, discoveries from data mining etc to others But visualisation has another important usage as well Exploratory (Initial) data analysis How can you decide which tools to apply to data and how to apply them if you haven t an initial idea of what might be useful? ITNPD4: Applications of Big Data 29
30 ITNPD4: Applications of Big Data 30
Exploratory Analysis: Clustering
Exploratory Analysis: Clustering (some material taken or adapted from slides by Hinrich Schutze) Heejun Kim June 26, 2018 Clustering objective Grouping documents or instances into subsets or clusters Documents
More informationChapter 2 Basic Structure of High-Dimensional Spaces
Chapter 2 Basic Structure of High-Dimensional Spaces Data is naturally represented geometrically by associating each record with a point in the space spanned by the attributes. This idea, although simple,
More informationClustering and Visualisation of Data
Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some
More informationBIG Data How to handle it. Mark Holton, College of Engineering, Swansea University,
BIG Data How to handle it Mark Holton, College of Engineering, Swansea University, m.d.holton@swansea.ac.uk The usual What I m going to talk about The source of the data some tag (data loggers) history
More information16 Data Visualizations. to Improve Your Application
16 Data Visualizations to Improve Your Application Table of Contents Best data visualizations to boost customer satisfaction Introduction 2 Types of Visualizations 3 Static vs. Animated Charts 6 Drilldowns
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,
More informationData Visualization for M&E. BRIDGE M&E Colloquium Jerusha Govender 8 August 2017
Data Visualization for M&E BRIDGE M&E Colloquium Jerusha Govender 8 August 2017 About Us We help organizations tell their story through innovative analysis, data visualization & strategic communication
More informationMSA220 - Statistical Learning for Big Data
MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups
More informationClustering & Dimensionality Reduction. 273A Intro Machine Learning
Clustering & Dimensionality Reduction 273A Intro Machine Learning What is Unsupervised Learning? In supervised learning we were given attributes & targets (e.g. class labels). In unsupervised learning
More informationOverview for Families
unit: Picturing Numbers Mathematical strand: Data Analysis and Probability The following pages will help you to understand the mathematics that your child is currently studying as well as the type of problems
More informationAdvanced data visualization (charts, graphs, dashboards, fever charts, heat maps, etc.)
Advanced data visualization (charts, graphs, dashboards, fever charts, heat maps, etc.) It is a graphical representation of numerical data. The right data visualization tool can present a complex data
More informationturning data into dollars
turning data into dollars Tom s Ten Data Tips November 2008 Neural Networks Neural Networks (NNs) are sometimes considered the epitome of data mining algorithms. Loosely modeled after the human brain (hence
More information1 SEO Synergy. Mark Bishop 2014
1 SEO Synergy 2 SEO Synergy Table of Contents Disclaimer... 3 Introduction... 3 Keywords:... 3 Google Keyword Planner:... 3 Do This First... 4 Step 1... 5 Step 2... 5 Step 3... 6 Finding Great Keywords...
More informationClustering and Dimensionality Reduction
Clustering and Dimensionality Reduction Some material on these is slides borrowed from Andrew Moore's excellent machine learning tutorials located at: Data Mining Automatically extracting meaning from
More informationWorking with Charts Stratum.Viewer 6
Working with Charts Stratum.Viewer 6 Getting Started Tasks Additional Information Access to Charts Introduction to Charts Overview of Chart Types Quick Start - Adding a Chart to a View Create a Chart with
More informationEvent: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect
Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect BEOP.CTO.TP4 Owner: OCTO Revision: 0001 Approved by: JAT Effective: 08/30/2018 Buchanan & Edwards Proprietary: Printed copies of
More informationA Comparative study of Clustering Algorithms using MapReduce in Hadoop
A Comparative study of Clustering Algorithms using MapReduce in Hadoop Dweepna Garg 1, Khushboo Trivedi 2, B.B.Panchal 3 1 Department of Computer Science and Engineering, Parul Institute of Engineering
More informationCIS192 Python Programming
CIS192 Python Programming Machine Learning in Python Robert Rand University of Pennsylvania October 22, 2015 Robert Rand (University of Pennsylvania) CIS 192 October 22, 2015 1 / 18 Outline 1 Machine Learning
More informationAn Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data
An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data Nian Zhang and Lara Thompson Department of Electrical and Computer Engineering, University
More informationWhat Type Of Graph Is Best To Use To Show Data That Are Parts Of A Whole
What Type Of Graph Is Best To Use To Show Data That Are Parts Of A Whole But how do you choose which style of graph to use? This page sets They are generally used for, and best for, quite different things.
More informationCPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018
CPSC 340: Machine Learning and Data Mining Deep Learning Fall 2018 Last Time: Multi-Dimensional Scaling Multi-dimensional scaling (MDS): Non-parametric visualization: directly optimize the z i locations.
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP
More informationWhat to come. There will be a few more topics we will cover on supervised learning
Summary so far Supervised learning learn to predict Continuous target regression; Categorical target classification Linear Regression Classification Discriminative models Perceptron (linear) Logistic regression
More informationNearest Neighbor Classification. Machine Learning Fall 2017
Nearest Neighbor Classification Machine Learning Fall 2017 1 This lecture K-nearest neighbor classification The basic algorithm Different distance measures Some practical aspects Voronoi Diagrams and Decision
More informationCPSC 340: Machine Learning and Data Mining. Multi-Dimensional Scaling Fall 2017
CPSC 340: Machine Learning and Data Mining Multi-Dimensional Scaling Fall 2017 Assignment 4: Admin 1 late day for tonight, 2 late days for Wednesday. Assignment 5: Due Monday of next week. Final: Details
More informationCSC 2515 Introduction to Machine Learning Assignment 2
CSC 2515 Introduction to Machine Learning Assignment 2 Zhongtian Qiu(1002274530) Problem 1 See attached scan files for question 1. 2. Neural Network 2.1 Examine the statistics and plots of training error
More informationCPSC 340: Machine Learning and Data Mining. Feature Selection Fall 2016
CPSC 34: Machine Learning and Data Mining Feature Selection Fall 26 Assignment 3: Admin Solutions will be posted after class Wednesday. Extra office hours Thursday: :3-2 and 4:3-6 in X836. Midterm Friday:
More informationMIS2502: Data Analytics Principles of Data Visualization. Alvin Zuyin Zheng
MIS2502: Data Analytics Principles of Data Visualization Alvin Zuyin Zheng zheng@temple.edu http://community.mis.temple.edu/zuyinzheng/ Data visualization can: provide clear understanding of patterns in
More informationSAS Visual Analytics 8.2: Getting Started with Reports
SAS Visual Analytics 8.2: Getting Started with Reports Introduction Reporting The SAS Visual Analytics tools give you everything you need to produce and distribute clear and compelling reports. SAS Visual
More informationCOMS 4771 Clustering. Nakul Verma
COMS 4771 Clustering Nakul Verma Supervised Learning Data: Supervised learning Assumption: there is a (relatively simple) function such that for most i Learning task: given n examples from the data, find
More informationToday. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time
Today Lecture 4: We examine clustering in a little more detail; we went over it a somewhat quickly last time The CAD data will return and give us an opportunity to work with curves (!) We then examine
More informationIteration Reduction K Means Clustering Algorithm
Iteration Reduction K Means Clustering Algorithm Kedar Sawant 1 and Snehal Bhogan 2 1 Department of Computer Engineering, Agnel Institute of Technology and Design, Assagao, Goa 403507, India 2 Department
More informationUnsupervised Data Mining: Clustering. Izabela Moise, Evangelos Pournaras, Dirk Helbing
Unsupervised Data Mining: Clustering Izabela Moise, Evangelos Pournaras, Dirk Helbing Izabela Moise, Evangelos Pournaras, Dirk Helbing 1 1. Supervised Data Mining Classification Regression Outlier detection
More informationFeature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule.
CS 188: Artificial Intelligence Fall 2007 Lecture 26: Kernels 11/29/2007 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit your
More informationProgramming. Dr Ben Dudson University of York
Programming Dr Ben Dudson University of York Outline Last lecture covered the basics of programming and IDL This lecture will cover More advanced IDL and plotting Fortran and C++ Programming techniques
More informationFacebook Page Insights
Facebook Product Guide for Facebook Page owners Businesses will be better in a connected world. That s why we connect 845M people and their friends to the things they care about, using social technologies
More informationAre your spreadsheets filled with unnecessary zero s, cluttering your information and making it hard to identify significant results?
Declutter your Spreadsheets by Hiding Zero Values Are your spreadsheets filled with unnecessary zero s, cluttering your information and making it hard to identify significant results? Undertaking data
More informationK-means Clustering & PCA
K-means Clustering & PCA Andreas C. Kapourani (Credit: Hiroshi Shimodaira) 02 February 2018 1 Introduction In this lab session we will focus on K-means clustering and Principal Component Analysis (PCA).
More informationCPSC 536N: Randomized Algorithms Term 2. Lecture 5
CPSC 536N: Randomized Algorithms 2011-12 Term 2 Prof. Nick Harvey Lecture 5 University of British Columbia In this lecture we continue to discuss applications of randomized algorithms in computer networking.
More informationCS 8520: Artificial Intelligence. Machine Learning 2. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek
CS 8520: Artificial Intelligence Machine Learning 2 Paula Matuszek Fall, 2015!1 Regression Classifiers We said earlier that the task of a supervised learning system can be viewed as learning a function
More informationChoosing the right graph in Excel
Choosing the right graph in Excel Guide? Presentation Level? Graph type Example Application Variants Notes (Y) Column Shows data change over time Illustrates comparisons (Y) Bar Illustrates comparisons
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More informationData Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Exploratory data analysis tasks Examine the data, in search of structures
More informationWhat s New in Spotfire DXP 1.1. Spotfire Product Management January 2007
What s New in Spotfire DXP 1.1 Spotfire Product Management January 2007 Spotfire DXP Version 1.1 This document highlights the new capabilities planned for release in version 1.1 of Spotfire DXP. In this
More informationKernels and Clustering
Kernels and Clustering Robert Platt Northeastern University All slides in this file are adapted from CS188 UC Berkeley Case-Based Learning Non-Separable Data Case-Based Reasoning Classification from similarity
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More informationOutlier detection using autoencoders
Outlier detection using autoencoders August 19, 2016 Author: Olga Lyudchik Supervisors: Dr. Jean-Roch Vlimant Dr. Maurizio Pierini CERN Non Member State Summer Student Report 2016 Abstract Outlier detection
More informationTDWI strives to provide course books that are contentrich and that serve as useful reference documents after a class has ended.
Previews of TDWI course books offer an opportunity to see the quality of our material and help you to select the courses that best fit your needs. The previews cannot be printed. TDWI strives to provide
More informationIntroduction to Machine Learning. Xiaojin Zhu
Introduction to Machine Learning Xiaojin Zhu jerryzhu@cs.wisc.edu Read Chapter 1 of this book: Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi- Supervised Learning. http://www.morganclaypool.com/doi/abs/10.2200/s00196ed1v01y200906aim006
More informationCase-Based Reasoning. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. Parametric / Non-parametric.
CS 188: Artificial Intelligence Fall 2008 Lecture 25: Kernels and Clustering 12/2/2008 Dan Klein UC Berkeley Case-Based Reasoning Similarity for classification Case-based reasoning Predict an instance
More informationCS 188: Artificial Intelligence Fall 2008
CS 188: Artificial Intelligence Fall 2008 Lecture 25: Kernels and Clustering 12/2/2008 Dan Klein UC Berkeley 1 1 Case-Based Reasoning Similarity for classification Case-based reasoning Predict an instance
More information1 Counting triangles and cliques
ITCSC-INC Winter School 2015 26 January 2014 notes by Andrej Bogdanov Today we will talk about randomness and some of the surprising roles it plays in the theory of computing and in coding theory. Let
More information3 Vectors and the Geometry of Space
3 Vectors and the Geometry of Space Up until this point in your career, you ve likely only done math in 2 dimensions. It s gotten you far in your problem solving abilities and you should be proud of all
More informationIntro to Analytics Learning Web Analytics
Intro to Analytics 100 - Learning Web Analytics When you hear the word analytics, what does this mean to you? Analytics is the discovery, interpretation and communication of meaningful patterns in data.
More informationDecisionPoint For Excel
DecisionPoint For Excel Getting Started Guide 2015 Antivia Group Ltd Notation used in this workbook Indicates where you need to click with your mouse Indicates a drag and drop path State >= N Indicates
More informationUniversity of Florida CISE department Gator Engineering. Visualization
Visualization Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida What is visualization? Visualization is the process of converting data (information) in to
More informationThe first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below.
Graphing in Excel featuring Excel 2007 1 A spreadsheet can be a powerful tool for analyzing and graphing data, but it works completely differently from the graphing calculator that you re used to. If you
More informationUnsupervised Learning
Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover
More informationDEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TORONTO CSC318S THE DESIGN OF INTERACTIVE COMPUTATIONAL MEDIA. Lecture March 1998
DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TORONTO CSC318S THE DESIGN OF INTERACTIVE COMPUTATIONAL MEDIA Lecture 19 30 March 1998 PRINCIPLES OF DATA DISPLAY AND VISUALIZATION 19.1 Nature, purpose of
More informationCS 343: Artificial Intelligence
CS 343: Artificial Intelligence Kernels and Clustering Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.
More informationSupervised vs. Unsupervised Learning
Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now
More informationHow to print a Hypercube
How to print a Hypercube Henry Segerman One of the things that mathematics is about, perhaps the thing that mathematics is about, is trying to make things easier to understand. John von Neumann once said
More informationIntroduction to Data Science Lecture 8 Unsupervised Learning. CS 194 Fall 2015 John Canny
Introduction to Data Science Lecture 8 Unsupervised Learning CS 194 Fall 2015 John Canny Outline Unsupervised Learning K-Means clustering DBSCAN Matrix Factorization Performance Machine Learning Supervised:
More informationNeural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders
Neural Networks for Machine Learning Lecture 15a From Principal Components Analysis to Autoencoders Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Principal Components
More informationHOW-TO GUIDE. Join or Login. About this Guide!
HOW-TO GUIDE About this Guide In this guide, you will learn about each section of the online community to help you make the best use of all it has to offer. Here you will find information on: Join or Login
More informationStats 170A: Project in Data Science Exploratory Data Analysis: Clustering Algorithms
Stats 170A: Project in Data Science Exploratory Data Analysis: Clustering Algorithms Padhraic Smyth Department of Computer Science Bren School of Information and Computer Sciences University of California,
More informationIntroduction to Excel
Introduction to Excel Written by Jon Agnone Center for Social Science Computation & Research 145 Savery Hall University of Washington Seattle WA 98195 U.S.A. (206)543-8110 November 2004 http://julius.csscr.washington.edu/pdf/excel.pdf
More informationLecture 2: January 24
CMPSCI 677 Operating Systems Spring 2017 Lecture 2: January 24 Lecturer: Prashant Shenoy Scribe: Phuthipong Bovornkeeratiroj 2.1 Lecture 2 Distributed systems fall into one of the architectures teaching
More informationCSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection
CSE 158 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:
More informationClassification: Feature Vectors
Classification: Feature Vectors Hello, Do you want free printr cartriges? Why pay more when you can get them ABSOLUTELY FREE! Just # free YOUR_NAME MISSPELLED FROM_FRIEND... : : : : 2 0 2 0 PIXEL 7,12
More informationInf2B assignment 2. Natural images classification. Hiroshi Shimodaira and Pol Moreno. Submission due: 4pm, Wednesday 30 March 2016.
Inf2B assignment 2 (Ver. 1.2) Natural images classification Submission due: 4pm, Wednesday 30 March 2016 Hiroshi Shimodaira and Pol Moreno This assignment is out of 100 marks and forms 12.5% of your final
More informationUNSUPERVISED LEARNING, CLUSTERING
UNSUPERVISED LEARNING, CLUSTERING UNSUPERVISED LEARNING UNSUPERVISED LEARNING Supervised learning: X - y pairs, f(x) function approximation Unsupervised learning: only X, no y Exploring the space of X
More informationStatistical graphics in analysis Multivariable data in PCP & scatter plot matrix. Paula Ahonen-Rainio Maa Visual Analysis in GIS
Statistical graphics in analysis Multivariable data in PCP & scatter plot matrix Paula Ahonen-Rainio Maa-123.3530 Visual Analysis in GIS 11.11.2015 Topics today YOUR REPORTS OF A-2 Thematic maps with charts
More informationCSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection
CSE 255 Lecture 6 Data Mining and Predictive Analytics Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:
More informationunderstanding media metrics WEB METRICS Basics for Journalists FIRST IN A SERIES
understanding media metrics WEB METRICS Basics for Journalists FIRST IN A SERIES Contents p 1 p 3 p 3 Introduction Basic Questions about Your Website Getting Started: Overall, how is our website doing?
More informationSlide 1 Hello, I m Jason Borgen, Program Coordinator for the TICAL project and a Google Certified Teacher. This Quick Take will show you a variety of ways to search Google to maximize your research and
More informationData Clustering. Algorithmic Thinking Luay Nakhleh Department of Computer Science Rice University
Data Clustering Algorithmic Thinking Luay Nakhleh Department of Computer Science Rice University Data clustering is the task of partitioning a set of objects into groups such that the similarity of objects
More informationClustering and Dimensionality Reduction. Stony Brook University CSE545, Fall 2017
Clustering and Dimensionality Reduction Stony Brook University CSE545, Fall 2017 Goal: Generalize to new data Model New Data? Original Data Does the model accurately reflect new data? Supervised vs. Unsupervised
More informationDimension Reduction CS534
Dimension Reduction CS534 Why dimension reduction? High dimensionality large number of features E.g., documents represented by thousands of words, millions of bigrams Images represented by thousands of
More informationCSC411/2515 Tutorial: K-NN and Decision Tree
CSC411/2515 Tutorial: K-NN and Decision Tree Mengye Ren csc{411,2515}ta@cs.toronto.edu September 25, 2016 Cross-validation K-nearest-neighbours Decision Trees Review: Motivation for Validation Framework:
More informationCreating Page Layouts 25 min
1 of 10 09/11/2011 19:08 Home > Design Tips > Creating Page Layouts Creating Page Layouts 25 min Effective document design depends on a clear visual structure that conveys and complements the main message.
More informationMITOCW ocw f99-lec07_300k
MITOCW ocw-18.06-f99-lec07_300k OK, here's linear algebra lecture seven. I've been talking about vector spaces and specially the null space of a matrix and the column space of a matrix. What's in those
More informationONS Beta website. 7 December 2015
ONS Beta website Terminology survey results 7 December 2015 Background During usability sessions, both moderated and online, it has become clear that users do not understand the majority of terminology
More informationSOME TYPES AND USES OF DATA MODELS
3 SOME TYPES AND USES OF DATA MODELS CHAPTER OUTLINE 3.1 Different Types of Data Models 23 3.1.1 Physical Data Model 24 3.1.2 Logical Data Model 24 3.1.3 Conceptual Data Model 25 3.1.4 Canonical Data Model
More informationMotion Interpretation and Synthesis by ICA
Motion Interpretation and Synthesis by ICA Renqiang Min Department of Computer Science, University of Toronto, 1 King s College Road, Toronto, ON M5S3G4, Canada Abstract. It is known that high-dimensional
More informationImages help us relate to content, help us become involved. They help us to see ourselves in the science, rather than standing on the outskirts.
1 2 3 Images help us relate to content, help us become involved. They help us to see ourselves in the science, rather than standing on the outskirts. 4 Why imagery? Because we are overloaded with information.
More informationCSE 586 Final Programming Project Spring 2011 Due date: Tuesday, May 3
CSE 586 Final Programming Project Spring 2011 Due date: Tuesday, May 3 What I have in mind for our last programming project is to do something with either graphical models or random sampling. A few ideas
More informationFacebook Page Insights
Facebook Product Guide for Facebook Page owners Businesses will be better in a connected world. That s why we connect 800M people and their friends to the things they care about, using social technologies
More informationAn Unsupervised Technique for Statistical Data Analysis Using Data Mining
International Journal of Information Sciences and Application. ISSN 0974-2255 Volume 5, Number 1 (2013), pp. 11-20 International Research Publication House http://www.irphouse.com An Unsupervised Technique
More informationBrowsing the World Wide Web with Firefox
Browsing the World Wide Web with Firefox B 660 / 1 Try this Popular and Featurepacked Free Alternative to Internet Explorer Internet Explorer 7 arrived with a bang a few months ago, but it hasn t brought
More informationMaking Science Graphs and Interpreting Data
Making Science Graphs and Interpreting Data Eye Opener: 5 mins What do you see? What do you think? Look up terms you don t know What do Graphs Tell You? A graph is a way of expressing a relationship between
More informationVisual Encoding Design
CSE 442 - Data Visualization Visual Encoding Design Jeffrey Heer University of Washington Review: Expressiveness & Effectiveness / APT Choosing Visual Encodings Assume k visual encodings and n data attributes.
More informationDistribution-free Predictive Approaches
Distribution-free Predictive Approaches The methods discussed in the previous sections are essentially model-based. Model-free approaches such as tree-based classification also exist and are popular for
More informationNearest Neighbor Predictors
Nearest Neighbor Predictors September 2, 2018 Perhaps the simplest machine learning prediction method, from a conceptual point of view, and perhaps also the most unusual, is the nearest-neighbor method,
More informationSetting up Blogger. We have focused on Blogger as it is easy to use and ideal for someone starting blogging.
Setting up Blogger The three most popular platforms for blogging are WordPress, Tumblr and Blogger. In Module 1 the primary features of each platform were outlined. We have focused on Blogger as it is
More informationCS Information Visualization Sep. 2, 2015 John Stasko
Multivariate Visual Representations 2 CS 7450 - Information Visualization Sep. 2, 2015 John Stasko Recap We examined a number of techniques for projecting >2 variables (modest number of dimensions) down
More informationAn Introduction to PDF Estimation and Clustering
Sigmedia, Electronic Engineering Dept., Trinity College, Dublin. 1 An Introduction to PDF Estimation and Clustering David Corrigan corrigad@tcd.ie Electrical and Electronic Engineering Dept., University
More informationCOSC 311: ALGORITHMS HW1: SORTING
COSC 311: ALGORITHMS HW1: SORTIG Solutions 1) Theoretical predictions. Solution: On randomly ordered data, we expect the following ordering: Heapsort = Mergesort = Quicksort (deterministic or randomized)
More informationSEO: SEARCH ENGINE OPTIMISATION
SEO: SEARCH ENGINE OPTIMISATION SEO IN 11 BASIC STEPS EXPLAINED What is all the commotion about this SEO, why is it important? I have had a professional content writer produce my content to make sure that
More informationPre-Requisites: CS2510. NU Core Designations: AD
DS4100: Data Collection, Integration and Analysis Teaches how to collect data from multiple sources and integrate them into consistent data sets. Explains how to use semi-automated and automated classification
More information