CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing. University of Florida, CISE Department Prof.

Size: px
Start display at page:

Download "CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing. University of Florida, CISE Department Prof."

Transcription

1 CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang

2 Data Visualization Value of Visualization Data And Image Models Visualization Design Exploratory Data Analysis Adapted Slides from Jeffrey Heer at University of Washington

3 What is visualization? Transformation of the symbolic into the geometric [McCormick et al. 1987]... finding the artificial memory that best supports our natural means of perception. [Bertin 1967] The use of computer-generated, interactive, visual representations of data to amplify cognition. [Card, Mackinlay, & Shneiderman 1999] 3

4 Data 4

5 Visual Representation 5

6 Why visualization? Efficient use of Attention What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention, and a need to allocate that attention efficiently among the overabundance of information sources that might consume it. Herb Simon as quoted by Hal Varian Scientific American September

7 Why create visualization? Answer questions (or discover them) (e.g., What is the silk road that travels from Europe to China?) Make decisions (e.g., stock market, monitoring system in hospitals) See data in context (e.g., map) Expand memory (e.g., multiplication) Find patterns (e.g., astronomy data, transaction) Present argument or tell a story (e.g., growth of Walmart: Inspire (e.g., textbook medicine, genome, DNA) 7

8 The Value of Visualization Record information Blueprints, photographs, seismographs, Analyze data to support reasoning Develop and assess hypotheses Discover errors in data Expand memory Find patterns Communicate information to others Share and persuade Collaborate and revise 8

9 Record information Leonardo da Vinci Map of Imola, created for Cesare Borgia (Up) Proportional of man (Left) 9

10 Support Reasoning Which animal has the most powerful brain? 10

11 The most powerful brain? 11

12 Communicate Information From the New York Times

13 The Value of Visualization Record information Blueprints, photographs, seismographs, Analyze data to support reasoning Develop and assess hypotheses Discover errors in data Expand memory Find patterns Communicate information to others Share and persuade Collaborate and revise 13

14 Visualization Reference Model 14

15 Visualization Generation Process 15

16 Topics Properties of data Properties of images Mapping data to images 16

17 Data models vs. Conceptual models Data models are low level descriptions of the data (math abstraction) Math: Sets with operations on them Example: integers with + and operators Conceptual models are mental constructions Include semantics and support reasoning Examples (data vs. conceptual) (1D floats) vs. Temperature (3D vector of floats) vs. Space 17

18 Taxonomy of data types 1D (sets and sequences) Temporal 2D (maps) -- Spatial 3D (shapes) nd (relational) Trees (hierarchies) Networks (graphs) Combination: e.g., spatial + temporal, spatial + relational 18

19 Types of variables Physical types Characterized by storage format Characterized by machine operations Example: bool, short, int32, float, double, string, Abstract types Provide descriptions of the data May be characterized by methods May be organized into a hierarchy (e.g., ontology) 19

20 Abstract types of Variables Categorical (data that are counted) Nominal Ordinal Quantitative or Numerical (data that are measured) Interval Ratio Why is the type of variable important? The methods used to display, summarize, and analyze data depend on whether the variables are categorical or quantitative. 20

21 Categorical: Nominal Nominal Variables that are named, i.e. classified into one or more qualitative categories that describe the characteristic of interest no ordering of the different categories no measure of distance between values categories can be listed in any order without affecting the relationship between them Nominal variables are the simplest type of variable 21

22 Categorical: Ordinal Ordinal Variables that have an inherent order to the relationship among the different categories an implied ordering of the categories (levels) quantitative distance between levels is unknown distances between the levels may not be the same meaning of different levels may not be the same for different individuals 22

23 Quantitative/Numerical Interval Variables that have constant, equal distances between values, but the zero point is arbitrary. Ratio Variables have equal intervals between values, the zero point is meaningful, and the numerical relationships between numbers is meaningful. Continuous vs. discrete 23

24 Nominal, Ordinal and Quantitative N - Nominal (labels) Fruits: Apples, oranges, O Ordinal (ordered list) Quality of meat: Grade A, AA, AAA Q - Interval (Location of zero arbitrary) Dates: Jan, 19, 2006; Location: (LAT 33.98, LONG ) Cannot compare directly Only differences (i.e. intervals) may be compared Q - Ratio (zero fixed) Physical measurement: Length, Mass, Temp, Counts and amounts Origin is meaningful 24

25 Level of Measurement Higher level variables can always be expressed at a lower level, but the reverse is not true. Q > O > N For example, Body Mass Index (BMI) is typically measured at an interval-level such as BMI can be collapsed into lower-level Ordinal categories such as: >30: Obese : Overweight <25: Underweight or Nominal categories such as: Overweight Not overweight 25

26 Operations on N,O,Q Data Types N - Nominal (labels) Operations: =, O Ordinal (ordered list) Operations: =,, <, > Q - Interval (Location of zero arbitrary) Operations: =,, <, >, - Can measure distances or spans Q - Ratio (zero fixed) Operations: =,, <, >, -, % Can measure ratios or proportions 26

27 From data models to N,O,Q data types Data model 32.5, 54.0, -17.3, floats Conceptual model Temperature ( C) Data type Burned vs. Not burned (N) Hot, warm, cold (O) Continuous range of values (Q) 27

28 Example Sepal and petal lengths and widths for three species of iris [Fisher 1936]. 28

29 Example Sepal and petal lengths and widths for three species of iris [Fisher 1936]. 29

30 Relational data model Represent data as a table (relation) Each row (tuple) represents a single record Each record is a fixed-length tuple Each column (attribute) represents a single variable Each attribute has a name and a data type A table s schema is the set of names and data types A database is a collection of tables (relations) 30

31 Relational Algebra [Codd] Data transformations (sql) Projection (select) Selection (where) Sorting (order by) Aggregation (group by, sum, min, ) Set operations (union, ) Combine (inner join, outer join, ) 31

32 Statistical data model Variables or measurements Categories or factors or dimensions Observations or cases 32

33 Dimensions and Measures Dimensions: Discrete variables describing data Dates, categories of values (independent vars) Measures: Data values that can be aggregated Numbers to be analyzed (dependent vars) Aggregate as sum, count, average, std. deviation 33

34 Example: U.S. Census Data People: # of people in group Year: (every decade) Age: Sex: Male, Female Marital Status: Single, Married, Divorced, 34

35 Example: U.S. Census People Year Age Sex Marital Status 2348 data points 35

36 Census: N, O, Q (R/I)? People Count Year Age Sex (M/F) Marital Status Q-Ratio Q-Interval (O) Q-Ratio (O) N N 36

37 Census: Measure or Dimension? People Count Year Age Sex (M/F) Marital Status Measure Dimension Dimension Dimension Dimension 37

Last Time: Value of Visualization

Last Time: Value of Visualization CS448B :: 29 Sep 2011 Data and Image Models Last Time: Value of Visualization Jeffrey Heer Stanford University The Value of Visualization Record information Blueprints, photographs, seismographs, Analyze

More information

Data and Image Models

Data and Image Models CSE 512 - Data Visualization Data and Image Models Jeffrey Heer University of Washington Last Time: Value of Visualization The Value of Visualization Record information Blueprints, photographs, seismographs,

More information

Data and Image Models

Data and Image Models CSE 442 - Data Visualization Data and Image Models Jeffrey Heer University of Washington Last Week: Value of Visualization The Value of Visualization Record information Blueprints, photographs, seismographs,

More information

Data and Image Models

Data and Image Models CSE 442 - Data Visualization Data and Image Models Jeffrey Heer University of Washington Last Time: Value of Visualization The Value of Visualization Record information Blueprints, photographs, seismographs,

More information

We will start at 2:05 pm! Thanks for coming early!

We will start at 2:05 pm! Thanks for coming early! We will start at 2:05 pm! Thanks for coming early! Yesterday Fundamental 1. Value of visualization 2. Design principles 3. Graphical perception Record Information Support Analytical Reasoning Communicate

More information

Visualization Re-Design

Visualization Re-Design CS448B :: 28 Sep 2010 Visualization Re-Design Last Time: Data and Image Models Jeffrey Heer Stanford University The Big Picture Taxonomy task data physical type int, float, etc. abstract type nominal,

More information

Last Time: Data and Image Models

Last Time: Data and Image Models CS448B :: 2 Oct 2012 Visualization Design Last Time: Data and Image Models Jeffrey Heer Stanford University The Big Picture Nominal, Ordinal and Quantitative task questions & hypotheses intended audience

More information

Data+Dataset Types/Semantics Tasks

Data+Dataset Types/Semantics Tasks Data+Dataset Types/Semantics Tasks Visualization Michael Sedlmair Reading Munzner, Visualization Analysis and Design : Chapter 2+3 (Why+What+How) Shneiderman, The Eyes Have It: A Task by Data Type Taxonomy

More information

S. Rinzivillo DATA VISUALIZATION AND VISUAL ANALYTICS

S. Rinzivillo DATA VISUALIZATION AND VISUAL ANALYTICS S. Rinzivillo rinzivillo@isti.cnr.it DATA VISUALIZATION AND VISUAL ANALYTICS Perception and Cognition vs Game #4 How many 3s? 1258965168765132168943213 5463479654321320354968413 2068798417184529529287149

More information

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. + What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and

More information

University of Florida CISE department Gator Engineering. Visualization

University of Florida CISE department Gator Engineering. Visualization Visualization Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida What is visualization? Visualization is the process of converting data (information) in to

More information

Lecture 3: Data Principles

Lecture 3: Data Principles Lecture 3: Data Principles Information Visualization CPSC 533C, Fall 2011 Tamara Munzner UBC Computer Science Mon, 19 September 2011 1 / 33 Papers Covered Chapter 2: Data Principles Polaris: A System for

More information

Multidimensional (Multivariate)

Multidimensional (Multivariate) Multidimensional (Multivariate) Data Visualization IV Course Spring 14 Graduate Course of UCAS May 9th, 2014 1 Data by Dimensionality 1-D (Linear, Set and Sequences) SeeSoft, Info Mural 2-D (Map) GIS,

More information

Machine Learning Chapter 2. Input

Machine Learning Chapter 2. Input Machine Learning Chapter 2. Input 2 Input: Concepts, instances, attributes Terminology What s a concept? Classification, association, clustering, numeric prediction What s in an example? Relations, flat

More information

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Input: Concepts, instances, attributes Terminology What s a concept?

More information

Basic Concepts Weka Workbench and its terminology

Basic Concepts Weka Workbench and its terminology Changelog: 14 Oct, 30 Oct Basic Concepts Weka Workbench and its terminology Lecture Part Outline Concepts, instances, attributes How to prepare the input: ARFF, attributes, missing values, getting to know

More information

What are we working with? Data Abstractions. Week 4 Lecture A IAT 814 Lyn Bartram

What are we working with? Data Abstractions. Week 4 Lecture A IAT 814 Lyn Bartram What are we working with? Data Abstractions Week 4 Lecture A IAT 814 Lyn Bartram Munzner s What-Why-How What are we working with? DATA abstractions, statistical methods Why are we doing it? Task abstractions

More information

ARTIFICIAL INTELLIGENCE (CS 370D)

ARTIFICIAL INTELLIGENCE (CS 370D) Princess Nora University Faculty of Computer & Information Systems ARTIFICIAL INTELLIGENCE (CS 370D) (CHAPTER-18) LEARNING FROM EXAMPLES DECISION TREES Outline 1- Introduction 2- know your data 3- Classification

More information

Grundlagen methodischen Arbeitens Informationsvisualisierung [WS ] Monika Lanzenberger

Grundlagen methodischen Arbeitens Informationsvisualisierung [WS ] Monika Lanzenberger Grundlagen methodischen Arbeitens Informationsvisualisierung [WS0708 01 ] Monika Lanzenberger lanzenberger@ifs.tuwien.ac.at 17. 10. 2007 Current InfoVis Research Activities: AlViz 2 [Lanzenberger et al.,

More information

Data Mining: Exploring Data. Lecture Notes for Chapter 3

Data Mining: Exploring Data. Lecture Notes for Chapter 3 Data Mining: Exploring Data Lecture Notes for Chapter 3 1 What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include

More information

Visual Encoding Design

Visual Encoding Design CSE 442 - Data Visualization Visual Encoding Design Jeffrey Heer University of Washington Last Time: Data & Image Models The Big Picture task questions, goals assumptions data physical data type conceptual

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Input: Concepts, instances, attributes Data ining Practical achine Learning Tools and Techniques Slides for Chapter 2 of Data ining by I. H. Witten and E. rank Terminology What s a concept z Classification,

More information

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar What is data exploration? A preliminary exploration of the data to better understand its characteristics.

More information

Information Visualization

Information Visualization Information Visualization Introduction Inspired from Petra Isenberg petra.isenberg@inria.fr Why INFORMATION VISUALIZATION It is estimated that 800 exabyte (800x 10^19) of digital information will be generated

More information

Data Mining: Exploring Data. Lecture Notes for Data Exploration Chapter. Introduction to Data Mining

Data Mining: Exploring Data. Lecture Notes for Data Exploration Chapter. Introduction to Data Mining Data Mining: Exploring Data Lecture Notes for Data Exploration Chapter Introduction to Data Mining by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 What is data exploration?

More information

CSE4334/5334 Data Mining 4 Data and Data Preprocessing. Chengkai Li University of Texas at Arlington Fall 2017

CSE4334/5334 Data Mining 4 Data and Data Preprocessing. Chengkai Li University of Texas at Arlington Fall 2017 CSE4334/5334 Data Mining 4 Data and Data Preprocessing Chengkai Li University of Texas at Arlington Fall 2017 10 What is Data? Collection of data objects and their attributes Attributes An attribute is

More information

Visual Computing. Lecture 2 Visualization, Data, and Process

Visual Computing. Lecture 2 Visualization, Data, and Process Visual Computing Lecture 2 Visualization, Data, and Process Pipeline 1 High Level Visualization Process 1. 2. 3. 4. 5. Data Modeling Data Selection Data to Visual Mappings Scene Parameter Settings (View

More information

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect BEOP.CTO.TP4 Owner: OCTO Revision: 0001 Approved by: JAT Effective: 08/30/2018 Buchanan & Edwards Proprietary: Printed copies of

More information

Information Visualization

Information Visualization Information Visualization Introduction Petra Isenberg petra.isenberg@inria.fr After today you will have gained an overview of the research area learned basic principles of data representation and interaction

More information

MAT 155. Chapter 1 Introduction to Statistics. sample. population. parameter. statistic

MAT 155. Chapter 1 Introduction to Statistics. sample. population. parameter. statistic MAT 155 Dr. Claude Moore Cape Fear Community College Chapter 1 Introduction to Statistics 1 1Review and Preview 1 2Statistical Thinking 1 3Types of Data 1 4Critical Thinking 1 5Collecting Sample Data Key

More information

Brief Contents. Foreword by Sarah Frostenson...xvii. Acknowledgments... Introduction... xxiii. Chapter 1: Creating Your First Database and Table...

Brief Contents. Foreword by Sarah Frostenson...xvii. Acknowledgments... Introduction... xxiii. Chapter 1: Creating Your First Database and Table... Brief Contents Foreword by Sarah Frostenson....xvii Acknowledgments... xxi Introduction... xxiii Chapter 1: Creating Your First Database and Table... 1 Chapter 2: Beginning Data Exploration with SELECT...

More information

Data Visualization. Fall 2016

Data Visualization. Fall 2016 Data Visualization Fall 2016 Information Visualization Upon now, we dealt with scientific visualization (scivis) Scivisincludes visualization of physical simulations, engineering, medical imaging, Earth

More information

Chapter 1 Introduction to Statistics

Chapter 1 Introduction to Statistics Corresponds to ELEMENTARY STATISTICS USING THE TI 83/84 PLUS CALCULATOR 3rd ed. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by Mario F. Triola Chapter 1 Introduction

More information

Data Has Shape. Did you know? Data has Shape! Examples. My Data What do you think the shape of height data for this class looks like?

Data Has Shape. Did you know? Data has Shape! Examples. My Data What do you think the shape of height data for this class looks like? L01 Data Has Shape Did you know? Data has Shape! Examples My Data What do you think the shape of height data for this class looks like? Data From you Calculate your height in inches From the shape of the

More information

Data analysis using Microsoft Excel

Data analysis using Microsoft Excel Introduction to Statistics Statistics may be defined as the science of collection, organization presentation analysis and interpretation of numerical data from the logical analysis. 1.Collection of Data

More information

Data Analyst Nanodegree Syllabus

Data Analyst Nanodegree Syllabus Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working

More information

CP SC 8810 Data Visualization. Joshua Levine

CP SC 8810 Data Visualization. Joshua Levine CP SC 8810 Data Visualization Joshua Levine levinej@clemson.edu Lecture 05 Visual Encoding Sept. 9, 2014 Agenda Programming Lab 01 Questions? Continuing from Lec04 Attribute Types no implicit ordering

More information

INFORMATION VISUALIZATION

INFORMATION VISUALIZATION CSE 557A Sep 26, 2016 INFORMATION VISUALIZATION Alvitta Ottley Washington University in St. Louis Slide Credits: Mariah Meyer, University of Utah Remco Chang, Tufts University HEIDELBERG LAUREATE FORUM

More information

CS570 Introduction to Data Mining

CS570 Introduction to Data Mining CS570 Introduction to Data Mining Department of Mathematics and Computer Science Li Xiong Data Exploration and Data Preprocessing Data and attributes Data exploration Data pre-processing 2 10 What is Data?

More information

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student Organizing data Learning Outcome 1. make an array 2. divide the array into class intervals 3. describe the characteristics of a table 4. construct a frequency distribution table 5. constructing a composite

More information

MATH 117 Statistical Methods for Management I Chapter Two

MATH 117 Statistical Methods for Management I Chapter Two Jubail University College MATH 117 Statistical Methods for Management I Chapter Two There are a wide variety of ways to summarize, organize, and present data: I. Tables 1. Distribution Table (Categorical

More information

Data mining, 4 cu Lecture 6:

Data mining, 4 cu Lecture 6: 582364 Data mining, 4 cu Lecture 6: Quantitative association rules Multi-level association rules Spring 2010 Lecturer: Juho Rousu Teaching assistant: Taru Itäpelto Data mining, Spring 2010 (Slides adapted

More information

Data Statistics Population. Census Sample Correlation... Statistical & Practical Significance. Qualitative Data Discrete Data Continuous Data

Data Statistics Population. Census Sample Correlation... Statistical & Practical Significance. Qualitative Data Discrete Data Continuous Data Data Statistics Population Census Sample Correlation... Voluntary Response Sample Statistical & Practical Significance Quantitative Data Qualitative Data Discrete Data Continuous Data Fewer vs Less Ratio

More information

Spatial Patterns Point Pattern Analysis Geographic Patterns in Areal Data

Spatial Patterns Point Pattern Analysis Geographic Patterns in Areal Data Spatial Patterns We will examine methods that are used to analyze patterns in two sorts of spatial data: Point Pattern Analysis - These methods concern themselves with the location information associated

More information

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA This lab will assist you in learning how to summarize and display categorical and quantitative data in StatCrunch. In particular, you will learn how to

More information

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT PRIMER FOR ACS OUTCOMES RESEARCH COURSE: TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT STEP 1: Install STATA statistical software. STEP 2: Read through this primer and complete the

More information

Chapter Two: Descriptive Methods 1/50

Chapter Two: Descriptive Methods 1/50 Chapter Two: Descriptive Methods 1/50 2.1 Introduction 2/50 2.1 Introduction We previously said that descriptive statistics is made up of various techniques used to summarize the information contained

More information

IAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram

IAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram IAT 355 Visual Analytics Data and Statistical Models Lyn Bartram Exploring data Example: US Census People # of people in group Year # 1850 2000 (every decade) Age # 0 90+ Sex (Gender) # Male, female Marital

More information

DLM Mathematics Year-End Assessment Model Blueprint

DLM Mathematics Year-End Assessment Model Blueprint DLM Mathematics Year-End Assessment Model 2017-18 Blueprint In this document, the blueprint refers to the range of Essential Elements (s) that will be assessed during the spring 2018 assessment window.

More information

Cartographic symbolization

Cartographic symbolization Symbology Cartographic symbolization Cartographic symbolization is based on a systematic approach for selecting the graphic symbols to use on a map Symbolization is the process of creating graphic symbols

More information

DLM Mathematics Year-End Assessment Model Blueprint for New York State 1

DLM Mathematics Year-End Assessment Model Blueprint for New York State 1 DLM Mathematics Year-End Assessment Model Blueprint for New York State 1 In this document, the blueprint refers to the range of Essential Elements (s) that will be assessed during the spring 2018 assessment

More information

DEPARTMENT OF HEALTH AND HUMAN SCIENCES HS900 RESEARCH METHODS

DEPARTMENT OF HEALTH AND HUMAN SCIENCES HS900 RESEARCH METHODS DEPARTMENT OF HEALTH AND HUMAN SCIENCES HS900 RESEARCH METHODS Using SPSS Topics addressed today: 1. Accessing data from CMR 2. Starting SPSS 3. Getting familiar with SPSS 4. Entering data 5. Saving data

More information

Relational Model, Relational Algebra, and SQL

Relational Model, Relational Algebra, and SQL Relational Model, Relational Algebra, and SQL August 29, 2007 1 Relational Model Data model. constraints. Set of conceptual tools for describing of data, data semantics, data relationships, and data integrity

More information

Project II. argument/reasoning based on the dataset)

Project II. argument/reasoning based on the dataset) Project II Hive: Simple queries (join, aggregation, group by) Hive: Advanced queries (text extraction, link prediction and graph analysis) Tableau: Visualizations (mutidimensional, interactive, support

More information

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation Objectives: 1. Learn the meaning of descriptive versus inferential statistics 2. Identify bar graphs,

More information

Nuts and Bolts Research Methods Symposium

Nuts and Bolts Research Methods Symposium Organizing Your Data Jenny Holcombe, PhD UT College of Medicine Nuts & Bolts Conference August 16, 3013 Topics to Discuss: Types of Variables Constructing a Variable Code Book Developing Excel Spreadsheets

More information

DSC 201: Data Analysis & Visualization

DSC 201: Data Analysis & Visualization DSC 201: Data Analysis & Visualization Exploratory Data Analysis Dr. David Koop What is Exploratory Data Analysis? "Detective work" to summarize and explore datasets Includes: - Data acquisition and input

More information

Summarising Data. Mark Lunt 09/10/2018. Arthritis Research UK Epidemiology Unit University of Manchester

Summarising Data. Mark Lunt 09/10/2018. Arthritis Research UK Epidemiology Unit University of Manchester Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 09/10/2018 Summarising Data Today we will consider Different types of data Appropriate ways to summarise these

More information

Contents NUMBER. Resource Overview xv. Counting Forward and Backward; Counting. Principles; Count On and Count Back. How Many? 3 58.

Contents NUMBER. Resource Overview xv. Counting Forward and Backward; Counting. Principles; Count On and Count Back. How Many? 3 58. Contents Resource Overview xv Application Item Title Pre-assessment Analysis Chart NUMBER Place Value and Representing Place Value and Representing Rote Forward and Backward; Principles; Count On and Count

More information

EECS 647: Introduction to Database Systems

EECS 647: Introduction to Database Systems EECS 647: Introduction to Database Systems Instructor: Luke Huan Spring 2009 Stating Points A database A database management system A miniworld A data model Conceptual model Relational model 2/24/2009

More information

INTRODUCTORY SPSS. Dr Feroz Mahomed Swalaha x2689

INTRODUCTORY SPSS. Dr Feroz Mahomed Swalaha x2689 INTRODUCTORY SPSS Dr Feroz Mahomed Swalaha fswalaha@dut.ac.za x2689 1 Statistics (the systematic collection and display of numerical data) is the most abused area of numeracy. 97% of statistics are made

More information

刘淇 School of Computer Science and Technology USTC

刘淇 School of Computer Science and Technology USTC Data Exploration 刘淇 School of Computer Science and Technology USTC http://staff.ustc.edu.cn/~qiliuql/dm2013.html t t / l/dm2013 l What is data exploration? A preliminary exploration of the data to better

More information

MODELS AND FRAMEWORKS. Information Visualization Fall 2009 Jinwook Seo SNU CSE

MODELS AND FRAMEWORKS. Information Visualization Fall 2009 Jinwook Seo SNU CSE MODELS AND FRAMEWORKS Information Visualization Fall 2009 Jinwook Seo SNU CSE Wednesday Prof. Hee-Joon Bae, Seoul National University Bundang Hostpital blood pressure and END (early neurologic deterioration)

More information

Data Visualization Principles for Scientific Communication

Data Visualization Principles for Scientific Communication Data Visualization Principles for Scientific Communication 8-888 Introduction to Linguistic Data Analysis Using R Jerzy Wieczorek 11//15 Follow along These slides and a summary checklist are at http://www.stat.cmu.edu/~jwieczor/

More information

Correlation of Ontario Mathematics 2005 Curriculum to. Addison Wesley Mathematics Makes Sense

Correlation of Ontario Mathematics 2005 Curriculum to. Addison Wesley Mathematics Makes Sense Correlation of Ontario Mathematics 2005 Curriculum to Addison Wesley Math Makes Sense 3 Number Sense and Numeration Overall Expectations By the end of Grade 3, students will: read, represent, compare,

More information

Data 8 Final Review #1

Data 8 Final Review #1 Data 8 Final Review #1 Topics we ll cover: Visualizations Arrays and Table Manipulations Programming constructs (functions, for loops, conditional statements) Chance, Simulation, Sampling and Distributions

More information

Data Mining: Exploring Data. Lecture Notes for Chapter 3

Data Mining: Exploring Data. Lecture Notes for Chapter 3 Data Mining: Exploring Data Lecture Notes for Chapter 3 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Look for accompanying R code on the course web site. Topics Exploratory Data Analysis

More information

DATA ABSTRACTION & INTRO TO TABLEAU

DATA ABSTRACTION & INTRO TO TABLEAU cs6630 September 4 2014 DATA ABSTRACTION & INTRO TO TABLEAU Miriah Meyer University of Utah 1 administrivia... 2 - design critiques due tonight - first assignment out today - there *might* be 3 seats available

More information

Opening a Data File in SPSS. Defining Variables in SPSS

Opening a Data File in SPSS. Defining Variables in SPSS Opening a Data File in SPSS To open an existing SPSS file: 1. Click File Open Data. Go to the appropriate directory and find the name of the appropriate file. SPSS defaults to opening SPSS data files with

More information

Input: Concepts, Instances, Attributes

Input: Concepts, Instances, Attributes Input: Concepts, Instances, Attributes 1 Terminology Components of the input: Concepts: kinds of things that can be learned aim: intelligible and operational concept description Instances: the individual,

More information

STP 226 ELEMENTARY STATISTICS NOTES

STP 226 ELEMENTARY STATISTICS NOTES ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 2 ORGANIZING DATA Descriptive Statistics - include methods for organizing and summarizing information clearly and effectively. - classify

More information

Lecture 5: DATA MAPPING & VISUALIZATION. November 3 rd, Presented by: Anum Masood (TA)

Lecture 5: DATA MAPPING & VISUALIZATION. November 3 rd, Presented by: Anum Masood (TA) 1/59 Lecture 5: DATA MAPPING & VISUALIZATION November 3 rd, 2017 Presented by: Anum Masood (TA) 2/59 Recap: Data What is Data Visualization? Data Attributes Visual Attributes Mapping What are data attributes?

More information

TNM093 Tillämpad visualisering och virtuell verklighet. Jimmy Johansson C-Research, Linköping University

TNM093 Tillämpad visualisering och virtuell verklighet. Jimmy Johansson C-Research, Linköping University TNM093 Tillämpad visualisering och virtuell verklighet Jimmy Johansson C-Research, Linköping University Introduction to Visualization New Oxford Dictionary of English, 1999 visualize - verb [with obj.]

More information

Computational Databases: Inspirations from Statistical Software. Linnea Passing, Technical University of Munich

Computational Databases: Inspirations from Statistical Software. Linnea Passing, Technical University of Munich Computational Databases: Inspirations from Statistical Software Linnea Passing, linnea.passing@tum.de Technical University of Munich Data Science Meets Databases Data Cleansing Pipelines Fuzzy joins Data

More information

Basic concepts and terms

Basic concepts and terms CHAPTER ONE Basic concepts and terms I. Key concepts Test usefulness Reliability Construct validity Authenticity Interactiveness Impact Practicality Assessment Measurement Test Evaluation Grading/marking

More information

Learning Objectives for Data Concept and Visualization

Learning Objectives for Data Concept and Visualization Learning Objectives for Data Concept and Visualization Assignment 1: Data Quality Concept and Impact of Data Quality Summarize concepts of data quality. Understand and describe the impact of data on actuarial

More information

8.NS.1 8.NS.2. 8.EE.7.a 8.EE.4 8.EE.5 8.EE.6

8.NS.1 8.NS.2. 8.EE.7.a 8.EE.4 8.EE.5 8.EE.6 Standard 8.NS.1 8.NS.2 8.EE.1 8.EE.2 8.EE.3 8.EE.4 8.EE.5 8.EE.6 8.EE.7 8.EE.7.a Jackson County Core Curriculum Collaborative (JC4) 8th Grade Math Learning Targets in Student Friendly Language I can identify

More information

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures Part I, Chapters 4 & 5 Data Tables and Data Analysis Statistics and Figures Descriptive Statistics 1 Are data points clumped? (order variable / exp. variable) Concentrated around one value? Concentrated

More information

DLM Mathematics Year-End Assessment Model Blueprint

DLM Mathematics Year-End Assessment Model Blueprint DLM Mathematics Year-End Assessment Model 2018-19 Blueprint In this document, the blueprint refers to the range of Essential Elements (s) that will be assessed during the spring 2019 assessment window.

More information

Benjamin Adlard School 2015/16 Maths medium term plan: Autumn term Year 6

Benjamin Adlard School 2015/16 Maths medium term plan: Autumn term Year 6 Benjamin Adlard School 2015/16 Maths medium term plan: Autumn term Year 6 Number - Number and : Order and compare decimals with up to 3 decimal places, and determine the value of each digit, and. Multiply

More information

The Semiology of Graphics Pat Hanrahan Stanford University Representations

The Semiology of Graphics Pat Hanrahan Stanford University Representations The Semiology of Graphics 2 Pat Hanrahan Stanford University Representations Page 1 Number Scrabble [Simon] Given: The numbers 1 through 9 Goal: Pick three numbers that sum to 15 Number Scrabble [Simon]

More information

Stats 170A: Project in Data Science Exploratory Data Analysis: Clustering Algorithms

Stats 170A: Project in Data Science Exploratory Data Analysis: Clustering Algorithms Stats 170A: Project in Data Science Exploratory Data Analysis: Clustering Algorithms Padhraic Smyth Department of Computer Science Bren School of Information and Computer Sciences University of California,

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 2 Sajjad Haider Spring 2010 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric

More information

CSC Advanced Scientific Computing, Fall Numpy

CSC Advanced Scientific Computing, Fall Numpy CSC 223 - Advanced Scientific Computing, Fall 2017 Numpy Numpy Numpy (Numerical Python) provides an interface, called an array, to operate on dense data buffers. Numpy arrays are at the core of most Python

More information

MATH& 146 Lesson 8. Section 1.6 Averages and Variation

MATH& 146 Lesson 8. Section 1.6 Averages and Variation MATH& 146 Lesson 8 Section 1.6 Averages and Variation 1 Summarizing Data The distribution of a variable is the overall pattern of how often the possible values occur. For numerical variables, three summary

More information

CS317 File and Database Systems

CS317 File and Database Systems CS317 File and Database Systems Lecture 3 Relational Model & Languages Part-1 September 7, 2018 Sam Siewert More Embedded Systems Summer - Analog, Digital, Firmware, Software Reasons to Consider Catch

More information

Part I. Fill in the blank. 2 points each. No calculators. No partial credit

Part I. Fill in the blank. 2 points each. No calculators. No partial credit Math 108 (105) Final Exam Page 1 Spring 2015 Part I. Fill in the blank. 2 points each. No calculators. No partial credit 1) Fill in the blank a) 2 8 h) 5 0 21 4 b) 5 7 i) 8 3 c) 2 3 = j) 2 7 d) The additive

More information

USING SOFT COMPUTING TECHNIQUES TO INTEGRATE MULTIPLE KINDS OF ATTRIBUTES IN DATA MINING

USING SOFT COMPUTING TECHNIQUES TO INTEGRATE MULTIPLE KINDS OF ATTRIBUTES IN DATA MINING USING SOFT COMPUTING TECHNIQUES TO INTEGRATE MULTIPLE KINDS OF ATTRIBUTES IN DATA MINING SARAH COPPOCK AND LAWRENCE MAZLACK Computer Science, University of Cincinnati, Cincinnati, Ohio 45220 USA E-mail:

More information

Data Mining Concepts & Techniques

Data Mining Concepts & Techniques Data Mining Concepts & Techniques Lecture No. 02 Data Processing, Data Mining Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology

More information

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 Objectives 2.1 What Are the Types of Data? www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

A Simple Guide to Using SPSS (Statistical Package for the. Introduction. Steps for Analyzing Data. Social Sciences) for Windows

A Simple Guide to Using SPSS (Statistical Package for the. Introduction. Steps for Analyzing Data. Social Sciences) for Windows A Simple Guide to Using SPSS (Statistical Package for the Social Sciences) for Windows Introduction ٢ Steps for Analyzing Data Enter the data Select the procedure and options Select the variables Run the

More information

IENG484 Quality Engineering Lab 1 RESEARCH ASSISTANT SHADI BOLOUKIFAR

IENG484 Quality Engineering Lab 1 RESEARCH ASSISTANT SHADI BOLOUKIFAR IENG484 Quality Engineering Lab 1 RESEARCH ASSISTANT SHADI BOLOUKIFAR SPSS (Statistical package for social science) Originally is acronym of Statistical Package for the Social Science but, now it stands

More information

Machine Learning. Decision Trees. Le Song /15-781, Spring Lecture 6, September 6, 2012 Based on slides from Eric Xing, CMU

Machine Learning. Decision Trees. Le Song /15-781, Spring Lecture 6, September 6, 2012 Based on slides from Eric Xing, CMU Machine Learning 10-701/15-781, Spring 2008 Decision Trees Le Song Lecture 6, September 6, 2012 Based on slides from Eric Xing, CMU Reading: Chap. 1.6, CB & Chap 3, TM Learning non-linear functions f:

More information

Week 2: Frequency distributions

Week 2: Frequency distributions Types of data Health Sciences M.Sc. Programme Applied Biostatistics Week 2: distributions Data can be summarised to help to reveal information they contain. We do this by calculating numbers from the data

More information

74 Wyner Math Academy I Spring 2016

74 Wyner Math Academy I Spring 2016 74 Wyner Math Academy I Spring 2016 CHAPTER EIGHT: SPREADSHEETS Review April 18 Test April 25 Spreadsheets are an extremely useful and versatile tool. Some basic knowledge allows many basic tasks to be

More information

Perception Maneesh Agrawala CS : Visualization Fall 2013 Multidimensional Visualization

Perception Maneesh Agrawala CS : Visualization Fall 2013 Multidimensional Visualization Perception Maneesh Agrawala CS 294-10: Visualization Fall 2013 Multidimensional Visualization 1 Visual Encoding Variables Position Length Area Volume Value Texture Color Orientation Shape ~8 dimensions?

More information

CORE BODY OF KNOWLEDGE MATH GRADE 6

CORE BODY OF KNOWLEDGE MATH GRADE 6 CORE BODY OF KNOWLEDGE MATH GRADE 6 For each of the sections that follow, students may be required to understand, apply, analyze, evaluate or create the particular concepts being taught. Course Description

More information

ECLT 5810 Data Preprocessing. Prof. Wai Lam

ECLT 5810 Data Preprocessing. Prof. Wai Lam ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate

More information

Data Exploration and Preparation Data Mining and Text Mining (UIC Politecnico di Milano)

Data Exploration and Preparation Data Mining and Text Mining (UIC Politecnico di Milano) Data Exploration and Preparation Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining, : Concepts and Techniques", The Morgan Kaufmann

More information

Implementation of Relational Operations

Implementation of Relational Operations Implementation of Relational Operations Module 4, Lecture 1 Database Management Systems, R. Ramakrishnan 1 Relational Operations We will consider how to implement: Selection ( ) Selects a subset of rows

More information