Cleaning the data: Who should do What, When? José Antonio Mejía Inter American Development Bank SDS/POV MECOVI Program February 28, 2001
|
|
- Holly Montgomery
- 6 years ago
- Views:
Transcription
1 Cleaning the data: Who should do What, When? José Antonio Mejía Inter American Development Bank SDS/POV MECOVI Program February 28, 2001
2 Precious resource Better to answer these questions than to have to wonder: Who? Where? With what? When? Why? Integrated surveys cannot afford to lose observations. The integrity of the data must be preserved.
3 Who does What? Data quality control has been pushed back as close to production as possible. The best person to correct errors is the informer. There is still some work that needs to be done at the central office so that the final data files are ready for analysis and distribution. Definitively in terms of organization, but also some in terms of data quality.
4 Do the job while the questionnaires are still at hand. Go back to them, check and re-check them. The work done at the central office should be one of CLEANING not of CHANGING. Cleaning is based on facts, on reviewing the questionnaires. Changing is altering the data. I don t think that is correct
5 Data quality checks after data entry (Who: Data manager in central office) Gather all files with household data, verify that all hh are included without duplication. A good system of hh identifiers should facilitate this process. Convert household based files into thematic files more useful for analysis. Files should be converted to the software format that will be used for analysis. Convert to popular software languages (SAS, STATA, SPSS). Always keep a master version of the files in ASCII.
6 Data quality checks after data entry Check structural consistency of files, so that different thematic files with hh data can be matched with each other, with individual data and community data. Compile basic univariate statistics for each variable. Frequencies for qualitative variables. For quantitative variables, the minimum, maximum and mean values. Issues with logical consistency have been taken care by a good data entry program and concurrent data entry.
7 Data quality checks after data entry Any issues left related to logical consistency should be left to the analysts. Because there is no consensus as to how to identify or treat outliers and missing observations. Best to give the raw data and let each analyst perform whatever cleaning she/he thinks best.
8 Data quality checks after data entry The statistical office as an analyst can do some more data cleaning. If the office chooses to make public its corrections, imputed values, treatment of outliers and missing values, and aggregated variables, it should always do it with the proper labels and tags, plus a well documented account of what was done. And, of course, this data should be distribute in addition not instead of the original data.
9 Data cleaning Minimize dirty data focus on training, field work, and data entry. Document changes keep program files with explanation of all changes made to the raw data, and construction of variables. Maintain original data give users the option to change assumptions. Use robust estimation techniques use statistics that are relatively insensitive to outliers (e.g. median instead of mean, etc.).
10 Distribution Organize data by section/section sub-part One to one correspondence with questionnaire. File naming convention should be informative, transparent and well documented. Add variable and value labels. Links between variables in questionnaire and data files should be clear and well documented. Distribute original data files with good and complete documentation, users are analysts not detectives.
11 Key messages Extremely important steps! Bad data is just bad data. Good data badly organized is bad data. Try to catch errors early. Documentation is essential. Cleaning is very different from changing. Learn from mistakes.
Data Entry, Processing and Management
Data Entry, Processing and Management Raka Banerjee Multi-Topic Household Surveys Poverty and Inequality Course March 7 th, 2012 Introduction Data entry, processing and management as fundamental aspects
More informationMonitoring and Improving Quality of Data Handling
Monitoring and Improving Quality of Data Handling The purpose of this document is to: (a) (b) (c) Maximise the quality of the research process once the question has been formulated and the study designed.
More informationNCRP Data. Quality Control. Jeremy Luallen October 29, 2012
NCRP Data Processing & Quality Control Jeremy Luallen October 29, 2012 Opening Remarks Working together with BJS and States, we ve introduced significant new quality assurances Result is a final product
More informationSDP TOOLKIT FOR EFFECTIVE DATA USE
AN INTRODUCTION TO THE SDP TOOLKIT FOR EFFECTIVE DATA USE A GUIDE FOR CONDUCTING DATA ANALYSIS IN EDUCATION AGENCIES www.gse.harvard.edu/sdp/toolkit Toolkit Documents An Introduction to the SDP Toolkit
More informationFact Sheet No.1 MERLIN
Fact Sheet No.1 MERLIN Fact Sheet No.1: MERLIN Page 1 1 Overview MERLIN is a comprehensive software package for survey data processing. It has been developed for over forty years on a wide variety of systems,
More informationWELCOME! Lecture 3 Thommy Perlinger
Quantitative Methods II WELCOME! Lecture 3 Thommy Perlinger Program Lecture 3 Cleaning and transforming data Graphical examination of the data Missing Values Graphical examination of the data It is important
More informationFour steps in an effective workflow...
Four steps in an effective workflow... 1. Cleaning data Things to do: Verify your data are accurate Variables should be well named Variables should be properly labeled Ask yourself: Do the variables have
More informationAutomating the Capture of Data Transformation Metadata
Automating the Capture of Data Transformation Metadata H.V. Jagadish Univ. of Michigan http://www.eecs.umich.edu/~jag George Alter, University of Michigan Why Metadata? Data are useless without Metadata
More informationCPSC 427: Object-Oriented Programming
CPSC 427: Object-Oriented Programming Michael J. Fischer Lecture 1 August 31, 2016 CPSC 427, Lecture 1 1/30 About This Course Topics to be Covered Kinds of Programming Why C++? C++ Programming Standards
More informationWORKING GROUP ON PASSENGER MOBILITY STATISTICS
Document: PM-2003-05/EN Original: English "Transport Statistics" WORKING GROUP ON PASSENGER MOBILITY STATISTICS Luxembourg, 24-25 April 2003 Jean Monnet Building, Room M5 Beginning 0:00 am Database and
More informationCoE CENTRE of EXCELLENCE ON DATA WAREHOUSING
in partnership with Overall handbook to set up a S-DWH CoE: Deliverable: 4.6 Version: 3.1 Date: 3 November 2017 CoE CENTRE of EXCELLENCE ON DATA WAREHOUSING Handbook to set up a S-DWH 1 version 2.1 / 4
More informationPlunging into the waters of UX
Plunging into the waters of UX Maja Engel TCUK 2017 UX vs. UI design UX is a journey UI design and technical communication are vehicles for that journey «things» that the user can interact with A UI without
More informationCPSC 427: Object-Oriented Programming
CPSC 427: Object-Oriented Programming Michael J. Fischer Lecture 1 August 29, 2018 CPSC 427, Lecture 1, August 29, 2018 1/30 About This Course Topics to be Covered Kinds of Programming Why C++? C++ Programming
More informationBlaise 5 Data In/Data Out
Blaise 5 Data In/Data Out Andrew D. Piskorowski, University of Michigan Survey Research Center, United States 1. Abstract The focus of this presentation is to demonstrate various methods used to move data
More informationSPSS Export. Cases & Variables. SPSS Syntax File SPSS EXPORT
184 SPSS Export ATLAS.ti is intended primarily for supporting qualitative reasoning processes. On the other hand, especially with large amounts data, it is sometimes useful to analyze the data in a quantitative
More informationData Quality Control: Using High Performance Binning to Prevent Information Loss
SESUG Paper DM-173-2017 Data Quality Control: Using High Performance Binning to Prevent Information Loss ABSTRACT Deanna N Schreiber-Gregory, Henry M Jackson Foundation It is a well-known fact that the
More informationImportant issues. Query the Sensor Network. Challenges. Challenges. In-network network data aggregation. Distributed In-network network Storage
Query the ensor Network Jie Gao Computer cience Department tony Brook University // Jie Gao CE9-fall Challenges Data Rich and massive data, spatially distributed. Data streaming and aging. Uncertainty,
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis Test Loads CS 147: Computer Systems Performance Analysis Test Loads 1 / 33 Overview Overview Overview 2 / 33 Test Load Design Test Load Design Test Load Design
More informationSoftware Revision Control for MASS. Git Basics, Best Practices
Software Revision Control for MASS Git Basics, Best Practices Matthew Sell, CSSE Student MASS Research Participant, February 2014 What is revision control? The obligatory Wikipedia definition: revision
More informationData Management Plan
Data Management Plan Mark Sanders, Martina Chýlková Document Identifier D1.9 Data Management Plan Version 1.0 Date Due M6 Submission date 30 November, 2015 WorkPackage WP1 Management and coordination Lead
More informationHEALTH AND RETIREMENT STUDY 2006 Internet Survey Final, Version 1.0 November Data Description and Usage. November 2008, Version 1.
HEALTH AND RETIREMENT STUDY 2006 Internet Survey Final, Version 1.0 November 2008 Data Description and Usage November 2008, Version 1.0 TABLE OF CONTENTS TABLE OF CONTENTS... II 1. INTRODUCTION... 1 2.
More informationTwo Papers on Network Visualization. CPSC 533c Presented by: Jeremy Hilliker
Two Papers on Network Visualization CPSC 533c Presented by: Jeremy Hilliker 2005-11-07 3D Geographic Network Displays Cox, Eick, He Bell Laboratories 1996 Motivation Computer networks can be represented
More informationSTEP Household Questionnaire. Guidelines for Data Processing
STEP Household Questionnaire Guidelines for Data Processing This Version: December 11, 2012 Table of Contents 1. Data Entry Process and Timing... 3 2. Data Files Structure... 4 3. Consistency Checks...
More informationData Quality Control for Big Data: Preventing Information Loss With High Performance Binning
Data Quality Control for Big Data: Preventing Information Loss With High Performance Binning ABSTRACT Deanna Naomi Schreiber-Gregory, Henry M Jackson Foundation, Bethesda, MD It is a well-known fact that
More informationINDEPTH Network. Introduction to ETL. Tathagata Bhattacharjee ishare2 Support Team
INDEPTH Network Introduction to ETL Tathagata Bhattacharjee ishare2 Support Team Data Warehouse A data warehouse is a system used for reporting and data analysis. Integrating data from one or more different
More informationPreprocessing Short Lecture Notes cse352. Professor Anita Wasilewska
Preprocessing Short Lecture Notes cse352 Professor Anita Wasilewska Data Preprocessing Why preprocess the data? Data cleaning Data integration and transformation Data reduction Discretization and concept
More informationWORKSHOP: Using the Health Survey for England, 2014
WORKSHOP: Using the Health Survey for England, 2014 There are three sections to this workshop, each with a separate worksheet. The worksheets are designed to be accessible to those who have no prior experience
More informationAn overview of Data Processing System of Survey data (Indian Experience)
An overview of Data Processing System of Survey data (Indian Experience) The System Design for data processing. System Design of data processing is a scheme of actions to clean and tabulate the data collected
More informationExercise 13. Accessing Census 2000 PUMS Data
Exercise 13. Accessing Census 2000 PUMS Data Purpose: The goal of this exercise is to extract some 2000 PUMS data for Asian Indians for PUMAs within California. You may either download the records for
More informationAgreement in Distributed Systems CS 188 Distributed Systems February 19, 2015
Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Page 1 Introduction We frequently want to get a set of nodes in a distributed system to agree Commitment protocols and mutual
More informationAns 1-j)True, these diagrams show a set of classes, interfaces and collaborations and their relationships.
Q 1) Attempt all the following questions: (a) Define the term cohesion in the context of object oriented design of systems? (b) Do you need to develop all the views of the system? Justify your answer?
More informationQUIZ How do we implement run-time constants and. compile-time constants inside classes?
QUIZ How do we implement run-time constants and compile-time constants inside classes? Compile-time constants in classes The static keyword inside a class means there s only one instance, regardless of
More informationReception and scanning of questionnaires
Questionnaires from the field (by cluster) 1. Check 2. Record 3. Package Reception and scanning of questionnaires CHECK cluster for completeness. Verify cluster number Sort questionnaires by household
More informationUNESCO, Division for Planning and Development of Education Systems, Section for Sector Policy Advice and ICT in Education (ED/PDE/PAD)
Guidelines for On- line Data E ntry and Downloading Impact of the Global Financial and Economic Crisis on Education in Selected Developing Countries (DFID RIVAF) UNESCO, Division for Planning and Development
More informationModule 10A Lecture - 20 What is a function? Why use functions Example: power (base, n)
Programming, Data Structures and Algorithms Prof. Shankar Balachandran Department of Computer Science and Engineering Indian Institute of Technology, Madras Module 10A Lecture - 20 What is a function?
More informationIntroduction to IPUMS
Introduction to IPUMS Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science Foundation and the National Institutes of
More informationMcCa!"s Triangle of Quality
McCa!"s Triangle of Quality Maintainability Portability Flexibility Reusability Testability Interoperability PRODUCT REVISION PRODUCT TRANSITION PRODUCT OPERATION Correctness Usability Reliability Efficiency
More information2.3 Organizing Quantitative Data
2.3 Organizing Quantitative Data This section will focus on ways to organize quantitative data into tables, charts, and graphs. Quantitative data is organized by dividing the observations into classes
More informationFundamentals of Information Systems, Seventh Edition
Chapter 3 Data Centers, and Business Intelligence 1 Why Learn About Database Systems, Data Centers, and Business Intelligence? Database: A database is an organized collection of data. Databases also help
More informationGUIDELINES ON DATA FLOWS AND GLOBAL DATA REPORTING FOR SUSTAINABLE DEVELOPMENT GOALS
GUIDELINES ON DATA FLOWS AND GLOBAL DATA REPORTING FOR SUSTAINABLE DEVELOPMENT GOALS Aim& scope Lessons learned from the Millennium Development Goals (MDG) process Importance of robust and reliable data
More informationThe Consequences of Poor Data Quality on Model Accuracy
The Consequences of Poor Data Quality on Model Accuracy Dr. Gerhard Svolba SAS Austria Cologne, June 14th, 2012 From this talk you can expect The analytical viewpoint on data quality Answers to the questions
More informationWhat I learned from Assignment 0. This is the first HCI course for most of you. You need practice with core HCI and Design concepts.
HCI and Design Today s Reading What I learned from Assignment 0 This is the first HCI course for most of you. You need practice with core HCI and Design concepts. Today: Understanding Users Why do we need
More informationIntroduction to Stata and DASP
Introduction to Stata and DASP Abdelkrim Araar, Sami Bibi and Jean-Yves Duclos Workshop on poverty and social impact analysis Dakar, Senegal, 08-12 June 2010 Introduction to Stata and DASP PEP and UNDP
More informationUsing the Boxplot analysis in marketing research
Bulletin of the Transilvania University of Braşov Series V: Economic Sciences Vol. 10 (59) No. 2-2017 Using the Boxplot analysis in marketing research Cristinel CONSTANTIN 1 Abstract: Taking into account
More informationLearning Objectives for Data Concept and Visualization
Learning Objectives for Data Concept and Visualization Assignment 1: Data Quality Concept and Impact of Data Quality Summarize concepts of data quality. Understand and describe the impact of data on actuarial
More informationOverview. When to export? How to export? What is exported? Structure of exported data files Interview Actions file
Data export Overview When to export? How to export? What is exported? Structure of exported data files Interview Actions file When to export? FREQUENTLY! Data export isn t just for exporting finalized
More informationHeteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors
Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors (Section 5.4) What? Consequences of homoskedasticity Implication for computing standard errors What do these two terms
More informationYEAK Survey: Online Access Data and Report Features
YEAK Survey: Online Access Data and Report Features CYFERnetSEARCH.org Getting Started Click Login on the homepage Creating a New Account Click Register if you don t have an account Enter your account
More informationReproducibly Random Values William Garner, Gilead Sciences, Inc., Foster City, CA Ting Bai, Gilead Sciences, Inc., Foster City, CA
ABSTRACT PharmaSUG 2015 - Paper QT24 Reproducibly Random Values William Garner, Gilead Sciences, Inc., Foster City, CA Ting Bai, Gilead Sciences, Inc., Foster City, CA For questionnaire data, multiple
More informationThe Power of Unit Testing and it s impact on your business. Ashish Kumar Vice President, Engineering
The Power of Unit Testing and it s impact on your business Ashish Kumar Vice President, Engineering Agitar Software, 2006 1 The Power of Unit Testing Why Unit Test? The Practical Reality Where do we go
More informationECE 354 Introduction to Lab 2. February 23 rd, 2003
ECE 354 Introduction to Lab 2 February 23 rd, 2003 Fun Fact Press release from Microchip: Microchip Technology Inc. announced it provides PICmicro field-programmable microcontrollers and system supervisors
More informationTHE 2002 U.S. CENSUS OF AGRICULTURE DATA PROCESSING SYSTEM
Abstract THE 2002 U.S. CENSUS OF AGRICULTURE DATA PROCESSING SYSTEM Kara Perritt and Chadd Crouse National Agricultural Statistics Service In 1997 responsibility for the census of agriculture was transferred
More informationMissing Data: What Are You Missing?
Missing Data: What Are You Missing? Craig D. Newgard, MD, MPH Jason S. Haukoos, MD, MS Roger J. Lewis, MD, PhD Society for Academic Emergency Medicine Annual Meeting San Francisco, CA May 006 INTRODUCTION
More informationData Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha
Data Preprocessing S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking
More information2011 INTERNATIONAL COMPARISON PROGRAM
2011 INTERNATIONAL COMPARISON PROGRAM 2011 ICP DATA ACCESS AND ARCHIVING POLICY GUIDING PRINCIPLES AND PROCEDURES FOR DATA ACCESS ICP Global Office November 2011 Contents I. PURPOSE... 3 II. CONTEXT...
More informationWork Session on Statistical Data Editing (Paris, France, April 2014) Topic (v): International Collaboration and Software & Tools
WP.XX ENGLISH ONLY UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing (Paris, France, 28-30 April 204) Topic (v): International
More informationQuestion Bank. 4) It is the source of information later delivered to data marts.
Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile
More informationThe TIER Documentation Protocol v2.0 Version 2.0 for Stata [.pdf format]
RJB First version: 2015-12-21 This version: 2016-03-30 I. Overview The TIER Documentation Protocol v2.0 Version 2.0 for Stata [.pdf format] The TIER Documentation Protocol provides instructions for assembling
More informationResults Based Financing for Health Impact Evaluation Workshop Tunis, Tunisia October Stata 2. Willa Friedman
Results Based Financing for Health Impact Evaluation Workshop Tunis, Tunisia October 2010 Stata 2 Willa Friedman Outline of Presentation Importing data from other sources IDs Merging and Appending multiple
More informationCLAREMONT MCKENNA COLLEGE. Fletcher Jones Student Peer to Peer Technology Training Program. Basic Statistics using Stata
CLAREMONT MCKENNA COLLEGE Fletcher Jones Student Peer to Peer Technology Training Program Basic Statistics using Stata An Introduction to Stata A Comparison of Statistical Packages... 3 Opening Stata...
More informationControl Invitation
Online Appendices Appendix A. Invitation Emails Control Invitation Email Subject: Reviewer Invitation from JPubE You are invited to review the above-mentioned manuscript for publication in the. The manuscript's
More informationJoint Application Design & Function Point Analysis the Perfect Match By Sherry Ferrell & Roger Heller
Joint Application Design & Function Point Analysis the Perfect Match By Sherry Ferrell & Roger Heller Introduction The old adage It s not what you know but when you know it that counts is certainly true
More informationName Date Types of Graphs and Creating Graphs Notes
Name Date Types of Graphs and Creating Graphs Notes Graphs are helpful visual representations of data. Different graphs display data in different ways. Some graphs show individual data, but many do not.
More informationUsability Report for Online Writing Portfolio
Usability Report for Online Writing Portfolio October 30, 2012 WR 305.01 Written By: Kelsey Carper I pledge on my honor that I have not given or received any unauthorized assistance in the completion of
More informationEnhancements to the 2006 Canadian Census Edit and Imputation System
Enhancements to the 2006 Canadian Census Edit and Imputation System Wesley Benjamin Statistics Canada, Ottawa, ON, K1A 0T6 Abstract The CANadian Census Edit and Imputation System (CANCEIS) will do deterministic
More informationExploratory Data Analysis with R. Matthew Renze Iowa Code Camp Fall 2013
Exploratory Data Analysis with R Matthew Renze Iowa Code Camp Fall 2013 Motivation The ability to take data to be able to understand it, to process it, to extract value from it, to visualize it, to communicate
More informationRegression testing. Whenever you find a bug. Why is this a good idea?
Regression testing Whenever you find a bug Reproduce it (before you fix it!) Store input that elicited that bug Store correct output Put into test suite Then, fix it and verify the fix Why is this a good
More informationTechnical Working Session on Profiling Equity Focused Information
Technical Working Session on Profiling Equity Focused Information Using to create, knowledge and wisdom (with a particular focus on meta) 23 26 June, 2015 UN ESCAP, Bangkok 24/06/2015 1 Aims 1. Outline
More informationNote: In the presentation I should have said "baby registry" instead of "bridal registry," see
Q-and-A from the Data-Mining Webinar Note: In the presentation I should have said "baby registry" instead of "bridal registry," see http://www.target.com/babyregistryportalview Q: You mentioned the 'Big
More informationSix Core Data Wrangling Activities. An introductory guide to data wrangling with Trifacta
Six Core Data Wrangling Activities An introductory guide to data wrangling with Trifacta Today s Data Driven Culture Are you inundated with data? Today, most organizations are collecting as much data in
More informationHow to clean up dirty data in Patient reported outcomes
Paper DH02 How to clean up dirty data in Patient reported outcomes Knut Mueller, UCB Schwarz Biosciences, Monheim, Germany ABSTRACT The current FDA Guidance for Industry - Patient Reported Outcome Measures
More informationIntroduction to Mplus
Introduction to Mplus May 12, 2010 SPONSORED BY: Research Data Centre Population and Life Course Studies PLCS Interdisciplinary Development Initiative Piotr Wilk piotr.wilk@schulich.uwo.ca OVERVIEW Mplus
More informationTTEDesigner User s Manual
TTEDesigner User s Manual John D. Cook Department of Biostatistics, Box 447 The University of Texas, M. D. Anderson Cancer Center 1515 Holcombe Blvd., Houston, Texas 77030, USA cook@mdanderson.org September
More informationRockefeller College University at Albany
Rockefeller College University at Albany Problem Set #7: Handling Egocentric Network Data Adapted from original by Peter V. Marsden, Harvard University Egocentric network data sometimes known as personal
More informationUSER-CENTERED DESIGN KRANACK / DESIGN 4
USER-CENTERED DESIGN WHAT IS USER-CENTERED DESIGN? User-centered design (UCD) is an approach to design that grounds the process in information about the people who will use the product. UCD processes focus
More informationCreating a Departmental Standard SAS Enterprise Guide Template
Paper 1288-2017 Creating a Departmental Standard SAS Enterprise Guide Template ABSTRACT Amanda Pasch and Chris Koppenhafer, Kaiser Permanente This paper describes an ongoing effort to standardize and simplify
More informationCS510 Advanced Topics in Concurrency. Jonathan Walpole
CS510 Advanced Topics in Concurrency Jonathan Walpole Threads Cannot Be Implemented as a Library Reasoning About Programs What are the valid outcomes for this program? Is it valid for both r1 and r2 to
More informationSoftware Review: Ruby Tabulation Software
Software Review: Ruby Tabulation Software Tags: Research Industry Software-Data Delivery Tools Software-Data Tabulation Data Processing Data Conversion Data Analysis Data Crosstabulation Data Collection
More informationTowards a Cross- Disciplinary Pedagogy for Big Data. Joshua Eckroth Math/CS Department Stetson University CCSC- Eastern 2015
Towards a Cross- Disciplinary Pedagogy for Big Data Joshua Eckroth Math/CS Department Stetson University CCSC- Eastern 2015 What is big data? Data mining and analysis require big data techniques when
More informationWHITE PAPER. The truth about data MASTER DATA IS YOUR KEY TO SUCCESS
WHITE PAPER The truth about data MASTER DATA IS YOUR KEY TO SUCCESS Master Data is your key to success SO HOW DO YOU KNOW WHAT S TRUE AMONG ALL THE DIFFER- ENT DATA SOURCES AND ACROSS ALL YOUR ORGANIZATIONAL
More informationOnline and On a Budget
Online and On a Budget Taking Multi-modal Transportation Planning to the Next Level #micities 2014 Saturday, October 4, 2014 Ann Arbor, MI Norman Cox, PLA, ASLA and Carolyn Prudhomme, ASLA The Greenway
More informationA new international standard for data validation and processing
A new international standard for data validation and processing Marco Pellegrino (marco.pellegrino@ec.europa.eu) 1 Keywords: Data validation, transformation, open standards, SDMX, GSIM 1. INTRODUCTION
More informationMn/DOT Market Research Reporting General Guidelines for Qualitative and Quantitative Market Research Reports Revised: August 2, 2011
Mn/DOT Market Research Reporting General Guidelines for Qualitative and Quantitative Market Research Reports Revised: August 2, 2011 The following guidelines have been developed to help our vendors understand
More information1. The narratives, diagrams, charts, and other written materials that explain how a system works are collectively called
CH 3 MULTIPLE CHOICE 1. The narratives, diagrams, charts, and other written materials that explain how a system works are collectively called a) documentation. b) data flows. c) flowcharts. d) schema.
More informationAMERICAN JOURNAL OF POLITICAL SCIENCE GUIDELINES FOR PREPARING REPLICATION FILES Version 1.0, March 25, 2015 William G. Jacoby
AJPS, South Kedzie Hall, 368 Farm Lane, S303, East Lansing, MI 48824 ajps@msu.edu (517) 884-7836 AMERICAN JOURNAL OF POLITICAL SCIENCE GUIDELINES FOR PREPARING REPLICATION FILES Version 1.0, March 25,
More informationConcepts of Usability. Usability Testing. Usability concept ISO/IS What is context? What is context? What is usability? How to measure it?
Concepts of Usability Usability Testing What is usability? How to measure it? Fang Chen ISO/IS 9241 Usability concept The extent to which a product can be used by specified users to achieve specified goals
More information2/6/2018. ECE 220: Computer Systems & Programming. Function Signature Needed to Call Function. Signature Include Name and Types for Inputs and Outputs
University of Illinois at Urbana-Champaign Dept. of Electrical and Computer Engineering ECE 220: Computer Systems & Programming C Functions and Examples Signature Include Name and Types for Inputs and
More informationCLEANING DATA IN PYTHON. Data types
CLEANING DATA IN PYTHON Data types Prepare and clean data Cleaning Data in Python Data types In [1]: print(df.dtypes) name object sex object treatment a object treatment b int64 dtype: object There may
More informationBasic Stata Tutorial
Basic Stata Tutorial By Brandon Heck Downloading Stata To obtain Stata, select your country of residence and click Go. Then, assuming you are a student, click New Educational then click Students. The capacity
More informationSession 10: Coding and Data Management for Household Interview Variables (Coding/Encoding Data using Excel and SPSS)
Training on Socioeconomic Monitoring (SocMon) Methodology for Evaluation of Socioeconomics and Marine Resources Utilization at Selected Coastal Communities in Myanmar Mawlamyine University, Mon State and
More informationIntermediate Programming, Spring 2017*
600.120 Intermediate Programming, Spring 2017* Misha Kazhdan *Much of the code in these examples is not commented because it would otherwise not fit on the slides. This is bad coding practice in general
More informationLiquibase Version Control For Your Schema. Nathan Voxland April 3,
Liquibase Version Control For Your Schema Nathan Voxland April 3, 2014 nathan@liquibase.org @nvoxland Agenda 2 Why Liquibase Standard Usage Tips and Tricks Q&A Why Liquibase? 3 You would never develop
More informationCorel Ventura 8 Introduction
Corel Ventura 8 Introduction Training Manual A! ANZAI 1998 Anzai! Inc. Corel Ventura 8 Introduction Table of Contents Section 1, Introduction...1 What Is Corel Ventura?...2 Course Objectives...3 How to
More informationEconomic and Social Council
United Nations Economic and Social Council Distr.: General 27 January 2014 ECE/CES/2014/1 Original: English Economic Commission for Europe Conference of European Statisticians Sixty-second plenary session
More informationPSS718 - Data Mining
Lecture 5 - Hacettepe University October 23, 2016 Data Issues Improving the performance of a model To improve the performance of a model, we mostly improve the data Source additional data Clean up the
More informationCATCH ERRORS BEFORE THEY HAPPEN. Lessons for a mature data governance practice
CATCH ERRORS BEFORE THEY HAPPEN Lessons for a mature data governance practice A guide to working with cross-departmental teams to establish proactive data governance for your website or mobile app. 2 Robust
More informationGUIDE TO USING THE 2014 AND 2015 CURRENT POPULATION SURVEY PUBLIC USE FILES
GUIDE TO USING THE 2014 AND 2015 CURRENT POPULATION SURVEY PUBLIC USE FILES INTRODUCTION Tabulating estimates of health insurance coverage, income, and poverty from the redesigned survey TECHNICAL BRIEF
More informationDATA CLEANING & DATA MANIPULATION
DATA CLEANING & DATA MANIPULATION WESLEY WILLETT INFO VISUAL 340 ANALYTICS D 13 FEB 2014 1 OCT 2014 WHAT IS DIRTY DATA? BEFORE WE CAN TALK ABOUT CLEANING,WE NEED TO KNOW ABOUT TYPES OF ERROR AND WHERE
More informationUser-Centered Design Process
KAIST Fall 2018 CS408E/F: Computer Science Project User-Centered Design Process 2018.08.27 Juho Kim CS408 Project-oriented course in which students design, develop, test, validate, and present a software
More informationUsing NHGIS: An Introduction
Using NHGIS: An Introduction August 2014 Funding provided by the National Science Foundation and National Institutes of Health. Project support provided by the Minnesota Population Center at the University
More information