Sampling. Single and Multi-Mode Surveys. Address-Based Sampling

Similar documents
Solving the Problems Cell Phones Create for Survey Research

Traditional and Enhanced Listing for Probability Sampling

Tracking and Evaluating Changes to Address-Based Sampling Frames over Time

Evaluating the Effectiveness of Using an Additional Mailing Piece in the American Community Survey 1

Using Hierarchical Data to Manage Sample and Contact Attempts in Microsoft Access

Kristine Wiant 1, Joe McMichael 1, Joe Murphy 1 Katie Morton 1, Megan Waggy 1 1

Web+Mail as a Mixed-Mode Solution to General Public Survey Challenges in the United States

Smartphone Ownership 2013 Update

AccuZIP Data Enhancement Services

Overview of SAS/GIS Software

The Rise of the Connected Viewer

U. S. Postal Service National Delivery Planning Standards A Guide for Builders and Developers

Segmented or Overlapping Dual Frame Samples in Telephone Surveys

U.S. Census Bureau Spatial Data Storage and Topology in the Redesigned MAF/TIGER System Jennifer DeVore and Danielle Lessard March 10, 2005

Survey Questions and Methodology

The US Census Geographic Support System Initiative (GSS-I): Working with Partners for Enhanced Data Management KY GIS Conference, 10/01/13

ADVANCED DATA QUALITY DATA DICTIONARY

2017 NEW JERSEY STATEWIDE SURVEY ON OUR HEALTH AND WELL BEING Methodology Report December 1, 2017

Telephone Survey Response: Effects of Cell Phones in Landline Households

DATA HYGIENE AND MERGE PURGE

Billing Zip Codes in Cellular Telephone Sampling

Centers for Disease Control and Prevention National Center for Health Statistics

Address Information System Products Technical Guide

Survey Questions and Methodology

Spring Change Assessment Survey 2010 Final Topline 6/4/10 Data for April 29 May 30, 2010

Memphis Property Hub User Guide

Successfully Filing Broadband Deployment Data with USAC

The NEW American FactFinder Classroom Training

TrueNCOA s Output File Guide

Quick Guide to American FactFinder

Introduction to Social Explorer

Geographic Accuracy of Cell Phone RDD Sample Selected by Area Code versus Wire Center

Census-ACS Master Index Files

Features of Case Management in CAI Systems

Dual-Frame Weights (Landline and Cell) for the 2009 Minnesota Health Access Survey

CRA Wiz and Fair Lending Wiz 7.0: Release Notes

Release Notes. GeoEnrichment Demographics/Risk/Property. Attributes Data/US GeoDemographics/GeoEnrichment Boundaries Lookup Version

Telephone Appends. White Paper. September Prepared by

6 Delivery Services. 632 Mail Receptacles Customer Obligation Responsibilities Exception Receptacles Not Required

Landline and Cell Phone Usage Patterns in a Large Urban Setting: Results from the 2008 New York City Community Health Survey

Address Correction DPV, NCOA* and Beyond

Topline Questionnaire

506 INTERNATIONAL JOURNAL OF PUBLIC OPINION RESEARCH

(1) How can I use SimplyMap to determine the top PRIZM Segments in my neighborhood?

Release Notes. GeoEnrichment Demographics/Risk/Property. Attributes Data/US. GeoDemographics/GeoEnrichment Boundaries Lookup. Version 2018.

Topology at the US Census

Anchorage Land Use Plan Map Update Vision

NCOA Link File Matching Guidelines

Job Aid File Management

SimplyMap Canada Research Guide

Fighting Hunger Worldwide. mvam for Nutrition. Kenya Case Study

Post Office Rural Customer Delivery Instructions

CASS Cycle L ( ) Certification: Frequently Asked Questions

A population grid for Andalusia Year Institute of Statistics and Cartography of Andalusia (IECA) Sofia (BU), 24 th October 2013

Towards a Web-based multimode data collection system for household surveys in Statistics Portugal:

We will start at 2 minutes after the hour

DIRECT MAIL EXPECTATIONS. Frequently Asked Questions on Factors that Contribute to the Creation and Success of a Direct Mail Campaign

APPENDIX G. VMT Spreadsheet Tool. FINAL REPORT Improved Data & Tools for Integrated Land Use-Transportation Planning in California TOPICS:

Dual-Frame Sample Sizes (RDD and Cell) for Future Minnesota Health Access Surveys

Access Data with Census APIs

JANUARY Conversion Software User Guide for UNIX. Version 2.0

Cost and Productivity Ratios in Dual-Frame RDD Telephone Surveys

Sample: n=2,252 national adults, age 18 and older, including 1,127 cell phone interviews Interviewing dates:

HEALTH AND RETIREMENT STUDY. Cross-Wave Geographic Information (State) Restricted Data Data Description and Usage

The Changing Costs of Random Digital Dial Cell Phone and Landline Interviewing

CRA Wiz and Fair Lending Wiz 7.2: 2015 ACS Census Data Update

Introduction to Web Surveys. Survey Research Laboratory University of Illinois at Chicago October 2010

AAPOR 2013 Methodological Briefs: Cell Phones. Howard Speizer, Marcus Berzofsky, Tom Duffy, Jamie Ridenhour (RTI) Tim Sahr (Ohio State University)

Access to the Web. Coverage. Basic Communication Technology. CMPT 165: Review

If you are familiar with the RDS CASS Certification process, you know that you

Match and Location Codes

ADDRESS DATA CLEANSING A BETTER APPROACH

AMERICANS USE OF THE U.S. POSTAL SERVICE: AN AARP BULLETIN SURVEY

Informed Delivery. June 2018

Conversion Software User Guide for

DSF 2 PRODUCT LICENSEE PERFORMANCE REQUIREMENTS

Using NHGIS: An Introduction

Sample: n=2,252 national adults, age 18 and older, including 1,127 cell phone interviews Interviewing dates:

CDS USER GUIDE NOVEMBER User Guide

Cornell University Program on. Mission Statement

Cell phones and American adults

Geocoding Reference USA data in ArcMap 9.3

TURNING METCALFE ON HIS HEAD: THE MULTIPLE and GROWING COSTS OF NETWORK EXCLUSION. Dr. Rahul Tongia, CMU/CSTEP Dr. Ernest J.

Mobile Access July 7, 2010 Aaron Smith, Research Specialist.

Neighborhood Change Rates

CATI MANAGEMENT: Behind the Scenes

MAKING MONEY FROM YOUR UN-USED CALLS. Connecting People Already on the Phone with Political Polls and Research Surveys. Scott Richards CEO

Welcome and thank you for attending our event! Today s topic is Informed Delivery. Optional: Before we get started, can I see a show of hands of how

We will start at 2 minutes after the hour

Center for Demography and Ecology

CRA Wiz and Fair Lending Wiz 6.9 Release Notes

The Demographics of Mobile News Habits

United Way of Alamance County Grant Workshops January 23 and 24, 2018

Quantitative Mapping Using Census Data

To start, please type the ID number from your invitation letter here, then click Log in.

On the Road to 2020 Census Geographic Programs Update Oregon State Data Center Annual Meeting 11/29/2012

Features of Case Management Systems

Massachusetts Institute of Technology Department of Urban Studies and Planning

Release Notes. GeoEnrichment Demographics/Risk/US. GeoDemographics/GeoEnrichment Boundaries Lookup Version Contents:

GUIDE

Transcription:

Sampling for Single and Multi-Mode Surveys using Address-Based Sampling Colm O Muircheartaigh NORC at the University of Chicago and University of Chicago Harris School NSF Conference The Future of Survey Research Arlington, VA 3 October 2012

Outline 1. Introduction 2. The (C)DSF Characteristics Multimode 3. Multi-mode ABS Mode choice Sequences and costs Conclusion 2

Sampling Purpose Identifying and selecting potential sample members sample from frame Choosing a mode of data collection In-person, phone, mail, web 3

Population Coverage Frame Population Overcoverage Undercoverage 4

Address-Based Sampling: Big Picture 1 Possible universe concepts Model A Could include all possible telephone numbers RDD plus cell But lots of blanks (non-existent and ineligible numbers) Model B Can now do something similar for residential addresses (Computerized) Delivery Sequence File ( (C)DSF ) gets close Much, much less overcoverage than telephone frame, however Model C internet addresses 5

Address-Based Sampling: Big Picture 2 Start from an address as basic frame unit Addresses contain households Households contain members Some members belong to multiple households/addresses What is population of interest? Address is tied to housing unit Street address or Rural Route number Usually: PO Box New challenge for telephones and web Wire-line phones assume household Mobile phones mostly individual Web relationship to hhs and individuals can be quite complex 6

Multimode Requirements 1 Identical or at least compatible frames Necessary for inference 2 Compatible sample design Necessary for case transfer and control 7

An Available Address List: The DSF or CDSF US Postal Service delivery sequence file All addresses receiving mail in USA DSF* has 98% coverage of USA (Link et al. 2008) May be closer to 100% now The (Computerized) Delivery Sequence File For each subdivision of the Postal Service down to the individual mail carrier For each mail carrier s delivery route: All the addresses in the order in which the mail is delivered by the mail carrier 8

The DSF 1 What It Is Simply mailing addresses Street style addresses PO BOXes OWGM flag Rural Route BOXes Drop points Businesses Throwback Vacant 9

The DSF 2 Source Organizational tool for USPS All mailable addresses in urban and suburban areas All non-vacant mailable addresses in rural areas Updated by individual carriers via edit books Available through licensees Direct mail, list compilers USPS provides licensing arrangements 1. Allows comparison of list to DSF but not supplement Less expensive, less complete 2. Corrects missing addresses where at least 90% complete by ZIP License from a vendor: Valassis, MSG, SSI, CIS If same category of license, very similar 10

Screen shot of DSF 11

The DSF 3 Survey Research Potential Application to Survey Research Basis for US Census Master Address File (MAF) Addresses in standard format Operational incentives for updating Can be geocoded and mapped, except in rural areas Post Office Box addresses Geocoding error affects coverage (Eckman and English 2011) Coverage and utility have undergone evaluation since early 2000s (Iannacchione et al. 2003, O'Muircheartaigh et al. 2003) Cover mid to upper 90 s% of US households for face-to-face Missing: simplified addresses, PO BOXes, vacants Expect E911 conversion and advances in geocoding to raise 12

The DSF 4 Needs Manipulation DSF not a sampling frame by design Requires processing Is organized by Postal geographies ZIP code ZIP + 4 Carrier Route Walk Sequence Geocoding to associate with Census areas Tract Block Group Block One decides which addresses to include Vacant, seasonal, college PO BOX, RR BOX 13

DSF 5 Why the DSF is Important and Relevant Can be frame source for Mail In-person / face-to-face (FtF) Phone (if matched) Web (if matched) For mail excellent For FtF a departure from past where for adequate coverage, survey frames required field listing For phone like directories linked to addresses For web dubious value 14

Why Multimode? RDD deteriorated 1. Under-coverage of target population Cell-only households 2. Low response-rates 3. Not geographically targeted Entirely FtF considered not cost-effective Mail resuscitated but Mail has own draw-backs 15

Multimode 1 Multi-mode becoming more norm (Couper 2011) Proliferation: compatibility across modes more feasible than used to be due to technology Complexity: computers can handle rules Rise of mixed-mode surveys to counteract coverage and response rate problems of single mode Benefits of multi-mode surveys Improved response rates But providing simultaneous choice not necessarily improve (Millar and Dillman, 2011) 16

Multimode 2 Drawbacks of multi-mode surveys Mode effects Complexity Distinguish between respondent capture and data capture If have multiple-waves (or panel) think differently 17

Closed population vs. General population Closed population with full multimode access e.g. employees in a single organization No external frame needed Have email, mail, telephone, office address General population Need to identify a comprehensive frame for at least one mode Build crosswalks between it and other modal frames Here the general population 18

Face-to-Face Surveys Require clustering of sample if population geographically dispersed Eventually small area as ultimate sampling unit Blocks, block groups, tracts, etc. Within these areas need to have a list of addresses as a sampling frame External source, if available Otherwise must list 19

Key to mode choice Complexity, expense, timing all impact Mode may effect results unpredictably Can phone anywhere for same cost Can mail (almost) anywhere for same cost Upgrade to premium services (FedEx, UPS) Cost is variable face-to-face Frame construction and travel How about using more than one mode? 20

Multimode Requirements 1 An identical or at least compatible frame Necessary for inference 2 Compatible sample design Necessary for case transfer and control 21

Practical Considerations of DSF Varies depending on mode Drop points Acceptable for face-to-face Does not work for mail, telephone PO BOX Acceptable for mail (with geographic ambiguity) Does not work for face-to-face, telephone Geocoding can misplace units Lag with new construction Missing vacant units Inconsistencies in numbers of units, numbering All require consideration for survey implementation 22

Drop Points Important flag is Drop Delivery Units that receive mail through one shared receptacle 1.9% overall city-style addresses Substantial share in some cities 9% New York MSA, 20% New York City 5% Chicago MSA, 15% City of Chicago 4% Boston MSA, 10% City of Boston Issue is clustered in certain areas Age of Housing Some Chicago neighborhoods 60% drop delivery Blocks entirely Cannot mail, match reliably Undercoverage risk 23

Example Drop Points 24

Non City-Style Addresses PO BOX, RR BOX addresses Approx 12% residential DSF Do not provide direct link to housing units Geocode to post office location Duplication if include all PO BOX locations OWGM Flag 10% PO BOXes Problematic phone matching 25

Telephone Matching Vendors match telephone numbers to addresses MSG, SSI, Valassis Expect 60%+ national sample Higher in single family, listed, owner occupied Degrees of confidence Can raise match-rate artificially if don t screen for accuracy Problematic: multi-unit buildings, non-city style Multi-unit can have one to many matching Unreliable probabilities of selection Favors listed, single-family, home-owner Similar to RDD 26

Mail Return Rates Variable mail return Mail return has cascading effect on response rate, costs 27

GIS Limitations Geographic information systems (GIS) fundamental to ABS Geocode large address lists Intersect with areal information (boundaries) Error in either process has downstream effects Geocoding error: put address in wrong place Offset error: address in right place, wrong boundary Creates side of street problems Source of urban undercoverage (Eckman and English forthcoming) 28

Sequence Protocol Mode combination requires case-flow with clear sequence Order Mail first? Phone first? Field usually last Transfer rules When switch modes? How many attempts? Transfer timing What lag in between modes? 29

Multi Mode ABS Case Flow: Telephone First Address Sample Telephone Matching Match? yes Advance Letter CATI Response? yes Resp no SAQ no Response? no yes Resp CAPI yes Response? Resp 30

Multimode Mode ABS Case Flow: Mail First Address Sample SAQ Response? no yes Resp Telephone Matching Match? no CAPI yes Resp yes Advance Letter CATI Resp? yes Resp 31

Case flow overview Different starting points Mail first Phone first Distinguish from advance letters, not mail data collection Flow can be affected by Drop points Phone matching Mail return rates What to do if not going in-person 32

Costs in ABS ABS is flexible Costs are a major component of ABS decisions When to mail When to dial Include cell phones? How many to visit in-person Cost considerations: Incidence of target population Households vs. rare group Type of mailing USPS bulk. vs. premium service Incentives; how much, up front? Subsampling for in-person 33

Costs for American Community Survey Mail: $13.76 Mail first to all Telephone: $18.54 Telephone to mail nonrespondents who could be matched In-Person: $143.60 10% subsample of nonrespondents to mail and telephone BRFSS had similar cost-ratios between mail and phone Mail higher RRs in 5 of 6 test states (Link et al. 2008) 34

Issue of Respondent Capture vs. Data Capture One-time cross-section vs Panel Resources available Recruitment is key to long-term response rates One mode as a means of encouraging response to another Mail as a means of encouraging response To phone or FtF (advance letter) Phone as a means of encouraging response Either to mail Or to in-person In-person to recruit long-term panel member Other 35

Summary Limitations of DSF and Survey Implications Over coverage Units that no longer exist Misclassified businesses Misgeocoded addresses Under coverage Drop points Rural vacants New construction Simplified addresses Incomplete ZIP codes 36

Augmenting the DSF Vendors can add demographic information at the address level e.g., InfoUSA, MSG, Experian Race/Ethnicity Income Market-research data Car ownership Magazine subscriptions Etc. Question of coverage 37

The Future New technology Satellite and other images The NoStats file Rural vacancies Some drop points Lattice designs Streams of information Dispositions Projections Dynamic adjustments Mode switching 38

What we Need Flexibility and Versatility Database linkages Frames Sampling Hierarchical / relational data structures Within units Within instrument Tracking capacity Rapid response to projections Capacity to move partially completed cases across modes Version flexibility Development of comparable stimuli for different modes Question form, sequence,

Thank You!

The goal: 1) We know a lot about best practices - here's a lot of what we know. 2) But many major surveys are not yet conforming to best practices. 3) And there is a lot that we don't know and need to know and need future research to explore. 41