Small area statistics on de jure and de facto populations

Similar documents
Supporting Growth in Society and Industry Using Statistical Data from Mobile Terminal Networks Overview of Mobile Spatial Statistics

Telecommunications Services (As of end of June 2007)

State of Numbers of Subscribers to Telecommunications Services

Results of Telecommunications Usage Trend Survey 2007

RRFSS 2012: Cell Phone Use and Texting While Driving

3.1. 3x 4y = 12 3(0) 4y = 12. 3x 4y = 12 3x 4(0) = y = x 0 = 12. 4y = 12 y = 3. 3x = 12 x = 4. The Rectangular Coordinate System

Exercise 4.1. Create New Variables in a Shapefile. GIS Techniques for Monitoring and Evaluation of HIV/AIDS and Related Programs

Telephone Survey Response: Effects of Cell Phones in Landline Households

2 ICT in Aruba introduction Central Bureau of Statistics Aruba

Activity-Based Human Mobility Patterns Inferred from Mobile Phone Data: A Case Study of Singapore

CountryData Workshop Technologies for Data Exchange. Reference Metadata

Research Methods for Business and Management. Session 8a- Analyzing Quantitative Data- using SPSS 16 Andre Samuel

UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C FORM SD

Joint Research Project on Disaster Reduction using Information Sharing Technologies

Are dual-frame surveys necessary in market research?

Delivering On Canada s Broadband Commitment Presentation to OECD/WPIE Public Sector Broadband Procurement Workshop December 4, 2002

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

A population grid for Andalusia Year Institute of Statistics and Cartography of Andalusia (IECA) Sofia (BU), 24 th October 2013

Spring Change Assessment Survey 2010 Final Topline 6/4/10 Data for April 29 May 30, 2010

Survey Questions and Methodology

How Young Children Are Watching TV and DVDs: From the June 2016 Rating Survey on Young Children's TV Viewing

United States Personal Device History and Forecast, April 2019

I. INTRODUCTION OBJECTIVES OF THE STUDY RESEARCH METHODOLOGY

Statistics: Normal Distribution, Sampling, Function Fitting & Regression Analysis (Grade 12) *

The Impact of Residential Broadband Traffic on Japanese ISP Backbones -- SRCCS Workshop on Internet Measurement, Modeling, and Analysis --

NANOS SURVEY NANOS SURVEY

Yokohama Partnership of Resources and Technologies (Y-PORT)

Before the FEDERAL COMMUNICATIONS COMMISSION Washington, D.C In the Matter of ) ) Telephone Number Portability ) CC Docket No.

Customer Insights Customer Insights

How Young Children Are Watching TV and DVDs: From the June 2013 Rating Survey on Young Children s TV Viewing

Analyzing data saves lives, II

Regionalism and Nationalism in Mobile Communications: A Comparison of East Asia and Europe

Development of Synthetic Microdata for Educational Use in Japan

US Geo-Explorer User s Guide. Web:

Review of Cartographic Data Types and Data Models

The Growing Gap between Landline and Dual Frame Election Polls

Communications Usage Trend Survey in 2005 Compiled

using GPS. As a result, the many location-related use is very popular. obtained beforehand to provide positioning granularity issues.

Disclosure of Quarterly Data concerning Competition Review in the Telecommunications Business Field

ICT Policy in Japan - Broadband and Mobile -

Chattanooga Cell Phone External O-D Matrix Development

Segmented or Overlapping Dual Frame Samples in Telephone Surveys

B. Graphing Representation of Data

Chapter 5. Track Geometry Data Analysis

Vodafone Analyst & Investor Day Monday 27 September 2004

WELCOME! Lecture 3 Thommy Perlinger

Mathematics Scope & Sequence Grade 7 Revised: June 9, 2017 First Quarter (38 Days)

Economic Evaluation of ehealth in Japan

Basic facts about Dudley

28 CHAPTER 2 Summarizing and Graphing Data

A User Authentication Based on Personal History - A User Authentication System Using History -

Acquiring and Processing NREL Wind Prospector Data. Steven Wallace, Old Saw Consulting, 27 Sep 2016

Predicting housing price

INTERNATIONAL TELECOMMUNICATION UNION Telecommunication Development Bureau Telecommunication Statistics and Data Unit

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

Paper SDA-11. Logistic regression will be used for estimation of net error for the 2010 Census as outlined in Griffin (2005).

Mobile phone exposures in children

Simulating Growth of Transportation Networks

2017 NEW JERSEY STATEWIDE SURVEY ON OUR HEALTH AND WELL BEING Methodology Report December 1, 2017

Financial Results for the 1 st Half of the Fiscal Year Ending March 2015 (from April to September, 2014) President Takashi Tanaka KDDI Corporation

Mr. Kongmany Chaleunvong. GFMER - WHO - UNFPA - LAO PDR Training Course in Reproductive Health Research Vientiane, 22 October 2009

Disclosure of Quarterly Data concerning Competition Review in the Telecommunications Business Field

Week 2: Frequency distributions

GALLUP NEWS SERVICE GALLUP POLL SOCIAL SERIES: VALUES AND BELIEFS

ITSMR Research Note. Crashes Involving Cell Phone Use and Distracted Driving KEY FINDINGS ABSTRACT INTRODUCTION. Crash Analyses.

Bulk Registration File Specifications

Development of the framework for disaster mitigating information sharing platform and its application to a local government

Neighbourhood Operations Specific Theory

Resilient Networks in Japan

Pilot Study for the WHOIS Accuracy Reporting System: Preliminary Findings

ITSMR RESEARCH NOTE EFFECTS OF CELL PHONE USE AND OTHER DRIVER DISTRACTIONS ON HIGHWAY SAFETY: 2006 UPDATE. Introduction SUMMARY

Consistent Measurement of Broadband Availability

CCSSM Curriculum Analysis Project Tool 1 Interpreting Functions in Grades 9-12

AMERICANS USE OF THE U.S. POSTAL SERVICE: AN AARP BULLETIN SURVEY

Geography 281 Map Making with GIS Project Three: Viewing Data Spatially

Age & Stage Structure: Elephant Model

Telecommunications Customer Satisfaction

HANDWRITTEN FORMS WILL NOT BE ACCEPTED APPLICATION MUST BE SINGLE SIDED DO NOT STAPLE. Directions for Completion of the IRB Application Form

Consistent Measurement of Broadband Availability

Mr. Takaharu Nakamura Acting Chairman, Technical Committee 5GMF and Fujitsu. Hosted by

How Young Children Are Watching TV and DVDs:

Executive Summary on. Privacy Awareness Survey on Smartphones. and Smartphone Apps

Product Guide Verizon Pennsylvania Inc. Verizon Pennsylvania Inc. Original Sheet 1 TABLE OF CONTENTS. Business Exchange Service...

Organizing and Summarizing Data

Chapter 2. Frequency distribution. Summarizing and Graphing Data

Objectives Learn how GMS uses rasters to support all kinds of digital elevation models and how rasters can be used for interpolation in GMS.

DOCUMENT NAME: etrakit Process Manual etrakit Process Manual

Survey Questions and Methodology

IP Paging Considered Unnecessary:

GEOGRAPHIC INFORMATION SYSTEMS Lecture 18: Spatial Modeling

Tracking and Evaluating Changes to Address-Based Sampling Frames over Time

Identity Proofing Standards and Beyond

Dual-Frame Weights (Landline and Cell) for the 2009 Minnesota Health Access Survey

Lecture 6: GIS Spatial Analysis. GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

Progress Report of NTT DOCOMO LTE Trials January 26, 2009 Takehiro Nakamura NTT DOCOMO

TDWI strives to provide course books that are contentrich and that serve as useful reference documents after a class has ended.

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

GOSPEL HARVEST Quick Reference Guide

Open to Every Citizen

Holiday Shopping With Mobile Phones October 2010

Transcription:

Small area statistics on de jure and de facto populations - population census and operational data of mobile phone network Naoki Makita a, *, Masakazu Kimura a, Masayuki Terada b, Motonari Kobayashi b and Yuki Oyabu b a National Statistics Center b NTT DOCOMO R&D Center Application of Geostatistic (session 1) Tuesday October 8 11:10-12:10 Casa de Montejo 2, Centro Banamex Mexico City October 7-9 National Statistics Center Incorporated Administrative Agency

Contents 1. Introduction 2. Grid Square Statistics of Population Census 3. Mobile Spatial Statistics 4. Qualitative difference between GPC and MSS 5. Quantitative comparison of GPC and MSS 6. Deviation rate in Tokyo area 7. Concluding Remarks 2

1. Introduction p National statistics office(s) n The Statistics Bureau of Japan (SBJ) and the National Statistics Center (NSTAC) (SBJ's affiliate) p A mobile telecommunication company n NTT DOCOMO 3

1. Introduction p Population Census conducted by National Statistics Offices (i.e. INEGI, SBJ/NSTAC, US Census Bureau, etc.) p Small area statistics derived from Population Census n Census data by census tract / city block n Grid Square Statistics (Gridded population) :-) Essential and most popular data for geostatistics :-( Produced not very often Statistics users want data more frequently. 4

1. Introduction p A new kind of small area statistics: Mobile Spatial Statistics (MSS) n developed by NTT DOCOMO, derived from operational data of mobile phone network n would produce small area population estimates as frequently as on an hourly basis. Is it really reliable? 5

1. Introduction p NSTAC and DOCOMO have undertook a research to examine the validity of MSS n by identifying qualitative differences between GPC and MSS; n by calculating 'deviation rates' to quantitatively compare GPC and MSS p We concluded that MSS would be plausible for densely populated areas. 6

2. Grid Square Statistics of Population Census p Population Census of Japan n taken place every five years p Grid Square Statistics of Population Census (GPC) n compiled from the population counts by Census Enumeration District (CED) Population counts by CED 592 826 631 691 719 982 982 Boundary of CEDs 787 Grid lines based on latitude and longitude Assign population counts of CEDs to grid squares GPC Population counts by grid square Grid lines based on latitude and longitude 7

2. Grid Square Statistics of Population Census Total Population, all Japan at Basic Grid Square level GPC 2010 A Basis Grid Square is a square about 1km on a side 8

2. Grid Square Statistics of Population Census Proportion of Aged Population (65 years and older), All Japan at Basic Grid Square level GPC 2000 GPC 2010 9

3. Mobile Spatial Statistics p NTT DOCOMO n The largest service provider of mobile phones in Japan n Four out of ten people use DOCOMO s mobile phones. p The mobile phone penetration rate 100% p Japanese Population: 128 million p DOCOMO has more than 60 million subscribers. 10

3. Mobile Spatial Statistics p Mobile Spatial Statistics (MSS) n Small area population estimated by DOCOMO derived from the operational data of their mobile network. Operational Data Mobile Spatial Statistics Mobile Phone Network A Mobile Phone A coverage area (a cell ) Population Estimates on an hourly basis 75 120 30 90 135 105 45 60 No- Data A base station 11

3. Mobile Spatial Statistics Operational Data Base station#4: Base station#3: Base station#2: Base station#1, Date and Time: 090-XXXX-,..., Male, Birthday:10/11/1968, 1-2 B Str. Chiyoda Tokyo 090-YYYY-,..., Female, Birthday: 10/11/1955, 3-4 D Ave. Urayasu Chiba... 1. De-identification process Mobile Spatial Statistics 75 120 30 90 135 105 45 60 No -Data By age group and sex, etc. 3. Disclosure Control process BS#4 BS#3 BS#2 Base station#1, Date and Time: Male, 40's, A-city Tokyo Female, 50's, C-city Chiba... Mobile phone counts by cell 2. Estimation process Extrapolate population from Mobile phone counts reflecting the DOCOMO's share in the mobile phone market by sex, age group and district (prefecture) 75 120 30 90 135 105 45 60 6 By age group and sex, etc. Population estimates by grid square 12

3. Mobile Spatial Statistics p Privacy protection measures are taken. No one can follow a mobile phone nor a particular user over time from MSS, because: n Identifiers of mobile phone and personal identification information (name, telephone number, etc.) are removed. n Users' dates of birth and residential address are coded in groups. 13

3. Mobile Spatial Statistics Operational Data Mobile Spatial Statistics Population Distributions male female 0 50,000 100,000 150,000 66 歳 ~ ~65 歳 ~60 歳 ~55 歳 ~50 歳 ~45 歳 ~40 歳 ~35 歳 ~30 歳 ~25 歳 100,000 50,000 0 150,000 ~20 歳 Population Pyramid 14

3. Mobile Spatial Statistics DEMO 15

3. Mobile Spatial Statistics p MSS has been in research phase. n MSS data has not been publicly available. n DOCOMO has formed private partnerships with universities to research the application of MSS, but the validity of MSS had not been examined yet. n A joint research project by DOCOMO and NSTAC was launched to compare MSS with GPC. 16

4. Qualitative difference between GPC and MSS GPC and MSS look alike, but there are differences between the two statistics p 4.1 Characteristics of population p 4.2 Data source p 4.3 Coverage of age p 4.4 Frequency and Timeliness p 4.5 Cost to collect data 17

4. Qualitative difference between GPC and MSS p 4.1 Characteristics of population n GPC: de jure population p An individual's location is recorded to their place of residence on Census day. n MSS: de facto population p An individual's location is recorded to where the person is present at the time of reference 18

4. Qualitative difference between GPC and MSS p 4.1 Characteristics of population (cont.) GPC de jure population People are attributed to their residences on Census day. Residence Office, School, etc. MSS de facto population People are attributed to where they are present depending on the time of reference Residence Office, School, etc. n GPC and MSS are diferrent each other. n You can not say which one is better than the other. 19

4. Qualitative difference between GPC and MSS p 4.2 Data source n GPC: Complete enumeration p is aggregated from the Population Census that counts all the people all over Japan; p can present the population for any given small area. n MSS: DOCOMO s mobile phone users p based on estimates from DOCOMO users (40% of people in Japan); p could be viable only for areas with large enough population; p is not applicable for an area with sparse population because the sample of the mobile phone users would be too small to accurately inform estimation. 20

4. Qualitative difference between GPC and MSS p 4.2 Data source (conti.) n MSS: DOCOMO s mobile phone users p The base stations does not completely cover Japan, where more than half of the land area is forest/mountainous zone. Only red zone is coverage area of the base stations. 21

4. Qualitative difference between GPC and MSS p 4.3 Coverage of age n GPC: all ages n MSS: limited to 15-79 years old p Those aged 14 and under and 80 and over are excluded from the estimation because mobile phone penetration rates for these age groups are low, and thus the sample size is too small for MSS to provide accurate estimates. The population aged 15-79 makes up around 80 percent of the total population (2010 Population Census.) 22

4. Qualitative difference between GPC and MSS p 4.4 Frequency and Timeliness n GPC is produced every five years p It takes around two years for GPC to be released after the Population Census is conducted. n MSS could be produced on an hourly basis p It would take a few weeks after the reference time to finalize the MSS result ensuring the consistency and confidentiality. 23

4. Qualitative difference between GPC and MSS p 4.5 Cost to collect data n GPC p An enormous budget is required for Population Census to mobilize enumerators to collect census questionnaires and to capture the data n MSS p The operational data are routinely collected from the mobile phone network 24

4. Qualitative difference between GPC and MSS 4.1 Characteristics of population GPC de jure population MSS de facto population 4.2 Data source complete population 4.3 Coverage of age all ages DOCOMO users (40% of people) 15-79 years old (generations of active mobile phone users) 4.4 Frequency and Timeliness 4.5 Cost to collect data not so often and slow enormous often and quick very small (routinely collected from the mobile phone network) 25

4. Qualitative difference between GPC and MSS p Lessons n You can not easily compare the two data, because GPC and MSS are not equivalent: de jure population and de facto population. n MSS is not applicable for an area with sparse population p Even so, there is a possibility: GPC population could be closely aligned to MSS population at night on Census day in densely populated area 26

4. Qualitative difference between GPC and MSS p it would be worth measuring the extent of the closeness of GPC and 'MSS at night.' GPC de jure population People are attributed to their residences on Census day. Residence Office, School, etc. MSS de facto population People are attributed to where they are present depending on the time of reference Residence Office, School, etc. 27

5. Quantitative comparison of GPC and MSS p To examine the validity of MSS, it would be worth measuring the extent of the closeness of GPC 2010 population and MSS population at 4:00 AM on Oct. 1 2010 (the Census day) by calculating 'deviation rate' δ(i) 28

5. Quantitative comparison of GPC and MSS p Definition of 'deviation rate' δ(i) MSS(i): the MSS population in a grid square i GPC(i): the GPC population aged 15-79 in i n The deviation rate δ(i) of the grid square i: δ(i) = {MSS(i)-GPC(i)} /{MSS(i)+GPC(i)} 29

5. Quantitative comparison of GPC and MSS p Interval of the deviation rate: -1 δ(i) 1 n If δ(i) is close to 0, MSS(i) and GPC(i) are close to each other, and the point (GPC(i), MSS(i)) is close to the 45 degree line. n If δ(i)>0, MSS > GPC and the point (GPC(i), MSS(i)) is above the 45 degree line. (GPC(i), MSS(i)) δ(i) = {MSS(i) - GPC(i)} / {MSS(i)+GPC(i)} = β(i) / α(i) β(i): the perpendicular distance from the point (GPC(i), MSS(i)) to the 45 degree slope α(i): the distance between (0,0) and projection of (GPC(i), MSS(i)) on to the 45 degree slope. 30

5. Quantitative comparison of GPC and MSS p A Apprx. 70,000 pairs of GSS(i) and MSS(i) are inputted to calculate the deviation rates. A scatter plot of the deviation rates of Basic Grid Squares (apprx. 1km on a side) p The fluctuation in deviation rates tend to decrease as the MSS population in the grid square increases p Above 4,000 of estimated MSS populations, deviation rates become steady. Deviation rate -1.0-0.5 0 0.5 1.0 100 2000 4000 6000 8000 10000 12000 14000 MSS(i) 31

6. Deviation rate in Tokyo area p For the most part in Tokyo area, the deviation rate is between +/- 0.1 (in green). p There remain, however, some areas where the deviation rates exceed this range, mostly beyond +0.1 (in warm color). A Choropleth map of the deviation rate of Basic Grid Squares in Tokyo area Deviation rate 32

6. Deviation rate in Tokyo area A Choropleth map of the deviation rate of Basic Grid Squares in Tokyo area Central Tokyo (Chiyoda, Chuo and Minato Cities) A scatter plot of Basic Grid Square in central Tokyo Deviation rate 33

7. Concluding Remarks p There are differences between MSS and GPC, and limitations in applying MSS to a small area with sparse population. p Nevertheless, we compared them by calculating the deviation rate in densely populated area, and found that the use of MSS would be plausible to a certain degree. p MSS was impacted by particular patterns of behavior, as seen in the busy blocks of central Tokyo early in the morning. 34

Thank you for your attention! p Views and opinions expressed in this document are those of the authors, and not necessarily those of the organizations which the authors belong to. p Reference n 'Can mobile phone network data be used to estimate small area population?: a comparison from Japan,' Statistical Journal of the IAOS, Volume 29, 2013 (forthcoming) n GPC n MSS p Grid Square Statistics, Statistics Bureau of Japan http://www.stat.go.jp/english/data/mesh/index.htm p Technical Journal Number VOL.14 NO.3, NTT DOCOMO http://www.nttdocomo.co.jp/english/corporate/technology/rd/ technical_journal/bn/vol14_3/index.html 35