Measurements of traffic in DITL 2008

Similar documents
Large-scale DNS. Hot Topics/An Analysis of Anomalous Queries

Is Your Caching Resolver Polluting the Internet?

Day In The Life of the Internet 2008 Data Collection Event.

Increase of Root and JP queries -- Long-term trends of number of queries --

An Update on Anomalous DNS Behavior

A Day at the Root of the Internet

Enumerating Privacy Leaks in DNS Data Collected above the Recursive

Packet Traces from a Simulated Signed Root

Managing Caching DNS Server

MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration. Chapter 5 Introduction to DNS in Windows Server 2008

RFC 2181 Ranking data and referrals/glue importance --- new resolver algorithm proposal ---

CDAR Continuous Data-driven Analysis of Root Stability

Understanding and preparing for DNS evolution

RFC1918 updates on servers near M and F roots C A I D A. Andre Broido, work in progress. CAIDA WIDE Workshop ISI, CAIDA / SDSC / UCSD

IDN query trends seen at JP and Root. Kazunori Fujiwara, JPRS 2016/4/3, IEPG meeting

Falling Trees or If a DNS Server is Lame but Nobody Queries It, Should You Send an ?

A First Look at QUIC in the Wild

Identifier Technology Health Indicators (ITHI) Alain Durand, Christian Huitema 13 March 2018

Introduction to Network. Topics

A Root DNS Server. Akira Kato. Brief Overview of M-Root. WIDE Project

Identifier Technology Health Indicators (ITHI)

DSC on top of PacketQ

Measuring IPv6 Adoption

Practical Comprehensive Bounds on Surreptitious Communication Over DNS

A Look at RFC 8145 Trust Anchor Signaling for the 2017 KSK Rollover

Figure 1: DNS server properties and the DNS server icon

Domain Name System (DNS) DNS Fundamentals. Computers use IP addresses. Why do we need names? hosts.txt does not scale. The old solution: HOSTS.

DNS. dr. C. P. J. Koymans. September 16, Informatics Institute University of Amsterdam. dr. C. P. J. Koymans (UvA) DNS September 16, / 46

DNS Flag day. A tale of five cctlds. Hugo Salgado,.CL Sebastián Castro,.NZ DNS-OARC 29, Amsterdam

Domain Name System (DNS) Session-1: Fundamentals. Joe Abley AfNOG Workshop, AIS 2017, Nairobi

Is Your Caching Resolver Polluting the Internet?

Objectives. Upon completion you will be able to:

DNS Anycast Statistic Collection

Assessing IPv6 Adop/on. Michael Bailey, University of Michigan NANOG 57

DNS/DNSSEC Workshop. In Collaboration with APNIC and HKIRC Hong Kong. Champika Wijayatunga Regional Security Engagement Manager Asia Pacific

An Approach for Determining the Health of the DNS

Protecting DNS Critical Infrastructure Solution Overview. Radware Attack Mitigation System (AMS) - Whitepaper

Detection of DNS Traffic Anomalies in Large Networks

What is all that crap? Analysis of DNS root server bogus queries

DNS Measurements at the.cn TLD Servers

DNS. Some advanced topics. Karst Koymans. Informatics Institute University of Amsterdam. (version 17.2, 2017/09/25 12:41:57)

ECE 435 Network Engineering Lecture 7

Is Your Caching Resolver Polluting the Internet?

DNSSEC for the Root Zone. IEPG IETF 77 Anaheim, USA March 2010

Writing Assignment #1. A Technical Description for Two Different Audiences. Yuji Shimojo WRTG 393. Instructor: Claudia M. Caruana

Some Lacunae in APNIC DNS Measurement. George Michaelson CAIDA WIDE Workshop Marina Del Ray 2006

DNS. Karst Koymans & Niels Sijm. Friday, September 14, Informatics Institute University of Amsterdam

ICANN and Technical Work: Really? Yes! Steve Crocker DNS Symposium, Madrid, 13 May 2017

Table of Contents DNS. Short history of DNS (1) DNS and BIND. Specification and implementation. A short history of DNS. Root servers.

Internet Technology 2/18/2016

DNS. A Massively Distributed Database. Justin Scott December 12, 2018

DNSwitness: recent developments and the new passive monitor

Naming in Distributed Systems

Overview. Last Lecture. This Lecture. Next Lecture. Scheduled tasks and log management. DNS and BIND Reference: DNS and BIND, 4 th Edition, O Reilly

Table of Contents DNS. Short history of DNS (1) DNS and BIND. Specification and implementation. A short history of DNS.

Mutlticast DNS (mdns)

Re-engineering the DNS One Resolver at a Time. Paul Wilson Director General APNIC channeling Geoff Huston Chief Scientist

Domain Name System (DNS) Session 2: Resolver Operation and debugging. Joe Abley AfNOG Workshop, AIS 2017, Nairobi

Development of the Domain Name System

Covert channel detection using flow-data

CSc 450/550 Computer Networks Domain Name System

Keeping DNS parents and children in sync at Internet Speed! Ólafur Guðmundsson

Understanding and Characterizing Hidden Interception of the DNS Resolution Path

page 1 Plain Old DNS WACREN, DNS/DNSSEC Regional Workshop Ouagadougou, October 2016

Some advanced topics. Karst Koymans. Tuesday, September 16, 2014

Multimedia Streaming. Mike Zink

DNS Traffic Analysis CDN and the World IPv6 Launch

Dataset Comparison. IPv4 vs IPv6 traffic seen at DNS root servers. Bradley Huffaker, Marina Fomenkov, and kc claffy

CSCE 463/612 Networks and Distributed Processing Spring 2018

Domain Name System (DNS) Session-1: Fundamentals. Computers use IP addresses. Why do we need names? hosts.txt does not scale

Protocol Classification

Domain Statistics Collector Tutorial

Domain Name System (DNS)

CS519: Computer Networks. Lecture 6: Apr 5, 2004 Naming and DNS

Table of Contents DNS. Short history of DNS (1) DNS and BIND. Specification and implementation. A short history of DNS. Root servers.

Network Protocols. DNS Intel *slightly modified public version of another talk. TDC 375 Autumn 2009/10 John Kristoff DePaul University 1

Computer Networks. Domain Name System. Jianping Pan Spring /25/17 CSC361 1

DNS Session 2: DNS cache operation and DNS debugging. Joe Abley AfNOG 2006 workshop

Evaluation and consideration of multiple responses. Kazunori Fujiwara, JPRS OARC 28

DNS Basics BUPT/QMUL

ANALYSING RA/RD BIT USAGE IN ROOT SERVER TRAFFIC

Domain Name System - Advanced Computer Networks

DNS. Introduction To. everything you never wanted to know about IP directory services

Computer Networks - Midterm

Network Working Group

DNS Type Query Support Added to the DNS Analyzer

Recursives in the Wild: Engineering Authoritative DNS Servers

ECE 650 Systems Programming & Engineering. Spring 2018

APNIC elearning: DNS Concepts

Analysis of query traffic to.com/.net name servers! Duane Wessels, Matt Larson, Allison Mankin! Verisign Labs! APRICOT 2013!

Internet Engineering Task Force (IETF) Request for Comments: Category: Best Current Practice ISSN: January 2019

DNS Session 2: DNS cache operation and DNS debugging. How caching NS works (1) What if the answer is not in the cache? How caching NS works (2)

How to Configure DNS Zones

EECS 428 Final Project Report Distributed Real-Time Process Control Over TCP and the Internet Brian Robinson

Hierarchical Intelligent Cuttings: A Dynamic Multi-dimensional Packet Classification Algorithm

RIPE NCC DNS Update. Wolfgang Nagele DNS Services Manager

Open Resolvers in COM/NET Resolution!! Duane Wessels, Aziz Mohaisen! DNS-OARC 2014 Spring Workshop! Warsaw, Poland!

RSSAC Activities Update. Lars Johan Liman and Tripti Sinha RSSAC Chair ICANN-54 October 2015

USING ~300 BILLION DNS QUERIES TO ANALYSE THE NAME COLLISION PROBLEM. Jim Reid, RTFM LLP

Observing DNSSEC validation in the wild

Transcription:

Measurements of traffic in DITL 2008 Sebastian Castro secastro@caida.org CAIDA / NIC Chile 2008 OARC Workshop Sep 2008 Ottawa, CA

Overview DITL 2008 General statistics Query characteristics Query rate comparison Client rate comparison Query types Distribution of queries/clients Client classification Per reverse names IP TTL Histogram Reputation Score Source Port Randomness evolution Invalid traffic Comparison with 2007 Exploration of sources Recursive queries A-for-A Invalid TLD 2

DITL 2008 Particularly successful in terms of variety of DNS traffic 8 root servers 2 old root servers 2 ORSN servers 5 TLD (1 gtld, 4 cctld) 2 RIR 7 instances of AS112 Cache traces from SIE and University of Rome Also includes traces and measurements 3

General statistics DITL 2007 DITL 2008 Dataset duration 24h 24h Dataset start Jan 9, noon (UTC) Mar 19, midnight (UTC) Root server list and instances C: 4/4 F: 36/38 K: 15/17 M: 6/6 A: 1/1 C: 4/4 E: 1/1 F: 40/42 H: 1/1 K: 15/17 L: 2/2 M: 6/6 Number of queries 3.84 billion 8.00 billion Number of unique clients ~2.8 million ~ 5.6 million Recursive queries 17.04% 11.99% TCP Bytes Packets Queries 1.65% 2.67% ~700K N/A Queries from RFC1918 4.26% N/A addresses 4

Query rates Traffic seen at the old-l root Variation of query rates along the years Between 2007 and 2008, the qrate grew: C: 40% F: 13% K: 33% M: 5% Between 2006 and 2008: C: 139% F: 71% K: 74% 5

Client rate Follows the same pattern of query rates: A, C, F, K and M with similar behavior E and H L But if old-l traffic is added E, H and L are on the same level 6

Distribution of queries by query type The highest fraction of queries are A queries (slightly below 60%) Important increase on AAAA queries (pink): from around 8% in 2007 to 15% in 2008. Reduction of MX queries (purple): K- root drop from 13% to 4% 7

Distribution of clients/queries DITL 2008 Leftmost column: ~2.8% of the queries are sent by ~86.4% of clients Rightmost column: 1200 clients generated ~54.3% of the queries. 8

Client classification We attempted to classify the clients sending queries to the roots. Using the reverse names Using the IP TTL of their packets Using external sources of data Mainly blacklists 9

Reverse Names For each address, query the corresponding PTR record. Using CAIDA s HostDB engine Five major groups No match found Failed By connection type DSL, cable, fiber, dialup, etc By address assignment static, dynamic By a service mail, dns, resolver, fw, etc 10

IP TTL For each sending queries to the roots, count the observed IP TTL One thin line per root 68 clients presented more than 40 different TTL values 11

Client Reputation Sampled 1200 clients on each query rate interval bin Queried for the address on 5 different DNSRBL Assign a reputation score based on the number of matches found. 12

SPR Measurements For each client sending queries to.br,.org and.uk At least 20 queries in total Two datasets: Mar- 19 and Aug-9 Three metrics Port changes/queries ratio # different ports/queries ratio Bits of randomness Presented by Duane Wessels at CAIDA/WIDE/C ASFI workshop (using standard deviation as a metric of randomness) 13

Invalid queries analysis To prepare the invalid queries analysis we required to split the traces per source address. We sampled 10% of the unique source addresses observed on each root Each query could fit in nine categories of invalid queries The match was done sequentially If none matched, was counted as valid query 14

Invalid queries categories Unused query class: Any class not in IN, CHAOS, HESIOD, NONE or ANY A-for-A: A-type query for a name is already a IPv4 Address <IN, A, 192.16.3.0> Invalid TLD: a query for a name with an invalid TLD <IN, MX, localhost.lan> Non-printable characters: <IN, A, www.ra^b.us.> Queries with _ : <IN, SRV, _ldap._tcp.dc._msdcs.sk0530-k32-1.> RFC 1918 PTR: <IN, PTR, 171.144.144.10.in-addr.arpa.> Identical queries: a query with the same class, type, name and id (during the whole period) Repeated queries: a query with the same class, type and name Referral-not-cached: a query seen with a referral previously given. 15

Query validity (the graph) 16

Query validity (the numbers) Category A C E F H K L M Total Unused 0.1 0.0 0.1 0.0 0.1 0.0 0.1 0.1 0.1 A-for-A 1.6 1.9 1.2 3.6 2.7 3.8 2.6 2.7 2.7 Invalid TLD 19.3 18.5 19.8 25.5 25.6 22.9 24.8 22.9 22.0 Non-print char 0.0 0.1 0.1 0.1 0.1 0.0 0.1 0.0 0.0 Queries with _ 0.2 0.1 0.2 0.1 0.2 0.1 0.1 0.1 0.1 RFC 1918 PTR 0.6 0.3 0.5 0.2 0.5 0.2 0.1 0.3 0.4 Identical queries 27.3 10.4 14.9 12.3 17.4 17.9 12.0 17.0 15.6 Repeated queries 38.5 51.4 49.3 45.3 38.7 42.0 44.2 43.9 44.9 Referral not cached 10.7 15.2 12.1 10.9 12.9 11.1 14.3 11.1 12.4 Valid 2008 1.7 2.0 1.8 1.9 1.8 2.0 1.8 1.8 1.8 Valid 2007 4.1 2.3 1.8 4.4 2.5 17

Query validity (the words) Based on our first graphs, the query load keeps increasing So the pollution The fraction of valid traffic is decreasing The pollution is dominated by invalid TLD, repeated and identical queries. 18

Looking some of the sources of pollution We explored more details on the sources of pollution Recursive queries A-for-A queries Including some evidence of address space scanning and a new type of trash. Invalid TLD and propose some solutions 19

Recursive Queries During 2008 the number of recursive queries reduced compared to 2007 2008: 11.99%; 2007: 17.04% But the number of sources increased 2007: 290K (11.3%) 2008: 1.97M (36.4%); What to do? Return a REFUSED Bad Idea Drop the query? Even worst Delay the query? Do nothing 20

A-for-A: Address space scanning All /16 s between 80/8-83/8 All /16 s between 88/8-89/8 Took all QNAME and convert them to addresses Group them by /24 and /16 18270 sources sent queries for the 80/8 83/8 8845 sources sent queries for the 88/8 89/8 8115 sources in common Seemed coordinated: different sources sent queries for different partitions, iterating over the third octet. 21

A6-for-A? AAAA-for-A? Originally this category included A-queries with a query name in the form of an IPv4 address What about the other query types for addresses? The result: 3.32% of this type of queries were for A6/AAAA queries 00:04:03.347275 IP 195.2.83.107.5553 > 12.0.0.2.53: 40248 [1au] A? 221.0.93.99. (40) 00:04:03.347392 IP 195.2.83.107.5553 > 12.0.0.2.53: 1887 [1au] AAAA? 221.0.93.99. (40) 00:04:03.347642 IP 195.2.83.107.5553 > 12.0.0.2.53: 2737 [1au] A6? 221.0.93.99. (40) 00:04:59.579904 IP 195.2.83.107.5553 > 6.0.0.30.53: 40723 [1au] A? 84.52.73.160. (41) 00:05:36.016886 IP 195.2.83.107.5553 > 11.0.0.8.53: 28473 [1au] A? 148.240.4.32. (41) 00:05:36.016902 IP 195.2.83.107.5553 > 11.0.0.8.53: 27782 [1au] AAAA? 148.240.4.32. (41) 00:05:36.016908 IP 195.2.83.107.5553 > 11.0.0.8.53: 1175 [1au] A6? 148.240.4.32. (41) 00:06:58.022212 IP 195.2.83.107.5553 > 13.0.0.1.53: 28596 [1au] A? 61.143.210.226. (43) 00:06:58.022647 IP 195.2.83.107.5553 > 13.0.0.1.53: 10748 [1au] AAAA? 61.143.210.226. (43) 00:06:58.023381 IP 195.2.83.107.5553 > 13.0.0.1.53: 12721 [1au] A6? 61.143.210.226. (43) 22

Invalid TLD Queries for invalid TLD represent 22% of the total traffic at the roots 20.6% during DITL 2007 Top 10 invalid TLD represent 10.5% of the total traffic RFC 2606 reserves some TLD to avoid future conflicts We propose: Include some of these TLD (local, lan, home, localdomain) to RFC 2606 Encourage cache implementations to answer queries for RFC 2606 TLDs locally (with data or error) Percentage of total TLD queries 2007 2008 local 5.018 5.098 belkin 0.436 0.781 localhost 2.205 0.710 lan 0.509 0.679 home 0.321 0.651 invalid 0.602 0.623 domain 0.778 0.550 localdomain 0.318 0.332 wpad 0.183 0.232 corp 0.150 0.231 23

Repeated/identical queries Minas Gjoka at CAIDA found 50% of the repeated/identical queries arrived within a 10-sec time window The use of Bloom filters was proposed to detect if a query reaching a server has been seen within the last k seconds Using a hash of <QNAME, QCLASS, QTYPE> If seen, take some action (discard? delay?). Probably we will work on an implementation to test effectiveness and performance. 24

Conclusions The traffic grows, the pollution grows We don t know much about the sources of unwanted traffic But we do learn a little bit more every time And we will continue looking for answers By simulating combinations of elements that might create pollution More brain power is needed to analyze this huge amount of data 25

Questions? Suggestions? Thanks for your time 26