A Robust Classifier for Passive TCP/IP Fingerprinting

Similar documents
A Robust Classifier for Passive TCP/IP Fingerprinting

Application Presence Fingerprinting for NAT-Aware Router

CSC 574 Computer and Network Security. TCP/IP Security

Improving TCP/IP Security Through Randomization Without Sacrificing Interoperability. Michael J. Silbersack. November 26th, 2005

CSE 565 Computer Security Fall 2018

SinFP3. More Than a Complete Framework for Operating System Fingerprinting v1.0. Patrice

ICS 451: Today's plan

Internet Layers. Physical Layer. Application. Application. Transport. Transport. Network. Network. Network. Network. Link. Link. Link.

Exploring NAT Host Counting Using Network Traffic Flows

INF5290 Ethical Hacking. Lecture 3: Network reconnaissance, port scanning. Universitetet i Oslo Laszlo Erdödi

CIT 480: Securing Computer Systems

BSc. (Hons) Web Technologies. Examinations for 2017 / Semester 1

Measuring and Characterizing IPv6 Router Availability

On Inferring TCP Behavior

TBIT: TCP Behavior Inference Tool

TCP TCP/IP: TCP. TCP segment. TCP segment. TCP encapsulation. TCP encapsulation 1/25/2012. Network Security Lecture 6

Managing and Securing Computer Networks. Guy Leduc. Chapter 2: Software-Defined Networks (SDN) Chapter 2. Chapter goals:

Port Mirroring in CounterACT. CounterACT Technical Note

Lecture 16: Network Layer Overview, Internet Protocol

Network Intrusion Detection Systems. Beyond packet filtering

The following topics describe how to configure correlation policies and rules.

Introduction to Network Discovery and Identity

TCP /IP Fundamentals Mr. Cantu

Network Function Property Algorithm. CounterACT Technical Note

What this talk is about?

Introduction to Network Discovery and Identity

Using a VMware Network Infrastructure to Collect Traffic Traces for Intrusion Detection Evaluation

Service Cloaking and Anonymous Access; Combining Tor with Single Packet Authorization (SPA)

Single Network: applications, client and server hosts, switches, access links, trunk links, frames, path. Review of TCP/IP Internetworking

Announcements. Designing IP. Our Story So Far (Context) Goals of Today s Lecture. Our Story So Far (Context), Con t. The Internet Hourglass

Identifying Operating System Using Flow-based Traffic Fingerprinting

Networking Theory CSCI 201 Principles of Software Development

CS 716: Introduction to communication networks th class; 7 th Oct Instructor: Sridhar Iyer IIT Bombay

Faulds: A Non-Parametric Iterative Classifier for Internet-Wide OS Fingerprinting

Investigating Transparent Web Proxies in Cellular Networks

Prof. Shervin Shirmohammadi SITE, University of Ottawa. Internet Protocol (IP) Lecture 2: Prof. Shervin Shirmohammadi CEG

inside: THE MAGAZINE OF USENIX & SAGE April 2002 Volume 27 Number 2 SECURITY A Remote Active OS Fingerprinting Tool Using ICMP BY OFIR ARKIN

Computer Security Spring Firewalls. Aggelos Kiayias University of Connecticut

TECHNICAL NOTE CLEARPASS PROFILING QUICK START GUIDE

Ambiguity Resolution via Passive OS Fingerprinting

Transport Over IP. CSCI 690 Michael Hutt New York Institute of Technology

MPTCP: Design and Deployment. Day 11

Chair for Network Architectures and Services Department of Informatics TU München Prof. Carle. Network Security. Chapter 8

cs/ee 143 Communication Networks

SinFP, Unication Of Active And Passive Operating System Fingerprinting

FOCUS on Intrusion Detection: Intrusion Detection Level Analysis of Nmap and Queso Page 1 of 6

Supporting Service Differentiation for Real-Time and Best-Effort Traffic in Stateless Wireless Ad-Hoc Networks (SWAN)

Scanning. Course Learning Outcomes for Unit III. Reading Assignment. Unit Lesson UNIT III STUDY GUIDE

What is the difference between unicast and multicast? (P# 114)

Ch.9 Internet Protocol: Error And Control Messages (ICMP)

Da t e: August 2 0 th a t 9: :00 SOLUTIONS

CS 5114 Network Programming Languages Data Plane. Nate Foster Cornell University Spring 2013

Internet Protocol. Outline Introduction to Internet Protocol Header and address formats ICMP Tools CS 640 1

Lecture 3: Packet Forwarding

IPv6. Internet Technologies and Applications

Internet Applications and the Application Layer Material from Kurose and Ross, Chapter 2: The Application Layer

Can we trust the inter-packet time for traffic classification?

Proxy server is a server (a computer system or an application program) that acts as an intermediary between for requests from clients seeking

Internetworking/Internetteknik, Examination 2G1305 Date: August 18 th 2004 at 9:00 13:00 SOLUTIONS

Introduction to Computer Networks. CS 166: Introduction to Computer Systems Security

OSI Layer OSI Name Units Implementation Description 7 Application Data PCs Network services such as file, print,

9th Slide Set Computer Networks

Fundamental Questions to Answer About Computer Networking, Jan 2009 Prof. Ying-Dar Lin,

CSCI 680: Computer & Network Security

CounterACT DHCP Classifier Plugin

Network Forensics and Covert Channels Analysis in Internet Protocols

IpMorph : Unification of OS fingerprinting defeating or, how to defeat common OSFP tools.

Features of a proxy server: - Nowadays, by using TCP/IP within local area networks, the relaying role that the proxy

Network layer: Overview. Network layer functions IP Routing and forwarding NAT ARP IPv6 Routing

NT1210 Introduction to Networking. Unit 10

Operation Manual IP Addressing and IP Performance H3C S5500-SI Series Ethernet Switches. Table of Contents

User Datagram Protocol

Our Narrow Focus Computer Networking Security Vulnerabilities. Outline Part II

The Network 15 Layer IPv4 and IPv6 Part 3

Applied IT Security. System Security. Dr. Stephan Spitz 6 Firewalls & IDS. Applied IT Security, Dr.

D4.4 Device fingerprinting

Goal of Today s Lecture. EE 122: Designing IP. The Internet Hourglass. Our Story So Far (Context) Our Story So Far (Context), Con t

Network layer: Overview. Network Layer Functions

Lecture 8. Basic Internetworking (IP) Outline. Basic Internetworking (IP) Basic Internetworking (IP) Service Model

On the State of ECN and TCP Options on the Internet

Configuring IP Services

n Given a scenario, analyze and interpret output from n A SPAN has the ability to copy network traffic passing n Capacity planning for traffic

9. Security. Safeguard Engine. Safeguard Engine Settings

Distributed Systems. 27. Firewalls and Virtual Private Networks Paul Krzyzanowski. Rutgers University. Fall 2013

IP: Addressing, ARP, Routing

ECE4110 Internetwork Programming. Introduction and Overview

BEng. (Hons) Telecommunications. Examinations for / Semester 2

CSE/EE 461 Lecture 13 Connections and Fragmentation. TCP Connection Management

Basic Concepts in Intrusion Detection

Lecture 8. Reminder: Homework 3, Programming Project 2 due on Thursday. Questions? Tuesday, September 20 CS 475 Networks - Lecture 8 1

Rob Sherwood Bobby Bhattacharjee Ryan Braud. University of Maryland. Misbehaving TCP Receivers Can Cause Internet-Wide Congestion Collapse p.

A quick theorical introduction to network scanning. 23rd November 2005

Network Forensics Prefix Hijacking Theory Prefix Hijacking Forensics Concluding Remarks. Network Forensics:

Detecting Malicious Hosts Using Traffic Flows

Forescout. Configuration Guide. Version 8.1

Agenda of today s lecture. Firewalls in General Hardware Firewalls Software Firewalls Building a Firewall

Authors: Mark Handley, Vern Paxson, Christian Kreibich

Application. Transport. Network. Link. Physical

Large-Scale Classification of IPv6-IPv4 Siblings with Variable Clock Skew

Juniper Networks Certified Professional Security Bootcamp, AJSEC and JIPS (JNCIP-SEC BC)

Transcription:

A Robust Classifier for Passive TCP/IP Fingerprinting Rob Beverly MIT CSAIL rbeverly@csail.mit.edu April 20, 2004 PAM 2004 Typeset by FoilTEX

Outline A Robust Classifier for Passive TCP/IP Fingerprinting Background Motivation Our Approach/Description of Tool Application 1: Measuring an Exchange Point Application 2: NAT Inference Conclusions Questions? 1

Background Objective: Identify Properties of a Remote System over the Network Grand Vision: Passively Determine TCP Implementation in Real Time [Paxson 97] Easier: Identify Remote Operating System/Version Passively Fingerprinting What s the Motivation? 2

Motivation Fingerprinting Often Regarded as Security Attack Fingerprinting as a Tool: In Packet Traces, Distinguish Effects due to OS from Network Path Intrusion Detection Systems [Taleck 03] Serving OS-Specific Content Fingerprinting a Section of the Network: Provides a Unique Cross-Sectional View of Traffic Building Representative Network Models Inventory 3

Motivation Con t We Select Two Applications: Characterizing One-Hour of Traffic from Commercial Internet USA Exchange Point Inferring NAT (Network Address Translation) Deployment More on these later... 4

TCP/IP Fingerprinting Background Observation: TCP Stacks between Vendors and OS Versions are Unique Differences Due to: Features Implementation Settings, e.g. socket buffer sizes Two Ways to Fingerprint: Active Passive 5

TCP/IP Fingerprinting Con t Active Fingerprinting A Probe Host Sends Traffic to a Remote Machine Scans for Open Ports Sends Specially Crafted Packets Observe Response; Match to list of Response Signatures. Active Probe Probe 1 Reply 1 Probe 2 6

TCP/IP Fingerprinting Con t Passive Fingerprinting Assume Ability to Observe Traffic Make Determination based on Normal Traffic Flow A B Passive Monitor Classifier 7

Active vs. Passive Fingerprinting Active Fingerprinting Advantages: Can be run anywhere, Adaptive Disadvantages: Intrusive, detectable, not scalable Tool: nmap. Database of 450 signatures. Passive Fingerprinting Advantages: Non-intrusive, scalable Disadvantages: Requires acceptable monitoring point Tool: p0f relies on SYN uniqueness exclusively We want to Fingerprint all Traffic on a Busy, Representative Link Use Passive Fingerprinting 8

Robust Classifier Passive Rule-Based tools on Exchange Point Traces: Fail to identify up to 5% of trace hosts Problems: TCP Stack Scrubbers [Smart, et. al 00] TCP Parameter Tuning Signatures must be Updated Regularly Idea: Use Statistical Learning Methods to make a Best-Guess for each Host 9

Robust Classifier Con t Created Classifier Tool: Naive Bayesian Classifier Maximum-Likelihood Inference of Host OS Each Classification has a Degree of Confidence Difficult Question: How to Train Classifier? Train Classifier Using: p0f Signatures ( 200) Web-Logs Special Collection Web Page + Altruistic Users 10

Robust Classifier Con t Question: Why not Measure OS Distribution using, e.g. Web Logs? Want General Method, Not HTTP-Specific Avoid Deep-Packet Inspection Web Browsers Can Lie for anonymity and compatibility 11

Robust Classifier Con t Inferences Made Based on Initial SYN of TCP Handshake Fields with Differentiation Power: Originating TTL (0-255, as packet left host) Initial TCP Window Size (bytes) SYN Size (bytes) Don t Fragment Bit (on/off) 12

Robust Classifier Con t Originating TTL: Next highest power of 2 trick Example: Monitor Observes Packet with TTL=59. Infer TTL=64. Initial TCP Window Size can be: Fixed Function of MTU (Maximum Transmission Unit) or MSS (Maximum Segment Size) Other Initial TCP Window Size Matching: No visibility into TCP-options For common MSS (1460, 1380, 1360, 796, 536) ± IP Options Size Check if an Integer Multiple of Window Size 13

Example Win SYN RuleBased Bayesian Description TTL Size Size DF Conf Correct Correct FreeBSD 5.2 64 65535 60 T 0.988 Y Y FreeBSD (1) 64 65535 70 T 0.940 N Y FreeBSD (2) 64 65530 60 T 0.497 N Y Example 2: Tuned FreeBSD; Window Scaling Throws Off Ruled-Based kern.ipc.maxsockbuf=4194304 net.inet.tcp.sendspace=1048576 net.inet.tcp.recvspace=1048576 net.inet.tcp.rfc3042=1 net.inet.tcp.rfc3390=1 More Fields in Rule-based Approach Fragile Learning on Additional Fields more Robust 14

Classifying a Cross-Section of the Internet Traces: MIT LCS Border Router NLANR MOAT Commercial Internet Exchange Point Link (USA) Analyze One-Hour Trace from Exchange Point Collected in 2003 at 16:00 PST on a Wednesday 15

Classifying a Cross-Section of the Internet Traces: Commercial Internet Exchange Point Link (USA) AS 1 AS 3 AS 2 AS 4 AS N Passive Monitor Classifier AS M 16

Classifying a Cross-Section of the Internet For Brevity (and Easier Computationally) Group in Six Broad OS Categories Measure Host, Packet and Byte Distribution Using p0f-trained Bayesian, Web-trained Bayesian and Rule-Based 17

Host Distribution Windows Dominates Host Count: 92.6-94.8% Host Distribution (59,595 unique) 100 Bayesian 90 WT Bayesian Rule Based 80 70 60 Percent 50 40 30 20 10 0 Windows Linux Mac BSD Solaris Other Unknown Note: Unknown applies only to Rule-Based 18

Packet Distribution Windows: 76.9-77.8%; Linux: 18.7-19.1% 80 70 60 50 Packet Distribution (30.7 MPackets) Bayesian WT Bayesian Rule Based Percent 40 30 20 10 0 Windows Linux Mac BSD Solaris Other Unknown 19

Byte Distribution Windows: 44.6-45.2%; Linux: 52.3-52.6% 60 50 Byte Distribution (7.2 GBytes) Bayesian WT Bayesian Rule Based 40 Percent 30 20 10 0 Windows Linux Mac BSD Solaris Other Unknown 20

Byte Distribution Interesting Results Windows Dominates Hosts, but Linux hosts contribute the most traffic! Top 10 Largest Flows: 55% of byte traffic! 5 Linux, 2 Windows Software Mirror, Web Crawlers (packet every 2-3ms) SMTP servers Aggressive pre-fetching web caches Conclusion: Linux Dominates Traffic, Primarily due to Server Applications in our Traces (YMMV) 21

Classifying for NAT Inference Second Potential Application of Classifier Goal: Understand NAT prevalence in Internet Motivation: E2E-ness of Internet Assume hosts behind an IP-Masquerading NAT have different OS or OS versions (strong assumption) Look for traffic from same IP with different signature to get NAT lower-bound In hour-long trace, assume DHCP and dual-booting machine influence negligible 22

NAT Inference Existing Approaches: sflow [Phaal 03], IP ID [Bellovin 02] sflow: Monitor must be before 1st hop router Using TTL trick, look for unexpectedly low TTLs (decremented by NAT) 23

NAT Inference IP ID [Bellovin 02]: If IP ID is a sequential counter Construct IP ID sequences Coalesce, prune with empirical thresholds Number of remaining sequences estimates number of hosts 1,2,3,4 NAT 22,23,24,26,28 101,102,105,107,106 Passive Monitor 24

Sequence Matching Obstacles Question of whether IP ID Sequence Matching Works: IP ID used for Reassembling Fragmented IP packets No defined semantic, e.g *BSD uses pseudo-random number generator! If DF-bit set, no need for reassembly. NAT sets IP ID to 0. Proper NAT should rewrite IP ID to ensure uniqueness! Further, these obstacles will become significant in the future! We seek to determine the practical impact of these limitations and how well alternate approach works in comparison. 25

Evaluating NAT Inference Algorithms To evaluate different NAT inference algorithms Gathered 2.5M packets from academic building (no NAT) Synthesize NAT traffic Reduce number of unique addresses by combining traffic of n IP addresses into 1. We term n the NAT Inflation factor 26

Evaluating NAT Inference Synthetic Traces Created with 2.0 NAT Inflation Factor Inferred NAT Inflation: IP ID Sequence Matching: 2.07 TCP Signature: 1.22 IP ID Technique works well! TCP Classification does not have enough Discrimination Power 27

NAT Inflation in the Internet Results: IP ID Sequence Matching: 1.092 TCP Signature: 1.02 Measurement-based lower bound to understanding NAT prevalence in Internet 28

Future How to Validate Performance of Classifier? (What s the Correct Answer?) Expand Learning to Additional Fields/Properties of Flow Properly Train Classifier? Web Page (Honest Users Please!): http://momo.lcs.mit.edu/finger/finger.php Identifying TCP Stack Variant (e.g. Reno, Tahoe) 29

Conclusions Contributions of this Work: Developed Robust tool for TCP/IP Fingerprinting Measure Operating System host, packet and byte distribution in the wild Understand NAT inflation factor Measured 9% NAT inflation 30

Questions? Questions? More: http://momo.lcs.mit.edu/finger/finger.php Thanks! 31