Detecting Malicious Activity with DNS Backscatter Kensuke Fukuda John Heidemann Proc. of ACM IMC '15, pp , 2015.

Size: px
Start display at page:

Download "Detecting Malicious Activity with DNS Backscatter Kensuke Fukuda John Heidemann Proc. of ACM IMC '15, pp , 2015."

Transcription

1 Detecting Malicious Activity with DNS Backscatter Kensuke Fukuda John Heidemann Proc. of ACM IMC '15, pp , Presented by Xintong Wang and Han Zhang Challenges in Network Monitoring Need a better monitoring service for network-wide activities Malicious activity: Spammer, scanner Non-malicious activity: Ad tracker, CDN Hard to achieve: Decentralized nature Reverse DNS (DNS Backscatter) provides a centralized strategic point Reverse DNS DNS Backscatter Sensor

2 DNS Backscatter DNS backscatter is the set of reverse DNS queries observed by a DNS authority Cache happens at all layers Final authority vs. root authority Final authority sees all queries for a specific originator Root authority should see all originators, if not cached Privacy Concerns over DNS traffic Get approval from IRB (though sometimes an IRB review is not enough) Reasons to address privacy concerns in this case: Caching and shared cache mask individual traffic, focusing on prevalent network activity instead Authorities have little interaction with targets due to recursive resolvers Mostly automated traffic, not human traffic, in reverse DNS Methodology - Datasets Collected at authorities One national authority managing.jp country TLD, two root servers (B, M) out of 13 And a final authority? (Not clear in the paper) Format: (originator, querier, authority) tuple Methodology - Features Derive static features from querier s domain name (mail.google.com) mail, ns, firewall, cdn, nxdomain, etc Dynamic features from query patterns queries per querier, unique ASes, unique countries, etc Classes for originator ad-tracker, cdn, cloud, mail, spam, etc Manually label originators for training

3 Constraints in Backscatter Limited information about targets Based only on querier domain name Backscatter is spread over multiple authorities due to anycast Could be tricked by careful spammer. Only increase the cost at certain degree Outline DNS Backscatters Methodology Validation Evaluation Validation Select appropriate features Label ground truth Choose learning algorithm Validate through cross-validation Select Appropriate Features Static features to distinguish different classes of originators Static features from domain names of the queriers Originator classes

4 Select Appropriate Features Dynamic features to distinguish different classes of originators Label Ground Truth Generate moderate to large lists of potential IP addresses in each application class from external sources; Intersect with the top originators in dataset by the number of queries; Manually verify intersection Choose Learning Algorithm Classification Accuracy Learning algorithms: Classification And Regression Tree (CART) Random Forest (RF) Kernel Support-Vector Machines (SVM) Metrics: Accuracy: (tp + tn) / all Precision: tp / (tp + fp) Recall: tp / (tp + fn) F1-score: 2tp / (2tp + fp + fn) Benchmark: 0.08 accuracy for randomly guessing Roots are attenuated (B post-ditl & M ditl)

5 Discriminative Features Gini Impurity Larger Gini values indicate features with greater discriminative power. Evaluate DNS Caching Backscatter is highly attenuated due to disinterested targets and DNS caching. Roughly 1 querier per 1000 targets Results - Size of Originator Footprints There are hundreds of originators that touch large parts of the Internet Classification of Top Originators Focus on the originators with the largest footprints; Understand the type of activity and the aggressiveness of activity. (Log-log scale)

6 Results - Trends of Network-wide Activities Results - Trends of Network-wide Activities Fluctuations of originators may be explained by reactions to network security Public announcement of the Heartbleed vulnerability events. (a security bug in the OpenSSL cryptography library) The number of originators in each originator class for each dataset Big footprints are often unsavory activities! Results - Trends of Network-wide Activities Results - Trends of Network-wide Activities Heartbleed Shellshock Very large scanners come and go.

7 Contributions Identify DNS backscatter as a new source of information about benign and malicious network-wide activity; Keep in mind of privacy and address any potential related issues in paper; Collect trainable dataset with ground truth label; Understand the type and trend of network-wide activity based on classifications; Discussions Adoption of botnets to circumvent the system Intentionally camouflage network traffic at each originator The possibility of other prominent features The possibility of other classifiers Limited training data The number of data points in some application classes is too small

Detecting Malicious Activity with DNS Backscatter Over Time

Detecting Malicious Activity with DNS Backscatter Over Time IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 25, NO. 3, JULY 217 1 Detecting Malicious Activity with DNS Backscatter Over Time Kensuke Fukuda John Heidemann Abdul Qadeer Abstract Network-wide activity is

More information

Detecting DGA Malware Traffic Through Behavioral Models. Erquiaga, María José Catania, Carlos García, Sebastían

Detecting DGA Malware Traffic Through Behavioral Models. Erquiaga, María José Catania, Carlos García, Sebastían Detecting DGA Malware Traffic Through Behavioral Models Erquiaga, María José Catania, Carlos García, Sebastían Outline Introduction Detection Method Training the threshold Dataset description Experiment

More information

Mobile Content Hosting Infrastructure in China: A View from a Cellular ISP. Zhenhua Li Chunjing Han Gaogang Xie

Mobile Content Hosting Infrastructure in China: A View from a Cellular ISP. Zhenhua Li Chunjing Han Gaogang Xie Mobile Content Hosting Infrastructure in China: A View from a Cellular ISP Zhenyu Li Donghui Yang Zhenhua Li Chunjing Han Gaogang Xie Continuous increase of mobile data CISCO projected: the mobile data

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

Evaluation of different biological data and computational classification methods for use in protein interaction prediction.

Evaluation of different biological data and computational classification methods for use in protein interaction prediction. Evaluation of different biological data and computational classification methods for use in protein interaction prediction. Yanjun Qi, Ziv Bar-Joseph, Judith Klein-Seetharaman Protein 2006 Motivation Correctly

More information

Naming in Distributed Systems

Naming in Distributed Systems Naming in Distributed Systems Dr. Yong Guan Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University Outline for Today s Talk Overview: Names, Identifiers,

More information

Detecting malware even when it is encrypted

Detecting malware even when it is encrypted Detecting malware even when it is encrypted Machine Learning for network HTTPS analysis František Střasák strasfra@fel.cvut.cz @FrenkyStrasak Sebastian Garcia sebastian.garcia@agents.fel.cvut.cz @eldracote

More information

Adaptive Learning of an Accurate Skin-Color Model

Adaptive Learning of an Accurate Skin-Color Model Adaptive Learning of an Accurate Skin-Color Model Q. Zhu K.T. Cheng C. T. Wu Y. L. Wu Electrical & Computer Engineering University of California, Santa Barbara Presented by: H.T Wang Outline Generic Skin

More information

Artificial Intelligence. Programming Styles

Artificial Intelligence. Programming Styles Artificial Intelligence Intro to Machine Learning Programming Styles Standard CS: Explicitly program computer to do something Early AI: Derive a problem description (state) and use general algorithms to

More information

Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real-World Exploits

Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real-World Exploits Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real-World Exploits Carl Sabottke Octavian Suciu Tudor Dumitraș University of Maryland 2 Problem Increasing number

More information

Discovering new malicious domains using DNS and big data Case study: Fast Flux domains. Dhia Mahjoub OpenDNS May 25 th, 2013

Discovering new malicious domains using DNS and big data Case study: Fast Flux domains. Dhia Mahjoub OpenDNS May 25 th, 2013 Discovering new malicious domains using DNS and big data Case study: Fast Flux domains Dhia Mahjoub OpenDNS May 25 th, 2013 Background A@ackers seek to keep their operabons online at all Bmes The Network

More information

WP1: Video Data Analysis

WP1: Video Data Analysis Leading : UNICT Participant: UEDIN Fish4Knowledge Final Review Meeting - November 29, 2013 - Luxembourg Workpackage 1 Objectives Fish Detection: Background/foreground modeling algorithms able to deal with

More information

An introduction to random forests

An introduction to random forests An introduction to random forests Eric Debreuve / Team Morpheme Institutions: University Nice Sophia Antipolis / CNRS / Inria Labs: I3S / Inria CRI SA-M / ibv Outline Machine learning Decision tree Random

More information

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation Learning 4 Supervised Learning 4 Unsupervised Learning 4

More information

Data Mining Concepts & Techniques

Data Mining Concepts & Techniques Data Mining Concepts & Techniques Lecture No. 03 Data Processing, Data Mining Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology

More information

Enumerating Privacy Leaks in DNS Data Collected above the Recursive

Enumerating Privacy Leaks in DNS Data Collected above the Recursive Enumerating Privacy Leaks in DNS Data Collected above the Recursive Basileal Imana 1, Aleksandra Korolova 1 and John Heidemann 2 1 University of Southern California 2 USC/Information Science Institute

More information

IEE 520 Data Mining. Project Report. Shilpa Madhavan Shinde

IEE 520 Data Mining. Project Report. Shilpa Madhavan Shinde IEE 520 Data Mining Project Report Shilpa Madhavan Shinde Contents I. Dataset Description... 3 II. Data Classification... 3 III. Class Imbalance... 5 IV. Classification after Sampling... 5 V. Final Model...

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

Increase of Root and JP queries -- Long-term trends of number of queries --

Increase of Root and JP queries -- Long-term trends of number of queries -- Increase of Root and JP queries -- Long-term trends of number of queries -- Kazunori Fujiwara, JPRS DNS-OARC 2015 Spring Workshop Last Update: 2015/5/10 1945 (UTC) 1 Are DNS queries

More information

A Network Intrusion Detection System Architecture Based on Snort and. Computational Intelligence

A Network Intrusion Detection System Architecture Based on Snort and. Computational Intelligence 2nd International Conference on Electronics, Network and Computer Engineering (ICENCE 206) A Network Intrusion Detection System Architecture Based on Snort and Computational Intelligence Tao Liu, a, Da

More information

DailyCatch: A Provider-centric View of Anycast Behaviour

DailyCatch: A Provider-centric View of Anycast Behaviour DailyCatch: A Provider-centric View of Anycast Behaviour Stephen McQuistin University of Glasgow Sree Priyanka Uppu Marcel Flores Verizon Digital Media Services What is IP anycast? 2 What is IP anycast?

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website) Evaluating Classifiers What we want: Classifier that best predicts

More information

INTRODUCTION TO MACHINE LEARNING. Measuring model performance or error

INTRODUCTION TO MACHINE LEARNING. Measuring model performance or error INTRODUCTION TO MACHINE LEARNING Measuring model performance or error Is our model any good? Context of task Accuracy Computation time Interpretability 3 types of tasks Classification Regression Clustering

More information

A SUBSYSTEM FOR FAST (IP) FLUX BOTNET DETECTION

A SUBSYSTEM FOR FAST (IP) FLUX BOTNET DETECTION Chapter 6 A SUBSYSTEM FOR FAST (IP) FLUX BOTNET DETECTION 6.1 Introduction 6.1.1 Motivation Content Distribution Networks (CDNs) and Round-Robin DNS (RRDNS) are the two standard methods used for resource

More information

Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval

Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval 1 Naïve Implementation Convert all documents in collection D to tf-idf weighted vectors, d j, for keyword vocabulary V. Convert

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING Clustering Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu November 7, 2017 Learnt Clustering Methods Vector Data Set Data Sequence Data Text

More information

Random Forest A. Fornaser

Random Forest A. Fornaser Random Forest A. Fornaser alberto.fornaser@unitn.it Sources Lecture 15: decision trees, information theory and random forests, Dr. Richard E. Turner Trees and Random Forests, Adele Cutler, Utah State University

More information

Pattern recognition (4)

Pattern recognition (4) Pattern recognition (4) 1 Things we have discussed until now Statistical pattern recognition Building simple classifiers Supervised classification Minimum distance classifier Bayesian classifier (1D and

More information

Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging

Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging 1 CS 9 Final Project Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging Feiyu Chen Department of Electrical Engineering ABSTRACT Subject motion is a significant

More information

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München Evaluation Measures Sebastian Pölsterl Computer Aided Medical Procedures Technische Universität München April 28, 2015 Outline 1 Classification 1. Confusion Matrix 2. Receiver operating characteristics

More information

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

The Comparative Study of Machine Learning Algorithms in Text Data Classification* The Comparative Study of Machine Learning Algorithms in Text Data Classification* Wang Xin School of Science, Beijing Information Science and Technology University Beijing, China Abstract Classification

More information

Documentation for: MTA developers

Documentation for: MTA developers This document contains implementation guidelines for developers of MTA products/appliances willing to use Spamhaus products to block as much spam as possible. No reference is made to specific products.

More information

Classification and Regression

Classification and Regression Classification and Regression Announcements Study guide for exam is on the LMS Sample exam will be posted by Monday Reminder that phase 3 oral presentations are being held next week during workshops Plan

More information

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati Evaluation Metrics (Classifiers) CS Section Anand Avati Topics Why? Binary classifiers Metrics Rank view Thresholding Confusion Matrix Point metrics: Accuracy, Precision, Recall / Sensitivity, Specificity,

More information

Information Retrieval

Information Retrieval Information Retrieval ETH Zürich, Fall 2012 Thomas Hofmann LECTURE 6 EVALUATION 24.10.2012 Information Retrieval, ETHZ 2012 1 Today s Overview 1. User-Centric Evaluation 2. Evaluation via Relevance Assessment

More information

DoH and DoT experience. Ólafur Guðmundsson Marek Vavrusa

DoH and DoT experience. Ólafur Guðmundsson Marek Vavrusa DoH and DoT experience Ólafur Guðmundsson Marek Vavrusa Announced April 1 st 2018 Our mission: to help build a better Internet. We use 1.1.1.1 and 1.0.0.1 (easy to remember) for our resolver. DNS resolver,

More information

Open Resolvers in COM/NET Resolution!! Duane Wessels, Aziz Mohaisen! DNS-OARC 2014 Spring Workshop! Warsaw, Poland!

Open Resolvers in COM/NET Resolution!! Duane Wessels, Aziz Mohaisen! DNS-OARC 2014 Spring Workshop! Warsaw, Poland! Open Resolvers in COM/NET Resolution!! Duane Wessels, Aziz Mohaisen! DNS-OARC 2014 Spring Workshop! Warsaw, Poland! Outine! Why do we care about Open Resolvers?! Surveys at Verisign! Characterizing Open

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Charles Elkan elkan@cs.ucsd.edu January 18, 2011 In a real-world application of supervised learning, we have a training set of examples with labels, and a test set of examples with

More information

The State and Challenges of the DNSSEC Deployment. Eric Osterweil Michael Ryan Dan Massey Lixia Zhang

The State and Challenges of the DNSSEC Deployment. Eric Osterweil Michael Ryan Dan Massey Lixia Zhang The State and Challenges of the DNSSEC Deployment Eric Osterweil Michael Ryan Dan Massey Lixia Zhang 1 Monitoring Shows What s Working and What needs Work DNS operations must already deal with widespread

More information

An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data

An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data Nian Zhang and Lara Thompson Department of Electrical and Computer Engineering, University

More information

Classifying Spam using URLs

Classifying Spam using URLs Classifying Spam using URLs Di Ai Computer Science Stanford University Stanford, CA diai@stanford.edu CS 229 Project, Autumn 2018 Abstract This project implements support vector machine and random forest

More information

Quad9: A Free, Secure DNS Resolver. Nishal Goburdhan Internet Infrastructure Analyst Packet Clearing House

Quad9: A Free, Secure DNS Resolver. Nishal Goburdhan Internet Infrastructure Analyst Packet Clearing House Quad9: A Free, Secure DNS Resolver Nishal Goburdhan Internet Infrastructure Analyst Packet Clearing House nishal@pch.net The Domain Name System (DNS) is the phone book of the Internet. It translates domain

More information

Performance Evaluation of Various Classification Algorithms

Performance Evaluation of Various Classification Algorithms Performance Evaluation of Various Classification Algorithms Shafali Deora Amritsar College of Engineering & Technology, Punjab Technical University -----------------------------------------------------------***----------------------------------------------------------

More information

The Kinect Sensor. Luís Carriço FCUL 2014/15

The Kinect Sensor. Luís Carriço FCUL 2014/15 Advanced Interaction Techniques The Kinect Sensor Luís Carriço FCUL 2014/15 Sources: MS Kinect for Xbox 360 John C. Tang. Using Kinect to explore NUI, Ms Research, From Stanford CS247 Shotton et al. Real-Time

More information

Understanding and Characterizing Hidden Interception of the DNS Resolution Path

Understanding and Characterizing Hidden Interception of the DNS Resolution Path Who Is Answering My Queries? Understanding and Characterizing Hidden Interception of the DNS Resolution Path Baojun Liu, Chaoyi Lu, Haixin Duan, YingLiu, ZhouLi, ShuangHaoand MinYang ISP DNS Resolver DNS

More information

Free for All! Assessing User Data Exposure to Advertising Libraries on Android

Free for All! Assessing User Data Exposure to Advertising Libraries on Android Free for All! Assessing User Data Exposure to Advertising Libraries on Android Soteris Demetriou, Whitney Merrill, Wei Yang, Aston Zhang, Carl Gunter University of Illinois at Urbana - Champaign Approach

More information

The UCSD Network Telescope

The UCSD Network Telescope The UCSD Network Telescope Colleen Shannon cshannon @ caida.org NSF CIED Site Visit November 22, 2004 UCSD CSE Motivation Blocking technologies for automated exploits is nascent and not widely deployed

More information

DNSSM: A Large Scale Passive DNS Security Monitoring Framework

DNSSM: A Large Scale Passive DNS Security Monitoring Framework samuel.marchal@uni.lu 16/04/12 DNSSM: A Large Scale Passive DNS Security Monitoring Framework Samuel Marchal, Jérôme François, Cynthia Wagner, Radu State, Alexandre Dulaunoy, Thomas Engel, Olivier Festor

More information

Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B

Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B 1 Supervised learning Catogarized / labeled data Objects in a picture: chair, desk, person, 2 Classification Fons van der Sommen

More information

Behavior Based Malware Analysis: A Perspective From Network Traces and Program Run-Time Structure

Behavior Based Malware Analysis: A Perspective From Network Traces and Program Run-Time Structure Behavior Based Malware Analysis: A Perspective From Network Traces and Program Run-Time Structure Chun-Ying Huang chuang@ntou.edu.tw Assistant Professor Department of Computer Science and Engineering National

More information

Using Machine Learning to Identify Security Issues in Open-Source Libraries. Asankhaya Sharma Yaqin Zhou SourceClear

Using Machine Learning to Identify Security Issues in Open-Source Libraries. Asankhaya Sharma Yaqin Zhou SourceClear Using Machine Learning to Identify Security Issues in Open-Source Libraries Asankhaya Sharma Yaqin Zhou SourceClear Outline - Overview of problem space Unidentified security issues How Machine Learning

More information

CS570: Introduction to Data Mining

CS570: Introduction to Data Mining CS570: Introduction to Data Mining Classification Advanced Reading: Chapter 8.4 & 8.5 Han, Chapters 4.5 & 4.6 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei. Data

More information

Supervised Learning Classification Algorithms Comparison

Supervised Learning Classification Algorithms Comparison Supervised Learning Classification Algorithms Comparison Aditya Singh Rathore B.Tech, J.K. Lakshmipat University -------------------------------------------------------------***---------------------------------------------------------

More information

Tracking Evil with Passive DNS

Tracking Evil with Passive DNS Tracking Evil with Passive DNS Bojan Ždrnja, CISSP, GCIA, GCIH Bojan.Zdrnja@infigo.hr INFIGO IS http://www.infigo.hr Who am I? Senior information security consultant with INFIGO IS (Croatia) Mainly doing

More information

Internet Content Distribution

Internet Content Distribution Internet Content Distribution Chapter 1: Introduction Jussi Kangasharju Chapter Outline Introduction into content distribution Basic concepts TCP DNS HTTP Outline of the rest of the course Kangasharju:

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 08: Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu October 24, 2017 Learnt Prediction and Classification Methods Vector Data

More information

MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration. Chapter 5 Introduction to DNS in Windows Server 2008

MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration. Chapter 5 Introduction to DNS in Windows Server 2008 MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 5 Introduction to DNS in Windows Server 2008 Objectives Discuss the basics of the Domain Name System (DNS) and its

More information

Scrutinizing a Country using Passive DNS and Picviz or how to analyze big dataset without loosing your mind

Scrutinizing a Country using Passive DNS and Picviz or how to analyze big dataset without loosing your mind Scrutinizing a Country using Passive DNS and Picviz or how to analyze big dataset without loosing your mind Sebastien Tricaud, Alexandre Dulaunoy March 10, 2012 Disclaimer Passive DNS is a technique to

More information

Network Protocols. DNS Intel *slightly modified public version of another talk. TDC 375 Autumn 2009/10 John Kristoff DePaul University 1

Network Protocols. DNS Intel *slightly modified public version of another talk. TDC 375 Autumn 2009/10 John Kristoff DePaul University 1 Network Protocols DNS Intel *slightly modified public version of another talk TDC 375 Autumn 2009/10 John Kristoff DePaul University 1 What's in a name? dns research01.cti.depaul.edu. TDC 375 Autumn 2009/10

More information

Chapter 6: Distributed Systems: The Web. Fall 2012 Sini Ruohomaa Slides joint work with Jussi Kangasharju et al.

Chapter 6: Distributed Systems: The Web. Fall 2012 Sini Ruohomaa Slides joint work with Jussi Kangasharju et al. Chapter 6: Distributed Systems: The Web Fall 2012 Sini Ruohomaa Slides joint work with Jussi Kangasharju et al. Chapter Outline Web as a distributed system Basic web architecture Content delivery networks

More information

Large-scale DNS. Hot Topics/An Analysis of Anomalous Queries

Large-scale DNS. Hot Topics/An Analysis of Anomalous Queries Large-scale DNS Caching Servers Hot Topics/An Analysis of Anomalous Queries Shintaro NAKAGAMI, Tsuyoshi TOYONO Keisuke ISHIBASHI, Haruhiko NISHIDA, and Haruhiko OHSHIMA NTT Communications, OCN NTT Laboratories

More information

The DNS. Application Proxies. Circuit Gateways. Personal and Distributed Firewalls The Problems with Firewalls

The DNS. Application Proxies. Circuit Gateways. Personal and Distributed Firewalls The Problems with Firewalls Network Security - ISA 656 Application Angelos Stavrou August 20, 2008 Application Distributed Why move up the stack? Apart from the limitations of packet filters discussed last time, firewalls are inherently

More information

Measurements of traffic in DITL 2008

Measurements of traffic in DITL 2008 Measurements of traffic in DITL 2008 Sebastian Castro secastro@caida.org CAIDA / NIC Chile 2008 OARC Workshop Sep 2008 Ottawa, CA Overview DITL 2008 General statistics Query characteristics Query rate

More information

DATA MINING LECTURE 11. Classification Basic Concepts Decision Trees Evaluation Nearest-Neighbor Classifier

DATA MINING LECTURE 11. Classification Basic Concepts Decision Trees Evaluation Nearest-Neighbor Classifier DATA MINING LECTURE 11 Classification Basic Concepts Decision Trees Evaluation Nearest-Neighbor Classifier What is a hipster? Examples of hipster look A hipster is defined by facial hair Hipster or Hippie?

More information

DNS Security. Ch 1: The Importance of DNS Security. Updated

DNS Security. Ch 1: The Importance of DNS Security. Updated DNS Security Ch 1: The Importance of DNS Security Updated 8-21-17 DNS is Essential Without DNS, no one can use domain names like ccsf.edu Almost every Internet communication begins with a DNS resolution

More information

Day In The Life of the Internet 2008 Data Collection Event.

Day In The Life of the Internet 2008 Data Collection Event. Day In The Life of the Internet 2008 Data Collection Event http://www.caida.org/projects/ditl Duane Wessels The Measurement Factory/CAIDA k claffy CAIDA NANOG 42 February 19, 2008 NANOG 42 0 The Measurement

More information

List of Exercises: Data Mining 1 December 12th, 2015

List of Exercises: Data Mining 1 December 12th, 2015 List of Exercises: Data Mining 1 December 12th, 2015 1. We trained a model on a two-class balanced dataset using five-fold cross validation. One person calculated the performance of the classifier by measuring

More information

Hoda Rohani Anastasios Poulidis Supervisor: Jeroen Scheerder. System and Network Engineering July 2014

Hoda Rohani Anastasios Poulidis Supervisor: Jeroen Scheerder. System and Network Engineering July 2014 Hoda Rohani Anastasios Poulidis Supervisor: Jeroen Scheerder System and Network Engineering July 2014 DNS Main Components Server Side: Authoritative Servers Resolvers (Recursive Resolvers, cache) Client

More information

Countering Spam Using Classification Techniques. Steve Webb Data Mining Guest Lecture February 21, 2008

Countering Spam Using Classification Techniques. Steve Webb Data Mining Guest Lecture February 21, 2008 Countering Spam Using Classification Techniques Steve Webb webb@cc.gatech.edu Data Mining Guest Lecture February 21, 2008 Overview Introduction Countering Email Spam Problem Description Classification

More information

Classification Algorithms in Data Mining

Classification Algorithms in Data Mining August 9th, 2016 Suhas Mallesh Yash Thakkar Ashok Choudhary CIS660 Data Mining and Big Data Processing -Dr. Sunnie S. Chung Classification Algorithms in Data Mining Deciding on the classification algorithms

More information

Improving Positron Emission Tomography Imaging with Machine Learning David Fan-Chung Hsu CS 229 Fall

Improving Positron Emission Tomography Imaging with Machine Learning David Fan-Chung Hsu CS 229 Fall Improving Positron Emission Tomography Imaging with Machine Learning David Fan-Chung Hsu (fcdh@stanford.edu), CS 229 Fall 2014-15 1. Introduction and Motivation High- resolution Positron Emission Tomography

More information

HOG-based Pedestriant Detector Training

HOG-based Pedestriant Detector Training HOG-based Pedestriant Detector Training evs embedded Vision Systems Srl c/o Computer Science Park, Strada Le Grazie, 15 Verona- Italy http: // www. embeddedvisionsystems. it Abstract This paper describes

More information

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4. Prof. James She

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4. Prof. James She ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4 Prof. James She james.she@ust.hk 1 Selected Works of Activity 4 2 Selected Works of Activity 4 3 Last lecture 4 Mid-term

More information

Franzes Francisco Manila IBM Domino Server Crash and Messaging

Franzes Francisco Manila IBM Domino Server Crash and Messaging Franzes Francisco Manila IBM Domino Server Crash and Messaging Topics to be discussed What is SPAM / email Spoofing? How to identify one? Anti-SPAM / Anti-email spoofing basic techniques Domino configurations

More information

snoc Snoc DDoS Protection Fast Secure Cost effective Introduction Snoc 3.0 Global Scrubbing Centers Web Application DNS Protection

snoc Snoc DDoS Protection Fast Secure Cost effective Introduction Snoc 3.0 Global Scrubbing Centers Web Application DNS Protection Snoc DDoS Protection Fast Secure Cost effective sales@.co.th www..co.th securenoc Introduction Snoc 3.0 Snoc DDoS Protection provides organizations with comprehensive protection against the most challenging

More information

A novel supervised learning algorithm and its use for Spam Detection in Social Bookmarking Systems

A novel supervised learning algorithm and its use for Spam Detection in Social Bookmarking Systems A novel supervised learning algorithm and its use for Spam Detection in Social Bookmarking Systems Anestis Gkanogiannis and Theodore Kalamboukis Department of Informatics Athens University of Economics

More information

DATA MINING LECTURE 9. Classification Basic Concepts Decision Trees Evaluation

DATA MINING LECTURE 9. Classification Basic Concepts Decision Trees Evaluation DATA MINING LECTURE 9 Classification Basic Concepts Decision Trees Evaluation What is a hipster? Examples of hipster look A hipster is defined by facial hair Hipster or Hippie? Facial hair alone is not

More information

TempR: Application of Stricture Dependent Intelligent Classifier for Fast Flux Domain Detection

TempR: Application of Stricture Dependent Intelligent Classifier for Fast Flux Domain Detection I. J. Computer Network and Information Security, 2016, 10, 37-44 Published Online October 2016 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijcnis.2016.10.05 TempR: Application of Stricture Dependent

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website) Evaluating Classifiers What we want: Classifier that best predicts

More information

Classifying Encrypted Traffic with TLSaware

Classifying Encrypted Traffic with TLSaware Classifying Encrypted Traffic with TLSaware Telemetry Blake Anderson, David McGrew, and Alison Kendler blaander@cisco.com, mcgrew@cisco.com, alkendle@cisco.com FloCon 2016 Problem Statement I need to understand

More information

Major Project Part II. Isolating Suspicious BGP Updates To Detect Prefix Hijacks

Major Project Part II. Isolating Suspicious BGP Updates To Detect Prefix Hijacks Major Project Part II Isolating Suspicious BGP Updates To Detect Prefix Hijacks Author: Abhishek Aggarwal (IIT Delhi) Co-authors: Anukool Lakhina (Guavus Networks Inc.) Prof. Huzur Saran (IIT Delhi) !

More information

CS4491/CS 7265 BIG DATA ANALYTICS

CS4491/CS 7265 BIG DATA ANALYTICS CS4491/CS 7265 BIG DATA ANALYTICS EVALUATION * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Dr. Mingon Kang Computer Science, Kennesaw State University Evaluation for

More information

Chapter 6 Evaluation Metrics and Evaluation

Chapter 6 Evaluation Metrics and Evaluation Chapter 6 Evaluation Metrics and Evaluation The area of evaluation of information retrieval and natural language processing systems is complex. It will only be touched on in this chapter. First the scientific

More information

COMP 465: Data Mining Classification Basics

COMP 465: Data Mining Classification Basics Supervised vs. Unsupervised Learning COMP 465: Data Mining Classification Basics Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Supervised

More information

Information Retrieval and Organisation

Information Retrieval and Organisation Information Retrieval and Organisation Chapter 16 Flat Clustering Dell Zhang Birkbeck, University of London What Is Text Clustering? Text Clustering = Grouping a set of documents into classes of similar

More information

Data Imbalance Problem solving for SMOTE Based Oversampling: Study on Fault Detection Prediction Model in Semiconductor Manufacturing Process

Data Imbalance Problem solving for SMOTE Based Oversampling: Study on Fault Detection Prediction Model in Semiconductor Manufacturing Process Vol.133 (Information Technology and Computer Science 2016), pp.79-84 http://dx.doi.org/10.14257/astl.2016. Data Imbalance Problem solving for SMOTE Based Oversampling: Study on Fault Detection Prediction

More information

Intrusion detection in computer networks through a hybrid approach of data mining and decision trees

Intrusion detection in computer networks through a hybrid approach of data mining and decision trees WALIA journal 30(S1): 233237, 2014 Available online at www.waliaj.com ISSN 10263861 2014 WALIA Intrusion detection in computer networks through a hybrid approach of data mining and decision trees Tayebeh

More information

Networked systems and their users

Networked systems and their users Networked systems and their users q The expansion of the Google serving infrastructure q A personalized livestreaming system q A platform for crowdsourcing web QoE measurements Mapping the expansion of

More information

Data Mining in Bioinformatics Day 1: Classification

Data Mining in Bioinformatics Day 1: Classification Data Mining in Bioinformatics Day 1: Classification Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institute Tübingen and Eberhard Karls

More information

D-mystifying the D-Root Address Change

D-mystifying the D-Root Address Change D-mystifying the D-Root Address Change Matthew Lentz, Dave Levin, Jason Castonguay, Neil Spring, Bobby Bhattacharjee University of Maryland Domain Name System (DNS). Root arpa edu com gov... Top-Level

More information

How to Configure DNS Sinkholing in the Firewall

How to Configure DNS Sinkholing in the Firewall UDP DNS traffic handled by the Firewall service is monitored and, if a domain is found that is considered to be malicious, the A and AAAA DNS response is replaced by fake IP addresses. An access rule blocks

More information

DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS

DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS 1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes and a class attribute

More information

DATA MINING LECTURE 9. Classification Decision Trees Evaluation

DATA MINING LECTURE 9. Classification Decision Trees Evaluation DATA MINING LECTURE 9 Classification Decision Trees Evaluation 10 10 Illustrating Classification Task Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K No 2 No Medium 100K No 3 No Small 70K No 4 Yes Medium

More information

John Munro / Jason Trost / FlonCon 2013 January 7 10 Albuquerque, New Mexico

John Munro / Jason Trost / FlonCon 2013 January 7 10 Albuquerque, New Mexico John Munro / jmunro@endgame.com Jason Trost / jtrost@endgame.com FlonCon 2013 January 7 10 Albuquerque, New Mexico Introductions John Munro (jmunro@endgame.com) Network Security Researcher and Data Scientist

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 Homework 2 out Announcements Due May 3 rd (11:59pm) Course project proposal

More information

MIT Samberg Center Cambridge, MA, USA. May 30 th June 2 nd, by C. Rea, R.S. Granetz MIT Plasma Science and Fusion Center, Cambridge, MA, USA

MIT Samberg Center Cambridge, MA, USA. May 30 th June 2 nd, by C. Rea, R.S. Granetz MIT Plasma Science and Fusion Center, Cambridge, MA, USA Exploratory Machine Learning studies for disruption prediction on DIII-D by C. Rea, R.S. Granetz MIT Plasma Science and Fusion Center, Cambridge, MA, USA Presented at the 2 nd IAEA Technical Meeting on

More information

8. Tree-based approaches

8. Tree-based approaches Foundations of Machine Learning École Centrale Paris Fall 2015 8. Tree-based approaches Chloé-Agathe Azencott Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr

More information

CS5670: Computer Vision

CS5670: Computer Vision CS5670: Computer Vision Noah Snavely Lecture 33: Recognition Basics Slides from Andrej Karpathy and Fei-Fei Li http://vision.stanford.edu/teaching/cs231n/ Announcements Quiz moved to Tuesday Project 4

More information

Trade-offs in Explanatory

Trade-offs in Explanatory 1 Trade-offs in Explanatory 21 st of February 2012 Model Learning Data Analysis Project Madalina Fiterau DAP Committee Artur Dubrawski Jeff Schneider Geoff Gordon 2 Outline Motivation: need for interpretable

More information

Indicate whether the statement is true or false.

Indicate whether the statement is true or false. Indicate whether the statement is true or false. 1. NIDPSs can reliably ascertain if an attack was successful or not. 2. Intrusion detection consists of procedures and systems that identify system intrusions

More information