Schematizing a Global SPAM Indicative Probability

Size: px
Start display at page:

Download "Schematizing a Global SPAM Indicative Probability"

Transcription

1 Schematizing a Global SPAM Indicative Probability NIKOLAOS KORFIATIS MARIOS POULOS SOZON PAPAVLASSOPOULOS Department of Management Science and Technology Athens University of Economics and Business Athens, Greece Department of Archive and Library Sciences Ionian University Corfu, Greece Abstract In this paper we propose a middleware infrastructure to address the problem of filtering unsolicitated mail messages (known as SPAM). In our approach we use Bayesian Classifications of SPAM messages built upon categorization models that map a probability to a word using text analysis not only to unsolicitated mails but also to legitimate mail messages, making easier to extract a cumulate inference about the nature of the message. Our proposed architecture is based on the extension of these models using the advances of Collaborative Filtering Methods expressed via -to-peer networks will help to built more effective and accurate anti-spam filters. Key-Words : , SPAM, Privacy, -to-peer, Bayesian Classifiers 1 Introduction SPAM [1] also known as mass commercial unsolicitated , is a fast growing phenomenon to all levels of internet users. Varying from end users to large enterprises such us Internet Service Providers (ISP s), SPAM is the most usual type of that a typical internet user receives every day. Socio-Technical aspects of SPAM vary from bandwidth costs to security and privacy manners. Furthermore the development of sophisticated types of software crawlers whichmakes easier for SPAMers to acquire the addresses from people who have made them public via a website or a participating to an internet community such us the USENET news, poses a threat to the use of as the primary mean for computer mediated communication. SPAM protection currently has two approaches, the first is the legal measures approach which is now being applied to US and EU as a way to punish senders that are responsible for a large number of unwanted s been sent to internet users, making a violation of their privacy rights. The other side of the coin is cost-sensitive applications of already developed techniques from fields such us information retrieval or text categorization. Following this side we are making a collaborative filtering approach that uses the concept of node interconnections for information exchange which is the main architecture of a peer to peer network. Collaborative filtering reflects the method of exchanging preferences and annotations regarding the same corpora of documents and information. Following the axiom that SPAM is not send only to certain type of users, thus making it a global phenomenon, we address the need for a collaborative filtering infrastructure that will make accurate recommendations about the intention of an message. In the next paragraphs we make a categorization of current SPAM 1

2 Table 1: Classification Methods and targeted part of the Mail Message Method Structured Text Filters Verification Filters Distributed Adaptive Blacklists Rule Based Rankings Bayesian Classifiers Message Part Body Header Header Body Body/Header filtering techniques from the part of the mail message that they target. Next we analyse our modelling approach by using a certain type of -to-peer Network Architecture in order to realize the system so that internet users could benefit in terms of reducing the amount of SPAM they receive everyday and also reduce the cost from misclassification of legitimate messages as SPAM. 2 Current Approaches on Spam Filtering Technically speaking SPAM Filtering is a cost sensitive application of text categorization. We characterize SPAM filtering as cost sensitive since the cost in terms of information loss by misclassifying a message as SPAM cannot be predicted. Taking this under consideration, many classification methods have been emerged, influenced from fields such as natural language processing or Information Retrieval. SPAM filtering methods can be classified from the part of the message that is targeted. A typical message is consisted of two parts: The mail header,which contains information about the origin of the message such us the appropriate route that it followed to come to the user s mail server The content of the message which is going to be read by the user Following the above taxonomy several families of SPAM filtering have been witnessed depending on the part of the mail message they target. 2.1 Header Based Filtering Header based filtering is the basic method for large scale implementations of SPAM protection. In that case when an message is being received by the mail transfer agent (MTA) it is separated from its body. The header contains fields such us the sender and the recipient of the message. Then it is being analyzed using a lexical parser in order to identify the values of the fields. Having header values we can apply two different types of filtering: Verification filtering (also known as whitelist filtering) Trusted Filtering (also known as blacklist filtering) Verification and Trusted filtering are the polar values of the same method, in which the header values are being validated against a vector which is being constructed by the user. Given a domain S of mail messages S i and a set C of predefined categories as C = {SP AM, LEGIT IMAT E}. We consider a vector S that represents the messages such us S = S 1, S 2, S 3,... and a vector P that refers to the classifiers for C such us P = P 1, P 2, P 3,.... The task of categorizing an message as SPAM may be formalized in the form of approximating the target function Φ : S C {T, F }. In case of Φ(S i, C i ) = T, S i represents a positive example of C i where the classification of the mail message according to the P is the same with the user s preference of classification, otherwise F represent the negative example of the classification process where the user s preferences are different from the automated classification. In both cases (Verification, Trusted filtering) the target function uses the classification vector P to process the message. The difference of the above two methods lays in the construction of the classification vector. Header based filtering is very efficient for the user since the message can be filtered before it arrives to the user s mailbox by using automated processes based on the classification method discussed above. From the other side it cannot be considered as a trusted method of SPAM filtering since it classifies messages based in a very little portion of the 2

3 message which often can be changed by the senders using network programming techniques. 2.2 Message /Content Based Filtering While header based filtering represents the majority of SPAM filtering policies implemented especially in large organizations, there are also several implementations of SPAM Protection in commercial and Open Source mail clients such us Outlook and Mozilla [2] that target the content part of the mail message. The basic underlaying procedure of classifying a mail messages as SPAM is rule based filtering. Typically a user can construct rules aka sets of classifiers that are activated when a new message arrives. In the simple form the user names a set of words and the rule engine (part of the software client) tries to match the values of these word with the content of the mail message. Similarly with header based filtering the message is being categorized as SPAM only if it reveals the rule. In a more advanced form of content filtering a combination of rules is being applied to classify a messages based on the overall score that a messages collects when several rules are applied to it. This score then is being validated against a predefined threshold, which is usually defined by the user and the message is being classified when the overall score gets over the threshold. Using message content as the targeted part of the SPAM filtering method, gives the advantage of an accurate mechanism that classifies mail messages based on user preferences about classification terms thus making the filtering method more targeted to the individual user characteristics of mail messages. The main disadvantage of this type of filtering model is that is being built upon certain characteristics of SPAM messages that are not global and interaction with the user is always needed to enter new values or modify existing ones in the rule vector. 2.3 Mixed Mode Filtering Mixed mode filtering addresses a combination of the filtering methods discussed above. Typically mixed mode filtering cumulates classifier decisions by applying filtering methods both to the header and the content part of the message. Examinations of existing SPAM mail corpora show that a typical SPAM message contains suspicious terms in both parts of the message. Radical implementations of this filtering policy use a mixed mode of interaction with the user in order to find the frontier of influence in the classification decision by the message part. 3 The Bayesian Filtering Approaches A special type of mixed mode filtering can be a bayesian filter that analyses both header and content parts of the message. This type of filtering uses a probability model to characterize a message based on the total probability that is accumulated by specific terms in the header and content parts of the message. Graham [3] suggested building Bayesian probability models of SPAM and non-spam words. The general pattern is that some words occur more frequently in known SPAM, and other words occur more frequently in legitimate messages. Using this approach we can generate certain probabilities for each attribute of the message and following a supervised learning period a probability distribution of terms that occur both to SPAM and legitimate messages can be created. Naive Bayes classifiers have been recently proved extremely accurate for SPAM classification [4]. This family of filters works in mixed mode by analyzing both content and header values of spam corpora. A spam message s is represented with a vector x = x 1, x 2..., x n where x 1, x 2..., x n are the values of attributes X 1, X 2,..., X n. In our case attributes correspond to words, i.e. each attribute shows if a particular word (e.g. offer ) appears in the message. Taking apart Bayes theorem and total probability theorem, the probability that a mail message s with vector x = x 1,..., x n belongs to category c is f(c = c X = x) = P (C = c) P (X i = x i C = c) P (C = k) P (X i = x i C = κ) Having κ {SP AM, Legitimate} The above simple formula is the basis for many spam-filtering approaches that have been developed in order to improve the accuracy of SPAM classification for 3

4 users (eg. SPAMBayes ). This approach can be customized due to the values of the document vector. The overall SPAM probability of a novel message, based on the collection of words it contains, can be computed as follows: Pi (T erm i ) S d = Pi (T erm i ) + (1 P i (T erm i )) Query Agent S d can be changed during the learning supervision. Following this method we have the above benefits: 1. It can generate a filter automatically from corpora of categorized messages rather than requiring human effort in rule development. 2. It can be customized to individual users characteristic spam and legitimate messages. 3. A probability set can be built and well known methods from decision theory can be applied to improve the accuracy of the filter (for example decision trees). As mentioned before in our system the SPAM indicative probability comes from a supervised learning process which constructs a user profile that is applied not only to a corpus of SPAM messages but also to a corpus of legitimate messages thus making a more coherent probabilistic model of the indicative probabilities of the message terms. We have reviewed SPAM filtering techniques that are widely been implemented to a large variety of software. We now schematize a collaborative filtering mediator using peer to peer networks in order to permit an exchange of these filters thus making a global SPAM indicative probability that can be used in workgroups and in cases where often SPAM message terms correlate. 4 Exchanging SPAM Indicative Probabilities over to Networks Recently the concept of -to-peer networks has been witnessed as a new kind of decentralized architecture in which nodes of equal roles and capabilities Figure 1: to deployed graph exchange information and services directly with each other.[5] The most well known characteristic of a to network is the decentralized architecture that characterizes it by giving advantages such us Not- Singe Point of Failure or independence of escalation. Most types of to are being built upon an anonymous policy that permits anyone to join the network and exchange certain types of information with other peers. In our approach we use a type of peer to peer network that requires a slightly authentication process implemented by invitation exchange between peers. This type of peer to peer network also known as trusted network can also be used efficiently in security applications [6]. By joining end users of the same workgroup in a filter exchange platform we then can be able to combine their filter characteristics in order to apply a probability classification scheme that is going to be more effective as peers join the network and exchange their indicative probability. 4.1 Overall System Architecture As can be seen from Figure 1 we define two certain types of nodes to our -to-peer network: Query 4

5 Let W be a vector representation of a simple whitelist filter. We declare R I as the header value of a misclassified message r i having r i C(legit SP AM).So let W = R 1, R 2,..., R n. The Query Agent now takes a query in the form Q(s) = P (C = SP AM X = xglobal ) Figure 2: Client-Network Architecture λ = Table 2: λ Values λ% Threshold Supervision 68% Low High 95% Medium Medium 99,7% High Low Agents and Simple s. Query agents handle the service requests that come across the client application. Then the user agent makes a reverse lookup with the total probability distribution of the messages terms that come from the network. We now define a parameter λ that perceives the classification criterion such us ( P (C = spam X ) i = x i ) i=n i=1 P (C = spam X global = x global ) having λ (0, 1], and assuming that λ follows the normal distribution (norm(0, 1)) Following certain values of λ as can be seen from the Table 2, a supervised interaction with the user is required. 4.2 Querying & LoopBack Service The second component of our architecture is the LoopBack service that is used in order to supervise the classification parameter. We now define the cost of classifying a legitimate messages as SPAM as a vector space C(legit SP AM) Having x / R I. Adding more values to the whitelist vector accurate an extra effort to query each time certain nodes about the rule R i, so we are currently examining the concept of creating a certain type of peer (supernode) in our architecture that stores a query history making the querying process more efficient. Reflecting this architecture the classification can be as much accurate as the indicative probability in the network is also accurate about the specific term. Similar architectures regarding SPAM can be found in Zhou[7] based on the approximate object location method in order to identify spam by mediating the matching of header fields and constructing a fingerprint verification that can be used by the Mail Transfer Agent. The wide use of spoofing techniques where senders hide their address or use false addresses makes the above architecture sensitive to SPAM attacks that use this method to bypass filters. 5 Ongoing and Feature Work This system is a research in progress work. We are currently evaluating the accuracy of the proposed system by creating a functional prototype deployed on top of JXTA API [8]. Considering the pheinomenon of SPAM as an emerging problem for all the users of the internet community we would be happy to collaborate with researchers who have some interest in extending our prototype. References [1] Cranor L.F. and LaMacchia B.A. Spam! Communications of ACM, vol. 41(8), 1998, pp [2] Mozilla Spam Filtering. [3] Graham A. A Plan for SPAM. Online, August

6 [4] Sahami M., Dumais S., Heckerman D., and Horvitz E. A Bayesian Approach to Filtering Junk . In Learning for Text Categorization: Papers from the 1998 Workshop. AAAI Technical Report WS-98-05, Madison, Wisconsin, [5] Androutsellis-Theotokis S. A Survey of to- File Sharing Technologies. Tech. Rep. WHP , Athens University of Economics and Business, Athens, Greece, [6] Vlachos V., Androutsellis-Theotokis S., and Spinellis D. Security applications of peer-to-peer networks. Computer Networks, vol. 45(2), June 2004, pp [7] Zhou F., Zhuang L., Zhao B.Y., Huang L., Joseph A.D., and Kubiatowicz J. Approximate Object Location and SPAM Filtering on -to- Systems. In Endler M. and Schmidt D. (eds.), Proceedings of ACM/IFIP/USENIX International Middleware Conference (Middleware 2003), vol. Vol of Lecture Notes in Computer Science. Springer Verlag, Rio de Janeiro, Brazil, [8] Gong L. Project JXTA: A technology overview, Technical report. Tech. rep., SUN Microsystems, April

An Empirical Performance Comparison of Machine Learning Methods for Spam Categorization

An Empirical Performance Comparison of Machine Learning Methods for Spam  Categorization An Empirical Performance Comparison of Machine Learning Methods for Spam E-mail Categorization Chih-Chin Lai a Ming-Chi Tsai b a Dept. of Computer Science and Information Engineering National University

More information

A Reputation-based Collaborative Approach for Spam Filtering

A Reputation-based Collaborative Approach for Spam Filtering Available online at www.sciencedirect.com ScienceDirect AASRI Procedia 5 (2013 ) 220 227 2013 AASRI Conference on Parallel and Distributed Computing Systems A Reputation-based Collaborative Approach for

More information

Filtering Spam by Using Factors Hyperbolic Trees

Filtering Spam by Using Factors Hyperbolic Trees Filtering Spam by Using Factors Hyperbolic Trees Hailong Hou*, Yan Chen, Raheem Beyah, Yan-Qing Zhang Department of Computer science Georgia State University P.O. Box 3994 Atlanta, GA 30302-3994, USA *Contact

More information

Project Report. Prepared for: Dr. Liwen Shih Prepared by: Joseph Hayes. April 17, 2008 Course Number: CSCI

Project Report. Prepared for: Dr. Liwen Shih Prepared by: Joseph Hayes. April 17, 2008 Course Number: CSCI University of Houston Clear Lake School of Science & Computer Engineering Project Report Prepared for: Dr. Liwen Shih Prepared by: Joseph Hayes April 17, 2008 Course Number: CSCI 5634.01 University of

More information

Efficacious Spam Filtering and Detection in Social Networks

Efficacious Spam Filtering and Detection in Social Networks Indian Journal of Science and Technology, Vol 7(S7), 180 184, November 2014 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Efficacious Spam Filtering and Detection in Social Networks U. V. Anbazhagu

More information

Keywords : Bayesian, classification, tokens, text, probability, keywords. GJCST-C Classification: E.5

Keywords : Bayesian,  classification, tokens, text, probability, keywords. GJCST-C Classification: E.5 Global Journal of Computer Science and Technology Software & Data Engineering Volume 12 Issue 13 Version 1.0 Year 2012 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global

More information

Spam Classification Documentation

Spam Classification Documentation Spam Classification Documentation What is SPAM? Unsolicited, unwanted email that was sent indiscriminately, directly or indirectly, by a sender having no current relationship with the recipient. Objective:

More information

Increasing the Accuracy of a Spam-Detecting Artificial Immune System

Increasing the Accuracy of a Spam-Detecting Artificial Immune System Increasing the Accuracy of a Spam-Detecting Artificial Immune System Terri Oda Carleton University 1125 Colonel By Drive Ottawa, ON K1S 5B6 terri@zone12.com Tony White Carleton University 1125 Colonel

More information

Diagnosis of Spams Some Statistical Considerations

Diagnosis of  Spams Some Statistical Considerations International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 3, Issue 4 (August 2012), PP. 05-09 Diagnosis of Email Spams Some Statistical Considerations

More information

P2P Contents Distribution System with Routing and Trust Management

P2P Contents Distribution System with Routing and Trust Management The Sixth International Symposium on Operations Research and Its Applications (ISORA 06) Xinjiang, China, August 8 12, 2006 Copyright 2006 ORSC & APORC pp. 319 326 P2P Contents Distribution System with

More information

Untitled Page. Help Documentation

Untitled Page. Help Documentation Help Documentation This document was auto-created from web content and is subject to change at any time. Copyright (c) 2018 SmarterTools Inc. Antispam Administration SmarterMail comes equipped with a number

More information

A Framework for Securing Databases from Intrusion Threats

A Framework for Securing Databases from Intrusion Threats A Framework for Securing Databases from Intrusion Threats R. Prince Jeyaseelan James Department of Computer Applications, Valliammai Engineering College Affiliated to Anna University, Chennai, India Email:

More information

A novel supervised learning algorithm and its use for Spam Detection in Social Bookmarking Systems

A novel supervised learning algorithm and its use for Spam Detection in Social Bookmarking Systems A novel supervised learning algorithm and its use for Spam Detection in Social Bookmarking Systems Anestis Gkanogiannis and Theodore Kalamboukis Department of Informatics Athens University of Economics

More information

Content Based Spam Filtering

Content Based Spam  Filtering 2016 International Conference on Collaboration Technologies and Systems Content Based Spam E-mail Filtering 2nd Author Pingchuan Liu and Teng-Sheng Moh Department of Computer Science San Jose State University

More information

Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach

Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach Alex Hai Wang College of Information Sciences and Technology, The Pennsylvania State University, Dunmore, PA 18512, USA

More information

An Experimental Evaluation of Spam Filter Performance and Robustness Against Attack

An Experimental Evaluation of Spam Filter Performance and Robustness Against Attack An Experimental Evaluation of Spam Filter Performance and Robustness Against Attack Steve Webb, Subramanyam Chitti, and Calton Pu {webb, chittis, calton}@cc.gatech.edu College of Computing Georgia Institute

More information

COSC 301 Network Management. Lecture 14: Electronic Mail

COSC 301 Network Management. Lecture 14: Electronic Mail COSC 301 Network Management Lecture 14: Electronic Mail Zhiyi Huang Computer Science, University of Otago COSC301 Lecture 14: Electronic Mail 1 Today s Focus Electronic Mail -- How does it work? -- How

More information

ANALYSIS AND EVALUATION OF DISTRIBUTED DENIAL OF SERVICE ATTACKS IDENTIFICATION METHODS

ANALYSIS AND EVALUATION OF DISTRIBUTED DENIAL OF SERVICE ATTACKS IDENTIFICATION METHODS ANALYSIS AND EVALUATION OF DISTRIBUTED DENIAL OF SERVICE ATTACKS IDENTIFICATION METHODS Saulius Grusnys, Ingrida Lagzdinyte Kaunas University of Technology, Department of Computer Networks, Studentu 50,

More information

STUDYING OF CLASSIFYING CHINESE SMS MESSAGES

STUDYING OF CLASSIFYING CHINESE SMS MESSAGES STUDYING OF CLASSIFYING CHINESE SMS MESSAGES BASED ON BAYESIAN CLASSIFICATION 1 LI FENG, 2 LI JIGANG 1,2 Computer Science Department, DongHua University, Shanghai, China E-mail: 1 Lifeng@dhu.edu.cn, 2

More information

SPAM PRECAUTIONS: A SURVEY

SPAM PRECAUTIONS: A SURVEY International Journal of Advanced Research in Engineering ISSN: 2394-2819 Technology & Sciences Email:editor@ijarets.org May-2016 Volume 3, Issue-5 www.ijarets.org EMAIL SPAM PRECAUTIONS: A SURVEY Aishwarya,

More information

VisNetic MailPermit. Enterprise Anti-spam Software. VisNetic MailPermit

VisNetic MailPermit. Enterprise Anti-spam Software. VisNetic MailPermit VisNetic MailPermit Enterprise Anti-spam Software VisNetic MailPermit p e r m i s s i o n - b a s e d email system Best of Class VisNetic MailPermit is on-premise anti-spam software that combines SpamAssassin

More information

A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems

A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems Anestis Gkanogiannis and Theodore Kalamboukis Department of Informatics Athens University

More information

Trust4All: a Trustworthy Middleware Platform for Component Software

Trust4All: a Trustworthy Middleware Platform for Component Software Proceedings of the 7th WSEAS International Conference on Applied Informatics and Communications, Athens, Greece, August 24-26, 2007 124 Trust4All: a Trustworthy Middleware Platform for Component Software

More information

Introduction. Logging in. WebQuarantine User Guide

Introduction. Logging in. WebQuarantine User Guide Introduction modusgate s WebQuarantine is a web application that allows you to access and manage your email quarantine. This user guide walks you through the tasks of managing your emails using the WebQuarantine

More information

Introduction to Antispam Practices

Introduction to Antispam Practices By Alina P Published: 2007-06-11 18:34 Introduction to Antispam Practices According to a research conducted by Microsoft and published by the Radicati Group, the percentage held by spam in the total number

More information

Introduction. Logging in. WebMail User Guide

Introduction. Logging in. WebMail User Guide Introduction modusmail s WebMail allows you to access and manage your email, quarantine contents and your mailbox settings through the Internet. This user guide will walk you through each of the tasks

More information

Getting Started 2 Logging into the system 2 Your Home Page 2. Manage your Account 3 Account Settings 3 Change your password 3

Getting Started 2 Logging into the system 2 Your Home Page 2. Manage your Account 3 Account Settings 3 Change your password 3 Table of Contents Subject Page Getting Started 2 Logging into the system 2 Your Home Page 2 Manage your Account 3 Account Settings 3 Change your password 3 Junk Mail Digests 4 Digest Scheduling 4 Using

More information

Ethical Hacking and. Version 6. Spamming

Ethical Hacking and. Version 6. Spamming Ethical Hacking and Countermeasures Version 6 Module XL Spamming News Source: http://www.nzherald.co.nz/ Module Objective This module will familiarize you with: Spamming Techniques used by Spammers How

More information

Sender Reputation Filtering

Sender Reputation Filtering This chapter contains the following sections: Overview of, on page 1 SenderBase Reputation Service, on page 1 Editing Score Thresholds for a Listener, on page 4 Entering Low SBRS Scores in the Message

More information

A Comparison of Text-Categorization Methods applied to N-Gram Frequency Statistics

A Comparison of Text-Categorization Methods applied to N-Gram Frequency Statistics A Comparison of Text-Categorization Methods applied to N-Gram Frequency Statistics Helmut Berger and Dieter Merkl 2 Faculty of Information Technology, University of Technology, Sydney, NSW, Australia hberger@it.uts.edu.au

More information

Computer aided mail filtering using SVM

Computer aided mail filtering using SVM Computer aided mail filtering using SVM Lin Liao, Jochen Jaeger Department of Computer Science & Engineering University of Washington, Seattle Introduction What is SPAM? Electronic version of junk mail,

More information

A Three-Way Decision Approach to Spam Filtering

A Three-Way Decision Approach to  Spam Filtering A Three-Way Decision Approach to Email Spam Filtering Bing Zhou, Yiyu Yao, and Jigang Luo Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 {zhou200b,yyao,luo226}@cs.uregina.ca

More information

A Content Vector Model for Text Classification

A Content Vector Model for Text Classification A Content Vector Model for Text Classification Eric Jiang Abstract As a popular rank-reduced vector space approach, Latent Semantic Indexing (LSI) has been used in information retrieval and other applications.

More information

Identifying Important Communications

Identifying Important Communications Identifying Important Communications Aaron Jaffey ajaffey@stanford.edu Akifumi Kobashi akobashi@stanford.edu Abstract As we move towards a society increasingly dependent on electronic communication, our

More information

Automated Online News Classification with Personalization

Automated Online News Classification with Personalization Automated Online News Classification with Personalization Chee-Hong Chan Aixin Sun Ee-Peng Lim Center for Advanced Information Systems, Nanyang Technological University Nanyang Avenue, Singapore, 639798

More information

Filtering Spam Using Fuzzy Expert System 1 Hodeidah University, Faculty of computer science and engineering, Yemen 3, 4

Filtering Spam Using Fuzzy Expert System 1 Hodeidah University, Faculty of computer science and engineering, Yemen 3, 4 Filtering Spam Using Fuzzy Expert System 1 Siham A. M. Almasan, 2 Wadeea A. A. Qaid, 3 Ahmed Khalid, 4 Ibrahim A. A. Alqubati 1, 2 Hodeidah University, Faculty of computer science and engineering, Yemen

More information

Basic Concepts in Intrusion Detection

Basic Concepts in Intrusion Detection Technology Technical Information Services Security Engineering Roma, L Università Roma Tor Vergata, 23 Aprile 2007 Basic Concepts in Intrusion Detection JOVAN GOLIĆ Outline 2 Introduction Classification

More information

Handling unwanted . What are the main sources of junk ?

Handling unwanted  . What are the main sources of junk  ? Handling unwanted email Philip Hazel Almost entirely based on a presentation by Brian Candler What are the main sources of junk email? Spam Unsolicited, bulk email Often fraudulent penis enlargement, lottery

More information

Intrusion Detection and Violation of Compliance by Monitoring the Network

Intrusion Detection and Violation of Compliance by Monitoring the Network International Journal of Computer Science and Engineering Open Access Research Paper Volume-2, Issue-3 E-ISSN: 2347-2693 Intrusion Detection and Violation of Compliance by Monitoring the Network R. Shenbaga

More information

SPHINX: A system for telling computers and humans apart through audio CAPTCHA. Yannis Soupionis

SPHINX: A system for telling computers and humans apart through audio CAPTCHA. Yannis Soupionis SPHINX: A system for telling computers and humans apart through audio CAPTCHA Yannis Soupionis Outline Introduction Internet Telephony Spam over Internet Telephony (SPIT) SPIT Phenomenon Methodology Research

More information

Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine

Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine Shuang Hao, Nadeem Ahmed Syed, Nick Feamster, Alexander G. Gray, Sven Krasser Motivation Spam: More than Just a

More information

Collaborative Spam Mail Filtering Model Design

Collaborative Spam Mail Filtering Model Design I.J. Education and Management Engineering, 2013, 2, 66-71 Published Online February 2013 in MECS (http://www.mecs-press.net) DOI: 10.5815/ijeme.2013.02.11 Available online at http://www.mecs-press.net/ijeme

More information

Extract of Summary and Key details of Symantec.cloud Health check Report

Extract of Summary and Key details of Symantec.cloud Health check Report SYMANTEC.CLOUD EXAMPLE HEALTH CHECK SUMMARY REPORT COMPUTER SECURITY TECHNOLOGY LTD. 8-9 Lovat lane, London, London. EC3R 8DW. Tel: 0207 621 9740. Email: info@cstl.com WWW.CSTL.COM Customer: - REDACTED

More information

Mail Services SPAM Filtering

Mail Services SPAM Filtering Table of Contents Subject Page Getting Started 2 Logging into the system 2 Your Home Page 2 Junk Mail Digests 3 Digest Scheduling 3 Using Your Digest 3 Messaging Features 4 Your Message Queue 4 View Queued

More information

In this project, I examined methods to classify a corpus of s by their content in order to suggest text blocks for semi-automatic replies.

In this project, I examined methods to classify a corpus of  s by their content in order to suggest text blocks for semi-automatic replies. December 13, 2006 IS256: Applied Natural Language Processing Final Project Email classification for semi-automated reply generation HANNES HESSE mail 2056 Emerson Street Berkeley, CA 94703 phone 1 (510)

More information

Decision Science Letters

Decision Science Letters Decision Science Letters 3 (2014) 439 444 Contents lists available at GrowingScience Decision Science Letters homepage: www.growingscience.com/dsl Identifying spam e-mail messages using an intelligence

More information

PERFORMANCE OF MACHINE LEARNING TECHNIQUES FOR SPAM FILTERING

PERFORMANCE OF MACHINE LEARNING TECHNIQUES FOR  SPAM FILTERING PERFORMANCE OF MACHINE LEARNING TECHNIQUES FOR EMAIL SPAM FILTERING M. Deepika 1 Shilpa Rani 2 1,2 Assistant Professor, Department of Computer Science & Engineering, Sreyas Institute of Engineering & Technology,

More information

arxiv:cs/ v1 [cs.cr] 5 Apr 2005

arxiv:cs/ v1 [cs.cr] 5 Apr 2005 Improving Spam Detection Based on Structural Similarity arxiv:cs/5412v1 [cs.cr] 5 Apr 25 Abstract Luiz H. Gomes, Fernando D. O. Castro, Rodrigo B. Almeida, Luis M. A. Bettencourt, Virgílio A. F. Almeida,

More information

Introduction This paper will discuss the best practices for stopping the maximum amount of SPAM arriving in a user's inbox. It will outline simple

Introduction This paper will discuss the best practices for stopping the maximum amount of SPAM arriving in a user's inbox. It will outline simple Table of Contents Introduction...2 Overview...3 Common techniques to identify SPAM...4 Greylisting...5 Dictionary Attack...5 Catchalls...5 From address...5 HELO / EHLO...6 SPF records...6 Detecting SPAM...6

More information

An Empirical Study of Behavioral Characteristics of Spammers: Findings and Implications

An Empirical Study of Behavioral Characteristics of Spammers: Findings and Implications An Empirical Study of Behavioral Characteristics of Spammers: Findings and Implications Zhenhai Duan, Kartik Gopalan, Xin Yuan Abstract In this paper we present a detailed study of the behavioral characteristics

More information

Testing? Here s a Second Opinion. David Koconis, Ph.D. Senior Technical Advisor, ICSA Labs 01 October 2010

Testing? Here s a Second Opinion. David Koconis, Ph.D. Senior Technical Advisor, ICSA Labs 01 October 2010 Still Curious about Anti-Spam Testing? Here s a Second Opinion David Koconis, Ph.D. Senior Technical Advisor, ICSA Labs 01 October 2010 Copyright 2009 Cybertrust. All Rights Reserved. Outline Introduction

More information

On the use of Locality for Improving SVM-Based Spam Filtering

On the use of Locality for Improving SVM-Based Spam Filtering On the use of Locality for Improving SVM-Based Spam Filtering Okesola, J.O. School of Computing University of South Africa, South Africa 48948535@mylife.unisa.ac.za Ojo, F.O. Department of Computer Science

More information

The evolution of malevolence

The evolution of malevolence Detection of spam hosts and spam bots using network traffic modeling Anestis Karasaridis Willa K. Ehrlich, Danielle Liu, David Hoeflin 4/27/2010. All rights reserved. AT&T and the AT&T logo are trademarks

More information

Collaborative Filtering. Doug Herbers Master s Oral Defense June 28, 2005

Collaborative  Filtering. Doug Herbers Master s Oral Defense June 28, 2005 Collaborative E-Mail Filtering Doug Herbers Master s Oral Defense June 28, 2005 Background Spamming the use of any electronic communications medium to send unsolicited messages in bulk E-Mail is the most

More information

s and Anti-spam

s and Anti-spam E-mails and Anti-spam Standard authentication AUTH method As the spammers become increasing aggressive more and more legit emails get banned as spam. When you send e-mails from your webcrm system, we use

More information

CISC859: Topics in Advanced Networks & Distributed Computing: Network & Distributed System Security. A Brief Overview of Security & Privacy Issues

CISC859: Topics in Advanced Networks & Distributed Computing: Network & Distributed System Security. A Brief Overview of Security & Privacy Issues CISC859: Topics in Advanced Networks & Distributed Computing: Network & Distributed System Security A Brief Overview of Security & Privacy Issues 1 Topics to Be Covered Cloud computing RFID systems Bitcoin

More information

Deliverability Terms

Deliverability Terms Email Deliverability Terms The Purpose of this Document Deliverability is an important piece to any email marketing strategy, but keeping up with the growing number of email terms can be tiring. To help

More information

Managing the Emerging Semantic Risks

Managing the Emerging Semantic Risks The New Information Security Agenda: Managing the Emerging Semantic Risks Dr Robert Garigue Vice President for information integrity and Chief Security Executive Bell Canada Page 1 Abstract Today all modern

More information

System Models. 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models. Nicola Dragoni Embedded Systems Engineering DTU Informatics

System Models. 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models. Nicola Dragoni Embedded Systems Engineering DTU Informatics System Models Nicola Dragoni Embedded Systems Engineering DTU Informatics 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models Architectural vs Fundamental Models Systems that are intended

More information

Improving Newsletter Delivery with Certified Opt-In An Executive White Paper

Improving Newsletter Delivery with Certified Opt-In  An Executive White Paper Improving Newsletter Delivery with Certified Opt-In E-Mail An Executive White Paper Coravue, Inc. 7742 Redlands St., #3041 Los Angeles, CA 90293 USA (310) 305-1525 www.coravue.com Table of Contents Introduction...1

More information

Improving the methods of classification based on words ontology

Improving the methods of  classification based on words ontology www.ijcsi.org 262 Improving the methods of email classification based on words ontology Foruzan Kiamarzpour 1, Rouhollah Dianat 2, Mohammad bahrani 3, Mehdi Sadeghzadeh 4 1 Department of Computer Engineering,

More information

Panda Security. Protection. User s Manual. Protection. Version PM & Business Development Team

Panda Security.  Protection. User s Manual.  Protection. Version PM & Business Development Team Panda Security Email Protection Email Protection PM & Business Development Team User s Manual Version 4.3.2-2 1 Table of Contents Table of Contents... 2 1. Introduction to Email Protection... 3 2. Email

More information

A Framework for Peer-To-Peer Lookup Services based on k-ary search

A Framework for Peer-To-Peer Lookup Services based on k-ary search A Framework for Peer-To-Peer Lookup Services based on k-ary search Sameh El-Ansary Swedish Institute of Computer Science Kista, Sweden Luc Onana Alima Department of Microelectronics and Information Technology

More information

AN EFFECTIVE SPAM FILTERING FOR DYNAMIC MAIL MANAGEMENT SYSTEM

AN EFFECTIVE SPAM FILTERING FOR DYNAMIC MAIL MANAGEMENT SYSTEM ISSN: 2229-6956(ONLINE) DOI: 1.21917/ijsc.212.5 ICTACT JOURNAL ON SOFT COMPUTING, APRIL 212, VOLUME: 2, ISSUE: 3 AN EFFECTIVE SPAM FILTERING FOR DYNAMIC MAIL MANAGEMENT SYSTEM S. Arun Mozhi Selvi 1 and

More information

CHEAP, efficient and easy to use, has become an

CHEAP, efficient and easy to use,  has become an Proceedings of International Joint Conference on Neural Networks, Dallas, Texas, USA, August 4-9, 2013 A Multi-Resolution-Concentration Based Feature Construction Approach for Spam Filtering Guyue Mi,

More information

Formalization of Objectives of Grid Systems Resources Protection against Unauthorized Access

Formalization of Objectives of Grid Systems Resources Protection against Unauthorized Access Nonlinear Phenomena in Complex Systems, vol. 17, no. 3 (2014), pp. 272-277 Formalization of Objectives of Grid Systems Resources Protection against Unauthorized Access M. O. Kalinin and A. S. Konoplev

More information

Naive Bayes spam filtering using word-position-based attributes and length-sensitive classification thresholds

Naive Bayes spam filtering using word-position-based attributes and length-sensitive classification thresholds Naive Bayes spam filtering using word-position-based attributes and length-sensitive classification thresholds Johan Hovold Department of Computer Science Lund University Box 118, 221 00 Lund, Sweden johan.hovold.363@student.lu.se

More information

Red Condor had. during. testing. Vx Technology high availability. AntiSpam,

Red Condor had. during. testing. Vx Technology high availability.  AntiSpam, Lab Testing Summary Report July 21 Report 167 Product Category: Email Security Solution Vendors Tested: MessageLabs/Symantec MxLogic/McAfee SaaS Products Tested: - Cloudfilter; MessageLabs/Symantec Email

More information

Final Report - Smart and Fast Sorting

Final Report - Smart and Fast  Sorting Final Report - Smart and Fast Email Sorting Antonin Bas - Clement Mennesson 1 Project s Description Some people receive hundreds of emails a week and sorting all of them into different categories (e.g.

More information

Mapping Internet Sensors with Probe Response Attacks

Mapping Internet Sensors with Probe Response Attacks Mapping Internet Sensors with Probe Response Attacks John Bethencourt, Jason Franklin, and Mary Vernon {bethenco, jfrankli, vernon}@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison

More information

Grid Computing Systems: A Survey and Taxonomy

Grid Computing Systems: A Survey and Taxonomy Grid Computing Systems: A Survey and Taxonomy Material for this lecture from: A Survey and Taxonomy of Resource Management Systems for Grid Computing Systems, K. Krauter, R. Buyya, M. Maheswaran, CS Technical

More information

ADAPTIVE AUTHENTICATION ADAPTER FOR IBM TIVOLI. Adaptive Authentication in IBM Tivoli Environments. Solution Brief

ADAPTIVE AUTHENTICATION ADAPTER FOR IBM TIVOLI. Adaptive Authentication in IBM Tivoli Environments. Solution Brief ADAPTIVE AUTHENTICATION ADAPTER FOR IBM TIVOLI Adaptive Authentication in IBM Tivoli Environments Solution Brief RSA Adaptive Authentication is a comprehensive authentication platform providing costeffective

More information

A Survey on Postive and Unlabelled Learning

A Survey on Postive and Unlabelled Learning A Survey on Postive and Unlabelled Learning Gang Li Computer & Information Sciences University of Delaware ligang@udel.edu Abstract In this paper we survey the main algorithms used in positive and unlabeled

More information

CORE for Anti-Spam. - Innovative Spam Protection - Mastering the challenge of spam today with the technology of tomorrow

CORE for Anti-Spam. - Innovative Spam Protection - Mastering the challenge of spam today with the technology of tomorrow CORE for Anti-Spam - Innovative Spam Protection - Mastering the challenge of spam today with the technology of tomorrow Contents 1 Spam Defense An Overview... 3 1.1 Efficient Spam Protection Procedure...

More information

DMARC ADOPTION AMONG

DMARC ADOPTION AMONG DMARC ADOPTION AMONG Top 100 Chinese Brands Q1 2018 Featuring Matthew Vernhout (CIPP/C) Director of Privacy, 250ok TABLE OF CONTENTS Introduction... 03 Research Overview... 04 Top 100 Chinese Brands...

More information

2 Application Support via Proxies Onion Routing can be used with applications that are proxy-aware, as well as several non-proxy-aware applications, w

2 Application Support via Proxies Onion Routing can be used with applications that are proxy-aware, as well as several non-proxy-aware applications, w Onion Routing for Anonymous and Private Internet Connections David Goldschlag Michael Reed y Paul Syverson y January 28, 1999 1 Introduction Preserving privacy means not only hiding the content of messages,

More information

CAMELOT Configuration Overview Step-by-Step

CAMELOT Configuration Overview Step-by-Step General Mode of Operation Page: 1 CAMELOT Configuration Overview Step-by-Step 1. General Mode of Operation CAMELOT consists basically of three analytic processes running in a row before the email reaches

More information

Robocall and fake caller-id detection

Robocall and fake caller-id detection Technical Disclosure Commons Defensive Publications Series December 01, 2017 Robocall and fake caller-id detection Junda Liu Naveen Kalla Shi Lu Follow this and additional works at: http://www.tdcommons.org/dpubs_series

More information

2. Design Methodology

2. Design Methodology Content-aware Email Multiclass Classification Categorize Emails According to Senders Liwei Wang, Li Du s Abstract People nowadays are overwhelmed by tons of coming emails everyday at work or in their daily

More information

Influence of Word Normalization on Text Classification

Influence of Word Normalization on Text Classification Influence of Word Normalization on Text Classification Michal Toman a, Roman Tesar a and Karel Jezek a a University of West Bohemia, Faculty of Applied Sciences, Plzen, Czech Republic In this paper we

More information

Correlation and Phishing

Correlation and Phishing A Trend Micro Research Paper Email Correlation and Phishing How Big Data Analytics Identifies Malicious Messages RungChi Chen Contents Introduction... 3 Phishing in 2013... 3 The State of Email Authentication...

More information

Binarytech Digital Education Karta Allahabad ( Notes)

Binarytech Digital Education Karta Allahabad ( Notes) Email Email is a service which allows us to send the message in electronic mode over the internet. It offers an efficient, inexpensive and real time mean of distributing information among people. E-Mail

More information

Mailspike. Henrique Aparício

Mailspike. Henrique Aparício Mailspike Henrique Aparício 1 Introduction For many years now, email has become a tool of great importance as a means of communication. Its growing use led inevitably to its exploitation by entities that

More information

Mapping Internet Sensors with Probe Response Attacks

Mapping Internet Sensors with Probe Response Attacks Mapping Internet Sensors with Probe Response Attacks Computer Sciences Department University of Wisconsin, Madison Introduction Outline Background Example Attack Introduction to the Attack Basic Probe

More information

Approximate Object Location and Spam Filtering on Tapestry

Approximate Object Location and Spam Filtering on Tapestry Approximate Object Location and Spam Filtering on Tapestry Feng Zhou (zf@cs.berkeley.edu) Li Zhuang (zl@cs.berkeley.edu) Ben Y. Zhao (ravenben@cs.berkeley.edu) Ling Huang (hling@cs.berkeley.edu) 1/13/2003

More information

Bayesian Spam Detection System Using Hybrid Feature Selection Method

Bayesian Spam Detection System Using Hybrid Feature Selection Method 2016 International Conference on Manufacturing Science and Information Engineering (ICMSIE 2016) ISBN: 978-1-60595-325-0 Bayesian Spam Detection System Using Hybrid Feature Selection Method JUNYING CHEN,

More information

Innovation IT Services Price List

Innovation IT Services Price List Innovation IT Services Price List 2016-2017 Tel: 0330 330 8956 email: itsales@innoit.co.uk How to complete the Keep My Number porting form - v1.1 st Effective Date: 31 June 2014 Contents: 3/4: Microsoft

More information

Application of Support Vector Machine Algorithm in Spam Filtering

Application of Support Vector Machine Algorithm in  Spam Filtering Application of Support Vector Machine Algorithm in E-Mail Spam Filtering Julia Bluszcz, Daria Fitisova, Alexander Hamann, Alexey Trifonov, Advisor: Patrick Jähnichen Abstract The problem of spam classification

More information

DONE FOR YOU SAMPLE INTERNET ACCEPTABLE USE POLICY

DONE FOR YOU SAMPLE INTERNET ACCEPTABLE USE POLICY DONE FOR YOU SAMPLE INTERNET ACCEPTABLE USE POLICY Published By: Fusion Factor Corporation 2647 Gateway Road Ste 105-303 Carlsbad, CA 92009 USA 1.0 Overview Fusion Factor s intentions for publishing an

More information

Protection FAQs

Protection FAQs Email Protection FAQs Table of Contents Email Protection FAQs... 3 General Information... 3 Which University email domains are configured to use Email Protection for Anti-Spam?... 3 What if I am still

More information

ProofPoint Protection Perimeter Security Daily Digest and Configuration Guide. Faculty/Staff Guide

ProofPoint  Protection  Perimeter Security Daily Digest and Configuration Guide. Faculty/Staff Guide ProofPoint Email Protection Email Perimeter Security Daily Digest and Configuration Guide Faculty/Staff Guide Contents Introduction and ProofPoint Overview... 2 Daily Email Digest... 3 ProofPoint Portal

More information

A Security Management Scheme Using a Novel Computational Reputation Model for Wireless and Mobile Ad hoc Networks

A Security Management Scheme Using a Novel Computational Reputation Model for Wireless and Mobile Ad hoc Networks 5th ACM Workshop on Performance Evaluation of Wireless Ad Hoc, Sensor, and Ubiquitous Networks (PE-WASUN) A Security Management Scheme Using a Novel Computational Reputation Model for Wireless and Mobile

More information

Terminator for Spam - A Fuzzy Approach Revealed

Terminator for  Spam - A Fuzzy Approach Revealed Terminator for E-mail Spam - A Fuzzy Approach Revealed P.SUDHAKAR 1, G.POONKUZHALI 2, K.THIAGARAJAN 3, K.SARUKESI 4 1 Vernalis systems Pvt Ltd, Chennai- 600116 2 Department of Computer Science and Engineering,

More information

Adaptive Authentication Adapter for Citrix XenApp. Adaptive Authentication in Citrix XenApp Environments. Solution Brief

Adaptive Authentication Adapter for Citrix XenApp. Adaptive Authentication in Citrix XenApp Environments. Solution Brief Adaptive Authentication Adapter for Citrix XenApp Adaptive Authentication in Citrix XenApp Environments Solution Brief RSA Adaptive Authentication is a comprehensive authentication platform providing costeffective

More information

BayesTH-MCRDR Algorithm for Automatic Classification of Web Document

BayesTH-MCRDR Algorithm for Automatic Classification of Web Document BayesTH-MCRDR Algorithm for Automatic Classification of Web Document Woo-Chul Cho and Debbie Richards Department of Computing, Macquarie University, Sydney, NSW 2109, Australia {wccho, richards}@ics.mq.edu.au

More information

Symantec Protection Suite Add-On for Hosted Security

Symantec Protection Suite Add-On for Hosted  Security Symantec Protection Suite Add-On for Hosted Email Security Overview Malware and spam pose enormous risk to the health and viability of IT networks. Cyber criminal attacks are focused on stealing money

More information

Blockchain for Enterprise: A Security & Privacy Perspective through Hyperledger/fabric

Blockchain for Enterprise: A Security & Privacy Perspective through Hyperledger/fabric Blockchain for Enterprise: A Security & Privacy Perspective through Hyperledger/fabric Elli Androulaki Staff member, IBM Research, Zurich Workshop on cryptocurrencies Athens, 06.03.2016 Blockchain systems

More information

Probabilistic Anti-Spam Filtering with Dimensionality Reduction

Probabilistic Anti-Spam Filtering with Dimensionality Reduction Probabilistic Anti-Spam Filtering with Dimensionality Reduction ABSTRACT One of the biggest problems of e-mail communication is the massive spam message delivery Everyday billion of unwanted messages are

More information

DMARC ADOPTION AMONG. SaaS 1000 Q Featuring Matthew Vernhout (CIPP/C) Director of Privacy, 250ok

DMARC ADOPTION AMONG. SaaS 1000 Q Featuring Matthew Vernhout (CIPP/C) Director of Privacy, 250ok DMARC ADOPTION AMONG SaaS 1000 Q1 2018 Featuring Matthew Vernhout (CIPP/C) Director of Privacy, 250ok TABLE OF CONTENTS Introduction... 03 Research Overview... 04 SaaS 1000... 05 DMARC Adoption Among SaaS

More information

Online (in)security: The current threat landscape Nikolaos Tsalis

Online (in)security: The current threat landscape Nikolaos Tsalis Online (in)security: The current threat landscape Nikolaos Tsalis November 2015 Online (in)security: The current threat landscape Nikolaos Tsalis (ntsalis@aueb.gr) Information Security & Critical Infrastructure

More information