Malicious Drive-By-Download Website Classification Using JavaScript Features. Sam Wang. B.Sc., University of Victoria, 2014

Size: px
Start display at page:

Download "Malicious Drive-By-Download Website Classification Using JavaScript Features. Sam Wang. B.Sc., University of Victoria, 2014"

Transcription

1 Malicious Drive-By-Download Website Classification Using JavaScript Features by Sam Wang B.Sc., University of Victoria, 2014 An Industrial Project Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in the Department of Computer Science Sam Wang, 2016 University of Victoria All rights reserved. This project may not be reproduced in whole or in part, by photocopy or other means, without the permission of the author.

2 ii Supervisory Committee Malicious Drive-By-Download Website Classification Using JavaScript Features by Sam Wang B.Sc., University of Victoria, 2014 Supervisory Committee Dr. Jianping Pan, Department of Computer Science Supervisor Dr. Sudhakar Ganti, Department of Computer Science Departmental Member

3 iii Abstract Supervisory Committee Dr. Jianping Pan, Department of Computer Science Supervisor Dr. Sudhakar Ganti, Department of Computer Science Departmental Member In recent years, Drive-by-download attacks make up over 90% of web-based attacks on web users. Many web users fall victim to this type of attacks due to its simplicity and less complex requirements to be compromised. They simply need to click on a malicious URL while having some browser vulnerabilities for the malicious attackers to compromise their machine and to obtain their sensitive information. To combat these attacks, proactive blacklists are used nowadays for preventing web users from accessing these malicious web pages. This report attempts to supplement the existing proactive blacklisting framework by introducing JavaScript feature vectors for classification. These feature vectors include the functionality of JavaScript in terms of JavaScript bytecode, as well as some string analysis properties for the classification of benign and malicious web pages. A few different classifiers are tested and compared to provide insight on the different JavaScript feature vectors defined.

4 iv Table of Contents Supervisory Committee... ii Abstract... iii Table of Contents... iv List of Figures... v List of Tables... vi Acknowledgements... vii 1. Background Drive-by-Download Attacks Current Approaches Related Work AutoBLG AutoBLG Overview URL Expansion URL Filtration URL Verification Analysis of the Current Framework and Plan Classification via JavaScript Bytecode Features Classification via JavaScript String Properties JavaScript Obfuscation String Tokenization Work Flow Data Source Obtaining and Tokenizing JavaScript N-gram Analysis Input into Weka Results gram vs 2-gram vs 3-gram Complete Set of Features Future Improvements and Conclusion References... 26

5 v List of Figures Figure 1: Drive-by-Download attack model... 1 Figure 2: High-level overview and work flow of AutoBLG... 3 Figure 3: Snapshot of JavaScript bytecode ratio distribution for each website... 6 Figure 4: Work flow of the classification process... 8 Figure 5: Comparison of 1-gram vs 2-gram vs 3-gram tokenization Figure 6: Classification result for Bayesian Network Figure 7: Classification result for Logistic Regression Figure 7A: Snapshot of the coefficient weights for the Logistic Function Figure 8: Classification result for SMO Figure 8A: Snapshot of the attribute weights for the support vectors Figure 9: Classification result for J48 tree Figure 9A: Pruned tree decisions for the J48 tree Figure 10: Classification result for Random Forest Figure 11: Comparison of the different classifiers... 19

6 vi List of Tables Table 1. Basic techniques of JavaScript obfuscation and evasion... 7

7 vii Acknowledgements I would like to acknowledge and thank the AutoBLG team (Network Security Lab) at Waseda University in Japan for providing me with the data set for this project and also for the technical support provided in processing the data. I would also like to thank the University of Victoria Parallel, Networking and Distributed Applications (PANDA) lab members for their help and support during this project and the Natural Sciences and Engineering Research Council of Canada (NSERC) for the research assistantship and equipments.

8 1. Background 1.1 Drive-by-Download Attacks Drive-by-download is a type of web-based attack that involves attacking a user when the user accesses a malicious, or compromised website, via a malicious URL (linked by s, or URL hot-linked by attackers), or during the normal exploration of the Internet. The condition of the attack is that the user fits some vulnerability profiles which the malicious site is looking for when the user accesses the site. As demonstrated in Fig. 1, when first clicking on the landing page URL, the user is redirected to an exploit URL in the background. The exploit URL looks for some browser vulnerabilities of the user, and if such vulnerabilities are found, the user is then redirected to a malware download URL where malwares are downloaded onto the user's machine without notice. These downloaded files can then steal sensitive information or computer resources from the user. In recent years, drive-by-download attacks make up over 90% of web-based attacks [1], and thus this is a hot research topic as well as a priority for researchers and security engineers alike. Fig. 1: Drive-by-Download attack model [1].

9 2 1.2 Current Approaches To defend against drive-by-download attacks, two main approaches are used. The first approach is actively identifying the vulnerabilities of browsers and patching them to prevent being exploited, but in many instances, browser vulnerabilities are not identified until attacks occur, and we cannot always rely on end users to be proactive in patching. The second approach is the use of URL blacklists. Blacklists are lists of blocked URLs that ISPs, search engines, or network admins maintain, to prevent the end users from accessing these malicious URLs. Reactive blacklists have the same issue in that they rely on some users to be attacked before the blacklists can register the malicious URL. Recently, much work in the research community has focused on creating proactive blacklists that can register compromised sites and blacklist these sites before the attacks are even active.

10 3 2. Related Work Many different angles have been explored in the detection of new malicious URLs. The International Computer Science Institute group at Berkley proposes a domain name-based approach [2], in which they check for elements such as the age of name server of domains and the relationship between the name server and the domain name. The University of California - San Diego group explores the lexical feature of a URL, and host-based features such as where the hosts are, who owns them and how they are managed [3], and then uses machine learning approaches to classify good and bad URLs. The University of Texas group analyzes both network-layer traffic and application-layer website contents in their work [4]. The Japan Advanced Institute of Science and Technology (JAIST) group creates a prediction model for predicting the downloading of malwares during a drive-by-download attack session by leveraging the Common Vulnerabilities and Exposures-ID vulnerability information as well as JavaScript opcode of websites [5]. Prophiler [6] is an all around tool that takes in HTML, JavaScript, URL and Host information to classify unknown websites as malicious or benign. The group of researchers at Yokohama National University in Japan explores ways of identifying JavaScript obfuscation [7] (more on obfuscation later in the report). Automatic Blacklist Generator (AutoBLG) is a lightweight proactive malicious URL blacklist framework [1]. This framework will be further explored in the next section.

11 4 3. AutoBLG 3.1 AutoBLG Overview AutoBLG is a light-weight static analysis framework for discovery of new malicious URLs. The framework consists of three components: URL Expansion, URL Filtration and URL Verification (Fig. 2). The inputs of AutoBLG are currently known malicious URLs and outputs of the framework are new malicious URLs. Fig. 2: High-level overview and work flow of AutoBLG [1]. 3.2 URL Expansion The URL expansion step takes in known malicious URLs and obtains the IP addresses associated with the malicious URLs. From the set of IP addresses, a passive DNS database is leveraged to return a set of Fully Qualified Domain Names (FQDN) associated with the IP addresses. This set of FQDN is considered the neighbourhood of existing malicious URLs in terms of IP address. The FQDN itself is not sufficient because the attacker likely places malicious web-pages deep in the directory structure of a server. The next step is leveraging search engines and web crawlers to obtain highly ranked and accessible pages inside the directories. The outputs of the URL expansion phase are a large set of URLs associated with the original malicious URLs.

12 5 3.3 URL Filtration Due to the high number of URLs generated from the URL expansion step, a filter is used to trim them down to a much smaller set before processing them via verification techniques. The URL filtration step leverages the Bayesian Set algorithm to classify URLs in terms of maliciousness. The features used for feature extraction and classification are 10 static features from the landing page contents: the number of iframe and frame tags, the number of hidden elements with a small display area, the number of out-ofplace elements, the number of embedded and object tags, the presence of unescape behaviour, the number of suspicious words in the script, the number of settimeout functions, and the number of URLs with a different domain. Then based on the classification of the URL maliciousness the top ranked malicious URLs are passed onto the URL verification step. Overall about 1% of the inputs of URL filtration are outputted to the next step. 3.4 URL Verification The highly malicious URLs are then finally passed onto the URL verification step for malicious verification of the URL. Three tools are leveraged in the URL verification step for this process: the Marionette web client honeypot, anti-virus software, and an online URL verification site VirusTotal. The web client honeypot can trace redirection generated by drive-by-download attacks and identify the malware distribution URL. Anti-virus software analyzes HTML and JavaScript contents statically to check for malicious contents. The VirusTotal website compares the URLs submitted to URL blacklists and cyber-attack detection systems, and then forwards the result of the comparison to users.

13 6 4. Analysis of the Current Framework and Plan The AutoBLG framework does a good job of detecting new potential URLs while still being a light weight framework. However as the authors mentioned, there are a few things that can definitely be improved upon. Classification Accuracy Right now for the 600 URLs inputted to the final verification step, 106 of them are found to be malicious or suspicious. Currently the final verification step is tedious and time consuming even for just 600 URLs, highly classification accuracy is desired for the URL filtration step either by reducing the false positive URLs and reducing the number of non-malicious URLs to be verified, or even better, by decreasing the number of false negatives and including more malicious URLs in the final verification step. Cloaked URL detection Right now the set of URLs to be filtered and finally verified relies on the search engines and web crawlers to find the site in the first place. Some sites employ advanced cloaking techniques to prevent search engines and web crawlers from finding these URLs in the first place, and these cloaked URLs will never be in the set of URLs to be filtered or verified. Currently, the AutoBLG framework leverages very little JavaScript features for the URL filtration step (only the feature number of suspicious words in script is observed, and only at JavaScript scripts directly inside the HTML landing pages). Even in the paper the authors mentioned that they intend to increase the use of JavaScript in the landing page in the future. By increasing the gathering and use of JavaScript, we can improve some of the above mentioned issues. In the URL filtration step, additional JavaScript based features can be added to complement the HTML based features observed right now to improve the classification accuracy. Also by scrutinizing the JavaScript, we can potentially identify hidden URLs that are missed by the web crawlers and pick up some potentially missed malicious URLs. Unfortunately, many JavaScript codes on the Internet, especially the ones on malicious sites, have their JavaScript obfuscated, so it is hard for us to analyze and tell exactly what it is doing. JavaScript obfuscation is a technique that the JavaScript codes and data fields are scrambled and modified in a way that it is hard for the reader, or program analyzer to understand the codes, while being functionally

14 equivalent to the JavaScript before the obfuscation. This technique is used on many sites, not only malicious sites, but also some benign sites on the Internet. 7 Based on the above, my plan was to identify JavaScript classification vectors via static analysis through two different sources. The first source is through JavaScript bytecode features extracted from the JavaScript, which give us an idea of what the JavaScript is doing. The second source is taking advantage of the fact that many malicious sites contain obfuscated JavaScripts, which look different from non-obfuscated JavaScripts and have a different distribution of characters in the string. Leveraging these differences also allows us to extract some feature vectors for classification.

15 8 5. Classification via JavaScript Bytecode Features Work for this part of the research was conducted last semester in the CSC 591 Directed Studies Web Security course, so I will only briefly explain the thought and methodology conducted in this part of the classification. The final results of this report also include work from this part of the research, as a result this section is included. Due to the obfuscation of JavaScripts, instead of looking directly at the JavaScript for feature vectors, I was looking to employ the technique similar to what the JAIST group did for predicting malware download [5], in that the interpreted bytecodes of the JavaScript are used instead of the JavaScript itself for feature vectors. Bytecodes has the advantage of bypassing obfuscation, as well as detecting functions and URLs hidden inside the JavaScript by partially interpreting the JavaScript. The bytecodes used in particular are the SpiderMonkey bytecodes developed by Mozilla, and although the bytecodes is platform specific to Mozilla browser, for our feature extraction of instruction set this is sufficient. SpiderMonkey bytecodes are the canonical form of code representation that is used in the JavaScript engine. The JavaScript front end constructs an Abstract Syntax Tree (AST) from the source text, then emits stack-based bytecodes from that AST as a part of the JSScript data structure. Bytecodes can reference atoms and objects (typically by array index) which are also contained in the JSScript data structure. - Mozilla Developer Network The process to obtain JavaScript bytecodes involve going through the HTML of the landing pages and extracting all JavaScript blocks and downloading all JavaScript files associated. These JavaScript files are then passed onto the SpiderMonkey shell and dissembled into the bytecode forms. The feature vectors used for classification (Fig. 3) are the ratio of each of the bytecode calls to the total number of bytecode function calls.

16 9 Site_Name strictne string hole call strict-delelem initelem_array enditer length goto setname geico.com elpais.com pornerbros.com blogger.com gazetaexpress.com hdzog.com zdf.de instantdownloaderpro.com tilestwra.com freepik.com echoroukonline.com meituan.com wordpress.org huffingtonpost.ca smallpdf.com allegro.pl elmogaz.com gmw.cn dpstream.net Fig. 3: Snapshot of JavaScript bytecode ratio distribution for each website.

17 10 6. Classification via JavaScript String Properties 6.1 JavaScript Obfuscation The previous section can be described as classification using the functionality of the JavaScript, whereas this section includes classification using the visual difference of the two, or how the structures of the strings inside the JavaScripts differ. This is taking advantage of the fact that many malicious JavaScripts are obfuscated whereas very little benign JavaScripts are [6]. To better understand how we can leverage string properties to detect obfuscated JavaScripts and malicious web pages, we should first understand the basics of JavaScript obfuscation. Table 1 is a summary of some of the basic techniques used for obfuscation. Table 1. Basic techniques of JavaScript obfuscation and evasion [8]. Technique String encoding Description Encode literals to generate unreadable versions of them (e.g., A character can be presented as %41 using URL encoding) Integer obfuscation Apply mathematical operation to generate numerical value as an evaluation of mathematical expression Whitespace and comment Identifier reassignment Block randomization String splitting Remove indentations, whitespaces, and comments and write the source script into a single line randomization Give alias names to the defined function calls to hinder tainting analysis Manipulate nested control structures to hinder analysis even by web developers Exploit methods of string object to dynamically generate literals As shown above, the obfuscated JavaScript should have some character distribution that is different from non-obfuscated JavaScripts in a string, and from this we should be able to leverage this difference to differentiate the two.

18 String Tokenization There have been a decent number of works conducted on the topic of detecting JavaScript obfuscation [7,8], and some attempted to deobfuscate the JavaScripts back to it's original form. Most of these involve some string analysis methods and look to entropy analysis for classification. Through an iterative test process, I ended up employing a method closely resemble the 2016 paper by the group at Yokohama National University [7]. Instead of doing the analysis on the character of the JavaScript, the JavaScript is first tokenized into 4 different categories: lower case letters are converted into the '0' characters, upper case characters are converted into the '1' characters, numbers are converted to the '2' characters, and punctuations (except white space) are converted to the '3' characters (white spaces are removed in this process). This tokenization takes into account that obfuscated JavaScripts have a different distribution compared to the normal one, while keeping only four types of characters allows for more complicated feature extraction such as n-gram analysis without having too many feature vectors.

19 Work Flow Fig. 4: Work flow of the classification process. 6.4 Data Source Work flow to obtain the string features is shown in Fig. 4 above. For the benign inputs, the list of Alexa top 5000 sites is used as input and randomly selected 120 URLs (SelectURL.java) to be used as our benign sites. To obtain the HTML, the wget command is leveraged inside a python script to retrieve the HTML of these sites (running htmlget.py). For the malicious inputs, I used the web trace data supplied by the AutoBLG team at Waseda to be the inputs. Initially I used WireShark to directly pull the files from HTTP export, but it turned out too time consuming for the large amount of sites for the.pcap file. I then looked for another tool for this task. The Network Miner tool takes in.pcap files and outputs all files in trace, and saves them inside folders

20 13 separated by the IP address. From these files, I am only interested in the JavaScript files and the HTML files, so all files except.html /.htm /.js and.javascript are removed first. Next, because the malicious sites also call and refer to many non-malicious sites, and this can potentially mess up our classification, I used the malicious URL list provided by Waseda to cross referenced to the IP URL list from Network Miner, and kept only the folders with IP from the malicious URL themselves. We now have a set of folders containing JavaScript and HTML files originated from the malicious URLs themselves. This set is about 120 sites although it is reduced later slightly due to some sites having no or very little JavaScript features. The JavaScript bytecode files are obtained from the same sources for bytecode analysis, the count files consist of purely the function calls of each of the bytecode functions, and allow us to tally up and gather the ratio and percentages for classification. 6.5 Obtaining and Tokenizing JavaScript For the benign sites, from the HTML files, to obtain the JavaScript both inside the main body of the HTML itself and the external JavaScript files called by the HTML, the JSGet2.java program is used. The JSGet2.java file looks for the tag <script...> and the tag </script> in the HTML file. If the starting script is found in the form <script>, the following section until </script> is assumed to be the JavaScript. If the phrase src= (the string space src=) is found in the script tag, it is assumed that the JavaScript is loaded from external file and different conditions are set to handle the URLs associated. If the tag is not a simple <script> and the src= phrase is not found, the phrase javascript is then looked for in the tag, and if it is found, the following section after the tag is then again assumed to be JavaScript and saved along with the simple <script> tag cases. A small number of JavaScripts are unable to be obtained inside the java program, which are saved in a log file and manually obtained through the browser. For the malicious sites, the output HTML files from Network Miner are treated in the same way, and the JavaScript outputs from the Network Miner are set aside for the next step as well. The JavaScript files are then tokenized into the 0, 1, 2, and 3 characters with whitespaces removed.

21 N-gram Analysis From this reduced set of characters, n-gram analysis is used for classification. N=1 gram gives us the percentage of 0 characters, 1 characters, 2 characters and 3 characters in the JavaScripts, where N>1 gives us more sequence information. For example, for the N=3 gram, the feature vectors are the '000' sequence, '001' sequence, '002' sequence and up to the '333' sequence, where each feature is the percentage of that sequence showing up in the JavaScript. The number N = 3 is chosen from some testing results that will be explained later in the results section. 6.7 Input into Weka For our classification, the popular machine learning tool Weka is leveraged. Waikato Environment for Knowledge Analysis (Weka) is a popular suite of machine learning software written in Java, developed at the University of Waikato, New Zealand. It is free software licensed under the GNU General Public License. - Wikipedia To classify in Weka, each row of the input.csv file corresponds to one of our websites, columns are the many feature vectors we are inputting, and one of the columns (usually the last) is the nominal variable on whether the website is malicious or benign. For the final classification input, 147 bytecode features (each bytecode function call to the total function call ratio), 64 N-gram features from N=3, and 5 JavaScript features (white space percentage, long string percentage, eval() percentage, settimeout() percentage and setinterval() percentage) are used. The JavaScript features are selected to round off our feature set to make up for the fact we are losing the white space information when tokenizing the JavaScripts, and the three JavaScript calls are proven by others [6] to be good indicators when classifying malicious web pages. Long string percentage here indicates ratio of strings that are longer than a certain threshold between whitespaces. The number through testing here is best at 40, which agrees with the results also presented by others [6]. Inside Weka, 10-fold cross validation is used to classify our results.

22 15 7. Results gram vs 2-gram vs 3-gram 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% 1-gram 2-gram 3-gram Bayes Net 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% 1-gram 2-gram 3-gram Sequential Minimal Optimization a) Classification Accuracy under Bayes Net b) Classification Accuracy under Sequential Regression Minimal Optimization 90.00% 90.00% 80.00% 80.00% 70.00% 70.00% 60.00% 60.00% 50.00% 50.00% Logistic 40.00% 40.00% Regression 30.00% 30.00% J48 Tree 20.00% 20.00% 10.00% 10.00% 0.00% 0.00% 1-gram 2-gram 3-gram 1-gram 2-gram 3-gram c) Classification Accuracy under Logistic d) Classification Accuracy under J48 Tree % 80.00% 60.00% 40.00% Random Forest 20.00% 0.00% 1-gram 2-gram 3-gram e) Classification Accuracy under Random Forest Fig. 5: Comparison of 1-gram vs 2-gram vs 3-gram tokenization.

23 16 Fig. 5 above is the classification accuracy of each of n-gram tokenization under five different commonly used classifiers. These classifications are under the tokenization features only and do not include any of the other features. For 1-gram, 4 features are used, for 2-gram, 16 features are used, and for 3-gram, 64 features are used. As expected, the classification accuracy increases as the number of features increases, but does slow down going from 2 to 3 gram compare to 1 to 2 gram, indicating this increase has plateaued. Due to this, and the exponential increase in the number of features when going up, 3-grams (64 features) are chosen as the stopping point for this part and will be part of our feature set for input into the final classification. 7.2 Complete Set of Features For this complete set of features to be classified, 64 3-gram token features along with 147 bytecode features and 5 JavaScript features are used for a total of 216 features. As for number of sites, 105 benign sites and 94 malicious sites are classified. Five commonly used but different classifiers are used to observe the classification of our feature vector set. Bayesian Net A Bayesian network is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph.

24 17 Fig. 6: Classification result for Bayesian Network. Fig. 6 shows the overall classification accuracy of the Bayesian Network classifier. The overall accuracy is %, true positive rate at 74.3% for the benign sites and 85.1% for the malicious sites. Logistic Regression Logistic regression is a regression model that measures the relationship between the categorical dependent variable (malicious 0 or 1 in our case) and one or more independent variables by estimating probabilities using a logistic function, which is the cumulative logistic distribution.

25 18 Fig. 7: Classification result for Logistic Regression. Fig. 7A: Snapshot of the coefficient weights for the Logistic Function.

26 19 Fig. 7 shows the overall classification accuracy of the Logistic Regression classifier. The overall accuracy is %, true positive rate at 75.2% for benign sites and 77.7% for malicious sites. Fig. 7A is a partial snapshot (due to the high number of total features) of the coefficient weights associated with some of the feature vectors for the logistic function, some of the function calls with most weight associated include the bytecode features: strict-delelem, strict-setgname, getrval, callee, Jssettimeout; JavaScript features: settimeout, setinterval. SMO (Weka's method of SVM) Support vector machines (SVMs) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. Sequential Minimal Optimization (SMO) is one way of solving the SVM training problem more efficiently than the standard quadratic programming solvers. Fig. 8: Classification result for SMO.

27 20 Fig. 8A: Snapshot of the attribute weights for the support vectors. Fig. 8 shows the overall classification accuracy of the SMO classifier. The overall accuracy is %, true positive rate at 81.9% for benign sites and 78.7% for malicious sites. Fig. 8A is a partial snapshot (due to the high number of total features) of the weights associated with the attributes for SMO. Some of the function calls with most weight associated include the bytecode features: strictne, initelem_array, retrval, globalthis, loopentry, newobject, loophead; 3-gram features: 001 pattern, 033 pattern, 233 pattern, 331 pattern; JavaScript features: white_space_percentage, settimeout, setinterval.

28 21 Decision Tree (J48 in Weka) Decision tree learning uses a decision tree as a predictive model which maps the observation about an item to the conclusions about the item's target value. Below is the tree generated and result. Fig. 9: Classification result for J48 tree.

29 22 Fig. 9A: Pruned tree decisions for the J48 tree. Fig. 9 shows the overall classification accuracy of the J48 tree classifier. The overall accuracy is %, true positive rate at 77.1% for benign sites and 75.5% for malicious sites. Fig. 9A shows the tree pruning decisions for the classifier, some of the features selected to be nodes are void, initelem, initprop, 032 pattern etc.

30 23 Random Forest Random forest operates by constructing a multitude of decision trees at training time and producing the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees. Fig. 10: Classification result for Random Forest. Fig. 10 shows the overall classification accuracy of the random forest classifier. The overall accuracy is %, true positive rate at 84.8% for benign sites and 83.0% for malicious sites. The branching decisions are different from each iteration of tree construction and currently in Weka, the pruning decisions for Random Forest is not given as a output for this classifier.

31 24 Random Forest J48 SMO Classification Accuracy Logistic Regression Baysian Net 70.00% 75.00% 80.00% 85.00% Fig. 11: Comparison of the different classifiers. Shown by Fig. 11, the Random Forest and SVM based classifiers outperformed the other classifiers, which seem to be consistant with the results presented by others [5], perhaps indicating for this type of classification, Random Forest and SVM should be preferred. In terms of the feature vectors observed, our different classifiers give different weights for each of the features, so it is hard to pinpoint exactly which features are most telling. We can however, rank some of these features by weight associated, and the features with heavier weight overall in different classification algorithms should give us some pretty good indicators of important features for classification. Also some features are better indicators for benign sites while some features for better indicators for malicious sites. Depending on whether we are trying to optimize for true positive or true negative, the features' importance can be varied.

32 25 8. Future Improvements and Conclusion Due to the time constraints and the scope of this project, I did not get a chance to integrate this JavaScript classification into the AutoBLG pipeline, so results currently are stand alone. If given the opportunity in the future I will try to integrate this process to observe possible classification improvements. In terms of the source data from the malicious sites, from the traces I have extracted all JavaScript and HTML. However this does not capture the JavaScript and HTML that the trace did not see, and for the URL filtration step we would not have trace files. So for the best and consistent results, the malicious HTML and JavaScript should be obtained the same way as benign sites, which is to parse the HTML and get the JavaScript files while the site is fresh (of course most if not all of these sites from the trace are long gone now so it is not currently possible). As for the feature vectors, not all vectors are equal in these classifications and as mentioned above, with more observations and iterations in the future we can isolate the better performing feature vectors to further improve the classification process. Besides the current bytecode and string tokenization feature vectors, other techniques can also be considered, such as the lexical feature analysis of the URL set. There are many works in this area especially for phishing attack URLs, and whether these techniques are fitting for Drive-by-download URLs are still up for test.

33 26 9. References [1] Sun, B., Akiyama, M., Yagi, T., Hatada, M., and Mori, T.: AutoBLG: Automatic URL Blacklist Generator Using Search Space Expansion and Filters. Proc. 20 Th IEEE Symposium on Computers and Communication (ISCC 2015) pp (2015). [2] Felegyhazi, M., Kreibich, C. and Paxson, V.: On the Potential of Proactive Domain Blacklisting, Proc. 3rd USENIX Conference on Large-scale Exploits and Emergent Threats: Botnets, Spyware, Worms, and More (LEET 2010), pp. 6 (2010). [3] Ma, J., Saul, L.K., Savage, S. and Voelker, G.M.: Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs, Proc. 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2009), pp (2009). [4] Xu, L., Zhan, Z., Xu, S., and Ye, K.: Cross-Layer Detection of Malicious Websites. Proc. CODASPY, pp (2013). [5] Adachi, T., Omote, K.: An Approach to Predict Drive-by-Download Attacks by Vulnerability Evaluation and Opcode. 10 th Asia Joint Conference on Information Security pp (2015). [6] Canali, D., Cova, M., Vigna G., and Kruegel, C.: Prophiler: a fast filter for the large-scale detection of malicious web pages. In Proc. WWW, 2011 pp (2011). [7] Su, J., Yoshioka, K., Shikata, and J., Matsumoto, T.: An Efficient Method for Detecting Obfuscated Suspicious JavaScript based on Text Pattern Analysis. International Workshop on Traffic Measurements for Cybersecurity, WTMC (2016). [8] AL-Taharwa, I., Lee, H., Jeng, A., Wu, K., Ho, C., and Chen, S.: JSOD: JavaScript obfuscation detector. Security and Communication Networks, vol. 8: pp (2015).

Detection of Cross Site Scripting Attack and Malicious Obfuscated Javascript Code

Detection of Cross Site Scripting Attack and Malicious Obfuscated Javascript Code International Journal of Engineering Research in Computer Science and Engineering Detection of Cross Site Scripting Attack and Malicious Obfuscated Javascript Code [1] Vrushali S. Bari [2] Prof. Nitin

More information

Hybrid Obfuscated Javascript Strength Analysis System for Detection of Malicious Websites

Hybrid Obfuscated Javascript Strength Analysis System for Detection of Malicious Websites Hybrid Obfuscated Javascript Strength Analysis System for Detection of Malicious Websites R. Krishnaveni, C. Chellappan, and R. Dhanalakshmi Department of Computer Science & Engineering, Anna University,

More information

Detecting Malicious Web Links and Identifying Their Attack Types

Detecting Malicious Web Links and Identifying Their Attack Types Detecting Malicious Web Links and Identifying Their Attack Types Anti-Spam Team Cellopoint July 3, 2013 Introduction References A great effort has been directed towards detection of malicious URLs Blacklisting

More information

Detecting Drive-by-Download Attacks based on HTTP Context-Types Ryo Kiire, Shigeki Goto Waseda University

Detecting Drive-by-Download Attacks based on HTTP Context-Types Ryo Kiire, Shigeki Goto Waseda University Detecting Drive-by-Download Attacks based on HTTP Context-Types Ryo Kiire, Shigeki Goto Waseda University 1 Outline Background Related Work Purpose Method Experiment Results Conclusion & Future Work 2

More information

Malicious Web Pages Detection Based on Abnormal Visibility Recognition

Malicious Web Pages Detection Based on Abnormal Visibility Recognition Malicious Web Pages Detection Based on Abnormal Visibility Recognition Bin Liang 1 2, Jianjun Huang 1, Fang Liu 1, Dawei Wang 1, Daxiang Dong 1, Zhaohui Liang 1 2 1. School of Information, Renmin University

More information

Validation of Web Alteration Detection using Link Change State in Web Page

Validation of Web Alteration Detection using Link Change State in Web Page Web 182-8585 1 5-1 m-shouta@uec.ac.jp,zetaka@computer.org Web Web URL Web Alexa Top 100 Web Validation of Web Alteration Detection using Link Change State in Web Page Shouta Mochizuki Tetsuji Takada The

More information

[Rajebhosale*, 5(4): April, 2016] ISSN: (I2OR), Publication Impact Factor: 3.785

[Rajebhosale*, 5(4): April, 2016] ISSN: (I2OR), Publication Impact Factor: 3.785 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY A FILTER FOR ANALYSIS AND DETECTION OF MALICIOUS WEB PAGES Prof. SagarRajebhosale*, Mr.Abhimanyu Bhor, Ms.Tejashree Desai, Ms.

More information

Finding Vulnerabilities in Web Applications

Finding Vulnerabilities in Web Applications Finding Vulnerabilities in Web Applications Christopher Kruegel, Technical University Vienna Evolving Networks, Evolving Threats The past few years have witnessed a significant increase in the number of

More information

Regular Paper Classification Method of Unknown Web Sites Based on Distribution Information of Malicious IP addresses

Regular Paper Classification Method of Unknown Web Sites Based on Distribution Information of Malicious IP addresses International Journal of Informatics Society, VOL.10, NO.1 (2018) 41-50 41 Regular Paper Classification Method of Unknown Web Sites Based on Distribution Information of Malicious IP addresses Shihori Kanazawa

More information

deseo: Combating Search-Result Poisoning Yu USF

deseo: Combating Search-Result Poisoning Yu USF deseo: Combating Search-Result Poisoning Yu Jin @MSCS USF Your Google is not SAFE! SEO Poisoning - A new way to spread malware! Why choose SE? 22.4% of Google searches in the top 100 results > 50% for

More information

JSObfusDetector: A Binary PSO-based One-Class Classifier Ensemble to Detect Obfuscated JavaScript Code

JSObfusDetector: A Binary PSO-based One-Class Classifier Ensemble to Detect Obfuscated JavaScript Code 2015 International Symposium on Artificial Intelligence and Signal Processing (AISP) JSObfusDetector: A Binary PSO-based One-Class Classifier Ensemble to Detect Obfuscated JavaScript Code Mehran Jodavi,

More information

Fighting Spam, Phishing and Malware With Recurrent Pattern Detection

Fighting Spam, Phishing and Malware With Recurrent Pattern Detection Fighting Spam, Phishing and Malware With Recurrent Pattern Detection White Paper September 2017 www.cyren.com 1 White Paper September 2017 Fighting Spam, Phishing and Malware With Recurrent Pattern Detection

More information

CS 161 Computer Security

CS 161 Computer Security Paxson Spring 2017 CS 161 Computer Security Discussion 12 Week of April 24, 2017 Question 1 Detection strategies (20 min) Suppose you are responsible for detecting attacks on the UC Berkeley network, and

More information

Next Generation Endpoint Security Confused?

Next Generation Endpoint Security Confused? SESSION ID: CEM-W06 Next Generation Endpoint Security Confused? Greg Day VP & Chief Security Officer, EMEA Palo Alto Networks @GreDaySecurity Brief Intro Questions we will answer Do I need a new (NG) endpoint

More information

Copyright 2014 NTT corp. All Rights Reserved.

Copyright 2014 NTT corp. All Rights Reserved. Credential Honeytoken for Tracking Web-based Attack Cycle Mitsuaki Akiyama (akiama.mitsuaki@lab.ntt.co.jp) NTT Secure Platform Laboratories / NTT-CERT Who I am Mitsuaki Akiyama Security Researcher (Ph.D)

More information

Self-Learning Systems for Network Intrusion Detection

Self-Learning Systems for Network Intrusion Detection Self-Learning Systems for Network Intrusion Detection Konrad Rieck Computer Security Group University of Göttingen GEORG-AUGUST-UNIVERSITÄT GÖTTINGEN About Me» Junior Professor for Computer Security» Research

More information

Identification of Malicious Web Pages with Static Heuristics

Identification of Malicious Web Pages with Static Heuristics Identification of Malicious Web Pages with Static Heuristics Christian Seifert, Ian Welch, Peter Komisarczuk Victoria University of Wellington P. O. Box 600 Wellington 6140, New Zealand Email: {cseifert,ian,peterk}@mcs.vuw.ac.nz

More information

Cross-Layer Detection of Malicious Websites

Cross-Layer Detection of Malicious Websites ONE UTSA CIRCLE SAN ANTONIO, TEXAS 78249-0631 210 458-4317 BUSINESS.UTSA.EDU THE UNIVERSITY OF TEXAS AT SAN ANTONIO, COLLEGE OF BUSINESS Working Paper SERIES Date February 21, 2013 WP # 0003MSS-432-2013

More information

Application Layer Attacks. Application Layer Attacks. Application Layer. Application Layer. Internet Protocols. Application Layer.

Application Layer Attacks. Application Layer Attacks. Application Layer. Application Layer. Internet Protocols. Application Layer. Application Layer Attacks Application Layer Attacks Week 2 Part 2 Attacks Against Programs Application Layer Application Layer Attacks come in many forms and can target each of the 5 network protocol layers

More information

Security+ Guide to Network Security Fundamentals, Third Edition. Chapter 3 Protecting Systems

Security+ Guide to Network Security Fundamentals, Third Edition. Chapter 3 Protecting Systems Security+ Guide to Network Security Fundamentals, Third Edition Chapter 3 Protecting Systems Objectives Explain how to harden operating systems List ways to prevent attacks through a Web browser Define

More information

CSI5387: Data Mining Project

CSI5387: Data Mining Project CSI5387: Data Mining Project Terri Oda April 14, 2008 1 Introduction Web pages have become more like applications that documents. Not only do they provide dynamic content, they also allow users to play

More information

White Paper. New Gateway Anti-Malware Technology Sets the Bar for Web Threat Protection

White Paper. New Gateway Anti-Malware Technology Sets the Bar for Web Threat Protection White Paper New Gateway Anti-Malware Technology Sets the Bar for Web Threat Protection The latest version of the flagship McAfee Gateway Anti-Malware technology adapts to new threats and plans for future

More information

P2_L12 Web Security Page 1

P2_L12 Web Security Page 1 P2_L12 Web Security Page 1 Reference: Computer Security by Stallings and Brown, Chapter (not specified) The web is an extension of our computing environment, because most of our daily tasks involve interaction

More information

Ms. Jevitha. K. P Assistant Professor

Ms. Jevitha. K. P Assistant Professor Prediction of Cross-Site Scripting Attack Using Machine Learning Algorithms Vishnu. B. A PG Student Amrita School of Engineering Amrita Vishwa Vidyapeetham (University) Coimbatore vishnuba89@gmail.com

More information

Machine Learning and Next-Generation Intrusion Prevention System (NGIPS)

Machine Learning and Next-Generation Intrusion Prevention System (NGIPS) A Trend Micro White Paper May 2017 Machine Learning and Next-Generation Intrusion Prevention System (NGIPS) Building a smarter NGIPS >> How Trend Micro is using machine learning to tackle today s complex

More information

shortcut Tap into learning NOW! Visit for a complete list of Short Cuts. Your Short Cut to Knowledge

shortcut Tap into learning NOW! Visit  for a complete list of Short Cuts. Your Short Cut to Knowledge shortcut Your Short Cut to Knowledge The following is an excerpt from a Short Cut published by one of the Pearson Education imprints. Short Cuts are short, concise, PDF documents designed specifically

More information

Attacks Against Websites 3 The OWASP Top 10. Tom Chothia Computer Security, Lecture 14

Attacks Against Websites 3 The OWASP Top 10. Tom Chothia Computer Security, Lecture 14 Attacks Against Websites 3 The OWASP Top 10 Tom Chothia Computer Security, Lecture 14 OWASP top 10. The Open Web Application Security Project Open public effort to improve web security: Many useful documents.

More information

Detecting Malicious URLs. Justin Ma, Lawrence Saul, Stefan Savage, Geoff Voelker. Presented by Gaspar Modelo-Howard September 29, 2010.

Detecting Malicious URLs. Justin Ma, Lawrence Saul, Stefan Savage, Geoff Voelker. Presented by Gaspar Modelo-Howard September 29, 2010. Detecting Malicious URLs Justin Ma, Lawrence Saul, Stefan Savage, Geoff Voelker Presented by Gaspar Modelo-Howard September 29, 2010 Publications Justin Ma, Lawrence K. Saul, Stefan Savage, and Geoffrey

More information

Business Club. Decision Trees

Business Club. Decision Trees Business Club Decision Trees Business Club Analytics Team December 2017 Index 1. Motivation- A Case Study 2. The Trees a. What is a decision tree b. Representation 3. Regression v/s Classification 4. Building

More information

Finding the Linchpins of the Dark Web: A Study on Topologically Dedicated Hosts on Malicious Web Infrastructures

Finding the Linchpins of the Dark Web: A Study on Topologically Dedicated Hosts on Malicious Web Infrastructures Finding the Linchpins of the Dark Web: A Study on Topologically Dedicated Hosts on Malicious Web Infrastructures Zhou Li, Indiana University Bloomington Sumayah Alrwais, Indiana University Bloomington

More information

Discovering Advertisement Links by Using URL Text

Discovering Advertisement Links by Using URL Text 017 3rd International Conference on Computational Systems and Communications (ICCSC 017) Discovering Advertisement Links by Using URL Text Jing-Shan Xu1, a, Peng Chang, b,* and Yong-Zheng Zhang, c 1 School

More information

Detecting and Characterizing Malicious Websites

Detecting and Characterizing Malicious Websites Detecting and Characterizing Malicious Websites APPROVED BY SUPERVISING COMMITTEE: Prof. Shouhuai Xu, Ph.D. Prof. Tom Bylander, Ph.D. Prof. Hugh B. Maynard, Ph.D. Prof. Ravi Sandhu, Ph.D. Prof. Maochao

More information

Detecting Botnets Using Cisco NetFlow Protocol

Detecting Botnets Using Cisco NetFlow Protocol Detecting Botnets Using Cisco NetFlow Protocol Royce Clarenz C. Ocampo 1, *, and Gregory G. Cu 2 1 Computer Technology Department, College of Computer Studies, De La Salle University, Manila 2 Software

More information

Anti-Phishing Method for Detecting Suspicious URLs in Twitter

Anti-Phishing Method for Detecting Suspicious URLs in Twitter Anti-Phishing Method for Detecting Suspicious URLs in Twitter Salu Sudhakar 1, Narasimhan T 2 P.G. Scholar, Dept of Computer Science, Mohandas College of engineering and technology Anad, TVM 1 Assistant

More information

EXPLOIT KITS. Tech Talk - Fall Josh Stroschein - Dakota State University

EXPLOIT KITS. Tech Talk - Fall Josh Stroschein - Dakota State University EXPLOIT KITS Tech Talk - Fall 2016 Josh Stroschein - Dakota State University Delivery Methods Spam/Spear-phishing Delivery Methods Spam/Spear-phishing Office Documents Generally refer to MS office suite

More information

On the Surface. Security Datasheet. Security Datasheet

On the Surface.  Security Datasheet.  Security Datasheet Email Security Datasheet Email Security Datasheet On the Surface No additional hardware or software required to achieve 99.9%+ spam and malware filtering effectiveness Initiate service by changing MX Record

More information

Topology-Based Spam Avoidance in Large-Scale Web Crawls

Topology-Based Spam Avoidance in Large-Scale Web Crawls Topology-Based Spam Avoidance in Large-Scale Web Crawls Clint Sparkman Joint work with Hsin-Tsang Lee and Dmitri Loguinov Internet Research Lab Department of Computer Science and Engineering Texas A&M

More information

Detecting Network Intrusions

Detecting Network Intrusions Detecting Network Intrusions Naveen Krishnamurthi, Kevin Miller Stanford University, Computer Science {naveenk1, kmiller4}@stanford.edu Abstract The purpose of this project is to create a predictive model

More information

FREE ONLINE WEBSITE MALWARE SCANNER WEBSITE SECURITY

FREE ONLINE WEBSITE MALWARE SCANNER WEBSITE SECURITY PDF 11 AWESOME TOOLS FOR WEBSITE MALWARE SCANNING FREE ONLINE WEBSITE SECURITY 1 / 5 2 / 5 3 / 5 website malware scanner pdf Qualys Malware Detection helps you to scan continuously for malware against

More information

Deep instinct For MSSPs

Deep instinct For MSSPs Deep instinct For MSSPs Deep Instinct Solution Deep Instinct is the first and only Endpoint & Mobile Cybersecurity solution that is based on a proprietary deep learning framework that was specifically

More information

Polygraph: Automatically Generating Signatures for Polymorphic Worms

Polygraph: Automatically Generating Signatures for Polymorphic Worms Polygraph: Automatically Generating Signatures for Polymorphic Worms James Newsome Brad Karp Dawn Song Presented by: Jeffrey Kirby Overview Motivation Polygraph Signature Generation Algorithm Evaluation

More information

Attacking CAPTCHAs for Fun and Profit

Attacking CAPTCHAs for Fun and Profit Attacking Author: Gursev Singh Kalra Managing Consultant Foundstone Professional Services Table of Contents Attacking... 1 Table of Contents... 2 Introduction... 3 A Strong CAPTCHA Implementation... 3

More information

Detecting Obfuscated JavaScript Malware Using Sequences of Internal Function Calls

Detecting Obfuscated JavaScript Malware Using Sequences of Internal Function Calls Detecting Obfuscated JavaScript Malware Using Sequences of Internal Function Calls Alireza Gorji Tarbiat Modares University Tehran, Iran alireza.gorji@modares.ac.ir Mahdi Abadi Tarbiat Modares University

More information

Application vulnerabilities and defences

Application vulnerabilities and defences Application vulnerabilities and defences In this lecture We examine the following : SQL injection XSS CSRF SQL injection SQL injection is a basic attack used to either gain unauthorized access to a database

More information

JSOD: JavaScript obfuscation detector

JSOD: JavaScript obfuscation detector SECURITY AND COMMUNICATION NETWORKS Security Comm. Networks 2015; 8:1092 1107 Published online 1 July 2014 in Wiley Online Library (wileyonlinelibrary.com)..1064 RESEARCH ARTICLE JSOD: JavaScript obfuscation

More information

Detecting Spam Zombies By Monitoring Outgoing Messages

Detecting Spam Zombies By Monitoring Outgoing Messages International Refereed Journal of Engineering and Science (IRJES) ISSN (Online) 2319-183X, (Print) 2319-1821 Volume 5, Issue 5 (May 2016), PP.71-75 Detecting Spam Zombies By Monitoring Outgoing Messages

More information

CLOAK OF VISIBILITY : DETECTING WHEN MACHINES BROWSE A DIFFERENT WEB

CLOAK OF VISIBILITY : DETECTING WHEN MACHINES BROWSE A DIFFERENT WEB CLOAK OF VISIBILITY : DETECTING WHEN MACHINES BROWSE A DIFFERENT WEB CIS 601: Graduate Seminar Prof. S. S. Chung Presented By:- Amol Chaudhari CSU ID 2682329 AGENDA About Introduction Contributions Background

More information

You Are Being Watched Analysis of JavaScript-Based Trackers

You Are Being Watched Analysis of JavaScript-Based Trackers You Are Being Watched Analysis of JavaScript-Based Trackers Rohit Mehra IIIT-Delhi rohit1376@iiitd.ac.in Shobhita Saxena IIIT-Delhi shobhita1315@iiitd.ac.in Vaishali Garg IIIT-Delhi vaishali1318@iiitd.ac.in

More information

SO YOU THINK YOU ARE PROTECTED? THINK AGAIN! NEXT GENERATION ENDPOINT SECURITY

SO YOU THINK YOU ARE PROTECTED? THINK AGAIN! NEXT GENERATION ENDPOINT SECURITY SO YOU THINK YOU ARE PROTECTED? THINK AGAIN! NEXT GENERATION ENDPOINT SECURITY www.securelink.net BACKGROUND Macro trends like cloud and mobility change the requirements for endpoint security. Data can

More information

Technical Brief: Domain Risk Score Proactively uncover threats using DNS and data science

Technical Brief: Domain Risk Score Proactively uncover threats using DNS and data science Technical Brief: Domain Risk Score Proactively uncover threats using DNS and data science 310 Million + Current Domain Names 11 Billion+ Historical Domain Profiles 5 Million+ New Domain Profiles Daily

More information

Automated Website Fingerprinting through Deep Learning

Automated Website Fingerprinting through Deep Learning Automated Website Fingerprinting through Deep Learning Vera Rimmer 1, Davy Preuveneers 1, Marc Juarez 2, Tom Van Goethem 1 and Wouter Joosen 1 NDSS 2018 Feb 19th (San Diego, USA) 1 2 Website Fingerprinting

More information

ISSN: (Online) Volume 2, Issue 2, February 2014 International Journal of Advance Research in Computer Science and Management Studies

ISSN: (Online) Volume 2, Issue 2, February 2014 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 2, Issue 2, February 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Paper / Case Study Available online at:

More information

ENTERPRISE ENDPOINT PROTECTION BUYER S GUIDE

ENTERPRISE ENDPOINT PROTECTION BUYER S GUIDE ENTERPRISE ENDPOINT PROTECTION BUYER S GUIDE TABLE OF CONTENTS Overview...3 A Multi-Layer Approach to Endpoint Security...4 Known Attack Detection...5 Machine Learning...6 Behavioral Analysis...7 Exploit

More information

ROSAEC Survey Workshop SELab. Soohyun Baik

ROSAEC Survey Workshop SELab. Soohyun Baik ROSAEC Survey Workshop SELab. Soohyun Baik Cross-Site Scripting Prevention with Dynamic Data Tainting and Static Analysis Philipp Vogt, Florian Nentwich, Nenad Jovanovic, Engin Kirda, Christopher Kruegel,

More information

McPAD and HMM-Web: two different approaches for the detection of attacks against Web applications

McPAD and HMM-Web: two different approaches for the detection of attacks against Web applications McPAD and HMM-Web: two different approaches for the detection of attacks against Web applications Davide Ariu, Igino Corona, Giorgio Giacinto, Fabio Roli University of Cagliari, Dept. of Electrical and

More information

VULNERABILITIES IN 2017 CODE ANALYSIS WEB APPLICATION AUTOMATED

VULNERABILITIES IN 2017 CODE ANALYSIS WEB APPLICATION AUTOMATED AUTOMATED CODE ANALYSIS WEB APPLICATION VULNERABILITIES IN 2017 CONTENTS Introduction...3 Testing methods and classification...3 1. Executive summary...4 2. How PT AI works...4 2.1. Verifying vulnerabilities...5

More information

WebShell UDURRANI.COM

WebShell UDURRANI.COM WebShell UDURRANI.COM Webshell is simply a backdoor used by attackers to enable remote administration and control. It s normally an obfuscated script i.e. php, cgi, aspx. Attacker could access webshell

More information

Improved Signature-Based Antivirus System

Improved Signature-Based Antivirus System Improved Signature-Based Antivirus System Osaghae E. O. Department of Computer Science Federal University, Lokoja, Kogi State, Nigeria Abstract: The continuous updating of antivirus database with malware

More information

Emerging Threat Intelligence using IDS/IPS. Chris Arman Kiloyan

Emerging Threat Intelligence using IDS/IPS. Chris Arman Kiloyan Emerging Threat Intelligence using IDS/IPS Chris Arman Kiloyan Who Am I? Chris AUA Graduate (CS) Thesis : Cyber Deception Automation and Threat Intelligence Evaluation Using IDS Integration with Next-Gen

More information

Characterizing Home Pages 1

Characterizing Home Pages 1 Characterizing Home Pages 1 Xubin He and Qing Yang Dept. of Electrical and Computer Engineering University of Rhode Island Kingston, RI 881, USA Abstract Home pages are very important for any successful

More information

with Advanced Protection

with Advanced  Protection with Advanced Email Protection OVERVIEW Today s sophisticated threats are changing. They re multiplying. They re morphing into new variants. And they re targeting people, not just technology. As organizations

More information

Ranking Vulnerability for Web Application based on Severity Ratings Analysis

Ranking Vulnerability for Web Application based on Severity Ratings Analysis Ranking Vulnerability for Web Application based on Severity Ratings Analysis Nitish Kumar #1, Kumar Rajnish #2 Anil Kumar #3 1,2,3 Department of Computer Science & Engineering, Birla Institute of Technology,

More information

Machine Learning in Digital Security

Machine Learning in Digital Security Machine Learning in Digital Security White Paper www.seqrite.com Table of Contents 1. Introduction 2. Introduction to Machine Learning 3. Machine Learning usage in Security Industry 4. Clustering Samples

More information

INF3700 Informasjonsteknologi og samfunn. Application Security. Audun Jøsang University of Oslo Spring 2015

INF3700 Informasjonsteknologi og samfunn. Application Security. Audun Jøsang University of Oslo Spring 2015 INF3700 Informasjonsteknologi og samfunn Application Security Audun Jøsang University of Oslo Spring 2015 Outline Application Security Malicious Software Attacks on applications 2 Malicious Software 3

More information

MALICIOUS URL DETECTION AND PREVENTION AT BROWSER LEVEL FRAMEWORK

MALICIOUS URL DETECTION AND PREVENTION AT BROWSER LEVEL FRAMEWORK International Journal of Mechanical Engineering and Technology (IJMET) Volume 8, Issue 12, December 2017, pp. 536 541, Article ID: IJMET_08_12_054 Available online at http://www.iaeme.com/ijmet/issues.asp?jtype=ijmet&vtype=8&itype=12

More information

Hybrid Feature Selection for Modeling Intrusion Detection Systems

Hybrid Feature Selection for Modeling Intrusion Detection Systems Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,

More information

Search Engines. Information Retrieval in Practice

Search Engines. Information Retrieval in Practice Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Web Crawler Finds and downloads web pages automatically provides the collection for searching Web is huge and constantly

More information

JAVASCRIPT AND JQUERY: AN INTRODUCTION (WEB PROGRAMMING, X452.1)

JAVASCRIPT AND JQUERY: AN INTRODUCTION (WEB PROGRAMMING, X452.1) Technology & Information Management Instructor: Michael Kremer, Ph.D. Class 1 Professional Program: Data Administration and Management JAVASCRIPT AND JQUERY: AN INTRODUCTION (WEB PROGRAMMING, X452.1) WHO

More information

Overview Cross-Site Scripting (XSS) Christopher Lam Introduction Description Programming Languages used Types of Attacks Reasons for XSS Utilization Attack Scenarios Steps to an XSS Attack Compromises

More information

Mobile Friendly Website. Checks whether your website is responsive. Out of 10. Out of Social

Mobile Friendly Website. Checks whether your website is responsive. Out of 10. Out of Social Website Score Overview On-Page Optimization Checks your website for different issues impacting performance and Search Engine Optimization problems. 3 3 WEBSITE SCORE 62 1 1 Competitor Analysis 4.6 5 Analysis

More information

Naming in Distributed Systems

Naming in Distributed Systems Naming in Distributed Systems Dr. Yong Guan Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University Outline for Today s Talk Overview: Names, Identifiers,

More information

A Comparative Study of Selected Classification Algorithms of Data Mining

A Comparative Study of Selected Classification Algorithms of Data Mining Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 6, June 2015, pg.220

More information

OWASP AppSec Research The OWASP Foundation New Insights into Clickjacking

OWASP AppSec Research The OWASP Foundation  New Insights into Clickjacking New Insights into Clickjacking Marco `embyte` Balduzzi iseclab @ EURECOM embyte@iseclab.org AppSec Research 2010 Joint work with Egele, Kirda, Balzarotti and Kruegel Copyright The Foundation Permission

More information

Categorization of Phishing Detection Features. And Using the Feature Vectors to Classify Phishing Websites. Bhuvana Namasivayam

Categorization of Phishing Detection Features. And Using the Feature Vectors to Classify Phishing Websites. Bhuvana Namasivayam Categorization of Phishing Detection Features And Using the Feature Vectors to Classify Phishing Websites by Bhuvana Namasivayam A Thesis Presented in Partial Fulfillment of the Requirements for the Degree

More information

Zero Trust on the Endpoint. Extending the Zero Trust Model from Network to Endpoint with Advanced Endpoint Protection

Zero Trust on the Endpoint. Extending the Zero Trust Model from Network to Endpoint with Advanced Endpoint Protection Zero Trust on the Endpoint Extending the Zero Trust Model from Network to Endpoint with Advanced Endpoint Protection March 2015 Executive Summary The Forrester Zero Trust Model (Zero Trust) of information

More information

Copyright

Copyright 1 Security Test EXTRA Workshop : ANSWER THESE QUESTIONS 1. What do you consider to be the biggest security issues with mobile phones? 2. How seriously are consumers and companies taking these threats?

More information

C1: Define Security Requirements

C1: Define Security Requirements OWASP Top 10 Proactive Controls IEEE Top 10 Software Security Design Flaws OWASP Top 10 Vulnerabilities Mitigated OWASP Mobile Top 10 Vulnerabilities Mitigated C1: Define Security Requirements A security

More information

UP L13: Leveraging the full protection of SEP 12.1.x

UP L13: Leveraging the full protection of SEP 12.1.x UP L13: Leveraging the full protection of SEP 12.1.x Hands on lab Description In this hands on lab you will learn about the different protection technologies bundled in SEP 12.1.x and see how they complement

More information

Cisco Cloud Security. How to Protect Business to Support Digital Transformation

Cisco Cloud Security. How to Protect Business to Support Digital Transformation Cisco Cloud Security How to Protect Business to Support Digital Transformation Dragan Novakovic Cybersecurity Consulting Systems Engineer January 2018. Security Enables Digitization Digital Disruption,

More information

Comment Extraction from Blog Posts and Its Applications to Opinion Mining

Comment Extraction from Blog Posts and Its Applications to Opinion Mining Comment Extraction from Blog Posts and Its Applications to Opinion Mining Huan-An Kao, Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan

More information

Malicious Activity and Risky Behavior in Residential Networks

Malicious Activity and Risky Behavior in Residential Networks Malicious Activity and Risky Behavior in Residential Networks Gregor Maier 1, Anja Feldmann 1, Vern Paxson 2,3, Robin Sommer 2,4, Matthias Vallentin 3 1 TU Berlin / Deutsche Telekom Laboratories 2 International

More information

Trend Micro SMB Endpoint Comparative Report Performed by AV-Test.org

Trend Micro SMB Endpoint Comparative Report Performed by AV-Test.org Trend Micro SMB Endpoint Comparative Report Performed by AV-Test.org Results from October 2010 Executive Summary In October of 2010, AV-Test.org performed endpoint security benchmark testing on five marketleading

More information

THE EFFECTIVE APPROACH TO CYBER SECURITY VALIDATION BREACH & ATTACK SIMULATION

THE EFFECTIVE APPROACH TO CYBER SECURITY VALIDATION BREACH & ATTACK SIMULATION BREACH & ATTACK SIMULATION THE EFFECTIVE APPROACH TO CYBER SECURITY VALIDATION Cymulate s cyber simulation platform allows you to test your security assumptions, identify possible security gaps and receive

More information

Prevx 3.0 v Product Overview - Core Functionality. April, includes overviews of. MyPrevx, Prevx 3.0 Enterprise,

Prevx 3.0 v Product Overview - Core Functionality. April, includes overviews of. MyPrevx, Prevx 3.0 Enterprise, Prevx 3.0 v3.0.1.65 Product Overview - Core Functionality April, 2009 includes overviews of MyPrevx, Prevx 3.0 Enterprise, and Prevx 3.0 Banking and Ecommerce editions Copyright Prevx Limited 2007,2008,2009

More information

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Quiz I

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Quiz I Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.858 Fall 2010 Quiz I All problems are open-ended questions. In order to receive credit you must answer

More information

Survey Paper for WARNINGBIRD: Detecting Suspicious URLs in Twitter Stream

Survey Paper for WARNINGBIRD: Detecting Suspicious URLs in Twitter Stream www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 3 Issue 5, May 2014, Page No. 5866-5872 Survey Paper for WARNINGBIRD: Detecting Suspicious URLs in Twitter

More information

A study of classification algorithms using Rapidminer

A study of classification algorithms using Rapidminer Volume 119 No. 12 2018, 15977-15988 ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu A study of classification algorithms using Rapidminer Dr.J.Arunadevi 1, S.Ramya 2, M.Ramesh Raja

More information

AUDIT REPORT BELMONT TV.COM. Sep 14, Report Content Last Updated. On-Page Optimization. Off-Page Optimization. Keywords Report.

AUDIT REPORT BELMONT TV.COM. Sep 14, Report Content Last Updated. On-Page Optimization. Off-Page Optimization. Keywords Report. WEBSITE AUDIT REPORT Report Content Last Updated Sep 14, 217 On-Page Optimization Off-Page Optimization Social Media Keywords Report BELMONT TV.COM Steve.Smith@belmonttv.com 4723 King Street Arlington,

More information

Training for the cyber professionals of tomorrow

Training for the cyber professionals of tomorrow Hands-On Labs Training for the cyber professionals of tomorrow CYBRScore is a demonstrated leader in professional cyber security training. Our unique training approach utilizes immersive hands-on lab environments

More information

CIS 4360 Secure Computer Systems XSS

CIS 4360 Secure Computer Systems XSS CIS 4360 Secure Computer Systems XSS Professor Qiang Zeng Spring 2017 Some slides are adapted from the web pages by Kallin and Valbuena Previous Class Two important criteria to evaluate an Intrusion Detection

More information

DENIAL OF SERVICE VIA INTERNET OF THINGS DEVICES: ATTACK METHODOLOGIES AND MITIGATION TECHNIQUES

DENIAL OF SERVICE VIA INTERNET OF THINGS DEVICES: ATTACK METHODOLOGIES AND MITIGATION TECHNIQUES DENIAL OF SERVICE VIA INTERNET OF THINGS DEVICES: ATTACK METHODOLOGIES AND MITIGATION TECHNIQUES by RICHARD ROE Advisor Dr. Joshua Eckroth A senior research proposal submitted in partial fulfillment of

More information

Performance Analysis of Data Mining Classification Techniques

Performance Analysis of Data Mining Classification Techniques Performance Analysis of Data Mining Classification Techniques Tejas Mehta 1, Dr. Dhaval Kathiriya 2 Ph.D. Student, School of Computer Science, Dr. Babasaheb Ambedkar Open University, Gujarat, India 1 Principal

More information

WHITEPAPER ENDPOINT DETECTION AND RESPONSE BEYOND ANTIVIRUS PROACTIVE THREAT HUNTING AT THE ENDPOINT

WHITEPAPER ENDPOINT DETECTION AND RESPONSE BEYOND ANTIVIRUS PROACTIVE THREAT HUNTING AT THE ENDPOINT WHITEPAPER ENDPOINT DETECTION AND RESPONSE BEYOND ANTIVIRUS PROACTIVE THREAT HUNTING AT THE ENDPOINT THREE DECADES OF COMPUTER THREATS In 1986, the Brain boot sector virus caused the first widespread realization

More information

ATTIVO NETWORKS THREATDEFEND INTEGRATION WITH MCAFEE SOLUTIONS

ATTIVO NETWORKS THREATDEFEND INTEGRATION WITH MCAFEE SOLUTIONS PARTNER BRIEF ATTIVO NETWORKS THREATDEFEND INTEGRATION WITH MCAFEE SOLUTIONS INTRODUCTION Attivo Networks has partnered with McAfee to detect real-time in-network threats and to automate incident response

More information

Streaming Prevention in Cb Defense. Stop malware and non-malware attacks that bypass machine-learning AV and traditional AV

Streaming Prevention in Cb Defense. Stop malware and non-malware attacks that bypass machine-learning AV and traditional AV Streaming Prevention in Cb Defense Stop malware and non-malware attacks that bypass machine-learning AV and traditional AV 2 STREAMING PREVENTION IN Cb DEFENSE OVERVIEW Over the past three years, cyberattackers

More information

A Review on Identifying the Main Content From Web Pages

A Review on Identifying the Main Content From Web Pages A Review on Identifying the Main Content From Web Pages Madhura R. Kaddu 1, Dr. R. B. Kulkarni 2 1, 2 Department of Computer Scienece and Engineering, Walchand Institute of Technology, Solapur University,

More information

RKN 2015 Application Layer Short Summary

RKN 2015 Application Layer Short Summary RKN 2015 Application Layer Short Summary HTTP standard version now: 1.1 (former 1.0 HTTP /2.0 in draft form, already used HTTP Requests Headers and body counterpart: answer Safe methods (requests): GET,

More information

Using Machine Learning to Identify Security Issues in Open-Source Libraries. Asankhaya Sharma Yaqin Zhou SourceClear

Using Machine Learning to Identify Security Issues in Open-Source Libraries. Asankhaya Sharma Yaqin Zhou SourceClear Using Machine Learning to Identify Security Issues in Open-Source Libraries Asankhaya Sharma Yaqin Zhou SourceClear Outline - Overview of problem space Unidentified security issues How Machine Learning

More information

International Journal of Software and Web Sciences (IJSWS)

International Journal of Software and Web Sciences (IJSWS) International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) ISSN (Print): 2279-0063 ISSN (Online): 2279-0071 International

More information

Automated Context and Incident Response

Automated Context and Incident Response Technical Brief Automated Context and Incident Response www.proofpoint.com Incident response requires situational awareness of the target, his or her environment, and the attacker. However, security alerts

More information