Dynamic update of binary logistic regression model for fraud detection in electronic credit card transactions

Size: px
Start display at page:

Download "Dynamic update of binary logistic regression model for fraud detection in electronic credit card transactions"

Transcription

1 Dynamic update of binary logistic regression model for fraud detection in electronic credit card transactions Fidel Beraldi 1,2 and Alair Pereira do Lago 2 1 First Data, Fraud Risk Department, São Paulo, Brazil 2 University of São Paulo, Department of Computer Science, São Paulo, Brazil ABSTRACT In this paper we develop Dynamic Model Averaging (DMA) models regarding electronic transactions coming from e-commerce environment which incorporate the trends and characteristics of fraud in each period of analysis. We have also developed logistic regression models in order to compare their performances in the fraud detection processes. For the experiment, a dataset was provided by an e-commerce company in Brazil to develop the models and compare their results. 1 Introduction Regarding technological and economic development, which made communication process easier and increased purchasing power, credit card transactions have become the primary payment method in brazilian and international retailers. 1 In this scenario, as the number of transactions by credit card grows, more opportunities are created for fraudsters to produce new ways of fraud, resulting in large losses for the financial system. 2 Fraud indicators have shown that e-commerce transactions are riskier than card present transactions, since those do not use secure and efficient processes to authenticate the cardholder, such as personal identification number (PIN). 1

2 2 Methods Due to the fact that fraudsters quickly adapt to fraud prevention measures, statistical models for fraud detection need to be adaptable and flexible to change over time in a dynamic way. Fraud scoring models can be updated sporadically or continuously over time, which raises the question of dynamic update of the model parameters to detect fraud. Raftery et al (2012) developed a method called Dynamic Model Averaging (DMA) 345 which implements a process of continuous updating over time. The DMA methodology combines some existing ideas: weighted Bayesian models (Bayesian Model Averaging - BMA), Markov chains and forgetting factor in shaping state-space. All those characteristics make DMA models a better option for fraud scoring models. For the experiment, an e-commerce company provided the transactions data which have been performed by its payment system from July 2009 to January The data analysis was performed following a non-disclosure agreement, as recommended by PCI Data Security Standard. 6 Data analysis made possible to compare the DMA methodology against the classical logistic regression model, which is often used in fraud detection process. The following steps were taken in the experiment: Step 1 - Original dataset: it was created the structure for access to tables and fields to export data. The structure was made in the SQL database management system. Step 2 - Tables and fields selection: at this stage, all analyzed tables and fields developed along with the business team were selected and exported. At total, 28 tables were exported leading to 354 fields. Step 3 - Data filters and structure: all credit card transactions approved from July 2009 to January 2014 were selected. For this period, the records identified as fraudulent, either by cardholder or the company s internal analysis, were identified in the database. The entire selection of fields and tables relationship was performed. Thus, the originality of the data stored in the fields of exported tables could be preserved. Step 4 - Development of derived variables: based on the original variables, derived variables were created for the modeling process. 2/7

3 Step 5 - Final dataset for sampling: finally the final dataset of records and variables for sampling and modeling. Additionally, for validation purposes, some records in the database on the internal system of the company were selected in order to verify if the data of variables collected were equal to the original. The database for the experiment is composed of 7,716,09 records of credit card transactions, distributed from July 2009 to January For each extracted record, there are 52 independent variables (original and derived variables) and one dependent variable (fraud/non-fraud transaction). Denied transactions, regardless of the reason of negative, were excluded from the database. Thus, to the modeling process are considered only approved transactions. Fraudulent transactions were identified in two ways: through chargebacks information and the analysis of the company s internal team. However, both ways of detection resulted in a classification of fraud transactions in the final database. Transactions not classified as fraud were identified as non-fraud. For the modeling process, we sampled 428,256 records, with 22,615 fraud and 405,641 non-fraud transactions. This implies a ratio of about 1 fraud record for each 18 non-fraud transaction. This ratio of fraud and non-fraud is very close to the values adopted for this kind of problem, as noted in previous experiments. 789 After collecting the sample, for the modeling process, the sampled records were split into 80% (342,605) for model development and 20% (85,651) for validation. 3/7

4 3 Results Evaluating the performance of logistic regression model and DMA model separately on the table 1, the Stepwise Modified and the DMA 95 (λ =α = 0.95) showed the best performance indicators in its categories. However, in general, it is noted that DMA models have better performance for all indicators in relation to logistic regression models, except the detection rate, which has approximately a difference of 10% to compare the Stepwise and DMA 95. Models Indicators Logistic Regression DMA Model Stepwise Stepwise DMA 99 DMA 95 Modified (λ =α=0,99) (λ =α=0,95) Classification Performance Detection Rate 62,9% 66,2% 50,8% 56,5% Specificity 83,1% 83,9% 96,3% 97,9% False Positive Rate 16,9% 16,1% 3,7% 2,1% Ratio Non-Fraud/Fraud 4,8 4,4 1,3 0,7 Precision (Fraud) 17,2% 18,7% 43,2% 60,1% Precision (Non-Fraud) 97,6% 97,8% 97,2% 97,6% Model Performance KS 48,1% 52,1% 61,3% 70,4% Accuracy 82,1% 83,0% 93,9% 95,7% Area under ROC curve 81,7% 83,9% 88,3% 92,9% F-measure 27,0% 29,2% 46,7% 58,2% Bold values indicate the best results in the indicator. Table 1. Adjusted Models Performance. Figure 1 shows AUC values for the four models developed. Adopting the Stepwise for comparative purposes, the Stepwise Modified, DMA 99 and DMA 95 have relative performance of 3%, 8% and 14% 4/7

5 better, respectively. Figure 1. ROC curves of adjusted models. As we are interested in the detection rate and accuracy of developed models, we can use the F-measure to compare the performance among them. In the table 2, we observed which the DMA models showed better performance and, in particular, the DMA 95. Model F-measure DMA 95 (λ=α=0,95) 58% DMA 99 (λ=α=0,99) 47% Stepwise Modified 29% Stepwise 27% Table 2. Adjusted models F-measures. 5/7

6 4 Conclusion The experiment shows that DMA models present better results than logistic regression models in respect to the analysis of the area under the ROC curve (AUC) and F measure. The F measure for the DMA was 58% while the logistic regression model was 29%. For the AUC, the DMA model reached 93% and the classical model reached 84%. Considering the results for DMA models, we can conclude that its updating over time characteristic makes a large difference when it comes to the analysis of fraud data, which undergo behavioral changes continuously. Given all that, its application has been proved to be appropriate for the detection process of fraudulent transactions in the e-commerce environment. References 1. Bolton, R. J. & Hand, D. J. Statistical fraude detection: A review. In Statistical Science 17, (2002). 2. Chan, P. K., Fan, W., Prodromidis, A. L. & Stolfo, S. J. Distributed data mining in credit card fraud detection. IEEE Intelligent Systems 14, (1999). 3. Raftery, A. E., Karny, M. & Ettler, P. Online prediction under model uncertainty via dynamic model averaging: Application to a cold rolling mill. American Statistical Association 52 (2010). 4. McCormick, T. H., Raftery, A. E., Madigan, D. & Burd, R. S. Dynamic logistic regression and dynamic model averaging for binary classification. Biometrics (2012). 5. Madigan, D. & Raftery, A. E. Model selection and accounting for model uncertainty in graphical models using occam s window. American Statistical Association 89, (1994). Washington. 6. PCI-DSS. Payment card industry (pci) - data security standard (2013). URL pcisecuritystandards.org/documents. 7. Chan, P. K. & Stolfo, S. J. Toward scalable learning with non-uniform class and cost distribution: A case study in credit card detection. In Proceeding of the Fourth International Conference on Knowledge Discovery and Data Mining (1998). 6/7

7 8. Gadi, M. F. A. Uma comparação de métodos de classificação aplicados à detecção de fraude em cartões de crédito. Dissertação de Mestrado - Instituto de Matemática e Estatística da Universidade de São Paulo (2006). 9. Gadi, M. F. A., Wang, X. & do Lago, A. P. Comparison with parametric optimization in credit card fraud detection. Seventh International Conference on Machine Learning and Applications (2008). 7/7

Attestation of Compliance for Onsite Assessments Service Providers

Attestation of Compliance for Onsite Assessments Service Providers Attestation of Compliance Service Providers Payment Card Industry (PCI) Data Security Standard Attestation of Compliance for Onsite Assessments Service Providers Version 2.0 October 2010 Instructions for

More information

Graph mining assisted semi-supervised learning for fraudulent cash-out detection

Graph mining assisted semi-supervised learning for fraudulent cash-out detection Graph mining assisted semi-supervised learning for fraudulent cash-out detection Yuan Li Yiheng Sun Noshir Contractor Aug 2, 2017 Outline Introduction Method Experiments and Results Conculsion and Future

More information

Istat s Pilot Use Case 1

Istat s Pilot Use Case 1 Istat s Pilot Use Case 1 Pilot identification 1 IT 1 Reference Use case X 1) URL Inventory of enterprises 2) E-commerce from enterprises websites 3) Job advertisements on enterprises websites 4) Social

More information

9. Conclusions. 9.1 Definition KDD

9. Conclusions. 9.1 Definition KDD 9. Conclusions Contents of this Chapter 9.1 Course review 9.2 State-of-the-art in KDD 9.3 KDD challenges SFU, CMPT 740, 03-3, Martin Ester 419 9.1 Definition KDD [Fayyad, Piatetsky-Shapiro & Smyth 96]

More information

Prototype Selection for Handwritten Connected Digits Classification

Prototype Selection for Handwritten Connected Digits Classification 2009 0th International Conference on Document Analysis and Recognition Prototype Selection for Handwritten Connected Digits Classification Cristiano de Santana Pereira and George D. C. Cavalcanti 2 Federal

More information

Credit card Fraud Detection using Predictive Modeling: a Review

Credit card Fraud Detection using Predictive Modeling: a Review February 207 IJIRT Volume 3 Issue 9 ISSN: 2396002 Credit card Fraud Detection using Predictive Modeling: a Review Varre.Perantalu, K. BhargavKiran 2 PG Scholar, CSE, Vishnu Institute of Technology, Bhimavaram,

More information

As a reference, please find a version of the Machine Learning Process described in the diagram below.

As a reference, please find a version of the Machine Learning Process described in the diagram below. PREDICTION OVERVIEW In this experiment, two of the Project PEACH datasets will be used to predict the reaction of a user to atmospheric factors. This experiment represents the first iteration of the Machine

More information

Mobile Banking and Payments Emerging Trends and Opportunities

Mobile Banking and Payments Emerging Trends and Opportunities Mobile Banking and Payments Emerging Trends and Opportunities VIDEO 2 Introductions Barry O Connell Banking and Payments Strategy Barry focuses on customer, product and channel strategy for banks and payments

More information

Knowledge Discovery. URL - Spring 2018 CS - MIA 1/22

Knowledge Discovery. URL - Spring 2018 CS - MIA 1/22 Knowledge Discovery Javier Béjar cbea URL - Spring 2018 CS - MIA 1/22 Knowledge Discovery (KDD) Knowledge Discovery in Databases (KDD) Practical application of the methodologies from machine learning/statistics

More information

DATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane

DATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane DATA MINING AND MACHINE LEARNING Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Data preprocessing Feature normalization Missing

More information

Fast or furious? - User analysis of SF Express Inc

Fast or furious? - User analysis of SF Express Inc CS 229 PROJECT, DEC. 2017 1 Fast or furious? - User analysis of SF Express Inc Gege Wen@gegewen, Yiyuan Zhang@yiyuan12, Kezhen Zhao@zkz I. MOTIVATION The motivation of this project is to predict the likelihood

More information

PCI DSS 3.1 is here. Are you ready? Mike Goldgof Sr. Director Product Marketing

PCI DSS 3.1 is here. Are you ready? Mike Goldgof Sr. Director Product Marketing PCI DSS 3.1 is here. Are you ready? Mike Goldgof Sr. Director Product Marketing 1 WhiteHat Security Application Security Company Leader in the Gartner Magic Quadrant Headquartered in Santa Clara, CA 320+

More information

CS423: Data Mining. Introduction. Jakramate Bootkrajang. Department of Computer Science Chiang Mai University

CS423: Data Mining. Introduction. Jakramate Bootkrajang. Department of Computer Science Chiang Mai University CS423: Data Mining Introduction Jakramate Bootkrajang Department of Computer Science Chiang Mai University Jakramate Bootkrajang CS423: Data Mining 1 / 29 Quote of the day Never memorize something that

More information

Structured Data Security Methodology. Discovering Sensitive Data in Structured Data Sources. San Francisco Chapter

Structured Data Security Methodology. Discovering Sensitive Data in Structured Data Sources. San Francisco Chapter Structured Data Security Methodology Discovering Sensitive Data in Structured Data Sources Agenda 2 Agenda Sensitive Data Security Introduction Find before you Fix Current Approaches Framework and Methodology

More information

Online Banking Fraud Detection Based on Local and Global Behavior

Online Banking Fraud Detection Based on Local and Global Behavior Online Banking Fraud Detection Based on Local and Global Behavior Stephan Kovach Laboratory of Computer Architecture and Networks Department of Computer and Digital System Engineering, Polytechnic School

More information

Navigating the PCI DSS Challenge. 29 April 2011

Navigating the PCI DSS Challenge. 29 April 2011 Navigating the PCI DSS Challenge 29 April 2011 Agenda 1. Overview of Threat and Compliance Landscape 2. Introduction to the PCI Security Standards 3. Payment Brand Compliance Programs 4. PCI DSS Scope

More information

AIB Merchant Services AIB Merchant Services Quick Reference Guide Verifone

AIB Merchant Services AIB Merchant Services Quick Reference Guide Verifone AIB Merchant Services AIB Merchant Services Quick Reference Guide Verifone AIB Merchant Services AIBMS Quick Reference Guide This quick reference guide has been designed to answer the most common queries

More information

January to April Upgrade Guide. Microsoft Dynamics AX for Retail

January to April Upgrade Guide. Microsoft Dynamics AX for Retail January to April Upgrade Guide Microsoft Dynamics AX for Retail April 2011 Microsoft Dynamics is a line of integrated, adaptable business management solutions that enables you and your people to make business

More information

Interpretable Machine Learning with Applications to Banking

Interpretable Machine Learning with Applications to Banking Interpretable Machine Learning with Applications to Banking Linwei Hu Advanced Technologies for Modeling, Corporate Model Risk Wells Fargo October 26, 2018 2018 Wells Fargo Bank, N.A. All rights reserved.

More information

Weka ( )

Weka (  ) Weka ( http://www.cs.waikato.ac.nz/ml/weka/ ) The phases in which classifier s design can be divided are reflected in WEKA s Explorer structure: Data pre-processing (filtering) and representation Supervised

More information

Policy. Sensitive Information. Credit Card, Social Security, Employee, and Customer Data Version 3.4

Policy. Sensitive Information. Credit Card, Social Security, Employee, and Customer Data Version 3.4 Policy Sensitive Information Version 3.4 Table of Contents Sensitive Information Policy -... 2 Overview... 2 Policy... 2 PCI... 3 HIPAA... 3 Gramm-Leach-Bliley (Financial Services Modernization Act of

More information

Fraud Detection using Machine Learning

Fraud Detection using Machine Learning Fraud Detection using Machine Learning Aditya Oza - aditya19@stanford.edu Abstract Recent research has shown that machine learning techniques have been applied very effectively to the problem of payments

More information

Business Data Analytics

Business Data Analytics MTAT.03.319 Business Data Analytics Lecture 9 The slides are available under creative common license. The original owner of these slides is the University of Tartu Fraud Detection Wrongful act for financial

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 24 2019 Logistics HW 1 is due on Friday 01/25 Project proposal: due Feb 21 1 page description

More information

Neural Network Method for failure detection. with skewed class distribution

Neural Network Method for failure detection. with skewed class distribution Neural Network Method for failure detection with skewed class distribution K. Carvajal Cuello 1, M. Chacón 1, D. Mery* 2 and G. Acuña 1 1 Departamento de Ingeniería Informática Universidad de Santiago

More information

Data Mining: Classifier Evaluation. CSCI-B490 Seminar in Computer Science (Data Mining)

Data Mining: Classifier Evaluation. CSCI-B490 Seminar in Computer Science (Data Mining) Data Mining: Classifier Evaluation CSCI-B490 Seminar in Computer Science (Data Mining) Predictor Evaluation 1. Question: how good is our algorithm? how will we estimate its performance? 2. Question: what

More information

Hidden Markov Model for Credit Card Fraud Detection

Hidden Markov Model for Credit Card Fraud Detection Hidden Markov Model for Credit Card Fraud Detection Ankit Vartak #1, Chinmay D Patil *2,Chinmay K Patil #3 #Vidyavardhini s College of Engineering & Technology, Mumbai,Maharashtra,India *Viva Institute

More information

Knowledge Discovery. Javier Béjar URL - Spring 2019 CS - MIA

Knowledge Discovery. Javier Béjar URL - Spring 2019 CS - MIA Knowledge Discovery Javier Béjar URL - Spring 2019 CS - MIA Knowledge Discovery (KDD) Knowledge Discovery in Databases (KDD) Practical application of the methodologies from machine learning/statistics

More information

Minimal Cost Complexity Pruning of Meta-Classifiers

Minimal Cost Complexity Pruning of Meta-Classifiers Minimal Cost Complexity Pruning of Meta-Classifiers Andreas L. Prodromidis Salvatore J. Stolfo Department of Computer Science Columbia University Combining multiple models Learning Algorithm Classifier-1

More information

INTRODUCTION... 2 FEATURES OF DARWIN... 4 SPECIAL FEATURES OF DARWIN LATEST FEATURES OF DARWIN STRENGTHS & LIMITATIONS OF DARWIN...

INTRODUCTION... 2 FEATURES OF DARWIN... 4 SPECIAL FEATURES OF DARWIN LATEST FEATURES OF DARWIN STRENGTHS & LIMITATIONS OF DARWIN... INTRODUCTION... 2 WHAT IS DATA MINING?... 2 HOW TO ACHIEVE DATA MINING... 2 THE ROLE OF DARWIN... 3 FEATURES OF DARWIN... 4 USER FRIENDLY... 4 SCALABILITY... 6 VISUALIZATION... 8 FUNCTIONALITY... 10 Data

More information

PCI DSS. Compliance and Validation Guide VERSION PCI DSS. Compliance and Validation Guide

PCI DSS. Compliance and Validation Guide VERSION PCI DSS. Compliance and Validation Guide PCI DSS VERSION 1.1 1 PCI DSS Table of contents 1. Understanding the Payment Card Industry Data Security Standard... 3 1.1. What is PCI DSS?... 3 2. Merchant Levels and Validation Requirements... 3 2.1.

More information

INTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá

INTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá INTRODUCTION TO DATA MINING Daniel Rodríguez, University of Alcalá Outline Knowledge Discovery in Datasets Model Representation Types of models Supervised Unsupervised Evaluation (Acknowledgement: Jesús

More information

FAQs. The Worldpay PCI Program. Help protect your business and your customers from data theft

FAQs. The Worldpay PCI Program. Help protect your business and your customers from data theft The Worldpay PCI Program Help protect your business and your customers from data theft What is the Payment Card Industry Data Security Standard (PCI DSS)? Do I have to comply? The PCI DSS is a set of 12

More information

Enterprise Miner Tutorial Notes 2 1

Enterprise Miner Tutorial Notes 2 1 Enterprise Miner Tutorial Notes 2 1 ECT7110 E-Commerce Data Mining Techniques Tutorial 2 How to Join Table in Enterprise Miner e.g. we need to join the following two tables: Join1 Join 2 ID Name Gender

More information

Package smbinning. December 1, 2017

Package smbinning. December 1, 2017 Title Scoring Modeling and Optimal Binning Version 0.5 Author Herman Jopia Maintainer Herman Jopia URL http://www.scoringmodeling.com Package smbinning December 1, 2017 A set of functions

More information

COMP 465 Special Topics: Data Mining

COMP 465 Special Topics: Data Mining COMP 465 Special Topics: Data Mining Introduction & Course Overview 1 Course Page & Class Schedule http://cs.rhodes.edu/welshc/comp465_s15/ What s there? Course info Course schedule Lecture media (slides,

More information

node2vec: Scalable Feature Learning for Networks

node2vec: Scalable Feature Learning for Networks node2vec: Scalable Feature Learning for Networks A paper by Aditya Grover and Jure Leskovec, presented at Knowledge Discovery and Data Mining 16. 11/27/2018 Presented by: Dharvi Verma CS 848: Graph Database

More information

Section 1: Assessment Information

Section 1: Assessment Information Section 1: Assessment Information Instructions for Submission This document must be completed as a declaration of the results of the merchant s self-assessment with the Payment Card Industry Data Security

More information

CS535 Big Data Fall 2017 Colorado State University 10/10/2017 Sangmi Lee Pallickara Week 8- A.

CS535 Big Data Fall 2017 Colorado State University   10/10/2017 Sangmi Lee Pallickara Week 8- A. CS535 Big Data - Fall 2017 Week 8-A-1 CS535 BIG DATA FAQs Term project proposal New deadline: Tomorrow PA1 demo PART 1. BATCH COMPUTING MODELS FOR BIG DATA ANALYTICS 5. ADVANCED DATA ANALYTICS WITH APACHE

More information

Gold finger: Fingerprints lead biometric authentication

Gold finger: Fingerprints lead biometric authentication Gold finger: Fingerprints lead biometric authentication The use of fingerprint authentication on smartphones has surged. As of mid-2017, 28 per cent of all smartphone owners aged 16-75 used fingerprint

More information

REDUCING THE RISK OF CARD NOT PRESENT FRAUD

REDUCING THE RISK OF CARD NOT PRESENT FRAUD www.globalpaymentsinc.co.uk REDUCING THE RISK OF CARD NOT PRESENT FRAUD 02 03 REDUCING THE RISK OF CARD NOT PRESENT FRAUD INTRODUCTION Many businesses accept Card Not Present (CNP) transactions on a daily

More information

Your guide to the Payment Card Industry Data Security Standard (PCI DSS) banksa.com.au

Your guide to the Payment Card Industry Data Security Standard (PCI DSS) banksa.com.au Your guide to the Payment Card Industry Data Security Standard (PCI DSS) 1 13 13 76 banksa.com.au CONTENTS Page Contents 1 Introduction 2 What are the 12 key requirements of PCIDSS? 3 Protect your business

More information

Clustering Large Credit Client Data Sets for Classification with SVM

Clustering Large Credit Client Data Sets for Classification with SVM Clustering Large Credit Client Data Sets for Classification with SVM Ralf Stecking University of Oldenburg Department of Economics Klaus B. Schebesch University Vasile Goldiş Arad Faculty of Economics

More information

Comparison of Optimization Methods for L1-regularized Logistic Regression

Comparison of Optimization Methods for L1-regularized Logistic Regression Comparison of Optimization Methods for L1-regularized Logistic Regression Aleksandar Jovanovich Department of Computer Science and Information Systems Youngstown State University Youngstown, OH 44555 aleksjovanovich@gmail.com

More information

ISSN: X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE) Volume 6, Issue 6, June 2017

ISSN: X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE) Volume 6, Issue 6, June 2017 A review: Credit card fraud detection using various machines learning algorithm Deepika kaushik 1 (M.Tech scholar) Dr.Indu kashyap 2 (Associate professor) Simple Sharma 3 (Associate professor) Department

More information

List of Exercises: Data Mining 1 December 12th, 2015

List of Exercises: Data Mining 1 December 12th, 2015 List of Exercises: Data Mining 1 December 12th, 2015 1. We trained a model on a two-class balanced dataset using five-fold cross validation. One person calculated the performance of the classifier by measuring

More information

Tutorial on Machine Learning Tools

Tutorial on Machine Learning Tools Tutorial on Machine Learning Tools Yanbing Xue Milos Hauskrecht Why do we need these tools? Widely deployed classical models No need to code from scratch Easy-to-use GUI Outline Matlab Apps Weka 3 UI TensorFlow

More information

Table of Contents. PCI Information Security Policy

Table of Contents. PCI Information Security Policy PCI Information Security Policy Policy Number: ECOMM-P-002 Effective Date: December, 14, 2016 Version Number: 1.0 Date Last Reviewed: December, 14, 2016 Classification: Business, Finance, and Technology

More information

Optimizing Your Analytics Life Cycle with SAS & Teradata. Rick Lower

Optimizing Your Analytics Life Cycle with SAS & Teradata. Rick Lower Optimizing Your Analytics Life Cycle with SAS & Teradata Rick Lower 1 Agenda The Analytic Life Cycle Common Problems SAS & Teradata solutions Analytical Life Cycle Exploration Explore All Your Data Preparation

More information

Data Mining with R Programming Language for Optimizing Credit Scoring in Commercial Bank

Data Mining with R Programming Language for Optimizing Credit Scoring in Commercial Bank INTERNATIONAL BLACK SEA UNIVERSITY FACULTY OF COMPUTER TECHNOLOGIES AND ENGINEERING Ph.D. PROGRAM Data Mining with R Programming Language for Optimizing Credit Scoring in Commercial Bank Dilmurodzhon Zakirov

More information

Universal Representation of a Consumer's Identity Is it Possible? Presenter: Rob Harris, VP of Product Strategy, FIS

Universal Representation of a Consumer's Identity Is it Possible? Presenter: Rob Harris, VP of Product Strategy, FIS Universal Representation of a Consumer's Identity Is it Possible? Presenter: Rob Harris, VP of Product Strategy, FIS Topics Consumer identity why it is important How big a problem is identity fraud? What

More information

Payment Card Industry (PCI) Data Security Standard

Payment Card Industry (PCI) Data Security Standard Payment Card Industry (PCI) Data Security Standard Attestation of Compliance for Onsite Assessments Service Providers Version 3.2 April 2016 Section 1: Assessment Information Instructions for Submission

More information

Payment Card Industry (PCI) Data Security Standard Self-Assessment Questionnaire A and Attestation of Compliance

Payment Card Industry (PCI) Data Security Standard Self-Assessment Questionnaire A and Attestation of Compliance Payment Card Industry (PCI) Data Security Standard Self-Assessment Questionnaire A and Attestation of Compliance Card-not-present Merchants, All Cardholder Data Functions Fully Outsourced For use with

More information

Data Mining and Knowledge Discovery: Practice Notes

Data Mining and Knowledge Discovery: Practice Notes Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si 2016/11/16 1 Keywords Data Attribute, example, attribute-value data, target variable, class, discretization

More information

Magento GDPR Frequently Asked Questions

Magento GDPR Frequently Asked Questions Magento GDPR Frequently Asked Questions Whom does GDPR impact? Does this only impact European Union (EU) based companies? The new regulation provides rules that govern how companies may collect and handle

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Access Online. Navigation Basics. User Guide. Version 2.2 Cardholder and Program Administrator

Access Online. Navigation Basics. User Guide. Version 2.2 Cardholder and Program Administrator Access Online Navigation Basics User Guide Version 2.2 Cardholder and Program Administrator Contents Introduction... 1 Access Online Overview... 2 How We Gather and Manage Transaction Data in Access Online...

More information

Automatic Domain Partitioning for Multi-Domain Learning

Automatic Domain Partitioning for Multi-Domain Learning Automatic Domain Partitioning for Multi-Domain Learning Di Wang diwang@cs.cmu.edu Chenyan Xiong cx@cs.cmu.edu William Yang Wang ww@cmu.edu Abstract Multi-Domain learning (MDL) assumes that the domain labels

More information

Part I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a

Part I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a Week 9 Based in part on slides from textbook, slides of Susan Holmes Part I December 2, 2012 Hierarchical Clustering 1 / 1 Produces a set of nested clusters organized as a Hierarchical hierarchical clustering

More information

Data Mining Classification: Alternative Techniques. Imbalanced Class Problem

Data Mining Classification: Alternative Techniques. Imbalanced Class Problem Data Mining Classification: Alternative Techniques Imbalanced Class Problem Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Class Imbalance Problem Lots of classification problems

More information

Donor Credit Card Security Policy

Donor Credit Card Security Policy Donor Credit Card Security Policy INTRODUCTION This document explains the Community Foundation of Northeast Alabama s credit card security requirements for donors as required by the Payment Card Industry

More information

FREQUENTLY ASKED QUESTIONS

FREQUENTLY ASKED QUESTIONS FREQUENTLY ASKED QUESTIONS 1. What is the YES BANK MasterCard SecureCode? The MasterCard SecureCode is a service offered by YES BANK in partnership with MasterCard. This authentication is basically a password

More information

Data Mining With Weka A Short Tutorial

Data Mining With Weka A Short Tutorial Data Mining With Weka A Short Tutorial Dr. Wenjia Wang School of Computing Sciences University of East Anglia (UEA), Norwich, UK Content 1. Introduction to Weka 2. Data Mining Functions and Tools 3. Data

More information

Hybrid Feature Selection for Modeling Intrusion Detection Systems

Hybrid Feature Selection for Modeling Intrusion Detection Systems Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,

More information

Machine Learning Final Project

Machine Learning Final Project Machine Learning Final Project Team: hahaha R01942054 林家蓉 R01942068 賴威昇 January 15, 2014 1 Introduction In this project, we are asked to solve a classification problem of Chinese characters. The training

More information

A Comparative Study of Locality Preserving Projection and Principle Component Analysis on Classification Performance Using Logistic Regression

A Comparative Study of Locality Preserving Projection and Principle Component Analysis on Classification Performance Using Logistic Regression Journal of Data Analysis and Information Processing, 2016, 4, 55-63 Published Online May 2016 in SciRes. http://www.scirp.org/journal/jdaip http://dx.doi.org/10.4236/jdaip.2016.42005 A Comparative Study

More information

Maintaining Trust: Visa Inc. Payment Security Strategy

Maintaining Trust: Visa Inc. Payment Security Strategy Maintaining Trust: Visa Inc Payment Security Strategy Ellen Richey 2010 Payments Conference Chicago Federal Reserve Global Electronic Payments Protecting the payment system is a shared responsibility among

More information

CSE 258. Web Mining and Recommender Systems. Advanced Recommender Systems

CSE 258. Web Mining and Recommender Systems. Advanced Recommender Systems CSE 258 Web Mining and Recommender Systems Advanced Recommender Systems This week Methodological papers Bayesian Personalized Ranking Factorizing Personalized Markov Chains Personalized Ranking Metric

More information

User Authentication Best Practices for E-Signatures Wednesday February 25, 2015

User Authentication Best Practices for E-Signatures Wednesday February 25, 2015 User Authentication Best Practices for E-Signatures Wednesday February 25, 2015 Agenda E-Signature Overview Legality, Authentication & Best Practices Role of authentication in e-signing Options and applications

More information

Lies, Damned Lies and Statistics Using Data Mining Techniques to Find the True Facts.

Lies, Damned Lies and Statistics Using Data Mining Techniques to Find the True Facts. Lies, Damned Lies and Statistics Using Data Mining Techniques to Find the True Facts. BY SCOTT A. BARNES, CPA, CFF, CGMA The adversarial nature of the American legal system creates a natural conflict between

More information

Online Signature Verification Technique

Online Signature Verification Technique Volume 3, Issue 1 ISSN: 2320-5288 International Journal of Engineering Technology & Management Research Journal homepage: www.ijetmr.org Online Signature Verification Technique Ankit Soni M Tech Student,

More information

MACHINE LEARNING TOOLBOX. Logistic regression on Sonar

MACHINE LEARNING TOOLBOX. Logistic regression on Sonar MACHINE LEARNING TOOLBOX Logistic regression on Sonar Classification models Categorical (i.e. qualitative) target variable Example: will a loan default? Still a form of supervised learning Use a train/test

More information

NIST. Support Vector Machines. Applied to Face Recognition U56 QC 100 NO A OS S. P. Jonathon Phillips. Gaithersburg, MD 20899

NIST. Support Vector Machines. Applied to Face Recognition U56 QC 100 NO A OS S. P. Jonathon Phillips. Gaithersburg, MD 20899 ^ A 1 1 1 OS 5 1. 4 0 S Support Vector Machines Applied to Face Recognition P. Jonathon Phillips U.S. DEPARTMENT OF COMMERCE Technology Administration National Institute of Standards and Technology Information

More information

Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing

Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Generalized Additive Model and Applications in Direct Marketing Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Abstract Logistic regression 1 has been widely used in direct marketing applications

More information

Slice Intelligence!

Slice Intelligence! Intern @ Slice Intelligence! Wei1an(Wu( September(8,(2014( Outline!! Details about the job!! Skills required and learned!! My thoughts regarding the internship! About the company!! Slice, which we call

More information

ISyE 6416 Basic Statistical Methods - Spring 2016 Bonus Project: Big Data Analytics Final Report. Team Member Names: Xi Yang, Yi Wen, Xue Zhang

ISyE 6416 Basic Statistical Methods - Spring 2016 Bonus Project: Big Data Analytics Final Report. Team Member Names: Xi Yang, Yi Wen, Xue Zhang ISyE 6416 Basic Statistical Methods - Spring 2016 Bonus Project: Big Data Analytics Final Report Team Member Names: Xi Yang, Yi Wen, Xue Zhang Project Title: Improve Room Utilization Introduction Problem

More information

Real-time Fraud Detection with Innovative Big Graph Feature. Gaurav Deshpande, VP Marketing, TigerGraph; Mingxi Wu, VP Engineering, TigerGraph

Real-time Fraud Detection with Innovative Big Graph Feature. Gaurav Deshpande, VP Marketing, TigerGraph; Mingxi Wu, VP Engineering, TigerGraph Real-time Fraud Detection with Innovative Big Graph Feature Gaurav Deshpande, VP Marketing, TigerGraph; Mingxi Wu, VP Engineering, TigerGraph Speaking Today Gaurav Deshpande VP Marketing, TigerGraph gaurav@tigergraph.com

More information

Data Preprocessing. Supervised Learning

Data Preprocessing. Supervised Learning Supervised Learning Regression Given the value of an input X, the output Y belongs to the set of real values R. The goal is to predict output accurately for a new input. The predictions or outputs y are

More information

90% of data breaches are caused by software vulnerabilities.

90% of data breaches are caused by software vulnerabilities. 90% of data breaches are caused by software vulnerabilities. Get the skills you need to build secure software applications Secure Software Development (SSD) www.ce.ucf.edu/ssd Offered in partnership with

More information

NETWORK FAULT DETECTION - A CASE FOR DATA MINING

NETWORK FAULT DETECTION - A CASE FOR DATA MINING NETWORK FAULT DETECTION - A CASE FOR DATA MINING Poonam Chaudhary & Vikram Singh Department of Computer Science Ch. Devi Lal University, Sirsa ABSTRACT: Parts of the general network fault management problem,

More information

Taking Your Application Design to the Next Level with Data Mining

Taking Your Application Design to the Next Level with Data Mining Taking Your Application Design to the Next Level with Data Mining Peter Myers Mentor SolidQ Australia HDNUG 24 June, 2008 WHO WE ARE Industry experts: Growing, elite group of over 90 of the world s best

More information

Statistics 202: Data Mining. c Jonathan Taylor. Outliers Based in part on slides from textbook, slides of Susan Holmes.

Statistics 202: Data Mining. c Jonathan Taylor. Outliers Based in part on slides from textbook, slides of Susan Holmes. Outliers Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Concepts What is an outlier? The set of data points that are considerably different than the remainder of the

More information

Payment Card Industry (PCI) Data Security Standard

Payment Card Industry (PCI) Data Security Standard Payment Card Industry (PCI) Data Security Standard Attestation of Compliance for Onsite Assessments Service Providers Version 3.2 April 2016 Section 1: Assessment Information Instructions for Submission

More information

On classification, ranking, and probability estimation

On classification, ranking, and probability estimation On classification, ranking, and probability estimation Peter Flach 1 and Edson Takashi Matsubara 2 1 Department of Computer Science, University of Bristol, United Kingdom Peter.Flach@bristol.ac.uk 2 Instituto

More information

Data Classification, Security, and Privacy

Data Classification, Security, and Privacy Data Classification, Security, and Privacy Jennifer Bayuk Securities Industry and Financial Markets Association Internal Audit Division October, 2007 Overview of Information Classification Logical Relationship

More information

Section 1: Assessment Information

Section 1: Assessment Information Section 1: Assessment Information Instructions for Submission This document must be completed as a declaration of the results of the merchant s self-assessment with the Payment Card Industry Data Security

More information

Package gbts. February 27, 2017

Package gbts. February 27, 2017 Type Package Package gbts February 27, 2017 Title Hyperparameter Search for Gradient Boosted Trees Version 1.2.0 Date 2017-02-26 Author Waley W. J. Liang Maintainer Waley W. J. Liang

More information

Boost your Analytics with Machine Learning for SQL Nerds. Julie mssqlgirl.com

Boost your Analytics with Machine Learning for SQL Nerds. Julie mssqlgirl.com Boost your Analytics with Machine Learning for SQL Nerds Julie Koesmarno @MsSQLGirl mssqlgirl.com 1. Y ML 2. Operationalizing ML 3. Tips & Tricks 4. Resources automation delighting customers Deepen Engagement

More information

SOCIAL MEDIA MINING. Data Mining Essentials

SOCIAL MEDIA MINING. Data Mining Essentials SOCIAL MEDIA MINING Data Mining Essentials Dear instructors/users of these slides: Please feel free to include these slides in your own material, or modify them as you see fit. If you decide to incorporate

More information

Efficient Scalable Multi-Level Classification Scheme for Credit Card Fraud Detection

Efficient Scalable Multi-Level Classification Scheme for Credit Card Fraud Detection IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 123 Efficient Scalable Multi-Level Classification Scheme for Credit Card Fraud Detection Dipti D.Patil, Sunita

More information

The Iterative Bayesian Model Averaging Algorithm: an improved method for gene selection and classification using microarray data

The Iterative Bayesian Model Averaging Algorithm: an improved method for gene selection and classification using microarray data The Iterative Bayesian Model Averaging Algorithm: an improved method for gene selection and classification using microarray data Ka Yee Yeung, Roger E. Bumgarner, and Adrian E. Raftery April 30, 2018 1

More information

The Devil is in the Details: The Secrets to Complying with PCI Requirements. Michelle Kaiser Bray Faegre Baker Daniels

The Devil is in the Details: The Secrets to Complying with PCI Requirements. Michelle Kaiser Bray Faegre Baker Daniels The Devil is in the Details: The Secrets to Complying with PCI Requirements Michelle Kaiser Bray Faegre Baker Daniels 1 PCI DSS: What? PCI DSS = Payment Card Industry Data Security Standard Payment card

More information

Principles of Machine Learning

Principles of Machine Learning Principles of Machine Learning Lab 3 Improving Machine Learning Models Overview In this lab you will explore techniques for improving and evaluating the performance of machine learning models. You will

More information

June 30, Phyllis Schneider, AAP, Director, Network Rules ᅳ Rules Development & Technical Support

June 30, Phyllis Schneider, AAP, Director, Network Rules ᅳ Rules Development & Technical Support June 30, 2010 TO: FROM: ACH Rulebook Subscribers Phyllis Schneider, AAP, Director, Network Rules ᅳ Rules Development & Technical Support RE: 2010 ACH Rulebook ᅳ Supplement #1-2010 Rules Simplification

More information

Phishing Activity Trends Report August, 2006

Phishing Activity Trends Report August, 2006 Phishing Activity Trends Report, 26 Phishing is a form of online identity theft that employs both social engineering and technical subterfuge to steal consumers' personal identity data and financial account

More information

CSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13

CSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13 CSE 634 - Data Mining Concepts and Techniques STATISTICAL METHODS Professor- Anita Wasilewska (REGRESSION) Team 13 Contents Linear Regression Logistic Regression Bias and Variance in Regression Model Fit

More information

Penalizied Logistic Regression for Classification

Penalizied Logistic Regression for Classification Penalizied Logistic Regression for Classification Gennady G. Pekhimenko Department of Computer Science University of Toronto Toronto, ON M5S3L1 pgen@cs.toronto.edu Abstract Investigation for using different

More information

Payment Card Industry (PCI) Data Security Standard

Payment Card Industry (PCI) Data Security Standard Payment Card Industry (PCI) Data Security Standard Attestation of Compliance for Onsite Assessments Service Providers Version 3.1 April 2015 Section 1: Assessment Information Instructions for Submission

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 Homework 2 out Announcements Due May 3 rd (11:59pm) Course project proposal

More information

Payment Card Industry (PCI) Data Security Standard

Payment Card Industry (PCI) Data Security Standard Payment Card Industry (PCI) Data Security Standard Attestation of Compliance for Onsite Assessments Service Providers Version 3.2 April 2016 Section 1: Assessment Information Instructions for Submission

More information