Data Mining: Models and Methods

Size: px
Start display at page:

Download "Data Mining: Models and Methods"

Transcription

1 Data Mining: Models and Methods Author, Kirill Goltsman A White Paper July Copyright

2 What is Data Mining? Data mining refers to discovery and extraction of patterns and knowledge from large data sets of structured and unstructured data. Data mining techniques have been around for many decades, however, recent advances in ML (Machine Learning), computer performance, and numerical computation have made data mining methods easier to implement on the large data sets and in business-centric tasks. Growing popularity of data mining in business analytics and marketing is also due to the proliferation of Big Data and Cloud Computing. Large distributed databases and methods for parallel processing of data such as MapReduce, make huge volumes of data manageable and useful for companies and academia. Similarly, the cost of storing and managing data is reduced by cloud service providers (CSPs) who offer a pay-as-you-go model to access virtualised servers, storage capacities (disc drives), GPUs (Graphic Processing Unit), and distributed databases. As a result, companies can store, process, and analyze more data getting better business insights. By themselves, state-of-the-art data mining methods are powerful in many classes of tasks. Some of them are anomaly detection, clustering, classification, association rule learning, regression, and summarization. Each of these tasks plays a crucial role in a whatever setting one might think of. For example, anomaly detection techniques help companies protect against network intrusion and data breach. In their turn, regression models are powerful in the prediction of business trends, revenues, and expenses. Clustering techniques have the highest utility in grouping huge volumes of data into cohesive entities that tell about patterns, dependencies both within and among them without the prior knowledge of any laws that govern observations. As these examples illustrate, data mining has the power to put data into the service of businesses and entire communities. Data Mining Models There exist numerous ways to organize and analyze data. Which approach to select depends much on our purpose (e.g. prediction, the inference of relationships) and the form of data (structured vs. unstructured). We can end up with a particular configuration of data which might be good for one task, but not so good for another. Thus, to make data usable one should be aware of theoretical models and approaches used in data mining and realize possible trade-offs and pitfalls in each of them. Parametric and Non-Parametric Models One way of looking at a data mining model is to determine whether it has parameters or not. In terms of parameters, we have a choice between parametric and non-parametric models. In the first type of models, we select a function that, in our view, is the best fit to the training data. For instance, we may choose a linear function of a form F (X) = q0 + q1 x1 + q2 x qp xp, in which x s are features of the input data (e.g house size, floor, a number of rooms) and q s are unknown parameters of the model. These parameters may be thought of as weights that determine a contribution of different features (e.g house size, floor, number of rooms) to the value of the function Y (e.g house price). The task of a parametric model is then to find parameters Q using some statistical methods, such as linear regression or logistic regression.

3 The main advantage of parametric models is that they contain intuition about relationships among features in our data. This makes parametric models an excellent heuristic, inference, and prediction tool. At the same time, however, parametric models have several pitfalls. If the function we have selected is too simple, it may fail to properly explain patterns in the complex data. This problem, known as underfitting, is frequent in linear functions used with the non-linear data. On the other hand, if our function is too complex (e.g with polynomials), it may end up in the overfitting, a scenario in which our model responds to the noise in data rather than actual patterns and is not generalizable to new examples. Figure #1 Examples of normal, underfit, and overfit models. Non-parametric models are free from these issues because they make no assumptions about the underlying form of the function. Therefore, non-parametric models are good in dealing with unstructured data. On the other hand, since non-parametric models do not reduce the problem to the estimation of a small number of parameters, they require very large datasets in order to obtain a precise estimate of the function. Restrictive vs. Flexible Methods Data mining and ML models may also differ in terms of flexibility. Generally speaking, parametric models, such as linear regression, are considered to be highly restrictive, because they need structured data and actual responses (Y) to work. This very feature, however, makes them suitable for inference finding relationships between features (e.g how the crime rate in the neighborhood affects house prices). Because of this, restrictive models are interpretable and clear. This observation, though, is not true for flexible models (e.g non-parametric models). Because flexible models make no assumptions about a form of the function that controls observation, they are less interpretable. In many settings, however, the lack of interpretability is not a concern. For example, when our only interest is a prediction of stock prices, we should not care about the interpretability of the model at all. Supervised vs. Unsupervised Learning Nowadays, we hear a lot about supervised and unsupervised Machine Learning. New neural networks

4 based on these concepts are making progress in image and speech recognition, or autonomous driving on a daily basis. A natural question, though, what is the difference between unsupervised and supervised learning approaches? The main difference is in a form of data used and techniques to analyze it. In a supervised learning setting, we use a labeled data that consists of features/variables and dependent variables (Y or response). This data is then fed to the learning algorithm that searches for patterns, and a function that controls relationships between independent and dependent variables. The retrieved function may be then applied for the prediction of future observations. In the unsupervised learning, we also observe a vector of features (e.g. house size, floor). The difference with supervised learning, though, we don t have any associated results (Y). In this case, we cannot apply a linear regression model since there are no response values to predict. Thus, in an unsupervised setting, we are working blind in some sense. Data Mining Methods In this section, we are going to describe technical details of several data mining methods. Our choice fell on linear regression, classification, and clustering methods. These methods are one of the most popular in data mining because they solve a wide variety of tasks, including inference and prediction. Also, these methods perfectly illustrate key features of data mining models described above. For example, linear regression and classification (logistic regression) are examples of parametric, supervised, and restrictive methods, whereas clustering (k-means) belongs to a subset of non-parametric unsupervised methods. Linear Regression for Machine Learning Linear Regression is a method of finding a linear function that reasonably approximates the relationship between data points and dependent variable. In other words, it finds an optimized function to represent and explain data. Contemporary advances in processing power and computation methods allow using linear regression in combination with ML algorithms to produce quick and efficient function optimization. In this section, we will describe an implementation of the linear regression with gradient descent to produce algorithmic fitting of data to linear function. Image #1 Linear regression

5 For this task, let s take a case of a house price prediction. Let s assume we have a training set of 100 house examples (m=100). Each house in this sample may be defined as x1 x2, x3 xm. Correspondingly, each house has a set of features or properties, such as house size and floor. Features may be thought of as variables that determine a house price. So, for example, the variable would refer to the size of the first house in the training sample. Finally, our training sample has a list of prices for each house denoted as y1, y2,...ym.. This data tell much by itself (e.g we may apply some methods of descriptive statistics to interpret it), however, in order to run a linear regression, we should first formulate the initial hypothesis. Our hypothesis may be defined as a simple linear function with three parameters (Q). hq(x) = Q0 + Q1x1+ Q2x2 Image#2 Gradient Descent where x1 and x2 are the features (house size and floor) and Q parameters of the function we want to predict with the linear regression. In essence, this hypothesis says that a house price is determined by house size and floor parametrized by certain parameters Q. Thus, we have a confirmation that linear regression is a parametric model, in which we try to fit right parameters to find a configuration that best explains the data. However, what method should we apply to determine the right parameters? Intuitively, our task is to fit parameters that ensure our hypothesis h(x) for each house is close to y, which is a real-world price. For

6 that purpose, we have to define a cost function that evaluates the difference between a predicted value and an actual value. Equation#1 Cost Function for Linear Regression The right-most part of this equation is a version of a popular least-squares method that calculates a squared difference between the training value y and the value predicted by our hypothesis function hq(x). Then, our task is to minimize the cost function so that the predicted error is small as possible. One of the most popular solutions to this problem is the gradient descent algorithm based on the mathematic properties of the gradient. Gradient is a vector- valued function that points in the direction of the greatest rate of increase of our function. In the case of a multi-variate function, its gradient is the vector whose components are partial derivatives of f. Since gradient is a vector that points in the direction of the function s growth, it may be used to find parameters that minimize our function. To achieve this, we simply need to move in the backward direction. The technique of the gradual movement down the function to find a local or global minimum is known as a gradient descent and demonstrated in the image above. To implement gradient descent for our linear regression, we need to start with random parameters Q, and then repeatedly update them in the direction opposite to the gradient vector until convergence with the global minimum (our linear function guarantees that such global minimum actually exists). The gradient descent procedure is defined by the update algorithm Equation#2 Gradient Descent Algorithm where a is the learning rate at which we set our algorithm to learn. The learning rate should not be too large, so that we don t jump over the global minimum, and should not be too small because then the process would take much time. A partial derivative in the right-most part of the equation is calculated for each parameter to construct a

7 gradient vector. It is derived in the following way: Equation#3 Partial Derivative of Gradient Descent Putting partial derivatives and learning rate together produces the final update rule Equation #4 Gradient Descent Update Rule that should be repeated until our algorithm finds the global minimum. That would be the point at which our parameters (Q) produce the function that best explains the training data. This learned function now may be used to predict house prices for houses not included in the training sample and employed in the inference of various relationships between features of our model. This functionality makes the linear regression with gradient descent a powerful technique both in data mining and machine learning. Classification with Logistic Regression Classification is a process of determining a class/category to which the object belongs. Classification techniques implemented via machine learning algorithms have numerous applications ranging from spam filtering to medical diagnostics and recommender systems. Similarly to linear regression, in a classification problem, we work with a labeled training set that includes some features. However, observations in the data set map not to the quantitative value as in linear regression, but to a categorical value (e.g class). For example, patients medical records may determine two classes of patients: those with benign and malignant cancers. The task of the classification algorithm is then to learn a function that best predicts what type of cancer (malignant vs. benign) a patient has. If there are only two classes, the

8 problem is known as a binary classification. In contrast, multi-class classification may be used when we have more classes of data. One of the most common classification techniques in data science and ML is the logistic regression. Logistic regression is based on the sigmoid function that has an interesting property: it maps any real number to the (0,1) interval. As a result, it may be effectively used to evaluate the probability (between 0 and 1) than an observation falls within a certain category. For example, if we define benign cancer as 0 and malignant cancer as 1, a logistic value of.6 would mean that there is a 60% chance that a patient s cancer is malignant. These properties make sigmoid function useful for binary classification, but multiclass classification is also possible. Image#3 Sigmoid Function The formula for the sigmoid function is: Equation#5 Sigmoid Function where e is the constant e (2,71828) a base of the natural logarithm with a property that its natural logarithm is equal to 1. To build a working classification model, we should put our hypothesis into the sigmoid function. Remember that our hypothesis function has a form of hq(x) = Q0 + Q1x1+ Q2x2 For convenience, it may be written in the vector form z = Qt x, where superscript t refers to the matrix transpose of the vector

9 parameters Q. As a result of this transformation, we get the following function Equation#6 Logistic regression model where z refers to the vector representation of our initial hypothesis hq(x). In order to fit parameters Q to our logistic regression model, we should first redefine it in probabilistic terms. This is also needed to leverage the power of sigmoid function as a classifier. Equation# 7 Probabilistic Interpretation of Classification The above definition stems from the basic rule that probability always adds up to 1. So, for example, if the probability of the malignant cancer is 0.7, the probability of benign cancer is automatically = 0.3. The equation above formalizes this obvious observation. Now, as we defined the hypothesis and probabilistic assumptions, it s time to construct a cost function in the same way we did for linear regression. For that purpose, we need to transform our original sigmoid function, because it s a complex non-convex function that can produce many local minima. If used with the least-squares method similar to the linear regression model below, a sigmoid function will have a hard time to converge. Equation #8 Linear Regression Cost Function Instead, to represent the intuition behind sigmoid function, we may use log-probability applied to the above-mentioned probabilistic definition of the classification problem.

10 Equation#9 Logistic Regression Cost Function The graph below illustrates that log function assigns a high cost if our hypothesis is wrong, and no cost if the hypothesis is right. If y=1 and hq(x) = 1, then the cost? 0. In contrast, if y=1 and hqx is 0, then the cost goes to infinity. The opposite happens if y=0. Image #4 - log(x) Function We can make a simplified version of the cost function by merging these two cases together. The final cost function ready for use with logistic regression has the following form: Equation #10 Logistic Regression Cost Function (Simplified) Now, when the cost function is formulated, we can easily use the gradient descent identical to the one applied in linear regression.

11 Equation # 11 Logistic Regression Update Rule A similar technique may be applied to the multi-class classification problem with more than two classes. In general, multi-class problems use a one-vs-all approach in which we choose one class and then lump all others into a single second class. We do this repeatedly, applying binary logistic regression to each case, and then use the hypothesis that returned the highest value as our prediction. Image #5 One-vs-all or Multiclass Classification Clustering Methods As we have seen, clustering is an unsupervised method that is useful when the data is not labeled or if there are no response values (y). Clustering observations of a data set involves partitioning them into distinct groups so that observations within each group are quite similar to each other, while observations

12 in the different groups have less in common. To illustrate this method, let s take an example from marketing. Assume that we have a big volume of data about consumers. This data may involve median household income, occupation, distance from the nearest urban area, and so forth. This information may be then used for market segmentation. Our task is to identify various groups of customers without the prior knowledge of commonalities that may exist among them. Such segmentation may be then used for tailoring marketing campaigns that target specific clusters of consumers. There are many different clustering techniques to do this, but the most popular are k-mean clustering algorithm and hierarchical clustering. In this section, we are going to describe k-means method, a very efficient algorithm that covers a wide range of use cases. In the k-means clustering, we want to partition observations into a pre-specified number of clusters. Although setting a number of clusters before clustering is considered to be a limitation of the k-means algorithm, it is still a very powerful technique. In our clustering problem, we are given a training set of xi,...,x(m) consumers with inpidual features xj.. Features are vectors of variables that describe various properties of consumers, such as median income, age, gender, and so forth. The rule is that each observation (consumer) should belong to exactly one cluster and no observations should belong to more than one cluster. Image #6 Illustration of the Clustering Process The idea behind the k-means clustering is that a good cluster is the one for which the within-cluster variation or within-cluster sum of squares (WCSS) (variance) is minimal. In other words, consumers in the same cluster should have more in common with each other than with consumers from other clusters. To achieve this configuration, our task is to algorithmically minimize WCSS for all pre-specified clusters. This task is implemented in the following equation:

13 Equation #12 Within-cluster Sum of Squares (WCSS) where Ck denotes the number of observations in the kth cluster In words, the equation above says that we want to partition the observations into K clusters so that the total within-cluster variation, summed over all K clusters, is as small as possible. The within- cluster variation for the kth cluster is the sum of all of the pairwise squared Euclidean distances between the observations in this cluster, pided by the total number of observations in the kth cluster. As in the case with linear regression, to minimize this function we should start from some initial guess. Our task is to find cluster centroids (average position of all points in the space) or, means for each cluster. This may be achieved via three-step algorithm in which we: 1.Randomly initialize K cluster centroids m1, m2 mk 2.We assign each observation in the data set to the cluster that yields the least WCSS. Intuitively, the least WCSS is the nearest mean (centroid). To find the nearest centroid we should calculate the Euclidean distances between observations and centroids and select the one with the smallest distance. Equation #13 Assigning observation to the closest centroid 3.The next step is moving to a new centroid m by computing the centroid for each K cluster. The centroid of the kth cluster may be defined as a vector of the p feature means for the observations in the kth cluster. So, for example, if we have x1, x2, x3 and they belong to the same cluster (c2), then the centroid m2 is defined by the average: Equation #14 Centroid Calculation Example Since arithmetic mean is a good least-squares estimator, this step also minimizes the within-cluster sum of squares. This means that as the algorithm runs, the clustering obtained will continually improve until the result no longer changes. When this happens, the local optimum has been reached and clusters

14 become stable. Conclusion Data mining models and methods described in this paper, allow data scientists to perform a wide array of tasks, including inference, prediction, and analysis. Linear regression is powerful in the prediction of trends and inference of relationships between features. In its turn, logistic regression may be used in the automatic classification of behaviors, processes, and objects, which makes it useful in business analytics and anomaly detection. Finally, clustering allows to make insights about unlabeled data and infer hidden relationships that may drive effective business decisions and strategic choices.

15 About the The is a professional body representing the interests of the Data Science Industry. Its membership consists of suppliers who offer a range of big data analytical and technical services and companies and individuals with an interest in the commercial advantages that can be gained from big data. The organisation aims to raise the profile of this developing industry, to educate people about the benefits of knowledge based decision making and to encourage firms to start using big data techniques. Contact contact@datascience.foundation Telephone: Atlantic Business Centre Atlantic Street Altrincham WA14 5NQ web:

06: Logistic Regression

06: Logistic Regression 06_Logistic_Regression 06: Logistic Regression Previous Next Index Classification Where y is a discrete value Develop the logistic regression algorithm to determine what class a new input should fall into

More information

A Comparative study of Clustering Algorithms using MapReduce in Hadoop

A Comparative study of Clustering Algorithms using MapReduce in Hadoop A Comparative study of Clustering Algorithms using MapReduce in Hadoop Dweepna Garg 1, Khushboo Trivedi 2, B.B.Panchal 3 1 Department of Computer Science and Engineering, Parul Institute of Engineering

More information

AKA: Logistic Regression Implementation

AKA: Logistic Regression Implementation AKA: Logistic Regression Implementation 1 Supervised classification is the problem of predicting to which category a new observation belongs. A category is chosen from a list of predefined categories.

More information

CSC 411: Lecture 02: Linear Regression

CSC 411: Lecture 02: Linear Regression CSC 411: Lecture 02: Linear Regression Raquel Urtasun & Rich Zemel University of Toronto Sep 16, 2015 Urtasun & Zemel (UofT) CSC 411: 02-Regression Sep 16, 2015 1 / 16 Today Linear regression problem continuous

More information

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018 MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge

More information

ECLT 5810 Clustering

ECLT 5810 Clustering ECLT 5810 Clustering What is Cluster Analysis? Cluster: a collection of data objects Similar to one another within the same cluster Dissimilar to the objects in other clusters Cluster analysis Grouping

More information

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1 Big Data Methods Chapter 5: Machine learning Big Data Methods, Chapter 5, Slide 1 5.1 Introduction to machine learning What is machine learning? Concerned with the study and development of algorithms that

More information

Machine Learning - Clustering. CS102 Fall 2017

Machine Learning - Clustering. CS102 Fall 2017 Machine Learning - Fall 2017 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Data Mining Looking for

More information

Introduction to Data Mining and Data Analytics

Introduction to Data Mining and Data Analytics 1/28/2016 MIST.7060 Data Analytics 1 Introduction to Data Mining and Data Analytics What Are Data Mining and Data Analytics? Data mining is the process of discovering hidden patterns in data, where Patterns

More information

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset. Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied

More information

REGRESSION ANALYSIS : LINEAR BY MAUAJAMA FIRDAUS & TULIKA SAHA

REGRESSION ANALYSIS : LINEAR BY MAUAJAMA FIRDAUS & TULIKA SAHA REGRESSION ANALYSIS : LINEAR BY MAUAJAMA FIRDAUS & TULIKA SAHA MACHINE LEARNING It is the science of getting computer to learn without being explicitly programmed. Machine learning is an area of artificial

More information

Jarek Szlichta

Jarek Szlichta Jarek Szlichta http://data.science.uoit.ca/ Approximate terminology, though there is some overlap: Data(base) operations Executing specific operations or queries over data Data mining Looking for patterns

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information

Unsupervised Learning: Clustering

Unsupervised Learning: Clustering Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning

More information

ECLT 5810 Clustering

ECLT 5810 Clustering ECLT 5810 Clustering What is Cluster Analysis? Cluster: a collection of data objects Similar to one another within the same cluster Dissimilar to the objects in other clusters Cluster analysis Grouping

More information

Mining di Dati Web. Lezione 3 - Clustering and Classification

Mining di Dati Web. Lezione 3 - Clustering and Classification Mining di Dati Web Lezione 3 - Clustering and Classification Introduction Clustering and classification are both learning techniques They learn functions describing data Clustering is also known as Unsupervised

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,

More information

Unsupervised Learning : Clustering

Unsupervised Learning : Clustering Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

Rapid growth of massive datasets

Rapid growth of massive datasets Overview Rapid growth of massive datasets E.g., Online activity, Science, Sensor networks Data Distributed Clusters are Pervasive Data Distributed Computing Mature Methods for Common Problems e.g., classification,

More information

CS 8520: Artificial Intelligence. Machine Learning 2. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek

CS 8520: Artificial Intelligence. Machine Learning 2. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek CS 8520: Artificial Intelligence Machine Learning 2 Paula Matuszek Fall, 2015!1 Regression Classifiers We said earlier that the task of a supervised learning system can be viewed as learning a function

More information

1) Give decision trees to represent the following Boolean functions:

1) Give decision trees to represent the following Boolean functions: 1) Give decision trees to represent the following Boolean functions: 1) A B 2) A [B C] 3) A XOR B 4) [A B] [C Dl Answer: 1) A B 2) A [B C] 1 3) A XOR B = (A B) ( A B) 4) [A B] [C D] 2 2) Consider the following

More information

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler BBS654 Data Mining Pinar Duygulu Slides are adapted from Nazli Ikizler 1 Classification Classification systems: Supervised learning Make a rational prediction given evidence There are several methods for

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

Nearest Neighbor Predictors

Nearest Neighbor Predictors Nearest Neighbor Predictors September 2, 2018 Perhaps the simplest machine learning prediction method, from a conceptual point of view, and perhaps also the most unusual, is the nearest-neighbor method,

More information

Exploratory Analysis: Clustering

Exploratory Analysis: Clustering Exploratory Analysis: Clustering (some material taken or adapted from slides by Hinrich Schutze) Heejun Kim June 26, 2018 Clustering objective Grouping documents or instances into subsets or clusters Documents

More information

CS 8520: Artificial Intelligence

CS 8520: Artificial Intelligence CS 8520: Artificial Intelligence Machine Learning 2 Paula Matuszek Spring, 2013 1 Regression Classifiers We said earlier that the task of a supervised learning system can be viewed as learning a function

More information

Machine Learning Lecture-1

Machine Learning Lecture-1 Machine Learning Lecture-1 Programming Club Indian Institute of Technology Kanpur pclubiitk@gmail.com February 25, 2016 Programming Club (IIT Kanpur) ML-1 February 25, 2016 1 / 18 Acknowledgement This

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)

More information

Unsupervised Learning

Unsupervised Learning Outline Unsupervised Learning Basic concepts K-means algorithm Representation of clusters Hierarchical clustering Distance functions Which clustering algorithm to use? NN Supervised learning vs. unsupervised

More information

Supervised Learning with Neural Networks. We now look at how an agent might learn to solve a general problem by seeing examples.

Supervised Learning with Neural Networks. We now look at how an agent might learn to solve a general problem by seeing examples. Supervised Learning with Neural Networks We now look at how an agent might learn to solve a general problem by seeing examples. Aims: to present an outline of supervised learning as part of AI; to introduce

More information

Bayes Net Learning. EECS 474 Fall 2016

Bayes Net Learning. EECS 474 Fall 2016 Bayes Net Learning EECS 474 Fall 2016 Homework Remaining Homework #3 assigned Homework #4 will be about semi-supervised learning and expectation-maximization Homeworks #3-#4: the how of Graphical Models

More information

CSE 158. Web Mining and Recommender Systems. Midterm recap

CSE 158. Web Mining and Recommender Systems. Midterm recap CSE 158 Web Mining and Recommender Systems Midterm recap Midterm on Wednesday! 5:10 pm 6:10 pm Closed book but I ll provide a similar level of basic info as in the last page of previous midterms CSE 158

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

Performance Evaluation of Various Classification Algorithms

Performance Evaluation of Various Classification Algorithms Performance Evaluation of Various Classification Algorithms Shafali Deora Amritsar College of Engineering & Technology, Punjab Technical University -----------------------------------------------------------***----------------------------------------------------------

More information

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Ensemble methods in machine learning. Example. Neural networks. Neural networks Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

Gradient Descent. Wed Sept 20th, James McInenrey Adapted from slides by Francisco J. R. Ruiz

Gradient Descent. Wed Sept 20th, James McInenrey Adapted from slides by Francisco J. R. Ruiz Gradient Descent Wed Sept 20th, 2017 James McInenrey Adapted from slides by Francisco J. R. Ruiz Housekeeping A few clarifications of and adjustments to the course schedule: No more breaks at the midpoint

More information

Lecture-17: Clustering with K-Means (Contd: DT + Random Forest)

Lecture-17: Clustering with K-Means (Contd: DT + Random Forest) Lecture-17: Clustering with K-Means (Contd: DT + Random Forest) Medha Vidyotma April 24, 2018 1 Contd. Random Forest For Example, if there are 50 scholars who take the measurement of the length of the

More information

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-

More information

3 Nonlinear Regression

3 Nonlinear Regression CSC 4 / CSC D / CSC C 3 Sometimes linear models are not sufficient to capture the real-world phenomena, and thus nonlinear models are necessary. In regression, all such models will have the same basic

More information

9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology

9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology 9/9/ I9 Introduction to Bioinformatics, Clustering algorithms Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Outline Data mining tasks Predictive tasks vs descriptive tasks Example

More information

COMPUTATIONAL INTELLIGENCE (CS) (INTRODUCTION TO MACHINE LEARNING) SS16. Lecture 2: Linear Regression Gradient Descent Non-linear basis functions

COMPUTATIONAL INTELLIGENCE (CS) (INTRODUCTION TO MACHINE LEARNING) SS16. Lecture 2: Linear Regression Gradient Descent Non-linear basis functions COMPUTATIONAL INTELLIGENCE (CS) (INTRODUCTION TO MACHINE LEARNING) SS16 Lecture 2: Linear Regression Gradient Descent Non-linear basis functions LINEAR REGRESSION MOTIVATION Why Linear Regression? Regression

More information

COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 2: Linear Regression Gradient Descent Non-linear basis functions

COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 2: Linear Regression Gradient Descent Non-linear basis functions COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS18 Lecture 2: Linear Regression Gradient Descent Non-linear basis functions LINEAR REGRESSION MOTIVATION Why Linear Regression? Simplest

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation Learning 4 Supervised Learning 4 Unsupervised Learning 4

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017 3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural

More information

Linear Regression & Gradient Descent

Linear Regression & Gradient Descent Linear Regression & Gradient Descent These slides were assembled by Byron Boots, with grateful acknowledgement to Eric Eaton and the many others who made their course materials freely available online.

More information

5 Machine Learning Abstractions and Numerical Optimization

5 Machine Learning Abstractions and Numerical Optimization Machine Learning Abstractions and Numerical Optimization 25 5 Machine Learning Abstractions and Numerical Optimization ML ABSTRACTIONS [some meta comments on machine learning] [When you write a large computer

More information

Clustering algorithms and autoencoders for anomaly detection

Clustering algorithms and autoencoders for anomaly detection Clustering algorithms and autoencoders for anomaly detection Alessia Saggio Lunch Seminars and Journal Clubs Université catholique de Louvain, Belgium 3rd March 2017 a Outline Introduction Clustering algorithms

More information

CPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017

CPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017 CPSC 340: Machine Learning and Data Mining More Linear Classifiers Fall 2017 Admin Assignment 3: Due Friday of next week. Midterm: Can view your exam during instructor office hours next week, or after

More information

Based on Raymond J. Mooney s slides

Based on Raymond J. Mooney s slides Instance Based Learning Based on Raymond J. Mooney s slides University of Texas at Austin 1 Example 2 Instance-Based Learning Unlike other learning algorithms, does not involve construction of an explicit

More information

CHAPTER 4: CLUSTER ANALYSIS

CHAPTER 4: CLUSTER ANALYSIS CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis

More information

Application of Support Vector Machine Algorithm in Spam Filtering

Application of Support Vector Machine Algorithm in  Spam Filtering Application of Support Vector Machine Algorithm in E-Mail Spam Filtering Julia Bluszcz, Daria Fitisova, Alexander Hamann, Alexey Trifonov, Advisor: Patrick Jähnichen Abstract The problem of spam classification

More information

CPSC 340: Machine Learning and Data Mining. Multi-Class Classification Fall 2017

CPSC 340: Machine Learning and Data Mining. Multi-Class Classification Fall 2017 CPSC 340: Machine Learning and Data Mining Multi-Class Classification Fall 2017 Assignment 3: Admin Check update thread on Piazza for correct definition of trainndx. This could make your cross-validation

More information

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 21 Table of contents 1 Introduction 2 Data mining

More information

Lecture on Modeling Tools for Clustering & Regression

Lecture on Modeling Tools for Clustering & Regression Lecture on Modeling Tools for Clustering & Regression CS 590.21 Analysis and Modeling of Brain Networks Department of Computer Science University of Crete Data Clustering Overview Organizing data into

More information

What to come. There will be a few more topics we will cover on supervised learning

What to come. There will be a few more topics we will cover on supervised learning Summary so far Supervised learning learn to predict Continuous target regression; Categorical target classification Linear Regression Classification Discriminative models Perceptron (linear) Logistic regression

More information

Unsupervised Data Mining: Clustering. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Unsupervised Data Mining: Clustering. Izabela Moise, Evangelos Pournaras, Dirk Helbing Unsupervised Data Mining: Clustering Izabela Moise, Evangelos Pournaras, Dirk Helbing Izabela Moise, Evangelos Pournaras, Dirk Helbing 1 1. Supervised Data Mining Classification Regression Outlier detection

More information

1 Case study of SVM (Rob)

1 Case study of SVM (Rob) DRAFT a final version will be posted shortly COS 424: Interacting with Data Lecturer: Rob Schapire and David Blei Lecture # 8 Scribe: Indraneel Mukherjee March 1, 2007 In the previous lecture we saw how

More information

Instance-Based Learning: Nearest neighbor and kernel regression and classificiation

Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Emily Fox University of Washington February 3, 2017 Simplest approach: Nearest neighbor regression 1 Fit locally to each

More information

Data Preprocessing. Slides by: Shree Jaswal

Data Preprocessing. Slides by: Shree Jaswal Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data

More information

CS489/698 Lecture 2: January 8 th, 2018

CS489/698 Lecture 2: January 8 th, 2018 CS489/698 Lecture 2: January 8 th, 2018 Nearest Neighbour [RN] Sec. 18.8.1, [HTF] Sec. 2.3.2, [D] Chapt. 3, [B] Sec. 2.5.2, [M] Sec. 1.4.2 CS489/698 (c) 2018 P. Poupart 1 Inductive Learning (recap) Induction

More information

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION 6.1 INTRODUCTION Fuzzy logic based computational techniques are becoming increasingly important in the medical image analysis arena. The significant

More information

Instance-Based Learning: Nearest neighbor and kernel regression and classificiation

Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Emily Fox University of Washington February 3, 2017 Simplest approach: Nearest neighbor regression 1 Fit locally to each

More information

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013 Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork

More information

Understanding Andrew Ng s Machine Learning Course Notes and codes (Matlab version)

Understanding Andrew Ng s Machine Learning Course Notes and codes (Matlab version) Understanding Andrew Ng s Machine Learning Course Notes and codes (Matlab version) Note: All source materials and diagrams are taken from the Coursera s lectures created by Dr Andrew Ng. Everything I have

More information

Cluster Analysis. Prof. Thomas B. Fomby Department of Economics Southern Methodist University Dallas, TX April 2008 April 2010

Cluster Analysis. Prof. Thomas B. Fomby Department of Economics Southern Methodist University Dallas, TX April 2008 April 2010 Cluster Analysis Prof. Thomas B. Fomby Department of Economics Southern Methodist University Dallas, TX 7575 April 008 April 010 Cluster Analysis, sometimes called data segmentation or customer segmentation,

More information

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394 Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 20 Table of contents 1 Introduction 2 Data mining

More information

COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization

COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18 Lecture 6: k-nn Cross-validation Regularization LEARNING METHODS Lazy vs eager learning Eager learning generalizes training data before

More information

Random Forest A. Fornaser

Random Forest A. Fornaser Random Forest A. Fornaser alberto.fornaser@unitn.it Sources Lecture 15: decision trees, information theory and random forests, Dr. Richard E. Turner Trees and Random Forests, Adele Cutler, Utah State University

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Gene Clustering & Classification

Gene Clustering & Classification BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering

More information

Challenges motivating deep learning. Sargur N. Srihari

Challenges motivating deep learning. Sargur N. Srihari Challenges motivating deep learning Sargur N. srihari@cedar.buffalo.edu 1 Topics In Machine Learning Basics 1. Learning Algorithms 2. Capacity, Overfitting and Underfitting 3. Hyperparameters and Validation

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

University of Florida CISE department Gator Engineering. Clustering Part 2

University of Florida CISE department Gator Engineering. Clustering Part 2 Clustering Part 2 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville Partitional Clustering Original Points A Partitional Clustering Hierarchical

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP

More information

Artificial Intelligence. Programming Styles

Artificial Intelligence. Programming Styles Artificial Intelligence Intro to Machine Learning Programming Styles Standard CS: Explicitly program computer to do something Early AI: Derive a problem description (state) and use general algorithms to

More information

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of

More information

Clustering. Chapter 10 in Introduction to statistical learning

Clustering. Chapter 10 in Introduction to statistical learning Clustering Chapter 10 in Introduction to statistical learning 16 14 12 10 8 6 4 2 0 2 4 6 8 10 12 14 1 Clustering ² Clustering is the art of finding groups in data (Kaufman and Rousseeuw, 1990). ² What

More information

APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE

APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE Sundari NallamReddy, Samarandra Behera, Sanjeev Karadagi, Dr. Anantha Desik ABSTRACT: Tata

More information

Fuzzy Segmentation. Chapter Introduction. 4.2 Unsupervised Clustering.

Fuzzy Segmentation. Chapter Introduction. 4.2 Unsupervised Clustering. Chapter 4 Fuzzy Segmentation 4. Introduction. The segmentation of objects whose color-composition is not common represents a difficult task, due to the illumination and the appropriate threshold selection

More information

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques 24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE

More information

Customer Clustering using RFM analysis

Customer Clustering using RFM analysis Customer Clustering using RFM analysis VASILIS AGGELIS WINBANK PIRAEUS BANK Athens GREECE AggelisV@winbank.gr DIMITRIS CHRISTODOULAKIS Computer Engineering and Informatics Department University of Patras

More information

Machine Learning with Python

Machine Learning with Python DEVNET-2163 Machine Learning with Python Dmitry Figol, SE WW Enterprise Sales @dmfigol Cisco Spark How Questions? Use Cisco Spark to communicate with the speaker after the session 1. Find this session

More information

MIS2502: Data Analytics Clustering and Segmentation. Jing Gong

MIS2502: Data Analytics Clustering and Segmentation. Jing Gong MIS2502: Data Analytics Clustering and Segmentation Jing Gong gong@temple.edu http://community.mis.temple.edu/gong What is Cluster Analysis? Grouping data so that elements in a group will be Similar (or

More information

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010 INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to Semi-Supervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi- Supervised Learning, Morgan & Claypool Publishers,

More information

3 Nonlinear Regression

3 Nonlinear Regression 3 Linear models are often insufficient to capture the real-world phenomena. That is, the relation between the inputs and the outputs we want to be able to predict are not linear. As a consequence, nonlinear

More information

SOCIAL MEDIA MINING. Data Mining Essentials

SOCIAL MEDIA MINING. Data Mining Essentials SOCIAL MEDIA MINING Data Mining Essentials Dear instructors/users of these slides: Please feel free to include these slides in your own material, or modify them as you see fit. If you decide to incorporate

More information

Introduction to Artificial Intelligence

Introduction to Artificial Intelligence Introduction to Artificial Intelligence COMP307 Machine Learning 2: 3-K Techniques Yi Mei yi.mei@ecs.vuw.ac.nz 1 Outline K-Nearest Neighbour method Classification (Supervised learning) Basic NN (1-NN)

More information

Data Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality

Data Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data e.g., occupation = noisy: containing

More information

Data Science Tutorial

Data Science Tutorial Eliezer Kanal Technical Manager, CERT Daniel DeCapria Data Scientist, ETC Software Engineering Institute Carnegie Mellon University Pittsburgh, PA 15213 2017 SEI SEI Data Science in in Cybersecurity Symposium

More information

Solution Sketches Midterm Exam COSC 6342 Machine Learning March 20, 2013

Solution Sketches Midterm Exam COSC 6342 Machine Learning March 20, 2013 Your Name: Your student id: Solution Sketches Midterm Exam COSC 6342 Machine Learning March 20, 2013 Problem 1 [5+?]: Hypothesis Classes Problem 2 [8]: Losses and Risks Problem 3 [11]: Model Generation

More information

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect BEOP.CTO.TP4 Owner: OCTO Revision: 0001 Approved by: JAT Effective: 08/30/2018 Buchanan & Edwards Proprietary: Printed copies of

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Data Mining by I. H. Witten and E. Frank 7 Engineering the input and output Attribute selection Scheme-independent, scheme-specific Attribute discretization Unsupervised, supervised, error-

More information

Machine Learning in Python. Rohith Mohan GradQuant Spring 2018

Machine Learning in Python. Rohith Mohan GradQuant Spring 2018 Machine Learning in Python Rohith Mohan GradQuant Spring 2018 What is Machine Learning? https://twitter.com/myusuf3/status/995425049170489344 Traditional Programming Data Computer Program Output Getting

More information

CSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13

CSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13 CSE 634 - Data Mining Concepts and Techniques STATISTICAL METHODS Professor- Anita Wasilewska (REGRESSION) Team 13 Contents Linear Regression Logistic Regression Bias and Variance in Regression Model Fit

More information