Cluster Analysis in Data Mining

Size: px
Start display at page:

Download "Cluster Analysis in Data Mining"

Transcription

1 Cluster Analysis in Data Mining Part II Andrew Kusiak 9 Seamans Center Iowa City, IA - 7 andrew-kusiak@uiowa.edu Tel. 9-9 Cluster Analysis Decomposition Aggregation Grouping Page

2 Models Matrix formulation Mathematical programming formulation Graph hformulation Cluster Representation G B A C D H E F Mutually exclusive regions Page

3 Cluster Representation G B A E C D F H Overlapping regions Cluster Representation A B C E F D G H Hierarchy Page

4 Cluster Representation Clusters A... B... C... D... Objects E...7 F...7 G... H... Fuzzy clusters Two types of data Object with decisions Object without decisions Page

5 Clustering Data with Decisions Method Ignore decisions Treat the objects as data without decisions Method Group objects according to the decisions Treat each group of objects with decisions as a separate data set Solving the Clustering Problem: Binary Matrix Formulation Similarity il it coefficient i methods Sorting based algorithms Bond energy algorithm Cost-based method Cluster identification algorithm Extended cluster identification algorithm Page

6 Clustering Algorithms Desired Properties Low computational time complexity Structure enhancement Enhancement of human interaction and simplification of interface with software Generalizability Similarity Coefficient Method n δ(aik, ajk ) k = sij = n δ(aik, ajk ) k = if aik = ajk where δ (aik, ajk ) = { 0 otherwise Generalization to qualitative features Page

7 δ (aik, ajk ) = { 0 if aik = ajk otherwise n = the number of features Consider GO- { Yes Yes GO- { Page 7 7

8 s = s = = 0. s = = 0. δ (aik, ajk ) = { δ (aik, ajk ) = { if aik = ajk 0 otherwise 0 if aik = ajk otherwise 0 s = s = s = = 0 Yes GO- { Yes GO- { Tree Representing Similarity Coefficients Object number Sim milarity (%) GO- GO- GO - GF- Yes Yes Page 8 8

9 Cluster Identification Algorithm Decomposition Problem Decompose an object-feature incidence matrix into mutually separable submatrices (groups of objects and groups of features) with the minimum number of overlapping objects (or features) subject to the following constraints: Constraint C: Constraint C: Empty groups of objects or features are not allowed. The number of objects in a group does not exceed an upper limit, b (or alternatively, the number of features in a group does not exceed, d). Page 9 9

10 Cluster Identification Algorithm Step 0. Set iteration number k =. Step. Select row i of incidence matrix [aij](k) and draw a horizontal line hi through it ([aij](k) is read: matrix [aij] at iteration k ). Step. For each entry of * crossed by the horizontal line hi draw a vertical line vj. Step. For each entry of * crossed-once by the vertical line vj draw a horizontal line hk. Step. Repeat steps and until there are no more crossed-once entries of * in [aij](k). All crossed-twice entries * in [aij](k) form row cluster RC-k and column cluster CC-k. Step. Transform the incidence matrix [aij](k) into [aij](k+) by removing rows and columns corresponding to the horizontal and vertical lines drawn in steps through. Step. If matrix [aij](k+) = 0 (where 0 denotes a matrix with all empty elements ), stop; otherwise set k = k + and go to step. Example Incidence matrix Object 7 Feature 7 8 * * Page 0 0

11 7 7 8 * * Next Step Delete all double-crossed elements Page

12 7 8 * 7 7 * * Resultant Matrix 7 * Page

13 7 Iteration * h h v v Iteration 7 * h h v v7 Final decomposition result RC- RC- RC- 7 CC- CC- CC- 8 7 * * * Page

14 Three types of matrices (a) (b) (c) a b c d e f g h a b c d e f g h a b c d e f * * * * * * * * * * * * 7 * (a) decomposable matrix (b) non-decomposable matrix with overlapping features (c) non-decomposable matrix with objects Extended CI Algorithm Step 0. Set iteration number k =. (k) Step. Select those objects (rows of matrix [aij] ) that based on the user's expertise, are potential candidates for inclusion in machine cell GO-k. Draw a horizontal line hi through each row of matrix [aij] (k) corresponding to these objects. In the absence of the user's expertise any machine can be selected. (k) Step. For each column in [aij] corresponding to entry, single-crossed by any of the horizontal lines hi, draw a vertical line vj. (k) Step. For each row in [aij] corresponding to the entry, single-crossed by the vertical line vj, drawn in Step, draw a horizontal line hi. Based on the objects corresponding to all the horizontal lines drawn in Step and Step, a temporary machine cell GO-k, is formed. Page

15 If the user's expertise indicates that some of the objects cannot be included in the temporary group GO-k, (k) erase the corresponding horizontal lines in the matrix [aij]. Removal of these horizontal lines results in group GO'-k. (k ) Delete from matrix [aij] features (columns) that are contained in at least one of the objects already included in GO-k. Place these features on a separate list. Draw a vertical line vj through each single-crossed entry in [aij](k) which does not involve any other objects than those included in GO-k. (k) Step.For all the double-crossed entries in [aij], form a cluster GO-k and a group GF-k Step. Transform the incidence matrix [aij] (k+) into [aij] by removing all the rows and columns included in GO-k and GF-k, respectively. (k +) Step. If matrix [aij] = 0 (where 0 denotes a matrix with all elements equal to zero), stop; otherwise set k = k + and go to step. (k) Page

16 Extended CI Algorithm Constraints: Max GO = Objects and in the cluster 7 Feature number Object number Step 0. Set iteration number k =. Step. Since user's expertise indicates that objects and should be included cluster GO-, two horizontal lines h and h are drawn. Feature number h h Page

17 Step. For columns,,,, and 7 crossed by the horizontal lines h and h, five vertical lines, v, v, v, v, and v7 are drawn. Feature number h h v v v v v7 Step. Three horizontal lines, h, h, and h7 are drawn through rows,, and 7 corresponding of the single-crossed elements. Feature number h h h h h7 v v v v v7 Page 7 7

18 A temporary machine cell GO'- with objects {,,,, 7} is formed. objects and are excluded from in GO'- as they include less double-crossed (committed) elements than object 7. The horizontal lines h and h are erased h h h7 v v v v7 Step. The double-crossed entries of matrix indicate: Object cluster GO- = {,, 7} and Feature cluster GF- = {,,, 7} Step. Matrix is transformed into the matrix below. 8 0 Page 8 8

19 Step. Set k = k + = and go to Step. The second iteration (k = ) results in: Object cluster GO- = {,,, } and Feature cluster GF- = {, 8, 0, } 8 0 v v8 v0 v h h h h The final result GO - GO - 7 GF - GF Page 9 9

20 Notation m = number of objects n = number of features p = number of clusters if feature i belongs to xij = { cluster j 0 otherwise dij = distance between objects i and j n n min dij xij i = j = subject to: n xij = for all i =,..., n j = n xjj = p j = xij <= xjj for all i =,..., n, j =,..., n xij = 0, for all i =,..., n, j =,..., n Page 0 0

21 m dij = d(aki, akj j) k = Two objects Hamming distance Hamming Distance 0 Blue Blue O 0 = O = 0 0 d = = Decision variable xij Object (or feature) number Object (or feature) number Cluster number Page

22 Given p = m dij = d(aki, akj ) k = [dij] = Object number Feature number Feature number Feature number MIN X+X+X+X+X+X+X+X+ X+X+X+X+X+X+X+X n SUBJECT TO xij = for all i =,..., n j = X+X+X+X+X= X+X+X+X+X= Feature number X+X+X+X+X= X+X+X+X+X= X+X+X+X+X=[dij] = X+X+X+X+X= Feature number Page

23 X-X<=0 X-X<=0 X-X<=0 X-X<=0 X-X<=0 X-X<=0 X-X<=0 X-X<=0 X-X<=0X< X-X<=0 X-X<=0 X-X<=0 Constraint xij <= xjj X-X<=0 X-X<=0 X-X<=0 X-X<=0 X-X<=0 X-X<=0 X-X<=0 X-X<=0 END INTEGER Solution x =, x = x =, x =, x = Based on the definition of xij, two groups of features are formed: GF- = {, } GF- = {,, } GO- = {, } GO- = {, } Feature number GF - GF- Object number GO- GO- Page

24 GO- { GO- { Solution: x=, x = x =, x =, x= Page

DM and Cluster Identification Algorithm

DM and Cluster Identification Algorithm DM and Cluster Identification Algorithm Andrew Kusiak, Professor oratory Seamans Center Iowa City, Iowa - Tel: 9-9 Fax: 9- E-mail: andrew-kusiak@uiowa.edu Homepage: http://www.icaen.uiowa.edu/~ankusiak

More information

Contents PROCESS DECOMPOSITION INTRODUCTION. Decomposition Approaches Desirable Properties

Contents PROCESS DECOMPOSITION INTRODUCTION. Decomposition Approaches Desirable Properties PRCESS DECMPSTN Contents Andrew Kusiak ntelligent Systems Laboratory Seamans Center The University of owa owa City, owa - Tel: - Fax: - andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak NTRDUCTN

More information

Evolutionary Computation: Solution Representation. Set Covering Problem. Set Covering Problem. Set Covering Problem 2/28/2008.

Evolutionary Computation: Solution Representation. Set Covering Problem. Set Covering Problem. Set Covering Problem 2/28/2008. Evolutionary Computation: Solution Representation Andrew Kusiak 9 Seamans Center Iowa City, Iowa - Tel: 9-9 Fax: 9-669 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Set Covering Problem

More information

Outline. Graph Representation of Dependencies. Dependency (Design) Structure Matrix

Outline. Graph Representation of Dependencies. Dependency (Design) Structure Matrix Dependency (Design) Structure Matrix Outline Andrew Kusiak Seamans Center Iowa City, Iowa - Tel: - ax: - andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak DSM definition DSM and innovation Topological

More information

Transportation problem

Transportation problem Transportation problem It is a special kind of LPP in which goods are transported from a set of sources to a set of destinations subjects to the supply and demand of the source and destination, respectively,

More information

Lecture 57 Dynamic Programming. (Refer Slide Time: 00:31)

Lecture 57 Dynamic Programming. (Refer Slide Time: 00:31) Programming, Data Structures and Algorithms Prof. N.S. Narayanaswamy Department of Computer Science and Engineering Indian Institution Technology, Madras Lecture 57 Dynamic Programming (Refer Slide Time:

More information

Balancing the flow. Part 3. Two-Machine Flowshop SCHEDULING SCHEDULING MODELS. Two-Machine Flowshop Two-Machine Job Shop Extensions

Balancing the flow. Part 3. Two-Machine Flowshop SCHEDULING SCHEDULING MODELS. Two-Machine Flowshop Two-Machine Job Shop Extensions PRODUCTION PLANNING AND SCHEDULING Part 3 Andrew Kusiak 3 Seamans Center Iowa City, Iowa 54-57 Tel: 3-335 534 Fax: 3-335 566 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak SCHEDULING Assignment

More information

Data Set. What is Data Mining? Data Mining (Big Data Analytics) Illustrative Applications. What is Knowledge Discovery?

Data Set. What is Data Mining? Data Mining (Big Data Analytics) Illustrative Applications. What is Knowledge Discovery? Data Mining (Big Data Analytics) Andrew Kusiak Intelligent Systems Laboratory 2139 Seamans Center The University of Iowa Iowa City, IA 52242-1527 andrew-kusiak@uiowa.edu http://user.engineering.uiowa.edu/~ankusiak/

More information

Main approach: always make the choice that looks best at the moment. - Doesn t always result in globally optimal solution, but for many problems does

Main approach: always make the choice that looks best at the moment. - Doesn t always result in globally optimal solution, but for many problems does Greedy algorithms Main approach: always make the choice that looks best at the moment. - More efficient than dynamic programming - Doesn t always result in globally optimal solution, but for many problems

More information

Computational Complexity CSC Professor: Tom Altman. Capacitated Problem

Computational Complexity CSC Professor: Tom Altman. Capacitated Problem Computational Complexity CSC 5802 Professor: Tom Altman Capacitated Problem Agenda: Definition Example Solution Techniques Implementation Capacitated VRP (CPRV) CVRP is a Vehicle Routing Problem (VRP)

More information

Data Mining and Evolutionary Computation Algorithms for Process Modeling and Optimization

Data Mining and Evolutionary Computation Algorithms for Process Modeling and Optimization Data Mining and Evolutionary Computation Algorithms for Process Modeling and Optimization Zhe Song, Andrew Kusiak 2139 Seamans Center Iowa City, Iowa 52242-1527 andrew-kusiak@uiowa.edu Tel: 319-335-5934

More information

Mathematics for Decision Making: An Introduction. Lecture 4

Mathematics for Decision Making: An Introduction. Lecture 4 Mathematics for Decision Making: An Introduction Lecture 4 Matthias Köppe UC Davis, Mathematics January 15, 2009 4 1 Modeling the TSP as a standard optimization problem, I A key observation is that every

More information

Parallel Programming with MPI and OpenMP

Parallel Programming with MPI and OpenMP Parallel Programming with MPI and OpenMP Michael J. Quinn Chapter 6 Floyd s Algorithm Chapter Objectives Creating 2-D arrays Thinking about grain size Introducing point-to-point communications Reading

More information

XVIII Open Cup named after E.V. Pankratiev Stage 1: Grand Prix of Romania, Sunday, September 17, 2017

XVIII Open Cup named after E.V. Pankratiev Stage 1: Grand Prix of Romania, Sunday, September 17, 2017 Problem A. Balance file: 1 second 512 mebibytes We say that a matrix A of size N N is balanced if A[i][j] + A[i + 1][j + 1] = A[i + 1][j] + A[i][j + 1] for all 1 i, j N 1. You are given a matrix A of size

More information

What is Data Mining? Data Mining. Data Mining Architecture. Illustrative Applications. Pharmaceutical Industry. Pharmaceutical Industry

What is Data Mining? Data Mining. Data Mining Architecture. Illustrative Applications. Pharmaceutical Industry. Pharmaceutical Industry Data Mining Andrew Kusiak Intelligent Systems Laboratory 2139 Seamans Center The University it of Iowa Iowa City, IA 52242-1527 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Tel. 319-335

More information

What is Data Mining? Data Mining. Data Mining Architecture. Illustrative Applications. Pharmaceutical Industry. Pharmaceutical Industry

What is Data Mining? Data Mining. Data Mining Architecture. Illustrative Applications. Pharmaceutical Industry. Pharmaceutical Industry Data Mining Andrew Kusiak Intelligent Systems Laboratory 2139 Seamans Center The University of Iowa Iowa City, IA 52242-1527 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Tel. 319-335 5934

More information

Distance-based Methods: Drawbacks

Distance-based Methods: Drawbacks Distance-based Methods: Drawbacks Hard to find clusters with irregular shapes Hard to specify the number of clusters Heuristic: a cluster must be dense Jian Pei: CMPT 459/741 Clustering (3) 1 How to Find

More information

CHAPTER IX MULTI STAGE DECISION MAKING APPROACH TO OPTIMIZE THE PRODUCT MIX IN ASSIGNMENT LEVEL UNDER FUZZY GROUP PARAMETERS

CHAPTER IX MULTI STAGE DECISION MAKING APPROACH TO OPTIMIZE THE PRODUCT MIX IN ASSIGNMENT LEVEL UNDER FUZZY GROUP PARAMETERS CHAPTER IX MULTI STAGE DECISION MAKING APPROACH TO OPTIMIZE THE PRODUCT MIX IN ASSIGNMENT LEVEL UNDER FUZZY GROUP PARAMETERS Introduction: Aryanezhad, M.B [2004] showed that one of the most important decisions

More information

10 Introduction to Distributed Computing

10 Introduction to Distributed Computing CME 323: Distributed Algorithms and Optimization, Spring 2016 http://stanford.edu/~rezab/dao. Instructor: Reza Zadeh, Matroid and Stanford. Lecture 10, 4/27/2016. Scribed by Ting-Po Lee and Yu-Sheng Chen.

More information

Lecture notes on Transportation and Assignment Problem (BBE (H) QTM paper of Delhi University)

Lecture notes on Transportation and Assignment Problem (BBE (H) QTM paper of Delhi University) Transportation and Assignment Problems The transportation model is a special class of linear programs. It received this name because many of its applications involve determining how to optimally transport

More information

Two dimensional bin packing problem

Two dimensional bin packing problem Two dimensional bin packing problem The two dimensional bin packing problem is essentially a problem of fitting rectangles into minimum number of given large rectangles. Given rectangles (bins) of size

More information

Transforming Imperfectly Nested Loops

Transforming Imperfectly Nested Loops Transforming Imperfectly Nested Loops 1 Classes of loop transformations: Iteration re-numbering: (eg) loop interchange Example DO 10 J = 1,100 DO 10 I = 1,100 DO 10 I = 1,100 vs DO 10 J = 1,100 Y(I) =

More information

Semi-Supervised Clustering with Partial Background Information

Semi-Supervised Clustering with Partial Background Information Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject

More information

Non-convex Multi-objective Optimization

Non-convex Multi-objective Optimization Non-convex Multi-objective Optimization Multi-objective Optimization Real-world optimization problems usually involve more than one criteria multi-objective optimization. Such a kind of optimization problems

More information

Solutions for Operations Research Final Exam

Solutions for Operations Research Final Exam Solutions for Operations Research Final Exam. (a) The buffer stock is B = i a i = a + a + a + a + a + a 6 + a 7 = + + + + + + =. And the transportation tableau corresponding to the transshipment problem

More information

Hamming Codes. s 0 s 1 s 2 Error bit No error has occurred c c d3 [E1] c0. Topics in Computer Mathematics

Hamming Codes. s 0 s 1 s 2 Error bit No error has occurred c c d3 [E1] c0. Topics in Computer Mathematics Hamming Codes Hamming codes belong to the class of codes known as Linear Block Codes. We will discuss the generation of single error correction Hamming codes and give several mathematical descriptions

More information

lpsymphony - Integer Linear Programming in R

lpsymphony - Integer Linear Programming in R lpsymphony - Integer Linear Programming in R Vladislav Kim October 30, 2017 Contents 1 Introduction 2 2 lpsymphony: Quick Start 2 3 Integer Linear Programming 5 31 Equivalent and Dual Formulations 5 32

More information

Graph Theory: Applications and Algorithms

Graph Theory: Applications and Algorithms Graph Theory: Applications and Algorithms CIS008-2 Logic and Foundations of Mathematics David Goodwin david.goodwin@perisic.com 11:00, Tuesday 21 st February 2012 Outline 1 n-cube 2 Gray Codes 3 Shortest-Path

More information

SETS. Sets are of two sorts: finite infinite A system of sets is a set, whose elements are again sets.

SETS. Sets are of two sorts: finite infinite A system of sets is a set, whose elements are again sets. SETS A set is a file of objects which have at least one property in common. The objects of the set are called elements. Sets are notated with capital letters K, Z, N, etc., the elements are a, b, c, d,

More information

SAS Visual Analytics 8.2: Working with Report Content

SAS Visual Analytics 8.2: Working with Report Content SAS Visual Analytics 8.2: Working with Report Content About Objects After selecting your data source and data items, add one or more objects to display the results. SAS Visual Analytics provides objects

More information

A Software Testing Optimization Method Based on Negative Association Analysis Lin Wan 1, Qiuling Fan 1,Qinzhao Wang 2

A Software Testing Optimization Method Based on Negative Association Analysis Lin Wan 1, Qiuling Fan 1,Qinzhao Wang 2 International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2015) A Software Testing Optimization Method Based on Negative Association Analysis Lin Wan 1, Qiuling Fan

More information

(5.2) 151 Math Exercises. Graph Terminology and Special Types of Graphs. Malek Zein AL-Abidin

(5.2) 151 Math Exercises. Graph Terminology and Special Types of Graphs. Malek Zein AL-Abidin King Saud University College of Science Department of Mathematics 151 Math Exercises (5.2) Graph Terminology and Special Types of Graphs Malek Zein AL-Abidin ه Basic Terminology First, we give some terminology

More information

COMP 182: Algorithmic Thinking Dynamic Programming

COMP 182: Algorithmic Thinking Dynamic Programming Luay Nakhleh 1 Formulating and understanding the problem The LONGEST INCREASING SUBSEQUENCE, or LIS, Problem is defined as follows. Input: An array A[0..n 1] of integers. Output: A sequence of indices

More information

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory Warehousing Outline Andrew Kusiak 2139 Seamans Center Iowa City, IA 52242-1527 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Tel. 319-335 5934 Introduction warehousing concepts Relationship

More information

Iterative Signature Algorithm for the Analysis of Large-Scale Gene Expression Data. By S. Bergmann, J. Ihmels, N. Barkai

Iterative Signature Algorithm for the Analysis of Large-Scale Gene Expression Data. By S. Bergmann, J. Ihmels, N. Barkai Iterative Signature Algorithm for the Analysis of Large-Scale Gene Expression Data By S. Bergmann, J. Ihmels, N. Barkai Reasoning Both clustering and Singular Value Decomposition(SVD) are useful tools

More information

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER II EXAMINATION MH 1301 DISCRETE MATHEMATICS TIME ALLOWED: 2 HOURS

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER II EXAMINATION MH 1301 DISCRETE MATHEMATICS TIME ALLOWED: 2 HOURS NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER II EXAMINATION 2015-2016 MH 1301 DISCRETE MATHEMATICS May 2016 TIME ALLOWED: 2 HOURS INSTRUCTIONS TO CANDIDATES 1. This examination paper contains FIVE (5) questions

More information

MCL. (and other clustering algorithms) 858L

MCL. (and other clustering algorithms) 858L MCL (and other clustering algorithms) 858L Comparing Clustering Algorithms Brohee and van Helden (2006) compared 4 graph clustering algorithms for the task of finding protein complexes: MCODE RNSC Restricted

More information

Locating Restricted Facilities on Binary Maps

Locating Restricted Facilities on Binary Maps Locating Restricted Facilities on Binary Maps Mugurel Ionut Andreica, Politehnica University of Bucharest, mugurel.andreica@cs.pub.ro Cristina Teodora Andreica, Commercial Academy Satu Mare Madalina Ecaterina

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

Statistical Methods and Optimization in Data Mining

Statistical Methods and Optimization in Data Mining Statistical Methods and Optimization in Data Mining Eloísa Macedo 1, Adelaide Freitas 2 1 University of Aveiro, Aveiro, Portugal; macedo@ua.pt 2 University of Aveiro, Aveiro, Portugal; adelaide@ua.pt The

More information

Counting Pigeonhole principle

Counting Pigeonhole principle CS 44 Discrete Mathematics for CS Lecture 23 Counting Pigeonhole principle Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Course administration Homework 7 is due today Homework 8 is out due on

More information

Cluster Analysis. Andrew Kusiak Intelligent Systems Laboratory

Cluster Analysis. Andrew Kusiak Intelligent Systems Laboratory Cluster Aalysis Adrew Kusiak Itelliget Systems Laboratory 2139 Seamas Ceter The Uiversity of Iowa Iowa City, Iowa 52242-1527 adrew-kusiak@uiowa.edu http://www.icae.uiowa.edu/~akusiak Two geeric modes of

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)

More information

Matrix Multiplication

Matrix Multiplication Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2013 CPS343 (Parallel and HPC) Matrix Multiplication Spring 2013 1 / 32 Outline 1 Matrix operations Importance Dense and sparse

More information

Discrete Mathematics, Spring 2004 Homework 8 Sample Solutions

Discrete Mathematics, Spring 2004 Homework 8 Sample Solutions Discrete Mathematics, Spring 4 Homework 8 Sample Solutions 6.4 #. Find the length of a shortest path and a shortest path between the vertices h and d in the following graph: b c d a 7 6 7 4 f 4 6 e g 4

More information

Fingerprint Image Compression

Fingerprint Image Compression Fingerprint Image Compression Ms.Mansi Kambli 1*,Ms.Shalini Bhatia 2 * Student 1*, Professor 2 * Thadomal Shahani Engineering College * 1,2 Abstract Modified Set Partitioning in Hierarchical Tree with

More information

Introduction. Computer Vision & Digital Image Processing. Preview. Basic Concepts from Set Theory

Introduction. Computer Vision & Digital Image Processing. Preview. Basic Concepts from Set Theory Introduction Computer Vision & Digital Image Processing Morphological Image Processing I Morphology a branch of biology concerned with the form and structure of plants and animals Mathematical morphology

More information

Morphological Image Processing

Morphological Image Processing Morphological Image Processing Ranga Rodrigo October 9, 29 Outline Contents Preliminaries 2 Dilation and Erosion 3 2. Dilation.............................................. 3 2.2 Erosion..............................................

More information

GraphBLAS Mathematics - Provisional Release 1.0 -

GraphBLAS Mathematics - Provisional Release 1.0 - GraphBLAS Mathematics - Provisional Release 1.0 - Jeremy Kepner Generated on April 26, 2017 Contents 1 Introduction: Graphs as Matrices........................... 1 1.1 Adjacency Matrix: Undirected Graphs,

More information

One-mode Additive Clustering of Multiway Data

One-mode Additive Clustering of Multiway Data One-mode Additive Clustering of Multiway Data Dirk Depril and Iven Van Mechelen KULeuven Tiensestraat 103 3000 Leuven, Belgium (e-mail: dirk.depril@psy.kuleuven.ac.be iven.vanmechelen@psy.kuleuven.ac.be)

More information

Matrix Multiplication

Matrix Multiplication Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Matrix Multiplication Spring 2018 1 / 32 Outline 1 Matrix operations Importance Dense and sparse

More information

Graph Theory. Part of Texas Counties.

Graph Theory. Part of Texas Counties. Graph Theory Part of Texas Counties. We would like to visit each of the above counties, crossing each county only once, starting from Harris county. Is this possible? This problem can be modeled as a graph.

More information

Giovanni De Micheli. Integrated Systems Centre EPF Lausanne

Giovanni De Micheli. Integrated Systems Centre EPF Lausanne Two-level Logic Synthesis and Optimization Giovanni De Micheli Integrated Systems Centre EPF Lausanne This presentation can be used for non-commercial purposes as long as this note and the copyright footers

More information

1.1 - Introduction to Sets

1.1 - Introduction to Sets 1.1 - Introduction to Sets Math 166-502 Blake Boudreaux Department of Mathematics Texas A&M University January 18, 2018 Blake Boudreaux (Texas A&M University) 1.1 - Introduction to Sets January 18, 2018

More information

Automatic Drawing for Tokyo Metro Map

Automatic Drawing for Tokyo Metro Map Automatic Drawing for Tokyo Metro Map Masahiro Onda 1, Masaki Moriguchi 2, and Keiko Imai 3 1 Graduate School of Science and Engineering, Chuo University monda@imai-lab.ise.chuo-u.ac.jp 2 Meiji Institute

More information

JME Language Reference Manual

JME Language Reference Manual JME Language Reference Manual 1 Introduction JME (pronounced jay+me) is a lightweight language that allows programmers to easily perform statistic computations on tabular data as part of data analysis.

More information

Comparative Analysis of Vertical Fragmentation Techniques in Distributed Environment

Comparative Analysis of Vertical Fragmentation Techniques in Distributed Environment Comparative Analysis of Vertical Fragmentation Techniques in Distributed Environment Mukta Goel 1, Shalini Bhaskar Bajaj 2 1 Assistant Professor, IT, TIT&S, Bhiwani, Haryana, India, 2 Professor, CSE, AUH,

More information

2 Dept. of Computer Applications 3 Associate Professor Dept. of Computer Applications

2 Dept. of Computer Applications 3 Associate Professor Dept. of Computer Applications International Journal of Computing Science and Information Technology, 2014, Vol.2(2), 15-19 ISSN: 2278-9669, April 2014 (http://ijcsit.org) Optimization of trapezoidal balanced Transportation problem

More information

IV. Special Linear Programming Models

IV. Special Linear Programming Models IV. Special Linear Programming Models Some types of LP problems have a special structure and occur so frequently that we consider them separately. A. The Transportation Problem - Transportation Model -

More information

Orange3 Data Fusion Documentation. Biolab

Orange3 Data Fusion Documentation. Biolab Biolab Mar 07, 2018 Widgets 1 IMDb Actors 1 2 Chaining 5 3 Completion Scoring 9 4 Fusion Graph 13 5 Latent Factors 17 6 Matrix Sampler 21 7 Mean Fuser 25 8 Movie Genres 29 9 Movie Ratings 33 10 Table

More information

Chapter 2: Entity-Relationship Model

Chapter 2: Entity-Relationship Model Chapter 2: Entity-Relationship Model! Entity Sets! Relationship Sets! Design Issues! Mapping Constraints! Keys! E-R Diagram! Extended E-R Features! Design of an E-R Database Schema! Reduction of an E-R

More information

Algebraic method for Shortest Paths problems

Algebraic method for Shortest Paths problems Lecture 1 (06.03.2013) Author: Jaros law B lasiok Algebraic method for Shortest Paths problems 1 Introduction In the following lecture we will see algebraic algorithms for various shortest-paths problems.

More information

Neural Networks. Neural Network. Neural Network. Neural Network 2/21/2008. Andrew Kusiak. Intelligent Systems Laboratory Seamans Center

Neural Networks. Neural Network. Neural Network. Neural Network 2/21/2008. Andrew Kusiak. Intelligent Systems Laboratory Seamans Center Neural Networks Neural Network Input Andrew Kusiak Intelligent t Systems Laboratory 2139 Seamans Center Iowa City, IA 52242-1527 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Tel. 319-335

More information

arxiv: v1 [math.gr] 19 Dec 2018

arxiv: v1 [math.gr] 19 Dec 2018 GROWTH SERIES OF CAT(0) CUBICAL COMPLEXES arxiv:1812.07755v1 [math.gr] 19 Dec 2018 BORIS OKUN AND RICHARD SCOTT Abstract. Let X be a CAT(0) cubical complex. The growth series of X at x is G x (t) = y V

More information

Package sparsereg. R topics documented: March 10, Type Package

Package sparsereg. R topics documented: March 10, Type Package Type Package Package sparsereg March 10, 2016 Title Sparse Bayesian Models for Regression, Subgroup Analysis, and Panel Data Version 1.2 Date 2016-03-01 Author Marc Ratkovic and Dustin Tingley Maintainer

More information

Linear Programming with Bounds

Linear Programming with Bounds Chapter 481 Linear Programming with Bounds Introduction Linear programming maximizes (or minimizes) a linear objective function subject to one or more constraints. The technique finds broad use in operations

More information

Chapter 3 Linear Structures: Lists

Chapter 3 Linear Structures: Lists Plan Chapter 3 Linear Structures: Lists 1. Two constructors for lists... 3.2 2. Lists... 3.3 3. Basic operations... 3.5 4. Constructors and pattern matching... 3.6 5. Simple operations on lists... 3.9

More information

CSE 373 Autumn 2012: Midterm #2 (closed book, closed notes, NO calculators allowed)

CSE 373 Autumn 2012: Midterm #2 (closed book, closed notes, NO calculators allowed) Name: Sample Solution Email address: CSE 373 Autumn 0: Midterm # (closed book, closed notes, NO calculators allowed) Instructions: Read the directions for each question carefully before answering. We may

More information

Chapter 4 Graphs and Matrices. PAD637 Week 3 Presentation Prepared by Weijia Ran & Alessandro Del Ponte

Chapter 4 Graphs and Matrices. PAD637 Week 3 Presentation Prepared by Weijia Ran & Alessandro Del Ponte Chapter 4 Graphs and Matrices PAD637 Week 3 Presentation Prepared by Weijia Ran & Alessandro Del Ponte 1 Outline Graphs: Basic Graph Theory Concepts Directed Graphs Signed Graphs & Signed Directed Graphs

More information

EE/CSCI 451 Midterm 1

EE/CSCI 451 Midterm 1 EE/CSCI 451 Midterm 1 Spring 2018 Instructor: Xuehai Qian Friday: 02/26/2018 Problem # Topic Points Score 1 Definitions 20 2 Memory System Performance 10 3 Cache Performance 10 4 Shared Memory Programming

More information

E-BLOW: E-Beam Lithography Overlapping aware Stencil Planning for MCC System

E-BLOW: E-Beam Lithography Overlapping aware Stencil Planning for MCC System E-BLOW: E-Beam Lithography Overlapping aware Stencil Planning for MCC System Bei Yu, Kun Yuan*, Jhih-Rong Gao, and David Z. Pan ECE Dept. University of Texas at Austin, TX *Cadence Inc., CA Supported in

More information

How to use Pivot table macro

How to use Pivot table macro How to use Pivot table macro Managing Pivot Tables Table Filter and Charts for Confluence add-on allows you to summarize your table data and produce its aggregated view in the form of a pivot table. You

More information

AN IMPROVED GRAPH BASED METHOD FOR EXTRACTING ASSOCIATION RULES

AN IMPROVED GRAPH BASED METHOD FOR EXTRACTING ASSOCIATION RULES AN IMPROVED GRAPH BASED METHOD FOR EXTRACTING ASSOCIATION RULES ABSTRACT Wael AlZoubi Ajloun University College, Balqa Applied University PO Box: Al-Salt 19117, Jordan This paper proposes an improved approach

More information

Overlapping Communities

Overlapping Communities Yangyang Hou, Mu Wang, Yongyang Yu Purdue Univiersity Department of Computer Science April 25, 2013 Overview Datasets Algorithm I Algorithm II Algorithm III Evaluation Overview Graph models of many real

More information

Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, Roma, Italy

Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, Roma, Italy Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, 00142 Roma, Italy e-mail: pimassol@istat.it 1. Introduction Questions can be usually asked following specific

More information

Generalized Network Flow Programming

Generalized Network Flow Programming Appendix C Page Generalized Network Flow Programming This chapter adapts the bounded variable primal simplex method to the generalized minimum cost flow problem. Generalized networks are far more useful

More information

BOOLEAN MATRIX FACTORIZATIONS. with applications in data mining Pauli Miettinen

BOOLEAN MATRIX FACTORIZATIONS. with applications in data mining Pauli Miettinen BOOLEAN MATRIX FACTORIZATIONS with applications in data mining Pauli Miettinen MATRIX FACTORIZATIONS BOOLEAN MATRIX FACTORIZATIONS o THE BOOLEAN MATRIX PRODUCT As normal matrix product, but with addition

More information

MATH 117 Statistical Methods for Management I Chapter Two

MATH 117 Statistical Methods for Management I Chapter Two Jubail University College MATH 117 Statistical Methods for Management I Chapter Two There are a wide variety of ways to summarize, organize, and present data: I. Tables 1. Distribution Table (Categorical

More information

Dynamically Motivated Models for Multiplex Networks 1

Dynamically Motivated Models for Multiplex Networks 1 Introduction Dynamically Motivated Models for Multiplex Networks 1 Daryl DeFord Dartmouth College Department of Mathematics Santa Fe institute Inference on Networks: Algorithms, Phase Transitions, New

More information

Engineering Methods in Microsoft Excel. Part 1:

Engineering Methods in Microsoft Excel. Part 1: Engineering Methods in Microsoft Excel Part 1: by Kwabena Ofosu, Ph.D., P.E., PTOE Abstract This course is the first of a series on engineering methods in Microsoft Excel tailored to practicing engineers.

More information

2. Discovery of Association Rules

2. Discovery of Association Rules 2. Discovery of Association Rules Part I Motivation: market basket data Basic notions: association rule, frequency and confidence Problem of association rule mining (Sub)problem of frequent set mining

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SPSS SPSS (originally Statistical Package for the Social Sciences ) is a commercial statistical software package with an easy-to-use

More information

2.2 Set Operations. Introduction DEFINITION 1. EXAMPLE 1 The union of the sets {1, 3, 5} and {1, 2, 3} is the set {1, 2, 3, 5}; that is, EXAMPLE 2

2.2 Set Operations. Introduction DEFINITION 1. EXAMPLE 1 The union of the sets {1, 3, 5} and {1, 2, 3} is the set {1, 2, 3, 5}; that is, EXAMPLE 2 2.2 Set Operations 127 2.2 Set Operations Introduction Two, or more, sets can be combined in many different ways. For instance, starting with the set of mathematics majors at your school and the set of

More information

Touchstone File Format Specification

Touchstone File Format Specification Touchstone File Format Specification Version 2. Touchstone File Format Specification Version 2. Ratified by the IBIS Open Forum April 24, 29 Copyright 29 by TechAmerica. This specification may be distributed

More information

TRIE BASED METHODS FOR STRING SIMILARTIY JOINS

TRIE BASED METHODS FOR STRING SIMILARTIY JOINS TRIE BASED METHODS FOR STRING SIMILARTIY JOINS Venkat Charan Varma Buddharaju #10498995 Department of Computer and Information Science University of MIssissippi ENGR-654 INFORMATION SYSTEM PRINCIPLES RESEARCH

More information

[A] The Earth is Flat!

[A] The Earth is Flat! [A] The Earth is Flat! flat.(c cpp java) flat.in Gold Consider the following language: expression = { c where c is a single, lower-case letter ( e 1 e 2 e t n ) zero or more expressions followed by a natural

More information

Introduction to Clustering and Classification. Psych 993 Methods for Clustering and Classification Lecture 1

Introduction to Clustering and Classification. Psych 993 Methods for Clustering and Classification Lecture 1 Introduction to Clustering and Classification Psych 993 Methods for Clustering and Classification Lecture 1 Today s Lecture Introduction to methods for clustering and classification Discussion of measures

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

Describe the two most important ways in which subspaces of F D arise. (These ways were given as the motivation for looking at subspaces.

Describe the two most important ways in which subspaces of F D arise. (These ways were given as the motivation for looking at subspaces. Quiz Describe the two most important ways in which subspaces of F D arise. (These ways were given as the motivation for looking at subspaces.) What are the two subspaces associated with a matrix? Describe

More information

CS473-Algorithms I. Lecture 11. Greedy Algorithms. Cevdet Aykanat - Bilkent University Computer Engineering Department

CS473-Algorithms I. Lecture 11. Greedy Algorithms. Cevdet Aykanat - Bilkent University Computer Engineering Department CS473-Algorithms I Lecture 11 Greedy Algorithms 1 Activity Selection Problem Input: a set S {1, 2,, n} of n activities s i =Start time of activity i, f i = Finish time of activity i Activity i takes place

More information

Mathematics. 2.1 Introduction: Graphs as Matrices Adjacency Matrix: Undirected Graphs, Directed Graphs, Weighted Graphs CHAPTER 2

Mathematics. 2.1 Introduction: Graphs as Matrices Adjacency Matrix: Undirected Graphs, Directed Graphs, Weighted Graphs CHAPTER 2 CHAPTER Mathematics 8 9 10 11 1 1 1 1 1 1 18 19 0 1.1 Introduction: Graphs as Matrices This chapter describes the mathematics in the GraphBLAS standard. The GraphBLAS define a narrow set of mathematical

More information

Analyzing Dshield Logs Using Fully Automatic Cross-Associations

Analyzing Dshield Logs Using Fully Automatic Cross-Associations Analyzing Dshield Logs Using Fully Automatic Cross-Associations Anh Le 1 1 Donald Bren School of Information and Computer Sciences University of California, Irvine Irvine, CA, 92697, USA anh.le@uci.edu

More information

Unlabeled equivalence for matroids representable over finite fields

Unlabeled equivalence for matroids representable over finite fields Unlabeled equivalence for matroids representable over finite fields November 16, 2012 S. R. Kingan Department of Mathematics Brooklyn College, City University of New York 2900 Bedford Avenue Brooklyn,

More information

Solving ONE S interval linear assignment problem

Solving ONE S interval linear assignment problem RESEARCH ARTICLE OPEN ACCESS Solving ONE S interval linear assignment problem Dr.A.Ramesh Kumar 1,S. Deepa 2, 1 Head, Department of Mathematics, Srimad Andavan Arts and Science College (Autonomous), T.V.Kovil,

More information

(Team Name) (Project Title) Software Design Document. Student Name (s):

(Team Name) (Project Title) Software Design Document. Student Name (s): (Team Name) (Project Title) Software Design Document Student Name (s): TABLE OF CONTENTS 1. INTRODUCTION 2 1.1Purpose 2 1.2Scope 2 1.3Overview 2 1.4Reference Material 2 1.5Definitions and Acronyms 2 2.

More information

Graphs and Graph Algorithms. Slides by Larry Ruzzo

Graphs and Graph Algorithms. Slides by Larry Ruzzo Graphs and Graph Algorithms Slides by Larry Ruzzo Goals Graphs: defns, examples, utility, terminology Representation: input, internal Traversal: Breadth- & Depth-first search Three Algorithms: Connected

More information

Class diagrams. Modeling with UML Chapter 2, part 2. Class Diagrams: details. Class diagram for a simple watch

Class diagrams. Modeling with UML Chapter 2, part 2. Class Diagrams: details. Class diagram for a simple watch Class diagrams Modeling with UML Chapter 2, part 2 CS 4354 Summer II 2014 Jill Seaman Used to describe the internal structure of the system. Also used to describe the application domain. They describe

More information

Linear Loop Transformations for Locality Enhancement

Linear Loop Transformations for Locality Enhancement Linear Loop Transformations for Locality Enhancement 1 Story so far Cache performance can be improved by tiling and permutation Permutation of perfectly nested loop can be modeled as a linear transformation

More information

Rectangle-Efficient Aggregation in Spatial Data Streams

Rectangle-Efficient Aggregation in Spatial Data Streams Rectangle-Efficient Aggregation in Spatial Data Streams Srikanta Tirthapura Iowa State David Woodruff IBM Almaden The Data Stream Model Stream S of additive updates (i, Δ) to an underlying vector v: v

More information

Chapter 3 A 3D Parity Bit Structure 1

Chapter 3 A 3D Parity Bit Structure 1 Chapter 3 A 3D Parity Bit Structure 1 3. Introduction There is always a major role of data structure used in any program or software. Bad programmers worry about the code, good programmers worry about

More information