UNIVERSITI PUTRA MALAYSIA CLASSIFICATION SYSTEM FOR HEART DISEASE USING BAYESIAN CLASSIFIER

Similar documents
AN IMPROVED PACKET FORWARDING APPROACH FOR SOURCE LOCATION PRIVACY IN WIRELESS SENSORS NETWORK MOHAMMAD ALI NASSIRI ABRISHAMCHI

BLOCK-BASED NEURAL NETWORK MAPPING ON GRAPHICS PROCESSOR UNIT ONG CHIN TONG UNIVERSITI TEKNOLOGI MALAYSIA

This item is protected by original copyright

INTEGRATION OF CUBIC MOTION AND VEHICLE DYNAMIC FOR YAW TRAJECTORY MOHD FIRDAUS BIN MAT GHANI

SEMANTICS ORIENTED APPROACH FOR IMAGE RETRIEVAL IN LOW COMPLEX SCENES WANG HUI HUI

HARDWARE AND SOFTWARE CO-SIMULATION PLATFORM FOR CONVOLUTION OR CORRELATION BASED IMAGE PROCESSING ALGORITHMS SAYED OMID AYAT

SUPERVISED MACHINE LEARNING APPROACH FOR DETECTION OF MALICIOUS EXECUTABLES YAHYE ABUKAR AHMED

MODELLING AND REASONING OF LARGE SCALE FUZZY PETRI NET USING INFERENCE PATH AND BIDIRECTIONAL METHODS ZHOU KAIQING

ENHANCING TIME-STAMPING TECHNIQUE BY IMPLEMENTING MEDIA ACCESS CONTROL ADDRESS PACU PUTRA SUARLI

LOGICAL OPERATORS AND ITS APPLICATION IN DETERMINING VULNERABLE WEBSITES CAUSED BY SQL INJECTION AMONG UTM FACULTY WEBSITES NURUL FARIHA BINTI MOKHTER

ADAPTIVE ONLINE FAULT DETECTION ON NETWORK-ON-CHIP BASED ON PACKET LOGGING MECHANISM LOO LING KIM UNIVERSITI TEKNOLOGI MALAYSIA

A LEVY FLIGHT PARTICLE SWARM OPTIMIZER FOR MACHINING PERFORMANCES OPTIMIZATION ANIS FARHAN BINTI KAMARUZAMAN UNIVERSITI TEKNOLOGI MALAYSIA

DESIGN AND IMPLEMENTATION OF A MUSIC BOX USING FPGA TAN KIAN YIAK

IMPLEMENTATION OF UNMANNED AERIAL VEHICLE MOVING OBJECT DETECTION ALGORITHM ON INTEL ATOM EMBEDDED SYSTEM

ENHANCING SRAM PERFORMANCE OF COMMON GATE FINFET BY USING CONTROLLABLE INDEPENDENT DOUBLE GATES CHONG CHUNG KEONG UNIVERSITI TEKNOLOGI MALAYSIA

MICRO-SEQUENCER BASED CONTROL UNIT DESIGN FOR A CENTRAL PROCESSING UNIT TAN CHANG HAI

DETECTION OF WORMHOLE ATTACK IN MOBILE AD-HOC NETWORKS MOJTABA GHANAATPISHEH SANAEI

OPTIMIZE PERCEPTUALITY OF DIGITAL IMAGE FROM ENCRYPTION BASED ON QUADTREE HUSSEIN A. HUSSEIN

A TRUST MODEL FOR BUSINESS TO CUSTOMER CLOUD E-COMMERCE HOSSEIN POURTAHERI

ISOGEOMETRIC ANALYSIS OF PLANE STRESS STRUCTURE CHUM ZHI XIAN

SECURE-SPIN WITH HASHING TO SUPPORT MOBILITY AND SECURITY IN WIRELESS SENSOR NETWORK MOHAMMAD HOSSEIN AMRI UNIVERSITI TEKNOLOGI MALAYSIA

AUTOMATIC APPLICATION PROGRAMMING INTERFACE FOR MULTI HOP WIRELESS FIDELITY WIRELESS SENSOR NETWORK

IMPROVED IMAGE COMPRESSION SCHEME USING HYBRID OF DISCRETE FOURIER, WAVELETS AND COSINE TRANSFORMATION MOH DALI MOUSTAFA ALSAYYH

OPTIMIZED BURST ASSEMBLY ALGORITHM FOR MULTI-RANKED TRAFFIC OVER OPTICAL BURST SWITCHING NETWORK OLA MAALI MOUSTAFA AHMED SAIFELDEEN

ENHANCING WEB SERVICE SELECTION USING ENHANCED FILTERING MODEL AJAO, TAJUDEEN ADEYEMI

HARDWARE/SOFTWARE SYSTEM-ON-CHIP CO-VERIFICATION PLATFORM BASED ON LOGIC-BASED ENVIRONMENT FOR APPLICATION PROGRAMMING INTERFACING TEO HONG YAP

COLOUR IMAGE WATERMARKING USING DISCRETE COSINE TRANSFORM AND TWO-LEVEL SINGULAR VALUE DECOMPOSITION BOKAN OMAR ALI

HERMAN. A thesis submitted in fulfilment of the requirements for the award of the degree of Doctor of Philosophy (Computer Science)

HIGH SPEED SIX OPERANDS 16-BITS CARRY SAVE ADDER AWATIF BINTI HASHIM

RECOGNITION OF PARTIALLY OCCLUDED OBJECTS IN 2D IMAGES ALMUASHI MOHAMMED ALI UNIVERSITI TEKNOLOGI MALAYSIA

PRIVACY FRIENDLY DETECTION TECHNIQUE OF SYBIL ATTACK IN VEHICULAR AD HOC NETWORK (VANET) SEYED MOHAMMAD CHERAGHI

STUDY OF FLOATING BODIES IN WAVE BY USING SMOOTHED PARTICLE HYDRODYNAMICS (SPH) HA CHEUN YUEN UNIVERSITI TEKNOLOGI MALAYSIA

AMBA AXI BUS TO NETWORK-ON-CHIP BRIDGE NG KENG YOKE UNIVERSITI TEKNOLOGI MALAYSIA

PERFOMANCE ANALYSIS OF SEAMLESS VERTICAL HANDOVER IN 4G NETWOKS MOHAMED ABDINUR SAHAL

A SEED GENERATION TECHNIQUE BASED ON ELLIPTIC CURVE FOR PROVIDING SYNCHRONIZATION IN SECUERED IMMERSIVE TELECONFERENCING VAHIDREZA KHOUBIARI

Signature :.~... Name of supervisor :.. ~NA.lf... l.?.~mk.. :... 4./qD F. Universiti Teknikal Malaysia Melaka

THESIS PROJECT ARCHIVE SYSTEM (T-PAS) SHAHRUL NAZMI BIN ISMAIL

SMART AQUARJUM (A UTOMATIC FEEDING MACHINE) SY AFINAZ ZURJATI BINTI BAHARUDDIN

Design and Implementation of I2C BUS Protocol on Xilinx FPGA. Meenal Pradeep Kumar

THE COMPARISON OF IMAGE MANIFOLD METHOD AND VOLUME ESTIMATION METHOD IN CONSTRUCTING 3D BRAIN TUMOR IMAGE

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395

SOLUTION AND INTERPOLATION OF ONE-DIMENSIONAL HEAT EQUATION BY USING CRANK-NICOLSON, CUBIC SPLINE AND CUBIC B-SPLINE WAN KHADIJAH BINTI WAN SULAIMAN

DYNAMIC TIMETABLE GENERATOR USING PARTICLE SWARM OPTIMIZATION (PSO) METHOD TEH YUNG CHUEN UNIVERSITY MALAYSIA PAHANG

ENHANCEMENT OF UML-BASED WEB ENGINEERING FOR METAMODELS: HOMEPAGE DEVELOPMENT CASESTUDY KARZAN WAKIL SAID

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394

MAC PROTOCOL FOR WIRELESS COGNITIVE NETWORK FARAH NAJWA BINTI MOKHTAR

HARDWARE-ACCELERATED LOCALIZATION FOR AUTOMATED LICENSE PLATE RECOGNITION SYSTEM CHIN TECK LOONG UNIVERSITI TEKNOLOGI MALAYSIA

RGB COLOR IMAGE WATERMARKING USING DISCRETE WAVELET TRANSFORM DWT TECHNIQUE AND 4-BITS PLAN BY HISTOGRAM STRETCHING KARRAR ABDUL AMEER KADHIM

USING A CONTINGENT HEURISTIC APPROACH

SYSTEMATIC SECURE DESIGN GUIDELINE TO IMPROVE INTEGRITY AND AVAILABILITY OF SYSTEM SECURITY ASHVINI DEVI A/P KRISHNAN

IMPLEMENTATION AND PERFORMANCE ANALYSIS OF IDENTITY- BASED AUTHENTICATION IN WIRELESS SENSOR NETWORKS MIR ALI REZAZADEH BAEE

MAGNETIC FLUX LEAKAGE SYSTEM FOR WIRE ROPE INSPECTION USING BLUETOOTH COMMUNICATION MUHAMMAD MAHFUZ BIN SALEHHON UNIVERSITI TEKNOLOGI MALAYSIA

LINK QUALITY AWARE ROUTING ALGORITHM IN MOBILE WIRELESS SENSOR NETWORKS RIBWAR BAKHTYAR IBRAHIM UNIVERSITI TEKNOLOGI MALAYSIA

A NEW STEGANOGRAPHY TECHNIQUE USING MAGIC SQUARE MATRIX AND AFFINE CIPHER WALEED S. HASAN AL-HASAN UNIVERSITI TEKNOLOGI MALAYSIA

SLANTING EDGE METHOD FOR MODULATION TRANSFER FUNCTION COMPUTATION OF X-RAY SYSTEM FARHANK SABER BRAIM UNIVERSITI TEKNOLOGI MALAYSIA

UNIVERSITI MALAYSIA PAHANG

COMP 465 Special Topics: Data Mining

INFORM DEPARTURE AND ARRIVING OF BUSSES USING BLUETOOTH MOHD SUHKRI BIN YASRI

WEB MANAGEMENT SYSTEM FOR SERIOUS GAME IN INTERNAL MEDICAL PRACTICE. Phoon Wei Yin

The Discovery and Retrieval of Temporal Rules in Interval Sequence Data

A Comparative Study of Data Mining Process Models (KDD, CRISP-DM and SEMMA)

A REVIEW ON VARIOUS APPROACHES OF CLUSTERING IN DATA MINING

ADAPTIVE VIDEO STREAMING FOR BANDWIDTH VARIATION WITH OPTIMUM QUALITY

Introduction to Data Mining and Data Analytics

TOMRAS: A Task Oriented Mobile Remote Access System for Desktop Applications

PROBLEMS ASSOCIATED WITH EVALUATION OF EXTENSION OF TIME (EOT) CLAIM IN GOVERNMENT PROJECTS

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

INTELLIGENT NON-DESTRUCTIVE CLASSIFICATION OF JOSAPINE PINEAPPLE MATURITY USING ARTIFICIAL NEURAL NETWORK

The Establishment of Large Data Mining Platform Based on Cloud Computing. Wei CAI

COMPARISON OF DIFFERENT CLASSIFICATION TECHNIQUES

Data Mining with R Programming Language for Optimizing Credit Scoring in Commercial Bank

AUTOMATED STUDENT S ATTENDANCE ENTERING SYSTEM BY ELIMINATING FORGE SIGNATURES

code pattern analysis of object-oriented programming languages

FUZZY NEURAL NETWORKS WITH GENETIC ALGORITHM-BASED LEARNING METHOD M. REZA MASHINCHI UNIVERSITI TEKNOLOGI MALAYSIA

Data Mining Technology Based on Bayesian Network Structure Applied in Learning

UNIVERSITI PUTRA MALAYSIA ADAPTIVE METHOD TO IMPROVE WEB RECOMMENDATION SYSTEM FOR ANONYMOUS USERS

Pengguna akan diberikan Username dan Password oleh Administrator untuk login sebagai admin/conference Manager bagi conference yang akan diadakan.

AN ENHANCED SIMULATED ANNEALING APPROACH FOR CYLINDRICAL, RECTANGULAR MESH, AND SEMI-DIAGONAL TORUS NETWORK TOPOLOGIES NORAZIAH BINTI ADZHAR

Data Mining and Warehousing

DEVELOPMENT OF SPAKE S MAINTENANCE MODULE FOR MINISTRY OF DEFENCE MALAYSIA SYED ARDI BIN SYED YAHYA KAMAL UNIVERSITI TEKNOLOGI MALAYSIA

Dr. Prof. El-Bahlul Emhemed Fgee Supervisor, Computer Department, Libyan Academy, Libya

Particle Swarm Optimization Methods for Pattern. Recognition and Image Processing

Question Bank. 4) It is the source of information later delivered to data marts.

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

A STUDY OF SOME DATA MINING CLASSIFICATION TECHNIQUES

The Data Mining usage in Production System Management

Business Intelligence Roadmap HDT923 Three Days

DATA MINING AND WAREHOUSING

Meaning & Concepts of Databases

ARM PROCESSOR EMULATOR MOHAMAD HASRUZAIRIN B MOHD HASHIM

DROWSINESS DETECTION FOR CAR ASSISTED DRIVER SYSTEM USING IMAGE PROCESSING ANALYSIS INTERFACING WITH HARDWARE HUONG NICE QUAN

DYNAMIC TIMESLOT ALLOCATION TECHNIQUE FOR WIRELESS SENSOR NETWORK OON ERIXNO

Hiking Gears Comparing System (HGCS) Using Web Scraping Technique

Parametric Comparisons of Classification Techniques in Data Mining Applications

Data Mining and Data Warehousing Introduction to Data Mining

An Introduction to Data Mining in Institutional Research. Dr. Thulasi Kumar Director of Institutional Research University of Northern Iowa

STATISTICAL APPROACH FOR IMAGE RETRIEVAL KHOR SIAK WANG DOCTOR OF PHILOSOPHY UNIVERSITI PUTRA MALAYSIA

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

Transcription:

UNIVERSITI PUTRA MALAYSIA CLASSIFICATION SYSTEM FOR HEART DISEASE USING BAYESIAN CLASSIFIER ANUSHA MAGENDRAM. FSKTM 2007 9

CLASIFICATION SYSTEM FOR HEART DISEASE USING BAYESIAN CLASSIFIER ANUSHA MAGENDRAM Thesis submitted in Partial Fulfilment of the Requirement for the Degree of Master of Science in the Faculty of Computer Science and Information Technology University Putra Malaysia DEC 2007

APPROVAL SHEET This thesis is received, checked and approved for submission in partial fulfilment of the requirement for the degree of Master of Science in the Faculty of Computer Science and Information Technology, University Putra Malaysia, Serdang, Selangor Dam1 Ehsan.... (Prof. Madya Dr. Md. Nasir Sulaiman) (Supervisor) Faculty of Computer Science and Information Technology University Putra Malaysia Serdang Selangor Dam1 Ehsan... Date

DECLARATION I hereby declare that the thesis is based on my original work except for quotations and citations, which have been dully acknowledged. I also declare that it has not been previously or concurrently submitted for any other degree at UPM or other institutions. Signature Anusha Magendram

ABSTRACT Increase of hearth problem in this world is rising each day. Classification system for heart disease is a system that able to justify whether a patient has heart problem or not. This is a new approach that able to use by doctors to rectify the heart problem. This system was developing base on to three main part which is data processing, testing and implementation of the algorithm. In this system a Bayesian algorithm was used in order to implement the system. This system was mainly developing using java programming. Apache Tom cat was used as a server in order to run the application smoothly. This system is tested using the test and training data set. It has proven that the system able to provide an accurate result on justifying whether the patient has has problem or not. As a conclusion by using this system doctors able to improvise the effectiveness in medical field.

ABSTRAK Pada masa kini,penyakit jantung semakin meningkat dari masa ke semasa.sistem klasifikasi untuk penyakit jantung merupakan sebuah sistem yang mapumegesan sama ada pesakit itu menghadapi masalah jantung.ia juga merupakan kaedah baru yang dapat membantu doctor megenal pasti jantung yang bermasalah. Sistem ini dibina berdasarkan tigafasa iaitu pemprocessan data, pengujian dan implementasi berdasarkan algoritma. Algoritma Bayesian telah di gunakan dalam sistem ini untuk process pengimplementasian. Secara keseluruhannya,sistem ini telah di bangunkan mengunakan program java.apache tom cat digunakan sebagai terminalpelayan supaya perjalanan sistem lebih lancar. Sistem ini telah diuji mengunakan kaedah pengujian data dan pengujian latihanbahawa sistem ini mampu memberikan keputusan yang tepat dalam menjustifikasi masalah pesakit. Kesimpulanny, pengunaan sistem ini boleh memberikan manfaat kepada doctor untuk memperbaiki keberkesanan sesuatu process di bidang perubatan.

Blessing of My Family & Best Friend For their loving and caring inspired me: "There's No Success without Hard work"

ACKNOWLEDGEMENTS This thesis means a lot to me. Many people have played their role to assist me in all manners. Support and encouragement was the main thing I ever needed. Deepest thanks to my supervisor; Prof. Madya Dr. Md. Nasir Sulaiman for his assistance, guidance and support throughout the work of this thesis is greatly appreciated. Dedications to dear mum, dad and all my family members who had been the very supportive, my best friend for his encouragement and others who were always there with help whenever I need it. Last but no least, my thanks to all that are involved in this project either directly or indirectly. I hope that god blesses all of the efforts being put on this project by the whole of you. vii

TABLE OF CONTENTS APPROVAL SHEET DECLARATION ABSTRACT DEDICATION ACKNOWLEDGEMENT. TABLE OF CONTENTS LIST OF FIGURES LIST OF TABLES Pages.. 11... 111 iv vi vii xi xii... Xlll CHAPTERS 1 INTRODUCTION 1.1 Background 1.1 Problem Statement 1.3 Objective 1.4 Scope 1.5 Organization of the thesis LITERATURE REVIEW Introduction... Vlll

Classification Two Tier Software Architectures 2.3.1 Technical Details 2.3.2 Usage of two tier architecture Previous Researches 2.4.1 Ensemble Feature Selection with the Simple Bayesian Classification in Medical Diagnostics 2.2.4 A global optimization approach to classification in medical 15 diagnosis and prognosis 2.4.3Mining association rules using asthma patient profile data set METHODOLOGY 3.1 Introduction 3.2 Software Development Life Cycle (SDLC) 3.2.1 User Requirement analysis 3.2.2 Design 3.2.3 Coding 3.2.4 implementation 3.2.5 Testing 3.2.6 Maintenance and Operation Guide Line of The Research Work Data Preparation 3.4.1 Raw Data Structure 3.4.2 Pre - Processing Data

Generating Of Experiment & Test Data Set Implementation of The Algorithm Summary 4 IMPLEMENTATION AND EXPERIMENTAL DESIGN 4.1 System Architecture 4.2 System Module 4.3 User interface 4.3.1 Interface Design 4.3.2 User Input 4.3.3 Output Design 4.4 Connection With The Server 4.5 Connection With The Database 4.6 Hardware & Software Requirement 4.7 System Work Flow 4.8 Conclusion 5 TESTING 5.1 Testing 5.1.1 Unit Testing 5.1.2 Integration Testing 5.1.3 System Testing 6 RESULT AND DISCUSSIONS 6.1 Performance of the System Data Set Conversion

6.3 Summary 7 CONCLUSION AND FUTURE WORKS 7.1 Conclusion 7.2 Contribution 7.3 Future Works REFERENCES

LIST OF FIGURES FIGURE NO TITLE 2.1 Two Tier Client Server Architecture PAGE 11 Software Development Life Cycle Guide line Chart Pre -Processing Steps Architecture Design System Module Main Page lnterface Design Uploading Screen lnterface Data Processing Interface Design System work Flow Chart Main screen Uploading Data Screen Display Data Screen Missing Data Screen Algorithm / Data Concept implementation Screen Option To calculate Screen Calculation Screen Report Screen Statistic Report xii

Gender Comparison Report Clean Data Set Converted Data Set Optional Calculation Bayesian Calculation LIST OF TABLES TABLE NO 4.1 TITLE Data Set Options Data Dictionary Hardware & Software Requirement System Testing Actual Data Set Test Data Set PAGE 36

CHAPTER 1 INTRODUCTION 1.1 Background Health is the most important aspect in human race. Everybody wants to live without disease but there's no escape to it. The only way out is precautions and early detection. There are many medical databases created with patient details and records. It became tans of collection.but no body has time to always over look at this. This is because there are too many data's and they are not organized. How can they summarize it, so that it is easier for doctors to read it? By using Data mining concepts. Data mining also known as knowledge discovery in database. Data mining is a process of analyzing data from different perspectives and summarizing it into useful information, which can be used to increase revenue and cut cost. It has gained considerable attention among the practitioners and researchers. There many kinds of data mining concepts such as association, classification, cluster, outlier and evolution. These concepts able to help to arrange the data in a systematic approach. Further more able to discover the trend and hidden pattern within the data's that could significantly enhanced our understanding of the patients,diseases and the diagnose. In this research classification is going to be the concept. Classification is the a suitable data mining method in order to analysis the raw data. It is an important area of research

in data mining. Classification partitions massive quantities of data into sets of common characteristic and properties. The classification model can be used to categorize future data samples, as well as providing a better understanding of the database contents. Classification is particularly useful when a database contains examples that can be used as the basis for future decision making, e.g. for assessing credit risks, for medical diagnosis, or for scientific data analysis. Time and cost is the main issue when coming to saving lives.the quicker we save a life the cheaper the cost is. Well this is where the concept lays. By analysis the dieses and the relations with symptoms and diagnose quicken a doctors task. Well here is the method to analysis the disease.

1.2 Problem statement Everybody has their own limitation, reading from whole list of data doesn't tell us anything. We are puzzle when looking at whole lots of details in one go. How do we solve this? The main aim is to improvise the way of detecting a heart disease problem. Currently it is done manually.it will be good and much more effective if there is a system which can compile all the test result in order to detect the heart problem.there few question rise when it come to detecting heart problem. For an example : How can a doctor notice a patient has heart problem immediately? Is there a way to analysis the test results?

1.3 Objectives of the Research The objective of this research is to develop a classification system for detecting heart disease using Bayesian classifier. Basically by using a data mining approach. Expect the system to detect heart disease problem using the Bayesian algorithm. This system also able to detect the missing value on the data and able to generate reports in graph forms. 1.4 Scope of the Research This research is mainly focus on heart disease. It is to detect the present of heart disease on individual patient. In order to test this, the raw data is taken from the internet. The raw data is separated into two potions. A training data set and a test data set. The system will be tested using this two set of data.

1.5 Organization of the Thesis The thesis has seven chapters which covers introduction, literature review,methodology,implementation and experimental design, testing, result and discussions and finally conclusion and future works. Introduction chapter covers the background of the research, information of heart disease and methodology. In here, the objective and scope of the research also is covered. The chapter two which is literature review covers the introduction of the overview of data mining concepts and classification concept. In here it is also explained about the architecture chosen, the purpose of it and the technical details. Further more, this chapter also covers the work of previous researchers which are used as a guide line for this research. Chapter three covers the methodology used to implement this system. The introduction explains the over view of the whole concepts in methodology. In this chapter it also talks about the way of system development, the requirement, design and steps taken to The following would be chapter four. This chapter covers the implementation of the system. It also explains the over view and functions of the system.

Chapter five covers testing part. In this chapter it is explain what kind of the testing is available and should be done in a system. Also included the test done in this system and the result of it. Result and discussion is chapter six. In this chapter, it is discuss the over view of the whole result of the system and the out put. Finally the last chapter, chapter seven is the conclusion and future works. It explains the over view of whole research based and explores promising directions for future works and development.

CHAPTER 2 Literature Review 2.1 Introduction In today's world, knowledge is power. An important source of knowledge is the data stored in databases. Data allows us to learn from the past and to predict the future. With the rapid computerization of businesses and organizations, a huge amount of data has been collected and stored in databases, and the rate at which data is stored is growing at a phenomenal rate. As a result, traditional ad hoc mixtures of statistical techniques and data management tools are no longer adequate for analyzing this vast collection of data. Data mining (or knowledge discovery in databases or KDD in short) has emerged as a growing field of multidisciplinary research for discovering interestingluseful knowledge from large databases. KDD is defined as the extraction of implicit, previously unknown, and potentially useful patterns from data. Data mining is the use of automated data analysis techniques to uncover previously undetected relationships among data items. Data mining often involves the analysis of data stored in a data warehouse. Three of the major data mining techniques are regression, classification and clustering.

Data mining consists of five major elements: Extract, transform, and load transaction data onto the data warehouse system. Q Store and manage the data in a multidimensional database system. *:* Provide data access to business analysts and information technology professionals. *:* Analyze the data by application software. *:* Present the data in a useful format, such as a graph or table

2.2 Classification Classification enables the categorization of data (or entities) into pre-defined classes (of course there should be two or more classes pre-defined before running categorization). The use of classification algorithms involves a training set consisting of pre-classified examples. The classifier calibration algorithm uses the pre-classified examples to determine a set of parameters required for proper discrimination between the classes. The algorithm then encodes these parameters into a model called a classifier. Once such a classifier is calibrated, it can assign new filings to either of the classes. There are many algorithms that can be used for classification, such as decision trees, neural networks, logistic regression, etc. Using this method Data Mining system learns from examples or the data (data warehouses, databases etc) how to partition or classify certain objects (it can be an object, an action, or any other information, that can be formalized). As a result, data mining software formulates classification rules.

2.3 Two Tier Software Architectures Two tier software architectures were developed in the 1980s from the file server software architecture design. The two-tier architecture is intended to improve usability by supporting a forms-based, user-friendly interface. The two-tier architecture improves scalability by accommodating up to 100 users (file server architectures only accommodate a dozen users), and improves flexibility by allowing data to be shared, usually within a homogeneous environment. The two-tier architecture requires minimal operator intervention, and is frequently used in non-complex, non-time critical information processing systems (Darleen Sadoski,2007). 2.3.1 Technical Details Two tier architectures consist of three components distributed in two layers: client (requester of services) and server (provider of services). The three components are as below: 1. User System Interface (such as session, text input, dialog, and display management services) 2. Processing Management (such as process development, process enactment, process monitoring, and process resource services) 3. Database Management (such as data and file services)

The two-tier design allocates the user system interface exclusively to the client. It places database management on the server and splits the processing management between client and server, creating two layers. Figure 1 depicts the two-tier software architecture. Figure 2.1: Two Tier Client Server Architecture In general, the user system interface client invokes services from the database management server. In many two-tier designs, most of the application portion of processing is in the client environment. The database management server usually provides the portion of the processing related to accessing data (often implemented in store procedures). Clients commonly communicate with the server through SQL statements or a call-level interface. It should be noted that connectivity between tiers could be dynamically changed depending upon the user's request for data and services. As compared to the file server software architecture (that also supports distributed systems), the two-tier architecture improves flexibility and scalability by