Cluster Analysis of Electrical Behavior

Similar documents
Evaluation of the application of BIM technology based on PCA - Q Clustering Algorithm and Choquet Integral

Load Balancing for Hex-Cell Interconnection Network

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Hierarchical clustering for gene expression data analysis

Machine Learning: Algorithms and Applications

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING

Machine Learning. Topic 6: Clustering

The Research of Ellipse Parameter Fitting Algorithm of Ultrasonic Imaging Logging in the Casing Hole

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

A new segmentation algorithm for medical volume image based on K-means clustering

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

An Optimal Algorithm for Prufer Codes *

Application of Improved Fish Swarm Algorithm in Cloud Computing Resource Scheduling

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

A NOTE ON FUZZY CLOSURE OF A FUZZY SET

Virtual Machine Migration based on Trust Measurement of Computer Node

A Simple Methodology for Database Clustering. Hao Tang 12 Guangdong University of Technology, Guangdong, , China

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Classifier Selection Based on Data Complexity Measures *

Feature Reduction and Selection

Smoothing Spline ANOVA for variable screening

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Solving two-person zero-sum game by Matlab

Related-Mode Attacks on CTR Encryption Mode

Research on Categorization of Animation Effect Based on Data Mining

The Research of Support Vector Machine in Agricultural Data Classification

USING GRAPHING SKILLS

The Discriminate Analysis and Dimension Reduction Methods of High Dimension

Intra-Parametric Analysis of a Fuzzy MOLP

The Nottingham eprints service makes this work by researchers of the University of Nottingham available open access under the following conditions.

APPLICATION OF IMPROVED K-MEANS ALGORITHM IN THE DELIVERY LOCATION

X- Chart Using ANOM Approach

Analysis on the Workspace of Six-degrees-of-freedom Industrial Robot Based on AutoCAD

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

Available online at Available online at Advanced in Control Engineering and Information Science

S1 Note. Basis functions.

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Network Intrusion Detection Based on PSO-SVM

A Similarity Measure Method for Symbolization Time Series

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

A Binarization Algorithm specialized on Document Images and Photos

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Human Face Recognition Using Generalized. Kernel Fisher Discriminant

Lecture 4: Principal components

Text Similarity Computing Based on LDA Topic Model and Word Co-occurrence

Modular PCA Face Recognition Based on Weighted Average

A New Approach For the Ranking of Fuzzy Sets With Different Heights

FAHP and Modified GRA Based Network Selection in Heterogeneous Wireless Networks

A Semi-parametric Regression Model to Estimate Variability of NO 2

Design of Simulation Model on the Battlefield Environment ZHANG Jianli 1,a, ZHANG Lin 2,b *, JI Lijian 1,c, GUO Zhongwei 1,d

Module Management Tool in Software Development Organizations

An Improved Image Segmentation Algorithm Based on the Otsu Method

Optimal Design of Nonlinear Fuzzy Model by Means of Independent Fuzzy Scatter Partition

Network Coding as a Dynamical System

Professional competences training path for an e-commerce major, based on the ISM method

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Machine Learning 9. week

Simulation Based Analysis of FAST TCP using OMNET++

Evaluation of an Enhanced Scheme for High-level Nested Network Mobility

Lecture #15 Lecture Notes

Unsupervised Learning

An Image Compression Algorithm based on Wavelet Transform and LZW

A Simple and Efficient Goal Programming Model for Computing of Fuzzy Linear Regression Parameters with Considering Outliers

Air Transport Demand. Ta-Hui Yang Associate Professor Department of Logistics Management National Kaohsiung First Univ. of Sci. & Tech.

Application of VCG in Replica Placement Strategy of Cloud Storage

Sensor Selection with Grey Correlation Analysis for Remaining Useful Life Evaluation

A Five-Point Subdivision Scheme with Two Parameters and a Four-Point Shape-Preserving Scheme

Soil data clustering by using K-means and fuzzy K-means algorithm

Design of Structure Optimization with APDL

Querying by sketch geographical databases. Yu Han 1, a *

The motion simulation of three-dof parallel manipulator based on VBAI and MATLAB Zhuo Zhen, Chaoying Liu* and Xueling Song

A Clustering Algorithm for Key Frame Extraction Based on Density Peak

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Research and Application of Fingerprint Recognition Based on MATLAB

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

CHAPTER 2 DECOMPOSITION OF GRAPHS

A User Selection Method in Advertising System

Real-time Fault-tolerant Scheduling Algorithm for Distributed Computing Systems

PROPERTIES OF BIPOLAR FUZZY GRAPHS

Face Recognition Method Based on Within-class Clustering SVM

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

Research Article A High-Order CFS Algorithm for Clustering Big Data

Support Vector Machines

A Deflected Grid-based Algorithm for Clustering Analysis

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

Supervised Nonlinear Dimensionality Reduction for Visualization and Classification

Recognizing Faces. Outline

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Data Preprocessing Based on Partially Supervised Learning Na Liu1,2, a, Guanglai Gao1,b, Guiping Liu2,c

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Programming in Fortran 90 : 2017/2018

Constructing Minimum Connected Dominating Set: Algorithmic approach

Transcription:

Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School of Electrcal and Electronc Engneerng, North Chna Electrc Power Unversty, Beng, Chna Emal: bhdluln@63.com Receved February 205 Abstract In ths paper, we apply clusterng analyss of data mnng nto power system. We adapt K-means clusterng algorthm to analyze customer load, analyzng smlar behavor between customer of electrcty, and we adapt prncpal component analyss to get the clusterng result vsble, Smulaton and analyss usng matlab, and ths well verfy cluster ratonalty. The concluson of ths paper can provde mportant bass to the peak for the power system, stable operaton the power system securty. Keywords K-Means Clusterng Analyss, Prncple Component Analyss, The Power System. Introducton On the one hand, n the age of bg data such a massve nformaton, data affects our works and lves every second, data mnng and clusterng analyss s becomng more and more mportant, on the other hand, Wth the rapd development of our natonal economy, the power consumpton s larger and larger. And our current power source s manly rely on thermal power, n order to ensure the stable operaton of power system, power dspatch and peak becomes more and more mportant. The clusterng analyss to customer power load s a key lnk n power decson. Therefore, ths paper wll focus on the applcaton of large data n power system. Clusterng algorthm can be dvded nto dfferent classfcaton wth dfferent standards. Commonly used algorthms n clusterng analyss nclude K-means clusterng algorthm, agglomeratve herarchcal clusterng algorthm, SOM of neural network clusterng algorthm, the FCM of fuzzy clusterng algorthm, and so on []. By comparson, we dscover that the K-MEANS program and the FCM program have good comprehensve performance, however the FCM program are too complex for us to use. The power system data s produced every second, so the K- MEAN program are outstandng for ts hghly effcency. We select K-means clusterng algorthm to analyze the customer power load. Thus may balance power load accordng to dfferent classfcaton. And ths can provde dfferent servce to dfferent knds of customers. The characterstc of ths artcle s: every detal s analyzed from rom the generaton of customer power load to data clusterng. 2. The Source Data of Power Load The source of data used for clusterng analyss n ths paper comes from reference [2]. We sample the reference data, then nterpolate, ths makes data regeneraton. We select 4 classfcatons of power load, each classfcaton How to cte ths paper: Lu, L. (205) Cluster Analyss of Electrcal Behavor. Journal of Computer and Communcatons, 3, 88-93. http://dx.do.org/0.4236/cc.205.350

L. Lu respectvely have 00 sets of data, a total of 400 sets of data. To analyze one day s power load, every 0 mnutes for a sample, each data set contans 44 numercal (as s shown n Fgure ). 3.. The Algorthm and Process Clusterng s one of the mportant research topcs n data mnng, s the process of physcal obects nto multple classes or clusters [3] [4]. The obects n the same cluster are as smlar as possble, whle obects n dfferent clusters as dfferent as possble. Clusterng can handle dfferent feld types and dscover clusters of arbtrary shape, t can process the abnormal data, Clusterng s not senstve to data order and less dependent of professonal knowledge. K-means algorthm s one of the most classc clusterng algorthms commonly used n the present, t has advantages n the followng three aspects [5]: It s quck and sample; For large data sets wth hgh effcency and scalablty; It has nearly lnear tme complexty, and t s sutable for mnng large data sets. K-Means clusterng algorthm's tme complexty s a functon of n, k, and t. Where n stands for the number of obects n data sets, t stands for number of the teraton algorthm, k stands for the number of clusters. So ths paper uses the K-means clusterng algorthm to analyze customer load, and desgn a flow chart as shown n Fgure 2. 3.2. The Steps of K-Means Algorthm, k Randomly select k ponts as the ntal clusterng center µ µ 2,..., µ n the data sets { x } = and N s the number of samples On the of sample ponts x n the data sets µ, calculate Eucldean dstance between t and the clusterng center, and get ts category label ( ) µ arg mn x µ =,..., N; =,..., k Recalculate the k cluster centers, accordng to type (2) µ = x, =,..., k (2) In the formula, N x µ N s the number of obects n clusters µ 2 N () Fgure. Source data: customer load. 89

L. Lu Fgure 2. Flow chart. Repeat step 2) and step 3), untl t reaches the convergence crteron functon. The evaluaton of convergence s based on the square error crteron, as shown n Formula (3). k 2 (3) = µ E = x m In the formula, E s the sum of square error of all the obects n the database; x s a pont n space; m s the average value of the cluster u. Ths obectve functon makes the generated clusters as compact as possble and ndependent. Usng the above K-means algorthm, cluster analyss was performed on the data obtaned, thus draw customer load can be dvded nto 4 categores obvously, and the clusterng center s shown n Fgure 5. 4. Vsualzaton of Clusterng Results Because of the use of the large amount of data, selected 44 samplng moments every day, we can t express the clusterng results drectly. In order to get the clusterng results vsual, we use the method of prncpal component analyss (PCA) to study the clusterng results. PCA s a mathematcal method of dmensonalty reducton. It can take many varables wth certan correlaton nto a set of new ndependent varables [6]. Use as few varables as possble to express as much nformaton as possble, ths s one of the basc prncples of PCA. By clusterng analyss, 44 dmensonal data s mapped to a 3 dmensonal space, then analyss. That s, select 3 prncpal components. The followng s the ntroducton about the calculaton steps of PCA standardzaton processng of the orgnal data. Assumng the sample observaton data matrx s x x2 x p x2 x22 x2 p X = xn xn2 xnp Then the orgnal data were standardzed accordng to the followng methods x * x x = ( =, 2,, n; =, 2,, p) Var( x ) where, x n n = = x 90

L. Lu n Var( x ) = ( x x ) ( =, 2,, p) n = Calculaton of sample correlaton coeffcent matrx For the sake of convenence, assumng that the orgnal data standardzaton s stll denoted by X, the correlaton coeffcent matrx after data standardzaton s where r r2 r p r2 r22 r2 p R = rp rp 2 rpp n ( x x )( x x ) r x x n n k = = cov(, ) = ( > ) The calculaton of characterstc value and correspondng characterstc vector of relaton coeffcent matrx R. Characterstc value: λ, λ2,, λp Characterstc vector: a = ( a, a,, a ), =, 2,, p 2 p Select the mportant prncpal components, and gve the expresson Through prncpal component analyss, we can get p prncpal components, but because the varance of each prncpal component s decreasng, the quantty of nformaton s declnng. In practcal analyss, accordng to the prncpal component contrbuton to select the frst k prncpal components, usually the accumulatve contrbuton rate can reach more than 85%, n order to ensure the ntegrated varables carry most nformaton of orgnal varables. where a a2 a3 λ contrbuton rate = p a2 a22 a23 λ a3 a32 a 33 = The calculaton of prncpal component scores Accordng to the orgnal data standardzaton, the prncpal component values are calculated from the expresson for each sample, you can get all new sample data n each prncpal component, that s, the prncpal component scores. The specfc form s as follows: where F = a x + a x + + a x =, 2,, n; =, 2,, k 2 2 p p F F2 F k F2 F22 F2 k Fn Fn 2 Fnk Snce we fnshed the clusterng analyss, prncpal component analyss, then make the scatter of clusterng results, by MATLAB. As shown n Fgure 5. 5. Smulaton Analyss In order to verfy the ratonalty of K-means algorthm used n ths paper, ths paper uses MATLAB smulaton 9

L. Lu analyss to explan. Fgure s the customer load source data before clusterng analyss, Fgure 3 s each clusterng center after clusterng, two pcture comparson, all customer load were clustered nto the D category, the customer load 2 were clustered nto the B category, the customer load 3 were clustered nto A category, the customer load 4 were clustered nto the C category, whch can prove that our clusterng method s reasonable. Fgure 4 depcts all ponts to the clusterng center dstance. Where Subgraph A conveys the dstance from all 4 knds of customer load to A clusterng center, we can see that all ponts of customer load 3 s nearest to A clusterng center Subgraph B conveys the dstance from all 4 knds of customer load to B clusterng center, we can see that all ponts of customer load 2 s nearest to B clusterng center Subgraph C conveys the dstance from all 4 knds of customer load to C clusterng center, we can see that all ponts of customer load 4 s nearest to C clusterng center Subgraph D conveys the dstance from all 4 knds of customer load to D clusterng center, we can see that all ponts of customer load s nearest to D clusterng center Ths s consstent wth the defnton of the clusterng center, the relatonshp s also compatble wth Fgure and Fgure 3 exhbted by. In the earler analyss based K-means clusterng and prncpal component analyss, draw the vsualzaton map accordng to clusterng analyss results, as Fgure 5, from the fgure, we can also drectly obtaned that the customer load can be dvded nto 4 categores, these four categores correspondng to four types of customer load types n the source data, further proved the correctness of ths clusterng analyss. Fgure 3. Clusterng center. Fgure 4. The dstance between each pont and each clusterng center. 92

L. Lu 6. Concluson Fgure 5. The scatter of the clusterng results. In ths paper, the K-means clusterng method n data mnng was used n power system on the clusterng analyss power load of customer, and the method of prncpal component analyss was used on the clusterng results vsualzaton, fully prove the ratonalty and correctness of the clusterng. To provde an mportant bass for the power system decson, and ensure the stable operaton of power system. References [] Feng, X.P. and Zhang, T.F. (200) Comparson of Four Clusterng Methods. Mcrocomputer and Applcaton, 6. [2] Zhang, M.M., Chen, J.Q., Wang, K., Peng, B. and Wu, H. (204) The Mult Tme Scale Coordnated Orderly Power Use Centralzed Decson Method. Automaton of Electrc Power Systems, 38, 70-77. [3] Xu, J., Huang, Y.L. and L, F. (2004) Research on Comparng the Sequental Learnng wth Batch Learnng for K-Means. Computer Scence, 3, 56-58. [4] Keogh, E. and Pazzan, M. (998) An Enhanced Representaton of Tme Seres Whch Allows Fast and Accurate Classfcaton, Clusterng and Relevance Feedback. Proceedngs of the 3rd Internatonal Conference of Knowledge Dscovery and Data Mnng, The Assocaton for the Advancement of Artfcal Intellgence, New York, 239-24. [5] Zhang, S.X., Lu, J.M., Zhao, B.Z. and Cao, J.P. (203) Analyss of Cloud Computng of Resdental Consumpton Behavor Model Based on. Power System Technology, 37, 542-546. [6] Zhuo, J.W., L, B.W., We, Y.S. and Qn, J. (204) Applcaton of MTLAB n Mathematcal Modelng. The Behang Unversty Press, Beng, 39-4. 93