A Data-Mining Approach for Wind Turbine Power Generation Performance Monitoring Based on Power Curve

Similar documents
The Comparative Study of Machine Learning Algorithms in Text Data Classification*

A Data Classification Algorithm of Internet of Things Based on Neural Network

ANN-Based Modeling for Load and Main Steam Pressure Characteristics of a 600MW Supercritical Power Generating Unit

A Data Cleaning Model for Electric Power Big Data Based on Spark Framework 1

Fault detection with principal component pursuit method

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a

Towards New Heterogeneous Data Stream Clustering based on Density

An Abnormal Data Detection Method Based on the Temporal-spatial Correlation in Wireless Sensor Networks

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Normalization based K means Clustering Algorithm

Research on Recognition and Classification of Moving Objects in Mixed Traffic Based on Video Detection

Quality Assessment of Power Dispatching Data Based on Improved Cloud Model

SCADA Data Interpretation improves Wind Farm Maintenance

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM

An Improved KNN Classification Algorithm based on Sampling

APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE

A Combined Method for On-Line Signature Verification

An Improved Fuzzy K-Medoids Clustering Algorithm with Optimized Number of Clusters

A Novel Image Super-resolution Reconstruction Algorithm based on Modified Sparse Representation

An Enhanced Density Clustering Algorithm for Datasets with Complex Structures

Using Statistical Techniques to Improve the QC Process of Swell Noise Filtering

Construction Scheme for Cloud Platform of NSFC Information System

The Study of Identification Algorithm Weeds based on the Order Morphology

A Study on Development of Azimuth Angle Tracking Algorithm for Tracking-type Floating Photovoltaic System

Forest Fire Smoke Recognition Based on Gray Bit Plane Technology

An adaptive limited wide area differential protection for power grid with microsources

Robust Face Recognition Based on Convolutional Neural Network

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN

Clustering Analysis based on Data Mining Applications Xuedong Fan

Prediction of traffic flow based on the EMD and wavelet neural network Teng Feng 1,a,Xiaohong Wang 1,b,Yunlai He 1,c

Structure-adaptive Image Denoising with 3D Collaborative Filtering

Efficient Parallel Multi-Copy Simulation of the Hybrid Atomistic-Continuum Simulation Based on Geometric Coupling

Data Imbalance Problem solving for SMOTE Based Oversampling: Study on Fault Detection Prediction Model in Semiconductor Manufacturing Process

A Normalization Method for Contextual Data: Experience from a Large-Scale Application

An Efficient Character Segmentation Algorithm for Printed Chinese Documents

An ELM-based traffic flow prediction method adapted to different data types Wang Xingchao1, a, Hu Jianming2, b, Zhang Yi3 and Wang Zhenyu4

A Knowledge Framework For Sensor Data Quality Control

Facial Expression Recognition Based on Local Directional Pattern Using SVM Decision-level Fusion

Identification of process phenomena in DMLS by optical inprocess

Internet/Intranet Based Remote Condition Monitoring and Fault. Diagnosis Scheme and System for Steam Turboset

FUZZY C-MEANS ALGORITHM BASED ON PRETREATMENT OF SIMILARITY RELATIONTP

Cluster analysis of 3D seismic data for oil and gas exploration

Data Mining Technology Based on Bayesian Network Structure Applied in Learning

A Real-time Detection for Traffic Surveillance Video Shaking

Spatial Outlier Detection

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Cluster Analysis. Ying Shen, SSE, Tongji University

Classification with Decision Tree Induction

Machine Tool Volumetric Error Feature Extraction and Classification Using Principal Component Analysis and K- means Methods

CHAPTER 4 FUZZY LOGIC, K-MEANS, FUZZY C-MEANS AND BAYESIAN METHODS

Effect of Principle Component Analysis and Support Vector Machine in Software Fault Prediction

Fuzzy time series forecasting of wheat production

Analyzing Outlier Detection Techniques with Hybrid Method

This paper puts forward three evaluation functions: image difference method, gradient area method and color ratio method. We propose that estimating t

A Novel Approach for Minimum Spanning Tree Based Clustering Algorithm

Rotation Invariant Finger Vein Recognition *

Data Cleaning and Prototyping Using K-Means to Enhance Classification Accuracy

Open Access The Three-dimensional Coding Based on the Cone for XML Under Weaving Multi-documents

Performance Evaluation of Virtualization and Non Virtualization on Different Workloads using DOE Methodology

COMPARISON OF DENSITY-BASED CLUSTERING ALGORITHMS

Network Scheduling Model of Cloud Computing

Research on QR Code Image Pre-processing Algorithm under Complex Background

Research on Correction and Optimization of Post-processing Imaging of Structure with Non-planar Interface Using Full Matrix Data of Ultrasonic Array

Comment Extraction from Blog Posts and Its Applications to Opinion Mining

An Efficient Single Chord-based Accumulation Technique (SCA) to Detect More Reliable Corners

Research on the raw data processing method of the hydropower construction project

Dynamic Clustering of Data with Modified K-Means Algorithm

Fault Cause Description Model: For Training Simulation

REMOVAL OF REDUNDANT AND IRRELEVANT DATA FROM TRAINING DATASETS USING SPEEDY FEATURE SELECTION METHOD

A QR code identification technology in package auto-sorting system

THE BENEFIT OF ANSA TOOLS IN THE DALLARA CFD PROCESS. Simona Invernizzi, Dallara Engineering, Italy,

Detection of Missing Values from Big Data of Self Adaptive Energy Systems

An Application of Genetic Algorithm for Auto-body Panel Die-design Case Library Based on Grid

Personalized Search for TV Programs Based on Software Man

A New Approach for Shape Dissimilarity Retrieval Based on Curve Evolution and Ant Colony Optimization

The Data Mining usage in Production System Management

Research and application of volleyball target tracking algorithm based on surf corner detection

A Two-phase Distributed Training Algorithm for Linear SVM in WSN

The Evaluation of Hash Algorithm based on Flows in the Network Equipment

A Novel Extreme Point Selection Algorithm in SIFT

5th International Conference on Information Engineering for Mechanics and Materials (ICIMM 2015)

Shape Optimization Design of Gravity Buttress of Arch Dam Based on Asynchronous Particle Swarm Optimization Method. Lei Xu

Introduction to Metrology

FRUIT SORTING BASED ON TEXTURE ANALYSIS

Density Based Clustering using Modified PSO based Neighbor Selection

Image and Video Quality Assessment Using Neural Network and SVM

Dynamic Obstacle Detection Based on Background Compensation in Robot s Movement Space

A Generalized Method for Fault Detection and Diagnosis in SCADA Sensor Data via Classification with Uncertain Labels

Recognition of Human Body Movements Trajectory Based on the Three-dimensional Depth Data

Study of Photovoltaic Cells Engineering Mathematical Model

Incorporating PMUs in Power System State Estimation for Smart Grid EMS

Correlation Based Feature Selection with Irrelevant Feature Removal

NUMERICAL ANALYSIS OF WIND EFFECT ON HIGH-DENSITY BUILDING AERAS

Credit card Fraud Detection using Predictive Modeling: a Review

Time Series Data Analysis on Agriculture Food Production

Real-time Aerial Targets Detection Algorithm Based Background Subtraction

OCR For Handwritten Marathi Script

Filtered Clustering Based on Local Outlier Factor in Data Mining

Open Access Apriori Algorithm Research Based on Map-Reduce in Cloud Computing Environments

A Cache Utility Monitor for Multi-core Processor

Transcription:

, pp.456-46 http://dx.doi.org/1.1457/astl.16. A Data-Mining Approach for Wind Turbine Power Generation Performance Monitoring Based on Power Curve Jianlou Lou 1,1, Heng Lu 1, Jia Xu and Zhaoyang Qu 1, 1 Department of Information and Engineering, Northeast DianLi University, Jilin 131, China Long Yuan(Beijing) Wind Power Engineering Technology CO.,LTD. Beijing 134, China loujianlou@qq.com Abstract. A data-mining approach is proposed to investigate the power generation monitoring of wind turbine based on power curve profiles in this paper. The weakened power generation performance could be identified by this method through assessing the wind-speed power datasets. Shapes of wind power curve profiles over consecutive time intervals are constructed by fitting power curve models into wind-speed power datasets. In this research, a optimal constraint in each sub-dataset is developed for governing the data-driven windpower generation method based on distance-based outlier detection and variance analysis model. The Auto-adapt Optimal Interclass Variance algorithm realize the self-optimization of the threshold parameter and achieves a high degree of robustness to the variations in wind-power generation performance monitoring. The blind industrial researches are conducted to validate the effectiveness of the approach, and shows the decrease of error rates when detecting weakened power generation performance or causing financial loss. Keywords: Wind turbine; Power curve; Data-mining; Performance monitoring; 1 Introduction The power curve reflects the operation performance of the wind turbine [1].It is often influenced by air density, system control, ambient temperature so that raw data collected by SCADA system usual contains lots of anomalies. But traditional datamining methods which detect weakened performance through building model training of the datasets with certain ratio generally have inevitable prediction error, so the identification of the poor power generation performance of turbines is not accurately for the real-time data [3-5]. 1 Project supported by the National Natural Science Foundation of China (No. 651773) and Jilin Province Science and Technology Development Project (15484GX). Corresponding author: Jianlou Lou. Email: loujianlou@qq.com. ISSN: 87-133 ASTL Copyright 16 SERSC

In this paper, a data-mining approach is proposed to assess the wind power generation performance through analyzing the variation of wind power curves rather than individual data points. The data is partitioned into sub-datasets based on consecutive equal time-intervals. Then the outlier-detection approach and variance analysis model are used to realize the self-optimization of threshold in the Auto-adapt Optimal Interclass Variance (AOIV) algorithm. The effectiveness of this approach is demonstrated through some blind industrial studies. Data-Mining Based on Power Curve.1 Data Preprocessing Before mining the power curve profiles, the data are preprocessed according to the methodology used in wind power generation enterprise of China. That encompasses the three following steps: 1) validity check, ) data range check, and 3) missing data processing. Moreover, these anomalies should be removed to make a separate analysis if necessary. Actually, the simplified rules as follows are often used when taking into account time and high-efficiency. a)wind speed < cut-in speed the cut-in speed is the wind speed value at which the turbine starts operating; pitch angle about 9 o. b)wind speed between cut-in speed and cut-out speed power output zero or negative; c)wind speed > cut-out speed the cut-out speed is the wind speed value at which the turbine stop generating available power; pitch angle about 9 o.. OIV Algorithm The optimal interclass variance (OIV) algorithm based on the power curve is a simple and efficient data-mining method for the power curve analysis and detect anomaly by combined with the initial variance threshold. The algorithm is as follows. Given a sample dataset of power curve U {( x1, y1 ),( x, y),...,( xn, yn)},and satisfy yi yi 1, i (, n), x represents wind speed, y is expressed as power, and n is the total number of sample points. The profile for depicting the characteristic of the power curve contained in U is if and only if satisfies (1). n 1 arg max{ ( * ( y j y ) ) S} j 1 (1) Where y j is the jth power value; y is the average of first power value; is constant; S is initial threshold. Copyright 16 SERSC 457

Each sub- dataset could be classified by S. And the results of similar dataset are finally summarized into and ab..3 AOIV Algorithm The auto-adapt optimal interclass variance (AOIV) algorithm is proposed to enhance the accuracy with the combination of the outlier detection and the optimization of the variance threshold in this research. A. The Detection of Outliers The outlier detection approach based on the distance is effective to detect power curves with different curvature. And the detected outlier should be removed and belongs to the category of the ab dataset. ui ( xi, yi ) represent a data point in the sample set U, The distance between two points is defined: d ( u, u ) ( x x y y ) () k k 1/ k k i i 1 i i 1 i i 1 Definition 3.1 For any point ui ( xi, yi ) in U, given a relative small positive number, if any point ui ( xi, yi ) in data set U satisfy with condition: dk ( ui, ui 1), so ui 1 is the -proximal point of u i, and the set of all -proximal points is -neighborhood of u i. Definition 3. For any point ui ( xi, yi ) in U, given a relative small positive number,select an empirical critical value N. Assume that the number of - neighborhood of u i is N i,if Ni N,The u i is called an isolated point of U. B. The Optimization of Variance Threshold The improved model of variance analysis achieves the optimization of threshold in each sub-dataset. Given a set of sliding variance samples is Z { z1, z,.., z k },k represents the number of sample points. To define the range of S is [ s1, s ], and satisfy with S N. The default value is s1 1, s 5. The sample set Z is divided into two groups by selecting different values of S in turn, and the following calculation is performed if and only if exist two groups, otherwise reselect next S value. Assume that the two groups of data in an specific S are Z1 { z1, z,.., z } and Z { z, z 1,.., z k }, and the internal error and external error between Z 1 and Z is calculated in (3) and (4). k ( z j Z1) ( z j Z ) j 1 j 1 (3) ( Z1 Z) * ( Z Z) *( k ) (4) Where presents external error; presents internal error; z presents sliding variance; Z 1 presents the mean value of Z 1 ; Z presents the mean value of Z ; 458 Copyright 16 SERSC

Z presents the mean value of Z ; S is the optimal variance threshold for the subdataset when only satisfy (5). s S arg max( / ) (5) S s1 S S In order to avoid the error detection in the operation mode of the wind turbine, S usually need to join a certain threshold supplementary quantity. The block diagram of the AOIV algorithm is as follows: Outlier Detection Variance Model Sub Dataset 1 Sliding Dataset 1 Normal Sub Dataset Sliding Dataset Data Collection Partition... Sub Dataset N Transform... Sliding Dataset N Filter Ab Fig. 1. Block diagram of AOIV algorithm. 3 Industrial Studies 3.1 Data Preparation In order to verify the effectiveness of the approach and the application of the power generation performance monitoring, this paper investigates the 1-min SCADA data of 33 wind turbines collected from May 31, 13 to May 8, 13 in a 1MW Class wind farm in China. The collected data contained values of wind turbine performance parameters, wind conditions, as well as the fault logs. 3. Contrastive Analysis between Algorithms To compare the data-mining effect between two algorithms in chapter, this section presents some industrial studies based on turbines 1-min SCADA data which collected randomly from a wind farm in China. The data-mining results with two algorithms are shown in figure and the performance comparisons are shown in table 1. Copyright 16 SERSC 459

4 6 8 1 1 14 16 18 Advanced Science and Technology Letters 16 16 1 ab 1 ab 1 1 1 1 Power(Kw) 8 8 6 6 5 1 15 5 Fig.. Power curve with data(cross) and ab data(circle). Left: OIV algorithm; right: AOIV algorithm. Figure shows the result of two algorithms. Obviously, the AOIV algorithm has better ability of data-mining. Table 1. Contrastive Analysis Table Parameter Artificial OIV AOIV ND 1341 115 137 LD 46 66 397 HD 938 938 938 EDoN 3 3 EDoL 68 1 EDoH Table 1 shows the data-mining results among the artificial statistics results, OIV algorithm and AOIV algorithm. ND, LD and HD represent the detected data size of, limited-power and halt. EDoN, EDoL and EDoH represent the error-detected data size of, limited-power and halt. We can see both of data and limit data have different degrees of error detection with OIV algorithm. In addition, the method of detecting halt-data are same law in chapter.1 so that the amount of error detected halt-data is zero. But after improving the algorithm, data-mining results has a lower error rate and higher accuracy. 3.3 Performance Monitoring of Turbines A. The Data-Mining with AOIV Algorithm Taking into account the number of test turbines, here is only given some data-mining results of randomly selected turbines from 158151 to 1581533. The results are shown in figure 3. 46 Copyright 16 SERSC

5 1 15 5 Advanced Science and Technology Letters 16 16 1 ab 1 ab 1 1 1 1 8 8 6 6 5 1 15 5 Fig. 3. Power curve with data(cross) and ab data(circle). Left: turbine 158155; right: turbine 158157. From Figure 3 it can be seen that this approach can effectively detect the various power curve profile. Among them, the turbines 158155 and 158157 processing results are better, but there are error-detection in the rated wind speed zone, which is due to the amount of data is too small to be mistaken for an isolated point, it can be neglected. According to the above, this approach algorithm has better versatility and accuracy. B. The Analysis of Detected Faults Through the analysis of identified anomalies by data-mining, the root cause of poor power generation performance of turbine will be found. Taking turbine 1581515 for example to describe the detailed analysis procedure. The irregular profiles are shown in Figure 4. 16 1 1 1 8 6 5 1 15 5 Fig. 4. Detected anomaly of turbine 1581515. In Figure 4, the turbine 1581515 does not generate power ly so that power curves display ab curvatures. Actually, this is because of the pitch faults which usually seriously affect power generation performance of turbines. The turbine should be maintained as soon as possible. 4 Conclusion In this paper, we proposed a data-mining method to identify impaired power generation performance by analyzing the curvature and shape of the wind power curve. The outlier detection based-on distance were applied to eliminate the impact of Copyright 16 SERSC 461

isolated points in the power curve analysis, while the variance analysis model realize the self-adaptive threshold to enhance the accuracy rating of data-mining. Compared to other methods, this way could more accurately and fleetly detect the anomalies of turbines just by analyzing power curve without complex sample training. It can also process the real time data for online monitoring and identify the weakened performance of generating unit. Some industrial studies were conducted to prove the effectiveness and accuracy of the algorithm. We have mined the power curve of 33 sets of wind turbines, and the great data results are obtained. The future research will investigate an intelligent method to identify the fault type and the root cause with poor performance of turbines. In addition, the relationship between wind power and multiple parameters will be considered in the data mining. References 1. Schlechtingen, M., Santos, I. F., & Achiche, S.: Wind turbine condition monitoring based on SCADA data using behavior models. Part 1: System description. Applied Soft Computing, 13(1), 59-7 (13). Kusiak, A., Zheng, H., & Song, Z.: On-line monitoring of power curves. Renewable Energy, 34(6), 1487-1493 (9) 3. Zheng, H., & Kusiak, A.: Prediction of wind farm power ramp rates: A data-mining approach. Journal of solar energy engineering, 131(3), 3111 (9) 4. Schlechtingen, M., Santos, I. F., & Achiche, S.: Using data-mining approaches for wind turbine power curve monitoring: a comparative study. Sustainable Energy, IEEE Transactions on, 4(3), 671-679 (13) 5. Sanchez, H., Escobet, T., Puig, V., & Odgaard, P.: Fault Diagnosis of Advanced Wind Turbine Benchmark using Interval-based ARRs and Observers. In World Congress, vol. 19, No. 1, pp. 4334-4339 (14) 46 Copyright 16 SERSC