Quality Assessment of Power Dispatching Based on Improved Cloud Model Zhaoyang Qu, Shaohua Zhou *. School of Information Engineering, Northeast Electric Power University, Jilin, China Abstract. This paper presents a data quality assessment method based on improved cloud model for the quality assessment of power dispatching data. The cloud model is used to make the soft segmentation of the reviews. The evaluation results are transformed into cloud models by means of the reverse cloud generator, and the corresponding evaluation cloud is constructed. At last, the similarity degree between the comprehensive evaluation cloud and each evaluation grade cloud is calculated by the similarity algorithm based on the improved cloud model, so as to determine the evaluation grade to which the electric power dispatching data belongs. The method of data quality assessment based on improved cloud model is proposed based on the data characteristics and business characteristics of Power Scheduling Center. The evaluation result is fair and objective. And the practicality of the method is verified by the experiment, which compensates the shortcomings of the quality assessment method of power dispatching data. Keywords: power dispatching; data quality; quality assessment; cloud model. Introduction With the development of the smart grid, there are a lot of data about the operation of the power grid, production management, market operation in the power dispatching center. These data are distributed in the different production run systems, it is difficult to for us find information or knowledge hidden behind these data directly based on these data, because of the dispersion, heterogeneity and inconsistency of these data [, 2]. Smart grid dispatch center is building an integrated platform to achieve data integration storage, management and application, while facing the problem of how to ensure the quality of data [3]. The quality assessment results of power dispatching data can be used for early warning, release and correction. Using evaluation method of data quality for the electricity, active power of data platform in comprehensive application of data can realize the line loss, power grid parameter, power grid section analysis, regional power balance of the operation correcting and checking data, anomaly detection, and solve the problem of data quality about abnormal findings, identification, rule accumulation and alarm repair, and improve the accuracy and practicability of data. Therefore, it is of great practical significance to study the methods of data quality assessment and improve the quality of data in the integration of power dispatching data. 2. Quality Assessment of Power Dispatching 2. Characteristics of Power Dispatching For the data stored in the power dispatching data center, it mainly includes the grid operation data, the production management data and the market operation data. From the data size of view, the amount of grid operation data is very large, accounting for 90 percent or more than 95 percent of the entire data [4, 5]. Through the analysis of power dispatching data in the data center, it is found that the dispatching data has many characteristics such as diversity, multi-level, multi-format and so on. There are five aspects of specific performance. () Diversity of data source. The data of electric power dispatching center comes from different production operation system, such as EMS system, Power system, OMS system, WAMS system, Scheduling system. These data are related to the operation of the power grid, production management, market operations and other aspects, with the characteristics of different forms of expression, different aging and different precision. (2) Inconsistent of data standards and diverse of data types. There are many kinds of data representation in the dispatching data, such as two-dimensional table data, image data of various formats, and various kinds of text data. (3) Uncertainty of data. The error in the process of data collection and entry and the non-standard operation in the process of data analysis and processing may affect certainty of data. (4) Real-time and dynamic of data. The power dispatching data must be able to reflect the real-time state running of the grid, the production management and the operation of the market. (5) Bad data quality. As the scheduling data in the acquisition process did not form a unified quality control system, resulting in the low quality data. In addition, the different of personnel business quality and the loss and error of data in the process of collecting and processing data also affect the quality of the source data. 2.2 Evaluation Method of Quality The quality of data is the concept of the data itself, and it does not have relation with the production and use of the data, which is usually explained from the perspective of the data producers, managers and users. At present, the data quality elements are evaluated at home and abroad, such as the six step method of qualitative evaluation, all the data quality management of MIT, Six tuple ( ) of Guiqing Zhang and so on [6-9]. There are many methods of data quality assessment in the process of data quality evaluation, which can be divided into three categories. The first one is the subjective evaluation method, which is mainly based on the method of expert scoring, Delphi method and the analytic hierarchy process. The second is the objective evaluation method, mainly include variation coefficient method, entropy method, factor analysis, principal component analysis. The third is the comprehensive evaluation methods, such as fuzzy comprehensive evaluation method, which is based on objective data, but its membership function is generally determined by the experts by experience with subjectivity [0-2]. At present, there is no clear method to evaluate the quality of electric power dispatching data. Ting Lei analyzed and studied the problem of data quality of power dispatching data center and put identification method for power quality data, Journal of Residuals Science & Technology, Vol. 3, No. 8, 206 5 206 DEStech Publications, Inc. doi:2783/issn.544-8053/3/8/50
including single identification method based on inherent attribute, based on statistical rules and based on electrical connection and comprehensive identification method based on network topology, based on event constraint and based on multi-source data [3-5]. In the process of quality evaluation of data in the past, the determination of final evaluation results is usually based to the membership level of data. On the one hand, which may cause the accurate assessment results due to the loss of randomness and fuzziness of evaluation results. On the other hand, when the evaluation results is between at the two grade boundaries, which reflect the strongly subjective defects. 2.3 Evaluation Index of Quality In the evaluation of data quality, the key is to define the evaluation index which can reflect the data quality. In this paper, the data quality evaluation index include six statistical indicators. correctness. The data is consistent with the fact, whether the data is error in the collection, transmission, transmit process. The quantitative evaluation index of correctness is the correct rate. The formula is the correct amount of data in the table / record in the table*00%. 2 integrity. Whether there is lack of data information, the lack of data may be missing the entire data record or a field of information records. The quantitative evaluation index of integrity is missing field rate. The formula is data set all the conditions to meet the conditions of the amount of data / collection of the total number of * 00%. 3 consistency. Whether all the data related is logically consistent and maintaining a unified format. Quantitative evaluation index of consistency is consistency. The formula is the total number of records to be investigated to meet the conditions of all data / data record * 00%. 4 uniqueness. Whether there are duplicate data records. The quantitative evaluation index of uniqueness is data repetition rate. The formula is the number of duplicate data / data records * 00%. 5 accuracy. Whether the accuracy of data meets the requirements. The quantitative evaluation index of accuracy is accuracy. The formula is the number of data meet the accuracy requirements / the total amount of data * 00%. 6 timeliness. Whether the value attribute of data is within the valid time span. The formula for calculating the timeliness is the total amount of data that is not lost in the data set / the total number of records in the collection * 00%. The relative importance of different assessment indicators in the index system is different from each other and the degree of determining evaluation results is not same, so it is necessary to determine the weights of each index according to the relative important degree of indexes. At present, there are two methods to determine the weights. The one is subjective weighting method, the other is objective weighting method. As the subjective weighting method mainly rely on the expert's understanding of the importance of indicators to give weight to the index, such as Delphi method and expert ranking method, the weight rely on the experts strongly. The objective weighting method is to obtain the information weight of the index by the mathematical calculation, such as the factor analysis method and the correlation coefficient method. Although the impact of human factors and subjective factors are avoided, the results of the weighting can not reflect the actual importance of the indicators objectively, and sometimes there is a certain gap between the results of weighting and objective reality. Therefore, the use of a single weighting method is vulnerable to the impact of the method itself, the combination method of subjective weighting method and objective weighting method is used in this paper. According to the actual meaning of the index in the data quality to get the weight of each index, the determination of the index weight is morely fair and the fairness of the evaluation result is guaranteed. 2.4 Calculation of Evaluation Index Weight 2.4. Calculation of Subjective Weight The G method was used to determine the subjective weight of each index to ensure the accuracy of the evaluation results. G method by improving the AHP avoided the shortcomings of the AHP. This method has many advantages of no constructing judgment matrix, no consistency test, calculation less than AHP, simple, intuitive, easy to use, no limit to the number of elements in the same level, order-preserving. The specific steps of the algorithm have two steps. Step. Determination of order relations. If the evaluation index is more important than about the importance of evaluation result, marked as and given an evaluation set * +. First, the most important one of the indexes is selected from the evaluation set, marked as. Second, the most important index is selected form the remaining indexes, marked as. Then, the most important index is selected form the remaining ( ) indexes, marked as. Last, the remaining index is marked as. In this way, a unique order relation can be determined, which can be expressed as ( ). Step 2. Determination of relative importance between adjacent indexes. The ratio of the importance between adjacent index and is expressed as.the formula is the weight of the k th index. The relative importance of each index is calculated based on the order relation of each index. Each expert alone determine the, and then take the average. The value of is shown in Table..0..2.3.4.5 Table Scale and meaning of index Meaning The index has the same importance as the index Between.0 and.2 The index is slightly more important than the index Between.2 and.4 The index is more important than the index Between.4 and.6 Journal of Residuals Science & Technology, Vol. 3, No. 8, 206 52 206 DEStech Publications, Inc. doi:2783/issn.544-8053/3/8/50
The index is strongly more important than the index.6 Between.6 and.8.7 The index is extremely more important than the index.8 The weight is calculated by the formula () and (2). ( ) () ( ) (2) 2.4.2 Calculation of Objective Weight The objective weight of each index is determined by entropy weight method in this paper. The steps of calculation have three steps. Step. If there are m evaluation objects, each evaluation object has n indexes and construct normalized judgment matrix as ( ). Step 2. The entropy of each evaluation index is defined as the formula (3) and (4). (3) Step 3. The entropy weight of the j th evaluation index is calculated by the formula (5). (4) 2.4.3 Calculation of Comprehensive Weight ( ) ( ) (5) The subjective and objective weights are combined by using the add method, and the final comprehensive weight of each index is calculated as W by the formula (6), (7) and (8). (6) In this formula, evaluation index. [( ) ] (7) (8) is the corresponding component of the subjective weight in ascending order and n is the number of 3. Quality Evaluation Model of Power Dispatching Based on Improved Cloud Model 3. Determine of the Evaluation Level Cloud If the comments are divided into p levels, the evaluation set can be represented as { }. If the range of the evaluation level is ( ), the corresponding cloud model can be expressed as ( ) calculated by the formula (9), (0) and (). ( ) (9) ( ) (0) () The k in the formula is a constant, determined by the range of the evaluation level. The larger the range of the evaluation level is, the larger is, and the greater uncertainty and randomness of the corresponding evaluation level are. 3.2 Determine of the Comprehensive Evaluation Cloud The evaluation results of each index are transformed into cloud model by using the backward cloud generator. If the corresponding evaluation cloud of evaluation index is ( ), and the comprehensive weight of the index is.. If the comprehensive evaluation cloud generated by the six evaluation cloud is ( ) calculated by the formula (2), (3) and (4). (2) (3) 3.3 Calculation of Similarity (4) The traditional qualitative evaluation method is based on the expectation of the comprehensive evaluation cloud. If the is within the range of the level k, the results of the evaluation are considered to belong to level k. However, when is in Journal of Residuals Science & Technology, Vol. 3, No. 8, 206 53 206 DEStech Publications, Inc. doi:2783/issn.544-8053/3/8/50
the boundary of two grades, it shows a strong subjectivity. Therefore, this paper use similarity algorithm based on cloud model, calculating the average membership of comprehensive evaluation cloud and determining the similarity between the comprehensive evaluation cloud and each evaluation grade cloud, so as to guarantee the justness of the evaluation results. The similarity algorithm based on improved cloud model can be described as two steps. Input: the digital characteristics ( ) of comprehensive evaluation cloud, the digital characteristics ( ) of each evaluation grade cloud, the number N of generated cloud droplets. Output: the similarity between the comprehensive evaluation cloud and every level cloud. Generate a normal random number with as expectation and as standard deviation Generate a normal random number with as expectation and as standard deviation Generate a normal random number with as expectation and as standard deviation Repeat step -4, until generated N cloud Calculate the similarity by the formula (5). [ ( ) ] (5) 4. Experiment Result and Discussion In this paper, the SCADA data of power dispatching system in Jilin Electric Power Limited Corporation Control Centre is selected as the research object. There are twenty statistical indicators such as power generation, power supply, capacity of power generation equipment, maximum generation load of power grid, line loss of electric energy, cumulative maximum load utilization hours, equipment ID, average utilization hours of equipment, switch state, measurement state, active power, reactive power, voltage, daily operation rate, monthly operating rate, line loss rate, average power load rate, daily average power consumption, net power consumption, electricity sales. They are represented by, as shown in Table 2. Table 2 Part of the statistical data of State Grid Company from 204 to 206 Time 2040 767 5829 5322 38 7 6 2040 476 40860 4075 2 43 5 0 6 2060 44 777 96 8 83758.67 According to expert opinion, the comments are divided into five levels {great, good, general, poor, bad}. Based on the percentile system, the numerical range of each evaluation grade are shown in Table 3. Evalua tion grade Numer ical range Table 3 The numerical range of each evaluation grade [90 00] Great Good General Poor Bad 90] [75 75] [60 60] [30 30] [0 According to the 2.3 section, the evaluation results of each index are shown in Table 4. correctness integrity timeliness Table 4 The evaluation results of each index 3 7 93. 9. 00 00 98.3 00 00 00 00 According to the 2.4 section, the weights of each evaluation index are shown in Table 5. correctness integrity Table 5 Weight value of each evaluation index Weight Value Subjective Weight Objective Weight Comprehensive Weight 2754 2835 2789 202 647 852 Journal of Residuals Science & Technology, Vol. 3, No. 8, 206 54 206 DEStech Publications, Inc. doi:2783/issn.544-8053/3/8/50
consistency uniqueness accuracy timeliness 574 325 448 59 0906 0836 306 766 466 5 0875 507 According to the 3. section, the digital characteristics of each evaluation grade cloud are shown in Table 6. After repeated experiments to produce the result. Table 6 The digital characteristics of each evaluation grade cloud al Great 95. 67 Good 82. 2. 5 5 Gener 67. 2. 5 5 Poor 45 5 Bad 5 5 0 0 0 0 0 Using the reverse cloud generator, we can get the digital characteristics of each evaluation cloud as shown in Table 7. Table 7 The digital characteristics of each evaluation cloud correctness 95 0 0 integrity 90 0 0 98. 3.. consistency 5 96 7908 0 uniqueness 0 0 0 accuracy 90 0 0 0 timeliness 0 0 0 The comprehensive evaluation cloud ( ) generated by each evaluation cloud. The image of corresponding cloud is shown in Figure.. Figure. The image of corresponding evaluation cloud It can be seen from the figure that when the score is 93.9203, the membership degree is, which shows that 93.9203 is the most representative of the quality of the data. And the cloud droplets of the comprehensive evaluation cloud mostly in between 93.84 to 94, the range is belonging to the evaluation level of great. At the same time, the similarity of the comprehensive evaluation cloud and the evaluation level cloud determined by the similarity algorithm is great(82)>good(2.9497e-005)>poor(.669e-02)>general(6.253e-025)>bad(8.9968e-055). In short, the comprehensive evaluation cloud is the most similar to evaluation level cloud of great, so the quality level of the data is great. This result is Journal of Residuals Science & Technology, Vol. 3, No. 8, 206 55 206 DEStech Publications, Inc. doi:2783/issn.544-8053/3/8/50
consistent with the actual situation. 5. Conclusion In this paper, a data quality assessment method based on improved cloud model is proposed to solve the quality problem of power dispatching data. The comments are divided into five grades such as great, good, general, poor, bad. According to the basic characteristics of electric power dispatching data, the paper evaluates the quality from six aspects such as correctness, integrity, consistency, uniqueness, accuracy and timeliness. On the basis of which, the comprehensive evaluation cloud is constructed. Finally, the evaluation level that the electric power dispatching data belongs to is determined according to the comprehensive evaluation cloud. Through case analysis, this method has good validity and feasibility, at the same time can avoid the subjective random defects of traditional methods, and to the maximum extent ensures that the evaluation results are fair and objective. References [] Ting Lei, Taigui Huang, Lin Yuan, et al. Research and Practice on the Application of Power Dispatching Integration. Electric Power Information, 200, 8(8), pp. 8-20+22. [2] Jianjun Cao, Jing Diao, Ting Wang, et al. Some Basic Problems in Quality Control Research. Microcomputer Information, 200, 9(9), pp. 2-4. [3] Wufeng Huang, Hua Zheng. Research on Quality Assessment for Enterprise Informatization. Computer Technology and Development, 20, 2(), pp. 85-88+92. [4] Liang Zhang. Research on Quality of Power Dispatching Center. East China Electric Power, 2009, 2(3), pp. 403-406. [5] Keyan Liu, Jian Zhang, Shun Tao, et al. Voltage Quality Detection and Evaluation Method for Distribution Network SCADA System Based on Multi-source and Multi temporal Spatial Information. Power System Technology, 205, 39(), pp. 369-375. [6] Jingyu Han, Lizhen Xu, Yisheng Dong. Review of Quality Research. Computer Science, 2008, 35(2), pp. -5. [7] Qingyun Yang, Peiying Zhao, Dongqing Yang, et al. Research on Quality Assessment Methodology. Computer Engineering and Applications, 2004, 40(9), pp. 3-4. [8] Zhimao Guo, Aoying Zhou. Research on Quality and Cleaning: a Survey. Journal of Software, 2002, 3(), pp. 2076-2082. [9] Batini C, Cappiello C, Francalanci C, et al. Methodologies for Quality Assessment and Improvement. ACM Computing Surveys (CSUR), 2009, 4(3), pp. 6-2 [0] Shibin Zhang, Chunxiang Xu. Study on the Trust Evaluation Approach Based on Cloud Model. Chinese Journal of Computers, 203, 36(2), pp. 422-43. [] Yin K, Wang S, Liu Z, et al. Research and Development on Quality Assessment Management System. IEEE, 204, pp. 2594-2598. [2] H Zhang, Y Hou, J Zhang, et al. A New Method for Nondestructive Quality Evaluation of the Resistance Spot Welding Based on the Radar Chart Method and the Decision Tree Classifier. The International Journal of Advanced Manufacturing Technology, 205, 78(5), pp. 84-85. [3] Ting Lei, Chuanbai Zhu, Taigui Huang, et al. Quality Identification Method Based on Platform. Automation of Electric Power Systems, 202, 36(2), pp. 7-75. [4] Hongwen Yan, Peng Chen. Research on the Method of Evaluating the Quality of Power Network Statistical Based on Cloud Model. Computer Applications and Software, 204, 3(2), pp. 00-03. [5] Liu J, Yan S, Cai N. Analysis and Evaluation Model on the Power Quality of Wind Farm. ICIC Express Letters, 205, 6(), pp. 53-59. Journal of Residuals Science & Technology, Vol. 3, No. 8, 206 56 206 DEStech Publications, Inc. doi:2783/issn.544-8053/3/8/50