LMP Step Pattern Detection based on Real-Time Data Haoyu Yuan, Fangxing Li, Yanli Wei Department of Electrical Engineering and Computer Science The University of Tennessee, Knoxville Knoxville, TN 37996 USA {hyuan2, fli6, ywei9}@utk.edu Abstract Locational marginal pricing (LMP) methodology has been widely adopted by most independent system operators (ISOs) and regional transmission organizations (RTOs) in today s electricity markets. Previous studies show that LMP has a step change characteristic with varying load. This can be used by market participants to predict the future electricity price and potential step change of LMP. In this paper, an effective algorithm using quality threshold (QT) clustering is proposed to detect the step change pattern of the hourly LMP. A set of indices to differentiate various patterns is introduced. Furthermore, a web-based tool is built to demonstrate the price behavior of different locations based on the 5 minutes real-time LMP data from ISOs/RTOs. The user friendly design with clustering functionality ensures easy statistical study over a large amount of historical data. Index Terms Locational marginal price (LMP), critical load level (CLL), QT clustering, step change pattern detection, market participant, price prediction. I. INTRODUCTION Most Independent System Operators (ISOs) and Regional Transmission Organizations (RTOs) in North America have adopted an economic dispatch-based model to clear dayahead energy market [1-3]. Locational marginal price (LMP) can reflect impacts of transmission congestion and losses based on network models, which contribute to the economic operation of the electricity market. Given the known unit commitment decisions, the step change characteristic of LMP has been well studied [4-6]. Critical load level (CLL) is introduced to indicate the point where the price spike occurs [4]. When the load level is close to the CLL, there will be higher risk of having volatile price. This becomes more common especially when the system has a large amount of intermittent renewable generations. Uncertainty analysis associated with CLL has been studied from the perspective of both market participants [7] and the operators [8]. Market participants, who are actively involved in the power trading business, need to investigate all the possible causes for LMP volatility. It is clear that CLL is an important signal for predicting price spikes. It is also desirable to have an easy-to-use web based tool for market participants to perform statistical analysis of the LMP over their interested time and locations. With this motivation, this paper attempts to address the following questions: What is the appropriate and effective way to model LMP step pattern? How to conduct statistical studies to correlate the model with the market information that is mostly used by market participants? How to create a user friendly environment to publicly show these statistical study results? Quality threshold (QT) clustering algorithm has been widely applied in research areas such as oncology for tumor classification [9-10]. The characteristic of not knowing cluster number in tumor classification is similar to our study in clustering LMP, which is the inspiration to this research work to adopt the QT clustering. In addition, two indices, outlier and overlap indicator (OI), are defined for efficient description of the fitness of a LMP pattern. Despite the importance of detecting and visualizing the step pattern in real time LMP data, there is no web application in providing such information to market participants. Thus, we develop a web-based application on the JAVA EE platform to dynamically generate detected step pattern diagram to users of interest. This paper is organized as follows. Section II introduces the clustering algorithm and raises a set of definitions for step detection. Section III briefly describes the data included in our work and presents the web-based application structure and snapshot. In Section IV, the observations of results obtained from the website are presented, and the statistic results of the proposed step detection algorithm based on the actual LMP data are obtained to verify the observations. Section V concludes the work.
II. STEP PATTERN DETECTION ALGORITHM A. Quality Threshold (QT) Clustering Algorithm Quality Threshold (QT) clustering algorithm is able to cluster data without specifying the number of clusters a priori [11]. This characteristic fits this application of LMP clustering very well because the number of price steps within one hour is unknown. The disadvantage of QT clustering is the intensive, time-consuming computation. However, with the consideration of the small size of LMP data, which contains only 12 points each hour, this disadvantage does not have a significant impact on the clustering performance here. Note, hourly LMP data (i.e., 12 data points) are analyzed because the unit commitment changes every hour. Thus, subhourly LMP is not affected by the change of generation units. The detailed steps of the QT clustering algorithm are described as follows: 1) Initialize the threshold distance allowed for clusters and the minimum cluster size. 2) Build a candidate cluster for each data point by including the closest point, the next closest, and so on, until the distance of the cluster surpasses the threshold. 3) Save the candidate cluster with the most points as the first true cluster, and remove all points in the cluster from candidates pool. 4) Repeat with the reduced set of points until no more clusters can be formed without violating the minimum cluster size. B. Indices for Step Pattern Detection After clustering the 12 LMP data points within an hour, the load range of each price step can be obtained. Ideally, if all the records follow the ascending step pattern, the diagram will demonstrate the step pattern by just drawing a straight line over each load range. However, sometimes LMP at a specific bus may not increase while the load level grows [4]. Thus, there exists overlapping between clusters. Therefore, it is needed to define some indices to handle this case. Two indices, outlier and overlap indicator (OI) which will be used in step change pattern detection are defined in this research work. 1) Definition of Outlier: After clustering based on LMP, a cluster with only one point is regarded as an outlier. Meanings of Outliers: A cluster with only one point cannot form a convincing step and is likely an outlier due to some changes (transmission outage, generator outage, etc.) in the system other than load variation. 2) Definition of Overlap Indicator (OI): The ideal step pattern should follow the rule that the step with a higher LMP should have a larger load. In the actual data, there can be a shift in the pattern which leads to some level of overlap in the pattern. The OI refers to the situation where the point right to the present point in the Load axis has a lower LMP value. Meanings of Overlap Indicator: The number of OI shows the level of overlap. The more of number of OI the worse the overlap is. Fig. 1 shows the LMP versus load diagram of Long Island region (LONGIL) at 3/1/2012 5 am. In the diagram, the purple cross marker represents an outlier because there is only one marker fill in this cluster. The remaining 11 records form 3 clusters. The first cluster consists of three red markers with a mean LMP of $6.8/MWh. The second cluster consists of three blue markers with a mean LMP of $11.5/MWh. The third cluster consists of four green markers with a mean LMP of $22.6/MWh. It is found that, in Cluster 1, the third point from the left does not follow the rule that higher load with higher price. When the step pattern is formed, the bold dashed black vertical line is used, as shown in Fig. 1, to connect the higher price cluster and the lower price cluster. Hence, it represents an OI. Figure 1. Illustration of Outlier and Overlap Indicator (OI). The reason of ruling out the outliers is because one point cannot make a clear step. However, we could assign a default load range to the one point cluster say +/- 10 MW. It can be explained that only one point within that hour has a price in that range. This point could still be meaningful to this research. Since it is meaningful regardless of including or excluding outliers, we consider both approaches in this work: 1) Approach A: considering the outliers such that the results of the clustering will include outliers and OIs; 2) Approach B: excluding the outliers such that the results of the clustering will include OIs only. C. Definition of Step Chagne Pattern Detection After defining the indices, we can describe how well a diagram follows the theoretical step change pattern based on these indices. Here, we give a strong definition and a weak definition of how a diagram fits the step pattern. For each definition, there are specific standards for Approach A and Approach B.
1) Strong Definition of Step Pattern: a) For Approach A, if the total number of OIs and outliers is less than or equal to 2, the diagram meets the criterion of strong definition of step pattern. b) For Approach B, if the number of OIs is less than or equal to 2, the diagram meets the criterion of strong definition of step pattern. As shown in Fig. 1, this case has 1 OI and 1 outlier for Approach A, which qualifies the case for a strong step pattern. For Approach B, the outlier will be treated as a cluster with only one data point. This point satisfies the definition of OI, which increase the total number of OI to 2. The diagram can also be considered having a strong step pattern. Intuitively, the pattern in Fig. 2 strongly leads us to regard it as a step pattern. In this sense, the definition is acceptable. 2) Weak definition of Step pattern: a) For Approach A, if the number of OI is less than or equal to 2 and the number of Outliers is less than or equal to 2, the diagram meet the standard of the weak definition. b) For Approach B, if the number of OIs is less than or equal to 4, the diagram meet the standard of the weak definition. If the price versus load diagram of an hour meets the standard of strong definition of step pattern, it means the subhourly data within this hour can be treated as following a step pattern. If the indices can only meet the weak definition of step pattern but not the strong definition, we may consider it a step trend in the diagram. If the indices cannot meet the weak definition, no clear step pattern exists. has identical definition of LMP. The fields are "Time Stamp", "Name", "PTID", "LBMP ($/MWHr)", "Marginal Cost Losses ($/MWHr)", and "Marginal Cost Congestion ($/MWHr)". The first three fields have the same the definition as in load files, while the other fields are selfexplanatory. The LBMP consists of three components (1) Energy, (2) Congestion, (3) Losses [12]. The energy component is the same for all buses. LMP will be the same when transmission limits are not binding and losses are zero. The congestion component of LMP, or Marginal Cost Congestion in the price file, results in unequal LMPs at different locations. The losses component, represented by Marginal Cost Losses in the file, is the cost due to losses. NYISO records the price and load data every 5 minutes, thus there are 12 records for each region within 1 hour. In this work, we attempt to detect the step pattern for each hour. The reason is that, unit commitment is usually hourly based which makes the step pattern more unpredictable between hours. The data included in our database is the LMP and load data for all 11 load regions from August 2011 to July 2012. Considering the total load of all the 11 load regions, the load peak happened in summer (June, July and August) through the year. Fig. 2 shows the average load, maximum load and minimum load of each month through all the 12 months data in our database, starting from August 2011. III. IMPLEMENTATION OF STEP DETECTION ALGORITHM The algorithm and indices described in Section II are implemented based on the published actual LMP and load data from NYISO. In this section the studied data is described. Also, the web-based tool dynamically generating the illustrative step pattern diagram is introduced. A. Real Time Data from NYISO In order to detect the step pattern of actual real-time LMP, we use the 5-minute LMP data published by New York Independent System Operator (NYISO) with 11 load regions. All the data were collected from the website of NYISO [12]. In the load data file, there are 5 fields: "Time Stamp", "Time Zone", "Name", "PTID" and "Load". The Time Stamp field is the time when the data tuple was recorded; Time Zone is Eastern Daylight Time (EDT) for all NYISO internal load regions; PTID is a unique ID for each region; and Load records the real-time load at each region. For the real-time LMP data, we use the real time LBMP (Locational Based Marginal Pricing) data file published by NYISO. Despite the slight difference in the names, LBMP Figure 2. Total load of NYISO within one year (August 2011 July 2012). B. Web-based Tool Development Web-based applications have been employed for engineering analysis [13]. In this study, a web-based tool is developed using Netbean IDE. It is a typical JAVA web application using JAVA EE 5 as server side language and MySQL as database. There are three layers: User layer, Java Client and MySQL Server. The user represented by user layer interacts with the java client through two webpages: welcome.jsp and response.jsp. The web pages are in JSP (JavaServer Pages) format which embeds HTML format. The java client interacts
with the MySQL Server through JDBC, a Java-based data access technology. The clustering method is programmed in a Java class and stored in the source file. Real time data of NYISO is stored in MySQL database. Fig. 3 gives a snapshot of the website. Users input the information through drop down list and submit the page. The clustered diagram is displayed beneath the drop down list. as a snapshot in the following discussion, there is a clustering result which is basically a set of indices for each approach. Based on the result, one snapshot could be classified as strong step pattern, weak step pattern or no step pattern for each method. For example, the clustering result for the snapshot shown in Fig. 1 would look like this: Approach A: number of clusters: 3; outlier number: 1; OI number: 1; strong step pattern. Approach B: number of clusters: 4; OI number: 2; strong step patter. The difference here is that, while Approach A considers outliers and OI, Approach B only considers OI and treats outliers as one cluster. B. Existence of Step Pattern in LMP After running the detection program over all the real-time LMP data, the percentage of snapshots classified as having step pattern in Approaches A and B can be gathered. For all the data in the database, there are 365 days * 24 hours *11 regions = 96360 snapshots. Table I gives the percentage of strong step pattern and weak step pattern through all the 12 months data. Note, the snapshots satisfying the strong definition of step pattern will meet the weak definition under the same method. TABLE I SUMMARY OF STATISTICS FOR YEARLY DATA Figure 3. Snapshot of the Web-based tool for LMP Step Pattern Detection. Strong Step Pattern Weak Step Pattern Approach A 46.68% Approach B 59.25% Approach A 81.45% Approach B 91.85% IV. OBSERVATIONS AND STATISTIC RESULTS The web-based tool introduced in Section III provides a convenient access to step detection diagram of NYISO regions. Observation and comparison of large amount of diagrams of historical data can provide useful information to market participants and system operators. Based on these observations, further verification can be conducted. In this section, indices and the definition of strong step pattern and weak step pattern for both Approach A and Approach B in Section II are used. Step detection algorithm is tested through all the price and load data of NYISO from August 2011 to July 2012 to get the yearly statistic data. The following are two observations. A. Steup of the NYISO Study From the diagram of the web-based tool, there is a clear step pattern in most of the hours. In order to justify the existence of the step pattern, the percentage of hours qualify the step pattern should be obtained. In Section II, Approaches A and B are defined. For a specific hour in a particular region, which will be referred to From the percentage data above, it is clear that the A approach, which considers outliers, has a stricter standard, and 81.45% of the snapshots could meet the standard of weak step pattern for Approach A. For Approach B, there are 59.12% of the snapshots qualifying the strong step pattern and over 90% of the snapshots meet the weak definition standard. Since the price data here takes into account the congestion and losses, a percentage above 80% is acceptable in our study to validate the existence of step change patterns. Therefore, the statistic results are in consistent with the observation from the web-based tool, as well as the theoretical analysis performed in the literature of critical load levels where the step change occurs [4-6]. If we consider the snapshots meet the weak step pattern standard as there is step pattern in the diagram, the existence of the step pattern is proven under both approaches. V. CONCLUSIONS In this work, a LMP step pattern detection algorithm based on QT clustering is proposed. Two indices, OI and
Outlier, are defined to describe the fitness of the Step Pattern. A web-based tool which is capable of detecting and visualizing the step pattern is constructed along with a database storing the 5 minutes real-time LMP data from NYISO. Observations based on the step detection diagram generated by the web-based tool provide valuable information for market participants and system operators to analyze the price behavior. Statistic results from real time data validate the existence of step pattern in the sub-hourly LMP data. Furthermore, the historical step pattern provided by the webbased tool can act as a good prediction for system participants, and the percentage data as well as the observation of pattern consistence within 1 hour can be treated as an indication of the system congestion. [11] L. J. Heyer, S. Kruglyak, and S. Yooseph, "Exploring expression data: identification and analysis of coexpressed genes," Genome research, vol. 9, pp. 1106-1115, 1999. [12] New York ISO, http://www.nyiso.com, accessed Aug. 2012. [13] F. Li, L.A.A. Freeman, R.E. Brown, "Web-Enabling Applications for Outsourced Computing," IEEE Power & Energy magazine, vol. 1, no. 1, pp. 53-57, Januaray 2003. VI. ACKNOWLEDGEMENT This work was supported in part by NSF grant ECCS- 1001999. This work also made use of the Shared Facilities and the Industry Partnership Program supported by CURENT, an Engineering Research Center (ERC) Program of NSF and DOE under NSF grant EEC-1041877. REFERENCES [1] J. Yang, F. Li and L.A.A. Freeman, "A market simulation program for the standard market design and generation/transmission planning," IEEE Power Engineering Society General Meeting, 2003. [2] NYISO Transmission & Dispatch Operations Manual, NYISO, 1999. [3] H. Chao, F. Li, L. H. Trinh, J. Pan, M. Gopinathan, D. J. Pillo, Market based transmission planning considering reliability and economic performances, 2004 International Conference on Probabilistic Methods Applied to Power Systems, pp. 557-562, 2004. [4] F. Li and R. Bo, "Congestion and price prediction under load variation," IEEE Trans. on Power Systems, vol. 24, no. 2, pp. 911-922, 2009. [5] R. Bo and F. Li, "Probabilistic LMP forecasting under AC optimal power flow framework: Theory and applications," Electric Power Systems Research, vol. 88, pp. 16-24, 2012. [6] R. Bo, F. Li, and K. Tomsovic, "Prediction of critical load levels for AC optimal power flow dispatch model," International Journal of Electrical Power & Energy Systems, vol. 42, pp. 635-643, 2012. [7] Y. Wei, F. Li, and K. Tomsovic, "Measuring the Volatility of Wholesale Electricity Prices Caused by Wind Power Uncertainty with a Correlation Model," IET Renewable Power Generation, vol. 6, no. 5, pp. 315-323, 2012. [8] F. Li and Y. Wei, "A Probability-Driven Multilayer Framework for Scheduling Intermittent Renewable Energy," IEEE Transactions on Sustainable Energy, vol. 3, no. 3, pp. 455-464, 2012. [9] A. N. Young, M. B. Amin, C. S. Moreno, et al, "Expression profiling of renal epithelial neoplasms: a method for tumor classification and discovery of diagnostic molecular markers," The American journal of pathology, vol. 158, pp. 1639-1651, 2001. [10] Y. Wang, T. Jatkoe, Y. Zhang, M. G. Mutch, D. Talantov, J. Jiang, H. L. McLeod, and D. Atkins, "Gene expression profiles and molecular markers to predict recurrence of Dukes' B colon cancer," Journal of Clinical Oncology, vol. 22, pp. 1564-1571, 2004.