Indices for calibration data selection of the rainfall runoff model

Size: px
Start display at page:

Download "Indices for calibration data selection of the rainfall runoff model"

Transcription

1 Click Here for Full Article Indices for calibration data selection of the rainfall runoff model Jia Liu 1 and Dawei Han 1 WATER RESOURCES RESEARCH, VOL. 46,, doi: /2009wr008668, 2010 Received 20 September 2009; revised 18 November 2009; accepted 1 December 2009; published 27 April [1] The identification of rainfall runoff models requires selection of appropriate data for model calibration. Traditionally, hydrologists use rules of thumb to select a certain period of hydrological data to calibrate the models (i.e., 6 year data). There are no numerical indices to help hydrologists to quantitatively select the calibration data. There are two questions: how long should the calibration data be (e.g., 6 months), and from which period should the data be selected (e.g., which 6 month data should be selected)? In this study, some indices for the selection of calibration data with adequate lengths and appropriate durations are proposed by examining the spectral properties of data sequences before the calibration work. With the validation data determined beforehand, we assume that the more similarity the calibration data set bears to the validation set, the better should the performance of the rainfall runoff model be after calibration. Three approaches are applied to reveal the similarity between the validation and calibration data sets: flow duration curve, Fourier transform, and wavelet analysis. Data sets used for calibration are generated by designing three scenario groups with fixed lengths of 6, 12, and 24 months, respectively, from 8 year continuous observations in the Brue catchment of the United Kingdom. Scenarios in each group have different starting times and thus various durations with specific hydrological characteristics. With a predetermined 18 month validation set and the rainfall runoff model chosen to be the probability distributed model, useful indices are produced for certain scenario groups by all three approaches. The information cost function, an entropy like function based on the decomposition results of the discrete wavelet transform, is found to be the most effective index for the calibration data selection. The study demonstrates that the information content of the calibration data is more important than the data length; thus 6 month data may provide more useful information than longer data series. This is important for hydrological modelers since shorter and more useful data help hydrologists to build models more efficiently and effectively. The idea presented in this paper has also shown potential in enhancing the efficiency of calibration data utilization, especially for data limited catchments. Citation: Liu, J., and D. Han (2010), Indices for calibration data selection of the rainfall runoff model, Water Resour. Res., 46,, doi: /2009wr Introduction [2] Mathematical rainfall runoff models are powerful tools which have been increasingly used in solving practical water resources engineering problems, ranging from online flood forecasting to land use change evaluations and the design of hydraulic structures. The confidence of a rainfallrunoff model depends on the model uncertainty remaining after being calibrated [Yapo et al., 1996]. Besides the automatic optimization related issues which have been focused on by many researchers during the past two decades, an appropriate selection of the calibration data is gaining more and more attention recently in order to get a robust and reliable calibration procedure. In general, modelers tend to use as large a data set as possible to get a representative 1 Water and Environmental Management Research Centre, Department of Civil Engineering, University of Bristol, Bristol, UK. Copyright 2010 by the American Geophysical Union /10/2009WR calibration data set of various phenomena experienced by the watershed. However, it is not the length of data but the quality of the information contained in the data that is of more importance in deciding the calibrated model performance, and the use of additional data beyond a certain amount will only marginally improve the parameter estimates [Sorooshian et al., 1983]. Gupta and Sorooshian [1985a, 1985b] provided a theoretical analysis indicating that data sequences containing greater hydrologic variability are more likely to result in reliable parameter estimates and thus enhance the performance of the calibrated model. [3] Many researchers have focused on searching for the most adequate calibration data length and confirmed that it is not the longer the data is used for calibration, the better is the model performance. Different data lengths were recommended, ranging from 3 months to 10 years, used for calibration regarding different models and optimization methods in their studies [Harlin, 1991; Yapo et al., 1996; Gan and Biftu, 1996, Gan et al., 1997; Anctil et al., 2004; Brath et al., 2004; Butts et al., 2004; Xia et al., 2004; Boughton, 2006; 1of17

2 Perrin et al., 2007]. As early as the beginning of the 1990s, Harlin [1991] developed a process oriented calibration scheme for the automatic calibration of the Hydrologiska Byråns Vattenbalansavdelning model, and a calibration length between 2 and 6 years was found to be sufficient for optimal parameters in the test basins. Later, important contributions were made by Yapo et al. [1996] and Gan et al. [1997], both using the shuffled complex evolution algorithm for the automatic calibration of lumped models operated at daily time scale. Their results suggested a minimal period length of 8 years and 1 year, respectively, of continuous daily data to obtain reliable calibrations that are relatively insensitive to the period selected. Brath et al. [2004] found there was also an optimal length of calibration data for the spatially distributed hydrological models and then showed how reducing the length of the calibration period under the extension of 3 months in the case study would degrade significantly the model performances. [4] However, the conclusions of the appropriate calibration lengths from those researches all depended on the characteristics of the case studies and the types of rainfallrunoff models used. The increasing attention gained by the selection of the most representative calibration data with adequate length has also put a heavy burden on the calibration work, and this problem will worsen as more observed data are collected by modern telemetry systems. There is a lack of a simple but effective approach for the selection of proper data used for calibration. Is it possible that the most appropriate set of calibration data could be decided before the procedure of the calibration work takes place? Besides the model performance, which is unknown until the completion of the whole calibration procedure, are there other criteria indicating the right selection of the calibration data? Assuming that the data used for model validation are ascertained, then the problem could be simplified to finding the criteria evaluating the similarity between the validation and different calibration data sets. We can assume that the similarity is in accordance with the model performance after calibration, which means that the more similar the calibration set is to the validation set, the better should the performance be of the calibrated model using that calibration set. The main purpose of this study is to search for simple indices representing the similarity between the calibration and validation data and then to verify the consistence of the similarity shown by the indices with the model performance after calibration. [5] In this study, the discrete wavelet transform (DWT) is applied, and an entropy like indicator named the identification cost function (ICF) is constructed based on the wavelet analysis to evaluate the spectral characteristics and the similarity between the validation and calibration data sets using the observed flow data. Before the DWT, two basic approaches, the flow duration curve (FDC) and the fast Fourier transform (FFT), are adopted to investigate their possibilities in representing the hydrological and spectral similarity between the validation and calibration data. All the results are verified by the model performances after the calibration of a conceptual rainfall runoff model, the probability distributed model (PDM). In order to eliminate the differences caused by using different automatic calibration methods, three optimization algorithms, particle swarm optimization (PSO), the genetic algorithm (GA), and sequential quadratic programming (SQP), are used to generate average and stable calibration results. An 8 year rainfall runoff calibration data set is split into three scenario groups, with the length of each scenario being 6, 12, and 24 months, respectively. All the analyses performed with FDC, FFT, and DWT are to explore whether the similarities identified by the chosen indices are consistent with the results of the model performances after calibration with different calibration data among the three scenario groups. 2. Methodology 2.1. Flow Duration Curve and Fourier Transform [6] A flow duration curve provides the percentage of time (duration) a flow with a certain time interval is exceeded over a historical period for a particular river basin [Vogel and Fennessey, 1994]. It may be also viewed as the complement of the cumulative distribution function of the considered flows [LeBoutillier and Waylen, 1993]. The empirical FDC can be easily constructed from streamflow observations using standardized nonparametric procedures as described by Vogel and Fennessey [1994]. [7] Fourier transform is a time frequency technique that decomposes a periodic signal into a linear superposition of sinusoids of different frequencies [Newland, 1993]. For discrete data (although the streamflow is a continuous signal, the digital recording device measures the flow at a prefixed time interval, and the resultant data are in discrete format), the DFT is often applied [Robin et al., 1993]. Two indices commonly used to visualize and analyze the results of DFT are the absolute amplitude, A p+1, and the power, P p+1 : A pþ1 ¼ X pþ1 =n ð1þ P pþ1 ¼ X pþ1 2 =n; where X p+1 is the transformed Fourier series with a length of n and index p running from 0 to n 1. Cooley and Tukey [1965] proposed the fast Fourier transform (FFT), which can compute the DFT in a more efficient way with complexity of O(n log n) instead of O(n 2 ). In this paper, the amplitude and total power of the Fourier series after FFT together with the flow duration curves are applied first for the comparison between the validation and calibration data sets Wavelet Analysis and the Information Cost Function (ICF) [8] Wavelet transform is a strong mathematical tool that provides a time frequency representation of an analyzed signal [Daubechies, 1990; Polikar, 1999]. It appears to be a more efficient approach than the Fourier transform in studying nonstationary time series. In recent years, there has been an increasing interest in the use of wavelet analysis in a wide range of fields in water resources and meteorology. Besides its successful application in the characterization and periodic analysis of climatic and hydrological data [Smith et al., 1998; Torrence and Compo, 1998; Park and Mann, 2000; Penalba and Vargas, 2004; Partal and Kahya, 2006], wavelet analysis is also an powerful tool for determining the relationships between different ð2þ 2of17

3 climatic or hydrological elements through analyzing and synthesizing their variable structures in the frequency domain [Nakken, 1999; Drago and Boxall, 2002; Taleb and Druyan, 2003; Kulkarni, 2000; Li et al., 2009]. Results from the recent studies have demonstrated the feasibilities of wavelet analysis in locating the irregularly distributed multiscale features of hydrometeorological data and in quantitatively correlating different observation series through their wavelet based expressions. [9] Basic ideas about the wavelets and how different wavelet transforms (including both continuous wavelet transforms (CWT) and DWT) are performed can be found in the work of Meyer [1993]. An efficient way to implement the DWT was devised by Mallat [1989] as the Mallat decomposition algorithm utilizing a number of successive filtering steps with which the original signal f can be decomposed into a series of approximations and details as follows: S j k ¼ XL 1 n¼0 C j k ¼ XL 1 n¼0 Sn 0 ¼ fn ½ Š; n 2 N ð3þ hn ½ ŠS j 1 nþ2k ; j ¼ 1; 2;...; J ð4þ gn ½ ŠS j 1 nþ2k ; j ¼ 1; 2;...; J; ð5þ j j where S k and C k represent the approximation and detail coefficients, respectively, N is the total number of data points in the signal f, h[n] and g[n] are impulse responses of the low pass filter H and the high pass filter G, respectively, L is the number of the nonzero impulse responses in h[n] and g[n], and J is the maximum possible scale of the Mallat decomposition algorithm, with J [log 2 (N L)] + 1 [Li et al., 1997]. The original signal f is first decomposed into an approximation and an accompanying detail. The approximation coefficients S j k are obtained by convolving the signal with the decomposition low pass filter H, while the detail coefficients C j k are obtained with the high pass filter G. The decomposition process is then iterated, with successive approximations being decomposed in turn so that the original signal is broken down into many lower resolution components. As a result, the approximations are the highscale, low frequency components of the signal, while the details indicate the low scale, high frequency components. j [10] With the wavelet coefficients C k and S j k, the sum E j = k C j2 k (or E j = k S j2 k ) gives the energy of the details (or approximations) of the signal f at level j. If the total energy is denoted as E tot = j E j, the corresponding percentile energy at level j is P j ¼ E j E tot : The level j is associated with a frequency band, DF, obtained in the following way: ð6þ 2 j 1 F s F 2 j F s ; ð7þ where F s is the sampled frequency and j = 1,2,, J. [11] The sequence P j gives the probability distribution of the energy for each level j. This distribution has a Shannon entropy that is defined as the information cost function [Blanco et al., 1998], which essentially measures the order inside the system: ICF ¼ X j P j ln P j ; where the sum is interpreted as zero for any P j = 0. The ICF is an entropy like function that is easy to calculate and gives a good estimate of the degree of disorder of a system [Figliola and Serrano, 1997]. In this study, the ICF and the energy distribution described by the total energy of the wavelet coefficients of both details and approximations are regarded as indicators of similarity in the comparisons between the validation and calibration sets in the frequency domain. 3. Rainfall Runoff Model, Optimization Methods, and Data Description 3.1. Probability Distributed Model [12] The rainfall runoff model used in this study is the PDM developed by Moore [1985]. The PDM model has been widely applied in various catchments in the United Kingdom and could be viewed as a representative of the conceptual saturation excess hydrological models used for the runoff simulation in humid and semihumid regions. It is developed based on the scheme of the Xinanjiang model with a soil moisture storage capacity that varies over the catchment, described by a simple Pareto distribution. There are 13 parameters in the PDM model to be calibrated, including f c (the rainfall factor), c min and c max (the minimum and maximum storage capacity, respectively), b (exponent of the Pareto distribution controlling the spatial variability of the store capacity), b e (exponent in the actual evaporation function), k g (groundwater recharge time constant), b g (exponent of recharge function), S t (soil tension storage capacity), k 1 and k 2 (time constants of a cascade of two linear reservoirs in the surface routing system), k b (base flow time constant of the groundwater routing system), q c (constant flow representing returns or abstractions), and t d (time delay). Moore [2007] gives an extensive description of the parameters and the model structure. The design of the surface and groundwater storage routing models can be found in the work of O Connor [1982] and Dooge [1973] Calibration Methods [13] The process of the rainfall runoff model calibration is normally performed either manually or by using computerbased automatic procedures. An integration of automatic optimization with visually interactive parameter estimation is recommended for the PDM model [Moore, 2007]. Due to the amount of calibration sets in this study and in order to reduce the subjective decisions on the calibration work, the automatic calibrations of all the 13 parameters are chosen here. [14] Recent research into global search methods has led to the use of population evolution based optimization algorithms [Gupta et al., 1998] such as the GA [Wang, ð8þ 3of17

4 1991, 1997], the shuffled complex evolution (SCE) algorithm [Duan et al., 1992, 1994], and simulated annealing (SA) [Sumner et al., 1997], which have proven to be both effective and relatively efficient in dealing with water resources systems [Zakermoshfegh et al., 2008]. Nowadays, a newer evolutionary technique, PSO, has gained much attention and wide applications in different fields [Eberhart and Shi, 2001]. It is a population based stochastic optimization technique developed in 1995, inspired by the simulation of social behavior [Eberhart and Kennedy, 1995]. PSO shares many similarities with the population evolutionbased techniques, especially the genetic algorithms, but it has shown many attractive characteristics, such as simple concept, easy implementation, and quick convergence [Liu et al., 2005]. Recently the PSO method has been successfully used in many parameter optimization cases of the rainfall runoff models [Chau, 2006, 2007; Gill et al., 2006; Goswami and O Connor, 2007; Reddy and Kumar, 2007; Zakermoshfegh et al., 2008]. [15] Dealing with the various calibration data scenarios in this study, although single optimization methods could provide similar calibration results, they are not stable for all the cases, and sometimes the results of certain scenarios can be quite different. In order to exclude the influences of the choice of optimization methods on the calibration results and to emphasis the selection of the calibration data, a more stable combined approach in which the PSO method, together with the genetic algorithm and another nonlinear optimization algorithm, SQP [Biggs, 1975; Han, 1977; Powell, 1978a, 1978b], was chosen to perform the automatic calibration procedure for the PDM model. It will take more computation time than a single optimization approach, but the improved stability helps to derive more reliable results. The objective function is chosen as the Nash Sutcliffe efficiency coefficient (NSE) [Nash and Sutcliffe, 1970] Catchment and Data Description [16] Data used in this study are from the Brue catchment, which is located in Somerset, United Kingdom ( N, 2.58 W), with a drainage area of 135 km 2. It is a predominantly rural catchment of modest relief with spring fed headwaters rising in the Mendip Hill and Salisbury Plain. The rain gauge network consists of 49 Casella 0.2 mm tipping bucket type rain gauges. An automatic weather station and an automatic soil water station are located in the catchment and record the global solar radiation, net radiation, and other weather parameters, such as wind speed, wet and dry bulb temperatures, and atmospheric pressure, in hourly intervals. [17] The Natural Environment Research Council funded the Hydrological Radar Experiment project to run from May 1993 to April 1997 in the Brue catchment (its data collection was extended to 2000). Eight years ( ) of 15 min rainfall runoff data obtained from this project are used in this study. Because there was a gap due to a failure of data collection from July to November 1998 during the project, that gap is regarded as the division for the calibration and validation data sets. Observed data before the gap are used for calibration. Starting at the same time, three sets of calibration data are first made with different lengths of 6, 12, and 24 months. Shifted by a 1 month sliding window, the three sets can then form three scenario groups of calibration sets with respective data lengths of 6, 12, and 24 months. This result in a total of 135 calibration sets, that is, 53 sets in the 6 month scenario group, 47 sets in the 12 month group, and 35 sets in the 24 month group. In the Brue catchment, the wet period normally lasts from November to next April and the dry period from May to October, which divides the year into a pair of two 6 months periods. The period of 12 months can cover a year with four seasons, which is also easy for dealing with the concept of the hydrological year. That is why the period of 6 month and its two integral multiples, 12 and 24 months, are chosen in this study as the lengths of three groups of calibration scenarios. The remaining 18 month observed data after the gap are used for validation. It should be noted that although the selection of validation data is also of great importance to the evaluation of the calibrated model, in order to fully investigate how the starting time and the duration of the calibration data influence the calibration results, the set of validation data is fixed in this study and the validation results are considered as the evaluation criteria for the performances of the calibrated models using different calibration scenarios. Figure 1 shows the hydrographs and the rainfall variations of the validation and calibration data sets in the three scenario groups. Daily potential evaporation data are obtained from the Met Office Surface Exchange Scheme, which are split into 15 min data in accordance with the rainfall runoff data before being processed in the PDM model. 4. Results 4.1. Model Performances of Different Calibration Scenarios [18] Calibration runs are conducted for the 135 calibration scenarios with fixed lengths of 6, 12, and 24 months and various starting times. For each scenario, three optimization runs are performed using the three algorithms of PSO, GA, and SQP. The optimization results of the three algorithms are very similar, and to smooth out the fluctuating outcomes of individual algorithms, the average results are adopted for analysis. All the calibrated models of the 135 scenarios are then validated against the validation data set of 18 month length. Besides the NSE, several other statistics are explored to evaluate the model performance based on both the validation and calibration results, including the root meansquare error, the mean absolute error, the mean bias error, and the correlation coefficient. Due to the consistent results of all the evaluation statistics, the NSE is chosen as the only assessment of the model performance in the following sections of analysis. Figure 2 shows the changes of the model performance due to the variations of the calibration data with different starting times and durations. [19] By comparing the three series of model performances of the 6, 12, and 24 month scenario groups in Figures 2a and 2b, it can be noted that although the validation results are slightly poorer than the calibration results, similar trends exist in the two series. Better calibrated model can produce better validation results, while poor model performance is often caused by the poor calibration scenarios. This tendency is mostly obvious with the 6 month group. The overfitting phenomenon which normally happens with numerical models cannot be found here. This may 4of17

5 Figure 1. Rainfall and runoff of the validation and calibration data sets for the three scenario groups, where the x axis values are the indices and starting times of the calibration sets with the interval representing 1 month. For example, calibration set 25 of the 6 month group can be found in Figure 1b with x values ranging from 25 to 31, and calibration set 25 in the 12 and 24 month group are the sections with x values ranging from 25 to 37 and from 25 to 49, respectively. 5of17

6 Figure 2. Average results of (a) calibration and (b) validation using the three optimization methods. The x axis values are the indices of scenarios in the 6, 12, and 24 month groups, and the y axis values are the model performance of each calibration scenario indicated by the Nash Sutcliffe efficiency coefficient (NSE). be because the largest length of calibration set in this study (24 months) is still appropriate for calibrating the PDM model. [20] The empirical cumulative distribution functions (CDFs) for the NSE statistic representing the model performance are constructed for both the calibration and validation results (Figure 3). The CDF of each scenario group indicates the chance of obtaining a NSE of magnitude less than a specific value if a calibration data set in that group is selected at random. In both Figures 3a and 3b, the CDFs become less steep and wider ranging as we progress from the 24 month group to the 6 month group. Increasing steepness indicates a reducing sensitivity of model performance to selection of scenario group with different data length [Yapo et al., 1996], which means that the 12 and 24 month scenario groups can produce more stable model performances than the 6 month group. By examining the NSE statistics of the three groups in Table 1, similar results can be found. The validation results of the 12 and 24 month groups have relatively higher average NSE values, although the 6 month group can achieve a better model performance after calibration for some data selections. Both the maximum and minimum values of NSE are yielded by the scenarios in the group with the length of 6 months, and as the length increases, the ranges of NSE shrink clearly with a decreasing standard deviation. [21] From another viewpoint, although in general the 6 month group has a less stable result, it does perform better in some cases than the 12 and 24 month groups. If the underlying relationship between the model performance and the starting time and duration of the calibration data can be found, we can pick up the best scenario (if not, the better ones) easily without the trade off between data length and stable model performance. [22] A further insight into the model performances in Figure 2 together with the hydrographs in Figure 1 can help to reveal the reason for the several lowest NSE values in the 6 month group. It can be easily noticed that in those sce- 6of17

7 Figure 3. Empirical cumulative distribution functions (CDFs) of the NSE of (a) calibration and (b) validation results of the 6, 12, and 24 month groups. narios, most of the calibration periods are occupied by the dry months, which normally occur during May October in the study catchment and the duration of which is no more than 6 months. The typical ones are scenarios 8, 20, and 33 in the 6 month group. That is easy to understand and can be avoided when choosing the calibration data by direct experience. In contrast, for the 12 and 24 month groups, it is tricky to identify the relatively poor scenarios before modeling because dry months can take up at least half of the whole period for all the scenarios. At the same time, the good scenarios in all three groups cannot be easily found only by a simple visualization of the hydrographs beforehand. However, from the case of the 6 month group, we can assume that the calibration data sets of the good scenarios may have a higher similarity with the validation data, while those of the poor ones have the least similarity. In the following sections, the exploration of the flow similarity between the calibration and validation data sets is carried out by using the flow duration curve and two spectral analysis tools, the Fourier transform and the wavelet analysis Similarity Identified by the Flow Duration Curve [23] The flow duration curves of the validation and calibration sets of all 135 scenarios are constructed using daily observed flow data. We can assume that when plotted together, the calibration curves of scenarios with good model performances should be close to the validation curve, while scenarios with poor model performances have relatively distant calibration curves. In order to quantify the similarity between the validation and different calibration curves, the Nash Sutcliffe efficiency coefficient is calculated based on the data series of the validation and calibration curves. Figure 4 plots the NSE values versus the corresponding model performances. The fitted regression line in Figure 4a shows a high correlation between the flow duration curve similarity and the model performance in the 6 month group, which means that the more similar the calibration set is to the validation set in the aspect of the flow duration curve, the better is the model performance that it can produce. Unfortunately, the regression lines for the 24 month group (Figure 4b) are nearly flat and the results are not as visible as the 6 month group, which can hardly reflect the same correlation between the curve similarity and the model performance. The results of the 12 month group show an average tendency between the 6 and 24 month groups, which are not displayed here to make the paper more concise Flow Similarity in the Frequency Domain Using Fourier Transform [24] The flow duration curve works quite well for the 6 month group but less effectively for the 12 and 24 month groups. In this section, the Fourier transform is explored to check if a better indicator revealing the relationship between the model performance and the similarity of the calibration and validation data could be found. [25] Fourier transform can help to study the spectral characteristics of a signal in the frequency domain. For feasible comparisons of signals with different data lengths, Table 1. Nash Sutcliffe Efficiency Coefficient Statistics of the Calibration and Validation Results for the Comparison of the Average Model Performances Produced by the 6, 12, and 24 Month Groups Nash Sutcliffe Efficiency Coefficient Statistics Six Month Group Twelve Month Group Twenty Four Month Group Calibration Average value Standard deviation Maximum value (MAX) Upper quartile (P75) Median value (P50) Lower quartile (P25) Minimum value (MIN) Validation Average value Standard deviation Maximum value (MAX) Upper quartile (P75) Median value (P50) Lower quartile (P25) Minimum value (MIN) of17

8 Figure 4. Relationship between the model performance and the similarity of the flow duration curves of the validation and calibration data sets in the 6 and 24 month groups. before being transformed from the time domain to the frequency domain, the 135 calibration sets in the three scenario groups together with the validation set are replicated for different times to generate new validation and calibration data sets of the same length by calculating their least common multiple (this is also for the convenience of the fast Fourier transform computations). Replication will not change the signal amplitude after transforming, and thus the spectral characteristics of the signal would remain the same. The scatterplots showing the total powers of each calibration set in the 6 and 24 month groups against the model performances are displayed in Figure 5. The vertical lines indicate the value of the total power of the validation set on the x axis. The total power can reflect the amount of energy contained in a signal by adding together the powers at each frequency component in the Fourier series. Again, we assume that the closer the value of total power of a calibration set is to that of the validation set, the more similar are the two sets in the frequency domain and thus the better does the calibrated model perform using that calibration data set. A consistent result for the 6 month group can be found in Figure 5a. Although a majority of points gather in the middle with a wide range of model performances, the tendency is quite clear on the left and right ends of the scatter, while for the results of the 24 month groups shown in Figure 5b, the tendencies are not as clear as the 6 month group. The 12 month group gives an average performance between those of the 6 and 24 month groups, the tendency of which is also not obvious to identify the similarity. [26] The amplitude of the Fourier series after transformation can also be compared between the validation and calibration data sets to further explore the spectral similarities and how they relate to the model performances. The comparison results of the amplitude are shown in Figure 6. In the same way as the flow duration curve, the similarity of the amplitude is evaluated by the Nash Sutcliffe efficiency coefficients between the validation and the calibra- Figure 5. Relationship between the model performance and the total power of the calibration data sets in the 6 and 24 month groups after the Fourier transform. 8of17

9 Figure 6. Relationship between the model performance and the similarity of the amplitude of the validation and calibration data sets in the 6 and 24 month groups after the Fourier transform. tion data sets. In order to eliminate the data noise in the high frequency domain, all the transformed Fourier series are processed using the moving average method with a window size of 50. The results are similar to those using the flow duration curve and the total power. The regression lines show a good correlation between the model performance and the amplitude similarity for the 6 month group in Figure 6a but nearly random results for both the 24 month group (Figure 6b) and the 12 month group Flow Similarity Described by Wavelet Analysis and ICF [27] The relationship between the model performance and the spectrum similarity of validation and calibration data sets can be further investigated by means of the DWT on more detailed subdivisions of the frequency domain. The DWT in this paper is carried out by decomposing the validation and calibration sets into six levels of details (d1 d6) and approximations (a1 a6), and a basic Daubechies wavelet of order 10 is chosen for the decomposition. More details about the Daubechies wavelets can be found in the work of Daubechies [1990]. [28] The details containing the high frequency information represent the flavor and nuance of a signal, which are regarded as being of more importance than the approximations and thus are more frequently utilized in wavelet analysis when comparing signals. In contrast, the approximations are the low frequency components, giving the identity of a signal, which means as the wavelet decomposition goes on, the approximation becomes a more and more abstract representation of the original signal. In this study, both the total energy and the energy distribution of details and approximations on different decomposition levels are examined in order to find a better representation of the spectrum similarity between the validation and calibration sets. [29] The total energy of wavelet coefficients on different decomposition levels are the amounts of energy distributed in the respective frequency domains, as described by equation (7). Because the approximation is an abstract of the original signal, the results of the total energy of approximations on all the six levels are almost the same as the results of the FFT. In that case, only the results of the total energy based on details are presented here. The total energy of details on different decomposition levels for the calibration sets in the 6 and 24 month groups are plotted against the corresponding model performance in Figures 7 and 8. The 12 month group shows similar trends as the 24 month group, so only the results of the 24 month group are presented here. The vertical lines indicate the total energy of the validation data set on the six levels. For all three groups with different calibration data lengths, the results show high consistency with the previous assumption made in section 4.3 on Fourier analysis, which is that the closer the total energy of the calibration data set is to that of the validation set, the better is the performance of the calibrated model. The assumption can be particularly verified by detail d5 (Figures 7e and 8e) on the decomposition level 5 and to a lesser extent by details on the other levels. [30] The percentile energy indicating the relative amount of energy distributed on a certain decomposition level can also be considered as a useful indicator in assessing the spectral similarity of the validation and calibration data sets. Figure 9 shows the percentile energies of details on different levels for the calibration sets in the 6 month group. The results of details d3 d6 are in strong agreement with assumption that the greater is the data similarity, the better is the model performance, among which the detail d4 has the most evident results. The results of d1 and d2 are not as good as the others, which may result from the noise in the high frequency domains of the original signals. [31] The results of percentile energy based on details are not ideal for the 12 and 24 month groups, showing no obvious trends on all six levels, and are nearly random series if ranked by the values of the model performance. The reason may lie in the relative low variances of model performances in the two groups compared with the evident differences between the poor and good scenarios in the 6 month group. Details revealing the subtle differences 9of17

10 Figure 7. Total energies of details on different wavelet decomposition levels for the calibration data sets in the 6 month scenario group. in the high frequency domain might not be sensitive to the comparison of similar signals (i.e., the comparison of the calibration sets in the 12 and 24 month groups with the validation set). In that case, the approximations on the six decomposition levels of each scenario in those two groups are explored to calculate the percentile energy on the six levels. Beyond expectation, the results based on approximations are dramatically good for the 12 and 10 of 17

11 Figure 8. Total energies of details on different wavelet decomposition levels for the calibration data sets in the 24 month scenario group. 24 month groups. The 24 month group results are shown in Figure 10. For the results on levels 1 4 (Figures 10a 10d), when the percentile energies calculated on the approximations of the calibration sets are less than that of the validation set (the vertical line), with the decrease of the distance between the validation and calibration values, the values on the y axis indicating the model performance of the scatterplots are on obvious rising trends. On the contrary, when the 11 of 17

12 Figure 9. Percentile energies of details on different wavelet decomposition levels for the calibration data sets in the 6 month scenario group. percentile energies of the calibration sets exceed the values of the validation set (Figure 10f), values of the model performance are decreasing with the increase of the distance between the validation and calibration values. For the 12 month group, it has quite similar results as the 24 month group, hence they are not presented in this paper. [32] An entropy like indicator, the ICF, is found to be a simple but efficient evaluation of the integral energy dis- 12 of 17

13 Figure 10. Percentile energies of approximations on different wavelet decomposition levels for the calibration data sets in the 24 month scenario group. tribution on different wavelet decomposition levels. It is defined as the degree of uniformity of the energy distribution, by comparing which the spectral similarity of the validation and calibration sets can be easily assessed. The results are plotted in Figures 11 and 12, respectively, for the 6 and 24 month scenario groups. (The 12 month group shows nearly the same tendency as the 24 month group, and its results are not presented in this paper.) Figures 11a 13 of 17

14 (because of the scatter in the high model performance area in Figures 11a and 11b), they perform effectively in the identification of the worst scenarios for the 6 month group. As for the 12 and 24 month groups, the scenarios are more sensitive to the calculations based on approximations to show evident trends as the assumption that the more similarity, the better is model performance. As shown in Figure 12, both the ICF and the NSE calculated based on the percentile energy series can be viewed as good indicators for the selection of calibration sets in the 24 month group, and the same in the 12 month group. 5. Discussion [34] Calibration results of the PDM model presented in this paper demonstrate the importance of the selection of calibration data with the most appropriate length and duration. For the three scenario groups containing calibration Figure 11. Relationship between the model performance and the similarity of (a) information cost function (ICF) values and (b) percentile energy series {P j, j =1,2,, 6} between the validation and calibration sets in the 6 month group, the results of which are calculated on the details on different composition levels. and 12a show the ICF values of the calibration data sets versus the respective model performances. The vertical lines indicate the ICF value of the validation set. According to the analysis of the percentile energy, better results are chosen from the calculations based on either details or approximations for the three scenario groups. Figures 11b and 12b present another approach to describe the similarity of the overall energy distribution between the validation and calibration sets. The model performances are plotted versus the Nash Sutcliffe efficiency coefficients calculated between the percentile energy series {P j, j = 1,2,, 6} of the validation set and that of the calibration sets in different scenario groups. [33] For the 6 month group, better results are shown from the calculations on details in Figure 11. Although the indices of ICF and the Nash Sutcliffe efficiency coefficients fail in picking up the best of the calibration sets Figure 12. Relationship between the model performance and the similarity of (a) ICF values and (b) percentile energy series {P j, j = 1,2,, 6} between the validation and calibration sets in the 24 month scenario group, the results of which are calculated on the approximations on different composition levels. 14 of 17

15 data with different starting times and fixed lengths of 6, 12, and 24 months, some scenarios in the 6 month group can even perform better than those in the 12 and 24 month groups, although on average the 12 and 24 month groups provide relatively stable and better model performances with less variance. These results are in line with the statement that it is not the length but the quality of information of the calibration data that is more important in deciding the model performance. From this point of view, we can say that the information contained in the good scenarios of the 6 month group has better quality than that of the scenarios with calibration data length of 12 or 24 months. As the evaluation criteria are chosen to be the validation results of the calibrated models by using an 18 month data set, it can be deduced that the better quality of information in the calibration set means to some extent an underlying similarity to the validation data set. [35] If the modeler can evaluate the similarity or the information quality in all the possible calibration data sets with certain durations, there will be a dramatic reduction in the calibration work for all the possible sets used for calibration in searching for the most appropriate one. However, it is difficult to compare the similarity of the validation and calibration sets visually or to identify the quality of information in the calibration data by direct experience, except for several poorly performed scenarios in the 6 month group which are covered throughout by dry months within a year. That is why the flow duration curve, the Fourier transform, and afterward the wavelet analysis are applied in this study to help assess the similarity between the validation and the calibration sets in the three scenario groups. [36] The similarities described using the flow duration curve and the Fourier transform have yielded similar results: both the flow duration curve and the total power and amplitude of the Fourier series show a good correlation between the model performance and the similarity of the validation and calibration sets for the scenarios in the 6 month group, while for the other two groups with calibration data length of 12 and 24 months, there are no obvious trends. In that case, the wavelet transform, which can reveal more detailed spectral properties of a signal in more specific frequency domains, is applied to search for better indices, the similarities indicated by which could show a more general agreement with the model performance. Comparisons made on both the total and percentile energies between the validation and calibration data sets after the discrete wavelet transform have provided more consistent results for all three scenario groups, showing an improvement in the model performances with the increase of similarities, especially in some particular decomposition levels which represent specific ranges of the frequency domain. On the basis of the wavelet results the entropylike function ICF, which efficiently evaluates the integral energy distribution of a signal on different decomposition levels, was constructed and appeared to be the most suitable index indicating the similarity between the validation and calibration sets. The ICF has provided evident results for all three groups, in high accordance with the previous assumption. It is interesting to note that the ICF performed better with details for the 6 month group, but with approximations for the 12 and 24 month groups. The results of the ICF are confirmed by the comparisons on the percentile energy series, which is another means to describe the similarity of the overall energy distribution between the validation and calibration sets. [37] It should be mentioned here that all three methods are applied to the observed rainfall data as well as the flow observations in this study (the flow duration curve methodology can also be applied to the rainfall data). Except for the results of the 6 month group using the duration curve, which show a similar trend as that using the flow data, poor results are produced by the other two methods. However, this is only the case for the Brue catchment; the rainfall data might still be worth trying for other catchments and other case studies with different designed calibration scenarios. 6. Conclusions [38] Selection of the calibration data is an important task for hydrologists in building hydrological models. Despite the publication of numerous studies on model development, there is a lack of guidance on how to select adequate and appropriate calibration data. The traditional rule of thumb is mainly based on the data length (i.e., 6 year data) and is inadequate for different catchment characteristics. It has been gradually recognized by modelers that it is not the length but the information quality of the data used for calibration that is the most significant factor affecting the performance of the calibrated rainfall runoff model. The selection of calibration data with the most adequate length and an appropriate duration is becoming more and more important, especially as increasingly more observed data with high resolution are collected by modern telemetry systems. This study has provided several practical indices for the calibration data selection of the rainfall runoff models. With the determination of the validation data, it is assumed that the more similarity a calibration data set bears to the validation set, the better performance should the calibrated model have using that calibration data set. Three methods presented in this paper, the flow duration curve, the Fourier transform, and the wavelet analysis, are all found to produce good indices for certain scenario groups in the case study describing the similarities between the validation and calibration data sets, among which the ICF appears to be the most appropriate and efficient one which could be used for calibration data selection. It is interesting to note that some models calibrated using 6 month data in this study had better performance than that using longer data lengths. This has again verified that the information content in the calibration data is more important than the data length. The idea presented in this paper has also shown its potential in enhancing the efficiency of the data utilization, particularly when the modeler is facing the problem of datalimited catchments. [39] Clearly, the outcomes of this paper are to some extent dependent on the characteristics of the case study, for example, the design of the scenario groups, the catchment, and the rainfall runoff model that have been used. More research is needed to explore the applicability of the indices under other catchment conditions and with different choices of the rainfall runoff models, in particular the spatially distributed models which have more complicated input requirements. One limitation of this study seems to be the ascertained validation data which are chosen to be determined beforehand. It is true that validation data are a 15 of 17

16 dominating factor in evaluating the performance of the calibrated model. In practice, the selection of the validation data is associated with the purpose of applying the rainfallrunoff model, such as real time flood forecasting, hydraulic structure designing, and so on, which in normal cases can be decided appropriately before the selection of the calibration data. [40] For completeness, it should be noted that the calibration data sets selected by the suggested indices in the study case are expected to be the relative best ones among the calibration sets in all the scenarios rather than the absolutely best ones. One can notice that there are no large differences between the model performances of the good scenarios in the 6 month group and those in the 12 and 24 month groups, although the average performance is increasing from the 6 to the 24 month group, which can lead to a general conclusion of improved performance with the increase of the calibration data length. Searching for the optimal length of the calibration data remains an unsolved and attractive issue for the future. Therefore, we hope this study will stimulate further researches into the related calibration issues so that some generalizations on the selection of the optimal calibration data and more applicable indices may be found. References Anctil, F., C. Perrin, and V. Andréassian (2004), Impact of the length of observed records on the performance of ANN and of conceptual parsimonious rainfall runoff forecasting models, Environ. Modell. Software, 19(4), , doi: /s (03)00135-x. Biggs, M. C. (1975), Constrained minimization using recursive quadratic programming, in Towards Global Optimization, editedbyl.c.w. Dixon and G. P. Szergö, pp , North Holland, Amsterdam, Netherlands. Blanco,S.,A.Figliola,R.QuianQuiroga,O.A.Rosso,andE.Serrano (1998), Time frequency analysis of electroencephalogram series. III. Wavelet packets and information cost function, Phys.Rev.E, 57(1), Boughton, W. (2006), Calibrations of a daily rainfall runoff model with poor quality data, Environ. Modell. Software, 21(8), , doi: /j.envsoft Brath, A., A. Montanari, and E. Toth (2004), Analysis of the effects of different scenarios of historical data availability on the calibration of a spatially distributed hydrological model, J. Hydrol. Amsterdam, 291, , doi: /j.jhydrol Butts, M. B., J. T. Payne, M. Kristensen, and H. Madsen (2004), An evaluation of the impact of model structure on hydrological modelling uncertainty for streamflow simulation, J. Hydrol. Amsterdam, 298, , doi: /j.jhydrol Chau, K. W. (2006), Particle swarm optimization training algorithm for ANNs in stage prediction of Shing Mun River, J. Hydrol. Amsterdam, 329, , doi: /j.jhydrol Chau, K. W. (2007), A split step particle swarm optimization algorithm in river stage forecasting, J. Hydrol. Amsterdam, 346, , doi: /j.jhydrol Cooley, J. W., and J. W. Tukey (1965), An algorithm for the machine calculation of complex Fourier series, Math. Comput., 19(90), , doi: / Daubechies, I. (1990), The wavelet transform, time frequency localization and signal analysis, IEEE Trans. Inf. Theory, 36(5), , doi: / Dooge, J. C. I. (1973), Linear theory of hydrologic systems, Technical Bulletin 1468, U.S. Dept. of Agric., Washington, D. C. Drago, A. F., and S. R. Boxall (2002), Use of the wavelet transform on hydro meteorological data, Phys. Chem. Earth, 27(32 34), , doi: /s (02) Duan, Q., S. Sorooshian, and V. Gupta (1992), Effective and efficient global optimization for conceptual rainfall runoff models, Water Resour. Res., 28, , doi: /91wr Duan, Q., S. Sorooshian, and V. Gupta (1994), Optimal use of the SCEUA global optimization method for calibrating watershed models, J. Hydrol. Amsterdam, 158, , doi: / (94) Eberhart, R. C., and J. Kennedy (1995), A new optimizer using particle swarm theory, in: Proceedings of the 6th International Symposium Micro Machine and Human Science, pp , IEEE, Piscataway, N. J. Eberhart, R. C., and Y. Shi (2001), Particle swarm optimization: Developments, applications and resources, in Proceeding of the 2001 Congress on Evolutionary Computation, pp , IEEE, Piscataway, N. J. Figliola, A., and E. Serrano (1997), Analysis of physiological time series using wavelet transforms, IEEE Eng. Med. Biol., 16(3), 74 79, doi: / Gan, T. Y., and G. F. Biftu (1996), Automatic calibration of conceptual rainfall runoff models: Optimization algorithms, catchment conditions, and model structure, Water Resour. Res., 32, , doi: / 95WR Gan, T. Y., E. M. Dlamini, and G. F. Biftu (1997), Effects of model complexity and structure, data quality, and objective functions on hydrologic modelling, J. Hydrol. Amsterdam, 192, , doi: /s (96) Gill, M. K., Y. H. Kaheil, A. Khalil, M. McKee, and L. Bastidas (2006), Multiobjective particle swarm optimization for parameter estimation in hydrology, Water Resour. Res., 42, W07417, doi: /2005wr Goswami, M., and K. M. O Connor (2007), Comparative assessment of six automatic optimization techniques for calibration of a conceptual rainfall runoff model, Hydrol. Sci. J., 52(3), , doi: / hysj Gupta, H. V., S. Sorooshian, and P. O. Yapo (1998), Toward improved calibration of hydrological models: Multiple and noncommensurable measures of information, Water Resour. Res., 34, , doi: / 97WR Gupta, V. K., and S. Sorooshian (1985a), The relationship between data and the precision of estimated parameters, J. Hydrol. Amsterdam, 85, 57 77, doi: / (85) Gupta, V. K., and S. Sorooshian (1985b), The automatic calibration of conceptual catchment models using derivative based optimization algorithms, Water Resour. Res., 21, , doi: /wr021i004p Han, S. P. (1977), A globally convergent method for nonlinear programming, J. Optim. Theory Appl., 22, , doi: /bf Harlin, J. (1991), Development of a process oriented calibration scheme for the HBV hydrological model, Nord. Hydrol., 22, Kulkarni, J. R. (2000), Wavelet analysis of the association between the Southern Oscillation and the Indian Summer Monsoon, Int. J. Climatol., 20(1), , doi: /(sici) (200001)20:1<89::aid- JOC458>3.0.CO;2-W. LeBoutillier, D. W., and P. R. Waylen (1993), A stochastic model of flow duration curves, Water Resour. Res., 29, , doi: / 93WR Li, C. H., Z. F. Yang, G. H. Huang, and Y. P. Li (2009), Identification of relationship between sunspots and natural runoff in the Yellow River based on discrete wavelet analysis, Expert Syst. Appl., 36(2), , doi: /j.eswa Li, X. B., H. Q. Li, F. Q. Wang, and J. Ding (1997), A remark on the Mallat pyramidal algorithm of wavelet analysis, Commun. Nonlinear Sci. Numer. Simul., 2(4), , doi: /s (97) Liu, B., L. Wang, Y. H. Jin, F. Tang, and D. X. Huang (2005), Improved particle swarm optimization combined with chaos, Chaos Solitons Fractals, 25(5), , doi: /j.chaos Mallat, S. (1989), A theory for multiresolution signal decomposition: The wavelet representation, IEEE Trans. Pattern Anal. Mach. Intell., 11(7), , doi: / Meyer, Y. (1993), Wavelets: Algorithms and Applications, Society for Industrial and Applied Mathematics, Philadelphia, Pa. Moore, R. J. (1985), The probability distributed principle and runoff production at point and basin scales, Hydrol. Sci. J., 30(2), Moore, R. J. (2007), The PDM rainfall runoff model, Hydrol. Earth Syst. Sci., 11(1), Nakken, M. (1999), Wavelet analysis of rainfall runoff variability isolating climatic from anthropogenic patterns, Environ. Modell. Software, 14(4), , doi: /s (98) Nash, J. E., and J. V. Sutcliffe (1970), River flow forecasting using conceptual models: Part 1. A discussion of principles, J. Hydrol. Amsterdam, 10, Newland, D. E. (1993), An Introduction to Random Vibrations, Spectral and Wavelet Analysis, 477 pp., Addison Wesley Longman, Harlow, Essex, U. K. 16 of 17

Comparison of parameter estimation algorithms in hydrological modelling

Comparison of parameter estimation algorithms in hydrological modelling Calibration and Reliability in Groundwater Modelling: From Uncertainty to Decision Making (Proceedings of ModelCARE 2005, The Hague, The Netherlands, June 2005). IAHS Publ. 304, 2006. 67 Comparison of

More information

Hybrid Particle Swarm and Neural Network Approach for Streamflow Forecasting

Hybrid Particle Swarm and Neural Network Approach for Streamflow Forecasting Math. Model. Nat. Phenom. Vol. 5, No. 7, 010, pp. 13-138 DOI: 10.1051/mmnp/01057 Hybrid Particle Swarm and Neural Network Approach for Streamflow Forecasting A. Sedki and D. Ouazar Department of Civil

More information

Identifying and reducing model structure uncertainty based on analysis of parameter interaction

Identifying and reducing model structure uncertainty based on analysis of parameter interaction Adv. Geosci., 11, 117 122, 2007 Author(s) 2007. This work is licensed under a Creative Commons License. Advances in Geosciences Identifying and reducing model structure uncertainty based on analysis of

More information

How to correct and complete discharge data Main text

How to correct and complete discharge data Main text Table of Contents. General 2. Completion from another record at the same station 3. Interpolating discharge gaps of short duration 4. Interpolating gaps during recessions 5. Interpolation using regression

More information

Research Article Forecasting SPEI and SPI Drought Indices Using the Integrated Artificial Neural Networks

Research Article Forecasting SPEI and SPI Drought Indices Using the Integrated Artificial Neural Networks Computational Intelligence and Neuroscience Volume 2016, Article ID 3868519, 17 pages http://dx.doi.org/10.1155/2016/3868519 Research Article Forecasting SPEI and SPI Drought Indices Using the Integrated

More information

5. Computational Geometry, Benchmarks and Algorithms for Rectangular and Irregular Packing. 6. Meta-heuristic Algorithms and Rectangular Packing

5. Computational Geometry, Benchmarks and Algorithms for Rectangular and Irregular Packing. 6. Meta-heuristic Algorithms and Rectangular Packing 1. Introduction 2. Cutting and Packing Problems 3. Optimisation Techniques 4. Automated Packing Techniques 5. Computational Geometry, Benchmarks and Algorithms for Rectangular and Irregular Packing 6.

More information

FITTING PIECEWISE LINEAR FUNCTIONS USING PARTICLE SWARM OPTIMIZATION

FITTING PIECEWISE LINEAR FUNCTIONS USING PARTICLE SWARM OPTIMIZATION Suranaree J. Sci. Technol. Vol. 19 No. 4; October - December 2012 259 FITTING PIECEWISE LINEAR FUNCTIONS USING PARTICLE SWARM OPTIMIZATION Pavee Siriruk * Received: February 28, 2013; Revised: March 12,

More information

Particle Swarm Optimization for Calibrating and Optimizing Xinanjiang Model Parameters

Particle Swarm Optimization for Calibrating and Optimizing Xinanjiang Model Parameters Particle Swarm Optimization for Calibrating and Optimizing Xinanjiang Model Parameters Kuok King Kuok Lecturer, School of Engineering, Computing and Science, Swinburne University of Technology Sarawak

More information

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables Further Maths Notes Common Mistakes Read the bold words in the exam! Always check data entry Remember to interpret data with the multipliers specified (e.g. in thousands) Write equations in terms of variables

More information

LOESS curve fitted to a population sampled from a sine wave with uniform noise added. The LOESS curve approximates the original sine wave.

LOESS curve fitted to a population sampled from a sine wave with uniform noise added. The LOESS curve approximates the original sine wave. LOESS curve fitted to a population sampled from a sine wave with uniform noise added. The LOESS curve approximates the original sine wave. http://en.wikipedia.org/wiki/local_regression Local regression

More information

Using Excel for Graphical Analysis of Data

Using Excel for Graphical Analysis of Data Using Excel for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters. Graphs are

More information

MONITORING THE REPEATABILITY AND REPRODUCIBILTY OF A NATURAL GAS CALIBRATION FACILITY

MONITORING THE REPEATABILITY AND REPRODUCIBILTY OF A NATURAL GAS CALIBRATION FACILITY MONITORING THE REPEATABILITY AND REPRODUCIBILTY OF A NATURAL GAS CALIBRATION FACILITY T.M. Kegel and W.R. Johansen Colorado Engineering Experiment Station, Inc. (CEESI) 54043 WCR 37, Nunn, CO, 80648 USA

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information

Chapter 5. Track Geometry Data Analysis

Chapter 5. Track Geometry Data Analysis Chapter Track Geometry Data Analysis This chapter explains how and why the data collected for the track geometry was manipulated. The results of these studies in the time and frequency domain are addressed.

More information

Hands-on Session: Calibrate the CREST Model

Hands-on Session: Calibrate the CREST Model Hands-on Session: Calibrate the CREST Model Xianwu Xue April 3 rd 2012 Background Hydrologic models often contain parameters that cannot be measured directly but which can only be inferred by a trialand-error

More information

Estimation of Design Flow in Ungauged Basins by Regionalization

Estimation of Design Flow in Ungauged Basins by Regionalization Estimation of Design Flow in Ungauged Basins by Regionalization Yu, P.-S., H.-P. Tsai, S.-T. Chen and Y.-C. Wang Department of Hydraulic and Ocean Engineering, National Cheng Kung University, Taiwan E-mail:

More information

HOW WELL DOES A MODEL REPRODUCE

HOW WELL DOES A MODEL REPRODUCE HOW WELL DOES A MODEL REPRODUCE HYDROLOGIC RESPONSE? Lessons from an inter-model comparison Riddhi Singh, Basudev Biswal Department of Civil Engineering Indian Institute of Technology Bombay, India 2018

More information

White Paper. Abstract

White Paper. Abstract Keysight Technologies Sensitivity Analysis of One-port Characterized Devices in Vector Network Analyzer Calibrations: Theory and Computational Analysis White Paper Abstract In this paper we present the

More information

A Data Classification Algorithm of Internet of Things Based on Neural Network

A Data Classification Algorithm of Internet of Things Based on Neural Network A Data Classification Algorithm of Internet of Things Based on Neural Network https://doi.org/10.3991/ijoe.v13i09.7587 Zhenjun Li Hunan Radio and TV University, Hunan, China 278060389@qq.com Abstract To

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

Modeling Plant Succession with Markov Matrices

Modeling Plant Succession with Markov Matrices Modeling Plant Succession with Markov Matrices 1 Modeling Plant Succession with Markov Matrices Concluding Paper Undergraduate Biology and Math Training Program New Jersey Institute of Technology Catherine

More information

How to carry out secondary validation of climatic data

How to carry out secondary validation of climatic data World Bank & Government of The Netherlands funded Training module # SWDP -17 How to carry out secondary validation of climatic data New Delhi, November 1999 CSMRS Building, 4th Floor, Olof Palme Marg,

More information

Graphical Analysis of Data using Microsoft Excel [2016 Version]

Graphical Analysis of Data using Microsoft Excel [2016 Version] Graphical Analysis of Data using Microsoft Excel [2016 Version] Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters.

More information

Ultrasonic Multi-Skip Tomography for Pipe Inspection

Ultrasonic Multi-Skip Tomography for Pipe Inspection 18 th World Conference on Non destructive Testing, 16-2 April 212, Durban, South Africa Ultrasonic Multi-Skip Tomography for Pipe Inspection Arno VOLKER 1, Rik VOS 1 Alan HUNTER 1 1 TNO, Stieltjesweg 1,

More information

A Novel Approach to Planar Mechanism Synthesis Using HEEDS

A Novel Approach to Planar Mechanism Synthesis Using HEEDS AB-2033 Rev. 04.10 A Novel Approach to Planar Mechanism Synthesis Using HEEDS John Oliva and Erik Goodman Michigan State University Introduction The problem of mechanism synthesis (or design) is deceptively

More information

Cpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc.

Cpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc. C: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc. C is one of many capability metrics that are available. When capability metrics are used, organizations typically provide

More information

Solar Radiation Data Modeling with a Novel Surface Fitting Approach

Solar Radiation Data Modeling with a Novel Surface Fitting Approach Solar Radiation Data Modeling with a Novel Surface Fitting Approach F. Onur Hocao glu, Ömer Nezih Gerek, Mehmet Kurban Anadolu University, Dept. of Electrical and Electronics Eng., Eskisehir, Turkey {fohocaoglu,ongerek,mkurban}

More information

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski Data Analysis and Solver Plugins for KSpread USER S MANUAL Tomasz Maliszewski tmaliszewski@wp.pl Table of Content CHAPTER 1: INTRODUCTION... 3 1.1. ABOUT DATA ANALYSIS PLUGIN... 3 1.3. ABOUT SOLVER PLUGIN...

More information

Automatic calibration of the MIKE SHE integrated hydrological modelling system

Automatic calibration of the MIKE SHE integrated hydrological modelling system Automatic calibration of the MIKE SHE integrated hydrological modelling system Henrik Madsen and Torsten Jacobsen DHI Water & Environment Abstract In this paper, automatic calibration of an integrated

More information

Descriptive Statistics, Standard Deviation and Standard Error

Descriptive Statistics, Standard Deviation and Standard Error AP Biology Calculations: Descriptive Statistics, Standard Deviation and Standard Error SBI4UP The Scientific Method & Experimental Design Scientific method is used to explore observations and answer questions.

More information

Table of Contents (As covered from textbook)

Table of Contents (As covered from textbook) Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression

More information

ENV3104 Hydraulics II 2017 Assignment 1. Gradually Varied Flow Profiles and Numerical Solution of the Kinematic Equations:

ENV3104 Hydraulics II 2017 Assignment 1. Gradually Varied Flow Profiles and Numerical Solution of the Kinematic Equations: ENV3104 Hydraulics II 2017 Assignment 1 Assignment 1 Gradually Varied Flow Profiles and Numerical Solution of the Kinematic Equations: Examiner: Jahangir Alam Due Date: 27 Apr 2017 Weighting: 1% Objectives

More information

Constraining Rainfall Replicates on Remote Sensed and In-Situ Measurements

Constraining Rainfall Replicates on Remote Sensed and In-Situ Measurements Constraining Rainfall Replicates on Remote Sensed and In-Situ Measurements Seyed Hamed Alemohammad, Dara Entekhabi, Dennis McLaughlin Ralph M. Parsons Laboratory for Environmental Science and Engineering

More information

On Automatic Calibration of the SWMM Model

On Automatic Calibration of the SWMM Model On Automatic Calibration of the SWMM Model Van-Thanh-Van Nguyen, Hamed Javaheri and Shie-Yui Liong Conceptual urban runoff (CUR) models, such as the U.S. Environmental Protection Agency Storm Water Management

More information

Metaheuristic Development Methodology. Fall 2009 Instructor: Dr. Masoud Yaghini

Metaheuristic Development Methodology. Fall 2009 Instructor: Dr. Masoud Yaghini Metaheuristic Development Methodology Fall 2009 Instructor: Dr. Masoud Yaghini Phases and Steps Phases and Steps Phase 1: Understanding Problem Step 1: State the Problem Step 2: Review of Existing Solution

More information

Image Transformation Techniques Dr. Rajeev Srivastava Dept. of Computer Engineering, ITBHU, Varanasi

Image Transformation Techniques Dr. Rajeev Srivastava Dept. of Computer Engineering, ITBHU, Varanasi Image Transformation Techniques Dr. Rajeev Srivastava Dept. of Computer Engineering, ITBHU, Varanasi 1. Introduction The choice of a particular transform in a given application depends on the amount of

More information

Prediction of traffic flow based on the EMD and wavelet neural network Teng Feng 1,a,Xiaohong Wang 1,b,Yunlai He 1,c

Prediction of traffic flow based on the EMD and wavelet neural network Teng Feng 1,a,Xiaohong Wang 1,b,Yunlai He 1,c 2nd International Conference on Electrical, Computer Engineering and Electronics (ICECEE 215) Prediction of traffic flow based on the EMD and wavelet neural network Teng Feng 1,a,Xiaohong Wang 1,b,Yunlai

More information

You ve already read basics of simulation now I will be taking up method of simulation, that is Random Number Generation

You ve already read basics of simulation now I will be taking up method of simulation, that is Random Number Generation Unit 5 SIMULATION THEORY Lesson 39 Learning objective: To learn random number generation. Methods of simulation. Monte Carlo method of simulation You ve already read basics of simulation now I will be

More information

Research Article Path Planning Using a Hybrid Evolutionary Algorithm Based on Tree Structure Encoding

Research Article Path Planning Using a Hybrid Evolutionary Algorithm Based on Tree Structure Encoding e Scientific World Journal, Article ID 746260, 8 pages http://dx.doi.org/10.1155/2014/746260 Research Article Path Planning Using a Hybrid Evolutionary Algorithm Based on Tree Structure Encoding Ming-Yi

More information

Levenberg-Marquardt minimisation in ROPP

Levenberg-Marquardt minimisation in ROPP Ref: SAF/GRAS/METO/REP/GSR/006 Web: www.grassaf.org Date: 4 February 2008 GRAS SAF Report 06 Levenberg-Marquardt minimisation in ROPP Huw Lewis Met Office, UK Lewis:Levenberg-Marquardt in ROPP GRAS SAF

More information

Incorporating Likelihood information into Multiobjective Calibration of Conceptual Rainfall- Runoff Models

Incorporating Likelihood information into Multiobjective Calibration of Conceptual Rainfall- Runoff Models International Congress on Environmental Modelling and Software Brigham Young University BYU ScholarsArchive th International Congress on Environmental Modelling and Software - Barcelona, Catalonia, Spain

More information

Use of evaporation and streamflow data in hydrological model calibration

Use of evaporation and streamflow data in hydrological model calibration Use of evaporation and streamflow data in hydrological model calibration Jeewanthi Sirisena* Assoc./Prof. S. Maskey Prof. R. Ranasinghe IHE-Delft Institute for Water Education, The Netherlands Background

More information

Using Excel for Graphical Analysis of Data

Using Excel for Graphical Analysis of Data EXERCISE Using Excel for Graphical Analysis of Data Introduction In several upcoming experiments, a primary goal will be to determine the mathematical relationship between two variable physical parameters.

More information

WELCOME! Lecture 3 Thommy Perlinger

WELCOME! Lecture 3 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 3 Thommy Perlinger Program Lecture 3 Cleaning and transforming data Graphical examination of the data Missing Values Graphical examination of the data It is important

More information

CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY

CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY 23 CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY 3.1 DESIGN OF EXPERIMENTS Design of experiments is a systematic approach for investigation of a system or process. A series

More information

Introduction to Exploratory Data Analysis

Introduction to Exploratory Data Analysis Introduction to Exploratory Data Analysis Ref: NIST/SEMATECH e-handbook of Statistical Methods http://www.itl.nist.gov/div898/handbook/index.htm The original work in Exploratory Data Analysis (EDA) was

More information

Comparison of multiple point and single point calibration performance for the Saginaw River Watershed

Comparison of multiple point and single point calibration performance for the Saginaw River Watershed Comparison of multiple point and single point calibration performance for the Saginaw River Watershed Fariborz Daneshvar, A 1. Pouyan Nejadhashemi 1, Matthew R. Herman 1 1 Department of Biosystems and

More information

Experimental Study on Bound Handling Techniques for Multi-Objective Particle Swarm Optimization

Experimental Study on Bound Handling Techniques for Multi-Objective Particle Swarm Optimization Experimental Study on Bound Handling Techniques for Multi-Objective Particle Swarm Optimization adfa, p. 1, 2011. Springer-Verlag Berlin Heidelberg 2011 Devang Agarwal and Deepak Sharma Department of Mechanical

More information

Bootstrapping Method for 14 June 2016 R. Russell Rhinehart. Bootstrapping

Bootstrapping Method for  14 June 2016 R. Russell Rhinehart. Bootstrapping Bootstrapping Method for www.r3eda.com 14 June 2016 R. Russell Rhinehart Bootstrapping This is extracted from the book, Nonlinear Regression Modeling for Engineering Applications: Modeling, Model Validation,

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

Aero-engine PID parameters Optimization based on Adaptive Genetic Algorithm. Yinling Wang, Huacong Li

Aero-engine PID parameters Optimization based on Adaptive Genetic Algorithm. Yinling Wang, Huacong Li International Conference on Applied Science and Engineering Innovation (ASEI 215) Aero-engine PID parameters Optimization based on Adaptive Genetic Algorithm Yinling Wang, Huacong Li School of Power and

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Underwater Acoustics Session 2aUW: Wave Propagation in a Random Medium

More information

NCAR SUMMER COLLOQUIUM: July 24-Aug.6, 2011 Boulder, Colorado, USA. General Large Area Crop Model (GLAM) TUTORIAL

NCAR SUMMER COLLOQUIUM: July 24-Aug.6, 2011 Boulder, Colorado, USA. General Large Area Crop Model (GLAM) TUTORIAL NCAR SUMMER COLLOQUIUM: July 24-Aug.6, 2011 Boulder, Colorado, USA General Large Area Crop Model (GLAM) TUTORIAL Gizaw Mengistu, Dept. of Physics, Addis Ababa University, Ethiopia This document provides

More information

Angela Ball, Richard Hill and Peter Jenkinson

Angela Ball, Richard Hill and Peter Jenkinson EVALUATION OF METHODS FOR INTEGRATING MONITORING AND MODELLING DATA FOR REGULATORY AIR QUALITY ASSESSMENTS Angela Ball, Richard Hill and Peter Jenkinson Westlakes Scientific Consulting Ltd, The Princess

More information

Supplementary Figure 1. Decoding results broken down for different ROIs

Supplementary Figure 1. Decoding results broken down for different ROIs Supplementary Figure 1 Decoding results broken down for different ROIs Decoding results for areas V1, V2, V3, and V1 V3 combined. (a) Decoded and presented orientations are strongly correlated in areas

More information

Development and Implementation of International and Regional Flash Flood Guidance (FFG) and Early Warning Systems. Project Brief

Development and Implementation of International and Regional Flash Flood Guidance (FFG) and Early Warning Systems. Project Brief Development and Implementation of International and Regional Flash Flood Guidance (FFG) and Early Warning Systems Project Brief 1 SUMMARY The purpose of this project is the development and implementation

More information

WAVELET USE FOR IMAGE CLASSIFICATION. Andrea Gavlasová, Aleš Procházka, and Martina Mudrová

WAVELET USE FOR IMAGE CLASSIFICATION. Andrea Gavlasová, Aleš Procházka, and Martina Mudrová WAVELET USE FOR IMAGE CLASSIFICATION Andrea Gavlasová, Aleš Procházka, and Martina Mudrová Prague Institute of Chemical Technology Department of Computing and Control Engineering Technická, Prague, Czech

More information

Fast Automated Estimation of Variance in Discrete Quantitative Stochastic Simulation

Fast Automated Estimation of Variance in Discrete Quantitative Stochastic Simulation Fast Automated Estimation of Variance in Discrete Quantitative Stochastic Simulation November 2010 Nelson Shaw njd50@uclive.ac.nz Department of Computer Science and Software Engineering University of Canterbury,

More information

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable. 5-number summary 68-95-99.7 Rule Area principle Bar chart Bimodal Boxplot Case Categorical data Categorical variable Center Changing center and spread Conditional distribution Context Contingency table

More information

IMAGE DE-NOISING IN WAVELET DOMAIN

IMAGE DE-NOISING IN WAVELET DOMAIN IMAGE DE-NOISING IN WAVELET DOMAIN Aaditya Verma a, Shrey Agarwal a a Department of Civil Engineering, Indian Institute of Technology, Kanpur, India - (aaditya, ashrey)@iitk.ac.in KEY WORDS: Wavelets,

More information

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-"&"3 -"(' ( +-" " " % '.+ % ' -0(+$,

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-&3 -(' ( +-   % '.+ % ' -0(+$, The structure is a very important aspect in neural network design, it is not only impossible to determine an optimal structure for a given problem, it is even impossible to prove that a given structure

More information

Tracking Changing Extrema with Particle Swarm Optimizer

Tracking Changing Extrema with Particle Swarm Optimizer Tracking Changing Extrema with Particle Swarm Optimizer Anthony Carlisle Department of Mathematical and Computer Sciences, Huntingdon College antho@huntingdon.edu Abstract The modification of the Particle

More information

Introduction to Geospatial Analysis

Introduction to Geospatial Analysis Introduction to Geospatial Analysis Introduction to Geospatial Analysis 1 Descriptive Statistics Descriptive statistics. 2 What and Why? Descriptive Statistics Quantitative description of data Why? Allow

More information

Towards an objective method of verifying the bend radius of HDD installations. Otto Ballintijn, CEO Reduct NV

Towards an objective method of verifying the bend radius of HDD installations. Otto Ballintijn, CEO Reduct NV International No-Dig 2010 28th International Conference and Exhibition Singapore 8-10 November 2010 Paper 001 Towards an objective method of verifying the bend radius of HDD installations Otto Ballintijn,

More information

Journal of Asian Scientific Research FEATURES COMPOSITION FOR PROFICIENT AND REAL TIME RETRIEVAL IN CBIR SYSTEM. Tohid Sedghi

Journal of Asian Scientific Research FEATURES COMPOSITION FOR PROFICIENT AND REAL TIME RETRIEVAL IN CBIR SYSTEM. Tohid Sedghi Journal of Asian Scientific Research, 013, 3(1):68-74 Journal of Asian Scientific Research journal homepage: http://aessweb.com/journal-detail.php?id=5003 FEATURES COMPOSTON FOR PROFCENT AND REAL TME RETREVAL

More information

Simulation Supported POD Methodology and Validation for Automated Eddy Current Procedures

Simulation Supported POD Methodology and Validation for Automated Eddy Current Procedures 4th International Symposium on NDT in Aerospace 2012 - Th.1.A.1 Simulation Supported POD Methodology and Validation for Automated Eddy Current Procedures Anders ROSELL, Gert PERSSON Volvo Aero Corporation,

More information

Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing

Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Generalized Additive Model and Applications in Direct Marketing Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Abstract Logistic regression 1 has been widely used in direct marketing applications

More information

Adaptive Fingerprint Image Enhancement Techniques and Performance Evaluations

Adaptive Fingerprint Image Enhancement Techniques and Performance Evaluations Adaptive Fingerprint Image Enhancement Techniques and Performance Evaluations Kanpariya Nilam [1], Rahul Joshi [2] [1] PG Student, PIET, WAGHODIYA [2] Assistant Professor, PIET WAGHODIYA ABSTRACT: Image

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

TERTIARY INSTITUTIONS SERVICE CENTRE (Incorporated in Western Australia)

TERTIARY INSTITUTIONS SERVICE CENTRE (Incorporated in Western Australia) TERTIARY INSTITUTIONS SERVICE CENTRE (Incorporated in Western Australia) Royal Street East Perth, Western Australia 6004 Telephone (08) 9318 8000 Facsimile (08) 9225 7050 http://www.tisc.edu.au/ THE AUSTRALIAN

More information

A Framework for Incorporating Uncertainty Sources in SWAT Modeling

A Framework for Incorporating Uncertainty Sources in SWAT Modeling A Framework for Incorporating Uncertainty Sources in SWAT Modeling Haw Yen X. Wang, D. G. Fontane, R. D. Harmel, M. Arabi July 30, 2014 2014 International SWAT Conference Pernambuco, Brazil Outline Overview

More information

Hydro Office Software for Water Sciences. TS Editor 3.0. White paper. HydroOffice.org

Hydro Office Software for Water Sciences. TS Editor 3.0. White paper. HydroOffice.org Hydro Office Software for Water Sciences TS Editor 3.0 White paper HydroOffice.org White paper for HydroOffice tool TS Editor 3.0 Miloš Gregor, PhD. / milos.gregor@hydrooffice.org HydroOffice.org software

More information

Hydrologic modelling at a continuous permafrost site using MESH. S. Pohl, P. Marsh, and S. Endrizzi

Hydrologic modelling at a continuous permafrost site using MESH. S. Pohl, P. Marsh, and S. Endrizzi Hydrologic modelling at a continuous permafrost site using MESH S. Pohl, P. Marsh, and S. Endrizzi Purpose of Study Test the latest version of MESH at a continuous permafrost site Model performance will

More information

An Adaptive Color Image Visible Watermark Algorithm Supporting for Interested Area and its Application System Based on Internet

An Adaptive Color Image Visible Watermark Algorithm Supporting for Interested Area and its Application System Based on Internet MATEC Web of Conferences 25, 0301 8 ( 2015) DOI: 10.1051/ matecconf/ 20152 503018 C Owned by the authors, published by EDP Sciences, 2015 An Adaptive Color Image Visible Watermark Algorithm Supporting

More information

17. SEISMIC ANALYSIS MODELING TO SATISFY BUILDING CODES

17. SEISMIC ANALYSIS MODELING TO SATISFY BUILDING CODES 17. SEISMIC ANALYSIS MODELING TO SATISFY BUILDING CODES The Current Building Codes Use the Terminology: Principal Direction without a Unique Definition 17.1 INTRODUCTION { XE "Building Codes" }Currently

More information

Overview of Model Calibration General Strategy & Optimization

Overview of Model Calibration General Strategy & Optimization Overview of Model Calibration General Strategy & Optimization Logan Karsten National Center for Atmospheric Research General Strategy Simple enough. Right?... 2 General Strategy Traditional NWS lumped

More information

The Bootstrap and Jackknife

The Bootstrap and Jackknife The Bootstrap and Jackknife Summer 2017 Summer Institutes 249 Bootstrap & Jackknife Motivation In scientific research Interest often focuses upon the estimation of some unknown parameter, θ. The parameter

More information

Optimizing Pharmaceutical Production Processes Using Quality by Design Methods

Optimizing Pharmaceutical Production Processes Using Quality by Design Methods Optimizing Pharmaceutical Production Processes Using Quality by Design Methods Bernd Heinen, SAS WHITE PAPER SAS White Paper Table of Contents Abstract.... The situation... Case study and database... Step

More information

Curve Fit: a pixel level raster regression tool

Curve Fit: a pixel level raster regression tool a pixel level raster regression tool Timothy Fox, Nathan De Jager, Jason Rohweder* USGS La Crosse, WI a pixel level raster regression tool Working with multiple raster datasets that share a common theme

More information

Louis Fourrier Fabien Gaie Thomas Rolf

Louis Fourrier Fabien Gaie Thomas Rolf CS 229 Stay Alert! The Ford Challenge Louis Fourrier Fabien Gaie Thomas Rolf Louis Fourrier Fabien Gaie Thomas Rolf 1. Problem description a. Goal Our final project is a recent Kaggle competition submitted

More information

Application of Clustering Techniques to Energy Data to Enhance Analysts Productivity

Application of Clustering Techniques to Energy Data to Enhance Analysts Productivity Application of Clustering Techniques to Energy Data to Enhance Analysts Productivity Wendy Foslien, Honeywell Labs Valerie Guralnik, Honeywell Labs Steve Harp, Honeywell Labs William Koran, Honeywell Atrium

More information

DESIGN OF EXPERIMENTS and ROBUST DESIGN

DESIGN OF EXPERIMENTS and ROBUST DESIGN DESIGN OF EXPERIMENTS and ROBUST DESIGN Problems in design and production environments often require experiments to find a solution. Design of experiments are a collection of statistical methods that,

More information

High Resolution Geomodeling, Ranking and Flow Simulation at SAGD Pad Scale

High Resolution Geomodeling, Ranking and Flow Simulation at SAGD Pad Scale High Resolution Geomodeling, Ranking and Flow Simulation at SAGD Pad Scale Chad T. Neufeld, Clayton V. Deutsch, C. Palmgren and T. B. Boyle Increasing computer power and improved reservoir simulation software

More information

Evolved Multi-resolution Transforms for Optimized Image Compression and Reconstruction under Quantization

Evolved Multi-resolution Transforms for Optimized Image Compression and Reconstruction under Quantization Evolved Multi-resolution Transforms for Optimized Image Compression and Reconstruction under Quantization FRANK W. MOORE Mathematical Sciences Department University of Alaska Anchorage CAS 154, 3211 Providence

More information

Building Better Parametric Cost Models

Building Better Parametric Cost Models Building Better Parametric Cost Models Based on the PMI PMBOK Guide Fourth Edition 37 IPDI has been reviewed and approved as a provider of project management training by the Project Management Institute

More information

Automated Thiessen polygon generation

Automated Thiessen polygon generation WATER RESOURCES RESEARCH, VOL. 42,, doi:10.1029/2005wr004365, 2006 Automated Thiessen polygon generation D. Han 1 and M. Bray 1 Received 17 June 2005; revised 20 July 2006; accepted 2 August 2006; published

More information

Resolution Improvement Processing of Post-stack Seismic Data and Example Analysis - Taking NEB Gas Field As an Example in Indonesia

Resolution Improvement Processing of Post-stack Seismic Data and Example Analysis - Taking NEB Gas Field As an Example in Indonesia International Forum on Energy, Environment Science and Materials (IFEESM 2017) Resolution Improvement Processing of Post-stack Seismic Data and Example Analysis - Taking NEB Gas Field As an Example in

More information

An improved PID neural network controller for long time delay systems using particle swarm optimization algorithm

An improved PID neural network controller for long time delay systems using particle swarm optimization algorithm An improved PID neural network controller for long time delay systems using particle swarm optimization algorithm A. Lari, A. Khosravi and A. Alfi Faculty of Electrical and Computer Engineering, Noushirvani

More information

Example Applications of A Stochastic Ground Motion Simulation Methodology in Structural Engineering

Example Applications of A Stochastic Ground Motion Simulation Methodology in Structural Engineering Example Applications of A Stochastic Ground Motion Simulation Methodology in Structural Engineering S. Rezaeian & N. Luco U.S. Geological Survey, Golden, CO, USA ABSTRACT: Example engineering applications

More information

Research on the New Image De-Noising Methodology Based on Neural Network and HMM-Hidden Markov Models

Research on the New Image De-Noising Methodology Based on Neural Network and HMM-Hidden Markov Models Research on the New Image De-Noising Methodology Based on Neural Network and HMM-Hidden Markov Models Wenzhun Huang 1, a and Xinxin Xie 1, b 1 School of Information Engineering, Xijing University, Xi an

More information

CREATING THE DISTRIBUTION ANALYSIS

CREATING THE DISTRIBUTION ANALYSIS Chapter 12 Examining Distributions Chapter Table of Contents CREATING THE DISTRIBUTION ANALYSIS...176 BoxPlot...178 Histogram...180 Moments and Quantiles Tables...... 183 ADDING DENSITY ESTIMATES...184

More information

A new multiscale routing framework and its evaluation for land surface modeling applications

A new multiscale routing framework and its evaluation for land surface modeling applications WATER RESOURCES RESEARCH, VOL. 48,, doi:10.1029/2011wr011337, 2012 A new multiscale routing framework and its evaluation for land surface modeling applications Zhiqun Wen, 1,2,4 Xu Liang, 2,3 and Shengtian

More information

Spatial and temporal rainfall approximation using additive models

Spatial and temporal rainfall approximation using additive models ANZIAM J. 42 (E) ppc1599 C1611, 2000 C1599 Spatial and temporal rainfall approximation using additive models C. Zoppou S. Roberts M. Hegland (Received 7 August 2000) Abstract We investigate the approximation

More information

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation 10.4 Measures of Central Tendency and Variation Mode-->The number that occurs most frequently; there can be more than one mode ; if each number appears equally often, then there is no mode at all. (mode

More information

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation 10.4 Measures of Central Tendency and Variation Mode-->The number that occurs most frequently; there can be more than one mode ; if each number appears equally often, then there is no mode at all. (mode

More information

Transducers and Transducer Calibration GENERAL MEASUREMENT SYSTEM

Transducers and Transducer Calibration GENERAL MEASUREMENT SYSTEM Transducers and Transducer Calibration Abstracted from: Figliola, R.S. and Beasley, D. S., 1991, Theory and Design for Mechanical Measurements GENERAL MEASUREMENT SYSTEM Assigning a specific value to a

More information

EF5 Overview. University of Oklahoma/HyDROS Module 1.3

EF5 Overview. University of Oklahoma/HyDROS Module 1.3 EF5 Overview University of Oklahoma/HyDROS Module 1.3 Outline Day 1 WELCOME INTRODUCTION TO HYDROLOGICAL MODELS EF5 OVERVIEW Features of EF5 Model structure Control file options Warm-up and model states

More information

Introduction to and calibration of a conceptual LUTI model based on neural networks

Introduction to and calibration of a conceptual LUTI model based on neural networks Urban Transport 591 Introduction to and calibration of a conceptual LUTI model based on neural networks F. Tillema & M. F. A. M. van Maarseveen Centre for transport studies, Civil Engineering, University

More information

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. + What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and

More information

Fourier Transformation Methods in the Field of Gamma Spectrometry

Fourier Transformation Methods in the Field of Gamma Spectrometry International Journal of Pure and Applied Physics ISSN 0973-1776 Volume 3 Number 1 (2007) pp. 132 141 Research India Publications http://www.ripublication.com/ijpap.htm Fourier Transformation Methods in

More information